BiKEGG: a COBRA toolbox extension for bridging the BiGG and KEGG databases.
Jamialahmadi, Oveis; Motamedian, Ehsan; Hashemi-Najafabadi, Sameereh
2016-10-18
Development of an interface tool between the Biochemical, Genetic and Genomic (BiGG) and KEGG databases is necessary for simultaneous access to the features of both databases. For this purpose, we present the BiKEGG toolbox, an open source COBRA toolbox extension providing a set of functions to infer the reaction correspondences between the KEGG reaction identifiers and those in the BiGG knowledgebase using a combination of manual verification and computational methods. Inferred reaction correspondences using this approach are supported by evidence from the literature, which provides a higher number of reconciled reactions between these two databases compared to the MetaNetX and MetRxn databases. This set of equivalent reactions is then used to automatically superimpose the predicted fluxes using COBRA methods on classical KEGG pathway maps or to create a customized metabolic map based on the KEGG global metabolic pathway, and to find the corresponding reactions in BiGG based on the genome annotation of an organism in the KEGG database. Customized metabolic maps can be created for a set of pathways of interest, for the whole KEGG global map or exclusively for all pathways for which there exists at least one flux carrying reaction. This flexibility in visualization enables BiKEGG to indicate reaction directionality as well as to visualize the reaction fluxes for different static or dynamic conditions in an animated manner. BiKEGG allows the user to export (1) the output visualized metabolic maps to various standard image formats or save them as a video or animated GIF file, and (2) the equivalent reactions for an organism as an Excel spreadsheet.
KEGGParser: parsing and editing KEGG pathway maps in Matlab.
Arakelyan, Arsen; Nersisyan, Lilit
2013-02-15
KEGG pathway database is a collection of manually drawn pathway maps accompanied with KGML format files intended for use in automatic analysis. KGML files, however, do not contain the required information for complete reproduction of all the events indicated in the static image of a pathway map. Several parsers and editors of KEGG pathways exist for processing KGML files. We introduce KEGGParser-a MATLAB based tool for KEGG pathway parsing, semiautomatic fixing, editing, visualization and analysis in MATLAB environment. It also works with Scilab. The source code is available at http://www.mathworks.com/matlabcentral/fileexchange/37561.
Conversion of KEGG metabolic pathways to SBGN maps including automatic layout
2013-01-01
Background Biologists make frequent use of databases containing large and complex biological networks. One popular database is the Kyoto Encyclopedia of Genes and Genomes (KEGG) which uses its own graphical representation and manual layout for pathways. While some general drawing conventions exist for biological networks, arbitrary graphical representations are very common. Recently, a new standard has been established for displaying biological processes, the Systems Biology Graphical Notation (SBGN), which aims to unify the look of such maps. Ideally, online repositories such as KEGG would automatically provide networks in a variety of notations including SBGN. Unfortunately, this is non‐trivial, since converting between notations may add, remove or otherwise alter map elements so that the existing layout cannot be simply reused. Results Here we describe a methodology for automatic translation of KEGG metabolic pathways into the SBGN format. We infer important properties of the KEGG layout and treat these as layout constraints that are maintained during the conversion to SBGN maps. Conclusions This allows for the drawing and layout conventions of SBGN to be followed while creating maps that are still recognizably the original KEGG pathways. This article details the steps in this process and provides examples of the final result. PMID:23953132
VisANT 3.0: new modules for pathway visualization, editing, prediction and construction.
Hu, Zhenjun; Ng, David M; Yamada, Takuji; Chen, Chunnuan; Kawashima, Shuichi; Mellor, Joe; Linghu, Bolan; Kanehisa, Minoru; Stuart, Joshua M; DeLisi, Charles
2007-07-01
With the integration of the KEGG and Predictome databases as well as two search engines for coexpressed genes/proteins using data sets obtained from the Stanford Microarray Database (SMD) and Gene Expression Omnibus (GEO) database, VisANT 3.0 supports exploratory pathway analysis, which includes multi-scale visualization of multiple pathways, editing and annotating pathways using a KEGG compatible visual notation and visualization of expression data in the context of pathways. Expression levels are represented either by color intensity or by nodes with an embedded expression profile. Multiple experiments can be navigated or animated. Known KEGG pathways can be enriched by querying either coexpressed components of known pathway members or proteins with known physical interactions. Predicted pathways for genes/proteins with unknown functions can be inferred from coexpression or physical interaction data. Pathways produced in VisANT can be saved as computer-readable XML format (VisML), graphic images or high-resolution Scalable Vector Graphics (SVG). Pathways in the format of VisML can be securely shared within an interested group or published online using a simple Web link. VisANT is freely available at http://visant.bu.edu.
KEGGtranslator: visualizing and converting the KEGG PATHWAY database to various formats.
Wrzodek, Clemens; Dräger, Andreas; Zell, Andreas
2011-08-15
The KEGG PATHWAY database provides a widely used service for metabolic and nonmetabolic pathways. It contains manually drawn pathway maps with information about the genes, reactions and relations contained therein. To store these pathways, KEGG uses KGML, a proprietary XML-format. Parsers and translators are needed to process the pathway maps for usage in other applications and algorithms. We have developed KEGGtranslator, an easy-to-use stand-alone application that can visualize and convert KGML formatted XML-files into multiple output formats. Unlike other translators, KEGGtranslator supports a plethora of output formats, is able to augment the information in translated documents (e.g. MIRIAM annotations) beyond the scope of the KGML document, and amends missing components to fragmentary reactions within the pathway to allow simulations on those. KEGGtranslator is freely available as a Java(™) Web Start application and for download at http://www.cogsys.cs.uni-tuebingen.de/software/KEGGtranslator/. KGML files can be downloaded from within the application. clemens.wrzodek@uni-tuebingen.de Supplementary data are available at Bioinformatics online.
NemaPath: online exploration of KEGG-based metabolic pathways for nematodes
Wylie, Todd; Martin, John; Abubucker, Sahar; Yin, Yong; Messina, David; Wang, Zhengyuan; McCarter, James P; Mitreva, Makedonka
2008-01-01
Background Nematode.net is a web-accessible resource for investigating gene sequences from parasitic and free-living nematode genomes. Beyond the well-characterized model nematode C. elegans, over 500,000 expressed sequence tags (ESTs) and nearly 600,000 genome survey sequences (GSSs) have been generated from 36 nematode species as part of the Parasitic Nematode Genomics Program undertaken by the Genome Center at Washington University School of Medicine. However, these sequencing data are not present in most publicly available protein databases, which only include sequences in Swiss-Prot. Swiss-Prot, in turn, relies on GenBank/Embl/DDJP for predicted proteins from complete genomes or full-length proteins. Description Here we present the NemaPath pathway server, a web-based pathway-level visualization tool for navigating putative metabolic pathways for over 30 nematode species, including 27 parasites. The NemaPath approach consists of two parts: 1) a backend tool to align and evaluate nematode genomic sequences (curated EST contigs) against the annotated Kyoto Encyclopedia of Genes and Genomes (KEGG) protein database; 2) a web viewing application that displays annotated KEGG pathway maps based on desired confidence levels of primary sequence similarity as defined by a user. NemaPath also provides cross-referenced access to nematode genome information provided by other tools available on Nematode.net, including: detailed NemaGene EST cluster information; putative translations; GBrowse EST cluster views; links from nematode data to external databases for corresponding synonymous C. elegans counterparts, subject matches in KEGG's gene database, and also KEGG Ontology (KO) identification. Conclusion The NemaPath server hosts metabolic pathway mappings for 30 nematode species and is available on the World Wide Web at . The nematode source sequences used for the metabolic pathway mappings are available via FTP , as provided by the Genome Center at Washington University School of Medicine. PMID:18983679
Hadadi, Noushin; Hafner, Jasmin; Shajkofci, Adrian; Zisaki, Aikaterini; Hatzimanikatis, Vassily
2016-10-21
Because the complexity of metabolism cannot be intuitively understood or analyzed, computational methods are indispensable for studying biochemistry and deepening our understanding of cellular metabolism to promote new discoveries. We used the computational framework BNICE.ch along with cheminformatic tools to assemble the whole theoretical reactome from the known metabolome through expansion of the known biochemistry presented in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. We constructed the ATLAS of Biochemistry, a database of all theoretical biochemical reactions based on known biochemical principles and compounds. ATLAS includes more than 130 000 hypothetical enzymatic reactions that connect two or more KEGG metabolites through novel enzymatic reactions that have never been reported to occur in living organisms. Moreover, ATLAS reactions integrate 42% of KEGG metabolites that are not currently present in any KEGG reaction into one or more novel enzymatic reactions. The generated repository of information is organized in a Web-based database ( http://lcsb-databases.epfl.ch/atlas/ ) that allows the user to search for all possible routes from any substrate compound to any product. The resulting pathways involve known and novel enzymatic steps that may indicate unidentified enzymatic activities and provide potential targets for protein engineering. Our approach of introducing novel biochemistry into pathway design and associated databases will be important for synthetic biology and metabolic engineering.
From genomics to chemical genomics: new developments in KEGG
Kanehisa, Minoru; Goto, Susumu; Hattori, Masahiro; Aoki-Kinoshita, Kiyoko F.; Itoh, Masumi; Kawashima, Shuichi; Katayama, Toshiaki; Araki, Michihiro; Hirakawa, Mika
2006-01-01
The increasing amount of genomic and molecular information is the basis for understanding higher-order biological systems, such as the cell and the organism, and their interactions with the environment, as well as for medical, industrial and other practical applications. The KEGG resource () provides a reference knowledge base for linking genomes to biological systems, categorized as building blocks in the genomic space (KEGG GENES) and the chemical space (KEGG LIGAND), and wiring diagrams of interaction networks and reaction networks (KEGG PATHWAY). A fourth component, KEGG BRITE, has been formally added to the KEGG suite of databases. This reflects our attempt to computerize functional interpretations as part of the pathway reconstruction process based on the hierarchically structured knowledge about the genomic, chemical and network spaces. In accordance with the new chemical genomics initiatives, the scope of KEGG LIGAND has been significantly expanded to cover both endogenous and exogenous molecules. Specifically, RPAIR contains curated chemical structure transformation patterns extracted from known enzymatic reactions, which would enable analysis of genome-environment interactions, such as the prediction of new reactions and new enzyme genes that would degrade new environmental compounds. Additionally, drug information is now stored separately and linked to new KEGG DRUG structure maps. PMID:16381885
KEGG Bioinformatics Resource for Plant Genomics and Metabolomics.
Kanehisa, Minoru
2016-01-01
In the era of high-throughput biology it is necessary to develop not only elaborate computational methods but also well-curated databases that can be used as reference for data interpretation. KEGG ( http://www.kegg.jp/ ) is such a reference knowledge base with two specific aims. One is to compile knowledge on high-level functions of the cell and the organism in terms of the molecular interaction and reaction networks, which is implemented in KEGG pathway maps, BRITE functional hierarchies, and KEGG modules. The other is to expand knowledge on genes and proteins involved in the molecular networks from experimentally observed organisms to other organisms using the concept of orthologs, which is implemented in the KEGG Orthology (KO) system. Thus, KEGG is a generic resource applicable to all organisms and enables interpretation of high-level functions from genomic and molecular data. Here we first present a brief overview of the entire KEGG resource, and then give an introduction of how to use KEGG in plant genomics and metabolomics research.
Zhou, Hufeng; Jin, Jingjing; Zhang, Haojun; Yi, Bo; Wozniak, Michal; Wong, Limsoon
2012-01-01
Pathway data are important for understanding the relationship between genes, proteins and many other molecules in living organisms. Pathway gene relationships are crucial information for guidance, prediction, reference and assessment in biochemistry, computational biology, and medicine. Many well-established databases--e.g., KEGG, WikiPathways, and BioCyc--are dedicated to collecting pathway data for public access. However, the effectiveness of these databases is hindered by issues such as incompatible data formats, inconsistent molecular representations, inconsistent molecular relationship representations, inconsistent referrals to pathway names, and incomprehensive data from different databases. In this paper, we overcome these issues through extraction, normalization and integration of pathway data from several major public databases (KEGG, WikiPathways, BioCyc, etc). We build a database that not only hosts our integrated pathway gene relationship data for public access but also maintains the necessary updates in the long run. This public repository is named IntPath (Integrated Pathway gene relationship database for model organisms and important pathogens). Four organisms--S. cerevisiae, M. tuberculosis H37Rv, H. Sapiens and M. musculus--are included in this version (V2.0) of IntPath. IntPath uses the "full unification" approach to ensure no deletion and no introduced noise in this process. Therefore, IntPath contains much richer pathway-gene and pathway-gene pair relationships and much larger number of non-redundant genes and gene pairs than any of the single-source databases. The gene relationships of each gene (measured by average node degree) per pathway are significantly richer. The gene relationships in each pathway (measured by average number of gene pairs per pathway) are also considerably richer in the integrated pathways. Moderate manual curation are involved to get rid of errors and noises from source data (e.g., the gene ID errors in WikiPathways and relationship errors in KEGG). We turn complicated and incompatible xml data formats and inconsistent gene and gene relationship representations from different source databases into normalized and unified pathway-gene and pathway-gene pair relationships neatly recorded in simple tab-delimited text format and MySQL tables, which facilitates convenient automatic computation and large-scale referencing in many related studies. IntPath data can be downloaded in text format or MySQL dump. IntPath data can also be retrieved and analyzed conveniently through web service by local programs or through web interface by mouse clicks. Several useful analysis tools are also provided in IntPath. We have overcome in IntPath the issues of compatibility, consistency, and comprehensiveness that often hamper effective use of pathway databases. We have included four organisms in the current release of IntPath. Our methodology and programs described in this work can be easily applied to other organisms; and we will include more model organisms and important pathogens in future releases of IntPath. IntPath maintains regular updates and is freely available at http://compbio.ddns.comp.nus.edu.sg:8080/IntPath.
2012-01-01
Background Pathway data are important for understanding the relationship between genes, proteins and many other molecules in living organisms. Pathway gene relationships are crucial information for guidance, prediction, reference and assessment in biochemistry, computational biology, and medicine. Many well-established databases--e.g., KEGG, WikiPathways, and BioCyc--are dedicated to collecting pathway data for public access. However, the effectiveness of these databases is hindered by issues such as incompatible data formats, inconsistent molecular representations, inconsistent molecular relationship representations, inconsistent referrals to pathway names, and incomprehensive data from different databases. Results In this paper, we overcome these issues through extraction, normalization and integration of pathway data from several major public databases (KEGG, WikiPathways, BioCyc, etc). We build a database that not only hosts our integrated pathway gene relationship data for public access but also maintains the necessary updates in the long run. This public repository is named IntPath (Integrated Pathway gene relationship database for model organisms and important pathogens). Four organisms--S. cerevisiae, M. tuberculosis H37Rv, H. Sapiens and M. musculus--are included in this version (V2.0) of IntPath. IntPath uses the "full unification" approach to ensure no deletion and no introduced noise in this process. Therefore, IntPath contains much richer pathway-gene and pathway-gene pair relationships and much larger number of non-redundant genes and gene pairs than any of the single-source databases. The gene relationships of each gene (measured by average node degree) per pathway are significantly richer. The gene relationships in each pathway (measured by average number of gene pairs per pathway) are also considerably richer in the integrated pathways. Moderate manual curation are involved to get rid of errors and noises from source data (e.g., the gene ID errors in WikiPathways and relationship errors in KEGG). We turn complicated and incompatible xml data formats and inconsistent gene and gene relationship representations from different source databases into normalized and unified pathway-gene and pathway-gene pair relationships neatly recorded in simple tab-delimited text format and MySQL tables, which facilitates convenient automatic computation and large-scale referencing in many related studies. IntPath data can be downloaded in text format or MySQL dump. IntPath data can also be retrieved and analyzed conveniently through web service by local programs or through web interface by mouse clicks. Several useful analysis tools are also provided in IntPath. Conclusions We have overcome in IntPath the issues of compatibility, consistency, and comprehensiveness that often hamper effective use of pathway databases. We have included four organisms in the current release of IntPath. Our methodology and programs described in this work can be easily applied to other organisms; and we will include more model organisms and important pathogens in future releases of IntPath. IntPath maintains regular updates and is freely available at http://compbio.ddns.comp.nus.edu.sg:8080/IntPath. PMID:23282057
Liu, Shanshan; Chen, Guanxing; Xu, Haidong; Zou, Weibin; Yan, Wenrui; Wang, Qianqian; Deng, Hengwei; Zhang, Heqian; Yu, Guojiao; He, Jianguo; Weng, Shaoping
2017-01-01
Mud crab (Scylla paramamosain) is an economically important marine cultured species in China's coastal area. Mud crab reovirus (MCRV) is the most important pathogen of mud crab, resulting in large economic losses in crab farming. In this paper, next-generation sequencing technology and bioinformatics analysis are used to study transcriptome differences between MCRV-infected mud crab and normal control. A total of 104.3 million clean reads were obtained, including 52.7 million and 51.6 million clean reads from MCRV-infected (CA) and controlled (HA) mud crabs respectively. 81,901, 70,059 and 67,279 unigenes were gained respectively from HA reads, CA reads and HA&CA reads. A total of 32,547 unigenes from HA&CA reads called All-Unigenes were matched to at least one database among Nr, Nt, Swiss-prot, COG, GO and KEGG databases. Among these, 13,039, 20,260 and 11,866 unigenes belonged to the 3, 258 and 25 categories of GO, KEGG pathway, and COG databases, respectively. Solexa/Illumina's DGE platform was also used, and about 13,856 differentially expressed genes (DEGs), including 4444 significantly upregulated and 9412 downregulated DEGs were detected in diseased crabs compared with the control. KEGG pathway analysis revealed that DEGs were obviously enriched in the pathways related to different diseases or infections. This transcriptome analysis provided valuable information on gene functions associated with the response to MCRV in mud crab, as well as detail information for identifying novel genes in the absence of the mud crab genome database. Copyright © 2016. Published by Elsevier Ltd.
miRPathDB: a new dictionary on microRNAs and target pathways.
Backes, Christina; Kehl, Tim; Stöckel, Daniel; Fehlmann, Tobias; Schneider, Lara; Meese, Eckart; Lenhof, Hans-Peter; Keller, Andreas
2017-01-04
In the last decade, miRNAs and their regulatory mechanisms have been intensively studied and many tools for the analysis of miRNAs and their targets have been developed. We previously presented a dictionary on single miRNAs and their putative target pathways. Since then, the number of miRNAs has tripled and the knowledge on miRNAs and targets has grown substantially. This, along with changes in pathway resources such as KEGG, leads to an improved understanding of miRNAs, their target genes and related pathways. Here, we introduce the miRNA Pathway Dictionary Database (miRPathDB), freely accessible at https://mpd.bioinf.uni-sb.de/ With the database we aim to complement available target pathway web-servers by providing researchers easy access to the information which pathways are regulated by a miRNA, which miRNAs target a pathway and how specific these regulations are. The database contains a large number of miRNAs (2595 human miRNAs), different miRNA target sets (14 773 experimentally validated target genes as well as 19 281 predicted targets genes) and a broad selection of functional biochemical categories (KEGG-, WikiPathways-, BioCarta-, SMPDB-, PID-, Reactome pathways, functional categories from gene ontology (GO), protein families from Pfam and chromosomal locations totaling 12 875 categories). In addition to Homo sapiens, also Mus musculus data are stored and can be compared to human target pathways. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Shirdel, Elize A.; Xie, Wing; Mak, Tak W.; Jurisica, Igor
2011-01-01
Background MicroRNAs are a class of small RNAs known to regulate gene expression at the transcript level, the protein level, or both. Since microRNA binding is sequence-based but possibly structure-specific, work in this area has resulted in multiple databases storing predicted microRNA:target relationships computed using diverse algorithms. We integrate prediction databases, compare predictions to in vitro data, and use cross-database predictions to model the microRNA:transcript interactome – referred to as the micronome – to study microRNA involvement in well-known signalling pathways as well as associations with disease. We make this data freely available with a flexible user interface as our microRNA Data Integration Portal — mirDIP (http://ophid.utoronto.ca/mirDIP). Results mirDIP integrates prediction databases to elucidate accurate microRNA:target relationships. Using NAViGaTOR to produce interaction networks implicating microRNAs in literature-based, KEGG-based and Reactome-based pathways, we find these signalling pathway networks have significantly more microRNA involvement compared to chance (p<0.05), suggesting microRNAs co-target many genes in a given pathway. Further examination of the micronome shows two distinct classes of microRNAs; universe microRNAs, which are involved in many signalling pathways; and intra-pathway microRNAs, which target multiple genes within one signalling pathway. We find universe microRNAs to have more targets (p<0.0001), to be more studied (p<0.0002), and to have higher degree in the KEGG cancer pathway (p<0.0001), compared to intra-pathway microRNAs. Conclusions Our pathway-based analysis of mirDIP data suggests microRNAs are involved in intra-pathway signalling. We identify two distinct classes of microRNAs, suggesting a hierarchical organization of microRNAs co-targeting genes both within and between pathways, and implying differential involvement of universe and intra-pathway microRNAs at the disease level. PMID:21364759
An editor for pathway drawing and data visualization in the Biopathways Workbench.
Byrnes, Robert W; Cotter, Dawn; Maer, Andreia; Li, Joshua; Nadeau, David; Subramaniam, Shankar
2009-10-02
Pathway models serve as the basis for much of systems biology. They are often built using programs designed for the purpose. Constructing new models generally requires simultaneous access to experimental data of diverse types, to databases of well-characterized biological compounds and molecular intermediates, and to reference model pathways. However, few if any software applications provide all such capabilities within a single user interface. The Pathway Editor is a program written in the Java programming language that allows de-novo pathway creation and downloading of LIPID MAPS (Lipid Metabolites and Pathways Strategy) and KEGG lipid metabolic pathways, and of measured time-dependent changes to lipid components of metabolism. Accessed through Java Web Start, the program downloads pathways from the LIPID MAPS Pathway database (Pathway) as well as from the LIPID MAPS web server http://www.lipidmaps.org. Data arises from metabolomic (lipidomic), microarray, and protein array experiments performed by the LIPID MAPS consortium of laboratories and is arranged by experiment. Facility is provided to create, connect, and annotate nodes and processes on a drawing panel with reference to database objects and time course data. Node and interaction layout as well as data display may be configured in pathway diagrams as desired. Users may extend diagrams, and may also read and write data and non-lipidomic KEGG pathways to and from files. Pathway diagrams in XML format, containing database identifiers referencing specific compounds and experiments, can be saved to a local file for subsequent use. The program is built upon a library of classes, referred to as the Biopathways Workbench, that convert between different file formats and database objects. An example of this feature is provided in the form of read/construct/write access to models in SBML (Systems Biology Markup Language) contained in the local file system. Inclusion of access to multiple experimental data types and of pathway diagrams within a single interface, automatic updating through connectivity to an online database, and a focus on annotation, including reference to standardized lipid nomenclature as well as common lipid names, supports the view that the Pathway Editor represents a significant, practicable contribution to current pathway modeling tools.
RUAN, XIYUN; LI, HONGYUN; LIU, BO; CHEN, JIE; ZHANG, SHIBAO; SUN, ZEQIANG; LIU, SHUANGQING; SUN, FAHAI; LIU, QINGYONG
2015-01-01
The aim of the present study was to develop a novel method for identifying pathways associated with renal cell carcinoma (RCC) based on a gene co-expression network. A framework was established where a co-expression network was derived from the database as well as various co-expression approaches. First, the backbone of the network based on differentially expressed (DE) genes between RCC patients and normal controls was constructed by the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database. The differentially co-expressed links were detected by Pearson’s correlation, the empirical Bayesian (EB) approach and Weighted Gene Co-expression Network Analysis (WGCNA). The co-expressed gene pairs were merged by a rank-based algorithm. We obtained 842; 371; 2,883 and 1,595 co-expressed gene pairs from the co-expression networks of the STRING database, Pearson’s correlation EB method and WGCNA, respectively. Two hundred and eighty-one differentially co-expressed (DC) gene pairs were obtained from the merged network using this novel method. Pathway enrichment analysis based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) database and the network enrichment analysis (NEA) method were performed to verify feasibility of the merged method. Results of the KEGG and NEA pathway analyses showed that the network was associated with RCC. The suggested method was computationally efficient to identify pathways associated with RCC and has been identified as a useful complement to traditional co-expression analysis. PMID:26058425
FMM: a web server for metabolic pathway reconstruction and comparative analysis.
Chou, Chih-Hung; Chang, Wen-Chi; Chiu, Chih-Min; Huang, Chih-Chang; Huang, Hsien-Da
2009-07-01
Synthetic Biology, a multidisciplinary field, is growing rapidly. Improving the understanding of biological systems through mimicry and producing bio-orthogonal systems with new functions are two complementary pursuits in this field. A web server called FMM (From Metabolite to Metabolite) was developed for this purpose. FMM can reconstruct metabolic pathways form one metabolite to another metabolite among different species, based mainly on the Kyoto Encyclopedia of Genes and Genomes (KEGG) database and other integrated biological databases. Novel presentation for connecting different KEGG maps is newly provided. Both local and global graphical views of the metabolic pathways are designed. FMM has many applications in Synthetic Biology and Metabolic Engineering. For example, the reconstruction of metabolic pathways to produce valuable metabolites or secondary metabolites in bacteria or yeast is a promising strategy for drug production. FMM provides a highly effective way to elucidate the genes from which species should be cloned into those microorganisms based on FMM pathway comparative analysis. Consequently, FMM is an effective tool for applications in synthetic biology to produce both drugs and biofuels. This novel and innovative resource is now freely available at http://FMM.mbc.nctu.edu.tw/.
Inferring gene and protein interactions using PubMed citations and consensus Bayesian networks.
Deeter, Anthony; Dalman, Mark; Haddad, Joseph; Duan, Zhong-Hui
2017-01-01
The PubMed database offers an extensive set of publication data that can be useful, yet inherently complex to use without automated computational techniques. Data repositories such as the Genomic Data Commons (GDC) and the Gene Expression Omnibus (GEO) offer experimental data storage and retrieval as well as curated gene expression profiles. Genetic interaction databases, including Reactome and Ingenuity Pathway Analysis, offer pathway and experiment data analysis using data curated from these publications and data repositories. We have created a method to generate and analyze consensus networks, inferring potential gene interactions, using large numbers of Bayesian networks generated by data mining publications in the PubMed database. Through the concept of network resolution, these consensus networks can be tailored to represent possible genetic interactions. We designed a set of experiments to confirm that our method is stable across variation in both sample and topological input sizes. Using gene product interactions from the KEGG pathway database and data mining PubMed publication abstracts, we verify that regardless of the network resolution or the inferred consensus network, our method is capable of inferring meaningful gene interactions through consensus Bayesian network generation with multiple, randomized topological orderings. Our method can not only confirm the existence of currently accepted interactions, but has the potential to hypothesize new ones as well. We show our method confirms the existence of known gene interactions such as JAK-STAT-PI3K-AKT-mTOR, infers novel gene interactions such as RAS- Bcl-2 and RAS-AKT, and found significant pathway-pathway interactions between the JAK-STAT signaling and Cardiac Muscle Contraction KEGG pathways.
Drug-Path: a database for drug-induced pathways
Zeng, Hui; Cui, Qinghua
2015-01-01
Some databases for drug-associated pathways have been built and are publicly available. However, the pathways curated in most of these databases are drug-action or drug-metabolism pathways. In recent years, high-throughput technologies such as microarray and RNA-sequencing have produced lots of drug-induced gene expression profiles. Interestingly, drug-induced gene expression profile frequently show distinct patterns, indicating that drugs normally induce the activation or repression of distinct pathways. Therefore, these pathways contribute to study the mechanisms of drugs and drug-repurposing. Here, we present Drug-Path, a database of drug-induced pathways, which was generated by KEGG pathway enrichment analysis for drug-induced upregulated genes and downregulated genes based on drug-induced gene expression datasets in Connectivity Map. Drug-Path provides user-friendly interfaces to retrieve, visualize and download the drug-induced pathway data in the database. In addition, the genes deregulated by a given drug are highlighted in the pathways. All data were organized using SQLite. The web site was implemented using Django, a Python web framework. Finally, we believe that this database will be useful for related researches. Database URL: http://www.cuilab.cn/drugpath PMID:26130661
Dong, Bin; Wu, Bin; Hong, Wenhong; Li, Xiuping; Li, Zhuo; Xue, Li; Huang, Yongfang
2017-01-01
The tea-oil camellia (Camellia oleifera) is the most important oil plant in southern China, and has a strong resistance to drought and barren soil. Understanding the molecular mechanisms of drought tolerance would greatly promote its cultivation and molecular breeding. In total, we obtained 76,585 unigenes with an average length of 810 bp and an N50 of 1,092 bp. We mapped all the unigenes to the NCBI 'nr' (non-redundant), SwissProt, KEGG, and clusters of orthologous groups (COG) databases, where 52,531 (68.6%) unigenes were functionally annotated. According to the annotation, 46,171 (60.8%) unigenes belong to 338 KEGG pathways. We identified a series of unigenes that are related to the synthesis and regulation of abscisic acid (ABA), the activity of protective enzymes, vitamin B6 metabolism, the metabolism of osmolytes, and pathways related to the biosynthesis of secondary metabolites. After exposed to drought for 12 hours, the number of differentially-expressed genes (DEGs) between treated plants and control plants increased in the G4 cultivar, while there was no significant increase in the drought-tolerant C3 cultivar. DEGs associated with drought stress responsive pathways were identified by KEGG pathway enrichment analysis. Moreover, we found 789 DEGs related to transcription factors. Finally, according to the results of qRT-PCR, the expression levels of the 20 unigenes tested were consistent with the results of next-generation sequencing. In the present study, we identified a large set of cDNA unigenes from C. oleifera annotated using public databases. Further studies of DEGs involved in metabolic pathways related to drought stress and transcription will facilitate the discovery of novel genes involved in resistance to drought stress in this commercially important plant.
Wu, Bin; Hong, Wenhong; Li, Xiuping; Li, Zhuo; Xue, Li; Huang, Yongfang
2017-01-01
Background The tea-oil camellia (Camellia oleifera) is the most important oil plant in southern China, and has a strong resistance to drought and barren soil. Understanding the molecular mechanisms of drought tolerance would greatly promote its cultivation and molecular breeding. Results In total, we obtained 76,585 unigenes with an average length of 810 bp and an N50 of 1,092 bp. We mapped all the unigenes to the NCBI ‘nr’ (non-redundant), SwissProt, KEGG, and clusters of orthologous groups (COG) databases, where 52,531 (68.6%) unigenes were functionally annotated. According to the annotation, 46,171 (60.8%) unigenes belong to 338 KEGG pathways. We identified a series of unigenes that are related to the synthesis and regulation of abscisic acid (ABA), the activity of protective enzymes, vitamin B6 metabolism, the metabolism of osmolytes, and pathways related to the biosynthesis of secondary metabolites. After exposed to drought for 12 hours, the number of differentially-expressed genes (DEGs) between treated plants and control plants increased in the G4 cultivar, while there was no significant increase in the drought-tolerant C3 cultivar. DEGs associated with drought stress responsive pathways were identified by KEGG pathway enrichment analysis. Moreover, we found 789 DEGs related to transcription factors. Finally, according to the results of qRT-PCR, the expression levels of the 20 unigenes tested were consistent with the results of next-generation sequencing. Conclusions In the present study, we identified a large set of cDNA unigenes from C. oleifera annotated using public databases. Further studies of DEGs involved in metabolic pathways related to drought stress and transcription will facilitate the discovery of novel genes involved in resistance to drought stress in this commercially important plant. PMID:28759610
Suzuki, Mami; Nakabayashi, Ryo; Ogata, Yoshiyuki; Sakurai, Nozomu; Tokimatsu, Toshiaki; Goto, Susumu; Suzuki, Makoto; Jasinski, Michal; Martinoia, Enrico; Otagaki, Shungo; Matsumoto, Shogo; Saito, Kazuki; Shiratake, Katsuhiro
2015-01-01
Grape (Vitis vinifera) accumulates various polyphenolic compounds, which protect against environmental stresses, including ultraviolet-C (UV-C) light and pathogens. In this study, we looked at the transcriptome and metabolome in grape berry skin after UV-C irradiation, which demonstrated the effectiveness of omics approaches to clarify important traits of grape. We performed transcriptome analysis using a genome-wide microarray, which revealed 238 genes up-regulated more than 5-fold by UV-C light. Enrichment analysis of Gene Ontology terms showed that genes encoding stilbene synthase, a key enzyme for resveratrol synthesis, were enriched in the up-regulated genes. We performed metabolome analysis using liquid chromatography-quadrupole time-of-flight mass spectrometry, and 2,012 metabolite peaks, including unidentified peaks, were detected. Principal component analysis using the peaks showed that only one metabolite peak, identified as resveratrol, was highly induced by UV-C light. We updated the metabolic pathway map of grape in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database and in the KaPPA-View 4 KEGG system, then projected the transcriptome and metabolome data on a metabolic pathway map. The map showed specific induction of the resveratrol synthetic pathway by UV-C light. Our results showed that multiomics is a powerful tool to elucidate the accumulation mechanisms of secondary metabolites, and updated systems, such as KEGG and KaPPA-View 4 KEGG for grape, can support such studies. PMID:25761715
Suzuki, Mami; Nakabayashi, Ryo; Ogata, Yoshiyuki; Sakurai, Nozomu; Tokimatsu, Toshiaki; Goto, Susumu; Suzuki, Makoto; Jasinski, Michal; Martinoia, Enrico; Otagaki, Shungo; Matsumoto, Shogo; Saito, Kazuki; Shiratake, Katsuhiro
2015-05-01
Grape (Vitis vinifera) accumulates various polyphenolic compounds, which protect against environmental stresses, including ultraviolet-C (UV-C) light and pathogens. In this study, we looked at the transcriptome and metabolome in grape berry skin after UV-C irradiation, which demonstrated the effectiveness of omics approaches to clarify important traits of grape. We performed transcriptome analysis using a genome-wide microarray, which revealed 238 genes up-regulated more than 5-fold by UV-C light. Enrichment analysis of Gene Ontology terms showed that genes encoding stilbene synthase, a key enzyme for resveratrol synthesis, were enriched in the up-regulated genes. We performed metabolome analysis using liquid chromatography-quadrupole time-of-flight mass spectrometry, and 2,012 metabolite peaks, including unidentified peaks, were detected. Principal component analysis using the peaks showed that only one metabolite peak, identified as resveratrol, was highly induced by UV-C light. We updated the metabolic pathway map of grape in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database and in the KaPPA-View 4 KEGG system, then projected the transcriptome and metabolome data on a metabolic pathway map. The map showed specific induction of the resveratrol synthetic pathway by UV-C light. Our results showed that multiomics is a powerful tool to elucidate the accumulation mechanisms of secondary metabolites, and updated systems, such as KEGG and KaPPA-View 4 KEGG for grape, can support such studies. © 2015 American Society of Plant Biologists. All Rights Reserved.
Metabolomics analysis: Finding out metabolic building blocks
2017-01-01
In this paper we propose a new methodology for the analysis of metabolic networks. We use the notion of strongly connected components of a graph, called in this context metabolic building blocks. Every strongly connected component is contracted to a single node in such a way that the resulting graph is a directed acyclic graph, called a metabolic DAG, with a considerably reduced number of nodes. The property of being a directed acyclic graph brings out a background graph topology that reveals the connectivity of the metabolic network, as well as bridges, isolated nodes and cut nodes. Altogether, it becomes a key information for the discovery of functional metabolic relations. Our methodology has been applied to the glycolysis and the purine metabolic pathways for all organisms in the KEGG database, although it is general enough to work on any database. As expected, using the metabolic DAGs formalism, a considerable reduction on the size of the metabolic networks has been obtained, specially in the case of the purine pathway due to its relative larger size. As a proof of concept, from the information captured by a metabolic DAG and its corresponding metabolic building blocks, we obtain the core of the glycolysis pathway and the core of the purine metabolism pathway and detect some essential metabolic building blocks that reveal the key reactions in both pathways. Finally, the application of our methodology to the glycolysis pathway and the purine metabolism pathway reproduce the tree of life for the whole set of the organisms represented in the KEGG database which supports the utility of this research. PMID:28493998
Inferring gene and protein interactions using PubMed citations and consensus Bayesian networks
Dalman, Mark; Haddad, Joseph; Duan, Zhong-Hui
2017-01-01
The PubMed database offers an extensive set of publication data that can be useful, yet inherently complex to use without automated computational techniques. Data repositories such as the Genomic Data Commons (GDC) and the Gene Expression Omnibus (GEO) offer experimental data storage and retrieval as well as curated gene expression profiles. Genetic interaction databases, including Reactome and Ingenuity Pathway Analysis, offer pathway and experiment data analysis using data curated from these publications and data repositories. We have created a method to generate and analyze consensus networks, inferring potential gene interactions, using large numbers of Bayesian networks generated by data mining publications in the PubMed database. Through the concept of network resolution, these consensus networks can be tailored to represent possible genetic interactions. We designed a set of experiments to confirm that our method is stable across variation in both sample and topological input sizes. Using gene product interactions from the KEGG pathway database and data mining PubMed publication abstracts, we verify that regardless of the network resolution or the inferred consensus network, our method is capable of inferring meaningful gene interactions through consensus Bayesian network generation with multiple, randomized topological orderings. Our method can not only confirm the existence of currently accepted interactions, but has the potential to hypothesize new ones as well. We show our method confirms the existence of known gene interactions such as JAK-STAT-PI3K-AKT-mTOR, infers novel gene interactions such as RAS- Bcl-2 and RAS-AKT, and found significant pathway-pathway interactions between the JAK-STAT signaling and Cardiac Muscle Contraction KEGG pathways. PMID:29049295
Drug-Path: a database for drug-induced pathways.
Zeng, Hui; Qiu, Chengxiang; Cui, Qinghua
2015-01-01
Some databases for drug-associated pathways have been built and are publicly available. However, the pathways curated in most of these databases are drug-action or drug-metabolism pathways. In recent years, high-throughput technologies such as microarray and RNA-sequencing have produced lots of drug-induced gene expression profiles. Interestingly, drug-induced gene expression profile frequently show distinct patterns, indicating that drugs normally induce the activation or repression of distinct pathways. Therefore, these pathways contribute to study the mechanisms of drugs and drug-repurposing. Here, we present Drug-Path, a database of drug-induced pathways, which was generated by KEGG pathway enrichment analysis for drug-induced upregulated genes and downregulated genes based on drug-induced gene expression datasets in Connectivity Map. Drug-Path provides user-friendly interfaces to retrieve, visualize and download the drug-induced pathway data in the database. In addition, the genes deregulated by a given drug are highlighted in the pathways. All data were organized using SQLite. The web site was implemented using Django, a Python web framework. Finally, we believe that this database will be useful for related researches. © The Author(s) 2015. Published by Oxford University Press.
Xtalk: a path-based approach for identifying crosstalk between signaling pathways
Tegge, Allison N.; Sharp, Nicholas; Murali, T. M.
2016-01-01
Motivation: Cells communicate with their environment via signal transduction pathways. On occasion, the activation of one pathway can produce an effect downstream of another pathway, a phenomenon known as crosstalk. Existing computational methods to discover such pathway pairs rely on simple overlap statistics. Results: We present Xtalk, a path-based approach for identifying pairs of pathways that may crosstalk. Xtalk computes the statistical significance of the average length of multiple short paths that connect receptors in one pathway to the transcription factors in another. By design, Xtalk reports the precise interactions and mechanisms that support the identified crosstalk. We applied Xtalk to signaling pathways in the KEGG and NCI-PID databases. We manually curated a gold standard set of 132 crosstalking pathway pairs and a set of 140 pairs that did not crosstalk, for which Xtalk achieved an area under the receiver operator characteristic curve of 0.65, a 12% improvement over the closest competing approach. The area under the receiver operator characteristic curve varied with the pathway, suggesting that crosstalk should be evaluated on a pathway-by-pathway level. We also analyzed an extended set of 658 pathway pairs in KEGG and to a set of more than 7000 pathway pairs in NCI-PID. For the top-ranking pairs, we found substantial support in the literature (81% for KEGG and 78% for NCI-PID). We provide examples of networks computed by Xtalk that accurately recovered known mechanisms of crosstalk. Availability and implementation: The XTALK software is available at http://bioinformatics.cs.vt.edu/~murali/software. Crosstalk networks are available at http://graphspace.org/graphs?tags=2015-bioinformatics-xtalk. Contact: ategge@vt.edu, murali@cs.vt.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26400040
The use of functional chemical-protein associations to identify multi-pathway renoprotectants.
Xu, Jia; Meng, Kexin; Zhang, Rui; Yang, He; Liao, Chang; Zhu, Wenliang; Jiao, Jundong
2014-01-01
Typically, most nephropathies can be categorized as complex human diseases in which the cumulative effect of multiple minor genes, combined with environmental and lifestyle factors, determines the disease phenotype. Thus, multi-target drugs would be more likely to facilitate comprehensive renoprotection than single-target agents. In this study, functional chemical-protein association analysis was performed to retrieve multi-target drugs of high pathway wideness from the STITCH 3.1 database. Pathway wideness of a drug evaluated the efficiency of regulation of Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways in quantity. We identified nine experimentally validated renoprotectants that exerted remarkable impact on KEGG pathways by targeting a limited number of proteins. We selected curcumin as an illustrative compound to display the advantage of multi-pathway drugs on renoprotection. We compared curcumin with hemin, an agonist of heme oxygenase-1 (HO-1), which significantly affects only one KEGG pathway, porphyrin and chlorophyll metabolism (adjusted p = 1.5×10-5). At the same concentration (10 µM), both curcumin and hemin equivalently mitigated oxidative stress in H2O2-treated glomerular mesangial cells. The benefit of using hemin was derived from its agonistic effect on HO-1, providing relief from oxidative stress. Selective inhibition of HO-1 completely blocked the action of hemin but not that of curcumin, suggesting simultaneous multi-pathway intervention by curcumin. Curcumin also increased cellular autophagy levels, enhancing its protective effect; however, hemin had no effects. Based on the fact that the dysregulation of multiple pathways is implicated in the etiology of complex diseases, we proposed a feasible method for identifying multi-pathway drugs from compounds with validated targets. Our efforts will help identify multi-pathway agents capable of providing comprehensive protection against renal injuries.
ProCarDB: a database of bacterial carotenoids.
Nupur, L N U; Vats, Asheema; Dhanda, Sandeep Kumar; Raghava, Gajendra P S; Pinnaka, Anil Kumar; Kumar, Ashwani
2016-05-26
Carotenoids have important functions in bacteria, ranging from harvesting light energy to neutralizing oxidants and acting as virulence factors. However, information pertaining to the carotenoids is scattered throughout the literature. Furthermore, information about the genes/proteins involved in the biosynthesis of carotenoids has tremendously increased in the post-genomic era. A web server providing the information about microbial carotenoids in a structured manner is required and will be a valuable resource for the scientific community working with microbial carotenoids. Here, we have created a manually curated, open access, comprehensive compilation of bacterial carotenoids named as ProCarDB- Prokaryotic Carotenoid Database. ProCarDB includes 304 unique carotenoids arising from 50 biosynthetic pathways distributed among 611 prokaryotes. ProCarDB provides important information on carotenoids, such as 2D and 3D structures, molecular weight, molecular formula, SMILES, InChI, InChIKey, IUPAC name, KEGG Id, PubChem Id, and ChEBI Id. The database also provides NMR data, UV-vis absorption data, IR data, MS data and HPLC data that play key roles in the identification of carotenoids. An important feature of this database is the extension of biosynthetic pathways from the literature and through the presence of the genes/enzymes in different organisms. The information contained in the database was mined from published literature and databases such as KEGG, PubChem, ChEBI, LipidBank, LPSN, and Uniprot. The database integrates user-friendly browsing and searching with carotenoid analysis tools to help the user. We believe that this database will serve as a major information centre for researchers working on bacterial carotenoids.
Bioinformatics analysis of transcriptome dynamics during growth in angus cattle longissimus muscle.
Moisá, Sonia J; Shike, Daniel W; Graugnard, Daniel E; Rodriguez-Zas, Sandra L; Everts, Robin E; Lewin, Harris A; Faulkner, Dan B; Berger, Larry L; Loor, Juan J
2013-01-01
Transcriptome dynamics in the longissimus muscle (LM) of young Angus cattle were evaluated at 0, 60, 120, and 220 days from early-weaning. Bioinformatic analysis was performed using the dynamic impact approach (DIA) by means of Kyoto Encyclopedia of Genes and Genomes (KEGG) and Database for Annotation, Visualization and Integrated Discovery (DAVID) databases. Between 0 to 120 days (growing phase) most of the highly-impacted pathways (eg, ascorbate and aldarate metabolism, drug metabolism, cytochrome P450 and Retinol metabolism) were inhibited. The phase between 120 to 220 days (finishing phase) was characterized by the most striking differences with 3,784 differentially expressed genes (DEGs). Analysis of those DEGs revealed that the most impacted KEGG canonical pathway was glycosylphosphatidylinositol (GPI)-anchor biosynthesis, which was inhibited. Furthermore, inhibition of calpastatin and activation of tyrosine aminotransferase ubiquitination at 220 days promotes proteasomal degradation, while the concurrent activation of ribosomal proteins promotes protein synthesis. Therefore, the balance of these processes likely results in a steady-state of protein turnover during the finishing phase. Results underscore the importance of transcriptome dynamics in LM during growth.
BioNetSim: a Petri net-based modeling tool for simulations of biochemical processes.
Gao, Junhui; Li, Li; Wu, Xiaolin; Wei, Dong-Qing
2012-03-01
BioNetSim, a Petri net-based software for modeling and simulating biochemistry processes, is developed, whose design and implement are presented in this paper, including logic construction, real-time access to KEGG (Kyoto Encyclopedia of Genes and Genomes), and BioModel database. Furthermore, glycolysis is simulated as an example of its application. BioNetSim is a helpful tool for researchers to download data, model biological network, and simulate complicated biochemistry processes. Gene regulatory networks, metabolic pathways, signaling pathways, and kinetics of cell interaction are all available in BioNetSim, which makes modeling more efficient and effective. Similar to other Petri net-based softwares, BioNetSim does well in graphic application and mathematic construction. Moreover, it shows several powerful predominances. (1) It creates models in database. (2) It realizes the real-time access to KEGG and BioModel and transfers data to Petri net. (3) It provides qualitative analysis, such as computation of constants. (4) It generates graphs for tracing the concentration of every molecule during the simulation processes.
Exercise-driven metabolic pathways in healthy cartilage.
Blazek, A D; Nam, J; Gupta, R; Pradhan, M; Perera, P; Weisleder, N L; Hewett, T E; Chaudhari, A M; Lee, B S; Leblebicioglu, B; Butterfield, T A; Agarwal, S
2016-07-01
Exercise is vital for maintaining cartilage integrity in healthy joints. Here we examined the exercise-driven transcriptional regulation of genes in healthy rat articular cartilage to dissect the metabolic pathways responsible for the potential benefits of exercise. Transcriptome-wide gene expression in the articular cartilage of healthy Sprague-Dawley female rats exercised daily (low intensity treadmill walking) for 2, 5, or 15 days was compared to that of non-exercised rats, using Affymetrix GeneChip arrays. Database for Annotation, Visualization and Integrated Discovery (DAVID) was used for Gene Ontology (GO)-term enrichment and Functional Annotation analysis of differentially expressed genes (DEGs). Kyoto Encyclopedia of Genes and Genome (KEGG) pathway mapper was used to identify the metabolic pathways regulated by exercise. Microarray analysis revealed that exercise-induced 644 DEGs in healthy articular cartilage. The DAVID bioinformatics tool demonstrated high prevalence of functional annotation clusters with greater enrichment scores and GO-terms associated with extracellular matrix (ECM) biosynthesis/remodeling and inflammation/immune response. The KEGG database revealed that exercise regulates 147 metabolic pathways representing molecular interaction networks for Metabolism, Genetic Information Processing, Environmental Information Processing, Cellular Processes, Organismal Systems, and Diseases. These pathways collectively supported the complex regulation of the beneficial effects of exercise on the cartilage. Overall, the findings highlight that exercise is a robust transcriptional regulator of a wide array of metabolic pathways in healthy cartilage. The major actions of exercise involve ECM biosynthesis/cartilage strengthening and attenuation of inflammatory pathways to provide prophylaxis against onset of arthritic diseases in healthy cartilage. Copyright © 2016 Osteoarthritis Research Society International. Published by Elsevier Ltd. All rights reserved.
Hu, Wei Qi; Wang, Wei; Fang, Di Long; Yin, Xue Feng
2018-05-24
BACKGROUND We screened the potential molecular targets and investigated the molecular mechanisms of hepatocellular carcinoma (HCC). MATERIAL AND METHODS Microarray data of GSE47786, including the 40 μM berberine-treated HepG2 human hepatoma cell line and 0.08% DMSO-treated as control cells samples, was downloaded from the GEO database. Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes pathway (KEGG) enrichment analyses were performed; the protein-protein interaction (PPI) networks were constructed using STRING database and Cytoscape; the genetic alteration, neighboring genes networks, and survival analysis of hub genes were explored by cBio portal; and the expression of mRNA level of hub genes was obtained from the Oncomine databases. RESULTS A total of 56 upregulated and 8 downregulated DEGs were identified. The GO analysis results were significantly enriched in cell-cycle arrest, regulation of transcription, DNA-dependent, protein amino acid phosphorylation, cell cycle, and apoptosis. The KEGG pathway analysis showed that DEGs were enriched in MAPK signaling pathway, ErbB signaling pathway, and p53 signaling pathway. JUN, EGR1, MYC, and CDKN1A were identified as hub genes in PPI networks. The genetic alteration of hub genes was mainly concentrated in amplification. TP53, NDRG1, and MAPK15 were found in neighboring genes networks. Altered genes had worse overall survival and disease-free survival than unaltered genes. The expressions of EGR1, MYC, and CDKN1A were significantly increased, but expression of JUN was not, in the Roessler Liver datasets. CONCLUSIONS We found that JUN, EGR1, MYC, and CDKN1A might be used as diagnostic and therapeutic molecular biomarkers and broaden our understanding of the molecular mechanisms of HCC.
Exploring of the molecular mechanism of rhinitis via bioinformatics methods
Song, Yufen; Yan, Zhaohui
2018-01-01
The aim of this study was to analyze gene expression profiles for exploring the function and regulatory network of differentially expressed genes (DEGs) in pathogenesis of rhinitis by a bioinformatics method. The gene expression profile of GSE43523 was downloaded from the Gene Expression Omnibus database. The dataset contained 7 seasonal allergic rhinitis samples and 5 non-allergic normal samples. DEGs between rhinitis samples and normal samples were identified via the limma package of R. The webGestal database was used to identify enriched Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways of the DEGs. The differentially co-expressed pairs of the DEGs were identified via the DCGL package in R, and the differential co-expression network was constructed based on these pairs. A protein-protein interaction (PPI) network of the DEGs was constructed based on the Search Tool for the Retrieval of Interacting Genes database. A total of 263 DEGs were identified in rhinitis samples compared with normal samples, including 125 downregulated ones and 138 upregulated ones. The DEGs were enriched in 7 KEGG pathways. 308 differential co-expression gene pairs were obtained. A differential co-expression network was constructed, containing 212 nodes. In total, 148 PPI pairs of the DEGs were identified, and a PPI network was constructed based on these pairs. Bioinformatics methods could help us identify significant genes and pathways related to the pathogenesis of rhinitis. Steroid biosynthesis pathway and metabolic pathways might play important roles in the development of allergic rhinitis (AR). Genes such as CDC42 effector protein 5, solute carrier family 39 member A11 and PR/SET domain 10 might be also associated with the pathogenesis of AR, which provided references for the molecular mechanisms of AR. PMID:29257233
2014-11-14
2 Xueping Yu,1 Bhaskar Dutta,1 Jacob D. Feala,1 Kara Schmid,2 Jitendra Dave,2 Gregory J . Tawa,1 Anders Wallqvist,1 and Jaques Reifman1* 1Department of...pathway.html), downloaded in December, 2011. KEGG, one of the largest and most widely used publicly available pathway databases, anno - tates pathways...Ansari MA, Roberts KN, Scheff SW. 2008b. A time course of contusion-induced oxidative stress and synaptic proteins in cortex in a rat model of TBI. J
Gene expression analysis of colorectal cancer by bioinformatics strategy.
Cui, Meng; Yuan, Junhua; Li, Jun; Sun, Bing; Li, Tao; Li, Yuantao; Wu, Guoliang
2014-10-01
We used bioinformatics technology to analyze gene expression profiles involved in colorectal cancer tissue samples and healthy controls. In this paper, we downloaded the gene expression profile GSE4107 from Gene Expression Omnibus (GEO) database, in which a total of 22 chips were available, including normal colonic mucosa tissue from normal healthy donors (n=10), colorectal cancer tissue samples from colorectal patients (n=33). To further understand the biological functions of the screened DGEs, the KEGG pathway enrichment analysis were conducted. Then we built a transcriptome network to study differentially co-expressed links. A total of 3151 DEGs of CRC were selected. Besides, total 164 DCGs (Differentially Coexpressed Gene, DCG) and 29279 DCLs (Differentially Co-expressed Link, DCL) were obtained. Furthermore, the significantly enriched KEGG pathways were Endocytosis, Calcium signaling pathway, Vascular smooth muscle contraction, Linoleic acid metabolism, Arginine and proline metabolism, Inositol phosphate metabolism and MAPK signaling pathway. Our results show that the generation of CRC involves multiple genes, TFs and pathways. Several signal and immune pathways are linked to CRC and give us more clues in the process of CRC. Hence, our work would pave ways for novel diagnosis of CRC, and provided theoretical guidance into cancer therapy.
Identifying pathways affected by cancer mutations.
Iengar, Prathima
2017-12-16
Mutations in 15 cancers, sourced from the COSMIC Whole Genomes database, and 297 human pathways, arranged into pathway groups based on the processes they orchestrate, and sourced from the KEGG pathway database, have together been used to identify pathways affected by cancer mutations. Genes studied in ≥15, and mutated in ≥10 samples of a cancer have been considered recurrently mutated, and pathways with recurrently mutated genes have been considered affected in the cancer. Novel doughnut plots have been presented which enable visualization of the extent to which pathways and genes, in each pathway group, are targeted, in each cancer. The 'organismal systems' pathway group (including organism-level pathways; e.g., nervous system) is the most targeted, more than even the well-recognized signal transduction, cell-cycle and apoptosis, and DNA repair pathway groups. The important, yet poorly-recognized, role played by the group merits attention. Pathways affected in ≥7 cancers yielded insights into processes affected. Copyright © 2017 Elsevier Inc. All rights reserved.
Du, Lianming; Li, Wujiao; Fan, Zhenxin; Shen, Fujun; Yang, Mingyu; Wang, Zili; Jian, Zuoyi; Hou, Rong; Yue, Bisong; Zhang, Xiuyue
2015-07-01
The giant panda (Ailuropoda melanoleuca) is one of the most famous flagship species for conservation, and its draft genome has recently been assembled. However, the transcriptome is not yet available. In this study, the blood transcriptomes of three pandas were characterized and about 160 million sequencing reads were generated using Illumina HiSeq 2000 paired-end sequencing technology. The assembly yielded 92 598 transcripts with an average length of 1626 bp and N50 length of 2842 bp. Based on a sequence similarity search against nonredundant (nr) protein database, a total of 38 522 (41.6%) transcripts were annotated. Of these annotated transcripts, 25 142 and 8272 transcripts were assigned to gene ontology terms and clusters of orthologous group, respectively. A search against the Kyoto Encyclopedia of Genes and Genomes Pathway database (KEGG) indicated that 9098 (9.83%) transcripts mapped to 324 KEGG pathways, and the best represented functional categories of pathways were signal transduction and immune system. We have also identified 23 460 microsatellites, 43 560 SNPs as well as 21 456 alternative splicing events in the assembly. Additionally, a total of 24 341 complete open reading frames (ORFs) were detected from the assembly where 1492 ORFs were found to be novel gene loci as these have not been annotated so far in any public database. © 2014 John Wiley & Sons Ltd.
Analysis of cancer-related lncRNAs using gene ontology and KEGG pathways.
Chen, Lei; Zhang, Yu-Hang; Lu, Guohui; Huang, Tao; Cai, Yu-Dong
2017-02-01
Cancer is a disease that involves abnormal cell growth and can invade or metastasize to other tissues. It is known that several factors are related to its initiation, proliferation, and invasiveness. Recently, it has been reported that long non-coding RNAs (lncRNAs) can participate in specific functional pathways and further regulate the biological function of cancer cells. Studies on lncRNAs are therefore helpful for uncovering the underlying mechanisms of cancer biological processes. We investigated cancer-related lncRNAs using gene ontology (GO) terms and KEGG pathway enrichment scores of neighboring genes that are co-expressed with the lncRNAs by extracting important GO terms and KEGG pathways that can help us identify cancer-related lncRNAs. The enrichment theory of GO terms and KEGG pathways was adopted to encode each lncRNA. Then, feature selection methods were employed to analyze these features and obtain the key GO terms and KEGG pathways. The analysis indicated that the extracted GO terms and KEGG pathways are closely related to several cancer associated processes, such as hormone associated pathways, energy associated pathways, and ribosome associated pathways. And they can accurately predict cancer-related lncRNAs. This study provided novel insight of how lncRNAs may affect tumorigenesis and which pathways may play important roles during it. These results could help understanding the biological mechanisms of lncRNAs and treating cancer. Copyright © 2017 Elsevier B.V. All rights reserved.
[Transcriptome analysis of Dunaliella viridis].
Zhu, Shuai-qi; Gong, Yi-fu; Hang, Yu-qing; Liu, Hao; Wang, He-yu
2015-08-01
In order to understand the gene information, function, haloduric pathway (glycerolipid metabolism) and related key genes for Dunaliella viridis, we used Illumina HiSeqTM 2000 high-throughput sequencing technology to sequence its transcriptome. Trinity soft was used to assemble the data to form transcripts. Based on the Clusters of Orthologous Groups (COG), Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG ) databases, we carried out functional annotation and classification, pathway annotation, and the opening reading fragment (ORF) sequence prediction of transcripts. The key genes in the glycerolipid metabolism were analyzed. The results suggested that 81,593 transcripts were found, and 77,117 ORF sequences were predicted, accounting for 94.50% of all transcripts. COG classification results showed that 16,569 transcripts were assigned to 24 categories. GO classification annotated 76,436 transcripts. The number of transcripts for biologcial processes was 30,678, accounting for 40.14% of all transcripts. KEGG pathway analysis showed that 26,428 transcripts were annotated to 317 pathways, and 131 pathways were related to metabolism, accounting for 41.32% of all annotated pathways. Only one transcript was annotated as coding the key enzyme dihydroxyacetone kinase involved in the glycerolipid pathway. This enzyme could be related to glycerol biosynthesis under salt stress. This study further improved the gene information and laid the foundation of metabolic pathway research for Dunaliella viridis.
Xie, Qi; Niu, Jun; Xu, Xilin; Xu, Lixin; Zhang, Yinbing; Fan, Bo; Liang, Xiaohong; Zhang, Lijuan; Yin, Shuxia; Han, Liebao
2015-01-01
Japanese lawngrass (Zoysia japonica Steud.) is an important warm-season turfgrass that is able to survive in a range of soils, from infertile sands to clays, and to grow well under saline conditions. However, little is known about the molecular mechanisms involved in its resistance to salt stress. Here, we used high-throughput RNA sequencing (RNA-seq) to investigate the changes in gene expression of Zoysia grass at high NaCl concentrations. We first constructed two sequencing libraries, including control and NaCl-treated samples, and sequenced them using the Illumina HiSeq™ 2000 platform. Approximately 157.20 million paired-end reads with a total length of 68.68 Mb were obtained. Subsequently, 32,849 unigenes with an N50 length of 1781 bp were assembled using Trinity. Furthermore, three public databases, the Kyoto Encyclopedia of Genes and Genomes (KEGG), Swiss-prot, and Clusters of Orthologous Groups (COGs), were used for gene function analysis and enrichment. The annotated genes included 57 Gene Ontology (GO) terms, 120 KEGG pathways, and 24 COGs. Compared with the control, 1455 genes were significantly different (false discovery rate ≤0.01, |log2Ratio |≥1) in the NaCl-treated samples. These genes were enriched in 10 KEGG pathways and 73 GO terms, and subjected to 25 COG categories. Using high-throughput next-generation sequencing, we built a database as a global transcript resource for Z. japonica Steud. roots. The results of this study will advance our understanding of the early salt response in Japanese lawngrass roots. PMID:26347751
Baten, Abdul; Ngangbam, Ajit Kumar; Waters, Daniel L. E.; Benkendorff, Kirsten
2016-01-01
Dicathais orbita is a mollusc of the Muricidae family and is well known for the production of the expensive dye Tyrian purple and its brominated precursors that have anticancer properties, in addition to choline esters with muscle-relaxing properties. However, the biosynthetic pathways that produce these secondary metabolites in D. orbita are not known. Illumina HiSeq 2000 transcriptome sequencing of hypobranchial glands, prostate glands, albumen glands, capsule glands, and mantle and foot tissues of D. orbita generated over 201 million high quality reads that were de novo assembled into 219,437 contigs. Annotation with reference to the Nr, Swiss-Prot and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases identified candidate-coding regions in 76,152 of these contigs, with transcripts for many enzymes in various metabolic pathways associated with secondary metabolite biosynthesis represented. This study revealed that D. orbita expresses a number of genes associated with indole, sulfur and histidine metabolism pathways that are relevant to Tyrian purple precursor biosynthesis, and many of which were not found in the fully annotated genomes of three other molluscs in the KEGG database. However, there were no matches to known bromoperoxidase enzymes within the D. orbita transcripts. These transcriptome data provide a significant molecular resource for gastropod research in general and Tyrian purple producing Muricidae in particular. PMID:27447649
Prediction and analysis of essential genes using the enrichments of gene ontology and KEGG pathways.
Chen, Lei; Zhang, Yu-Hang; Wang, ShaoPeng; Zhang, YunHua; Huang, Tao; Cai, Yu-Dong
2017-01-01
Identifying essential genes in a given organism is important for research on their fundamental roles in organism survival. Furthermore, if possible, uncovering the links between core functions or pathways with these essential genes will further help us obtain deep insight into the key roles of these genes. In this study, we investigated the essential and non-essential genes reported in a previous study and extracted gene ontology (GO) terms and biological pathways that are important for the determination of essential genes. Through the enrichment theory of GO and KEGG pathways, we encoded each essential/non-essential gene into a vector in which each component represented the relationship between the gene and one GO term or KEGG pathway. To analyze these relationships, the maximum relevance minimum redundancy (mRMR) was adopted. Then, the incremental feature selection (IFS) and support vector machine (SVM) were employed to extract important GO terms and KEGG pathways. A prediction model was built simultaneously using the extracted GO terms and KEGG pathways, which yielded nearly perfect performance, with a Matthews correlation coefficient of 0.951, for distinguishing essential and non-essential genes. To fully investigate the key factors influencing the fundamental roles of essential genes, the 21 most important GO terms and three KEGG pathways were analyzed in detail. In addition, several genes was provided in this study, which were predicted to be essential genes by our prediction model. We suggest that this study provides more functional and pathway information on the essential genes and provides a new way to investigate related problems.
Identification of transcriptional factors and key genes in primary osteoporosis by DNA microarray.
Xie, Wengui; Ji, Lixin; Zhao, Teng; Gao, Pengfei
2015-05-09
A number of genes have been identified to be related with primary osteoporosis while less is known about the comprehensive interactions between regulating genes and proteins. We aimed to identify the differentially expressed genes (DEGs) and regulatory effects of transcription factors (TFs) involved in primary osteoporosis. The gene expression profile GSE35958 was obtained from Gene Expression Omnibus database, including 5 primary osteoporosis and 4 normal bone tissues. The differentially expressed genes between primary osteoporosis and normal bone tissues were identified by the same package in R language. The TFs of these DEGs were predicted with the Essaghir A method. DAVID (The Database for Annotation, Visualization and Integrated Discovery) was applied to perform the GO (Gene Ontology) and KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway enrichment analysis of DEGs. After analyzing regulatory effects, a regulatory network was built between TFs and the related DEGs. A total of 579 DEGs was screened, including 310 up-regulated genes and 269 down-regulated genes in primary osteoporosis samples. In GO terms, more up-regulated genes were enriched in transcription regulator activity, and secondly in transcription factor activity. A total 10 significant pathways were enriched in KEGG analysis, including colorectal cancer, Wnt signaling pathway, Focal adhesion, and MAPK signaling pathway. Moreover, total 7 TFs were enriched, of which CTNNB1, SP1, and TP53 regulated most up-regulated DEGs. The discovery of the enriched TFs might contribute to the understanding of the mechanism of primary osteoporosis. Further research on genes and TFs related to the WNT signaling pathway and MAPK pathway is urgent for clinical diagnosis and directing treatment of primary osteoporosis.
Men, Xin; Ma, Jun; Wu, Tong; Pu, Junyi; Wen, Shaojia; Shen, Jianfeng; Wang, Xun; Wang, Yamin; Chen, Chao; Dai, Penggao
2018-01-01
Tamoxifen (TAM) resistance is an important clinical problem in the treatment of breast cancer. In order to identify the mechanism of TAM resistance for estrogen receptor (ER)-positive breast cancer, we screened the transcriptome using RNA-seq and compared the gene expression profiles between the MCF-7 mamma carcinoma cell line and the TAM-resistant cell line TAMR/MCF-7, 52 significant differential expression genes (DEGs) were identified including SLIT2, ROBO, LHX, KLF, VEGFC, BAMBI, LAMA1, FLT4, PNMT, DHRS2, MAOA and ALDH. The DEGs were annotated in the GO, COG and KEGG databases. Annotation of the function of the DEGs in the KEGG database revealed the top three pathways enriched with the most DEGs, including pathways in cancer, the PI3K-AKT pathway, and focal adhesion. Then we compared the gene expression profiles between the Clinical progressive disease (PD) and the complete response (CR) from the cancer genome altas (TCGA). 10 common DEGs were identified through combining the clinical and cellular analysis results. Protein-protein interaction network was applied to analyze the association of ER signal pathway with the 10 DEGs. 3 significant genes (GFRA3, NPY1R and PTPRN2) were closely related to ER related pathway. These significant DEGs regulated many biological activities such as cell proliferation and survival, motility and migration, and tumor cell invasion. The interactions between these DEGs and drug resistance phenomenon need to be further elucidated at a functional level in further studies. Based on our findings, we believed that these DEGs could be therapeutic targets, which can be explored to develop new treatment options. PMID:29423105
Key genes and pathways in measles and their interaction with environmental chemicals.
Zhang, Rongqiang; Jiang, Hualin; Li, Fengying; Su, Ning; Ding, Yi; Mao, Xiang; Ren, Dan; Wang, Jing
2018-06-01
The aim of the present study was to explore key genes that may have a role in the pathology of measles virus infection and to clarify the interaction networks between environmental factors and differentially expressed genes (DEGs). After screening the database of the Gene Expression Omnibus of the National Center for Biotechnology Information, the dataset GSE5808 was downloaded and analyzed. A global normalization method was performed to minimize data inconsistencies and heterogeneity. DEGs during different stages of measles virus infection were explored using R software (v3.4.0). Gene Ontology and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis of the DEGs were performed using Cytoscape 3.4.0 software. A protein-protein interaction (PPI) network of the DEGs was obtained from the STRING database v9.05. A total of 43 DEGs were obtained from four analyzed sample groups, including 10 highly expressed genes and 33 genes with decreased expression. The most enriched pathways based on KEGG analysis were fatty acid elongation, cytokine-cytokine receptor interaction and RNA degradation. The genes mentioned in the PPI network were mainly associated with protein binding and chemokine activity. A total of 219 chemicals were identified that may, jointly or on their own, interact with the 6 DEGs between the control group and patients with measles (at hospital entry), including benzo(a)pyrene (BaP) and tetrachlorodibenzodioxin (TCDD). In conclusion, the present study revealed that chemokines and environmental chemicals, e.g. BaP and TCDD, may affect the development of measles.
Chen, Qu; Hua, Canfeng; Niu, Liqiong; Geng, Yali; Cai, Liuping; Tao, Shiyu; Ni, Yingdong; Zhao, Ruqian
2018-06-15
Chronic stress severely threatens the welfare and health of animals and humans. In order to study the effects of chronic stress on metabolism, de novo transcriptome sequencing was used to generate the expressed sequence tag dataset for the goat, using nextgeneration sequencing technology. For this study, consecutive dexamethasone (Dex) injection was used in 10 healthy male goats (body weight 25 ± 1.0 kg) to mimic chronic stress. Ten male goats were randomly assigned into two groups, one group was injected intramuscularly with the same volume of saline as control (Con) group, and another (Dex) group was injected intramuscularly with 0.2 mg/kg Dex for 21 days. To elucidate the resulting changes in genes, transcriptome profiling of liver was conducted by analysing samples from three goats of each group using RNA-Seq. A total of 137 differentially expressed genes (DEGs) were identified between Con group and Dex group. GO classification showed rhythmic process and hormone secretion in term cellular, and chemoattractant activity in term molecular function had noticeable differences in the proportion between DEGs and all genes. By mapping the DEGs to the COG database, we found that general function prediction only, energy production and conversion, and amino acid transport and metabolism were the most frequently represented functional clusters. We mapped the unigenes to the KEGG pathway database and found most annotated genes were involved in the AMPK signalling pathway as well as pathways in cancer and insulin signalling pathway. Via KEGG enrichment analysis, we found the DEGs were significantly enriched in insulin signalling pathway, AMPK signalling pathway and adipocytokine signalling pathway. In addition, these pathways have close relationship with metabolism, which resulted in metabolic changes in which the identified DEGs may play important roles. These results provide valuable information for further research on the complex molecular mechanisms of dexamethasone in goats and will provide a foundation for future studies. Copyright © 2018 Elsevier B.V. All rights reserved.
Vivar, Juan C; Pemu, Priscilla; McPherson, Ruth; Ghosh, Sujoy
2013-08-01
Abstract Unparalleled technological advances have fueled an explosive growth in the scope and scale of biological data and have propelled life sciences into the realm of "Big Data" that cannot be managed or analyzed by conventional approaches. Big Data in the life sciences are driven primarily via a diverse collection of 'omics'-based technologies, including genomics, proteomics, metabolomics, transcriptomics, metagenomics, and lipidomics. Gene-set enrichment analysis is a powerful approach for interrogating large 'omics' datasets, leading to the identification of biological mechanisms associated with observed outcomes. While several factors influence the results from such analysis, the impact from the contents of pathway databases is often under-appreciated. Pathway databases often contain variously named pathways that overlap with one another to varying degrees. Ignoring such redundancies during pathway analysis can lead to the designation of several pathways as being significant due to high content-similarity, rather than truly independent biological mechanisms. Statistically, such dependencies also result in correlated p values and overdispersion, leading to biased results. We investigated the level of redundancies in multiple pathway databases and observed large discrepancies in the nature and extent of pathway overlap. This prompted us to develop the application, ReCiPa (Redundancy Control in Pathway Databases), to control redundancies in pathway databases based on user-defined thresholds. Analysis of genomic and genetic datasets, using ReCiPa-generated overlap-controlled versions of KEGG and Reactome pathways, led to a reduction in redundancy among the top-scoring gene-sets and allowed for the inclusion of additional gene-sets representing possibly novel biological mechanisms. Using obesity as an example, bioinformatic analysis further demonstrated that gene-sets identified from overlap-controlled pathway databases show stronger evidence of prior association to obesity compared to pathways identified from the original databases.
Zhao, Zheng; Bai, Jing; Wu, Aiwei; Wang, Yuan; Zhang, Jinwen; Wang, Zishan; Li, Yongsheng; Xu, Juan; Li, Xia
2015-01-01
Long non-coding RNAs (lncRNAs) are emerging as key regulators of diverse biological processes and diseases. However, the combinatorial effects of these molecules in a specific biological function are poorly understood. Identifying co-expressed protein-coding genes of lncRNAs would provide ample insight into lncRNA functions. To facilitate such an effort, we have developed Co-LncRNA, which is a web-based computational tool that allows users to identify GO annotations and KEGG pathways that may be affected by co-expressed protein-coding genes of a single or multiple lncRNAs. LncRNA co-expressed protein-coding genes were first identified in publicly available human RNA-Seq datasets, including 241 datasets across 6560 total individuals representing 28 tissue types/cell lines. Then, the lncRNA combinatorial effects in a given GO annotations or KEGG pathways are taken into account by the simultaneous analysis of multiple lncRNAs in user-selected individual or multiple datasets, which is realized by enrichment analysis. In addition, this software provides a graphical overview of pathways that are modulated by lncRNAs, as well as a specific tool to display the relevant networks between lncRNAs and their co-expressed protein-coding genes. Co-LncRNA also supports users in uploading their own lncRNA and protein-coding gene expression profiles to investigate the lncRNA combinatorial effects. It will be continuously updated with more human RNA-Seq datasets on an annual basis. Taken together, Co-LncRNA provides a web-based application for investigating lncRNA combinatorial effects, which could shed light on their biological roles and could be a valuable resource for this community. Database URL: http://www.bio-bigdata.com/Co-LncRNA/ PMID:26363020
Wei, Jiankai; Zhang, Xiaojun; Yu, Yang; Huang, Hao; Li, Fuhua; Xiang, Jianhai
2014-01-01
Penaeid shrimp has a distinctive metamorphosis stage during early development. Although morphological and biochemical studies about this ontogeny have been developed for decades, researches on gene expression level are still scarce. In this study, we have investigated the transcriptomes of five continuous developmental stages in Pacific white shrimp (Litopenaeus vannamei) with high throughput Illumina sequencing technology. The reads were assembled and clustered into 66,815 unigenes, of which 32,398 have putative homologues in nr database, 14,981 have been classified into diverse functional categories by Gene Ontology (GO) annotation and 26,257 have been associated with 255 pathways by KEGG pathway mapping. Meanwhile, the differentially expressed genes (DEGs) between adjacent developmental stages were identified and gene expression patterns were clustered. By GO term enrichment analysis, KEGG pathway enrichment analysis and functional gene profiling, the physiological changes during shrimp metamorphosis could be better understood, especially histogenesis, diet transition, muscle development and exoskeleton reconstruction. In conclusion, this is the first study that characterized the integrated transcriptomic profiles during early development of penaeid shrimp, and these findings will serve as significant references for shrimp developmental biology and aquaculture research. PMID:25197823
Tian, Honglai; Guan, Donghui; Li, Jianmin
2018-06-01
Osteosarcoma (OS), the most common malignant bone tumor, accounts for the heavy healthy threat in the period of children and adolescents. OS occurrence usually correlates with early metastasis and high death rate. This study aimed to better understand the mechanism of OS metastasis.Based on Gene Expression Omnibus (GEO) database, we downloaded 4 expression profile data sets associated with OS metastasis, and selected differential expressed genes. Weighted gene co-expression network analysis (WGCNA) approach allowed us to investigate the most OS metastasis-correlated module. Gene Ontology functional and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were used to give annotation of selected OS metastasis-associated genes.We select 897 differential expressed genes from OS metastasis and OS non-metastasis groups. Based on these selected genes, WGCNA further explored 142 genes included in the most OS metastasis-correlated module. Gene Ontology functional and KEGG pathway enrichment analyses showed that significantly OS metastasis-associated genes were involved in pathway correlated with insulin-like growth factor binding.Our research figured out several potential molecules participating in metastasis process and factors acting as biomarker. With this study, we could better explore the mechanism of OS metastasis and further discover more therapy targets.
Serial analysis of gene expression in a rat lung model of asthma.
Yin, Lei-Miao; Jiang, Gong-Hao; Wang, Yu; Wang, Yan; Liu, Yan-Yan; Jin, Wei-Rong; Zhang, Zen; Xu, Yu-Dong; Yang, Yong-Qing
2008-11-01
The pathogenesis and molecular mechanism underlying asthma remain undetermined. The purpose of this study was to identify genes and pathways involved in the early airway response (EAR) phase of asthma by using serial analysis of gene expression (SAGE). Two SAGE tag libraries of lung tissues derived from a rat model of asthma and controls were generated. Bioinformatic analyses were carried out using the Database for Annotation, Visualization and IntegratedDiscovery Functional Annotation Tool, Gene Ontology (GO) TreeMachine and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis. A total of 26 552 SAGE tags of asthmatic rat lung were obtained, of which 12 221 were unique tags. Of the unique tags, 55.5% were matched with known genes. By comparison of the two libraries, 186 differentially expressed tags (P < 0.05) were identified, of which 103 were upregulated and 83 were downregulated. Using the bioinformatic tools these genes were classified into 23 functional groups, 15 KEGG pathways and 37 enriched GO categories. The bioinformatic analyses of gene distribution, enriched categories and the involvement of specific pathways in the SAGE libraries have provided information on regulatory networks of the EAR phase of asthma. Analyses of the regulated genes of interest may inform new hypotheses, increase our understanding of the disease and provide a foundation for future research.
Subramani, Suresh; Kalpana, Raja; Monickaraj, Pankaj Moses; Natarajan, Jeyakumar
2015-04-01
The knowledge on protein-protein interactions (PPI) and their related pathways are equally important to understand the biological functions of the living cell. Such information on human proteins is highly desirable to understand the mechanism of several diseases such as cancer, diabetes, and Alzheimer's disease. Because much of that information is buried in biomedical literature, an automated text mining system for visualizing human PPI and pathways is highly desirable. In this paper, we present HPIminer, a text mining system for visualizing human protein interactions and pathways from biomedical literature. HPIminer extracts human PPI information and PPI pairs from biomedical literature, and visualize their associated interactions, networks and pathways using two curated databases HPRD and KEGG. To our knowledge, HPIminer is the first system to build interaction networks from literature as well as curated databases. Further, the new interactions mined only from literature and not reported earlier in databases are highlighted as new. A comparative study with other similar tools shows that the resultant network is more informative and provides additional information on interacting proteins and their associated networks. Copyright © 2015 Elsevier Inc. All rights reserved.
Chen, L; Yue, J; Han, X; Li, J; Hu, Y
2016-02-01
Intrauterine growth restriction (IUGR) is associated with a reduction in the numbers of nephrons in neonates, which increases the risk of hypertension. Our previous study showed that ouabain protects the development of the embryonic kidney during IUGR. To explore this molecular mechanism, IUGR rats were induced by protein and calorie restriction throughout pregnancy, and ouabain was delivered using a mini osmotic pump. RNA sequencing technology was used to identify the differentially expressed genes (DEGs) of the embryonic kidneys. DEGs were submitted to the Database for Annotation and Visualization and Integrated Discovery, and gene ontology enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis were conducted. Maternal malnutrition significantly reduced fetal weight, but ouabain treatment had no significant effect on body weight. A total of 322 (177 upregulated and 145 downregulated) DEGs were detected between control and the IUGR group. Meanwhile, 318 DEGs were found to be differentially expressed (180 increased and 138 decreased) between the IUGR group and the ouabain-treated group. KEGG pathway analysis indicated that maternal undernutrition mainly disrupts the complement and coagulation cascades and the calcium signaling pathway, which could be protected by ouabain treatment. Taken together, these two biological pathways may play an important role in nephrogenesis, indicating potential novel therapeutic targets against the unfavorable effects of IUGR.
Metabolic pathway reconstruction of eugenol to vanillin bioconversion in Aspergillus niger
Srivastava, Suchita; Luqman, Suaib; Khan, Feroz; Chanotiya, Chandan S; Darokar, Mahendra P
2010-01-01
Identification of missing genes or proteins participating in the metabolic pathways as enzymes are of great interest. One such class of pathway is involved in the eugenol to vanillin bioconversion. Our goal is to develop an integral approach for identifying the topology of a reference or known pathway in other organism. We successfully identify the missing enzymes and then reconstruct the vanillin biosynthetic pathway in Aspergillus niger. The procedure combines enzyme sequence similarity searched through BLAST homology search and orthologs detection through COG & KEGG databases. Conservation of protein domains and motifs was searched through CDD, PFAM & PROSITE databases. Predictions regarding how proteins act in pathway were validated experimentally and also compared with reported data. The bioconversion of vanillin was screened on UV-TLC plates and later confirmed through GC and GC-MS techniques. We applied a procedure for identifying missing enzymes on the basis of conserved functional motifs and later reconstruct the metabolic pathway in target organism. Using the vanillin biosynthetic pathway of Pseudomonas fluorescens as a case study, we indicate how this approach can be used to reconstruct the reference pathway in A. niger and later results were experimentally validated through chromatography and spectroscopy techniques. PMID:20978605
dEMBF: A Comprehensive Database of Enzymes of Microalgal Biofuel Feedstock.
Misra, Namrata; Panda, Prasanna Kumar; Parida, Bikram Kumar; Mishra, Barada Kanta
2016-01-01
Microalgae have attracted wide attention as one of the most versatile renewable feedstocks for production of biofuel. To develop genetically engineered high lipid yielding algal strains, a thorough understanding of the lipid biosynthetic pathway and the underpinning enzymes is essential. In this work, we have systematically mined the genomes of fifteen diverse algal species belonging to Chlorophyta, Heterokontophyta, Rhodophyta, and Haptophyta, to identify and annotate the putative enzymes of lipid metabolic pathway. Consequently, we have also developed a database, dEMBF (Database of Enzymes of Microalgal Biofuel Feedstock), which catalogues the complete list of identified enzymes along with their computed annotation details including length, hydrophobicity, amino acid composition, subcellular location, gene ontology, KEGG pathway, orthologous group, Pfam domain, intron-exon organization, transmembrane topology, and secondary/tertiary structural data. Furthermore, to facilitate functional and evolutionary study of these enzymes, a collection of built-in applications for BLAST search, motif identification, sequence and phylogenetic analysis have been seamlessly integrated into the database. dEMBF is the first database that brings together all enzymes responsible for lipid synthesis from available algal genomes, and provides an integrative platform for enzyme inquiry and analysis. This database will be extremely useful for algal biofuel research. It can be accessed at http://bbprof.immt.res.in/embf.
dEMBF: A Comprehensive Database of Enzymes of Microalgal Biofuel Feedstock
Misra, Namrata; Panda, Prasanna Kumar; Parida, Bikram Kumar; Mishra, Barada Kanta
2016-01-01
Microalgae have attracted wide attention as one of the most versatile renewable feedstocks for production of biofuel. To develop genetically engineered high lipid yielding algal strains, a thorough understanding of the lipid biosynthetic pathway and the underpinning enzymes is essential. In this work, we have systematically mined the genomes of fifteen diverse algal species belonging to Chlorophyta, Heterokontophyta, Rhodophyta, and Haptophyta, to identify and annotate the putative enzymes of lipid metabolic pathway. Consequently, we have also developed a database, dEMBF (Database of Enzymes of Microalgal Biofuel Feedstock), which catalogues the complete list of identified enzymes along with their computed annotation details including length, hydrophobicity, amino acid composition, subcellular location, gene ontology, KEGG pathway, orthologous group, Pfam domain, intron-exon organization, transmembrane topology, and secondary/tertiary structural data. Furthermore, to facilitate functional and evolutionary study of these enzymes, a collection of built-in applications for BLAST search, motif identification, sequence and phylogenetic analysis have been seamlessly integrated into the database. dEMBF is the first database that brings together all enzymes responsible for lipid synthesis from available algal genomes, and provides an integrative platform for enzyme inquiry and analysis. This database will be extremely useful for algal biofuel research. It can be accessed at http://bbprof.immt.res.in/embf. PMID:26727469
Li, Yanyun; Chen, Minjian; Liu, Cuiping; Xia, Yankai; Xu, Bo; Hu, Yanhui; Chen, Ting; Shen, Meiping; Tang, Wei
2018-05-01
Papillary thyroid carcinoma (PTC) is the most common thyroid cancer. Nuclear magnetic resonance (NMR)‑based metabolomic technique is the gold standard in metabolite structural elucidation, and can provide different coverage of information compared with other metabolomic techniques. Here, we firstly conducted NMR based metabolomics study regarding detailed metabolic changes especially metabolic pathway changes related to PTC pathogenesis. 1H NMR-based metabolomic technique was adopted in conju-nction with multivariate analysis to analyze matched tumor and normal thyroid tissues obtained from 16 patients. The results were further annotated with Kyoto Encyclopedia of Genes and Genomes (KEGG), and Human Metabolome Database, and then were analyzed using modules of pathway analysis and enrichment analysis of MetaboAnalyst 3.0. Based on the analytical techniques, we established the models of principal component analysis (PCA), partial least squares-discriminant analysis (PLS-DA), and orthogonal partial least-squares discriminant analysis (OPLS‑DA) which could discriminate PTC from normal thyroid tissue, and found 15 robust differentiated metabolites from two OPLS-DA models. We identified 8 KEGG pathways and 3 pathways of small molecular pathway database which were significantly related to PTC by using pathway analysis and enrichment analysis, respectively, through which we identified metabolisms related to PTC including branched chain amino acid metabolism (leucine and valine), other amino acid metabolism (glycine and taurine), glycolysis (lactate), tricarboxylic acid cycle (citrate), choline metabolism (choline, ethanolamine and glycerolphosphocholine) and lipid metabolism (very-low‑density lipoprotein and low-density lipoprotein). In conclusion, the PTC was characterized with increased glycolysis and inhibited tricarboxylic acid cycle, increased oncogenic amino acids as well as abnormal choline and lipid metabolism. The findings in this study provide new insights into detailed metabolic changes of PTC, and hold great potential in the treatment of PTC.
Key genes and pathways in measles and their interaction with environmental chemicals
Zhang, Rongqiang; Jiang, Hualin; Li, Fengying; Su, Ning; Ding, Yi; Mao, Xiang; Ren, Dan; Wang, Jing
2018-01-01
The aim of the present study was to explore key genes that may have a role in the pathology of measles virus infection and to clarify the interaction networks between environmental factors and differentially expressed genes (DEGs). After screening the database of the Gene Expression Omnibus of the National Center for Biotechnology Information, the dataset GSE5808 was downloaded and analyzed. A global normalization method was performed to minimize data inconsistencies and heterogeneity. DEGs during different stages of measles virus infection were explored using R software (v3.4.0). Gene Ontology and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis of the DEGs were performed using Cytoscape 3.4.0 software. A protein-protein interaction (PPI) network of the DEGs was obtained from the STRING database v9.05. A total of 43 DEGs were obtained from four analyzed sample groups, including 10 highly expressed genes and 33 genes with decreased expression. The most enriched pathways based on KEGG analysis were fatty acid elongation, cytokine-cytokine receptor interaction and RNA degradation. The genes mentioned in the PPI network were mainly associated with protein binding and chemokine activity. A total of 219 chemicals were identified that may, jointly or on their own, interact with the 6 DEGs between the control group and patients with measles (at hospital entry), including benzo(a)pyrene (BaP) and tetrachlorodibenzodioxin (TCDD). In conclusion, the present study revealed that chemokines and environmental chemicals, e.g. BaP and TCDD, may affect the development of measles. PMID:29805511
Zhang, Bofei; Hu, Senyang; Baskin, Elizabeth; Patt, Andrew; Siddiqui, Jalal K.
2018-01-01
The value of metabolomics in translational research is undeniable, and metabolomics data are increasingly generated in large cohorts. The functional interpretation of disease-associated metabolites though is difficult, and the biological mechanisms that underlie cell type or disease-specific metabolomics profiles are oftentimes unknown. To help fully exploit metabolomics data and to aid in its interpretation, analysis of metabolomics data with other complementary omics data, including transcriptomics, is helpful. To facilitate such analyses at a pathway level, we have developed RaMP (Relational database of Metabolomics Pathways), which combines biological pathways from the Kyoto Encyclopedia of Genes and Genomes (KEGG), Reactome, WikiPathways, and the Human Metabolome DataBase (HMDB). To the best of our knowledge, an off-the-shelf, public database that maps genes and metabolites to biochemical/disease pathways and can readily be integrated into other existing software is currently lacking. For consistent and comprehensive analysis, RaMP enables batch and complex queries (e.g., list all metabolites involved in glycolysis and lung cancer), can readily be integrated into pathway analysis tools, and supports pathway overrepresentation analysis given a list of genes and/or metabolites of interest. For usability, we have developed a RaMP R package (https://github.com/Mathelab/RaMP-DB), including a user-friendly RShiny web application, that supports basic simple and batch queries, pathway overrepresentation analysis given a list of genes or metabolites of interest, and network visualization of gene-metabolite relationships. The package also includes the raw database file (mysql dump), thereby providing a stand-alone downloadable framework for public use and integration with other tools. In addition, the Python code needed to recreate the database on another system is also publicly available (https://github.com/Mathelab/RaMP-BackEnd). Updates for databases in RaMP will be checked multiple times a year and RaMP will be updated accordingly. PMID:29470400
Zhang, Bofei; Hu, Senyang; Baskin, Elizabeth; Patt, Andrew; Siddiqui, Jalal K; Mathé, Ewy A
2018-02-22
The value of metabolomics in translational research is undeniable, and metabolomics data are increasingly generated in large cohorts. The functional interpretation of disease-associated metabolites though is difficult, and the biological mechanisms that underlie cell type or disease-specific metabolomics profiles are oftentimes unknown. To help fully exploit metabolomics data and to aid in its interpretation, analysis of metabolomics data with other complementary omics data, including transcriptomics, is helpful. To facilitate such analyses at a pathway level, we have developed RaMP (Relational database of Metabolomics Pathways), which combines biological pathways from the Kyoto Encyclopedia of Genes and Genomes (KEGG), Reactome, WikiPathways, and the Human Metabolome DataBase (HMDB). To the best of our knowledge, an off-the-shelf, public database that maps genes and metabolites to biochemical/disease pathways and can readily be integrated into other existing software is currently lacking. For consistent and comprehensive analysis, RaMP enables batch and complex queries (e.g., list all metabolites involved in glycolysis and lung cancer), can readily be integrated into pathway analysis tools, and supports pathway overrepresentation analysis given a list of genes and/or metabolites of interest. For usability, we have developed a RaMP R package (https://github.com/Mathelab/RaMP-DB), including a user-friendly RShiny web application, that supports basic simple and batch queries, pathway overrepresentation analysis given a list of genes or metabolites of interest, and network visualization of gene-metabolite relationships. The package also includes the raw database file (mysql dump), thereby providing a stand-alone downloadable framework for public use and integration with other tools. In addition, the Python code needed to recreate the database on another system is also publicly available (https://github.com/Mathelab/RaMP-BackEnd). Updates for databases in RaMP will be checked multiple times a year and RaMP will be updated accordingly.
Wang, Anping; Zhang, Guibin
2017-11-01
The differentially expressed genes between glioblastoma (GBM) cells and normal human brain cells were investigated to performed pathway analysis and protein interaction network analysis for the differentially expressed genes. GSE12657 and GSE42656 gene chips, which contain gene expression profile of GBM were obtained from Gene Expression Omniub (GEO) database of National Center for Biotechnology Information (NCBI). The 'limma' data packet in 'R' software was used to analyze the differentially expressed genes in the two gene chips, and gene integration was performed using 'RobustRankAggreg' package. Finally, pheatmap software was used for heatmap analysis and Cytoscape, DAVID, STRING and KOBAS were used for protein-protein interaction, Gene Ontology (GO) and KEGG analyses. As results: i) 702 differentially expressed genes were identified in GSE12657, among those genes, 548 were significantly upregulated and 154 were significantly downregulated (p<0.01, fold-change >1), and 1,854 differentially expressed genes were identified in GSE42656, among the genes, 1,068 were significantly upregulated and 786 were significantly downregulated (p<0.01, fold-change >1). A total of 167 differentially expressed genes including 100 upregulated genes and 67 downregulated genes were identified after gene integration, and the genes showed significantly different expression levels in GBM compared with normal human brain cells (p<0.05). ii) Interactions between the protein products of 101 differentially expressed genes were identified using STRING and expression network was established. A key gene, called CALM3, was identified by Cytoscape software. iii) GO enrichment analysis showed that differentially expressed genes were mainly enriched in 'neurotransmitter:sodium symporter activity' and 'neurotransmitter transporter activity', which can affect the activity of neurotransmitter transportation. KEGG pathway analysis showed that the differentially expressed genes were mainly enriched in 'protein processing in endoplasmic reticulum', which can affect protein processing in endoplasmic reticulum. The results showed that: i) 167 differentially expressed genes were identified from two gene chips after integration; and ii) protein interaction network was established, and GO and KEGG pathway analyses were successfully performed to identify and annotate the key gene, which provide new insights for the studies on GBN at gene level.
Using Bioinformatic Approaches to Identify Pathways Targeted by Human Leukemogens
Thomas, Reuben; Phuong, Jimmy; McHale, Cliona M.; Zhang, Luoping
2012-01-01
We have applied bioinformatic approaches to identify pathways common to chemical leukemogens and to determine whether leukemogens could be distinguished from non-leukemogenic carcinogens. From all known and probable carcinogens classified by IARC and NTP, we identified 35 carcinogens that were associated with leukemia risk in human studies and 16 non-leukemogenic carcinogens. Using data on gene/protein targets available in the Comparative Toxicogenomics Database (CTD) for 29 of the leukemogens and 11 of the non-leukemogenic carcinogens, we analyzed for enrichment of all 250 human biochemical pathways in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. The top pathways targeted by the leukemogens included metabolism of xenobiotics by cytochrome P450, glutathione metabolism, neurotrophin signaling pathway, apoptosis, MAPK signaling, Toll-like receptor signaling and various cancer pathways. The 29 leukemogens formed 18 distinct clusters comprising 1 to 3 chemicals that did not correlate with known mechanism of action or with structural similarity as determined by 2D Tanimoto coefficients in the PubChem database. Unsupervised clustering and one-class support vector machines, based on the pathway data, were unable to distinguish the 29 leukemogens from 11 non-leukemogenic known and probable IARC carcinogens. However, using two-class random forests to estimate leukemogen and non-leukemogen patterns, we estimated a 76% chance of distinguishing a random leukemogen/non-leukemogen pair from each other. PMID:22851955
The Use of Gene Ontology Term and KEGG Pathway Enrichment for Analysis of Drug Half-Life
Chen, Lei; Lu, Jing; Kong, XiangYin; Huang, Tao; Li, HaiPeng
2016-01-01
A drug’s biological half-life is defined as the time required for the human body to metabolize or eliminate 50% of the initial drug dosage. Correctly measuring the half-life of a given drug is helpful for the safe and accurate usage of the drug. In this study, we investigated which gene ontology (GO) terms and biological pathways were highly related to the determination of drug half-life. The investigated drugs, with known half-lives, were analyzed based on their enrichment scores for associated GO terms and KEGG pathways. These scores indicate which GO terms or KEGG pathways the drug targets. The feature selection method, minimum redundancy maximum relevance, was used to analyze these GO terms and KEGG pathways and to identify important GO terms and pathways, such as sodium-independent organic anion transmembrane transporter activity (GO:0015347), monoamine transmembrane transporter activity (GO:0008504), negative regulation of synaptic transmission (GO:0050805), neuroactive ligand-receptor interaction (hsa04080), serotonergic synapse (hsa04726), and linoleic acid metabolism (hsa00591), among others. This analysis confirmed our results and may show evidence for a new method in studying drug half-lives and building effective computational methods for the prediction of drug half-lives. PMID:27780226
Zhang, Qian; Sun, Xiaofang; Zheng, Jia; Li, Ming; Yu, Miao; Ping, Fan; Wang, Zhixin; Qi, Cuijuan; Wang, Tong; Wang, Xiaojing
2017-01-01
Maternal malnutrition leads to the incidence of metabolic diseases in offspring. The purpose of this project was to examine whether maternal low chromium could disturb normal lipid metabolism in offspring, altering adipose cell differentiation and leading to the incidence of lipid metabolism diseases, including metabolic syndrome and obesity. Female C57BL mice were given a control diet (CD) or a low chromium diet (LCD) during the gestational and lactation periods. After weaning, offspring was fed with CD or LCD. The female offspring were assessed at 32 weeks of age. Fresh adipose samples from CD–CD group and LCD–CD group were collected. Genome mRNA were analysed using Affymetrix GeneChip Mouse Gene 2.0 ST Whole Transcript-based array. Differentially expressed genes (DEGs) were analysed based on gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis database. Maternal low chromium irreversibly increased offspring body weight, fat-pad weight, serum triglyceride (TG) and TNF-α. Eighty five genes increased and 109 genes reduced in the offspring adipose of the maternal low chromium group. According to KEGG pathway and String analyses, the PPAR signalling pathway may be the key controlled pathway related to the effect of maternal low chromium on female offspring. Maternal chromium status have long-term effects of lipid metabolism in female mice offspring. Normalizing offspring diet can not reverse these effects. The potential underlying mechanisms are the disturbance of the PPAR signalling pathway in adipose tissue. PMID:28320771
Investigation of candidate genes for osteoarthritis based on gene expression profiles.
Dong, Shuanghai; Xia, Tian; Wang, Lei; Zhao, Qinghua; Tian, Jiwei
2016-12-01
To explore the mechanism of osteoarthritis (OA) and provide valid biological information for further investigation. Gene expression profile of GSE46750 was downloaded from Gene Expression Omnibus database. The Linear Models for Microarray Data (limma) package (Bioconductor project, http://www.bioconductor.org/packages/release/bioc/html/limma.html) was used to identify differentially expressed genes (DEGs) in inflamed OA samples. Gene Ontology function enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways enrichment analysis of DEGs were performed based on Database for Annotation, Visualization and Integrated Discovery data, and protein-protein interaction (PPI) network was constructed based on the Search Tool for the Retrieval of Interacting Genes/Proteins database. Regulatory network was screened based on Encyclopedia of DNA Elements. Molecular Complex Detection was used for sub-network screening. Two sub-networks with highest node degree were integrated with transcriptional regulatory network and KEGG functional enrichment analysis was processed for 2 modules. In total, 401 up- and 196 down-regulated DEGs were obtained. Up-regulated DEGs were involved in inflammatory response, while down-regulated DEGs were involved in cell cycle. PPI network with 2392 protein interactions was constructed. Moreover, 10 genes including Interleukin 6 (IL6) and Aurora B kinase (AURKB) were found to be outstanding in PPI network. There are 214 up- and 8 down-regulated transcription factor (TF)-target pairs in the TF regulatory network. Module 1 had TFs including SPI1, PRDM1, and FOS, while module 2 contained FOSL1. The nodes in module 1 were enriched in chemokine signaling pathway, while the nodes in module 2 were mainly enriched in cell cycle. The screened DEGs including IL6, AGT, and AURKB might be potential biomarkers for gene therapy for OA by being regulated by TFs such as FOS and SPI1, and participating in the cell cycle and cytokine-cytokine receptor interaction pathway. Copyright © 2016 Turkish Association of Orthopaedics and Traumatology. Production and hosting by Elsevier B.V. All rights reserved.
Zhao, Zheng; Bai, Jing; Wu, Aiwei; Wang, Yuan; Zhang, Jinwen; Wang, Zishan; Li, Yongsheng; Xu, Juan; Li, Xia
2015-01-01
Long non-coding RNAs (lncRNAs) are emerging as key regulators of diverse biological processes and diseases. However, the combinatorial effects of these molecules in a specific biological function are poorly understood. Identifying co-expressed protein-coding genes of lncRNAs would provide ample insight into lncRNA functions. To facilitate such an effort, we have developed Co-LncRNA, which is a web-based computational tool that allows users to identify GO annotations and KEGG pathways that may be affected by co-expressed protein-coding genes of a single or multiple lncRNAs. LncRNA co-expressed protein-coding genes were first identified in publicly available human RNA-Seq datasets, including 241 datasets across 6560 total individuals representing 28 tissue types/cell lines. Then, the lncRNA combinatorial effects in a given GO annotations or KEGG pathways are taken into account by the simultaneous analysis of multiple lncRNAs in user-selected individual or multiple datasets, which is realized by enrichment analysis. In addition, this software provides a graphical overview of pathways that are modulated by lncRNAs, as well as a specific tool to display the relevant networks between lncRNAs and their co-expressed protein-coding genes. Co-LncRNA also supports users in uploading their own lncRNA and protein-coding gene expression profiles to investigate the lncRNA combinatorial effects. It will be continuously updated with more human RNA-Seq datasets on an annual basis. Taken together, Co-LncRNA provides a web-based application for investigating lncRNA combinatorial effects, which could shed light on their biological roles and could be a valuable resource for this community. Database URL: http://www.bio-bigdata.com/Co-LncRNA/. © The Author(s) 2015. Published by Oxford University Press.
GEM System: automatic prototyping of cell-wide metabolic pathway models from genomes.
Arakawa, Kazuharu; Yamada, Yohei; Shinoda, Kosaku; Nakayama, Yoichi; Tomita, Masaru
2006-03-23
Successful realization of a "systems biology" approach to analyzing cells is a grand challenge for our understanding of life. However, current modeling approaches to cell simulation are labor-intensive, manual affairs, and therefore constitute a major bottleneck in the evolution of computational cell biology. We developed the Genome-based Modeling (GEM) System for the purpose of automatically prototyping simulation models of cell-wide metabolic pathways from genome sequences and other public biological information. Models generated by the GEM System include an entire Escherichia coli metabolism model comprising 968 reactions of 1195 metabolites, achieving 100% coverage when compared with the KEGG database, 92.38% with the EcoCyc database, and 95.06% with iJR904 genome-scale model. The GEM System prototypes qualitative models to reduce the labor-intensive tasks required for systems biology research. Models of over 90 bacterial genomes are available at our web site.
Carmona, Rosario; Zafra, Adoración; Seoane, Pedro; Castro, Antonio J.; Guerrero-Fernández, Darío; Castillo-Castillo, Trinidad; Medina-García, Ana; Cánovas, Francisco M.; Aldana-Montes, José F.; Navas-Delgado, Ismael; Alché, Juan de Dios; Claros, M. Gonzalo
2015-01-01
Plant reproductive transcriptomes have been analyzed in different species due to the agronomical and biotechnological importance of plant reproduction. Here we presented an olive tree reproductive transcriptome database with samples from pollen and pistil at different developmental stages, and leaf and root as control vegetative tissues http://reprolive.eez.csic.es). It was developed from 2,077,309 raw reads to 1,549 Sanger sequences. Using a pre-defined workflow based on open-source tools, sequences were pre-processed, assembled, mapped, and annotated with expression data, descriptions, GO terms, InterPro signatures, EC numbers, KEGG pathways, ORFs, and SSRs. Tentative transcripts (TTs) were also annotated with the corresponding orthologs in Arabidopsis thaliana from TAIR and RefSeq databases to enable Linked Data integration. It results in a reproductive transcriptome comprising 72,846 contigs with average length of 686 bp, of which 63,965 (87.8%) included at least one functional annotation, and 55,356 (75.9%) had an ortholog. A minimum of 23,568 different TTs was identified and 5,835 of them contain a complete ORF. The representative reproductive transcriptome can be reduced to 28,972 TTs for further gene expression studies. Partial transcriptomes from pollen, pistil, and vegetative tissues as control were also constructed. ReprOlive provides free access and download capability to these results. Retrieval mechanisms for sequences and transcript annotations are provided. Graphical localization of annotated enzymes into KEGG pathways is also possible. Finally, ReprOlive has included a semantic conceptualisation by means of a Resource Description Framework (RDF) allowing a Linked Data search for extracting the most updated information related to enzymes, interactions, allergens, structures, and reactive oxygen species. PMID:26322066
Bioinformatics approach reveals systematic mechanism underlying lung adenocarcinoma.
Wu, Xiya; Zhang, Wei; Hu, Yunhua; Yi, Xianghua
2015-01-01
The purpose of this work was to explore the systematic molecular mechanism of lung adenocarcinoma and gain a deeper insight into it. Comprehensive bioinformatics methods were applied. Initially, significant differentially expressed genes (DEGs) were analyzed from the Affymetrix microarray data (GSE27262) deposited in the Gene Expression Omnibus (GEO). Subsequently, gene ontology (GO) analysis was performed using online Database for Annotation, Visualization and Integration Discovery (DAVID) software. Finally, significant pathway crosstalk was investigated based on the information derived from the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. According to our results, the N-terminal globular domain of the type X collagen (COL10A1) gene and transmembrane protein 100 (TMEM100) gene were identified to be the most significant DEGs in tumor tissue compared with the adjacent normal tissues. The main GO categories were biological process, cellular component and molecular function. In addition, the crosstalk was significantly different between non-small cell lung cancer pathways and inositol phosphate metabolism pathway, focal adhesion signal pathway, vascular smooth muscle contraction signal pathway, peroxisome proliferator-activated receptor (PPAR) signaling pathway and calcium signaling pathway in tumor. Dysfunctional genes and pathways may play key roles in the progression and development of lung adenocarcinoma. Our data provide a systematic perspective for understanding this mechanism and may be helpful in discovering an effective treatment for lung adenocarcinoma.
BioWarehouse: a bioinformatics database warehouse toolkit
Lee, Thomas J; Pouliot, Yannick; Wagner, Valerie; Gupta, Priyanka; Stringer-Calvert, David WJ; Tenenbaum, Jessica D; Karp, Peter D
2006-01-01
Background This article addresses the problem of interoperation of heterogeneous bioinformatics databases. Results We introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL) but also facilitating a variety of database integration tasks such as comparative analysis and data mining. BioWarehouse currently supports the integration of a pathway-centric set of databases including ENZYME, KEGG, and BioCyc, and in addition the UniProt, GenBank, NCBI Taxonomy, and CMR databases, and the Gene Ontology. Loader tools, written in the C and JAVA languages, parse and load these databases into a relational database schema. The loaders also apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity. The schema supports the following bioinformatics datatypes: chemical compounds, biochemical reactions, metabolic pathways, proteins, genes, nucleic acid sequences, features on protein and nucleic-acid sequences, organisms, organism taxonomies, and controlled vocabularies. As an application example, we applied BioWarehouse to determine the fraction of biochemically characterized enzyme activities for which no sequences exist in the public sequence databases. The answer is that no sequence exists for 36% of enzyme activities for which EC numbers have been assigned. These gaps in sequence data significantly limit the accuracy of genome annotation and metabolic pathway prediction, and are a barrier for metabolic engineering. Complex queries of this type provide examples of the value of the data warehousing approach to bioinformatics research. Conclusion BioWarehouse embodies significant progress on the database integration problem for bioinformatics. PMID:16556315
BioWarehouse: a bioinformatics database warehouse toolkit.
Lee, Thomas J; Pouliot, Yannick; Wagner, Valerie; Gupta, Priyanka; Stringer-Calvert, David W J; Tenenbaum, Jessica D; Karp, Peter D
2006-03-23
This article addresses the problem of interoperation of heterogeneous bioinformatics databases. We introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL) but also facilitating a variety of database integration tasks such as comparative analysis and data mining. BioWarehouse currently supports the integration of a pathway-centric set of databases including ENZYME, KEGG, and BioCyc, and in addition the UniProt, GenBank, NCBI Taxonomy, and CMR databases, and the Gene Ontology. Loader tools, written in the C and JAVA languages, parse and load these databases into a relational database schema. The loaders also apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity. The schema supports the following bioinformatics datatypes: chemical compounds, biochemical reactions, metabolic pathways, proteins, genes, nucleic acid sequences, features on protein and nucleic-acid sequences, organisms, organism taxonomies, and controlled vocabularies. As an application example, we applied BioWarehouse to determine the fraction of biochemically characterized enzyme activities for which no sequences exist in the public sequence databases. The answer is that no sequence exists for 36% of enzyme activities for which EC numbers have been assigned. These gaps in sequence data significantly limit the accuracy of genome annotation and metabolic pathway prediction, and are a barrier for metabolic engineering. Complex queries of this type provide examples of the value of the data warehousing approach to bioinformatics research. BioWarehouse embodies significant progress on the database integration problem for bioinformatics.
Abbott, Kenneth L; Nyre, Erik T; Abrahante, Juan; Ho, Yen-Yi; Isaksson Vogel, Rachel; Starr, Timothy K
2015-01-01
Identification of cancer driver gene mutations is crucial for advancing cancer therapeutics. Due to the overwhelming number of passenger mutations in the human tumor genome, it is difficult to pinpoint causative driver genes. Using transposon mutagenesis in mice many laboratories have conducted forward genetic screens and identified thousands of candidate driver genes that are highly relevant to human cancer. Unfortunately, this information is difficult to access and utilize because it is scattered across multiple publications using different mouse genome builds and strength metrics. To improve access to these findings and facilitate meta-analyses, we developed the Candidate Cancer Gene Database (CCGD, http://ccgd-starrlab.oit.umn.edu/). The CCGD is a manually curated database containing a unified description of all identified candidate driver genes and the genomic location of transposon common insertion sites (CISs) from all currently published transposon-based screens. To demonstrate relevance to human cancer, we performed a modified gene set enrichment analysis using KEGG pathways and show that human cancer pathways are highly enriched in the database. We also used hierarchical clustering to identify pathways enriched in blood cancers compared to solid cancers. The CCGD is a novel resource available to scientists interested in the identification of genetic drivers of cancer. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Tariq, Mansoor; Chen, Rong; Yuan, Hongyu; Liu, Yanjie; Wu, Yanan; Wang, Junya; Xia, Chun
2015-01-01
Background The Chinese goose is one of the most economically important poultry birds and is a natural reservoir for many avian viruses. However, the nature and regulation of the innate and adaptive immune systems of this waterfowl species are not completely understood due to limited information on the goose genome. Recently, transcriptome sequencing technology was applied in the genomic studies focused on novel gene discovery. Thus, this study described the transcriptome of the goose peripheral blood lymphocytes to identify immunity relevant genes. Principal Findings De novo transcriptome assembly of the goose peripheral blood lymphocytes was sequenced by Illumina-Solexa technology. In total, 211,198 unigenes were assembled from the 69.36 million cleaned reads. The average length, N50 size and the maximum length of the assembled unigenes were 687 bp, 1,298 bp and 18,992 bp, respectively. A total of 36,854 unigenes showed similarity by BLAST search against the NCBI non-redundant (Nr) protein database. For functional classification, 163,161 unigenes were comprised of three Gene Ontology (Go) categories and 67 subcategories. A total of 15,334 unigenes were annotated into 25 eukaryotic orthologous groups (KOGs) categories. Kyoto Encyclopedia of Genes and Genomes (KEGG) database annotated 39,585 unigenes into six biological functional groups and 308 pathways. Among the 2,757 unigenes that participated in the 15 immune system KEGG pathways, 125 of the most important immune relevant genes were summarized and analyzed by STRING analysis to identify gene interactions and relationships. Moreover, 10 genes were confirmed by PCR and analyzed. Of these 125 unigenes, 109 unigenes, approximately 87%, were not previously identified in the goose. Conclusion This de novo transcriptome analysis could provide important Chinese goose sequence information and highlights the value of new gene discovery, pathways investigation and immune system gene identification, and comparison with other avian species as useful tools to understand the goose immune system. PMID:25816068
Zhang, Qian; Sun, Xiaofang; Xiao, Xinhua; Zheng, Jia; Li, Ming; Yu, Miao; Ping, Fan; Wang, Zhixin; Qi, Cuijuan; Wang, Tong; Wang, Xiaojing
2017-04-30
Maternal malnutrition leads to the incidence of metabolic diseases in offspring. The purpose of this project was to examine whether maternal low chromium could disturb normal lipid metabolism in offspring, altering adipose cell differentiation and leading to the incidence of lipid metabolism diseases, including metabolic syndrome and obesity. Female C57BL mice were given a control diet (CD) or a low chromium diet (LCD) during the gestational and lactation periods. After weaning, offspring was fed with CD or LCD. The female offspring were assessed at 32 weeks of age. Fresh adipose samples from CD-CD group and LCD-CD group were collected. Genome mRNA were analysed using Affymetrix GeneChip Mouse Gene 2.0 ST Whole Transcript-based array. Differentially expressed genes (DEGs) were analysed based on gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis database. Maternal low chromium irreversibly increased offspring body weight, fat-pad weight, serum triglyceride (TG) and TNF-α. Eighty five genes increased and 109 genes reduced in the offspring adipose of the maternal low chromium group. According to KEGG pathway and String analyses, the PPAR signalling pathway may be the key controlled pathway related to the effect of maternal low chromium on female offspring. Maternal chromium status have long-term effects of lipid metabolism in female mice offspring. Normalizing offspring diet can not reverse these effects. The potential underlying mechanisms are the disturbance of the PPAR signalling pathway in adipose tissue. © 2017 The Author(s).
Chatterjee, Shatakshee; Verma, Srikant Prasad; Pandey, Priyanka
2017-09-05
Initiation and progression of fluid filled cysts mark Autosomal Dominant Polycystic Kidney Disease (ADPKD). Thus, improved therapeutics targeting cystogenesis remains a constant challenge. Microarray studies in single ADPKD animal models species with limited sample sizes tend to provide scattered views on underlying ADPKD pathogenesis. Thus we aim to perform a cross species meta-analysis to profile conserved biological pathways that might be key targets for therapy. Nine ADPKD microarray datasets on rat, mice and human fulfilled our study criteria and were chosen. Intra-species combined analysis was performed after considering removal of batch effect. Significantly enriched GO biological processes and KEGG pathways were computed and their overlap was observed. For the conserved pathways, biological modules and gene regulatory networks were observed. Additionally, Gene Set Enrichment Analysis (GSEA) using Molecular Signature Database (MSigDB) was performed for genes found in conserved pathways. We obtained 28 modules of significantly enriched GO processes and 5 major functional categories from significantly enriched KEGG pathways conserved in human, mice and rats that in turn suggest a global transcriptomic perturbation affecting cyst - formation, growth and progression. Significantly enriched pathways obtained from up-regulated genes such as Genomic instability, Protein localization in ER and Insulin Resistance were found to regulate cyst formation and growth whereas cyst progression due to increased cell adhesion and inflammation was suggested by perturbations in Angiogenesis, TGF-beta, CAMs, and Infection related pathways. Additionally, networks revealed shared genes among pathways e.g. SMAD2 and SMAD7 in Endocytosis and TGF-beta. Our study suggests cyst formation and progression to be an outcome of interplay between a set of several key deregulated pathways. Thus, further translational research is warranted focusing on developing a combinatorial therapeutic approach for ADPKD redressal. Copyright © 2017 Elsevier B.V. All rights reserved.
1-CMDb: A Curated Database of Genomic Variations of the One-Carbon Metabolism Pathway.
Bhat, Manoj K; Gadekar, Veerendra P; Jain, Aditya; Paul, Bobby; Rai, Padmalatha S; Satyamoorthy, Kapaettu
2017-01-01
The one-carbon metabolism pathway is vital in maintaining tissue homeostasis by driving the critical reactions of folate and methionine cycles. A myriad of genetic and epigenetic events mark the rate of reactions in a tissue-specific manner. Integration of these to predict and provide personalized health management requires robust computational tools that can process multiomics data. The DNA sequences that may determine the chain of biological events and the endpoint reactions within one-carbon metabolism genes remain to be comprehensively recorded. Hence, we designed the one-carbon metabolism database (1-CMDb) as a platform to interrogate its association with a host of human disorders. DNA sequence and network information of a total of 48 genes were extracted from a literature survey and KEGG pathway that are involved in the one-carbon folate-mediated pathway. The information generated, collected, and compiled for all these genes from the UCSC genome browser included the single nucleotide polymorphisms (SNPs), CpGs, copy number variations (CNVs), and miRNAs, and a comprehensive database was created. Furthermore, a significant correlation analysis was performed for SNPs in the pathway genes. Detailed data of SNPs, CNVs, CpG islands, and miRNAs for 48 folate pathway genes were compiled. The SNPs in CNVs (9670), CpGs (984), and miRNAs (14) were also compiled for all pathway genes. The SIFT score, the prediction and PolyPhen score, as well as the prediction for each of the SNPs were tabulated and represented for folate pathway genes. Also included in the database for folate pathway genes were the links to 124 various phenotypes and disease associations as reported in the literature and from publicly available information. A comprehensive database was generated consisting of genomic elements within and among SNPs, CNVs, CpGs, and miRNAs of one-carbon metabolism pathways to facilitate (a) single source of information and (b) integration into large-genome scale network analysis to be developed in the future by the scientific community. The database can be accessed at http://slsdb.manipal.edu/ocm/. © 2017 S. Karger AG, Basel.
2014-01-01
Automatic reconstruction of metabolic pathways for an organism from genomics and transcriptomics data has been a challenging and important problem in bioinformatics. Traditionally, known reference pathways can be mapped into an organism-specific ones based on its genome annotation and protein homology. However, this simple knowledge-based mapping method might produce incomplete pathways and generally cannot predict unknown new relations and reactions. In contrast, ab initio metabolic network construction methods can predict novel reactions and interactions, but its accuracy tends to be low leading to a lot of false positives. Here we combine existing pathway knowledge and a new ab initio Bayesian probabilistic graphical model together in a novel fashion to improve automatic reconstruction of metabolic networks. Specifically, we built a knowledge database containing known, individual gene / protein interactions and metabolic reactions extracted from existing reference pathways. Known reactions and interactions were then used as constraints for Bayesian network learning methods to predict metabolic pathways. Using individual reactions and interactions extracted from different pathways of many organisms to guide pathway construction is new and improves both the coverage and accuracy of metabolic pathway construction. We applied this probabilistic knowledge-based approach to construct the metabolic networks from yeast gene expression data and compared its results with 62 known metabolic networks in the KEGG database. The experiment showed that the method improved the coverage of metabolic network construction over the traditional reference pathway mapping method and was more accurate than pure ab initio methods. PMID:25374614
Computational analysis of microRNA function in heart development.
Liu, Ganqiang; Ding, Min; Chen, Jiajia; Huang, Jinyan; Wang, Haiyun; Jing, Qing; Shen, Bairong
2010-09-01
Emerging evidence suggests that specific spatio-temporal microRNA (miRNA) expression is required for heart development. In recent years, hundreds of miRNAs have been discovered. In contrast, functional annotations are available only for a very small fraction of these regulatory molecules. In order to provide a global perspective for the biologists who study the relationship between differentially expressed miRNAs and heart development, we employed computational analysis to uncover the specific cellular processes and biological pathways targeted by miRNAs in mouse heart development. Here, we utilized Gene Ontology (GO) categories, KEGG Pathway, and GeneGo Pathway Maps as a gene functional annotation system for miRNA target enrichment analysis. The target genes of miRNAs were found to be enriched in functional categories and pathway maps in which miRNAs could play important roles during heart development. Meanwhile, we developed miRHrt (http://sysbio.suda.edu.cn/mirhrt/), a database aiming to provide a comprehensive resource of miRNA function in regulating heart development. These computational analysis results effectively illustrated the correlation of differentially expressed miRNAs with cellular functions and heart development. We hope that the identified novel heart development-associated pathways and the database presented here would facilitate further understanding of the roles and mechanisms of miRNAs in heart development.
Liu, Xiaozhen; Jin, Gan; Qian, Jiacheng; Yang, Hongjian; Tang, Hongchao; Meng, Xuli; Li, Yongfeng
2018-04-23
This study aimed to screen sensitive biomarkers for the efficacy evaluation of neoadjuvant chemotherapy in breast cancer. In this study, Illumina digital gene expression sequencing technology was applied and differentially expressed genes (DEGs) between patients presenting pathological complete response (pCR) and non-pathological complete response (NpCR) were identified. Further, gene ontology and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis were then performed. The genes in significant enriched pathways were finally quantified by quantitative real-time PCR (qRT-PCR) to confirm that they were differentially expressed. Additionally, GSE23988 from Gene Expression Omnibus database was used as the validation dataset to confirm the DEGs. After removing the low-quality reads, 715 DEGs were finally detected. After mapping to KEGG pathways, 10 DEGs belonging to the ubiquitin proteasome pathway (HECTD3, PSMB10, UBD, UBE2C, and UBE2S) and cytokine-cytokine receptor interactions (CCL2, CCR1, CXCL10, CXCL11, and IL2RG) were selected for further analysis. These 10 genes were finally quantified by qRT-PCR to confirm that they were differentially expressed (the log 2 fold changes of selected genes were - 5.34, 7.81, 6.88, 5.74, 3.11, 19.58, 8.73, 8.88, 7.42, and 34.61 for HECTD3, PSMB10, UBD, UBE2C, UBE2S, CCL2, CCR1, CXCL10, CXCL11, and IL2RG, respectively). Moreover, 53 common genes were confirmed by the validation dataset, including downregulated UBE2C and UBE2S. Our results suggested that these 10 genes belonging to these two pathways might be useful as sensitive biomarkers for the efficacy evaluation of neoadjuvant chemotherapy in breast cancer.
Comparative transcriptome analysis of microsclerotia development in Nomuraea rileyi.
Song, Zhangyong; Yin, Youping; Jiang, Shasha; Liu, Juanjuan; Chen, Huan; Wang, Zhongkang
2013-06-19
Nomuraea rileyi is used as an environmental-friendly biopesticide. However, mass production and commercialization of this organism are limited due to its fastidious growth and sporulation requirements. When cultured in amended medium, we found that N. rileyi could produce microsclerotia bodies, replacing conidiophores as the infectious agent. However, little is known about the genes involved in microsclerotia development. In the present study, the transcriptomes were analyzed using next-generation sequencing technology to find the genes involved in microsclerotia development. A total of 4.69 Gb of clean nucleotides comprising 32,061 sequences was obtained, and 20,919 sequences were annotated (about 65%). Among the annotated sequences, only 5928 were annotated with 34 gene ontology (GO) functional categories, and 12,778 sequences were mapped to 165 pathways by searching against the Kyoto Encyclopedia of Genes and Genomes pathway (KEGG) database. Furthermore, we assessed the transcriptomic differences between cultures grown in minimal and amended medium. In total, 4808 sequences were found to be differentially expressed; 719 differentially expressed unigenes were assigned to 25 GO classes and 1888 differentially expressed unigenes were assigned to 161 KEGG pathways, including 25 enrichment pathways. Subsequently, we examined the up-regulation or uniquely expressed genes following amended medium treatment, which were also expressed on the enrichment pathway, and found that most of them participated in mediating oxidative stress homeostasis. To elucidate the role of oxidative stress in microsclerotia development, we analyzed the diversification of unigenes using quantitative reverse transcription-PCR (RT-qPCR). Our findings suggest that oxidative stress occurs during microsclerotia development, along with a broad metabolic activity change. Our data provide the most comprehensive sequence resource available for the study of N. rileyi. We believe that the transcriptome datasets will serve as an important public information platform to accelerate studies on N. rileyi microsclerotia.
Wang, Weijing; Jiang, Wenjie; Hou, Lin; Duan, Haiping; Wu, Yili; Xu, Chunsheng; Tan, Qihua; Li, Shuxia; Zhang, Dongfeng
2017-11-13
The therapeutic management of obesity is challenging, hence further elucidating the underlying mechanisms of obesity development and identifying new diagnostic biomarkers and therapeutic targets are urgent and necessary. Here, we performed differential gene expression analysis and weighted gene co-expression network analysis (WGCNA) to identify significant genes and specific modules related to BMI based on gene expression profile data of 7 discordant monozygotic twins. In the differential gene expression analysis, it appeared that 32 differentially expressed genes (DEGs) were with a trend of up-regulation in twins with higher BMI when compared to their siblings. Categories of positive regulation of nitric-oxide synthase biosynthetic process, positive regulation of NF-kappa B import into nucleus, and peroxidase activity were significantly enriched within GO database and NF-kappa B signaling pathway within KEGG database. DEGs of NAMPT, TLR9, PTGS2, HBD, and PCSK1N might be associated with obesity. In the WGCNA, among the total 20 distinct co-expression modules identified, coral1 module (68 genes) had the strongest positive correlation with BMI (r = 0.56, P = 0.04) and disease status (r = 0.56, P = 0.04). Categories of positive regulation of phospholipase activity, high-density lipoprotein particle clearance, chylomicron remnant clearance, reverse cholesterol transport, intermediate-density lipoprotein particle, chylomicron, low-density lipoprotein particle, very-low-density lipoprotein particle, voltage-gated potassium channel complex, cholesterol transporter activity, and neuropeptide hormone activity were significantly enriched within GO database for this module. And alcoholism and cell adhesion molecules pathways were significantly enriched within KEGG database. Several hub genes, such as GAL, ASB9, NPPB, TBX2, IL17C, APOE, ABCG4, and APOC2 were also identified. The module eigengene of saddlebrown module (212 genes) was also significantly correlated with BMI (r = 0.56, P = 0.04), and hub genes of KCNN1 and AQP10 were differentially expressed. We identified significant genes and specific modules potentially related to BMI based on the gene expression profile data of monozygotic twins. The findings may help further elucidate the underlying mechanisms of obesity development and provide novel insights to research potential gene biomarkers and signaling pathways for obesity treatment. Further analysis and validation of the findings reported here are important and necessary when more sample size is acquired.
Liu, Ken H; Walker, Douglas I; Uppal, Karan; Tran, ViLinh; Rohrbeck, Patricia; Mallon, Timothy M; Jones, Dean P
2016-08-01
The aim of this study was to maximize detection of serum metabolites with high-resolution metabolomics (HRM). Department of Defense Serum Repository (DoDSR) samples were analyzed using ultrahigh resolution mass spectrometry with three complementary chromatographic phases and four ionization modes. Chemical coverage was evaluated by number of ions detected and accurate mass matches to a human metabolomics database. Individual HRM platforms provided accurate mass matches for up to 58% of the KEGG metabolite database. Combining two analytical methods increased matches to 72% and included metabolites in most major human metabolic pathways and chemical classes. Detection and feature quality varied by analytical configuration. Dual chromatography HRM with positive and negative electrospray ionization provides an effective generalized method for metabolic assessment of military personnel.
Liu, Ken H.; Walker, Douglas I.; Uppal, Karan; Tran, ViLinh; Rohrbeck, Patricia; Mallon, Timothy M.; Jones, Dean P.
2016-01-01
Objective To maximize detection of serum metabolites with high-resolution metabolomics (HRM). Methods Department of Defense Serum Repository (DoDSR) samples were analyzed using ultra-high resolution mass spectrometry with three complementary chromatographic phases and four ionization modes. Chemical coverage was evaluated by number of ions detected and accurate mass matches to a human metabolomics database. Results Individual HRM platforms provided accurate mass matches for up to 58% of the KEGG metabolite database. Combining two analytical methods increased matches to 72%, and included metabolites in most major human metabolic pathways and chemical classes. Detection and feature quality varied by analytical configuration. Conclusions Dual chromatography HRM with positive and negative electrospray ionization provides an effective generalized method for metabolic assessment of military personnel. PMID:27501105
Dhanasekaran, A Ranjitha; Pearson, Jon L; Ganesan, Balasubramanian; Weimer, Bart C
2015-02-25
Mass spectrometric analysis of microbial metabolism provides a long list of possible compounds. Restricting the identification of the possible compounds to those produced by the specific organism would benefit the identification process. Currently, identification of mass spectrometry (MS) data is commonly done using empirically derived compound databases. Unfortunately, most databases contain relatively few compounds, leaving long lists of unidentified molecules. Incorporating genome-encoded metabolism enables MS output identification that may not be included in databases. Using an organism's genome as a database restricts metabolite identification to only those compounds that the organism can produce. To address the challenge of metabolomic analysis from MS data, a web-based application to directly search genome-constructed metabolic databases was developed. The user query returns a genome-restricted list of possible compound identifications along with the putative metabolic pathways based on the name, formula, SMILES structure, and the compound mass as defined by the user. Multiple queries can be done simultaneously by submitting a text file created by the user or obtained from the MS analysis software. The user can also provide parameters specific to the experiment's MS analysis conditions, such as mass deviation, adducts, and detection mode during the query so as to provide additional levels of evidence to produce the tentative identification. The query results are provided as an HTML page and downloadable text file of possible compounds that are restricted to a specific genome. Hyperlinks provided in the HTML file connect the user to the curated metabolic databases housed in ProCyc, a Pathway Tools platform, as well as the KEGG Pathway database for visualization and metabolic pathway analysis. Metabolome Searcher, a web-based tool, facilitates putative compound identification of MS output based on genome-restricted metabolic capability. This enables researchers to rapidly extend the possible identifications of large data sets for metabolites that are not in compound databases. Putative compound names with their associated metabolic pathways from metabolomics data sets are returned to the user for additional biological interpretation and visualization. This novel approach enables compound identification by restricting the possible masses to those encoded in the genome.
Chen, Wei; Zhao, Wenshan; Yang, Aiting; Xu, Anjian; Wang, Huan; Cong, Min; Liu, Tianhui; Wang, Ping; You, Hong
2017-12-15
Liver fibrosis, characterized with the excessive accumulation of extracellular matrix (ECM) proteins, represents the final common pathway of chronic liver inflammation. Ever-increasing evidence indicates microRNAs (miRNAs) dysregulation has important implications in the different stages of liver fibrosis. However, our knowledge of miRNA-gene regulation details pertaining to such disease remains unclear. The publicly available Gene Expression Omnibus (GEO) datasets of patients suffered from cirrhosis were extracted for integrated analysis. Differentially expressed miRNAs (DEMs) and genes (DEGs) were identified using GEO2R web tool. Putative target gene prediction of DEMs was carried out using the intersection of five major algorithms: DIANA-microT, TargetScan, miRanda, PICTAR5 and miRWalk. Functional miRNA-gene regulatory network (FMGRN) was constructed based on the computational target predictions at the sequence level and the inverse expression relationships between DEMs and DEGs. DAVID web server was selected to perform KEGG pathway enrichment analysis. Functional miRNA-gene regulatory module was generated based on the biological interpretation. Internal connections among genes in liver fibrosis-related module were determined using String database. MiRNA-gene regulatory modules related to liver fibrosis were experimentally verified in recombinant human TGFβ1 stimulated and specific miRNA inhibitor treated LX-2 cells. We totally identified 85 and 923 dysregulated miRNAs and genes in liver cirrhosis biopsy samples compared to their normal controls. All evident miRNA-gene pairs were identified and assembled into FMGRN which consisted of 990 regulations between 51 miRNAs and 275 genes, forming two big sub-networks that were defined as down-network and up-network, respectively. KEGG pathway enrichment analysis revealed that up-network was prominently involved in several KEGG pathways, in which "Focal adhesion", "PI3K-Akt signaling pathway" and "ECM-receptor interaction" were remarked significant (adjusted p<0.001). Genes enriched in these pathways coupled with their regulatory miRNAs formed a functional miRNA-gene regulatory module that contains 7 miRNAs, 22 genes and 42 miRNA-gene connections. Gene interaction analysis based on String database revealed that 8 out of 22 genes were highly clustered. Finally, we experimentally confirmed a functional regulatory module containing 5 miRNAs (miR-130b-3p, miR-148a-3p, miR-345-5p, miR-378a-3p, and miR-422a) and 6 genes (COL6A1, COL6A2, COL6A3, PIK3R3, COL1A1, CCND2) associated with liver fibrosis. Our integrated analysis of miRNA and gene expression profiles highlighted a functional miRNA-gene regulatory module associated with liver fibrosis, which, to some extent, may provide important clues to better understand the underlying pathogenesis of liver fibrosis. Copyright © 2017. Published by Elsevier B.V.
Xue, Yunping; Lv, Juan; Xu, Pengfei; Gu, Lin; Cao, Jian; Xu, Lingling; Xue, Kai; Li, Qian
2018-05-01
Polycystic ovary syndrome (PCOS) is a common reproductive endocrine disease, which is characterized by hyperandrogenism (HA), chronic anovulation, polycystic ovaries, insulin resistance, and obesity. At present, the mechanism by which PCOS/HA occurs has not been fully elucidated, thus, the mechanisms behind and interventions for HA in PCOS are current hot topics in research. MiRNAs have recently been shown to serve as diagnostic or prognostic biomarkers in patients with cancer. Thus, we are currently focused on studying the altered expression of miRNAs in follicular fluid and their correlation with HA in PCOS. Illumina deep sequencing technology was used to explore different miRNAs in the follicular fluid of women with PCOS/HA and in the follicular fluid of women in a control group. Target prediction databases were then used to analyse the target genes of different expressed miRNAs, and GO analysis and the KEGG pathway database were used to identify the functions and the main biochemical and signalling pathways of differentially expressed target genes. The expression levels of 263 miRNAs were significantly different (>2-fold up-regulated or <0.5-fold down-regulated, P < 0.05) between the two groups of women. For example, the expression levels of miRNA (200a-3p, 10b-3p, 200b-3p, 29c-3p, 99a-3p, and 125a-5p) were significantly increased, while there was a decreased expression of miR-105-3p in PCOS patients with respect to the control. Literature has shown that the above seven miRNAs were associated with HA in PCOS. Furthermore, 31 770 genes were predicted to be targets of the 263 differentially expressed microRNAs. GO analysis and the KEGG pathway database showed involvement of these target genes in HA in PCOS. These results suggest the presence of differentially expressed miRNAs in the follicular fluid of women with PCOS/HA versus women in the control group. The potential role of these microRNAs was elucidated using bioinformatics tools and was found to be involved in the regulation of different pathways, biological functions, and cellular components underlying PCOS. The results of this research may reveal new mechanisms of PCOS/HA and suggest potential treatment targets. © 2017 Wiley Periodicals, Inc.
Gao, Li; Zhang, Li-Jie; Li, Sheng-Hua; Wei, Li-Li; Luo, Bin; He, Rong-Quan; Xia, Shuang
2018-03-06
MiR-452-5p has been reported to be down-regulated in prostate cancer, affecting the development of this type of cancer. However, the molecular mechanism of miR-452-5p in prostate cancer remains unclear. Therefore, we investigated the network of target genes of miR-452-5p in prostate cancer using bioinformatics analyses. We first analyzed the expression profiles and prognostic value of miR-452-5p in prostate cancer tissues from a public database. Gene Ontology (GO), the Kyoto Encyclopedia of Genes and Genomes (KEGG), PANTHER pathway analyses, and a disease ontology (DG) analysis were performed to find the molecular functions of the target genes from GSE datasets and miRWalk. Finally, we validated hub genes from the protein-protein interaction (PPI) networks of the target genes in the Human Protein Atlas (HPA) database and Gene Expression Profiling Interactive Analysis (GEPIA). Narrowing down the optimal target genes was conducted by seeking the common parts of up-regulated genes from GEPIA, down-regulated genes from GSE datasets, and predicted genes in miRWalk. Based on mining of GEO and ArrayExpress microarray chips and miRNA-Seq data in the TCGA database, which includes 1007 prostate cancer samples and 387 non-cancer samples, miR-452-5p is shown to be down-regulated in prostate cancer. GO, KEGG, and PANTHER pathway analyses suggested that the target genes might participate in important biological processes, such as transforming growth factor beta signaling and the positive regulation of brown fat cell differentiation and mesenchymal cell differentiation, as well as the Ras signaling pathway and pathways regulating the pluripotency of stem cells and arrhythmogenic right ventricular cardiomyopathy (ARVC). Nine genes-GABBR, PNISR, NTSR1, DOCK1, EREG, SFRP1, PTGS2, LEF1, and BMP2-were defined as hub genes in the PPI network. Three genes-FAM174B, SLC30A4, and SLIT1-were jointly shared by GEPIA, the GSE datasets, and miRWalk. Down-regulated miR-452-5p might play an essential role in the tumorigenesis of prostate cancer. Copyright © 2018. Published by Elsevier GmbH.
Yang, Deying; Fu, Yan; Wu, Xuhang; Xie, Yue; Nie, Huaming; Chen, Lin; Nong, Xiang; Gu, Xiaobin; Wang, Shuxian; Peng, Xuerong; Yan, Ning; Zhang, Runhui; Zheng, Wanpeng; Yang, Guangyou
2012-01-01
Background Taenia pisiformis is one of the most common intestinal tapeworms and can cause infections in canines. Adult T. pisiformis (canines as definitive hosts) and Cysticercus pisiformis (rabbits as intermediate hosts) cause significant health problems to the host and considerable socio-economic losses as a consequence. No complete genomic data regarding T. pisiformis are currently available in public databases. RNA-seq provides an effective approach to analyze the eukaryotic transcriptome to generate large functional gene datasets that can be used for further studies. Methodology/Principal Findings In this study, 2.67 million sequencing clean reads and 72,957 unigenes were generated using the RNA-seq technique. Based on a sequence similarity search with known proteins, a total of 26,012 unigenes (no redundancy) were identified after quality control procedures via the alignment of four databases. Overall, 15,920 unigenes were mapped to 203 Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. Through analyzing the glycolysis/gluconeogenesis and axonal guidance pathways, we achieved an in-depth understanding of the biochemistry of T. pisiformis. Here, we selected four unigenes at random and obtained their full-length cDNA clones using RACE PCR. Functional distribution characteristics were gained through comparing four cestode species (72,957 unigenes of T. pisiformis, 30,700 ESTs of T. solium, 1,058 ESTs of Eg+Em [conserved ESTs between Echinococcus granulosus and Echinococcus multilocularis]), with the cluster of orthologous groups (COG) and gene ontology (GO) functional classification systems. Furthermore, the conserved common genes in these four cestode species were obtained and aligned by the KEGG database. Conclusion This study provides an extensive transcriptome dataset obtained from the deep sequencing of T. pisiformis in a non-model whole genome. The identification of conserved genes may provide novel approaches for potential drug targets and vaccinations against cestode infections. Research can now accelerate into the functional genomics, immunity and gene expression profiles of cestode species. PMID:22514598
ESEA: Discovering the Dysregulated Pathways based on Edge Set Enrichment Analysis
Han, Junwei; Shi, Xinrui; Zhang, Yunpeng; Xu, Yanjun; Jiang, Ying; Zhang, Chunlong; Feng, Li; Yang, Haixiu; Shang, Desi; Sun, Zeguo; Su, Fei; Li, Chunquan; Li, Xia
2015-01-01
Pathway analyses are playing an increasingly important role in understanding biological mechanism, cellular function and disease states. Current pathway-identification methods generally focus on only the changes of gene expression levels; however, the biological relationships among genes are also the fundamental components of pathways, and the dysregulated relationships may also alter the pathway activities. We propose a powerful computational method, Edge Set Enrichment Analysis (ESEA), for the identification of dysregulated pathways. This provides a novel way of pathway analysis by investigating the changes of biological relationships of pathways in the context of gene expression data. Simulation studies illustrate the power and performance of ESEA under various simulated conditions. Using real datasets from p53 mutation, Type 2 diabetes and lung cancer, we validate effectiveness of ESEA in identifying dysregulated pathways. We further compare our results with five other pathway enrichment analysis methods. With these analyses, we show that ESEA is able to help uncover dysregulated biological pathways underlying complex traits and human diseases via specific use of the dysregulated biological relationships. We develop a freely available R-based tool of ESEA. Currently, ESEA can support pathway analysis of the seven public databases (KEGG; Reactome; Biocarta; NCI; SPIKE; HumanCyc; Panther). PMID:26267116
SoyFN: a knowledge database of soybean functional networks.
Xu, Yungang; Guo, Maozu; Liu, Xiaoyan; Wang, Chunyu; Liu, Yang
2014-01-01
Many databases for soybean genomic analysis have been built and made publicly available, but few of them contain knowledge specifically targeting the omics-level gene-gene, gene-microRNA (miRNA) and miRNA-miRNA interactions. Here, we present SoyFN, a knowledge database of soybean functional gene networks and miRNA functional networks. SoyFN provides user-friendly interfaces to retrieve, visualize, analyze and download the functional networks of soybean genes and miRNAs. In addition, it incorporates much information about KEGG pathways, gene ontology annotations and 3'-UTR sequences as well as many useful tools including SoySearch, ID mapping, Genome Browser, eFP Browser and promoter motif scan. SoyFN is a schema-free database that can be accessed as a Web service from any modern programming language using a simple Hypertext Transfer Protocol call. The Web site is implemented in Java, JavaScript, PHP, HTML and Apache, with all major browsers supported. We anticipate that this database will be useful for members of research communities both in soybean experimental science and bioinformatics. Database URL: http://nclab.hit.edu.cn/SoyFN.
Sivakumar, Subramaniam; Anitha, Palanivel; Ramesh, Balsubramanian; Suresh, Gopal
2017-01-01
Insecticides are the toxic substances that are used to kill insects. The use of insecticides is believed to be one of the major factors behind the increase in agricultural productivity in the 20th century. The organophosphates are now the largest and most versatile class of insecticide used and Malathion is the predominant type utilized. The accumulation of Malathion in environment is the biggest threat to the environment because of its toxicity. Malathion is lethal to beneficial insects, snails, micro crustaceans, fish, birds, amphibians, and soil microorganisms. Chronic exposure of non-diabetic farmers to organophosphorus Malathion pesticides may induce insulin resistance, which might ultimately results in diabetes mellitus. Given the potential carcinogenic risk from the pesticides there is serious need to develop remediation processes to eliminate or minimize contamination in the environment. Biodegradation could be a reliable and cost effective technique for pesticide abatement. Since today as there were no metabolic pathway predicted for the degradation of organophosphates pesticide Malathion in KEGG database or in any of the other pathway databases. Thus in the present study, an attempt has been made to predict the microbial biodegradation pathway of Malathion using bioinformatics tools. The present study predicted the degradation pathway for Malathion. The present study also identifies, Streptomyces sp. and E.coli are capable of degrading Malathion through pathway prediction system. PMID:28584447
Sivakumar, Subramaniam; Anitha, Palanivel; Ramesh, Balsubramanian; Suresh, Gopal
2017-01-01
Insecticides are the toxic substances that are used to kill insects. The use of insecticides is believed to be one of the major factors behind the increase in agricultural productivity in the 20th century. The organophosphates are now the largest and most versatile class of insecticide used and Malathion is the predominant type utilized. The accumulation of Malathion in environment is the biggest threat to the environment because of its toxicity. Malathion is lethal to beneficial insects, snails, micro crustaceans, fish, birds, amphibians, and soil microorganisms. Chronic exposure of non-diabetic farmers to organophosphorus Malathion pesticides may induce insulin resistance, which might ultimately results in diabetes mellitus. Given the potential carcinogenic risk from the pesticides there is serious need to develop remediation processes to eliminate or minimize contamination in the environment. Biodegradation could be a reliable and cost effective technique for pesticide abatement. Since today as there were no metabolic pathway predicted for the degradation of organophosphates pesticide Malathion in KEGG database or in any of the other pathway databases. Thus in the present study, an attempt has been made to predict the microbial biodegradation pathway of Malathion using bioinformatics tools. The present study predicted the degradation pathway for Malathion. The present study also identifies, Streptomyces sp. and E.coli are capable of degrading Malathion through pathway prediction system.
WholePathwayScope: a comprehensive pathway-based analysis tool for high-throughput data
Yi, Ming; Horton, Jay D; Cohen, Jonathan C; Hobbs, Helen H; Stephens, Robert M
2006-01-01
Background Analysis of High Throughput (HTP) Data such as microarray and proteomics data has provided a powerful methodology to study patterns of gene regulation at genome scale. A major unresolved problem in the post-genomic era is to assemble the large amounts of data generated into a meaningful biological context. We have developed a comprehensive software tool, WholePathwayScope (WPS), for deriving biological insights from analysis of HTP data. Result WPS extracts gene lists with shared biological themes through color cue templates. WPS statistically evaluates global functional category enrichment of gene lists and pathway-level pattern enrichment of data. WPS incorporates well-known biological pathways from KEGG (Kyoto Encyclopedia of Genes and Genomes) and Biocarta, GO (Gene Ontology) terms as well as user-defined pathways or relevant gene clusters or groups, and explores gene-term relationships within the derived gene-term association networks (GTANs). WPS simultaneously compares multiple datasets within biological contexts either as pathways or as association networks. WPS also integrates Genetic Association Database and Partial MedGene Database for disease-association information. We have used this program to analyze and compare microarray and proteomics datasets derived from a variety of biological systems. Application examples demonstrated the capacity of WPS to significantly facilitate the analysis of HTP data for integrative discovery. Conclusion This tool represents a pathway-based platform for discovery integration to maximize analysis power. The tool is freely available at . PMID:16423281
Luo, Jie; Shi, Ke; Yin, Shu-Ya; Tang, Rui-Xue; Chen, Wen-Jie; Huang, Lin-Zhen; Gan, Ting-Qing; Cai, Zheng-Wen; Chen, Gang
2018-04-10
MiR-182-5p, as a member of miRNA family, can be detected in lung cancer and plays an important role in lung cancer. To explore the clinical value of miR-182-5p in lung squamous cell carcinoma (LUSC) and to unveil the molecular mechanism of LUSC. The clinical value of miR-182-5p in LUSC was investigated by collecting and calculating data from The Cancer Genome Atlas (TCGA) database, the Gene Expression Omnibus (GEO) database, and real-time quantitative polymerase chain reaction (RT-qPCR). Twelve prediction platforms were used to predict the target genes of miR-182-5p. Protein-protein interaction (PPI) networks and gene ontology (GO), and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses were used to explore the molecular mechanism of LUSC. The expression of miR-182-5p was significantly over-expressed in LUSC than in non-cancerous tissues, as evidenced by various approaches, including the TCGA database, GEO microarrays, RT-qPCR, and a comprehensive meta-analysis of 501 LUSC cases and 148 non-cancerous cases. Furthermore, a total of 81 potential target genes were chosen from the union of predicted genes and the TCGA database. GO and KEGG analyses demonstrated that the target genes are involved in pathways related to biological processes. PPIs revealed the relationships between these genes, with EPAS1, PRKCE, NR3C1, and RHOB being located in the center of the PPI network. MiR-182-5p upregulation greatly contributes to LUSC and may serve as a biomarker in LUSC.
The Pathway Coexpression Network: Revealing pathway relationships
Tanzi, Rudolph E.
2018-01-01
A goal of genomics is to understand the relationships between biological processes. Pathways contribute to functional interplay within biological processes through complex but poorly understood interactions. However, limited functional references for global pathway relationships exist. Pathways from databases such as KEGG and Reactome provide discrete annotations of biological processes. Their relationships are currently either inferred from gene set enrichment within specific experiments, or by simple overlap, linking pathway annotations that have genes in common. Here, we provide a unifying interpretation of functional interaction between pathways by systematically quantifying coexpression between 1,330 canonical pathways from the Molecular Signatures Database (MSigDB) to establish the Pathway Coexpression Network (PCxN). We estimated the correlation between canonical pathways valid in a broad context using a curated collection of 3,207 microarrays from 72 normal human tissues. PCxN accounts for shared genes between annotations to estimate significant correlations between pathways with related functions rather than with similar annotations. We demonstrate that PCxN provides novel insight into mechanisms of complex diseases using an Alzheimer’s Disease (AD) case study. PCxN retrieved pathways significantly correlated with an expert curated AD gene list. These pathways have known associations with AD and were significantly enriched for genes independently associated with AD. As a further step, we show how PCxN complements the results of gene set enrichment methods by revealing relationships between enriched pathways, and by identifying additional highly correlated pathways. PCxN revealed that correlated pathways from an AD expression profiling study include functional clusters involved in cell adhesion and oxidative stress. PCxN provides expanded connections to pathways from the extracellular matrix. PCxN provides a powerful new framework for interrogation of global pathway relationships. Comprehensive exploration of PCxN can be performed at http://pcxn.org/. PMID:29554099
Chai, Hui; Yan, Zhaoyuan; Huang, Ke; Jiang, Yuanqing; Zhang, Lin
2018-02-01
This study aimed to systematically investigate the relationship between miRNA expression and the occurrence of ventricular septal defect (VSD), and characterize the miRNA target genes and pathways that can lead to VSD. The miRNAs that were differentially expressed in blood samples from VSD and normal infants were screened and validated by implementing miRNA microarrays and qRT-PCR. The target genes regulated by differentially expressed miRNAs were predicted using three target gene databases. The functions and signaling pathways of the target genes were enriched using the GO database and KEGG database, respectively. The transcription and protein expression of specific target genes in critical pathways were compared in the VSD and normal control groups using qRT-PCR and western blotting, respectively. Compared with the normal control group, the VSD group had 22 differentially expressed miRNAs; 19 were downregulated and three were upregulated. The 10,677 predicted target genes participated in many biological functions related to cardiac development and morphogenesis. Four target genes (mGLUR, Gq, PLC, and PKC) were involved in the PKC pathway and four (ECM, FAK, PI3 K, and PDK1) were involved in the PI3 K-Akt pathway. The transcription and protein expression of these eight target genes were significantly upregulated in the VSD group. The 22 miRNAs that were dysregulated in the VSD group were mainly downregulated, which may result in the dysregulation of several key genes and biological functions related to cardiac development. These effects could also be exerted via the upregulation of eight specific target genes, the subsequent over-activation of the PKC and PI3 K-Akt pathways, and the eventual abnormal cardiac development and VSD.
Informatics approaches in the Biological Characterization of ...
Adverse Outcome Pathways (AOPs) are a conceptual framework to characterize toxicity pathways by a series of mechanistic steps from a molecular initiating event to population outcomes. This framework helps to direct risk assessment research, for example by aiding in computational prioritization of chemicals, genes, and tissues relevant to an adverse health outcome. We have designed and implemented a computational workflow to access a wealth of public data relating genes, chemicals, diseases, pathways, and species, to provide a biological context for putative AOPs. We selected three AOP case studies: ER/Aromatase Antagonism Leading to Reproductive Dysfunction, AHR1 Activation Leading to Cardiotoxicity, and AChE Inhibition Leading to Acute Mortality, and deduced a taxonomic range of applicability for each AOP. We developed computational tools to automatically access and analyze the pathway activity of AOP-relevant protein orthologs, finding broad similarity among vertebrate species for the ER/Aromatase and AHR1 AOPs, and similarity extending to invertebrate animal species for AChE inhibition. Additionally, we used public gene expression data to find groups of highly co-expressed genes, and compared those groups across organisms. To interpret these findings at a higher level of biological organization, we created the AOPdb, a relational database that mines results from sources including NCBI, KEGG, Reactome, CTD, and OMIM. This multi-source database connects genes,
Min, Li; Cheng, Jianbo; Zhao, Shengguo; Tian, He; Zhang, Yangdong; Li, Songli; Yang, Hongjian; Zheng, Nan; Wang, Jiaqi
2016-09-02
Heat stress (HS) has an enormous economic impact on the dairy industry. In recent years, many researchers have investigated changes in the gene expression and metabolomics profiles in dairy cows caused by HS. However, the proteomics profiles of heat-stressed dairy cows have not yet been completely elucidated. We compared plasma proteomics from HS-free and heat-stressed dairy cows using an iTRAQ labeling approach. After the depletion of high abundant proteins in the plasma, 1472 proteins were identified. Of these, 85 proteins were differentially abundant in cows exposed to HS relative to HS-free. Database searches combined with GO and KEGG pathway enrichment analyses revealed that many components of the complement and coagulation cascades were altered in heat-stressed cows compared with HS-free cows. Of these, many factors in the complement system (including complement components C1, C3, C5, C6, C7, C8, and C9, complement factor B, and factor H) were down-regulated by HS, while components of the coagulation system (including coagulation factors, vitamin K-dependent proteins, and fibrinogens) were up-regulated by HS. In conclusion, our results indicate that HS decreases plasma levels of complement system proteins, suggesting that immune function is impaired in dairy cows exposed to HS. Though many aspects of heat stress (HS) have been extensively researched, relatively little is known about the proteomics profile changes that occur during heat exposure. In this work, we employed a proteomics approach to investigate differential abundance of plasma proteins in HS-free and heat-stressed dairy cows. Database searches combined with GO and KEGG pathway enrichment analyses revealed that HS resulted in a decrease in complement components, suggesting that heat-stressed dairy cows have impaired immune function. In addition, through integrative analyses of proteomics and previous metabolomics, we showed enhanced glycolysis, lipid metabolic pathway shifts, and nitrogen repartitioning in dairy cows exposed to HS. Our findings expand our current knowledge on the effects of HS on plasma proteomics in dairy cows and offer a new perspective for future research. Copyright © 2016 Elsevier B.V. All rights reserved.
Rama Reddy, Nagaraja Reddy; Mehta, Rucha Harishbhai; Soni, Palak Harendrabhai; Makasana, Jayanti; Gajbhiye, Narendra Athamaram; Ponnuchamy, Manivel; Kumar, Jitendra
2015-01-01
Senna (Cassia angustifolia Vahl.) is a world's natural laxative medicinal plant. Laxative properties are due to sennosides (anthraquinone glycosides) natural products. However, little genetic information is available for this species, especially concerning the biosynthetic pathways of sennosides. We present here the transcriptome sequencing of young and mature leaf tissue of Cassia angustifolia using Illumina MiSeq platform that resulted in a total of 6.34 Gb of raw nucleotide sequence. The sequence assembly resulted in 42230 and 37174 transcripts with an average length of 1119 bp and 1467 bp for young and mature leaf, respectively. The transcripts were annotated using NCBI BLAST with 'green plant database (txid 33090)', Swiss Prot, Kyoto Encylcopedia of Genes & Genomes (KEGG), Cluster of Orthologous Gene (COG) and Gene Ontology (GO). Out of the total transcripts, 40138 (95.0%) and 36349 (97.7%) from young and mature leaf, respectively, were annotated by BLASTX against green plant database of NCBI. We used InterProscan to see protein similarity at domain level, a total of 34031 (young leaf) and 32077 (mature leaf) transcripts were annotated against the Pfam domains. All transcripts from young and mature leaf were assigned to 191 KEGG pathways. There were 166 and 159 CDS, respectively, from young and mature leaf involved in metabolism of terpenoids and polyketides. Many CDS encoding enzymes leading to biosynthesis of sennosides were identified. A total of 10,763 CDS differentially expressing in both young and mature leaf libraries of which 2,343 (21.7%) CDS were up-regulated in young compared to mature leaf. Several differentially expressed genes found functionally associated with sennoside biosynthesis. CDS encoding for many CYPs and TF families were identified having probable roles in metabolism of primary as well as secondary metabolites. We developed SSR markers for molecular breeding of senna. We have identified a set of putative genes involved in various secondary metabolite pathways, especially those related to the synthesis of sennosides which will serve as an important platform for public information about gene expression, genomics, and functional genomics in senna.
Rama Reddy, Nagaraja Reddy; Mehta, Rucha Harishbhai; Soni, Palak Harendrabhai; Makasana, Jayanti; Gajbhiye, Narendra Athamaram; Ponnuchamy, Manivel; Kumar, Jitendra
2015-01-01
Senna (Cassia angustifolia Vahl.) is a world’s natural laxative medicinal plant. Laxative properties are due to sennosides (anthraquinone glycosides) natural products. However, little genetic information is available for this species, especially concerning the biosynthetic pathways of sennosides. We present here the transcriptome sequencing of young and mature leaf tissue of Cassia angustifolia using Illumina MiSeq platform that resulted in a total of 6.34 Gb of raw nucleotide sequence. The sequence assembly resulted in 42230 and 37174 transcripts with an average length of 1119 bp and 1467 bp for young and mature leaf, respectively. The transcripts were annotated using NCBI BLAST with ‘green plant database (txid 33090)’, Swiss Prot, Kyoto Encylcopedia of Genes & Genomes (KEGG), Cluster of Orthologous Gene (COG) and Gene Ontology (GO). Out of the total transcripts, 40138 (95.0%) and 36349 (97.7%) from young and mature leaf, respectively, were annotated by BLASTX against green plant database of NCBI. We used InterProscan to see protein similarity at domain level, a total of 34031 (young leaf) and 32077 (mature leaf) transcripts were annotated against the Pfam domains. All transcripts from young and mature leaf were assigned to 191 KEGG pathways. There were 166 and 159 CDS, respectively, from young and mature leaf involved in metabolism of terpenoids and polyketides. Many CDS encoding enzymes leading to biosynthesis of sennosides were identified. A total of 10,763 CDS differentially expressing in both young and mature leaf libraries of which 2,343 (21.7%) CDS were up-regulated in young compared to mature leaf. Several differentially expressed genes found functionally associated with sennoside biosynthesis. CDS encoding for many CYPs and TF families were identified having probable roles in metabolism of primary as well as secondary metabolites. We developed SSR markers for molecular breeding of senna. We have identified a set of putative genes involved in various secondary metabolite pathways, especially those related to the synthesis of sennosides which will serve as an important platform for public information about gene expression, genomics, and functional genomics in senna. PMID:26098898
Park, So Young; Patnaik, Bharat Bhusan; Kang, Se Won; Hwang, Hee-Ju; Chung, Jong Min; Song, Dae Kwon; Sang, Min Kyu; Patnaik, Hongray Howrelia; Lee, Jae Bong; Noh, Mi Young; Kim, Changmu; Kim, Soonok; Park, Hong Seog; Lee, Jun Sang; Han, Yeon Soo; Lee, Yong Seok
2016-01-01
An aquatic gastropod belonging to the family Neritidae, Clithon retropictus is listed as an endangered class II species in South Korea. The lack of information on its genomic background limits the ability to obtain functional data resources and inhibits informed conservation planning for this species. In the present study, the transcriptomic sequencing and de novo assembly of C. retropictus generated a total of 241,696,750 high-quality reads. These assembled to 282,838 unigenes with mean and N50 lengths of 736.9 and 1201 base pairs, respectively. Of these, 125,616 unigenes were subjected to annotation analysis with known proteins in Protostome DB, COG, GO, and KEGG protein databases (BLASTX; E ≤ 0.00001) and with known nucleotides in the Unigene database (BLASTN; E ≤ 0.00001). The GO analysis indicated that cellular process, cell, and catalytic activity are the predominant GO terms in the biological process, cellular component, and molecular function categories, respectively. In addition, 2093 unigenes were distributed in 107 different KEGG pathways. Furthermore, 49,280 simple sequence repeats were identified in the unigenes (>1 kilobase sequences). This is the first report on the identification of transcriptomic and microsatellite resources for C. retropictus, which opens up the possibility of exploring traits related to the adaptation and acclimatization of this species. PMID:27455329
He, Bangxiang; Hou, Lulu; Dong, Manman; Shi, Jiawei; Huang, Xiaoyun; Ding, Yating; Cong, Xiaomei; Zhang, Feng; Zhang, Xuecheng; Zang, Xiaonan
2018-01-07
Haematococcus pluvialis is a commercial microalga, that produces abundant levels of astaxanthin under stress conditions. Acetate and Fe 2+ are reported to be important for astaxanthin accumulation in H. pluvialis . In order to study the synergistic effects of high light stress and these two factors, we obtained transcriptomes for four groups: high light irradiation (HL), addition of 25 mM acetate under high light (HA), addition of 20 μM Fe 2+ under high light (HF) and normal green growing cells (HG). Among the total clean reads of the four groups, 156,992 unigenes were found, of which 48.88% were annotated in at least one database (Nr, Nt, Pfam, KOG/COG, SwissProt, KEGG, GO). The statistics for DEGs (differentially expressed genes) showed that there were more than 10 thousand DEGs caused by high light and 1800-1900 DEGs caused by acetate or Fe 2+ . The results of DEG analysis by GO and KEGG enrichments showed that, under the high light condition, the expression of genes related to many pathways had changed, such as the pathway for carotenoid biosynthesis, fatty acid elongation, photosynthesis-antenna proteins, carbon fixation in photosynthetic organisms and so on. Addition of acetate under high light significantly promoted the expression of key genes related to the pathways for carotenoid biosynthesis and fatty acid elongation. Furthermore, acetate could obviously inhibit the expression of genes related to the pathway for photosynthesis-antenna proteins. For addition of Fe 2+ , the genes related to photosynthesis-antenna proteins were promoted significantly and there was no obvious change in the gene expressions related to carotenoid and fatty acid synthesis.
Ren, Yipeng; Xue, Junli; Yang, Huanhuan; Pan, Baoping; Bu, Wenjun
2017-05-01
The Manila clam, Ruditapes philippinarum, is one of the most economically important aquatic clams that are harvested on a large scale by the mariculture industry in China. However, increasing reports of bacterial pathogenic diseases have had a negative effect on the aquaculture industry of R. philippinarum. In the present study, the two transcriptome libraries of untreated (termed H) and challenged Vibrio anguillarum (termed HV) hepatopancreas were constructed and sequenced from Manila clam using an Illumina-based paired-end sequencing platform. In total, 75,302,886 and 66,578,976 high-quality clean reads were assembled from 101,080,746 and 99,673,538 raw data points from the two transcriptome libraries described above, respectively. Furthermore, 156,116 unigenes were generated from 210,685 transcripts, with an N50 length of 1125 bp, and from the annotated SwissProt, NR, NT, KO, GO, KOG and KEGG databases. Moreover, a total of 4071 differentially expressed unigenes (HV vs H) were detected, including 903 up-regulated and 3168 down-regulated genes. Among these differentially expressed unigenes, 226 unigenes were annotated using KEGG annotation in 16 immune-related signaling pathways, including Toll-like receptor, NF-kappa B, MAPK, NOD-like receptor, RIG-I-like receptor, and the TNF and chemokine signaling pathways. Finally, 20,341 simple sequence repeats (SSRs) and 214,430 potential single nucleotide polymorphisms (SNPs) were detected from the H and HV transcriptome libraries. In conclusion, these studies identified many candidate immune-related genes and signaling pathways and conducted a comparative analysis of the differentially expressed unigenes from Manila clam hepatopancreas in response to V. anguillarum stimulation. These data laid the foundation for studying the innate immune systems and defense mechanisms in R. philippinarum. Copyright © 2017 Elsevier Ltd. All rights reserved.
Chen, Lei; Zhang, Yu-Hang; Zheng, Mingyue; Huang, Tao; Cai, Yu-Dong
2016-12-01
Compound-protein interactions play important roles in every cell via the recognition and regulation of specific functional proteins. The correct identification of compound-protein interactions can lead to a good comprehension of this complicated system and provide useful input for the investigation of various attributes of compounds and proteins. In this study, we attempted to understand this system by extracting properties from both proteins and compounds, in which proteins were represented by gene ontology and KEGG pathway enrichment scores and compounds were represented by molecular fragments. Advanced feature selection methods, including minimum redundancy maximum relevance, incremental feature selection, and the basic machine learning algorithm random forest, were used to analyze these properties and extract core factors for the determination of actual compound-protein interactions. Compound-protein interactions reported in The Binding Databases were used as positive samples. To improve the reliability of the results, the analytic procedure was executed five times using different negative samples. Simultaneously, five optimal prediction methods based on a random forest and yielding maximum MCCs of approximately 77.55 % were constructed and may be useful tools for the prediction of compound-protein interactions. This work provides new clues to understanding the system of compound-protein interactions by analyzing extracted core features. Our results indicate that compound-protein interactions are related to biological processes involving immune, developmental and hormone-associated pathways.
Yuan, Can; Peng, Fang; Yang, Ze-Mao; Zhong, Wen-Juan; Mou, Fang-Sheng; Gong, Yi-Yun; Ji, Pei-Cheng; Pu, De-Qiang; Huang, Hai-Yan; Yang, Xiao; Zhang, Chao
2017-09-01
Ligusticum chuanxiong is a well-known traditional Chinese medicine plant. The study on its molecular markers development and germplasm resources is very important. In this study, we obtained 24 422 unigenes by assembling transcriptome sequencing reads of L. chuanxiong root. EST-SSR was detected and 4 073 SSR loci were identified. EST-SSR distribution and characteristic analysis results showed that the mono-nucleotide repeats were the main repeat types, accounting for 41.0%. In addition, the sequences containing SSR were functionally annotated in Gene Ontology (GO) and KEGG pathway and were assigned to 49 GO categories, 242 KEGG pathways, among them 2 201 sequences were annotated against Nr database. By validating 235 EST-SSRs,74 primer pairs were ultimately proved to have high quality amplification. Subsequently, genetic diversity analysis, UPGMA cluster analysis, PCoA analysis and population structure analysis of 34 L. chuanxiong germplasm resources were carried out with 74 primer pairs. In both UPGMA tree and PCoA results, L. chuanxiong resources were clustered into two groups, which are believed to be partial related to their geographical distribution. In this study, EST-SSRs in L. chuanxiong was firstly identified, and newly developed molecular markers would contribute significantly to further genetic diversity study, the purity detection, gene mapping, and molecular breeding. Copyright© by the Chinese Pharmaceutical Association.
Yan, Guoyong; Zhang, Gen; Huang, Jiaomei; Lan, Yi; Sun, Jin; Zeng, Cong; Wang, Yong; Qian, Pei-Yuan; He, Lisheng
2017-10-27
Megabalanus barnacle is one of the model organisms for marine biofouling research. However, further elucidation of molecular mechanisms underlying larval settlement has been hindered due to the lack of genomic information thus far. In the present study, cDNA libraries were constructed for cyprids, the key stage for larval settlement, and adults of Megabalanus volcano . After high-throughput sequencing and de novo assembly, 42,620 unigenes were obtained with a N50 value of 1532 bp. These unigenes were annotated by blasting against the NCBI non-redundant (nr), Swiss-Prot, Cluster of Orthologous Groups (COG), and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. Finally, 19,522, 15,691, 14,459, and 10,914 unigenes were identified correspondingly. There were 22,158 differentially expressed genes (DEGs) identified between two stages. Compared with the cyprid stage, 8241 unigenes were down-regulated and 13,917 unigenes were up-regulated at the adult stage. The neuroactive ligand-receptor interaction pathway (ko04080) was significantly enriched by KEGG enrichment analysis of the DEGs, suggesting that it possibly involved in larval settlement. Potential functions of three conserved allatostatin neuropeptide-receptor pairs and two light-sensitive opsin proteins were further characterized, indicating that they might regulate attachment and metamorphosis at cyprid stage. These results provided a deeper insight into the molecular mechanisms underlying larval settlement of barnacles.
Yan, Guoyong; Huang, Jiaomei; Lan, Yi; Zeng, Cong; Wang, Yong; Qian, Pei-Yuan; He, Lisheng
2017-01-01
Megabalanus barnacle is one of the model organisms for marine biofouling research. However, further elucidation of molecular mechanisms underlying larval settlement has been hindered due to the lack of genomic information thus far. In the present study, cDNA libraries were constructed for cyprids, the key stage for larval settlement, and adults of Megabalanus volcano. After high-throughput sequencing and de novo assembly, 42,620 unigenes were obtained with a N50 value of 1532 bp. These unigenes were annotated by blasting against the NCBI non-redundant (nr), Swiss-Prot, Cluster of Orthologous Groups (COG), and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. Finally, 19,522, 15,691, 14,459, and 10,914 unigenes were identified correspondingly. There were 22,158 differentially expressed genes (DEGs) identified between two stages. Compared with the cyprid stage, 8241 unigenes were down-regulated and 13,917 unigenes were up-regulated at the adult stage. The neuroactive ligand-receptor interaction pathway (ko04080) was significantly enriched by KEGG enrichment analysis of the DEGs, suggesting that it possibly involved in larval settlement. Potential functions of three conserved allatostatin neuropeptide-receptor pairs and two light-sensitive opsin proteins were further characterized, indicating that they might regulate attachment and metamorphosis at cyprid stage. These results provided a deeper insight into the molecular mechanisms underlying larval settlement of barnacles. PMID:29077039
Yang, Yi; Maxwell, Andrew; Zhang, Xiaowei; Wang, Nan; Perkins, Edward J; Zhang, Chaoyang; Gong, Ping
2013-01-01
Pathway alterations reflected as changes in gene expression regulation and gene interaction can result from cellular exposure to toxicants. Such information is often used to elucidate toxicological modes of action. From a risk assessment perspective, alterations in biological pathways are a rich resource for setting toxicant thresholds, which may be more sensitive and mechanism-informed than traditional toxicity endpoints. Here we developed a novel differential networks (DNs) approach to connect pathway perturbation with toxicity threshold setting. Our DNs approach consists of 6 steps: time-series gene expression data collection, identification of altered genes, gene interaction network reconstruction, differential edge inference, mapping of genes with differential edges to pathways, and establishment of causal relationships between chemical concentration and perturbed pathways. A one-sample Gaussian process model and a linear regression model were used to identify genes that exhibited significant profile changes across an entire time course and between treatments, respectively. Interaction networks of differentially expressed (DE) genes were reconstructed for different treatments using a state space model and then compared to infer differential edges/interactions. DE genes possessing differential edges were mapped to biological pathways in databases such as KEGG pathways. Using the DNs approach, we analyzed a time-series Escherichia coli live cell gene expression dataset consisting of 4 treatments (control, 10, 100, 1000 mg/L naphthenic acids, NAs) and 18 time points. Through comparison of reconstructed networks and construction of differential networks, 80 genes were identified as DE genes with a significant number of differential edges, and 22 KEGG pathways were altered in a concentration-dependent manner. Some of these pathways were perturbed to a degree as high as 70% even at the lowest exposure concentration, implying a high sensitivity of our DNs approach. Findings from this proof-of-concept study suggest that our approach has a great potential in providing a novel and sensitive tool for threshold setting in chemical risk assessment. In future work, we plan to analyze more time-series datasets with a full spectrum of concentrations and sufficient replications per treatment. The pathway alteration-derived thresholds will also be compared with those derived from apical endpoints such as cell growth rate.
Identification of key microRNAs and genes in preeclampsia by bioinformatics analysis
Luo, Shouling; Cao, Nannan; Tang, Yao; Gu, Weirong
2017-01-01
Preeclampsia is a leading cause of perinatal maternal–foetal mortality and morbidity. The aim of this study is to identify the key microRNAs and genes in preeclampsia and uncover their potential functions. We downloaded the miRNA expression profile of GSE84260 and the gene expression profile of GSE73374 from the Gene Expression Omnibus database. Differentially expressed miRNAs and genes were identified and compared to miRNA-target information from MiRWalk 2.0, and a total of 65 differentially expressed miRNAs (DEMIs), including 32 up-regulated miRNAs and 33 down-regulated miRNAs, and 91 differentially expressed genes (DEGs), including 83 up-regulated genes and 8 down-regulated genes, were identified. The pathway enrichment analyses of the DEMIs showed that the up-regulated DEMIs were enriched in the Hippo signalling pathway and MAPK signalling pathway, and the down-regulated DEMIs were enriched in HTLV-I infection and miRNAs in cancers. The gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes pathway (KEGG) enrichment analyses of the DEGs were performed using Multifaceted Analysis Tool for Human Transcriptome. The up-regulated DEGs were enriched in biological processes (BPs), including the response to cAMP, response to hydrogen peroxide and cell-cell adhesion mediated by integrin; no enrichment of down-regulated DEGs was identified. KEGG analysis showed that the up-regulated DEGs were enriched in the Hippo signalling pathway and pathways in cancer. A PPI network of the DEGs was constructed by using Cytoscape software, and FOS, STAT1, MMP14, ITGB1, VCAN, DUSP1, LDHA, MCL1, MET, and ZFP36 were identified as the hub genes. The current study illustrates a characteristic microRNA profile and gene profile in preeclampsia, which may contribute to the interpretation of the progression of preeclampsia and provide novel biomarkers and therapeutic targets for preeclampsia. PMID:28594854
Dou, Yun-De; Huang, Tao; Wang, Qun; Shu, Xin; Zhao, Shi-Gang; Li, Lei; Liu, Tao; Lu, Gang; Chan, Wai-Yee; Liu, Hong-Bin
2018-01-29
Characterization of the genetic landscapes of familial ovarian cancer through integrated analysis of microRNA and mRNA by partial least squares (PLS) and Monte Carlo technique based on genome-wide association studies (GWAS). The miRNA and mRNA transcriptional data in familial ovarian cancer were characterized from the Gene Expression Omnibus (GEO) database. The miRNA and mRNA expression profiles in peripheral blood lymphocytes (PBLs) of 74 familial ovarian cancer patients and 47 control subjects were analyzed with the integration of partial least squares (PLS) and Monte Carlo techniques. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were also performed. Total of 16 miRNA-mRNA pairs were identified with the target gene prediction results of miRNAs and mRNAs. An innovated miRNA-mRNA integrated network was constructed in which 6 downregulated miRNAs and 1 upregulated miRNAs were included. KEGG and GO pathway enrichment analysis revealed over-representation of dysregulated miRNAs in various biological processes especially in cancer pathology. Hsa-miR-34b played a pivotal role in this network and interacted with other miRNAs. Hsa-miR-136 and hsa-miR-335 were associated with p53 and Erk1/2 pathways and tumor suppressors, such as PTEN. The results from this research provide insights on miRNA-mRNA networks and offer new tools for studying transcriptional variants in familial ovarian cancer. Copyright © 2018 Elsevier Inc. All rights reserved.
Pathway-Based Genome-Wide Association Studies for Two Meat Production Traits in Simmental Cattle.
Fan, Huizhong; Wu, Yang; Zhou, Xiaojing; Xia, Jiangwei; Zhang, Wengang; Song, Yuxin; Liu, Fei; Chen, Yan; Zhang, Lupei; Gao, Xue; Gao, Huijiang; Li, Junya
2015-12-17
Most single nucleotide polymorphisms (SNPs) detected by genome-wide association studies (GWAS), explain only a small fraction of phenotypic variation. Pathway-based GWAS were proposed to improve the proportion of genes for some human complex traits that could be explained by enriching a mass of SNPs within genetic groups. However, few attempts have been made to describe the quantitative traits in domestic animals. In this study, we used a dataset with approximately 7,700,000 SNPs from 807 Simmental cattle and analyzed live weight and longissimus muscle area using a modified pathway-based GWAS method to orthogonalise the highly linked SNPs within each gene using principal component analysis (PCA). As a result, of the 262 biological pathways of cattle collected from the KEGG database, the gamma aminobutyric acid (GABA)ergic synapse pathway and the non-alcoholic fatty liver disease (NAFLD) pathway were significantly associated with the two traits analyzed. The GABAergic synapse pathway was biologically applicable to the traits analyzed because of its roles in feed intake and weight gain. The proposed method had high statistical power and a low false discovery rate, compared to those of the smallest P-value and SNP set enrichment analysis methods.
NASA Astrophysics Data System (ADS)
Yang, Wei; Chen, Huapu; Cui, Xuefan; Zhang, Kewei; Jiang, Dongneng; Deng, Siping; Zhu, Chunhua; Li, Guangli
2017-09-01
Spotted scat (Scatophagus argus) is an economically important farmed fish, particularly in East and Southeast Asia. Because there has been little research on reproductive development and regulation in this species, the lack of a mature artificial reproduction technology remains a barrier for the sustainable development of the aquaculture industry. More genetic and genomic background knowledge is urgently needed for an in-depth understanding of the molecular mechanism of reproductive process and identification of functional genes related to sexual differentiation, gonad maturation and gametogenesis. For these reasons, we performed transcriptomic analysis on spotted scat using a multiple tissue sample mixing strategy. The Illumina RNA sequencing generated 118 510 486 raw reads. After trimming, de novo assembly was performed and yielded 99 888 unigenes with an average length of 905.75 bp. A total of 45 015 unigenes were successfully annotated to the Nr, Swiss-Prot, KOG and KEGG databases. Additionally, 23 783 and 27 183 annotated unigenes were assigned to 56 Gene Ontology (GO) functional groups and 228 KEGG pathways, respectively. Subsequently, 2 474 transcripts associated with reproduction were selected using GO term and KEGG pathway assignments, and a number of reproduction-related genes involved in sex differentiation, gonad development and gametogenesis were identified. Furthermore, 22 279 simple sequence repeat (SSR) loci were discovered and characterized. The comprehensive transcript dataset described here greatly increases the genetic information available for spotted scat and contributes valuable sequence resources for functional gene mining and analysis. Candidate transcripts involved in reproduction would make good starting points for future studies on reproductive mechanisms, and the putative sex differentiation-related genes will be helpful for sex-determining gene identification and sex-specific marker isolation. Lastly, the SSRs can serve as marker resources for future research into genetics, marker-assisted selection (MAS) and conservation biology.
A combined computational-experimental analyses of selected metabolic enzymes in Pseudomonas species.
Perumal, Deepak; Lim, Chu Sing; Chow, Vincent T K; Sakharkar, Kishore R; Sakharkar, Meena K
2008-09-10
Comparative genomic analysis has revolutionized our ability to predict the metabolic subsystems that occur in newly sequenced genomes, and to explore the functional roles of the set of genes within each subsystem. These computational predictions can considerably reduce the volume of experimental studies required to assess basic metabolic properties of multiple bacterial species. However, experimental validations are still required to resolve the apparent inconsistencies in the predictions by multiple resources. Here, we present combined computational-experimental analyses on eight completely sequenced Pseudomonas species. Comparative pathway analyses reveal that several pathways within the Pseudomonas species show high plasticity and versatility. Potential bypasses in 11 metabolic pathways were identified. We further confirmed the presence of the enzyme O-acetyl homoserine (thiol) lyase (EC: 2.5.1.49) in P. syringae pv. tomato that revealed inconsistent annotations in KEGG and in the recently published SYSTOMONAS database. These analyses connect and integrate systematic data generation, computational data interpretation, and experimental validation and represent a synergistic and powerful means for conducting biological research.
Comparative transcriptome analysis of microsclerotia development in Nomuraea rileyi
2013-01-01
Background Nomuraea rileyi is used as an environmental-friendly biopesticide. However, mass production and commercialization of this organism are limited due to its fastidious growth and sporulation requirements. When cultured in amended medium, we found that N. rileyi could produce microsclerotia bodies, replacing conidiophores as the infectious agent. However, little is known about the genes involved in microsclerotia development. In the present study, the transcriptomes were analyzed using next-generation sequencing technology to find the genes involved in microsclerotia development. Results A total of 4.69 Gb of clean nucleotides comprising 32,061 sequences was obtained, and 20,919 sequences were annotated (about 65%). Among the annotated sequences, only 5928 were annotated with 34 gene ontology (GO) functional categories, and 12,778 sequences were mapped to 165 pathways by searching against the Kyoto Encyclopedia of Genes and Genomes pathway (KEGG) database. Furthermore, we assessed the transcriptomic differences between cultures grown in minimal and amended medium. In total, 4808 sequences were found to be differentially expressed; 719 differentially expressed unigenes were assigned to 25 GO classes and 1888 differentially expressed unigenes were assigned to 161 KEGG pathways, including 25 enrichment pathways. Subsequently, we examined the up-regulation or uniquely expressed genes following amended medium treatment, which were also expressed on the enrichment pathway, and found that most of them participated in mediating oxidative stress homeostasis. To elucidate the role of oxidative stress in microsclerotia development, we analyzed the diversification of unigenes using quantitative reverse transcription-PCR (RT-qPCR). Conclusion Our findings suggest that oxidative stress occurs during microsclerotia development, along with a broad metabolic activity change. Our data provide the most comprehensive sequence resource available for the study of N. rileyi. We believe that the transcriptome datasets will serve as an important public information platform to accelerate studies on N. rileyi microsclerotia. PMID:23777366
Gene expression in obstetric antiphospholipid syndrome: a systematic review.
Muhammad Aliff, M; Muhammad Shazwan, S; Nur Fariha, M M; Hayati, A R; Nur Syahrina, A R; Maizatul Azma, M; Nazefah, A H; Jameela, S; Asral Wirda, A A
2016-12-01
Antiphospholipid syndrome (APS) is a multisystem disease that may present as venous or arterial thrombosis and/or pregnancy complications with the presence of antiphospholipid antibodies. Until today, heterogeneity of pathogenic mechanism fits well with various clinical manifestations. Moreover, previous studies have indicated that genes are differentially expressed between normal and in the disease state. Hence, this study systematically searched the literature on human gene expression that was differentially expressed in Obstetric APS. Electronic search was performed until 31st March 2015 through PubMed and Embase databases; where the following Medical Subject Heading (MeSH) terms were used and they had been specified as the primary focus of the articles; gene, antiphospholipid, obstetric, and pregnancy in the title or abstract. From 502 studies retrieved from the search, only original publications that had performed gene expression analyses of human placental tissue that reported on differentially expressed gene in pregnancies with Obstetric APS were included. Two reviewers independently scrutinized the titles and the abstracts before examining the eligibility of studies that met the inclusion criteria. For each study; diagnostic criteria for APS, method for analysis, and the gene signature were extracted independently by two reviewers. The genes listed were further analysed with the DAVID and the KEGG pathways. Three eligible gene expression studies involving obstetric APS, comprising the datasets on gene expression, were identified. All three studies showed a reduction in transcript expression on PRL, STAT5, TF, DAF, ABCA1, and HBEGF in Obstetric APS. The high enrichment score for functionality in DAVID had been positive regulation of cell proliferation. Meanwhile, pertaining to the KEGG pathway, two pathways were associated with some of the listed genes, which were ErBb signalling pathway and JAK-STAT signalling pathway. Ultimately, studies on a genetic level have the potential to provide new insights into the regulation and to widen the basis for identification of changes in the mechanism of Obstetric APS.
Song, Huwei; Zhao, Xiangxiang; Hu, Weicheng; Wang, Xinfeng; Shen, Ting; Yang, Liming
2016-11-04
Loquat ( Eriobotrya japonica Lindl.) is an important non-climacteric fruit and rich in essential nutrients such as minerals and carotenoids. During fruit development and ripening, thousands of the differentially expressed genes (DEGs) from various metabolic pathways cause a series of physiological and biochemical changes. To better understand the underlying mechanism of fruit development, the Solexa/Illumina RNA-seq high-throughput sequencing was used to evaluate the global changes of gene transcription levels. More than 51,610,234 high quality reads from ten runs of fruit development were sequenced and assembled into 48,838 unigenes. Among 3256 DEGs, 2304 unigenes could be annotated to the Gene Ontology database. These DEGs were distributed into 119 pathways described in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. A large number of DEGs were involved in carbohydrate metabolism, hormone signaling, and cell-wall degradation. The real-time reverse transcription (qRT)-PCR analyses revealed that several genes related to cell expansion, auxin signaling and ethylene response were differentially expressed during fruit development. Other members of transcription factor families were also identified. There were 952 DEGs considered as novel genes with no annotation in any databases. These unigenes will serve as an invaluable genetic resource for loquat molecular breeding and postharvest storage.
Robinson, J M; Henderson, W A
2018-01-12
We report a method using functional-molecular databases and network modelling to identify hypothetical mRNA-miRNA interaction networks regulating intestinal epithelial barrier function. The model forms a data-analysis component of our cell culture experiments, which produce RNA expression data from Nanostring Technologies nCounter ® system. The epithelial tight-junction (TJ) and actin cytoskeleton interact as molecular components of the intestinal epithelial barrier. Upstream regulation of TJ-cytoskeleton interaction is effected by the Rac/Rock/Rho signaling pathway and other associated pathways which may be activated or suppressed by extracellular signaling from growth factors, hormones, and immune receptors. Pathway activations affect epithelial homeostasis, contributing to degradation of the epithelial barrier associated with osmotic dysregulation, inflammation, and tumor development. The complexity underlying miRNA-mRNA interaction networks represents a roadblock for prediction and validation of competing-endogenous RNA network function. We developed a network model to identify hypothetical co-regulatory motifs in a miRNA-mRNA interaction network related to epithelial function. A mRNA-miRNA interaction list was generated using KEGG and miRWalk2.0 databases. R-code was developed to quantify and visualize inherent network structures. We identified a sub-network with a high number of shared, targeting miRNAs, of genes associated with cellular proliferation and cancer, including c-MYC and Cyclin D.
Rawal, Hukam C.; Kumar, Shrawan; Mithra S.V., Amitha; Solanke, Amolkumar U.; Saxena, Swati; Tyagi, Anshika; V., Sureshkumar; Yadav, Neelam R.; Kalia, Pritam; Singh, Narendra Pratap; Singh, Nagendra Kumar; Sharma, Tilak Raj; Gaikwad, Kishor
2017-01-01
Clusterbean (Cyamopsis tetragonoloba L. Taub), is an important industrial, vegetable and forage crop. This crop owes its commercial importance to the presence of guar gum (galactomannans) in its endosperm which is used as a lubricant in a range of industries. Despite its relevance to agriculture and industry, genomic resources available in this crop are limited. Therefore, the present study was undertaken to generate RNA-Seq based transcriptome from leaf, shoot, and flower tissues. A total of 145 million high quality Illumina reads were assembled using Trinity into 127,706 transcripts and 48,007 non-redundant high quality (HQ) unigenes. We annotated 79% unigenes against Plant Genes from the National Center for Biotechnology Information (NCBI), Swiss-Prot, Pfam, gene ontology (GO) and KEGG databases. Among the annotated unigenes, 30,020 were assigned with 116,964 GO terms, 9984 with EC and 6111 with 137 KEGG pathways. At different fragments per kilobase of transcript per millions fragments sequenced (FPKM) levels, genes were found expressed higher in flower tissue followed by shoot and leaf. Additionally, we identified 8687 potential simple sequence repeats (SSRs) with an average frequency of one SSR per 8.75 kb. A total of 28 amplified SSRs in 21 clusterbean genotypes resulted in polymorphism in 13 markers with average polymorphic information content (PIC) of 0.21. We also constructed a database named ‘ClustergeneDB’ for easy retrieval of unigenes and the microsatellite markers. The tissue specific genes identified and the molecular marker resources developed in this study is expected to aid in genetic improvement of clusterbean for its end use. PMID:29120386
Mitchell, Joshua M.; Fan, Teresa W.-M.; Lane, Andrew N.; Moseley, Hunter N. B.
2014-01-01
Large-scale identification of metabolites is key to elucidating and modeling metabolism at the systems level. Advances in metabolomics technologies, particularly ultra-high resolution mass spectrometry (MS) enable comprehensive and rapid analysis of metabolites. However, a significant barrier to meaningful data interpretation is the identification of a wide range of metabolites including unknowns and the determination of their role(s) in various metabolic networks. Chemoselective (CS) probes to tag metabolite functional groups combined with high mass accuracy provide additional structural constraints for metabolite identification and quantification. We have developed a novel algorithm, Chemically Aware Substructure Search (CASS) that efficiently detects functional groups within existing metabolite databases, allowing for combined molecular formula and functional group (from CS tagging) queries to aid in metabolite identification without a priori knowledge. Analysis of the isomeric compounds in both Human Metabolome Database (HMDB) and KEGG Ligand demonstrated a high percentage of isomeric molecular formulae (43 and 28%, respectively), indicating the necessity for techniques such as CS-tagging. Furthermore, these two databases have only moderate overlap in molecular formulae. Thus, it is prudent to use multiple databases in metabolite assignment, since each major metabolite database represents different portions of metabolism within the biosphere. In silico analysis of various CS-tagging strategies under different conditions for adduct formation demonstrate that combined FT-MS derived molecular formulae and CS-tagging can uniquely identify up to 71% of KEGG and 37% of the combined KEGG/HMDB database vs. 41 and 17%, respectively without adduct formation. This difference between database isomer disambiguation highlights the strength of CS-tagging for non-lipid metabolite identification. However, unique identification of complex lipids still needs additional information. PMID:25120557
Meta-All: a system for managing metabolic pathway information.
Weise, Stephan; Grosse, Ivo; Klukas, Christian; Koschützki, Dirk; Scholz, Uwe; Schreiber, Falk; Junker, Björn H
2006-10-23
Many attempts are being made to understand biological subjects at a systems level. A major resource for these approaches are biological databases, storing manifold information about DNA, RNA and protein sequences including their functional and structural motifs, molecular markers, mRNA expression levels, metabolite concentrations, protein-protein interactions, phenotypic traits or taxonomic relationships. The use of these databases is often hampered by the fact that they are designed for special application areas and thus lack universality. Databases on metabolic pathways, which provide an increasingly important foundation for many analyses of biochemical processes at a systems level, are no exception from the rule. Data stored in central databases such as KEGG, BRENDA or SABIO-RK is often limited to read-only access. If experimentalists want to store their own data, possibly still under investigation, there are two possibilities. They can either develop their own information system for managing that own data, which is very time-consuming and costly, or they can try to store their data in existing systems, which is often restricted. Hence, an out-of-the-box information system for managing metabolic pathway data is needed. We have designed META-ALL, an information system that allows the management of metabolic pathways, including reaction kinetics, detailed locations, environmental factors and taxonomic information. Data can be stored together with quality tags and in different parallel versions. META-ALL uses Oracle DBMS and Oracle Application Express. We provide the META-ALL information system for download and use. In this paper, we describe the database structure and give information about the tools for submitting and accessing the data. As a first application of META-ALL, we show how the information contained in a detailed kinetic model can be stored and accessed. META-ALL is a system for managing information about metabolic pathways. It facilitates the handling of pathway-related data and is designed to help biochemists and molecular biologists in their daily research. It is available on the Web at http://bic-gh.de/meta-all and can be downloaded free of charge and installed locally.
Meta-All: a system for managing metabolic pathway information
Weise, Stephan; Grosse, Ivo; Klukas, Christian; Koschützki, Dirk; Scholz, Uwe; Schreiber, Falk; Junker, Björn H
2006-01-01
Background Many attempts are being made to understand biological subjects at a systems level. A major resource for these approaches are biological databases, storing manifold information about DNA, RNA and protein sequences including their functional and structural motifs, molecular markers, mRNA expression levels, metabolite concentrations, protein-protein interactions, phenotypic traits or taxonomic relationships. The use of these databases is often hampered by the fact that they are designed for special application areas and thus lack universality. Databases on metabolic pathways, which provide an increasingly important foundation for many analyses of biochemical processes at a systems level, are no exception from the rule. Data stored in central databases such as KEGG, BRENDA or SABIO-RK is often limited to read-only access. If experimentalists want to store their own data, possibly still under investigation, there are two possibilities. They can either develop their own information system for managing that own data, which is very time-consuming and costly, or they can try to store their data in existing systems, which is often restricted. Hence, an out-of-the-box information system for managing metabolic pathway data is needed. Results We have designed META-ALL, an information system that allows the management of metabolic pathways, including reaction kinetics, detailed locations, environmental factors and taxonomic information. Data can be stored together with quality tags and in different parallel versions. META-ALL uses Oracle DBMS and Oracle Application Express. We provide the META-ALL information system for download and use. In this paper, we describe the database structure and give information about the tools for submitting and accessing the data. As a first application of META-ALL, we show how the information contained in a detailed kinetic model can be stored and accessed. Conclusion META-ALL is a system for managing information about metabolic pathways. It facilitates the handling of pathway-related data and is designed to help biochemists and molecular biologists in their daily research. It is available on the Web at and can be downloaded free of charge and installed locally. PMID:17059592
Li, Shi-Weng; Shi, Rui-Fang; Leng, Yan
2015-01-01
Adventitious rooting is the most important mechanism underlying vegetative propagation and an important strategy for plant propagation under environmental stress. The present study was conducted to obtain transcriptomic data and examine gene expression using RNA-Seq and bioinformatics analysis, thereby providing a foundation for understanding the molecular mechanisms controlling adventitious rooting. Three cDNA libraries constructed from mRNA samples from mung bean hypocotyls during adventitious rooting were sequenced. These three samples generated a total of 73 million, 60 million, and 59 million 100-bp reads, respectively. These reads were assembled into 78,697 unigenes with an average length of 832 bp, totaling 65 Mb. The unigenes were aligned against six public protein databases, and 29,029 unigenes (36.77%) were annotated using BLASTx. Among them, 28,225 (35.75%) and 28,119 (35.62%) unigenes had homologs in the TrEMBL and NCBI non-redundant (Nr) databases, respectively. Of these unigenes, 21,140 were assigned to gene ontology classes, and a total of 11,990 unigenes were classified into 25 KOG functional categories. A total of 7,357 unigenes were annotated to 4,524 KOs, and 4,651 unigenes were mapped onto 342 KEGG pathways using BLAST comparison against the KEGG database. A total of 11,717 unigenes were differentially expressed (fold change>2) during the root induction stage, with 8,772 unigenes down-regulated and 2,945 unigenes up-regulated. A total of 12,737 unigenes were differentially expressed during the root initiation stage, with 9,303 unigenes down-regulated and 3,434 unigenes up-regulated. A total of 5,334 unigenes were differentially expressed between the root induction and initiation stage, with 2,167 unigenes down-regulated and 3,167 unigenes up-regulated. qRT-PCR validation of the 39 genes with known functions indicated a strong correlation (92.3%) with the RNA-Seq data. The GO enrichment, pathway mapping, and gene expression profiles reveal molecular traits for root induction and initiation. This study provides a platform for functional genomic research with this species. PMID:26177103
Li, Shi-Weng; Shi, Rui-Fang; Leng, Yan
2015-01-01
Adventitious rooting is the most important mechanism underlying vegetative propagation and an important strategy for plant propagation under environmental stress. The present study was conducted to obtain transcriptomic data and examine gene expression using RNA-Seq and bioinformatics analysis, thereby providing a foundation for understanding the molecular mechanisms controlling adventitious rooting. Three cDNA libraries constructed from mRNA samples from mung bean hypocotyls during adventitious rooting were sequenced. These three samples generated a total of 73 million, 60 million, and 59 million 100-bp reads, respectively. These reads were assembled into 78,697 unigenes with an average length of 832 bp, totaling 65 Mb. The unigenes were aligned against six public protein databases, and 29,029 unigenes (36.77%) were annotated using BLASTx. Among them, 28,225 (35.75%) and 28,119 (35.62%) unigenes had homologs in the TrEMBL and NCBI non-redundant (Nr) databases, respectively. Of these unigenes, 21,140 were assigned to gene ontology classes, and a total of 11,990 unigenes were classified into 25 KOG functional categories. A total of 7,357 unigenes were annotated to 4,524 KOs, and 4,651 unigenes were mapped onto 342 KEGG pathways using BLAST comparison against the KEGG database. A total of 11,717 unigenes were differentially expressed (fold change>2) during the root induction stage, with 8,772 unigenes down-regulated and 2,945 unigenes up-regulated. A total of 12,737 unigenes were differentially expressed during the root initiation stage, with 9,303 unigenes down-regulated and 3,434 unigenes up-regulated. A total of 5,334 unigenes were differentially expressed between the root induction and initiation stage, with 2,167 unigenes down-regulated and 3,167 unigenes up-regulated. qRT-PCR validation of the 39 genes with known functions indicated a strong correlation (92.3%) with the RNA-Seq data. The GO enrichment, pathway mapping, and gene expression profiles reveal molecular traits for root induction and initiation. This study provides a platform for functional genomic research with this species.
Griffith, Rachel M.; Li, Hu; Zhang, Nan; Favazza, Tara L.; Fulton, Anne B.; Hansen, Ronald M.; Akula, James D.
2013-01-01
Purpose To identify the genes, biochemical signaling pathways and biological themes involved in the pathogenesis of retinopathy of prematurity (ROP). Methods Next-generation sequencing (NGS) was performed on the RNA transcriptome of rats with the Penn et al. (1994) oxygen-induced retinopathy (OIR) model of ROP at the height of vascular abnormality, postnatal day (P) 19, and normalized to age-matched, room-air-reared littermate controls. Eight custom developed pathways with potential relevance to known ROP sequelae were evaluated for significant regulation in ROP: The three major Wnt signaling pathways, canonical, planar cell polarity (PCP), and Wnt/Ca2+, two signaling pathways mediated by the Rho GTPases RhoA and Cdc42, which are respectively thought to intersect with canonical and noncanonical Wnt signaling, nitric oxide signaling pathways mediated by two nitrox oxide synthase (NOS) enzymes, neuronal (nNOS) and endothelial (eNOS), and the retinoic acid (RA) signaling pathway. Regulation of other biological pathways and themes were detected by gene ontology using the Kyoto Encyclopedia of Genes and Genomes (KEGG) and the NIH's Database for Annotation, Visualization and Integrated Discovery (DAVID)'s GO terms databases. Results Canonical Wnt signaling was found to be regulated, but the non-canonical PCP and Wnt/Ca2+ pathways were not. Nitric oxide (NO) signaling, as measured by the activation of nNOS eNOS, was also regulated, as was RA signaling. Biological themes related to protein translation (ribosomes), neural signaling, inflammation and immunity, cell cycle and cell death, were (among others) highly regulated in ROP rats. Conclusions These several genes and pathways identified by NGS might provide novel targets for intervention in ROP. PMID:23775346
Yang, Qian; Wang, Shuyuan; Dai, Enyu; Zhou, Shunheng; Liu, Dianming; Liu, Haizhou; Meng, Qianqian; Jiang, Bin; Jiang, Wei
2017-08-16
Pathway enrichment analysis has been widely used to identify cancer risk pathways, and contributes to elucidating the mechanism of tumorigenesis. However, most of the existing approaches use the outdated pathway information and neglect the complex gene interactions in pathway. Here, we first reviewed the existing widely used pathway enrichment analysis approaches briefly, and then, we proposed a novel topology-based pathway enrichment analysis (TPEA) method, which integrated topological properties and global upstream/downstream positions of genes in pathways. We compared TPEA with four widely used pathway enrichment analysis tools, including database for annotation, visualization and integrated discovery (DAVID), gene set enrichment analysis (GSEA), centrality-based pathway enrichment (CePa) and signaling pathway impact analysis (SPIA), through analyzing six gene expression profiles of three tumor types (colorectal cancer, thyroid cancer and endometrial cancer). As a result, we identified several well-known cancer risk pathways that could not be obtained by the existing tools, and the results of TPEA were more stable than that of the other tools in analyzing different data sets of the same cancer. Ultimately, we developed an R package to implement TPEA, which could online update KEGG pathway information and is available at the Comprehensive R Archive Network (CRAN): https://cran.r-project.org/web/packages/TPEA/. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
ARMOUR - A Rice miRNA: mRNA Interaction Resource.
Sanan-Mishra, Neeti; Tripathi, Anita; Goswami, Kavita; Shukla, Rohit N; Vasudevan, Madavan; Goswami, Hitesh
2018-01-01
ARMOUR was developed as A Rice miRNA:mRNA interaction resource. This informative and interactive database includes the experimentally validated expression profiles of miRNAs under different developmental and abiotic stress conditions across seven Indian rice cultivars. This comprehensive database covers 689 known and 1664 predicted novel miRNAs and their expression profiles in more than 38 different tissues or conditions along with their predicted/known target transcripts. The understanding of miRNA:mRNA interactome in regulation of functional cellular machinery is supported by the sequence information of the mature and hairpin structures. ARMOUR provides flexibility to users in querying the database using multiple ways like known gene identifiers, gene ontology identifiers, KEGG identifiers and also allows on the fly fold change analysis and sequence search query with inbuilt BLAST algorithm. ARMOUR database provides a cohesive platform for novel and mature miRNAs and their expression in different experimental conditions and allows searching for their interacting mRNA targets, GO annotation and their involvement in various biological pathways. The ARMOUR database includes a provision for adding more experimental data from users, with an aim to develop it as a platform for sharing and comparing experimental data contributed by research groups working on rice.
Wenne, Roman; Burzynski, Artur
2017-01-01
In fish, the skin is a multifunctional organ and the first barrier against pathogens. Salmonids differ in their susceptibility to microorganisms due to varied skin morphology and gene expression patterns. The brown trout is a salmonid species with important commercial and ecological value in Europe. However, there is a lack of knowledge regarding the genes involved in the immune response and mucus secretion in the skin of this fish. Thus, we characterized the skin transcriptome of anadromous brown trout using next-generation sequencing (NGS). A total of 1,348,306 filtered reads were obtained and assembled into 75,970 contigs. Of these contigs 48.57% were identified using BLAST tool searches against four public databases. KEGG pathway and Gene Ontology analyses revealed that 13.40% and 34.57% of the annotated transcripts, respectively, represent a variety of biological processes and functions. Among the identified KEGG Orthology categories, the best represented were signal transduction (23.28%) and immune system (8.82%), with a variety of genes involved in immune pathways, implying the differentiation of immune responses in the trout skin. We also identified and transcriptionally characterized 8 types of mucin proteins–the main structural components of the mucosal layer. Moreover, 140 genes involved in mucin synthesis were identified, and 1,119 potential simple sequence repeats (SSRs) were detected in 3,134 transcripts. PMID:28212382
A Web Tool for Generating High Quality Machine-readable Biological Pathways.
Ramirez-Gaona, Miguel; Marcu, Ana; Pon, Allison; Grant, Jason; Wu, Anthony; Wishart, David S
2017-02-08
PathWhiz is a web server built to facilitate the creation of colorful, interactive, visually pleasing pathway diagrams that are rich in biological information. The pathways generated by this online application are machine-readable and fully compatible with essentially all web-browsers and computer operating systems. It uses a specially developed, web-enabled pathway drawing interface that permits the selection and placement of different combinations of pre-drawn biological or biochemical entities to depict reactions, interactions, transport processes and binding events. This palette of entities consists of chemical compounds, proteins, nucleic acids, cellular membranes, subcellular structures, tissues, and organs. All of the visual elements in it can be interactively adjusted and customized. Furthermore, because this tool is a web server, all pathways and pathway elements are publicly accessible. This kind of pathway "crowd sourcing" means that PathWhiz already contains a large and rapidly growing collection of previously drawn pathways and pathway elements. Here we describe a protocol for the quick and easy creation of new pathways and the alteration of existing pathways. To further facilitate pathway editing and creation, the tool contains replication and propagation functions. The replication function allows existing pathways to be used as templates to create or edit new pathways. The propagation function allows one to take an existing pathway and automatically propagate it across different species. Pathways created with this tool can be "re-styled" into different formats (KEGG-like or text-book like), colored with different backgrounds, exported to BioPAX, SBGN-ML, SBML, or PWML data exchange formats, and downloaded as PNG or SVG images. The pathways can easily be incorporated into online databases, integrated into presentations, posters or publications, or used exclusively for online visualization and exploration. This protocol has been successfully applied to generate over 2,000 pathway diagrams, which are now found in many online databases including HMDB, DrugBank, SMPDB, and ECMDB.
Characterization of mango (Mangifera indica L.) transcriptome and chloroplast genome.
Azim, M Kamran; Khan, Ishtaiq A; Zhang, Yong
2014-05-01
We characterized mango leaf transcriptome and chloroplast genome using next generation DNA sequencing. The RNA-seq output of mango transcriptome generated >12 million reads (total nucleotides sequenced >1 Gb). De novo transcriptome assembly generated 30,509 unigenes with lengths in the range of 300 to ≥3,000 nt and 67× depth of coverage. Blast searching against nonredundant nucleotide databases and several Viridiplantae genomic datasets annotated 24,593 mango unigenes (80% of total) and identified Citrus sinensis as closest neighbor of mango with 9,141 (37%) matched sequences. The annotation with gene ontology and Clusters of Orthologous Group terms categorized unigene sequences into 57 and 25 classes, respectively. More than 13,500 unigenes were assigned to 293 KEGG pathways. Besides major plant biology related pathways, KEGG based gene annotation pointed out active presence of an array of biochemical pathways involved in (a) biosynthesis of bioactive flavonoids, flavones and flavonols, (b) biosynthesis of terpenoids and lignins and (c) plant hormone signal transduction. The mango transcriptome sequences revealed 235 proteases belonging to five catalytic classes of proteolytic enzymes. The draft genome of mango chloroplast (cp) was obtained by a combination of Sanger and next generation sequencing. The draft mango cp genome size is 151,173 bp with a pair of inverted repeats of 27,093 bp separated by small and large single copy regions, respectively. Out of 139 genes in mango cp genome, 91 found to be protein coding. Sequence analysis revealed cp genome of C. sinensis as closest neighbor of mango. We found 51 short repeats in mango cp genome supposed to be associated with extensive rearrangements. This is the first report of transcriptome and chloroplast genome analysis of any Anacardiaceae family member.
Sun, Haiyue; Liu, Yushan; Gai, Yuzhuo; Geng, Jinman; Chen, Li; Liu, Hongdi; Kang, Limin; Tian, Youwen; Li, Yadong
2015-09-02
Cranberries (Vaccinium macrocarpon Ait.), renowned for their excellent health benefits, are an important berry crop. Here, we performed transcriptome sequencing of one cranberry cultivar, from fruits at two different developmental stages, on the Illumina HiSeq 2000 platform. Our main goals were to identify putative genes for major metabolic pathways of bioactive compounds and compare the expression patterns between white fruit (W) and red fruit (R) in cranberry. In this study, two cDNA libraries of W and R were constructed. Approximately 119 million raw sequencing reads were generated and assembled de novo, yielding 57,331 high quality unigenes with an average length of 739 bp. Using BLASTx, 38,460 unigenes were identified as putative homologs of annotated sequences in public protein databases, including NCBI NR, NT, Swiss-Prot, KEGG, COG and GO. Of these, 21,898 unigenes mapped to 128 KEGG pathways, with the metabolic pathways, secondary metabolites, glycerophospholipid metabolism, ether lipid metabolism, starch and sucrose metabolism, purine metabolism, and pyrimidine metabolism being well represented. Among them, many candidate genes were involved in flavonoid biosynthesis, transport and regulation. Furthermore, digital gene expression (DEG) analysis identified 3,257 unigenes that were differentially expressed between the two fruit developmental stages. In addition, 14,473 simple sequence repeats (SSRs) were detected. Our results present comprehensive gene expression information about the cranberry fruit transcriptome that could facilitate our understanding of the molecular mechanisms of fruit development in cranberries. Although it will be necessary to validate the functions carried out by these genes, these results could be used to improve the quality of breeding programs for the cranberry and related species.
Li, Shicheng; Sun, Xiao; Miao, Shuncheng; Liu, Jia; Jiao, Wenjie
2017-11-01
Cigarette smoking is one of the greatest preventable risk factors for developing cancer, and most cases of lung squamous cell carcinoma (lung SCC) are associated with smoking. The pathogenesis mechanism of tumor progress is unclear. This study aimed to identify biomarkers in smoking-related lung cancer, including protein-coding gene, long noncoding RNA, and transcription factors. We selected and obtained messenger RNA microarray datasets and clinical data from the Gene Expression Omnibus database to identify gene expression altered by cigarette smoking. Integrated bioinformatic analysis was used to clarify biological functions of the identified genes, including Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway, the construction of a protein-protein interaction network, transcription factor, and statistical analyses. Subsequent quantitative real-time PCR was utilized to verify these bioinformatic analyses. Five hundred and ninety-eight differentially expressed genes and 21 long noncoding RNA were identified in smoking-related lung SCC. GO and KEGG pathway analysis showed that identified genes were enriched in the cancer-related functions and pathways. The protein-protein interaction network revealed seven hub genes identified in lung SCC. Several transcription factors and their binding sites were predicted. The results of real-time quantitative PCR revealed that AURKA and BIRC5 were significantly upregulated and LINC00094 was downregulated in the tumor tissues of smoking patients. Further statistical analysis indicated that dysregulation of AURKA, BIRC5, and LINC00094 indicated poor prognosis in lung SCC. Protein-coding genes AURKA, BIRC5, and LINC00094 could be biomarkers or therapeutic targets for smoking-related lung SCC. © 2017 The Authors. Thoracic Cancer published by China Lung Oncology Group and John Wiley & Sons Australia, Ltd.
Jeena, Gajendra Singh; Fatima, Shahnoor; Tripathi, Pragya; Upadhyay, Swati; Shukla, Rakesh Kumar
2017-06-28
Bacopa monnieri commonly known as Brahmi is utilized in Ayurveda to improve memory and many other human health benefits. Bacosides enriched standardized extract of Bacopa monnieri is being marketed as a memory enhancing agent. In spite of its well known pharmacological properties it is not much studied in terms of transcripts involved in biosynthetic pathway and its regulation that controls the secondary metabolic pathway in this plant. The aim of this study was to identify the potential transcripts and provide a framework of identified transcripts involved in bacosides production through transcriptome assembly. We performed comparative transcriptome analysis of shoot and root tissue of Bacopa monnieri in two independent biological replicate and obtained 22.48 million and 22.0 million high quality processed reads in shoot and root respectively. After de novo assembly and quantitative assessment total 26,412 genes got annotated in root and 18,500 genes annotated in shoot sample. Quality of raw reads was determined by using SeqQC-V2.2. Assembled sequences were annotated using BLASTX against public database such as NR or UniProt. Searching against the KEGG pathway database indicated that 37,918 unigenes from root and 35,130 unigenes from shoot were mapped to 133 KEGG pathways. Based on the DGE data we found that most of the transcript related to CYP450s and UDP-glucosyltransferases were specifically upregulated in shoot tissue as compared to root tissue. Finally, we have selected 43 transcripts related to secondary metabolism including transcription factor families which are differentially expressed in shoot and root tissues were validated by qRT-PCR and their expression level were monitored after MeJA treatment and wounding for 1, 3 and 5 h. This study not only represents the first de novo transcriptome analysis of Bacopa monnieri but also provides information about the identification, expression and differential tissues specific distribution of transcripts related to triterpenoid sapogenin which is one of the most important pharmacologically active secondary metabolite present in Bacopa monnieri. The identified transcripts in this study will establish a foundation for future studies related to carrying out the metabolic engineering for increasing the bacosides biosynthesis and its regulation for human health benefits.
Bi, Lei; Guan, Chun-jie; Yang, Guan-e; Yang, Fei; Yan, Hong-yu; Li, Qing-shan
2016-04-01
The purple photosynthetic bacterium Rhodopseudomonas palustris has been widely applied to enhance the therapeutic effects of traditional Chinese medicine using novel biotransformation technology. However, comprehensive studies of the R. palustris biotransformation mechanism are rare. Therefore, investigation of the expression patterns of genes involved in metabolic pathways that are active during the biotransformation process is essential to elucidate this complicated mechanism. To promote further study of the biotransformation of R. palustris, we assembled all R. palustris transcripts using Trinity software and performed differential expression analysis of the resulting unigenes. A total of 9725, 7341 and 10,963 unigenes were obtained by assembling the alpha-rhamnetin-3-rhamnoside-treated R. palustris (RPB) reads, control R. palustris (RPS) reads and combined RPB&RPS reads, respectively. A total of 9971 unigenes assembled from the RPB&RPS reads were mapped to the nr, nt, Swiss-Prot, Gene Ontology (GO), Clusters of Orthologous Groups (COGs) and Kyoto Encyclopedia of Genes and Genomes (KEGG) (E-value <0.00001) databases using BLAST software. A total of 3360 unique differentially expressed genes (DEGs) in RPB versus RPS were identified, among which 922 unigenes were up-regulated and 2438 were down-regulated. The unigenes were mapped to the KEGG database, resulting in the identification of 7676 pathways among all annotated unigenes and 2586 pathways among the DEGs. Some sets of functional unigenes annotated to important metabolic pathways and environmental information processing were differentially expressed between the RPS and RPB samples, including those involved in energy metabolism (18.4% of total DEGs), carbohydrate metabolism (36.0% of total DEGs), ABC transport (6.0% of total DEGs), the two-component system (8.6% of total DEGs), cell motility (4.3% of total DEGs) and the cell cycle (1.5% of total DEGs). We also identified 19 transcripts annotated as hydrolytic enzymes and other enzymes involved in ARR catabolism in R. palustris. We present the first comparative transcriptome profiles of RPB and RPS samples to facilitate elucidation of the molecular mechanism of biotransformation in R. palustris. Furthermore, we propose two putative ARR biotransformation mechanisms in R. palustris. These analytical results represent a useful genomic resource for in-depth research into the molecular basis of biotransformation and genetic modification in R. palustris. Copyright © 2016 Elsevier GmbH. All rights reserved.
Natarajan, Sathishkumar; Park, Jong-In; Chung, Mi-Young; Nou, Ill-Sup
2016-01-01
Bulb onion (Allium cepa) is the second most widely cultivated and consumed vegetable crop in the world. During winter, cold injury can limit the production of bulb onion. Genomic resources available for bulb onion are still very limited. To date, no studies on heritably durable cold and freezing tolerance have been carried out in bulb onion genotypes. We applied high-throughput sequencing technology to cold (2°C), freezing (-5 and -15°C), and control (25°C)-treated samples of cold tolerant (CT) and cold susceptible (CS) genotypes of A. cepa lines. A total of 452 million paired-end reads were de novo assembled into 54,047 genes with an average length of 1,331 bp. Based on similarity searches, these genes were aligned with entries in the public non-redundant (nr) database, as well as KEGG and COG database. Differentially expressed genes (DEGs) were identified using log10 values with the FPKM method. Among 5,167DEGs, 491 genes were differentially expressed at freezing temperature compared to the control temperature in both CT and CS libraries. The DEG results were validated with qRT-PCR. We performed GO and KEGG pathway enrichment analyses of all DEGs and iPath interactive analysis found 31 pathways including those related to metabolism of carbohydrate, nucleotide, energy, cofactors and vitamins, other amino acids and xenobiotics biodegradation. Furthermore, a large number of molecular markers were identified from the assembled genes, including simple sequence repeats (SSRs) 4,437 and SNP substitutions of transition and transversion types of CT and CS. Our study is the first to provide a transcriptome sequence resource for Allium spp. with regard to cold and freezing stress. We identified a large set of genes and determined their DEG profiles under cold and freezing conditions using two different genotypes. These data represent a valuable resource for genetic and genomic studies of Allium spp. PMID:27627679
Han, Jeongsukhyeon; Thamilarasan, Senthil Kumar; Natarajan, Sathishkumar; Park, Jong-In; Chung, Mi-Young; Nou, Ill-Sup
2016-01-01
Bulb onion (Allium cepa) is the second most widely cultivated and consumed vegetable crop in the world. During winter, cold injury can limit the production of bulb onion. Genomic resources available for bulb onion are still very limited. To date, no studies on heritably durable cold and freezing tolerance have been carried out in bulb onion genotypes. We applied high-throughput sequencing technology to cold (2°C), freezing (-5 and -15°C), and control (25°C)-treated samples of cold tolerant (CT) and cold susceptible (CS) genotypes of A. cepa lines. A total of 452 million paired-end reads were de novo assembled into 54,047 genes with an average length of 1,331 bp. Based on similarity searches, these genes were aligned with entries in the public non-redundant (nr) database, as well as KEGG and COG database. Differentially expressed genes (DEGs) were identified using log10 values with the FPKM method. Among 5,167DEGs, 491 genes were differentially expressed at freezing temperature compared to the control temperature in both CT and CS libraries. The DEG results were validated with qRT-PCR. We performed GO and KEGG pathway enrichment analyses of all DEGs and iPath interactive analysis found 31 pathways including those related to metabolism of carbohydrate, nucleotide, energy, cofactors and vitamins, other amino acids and xenobiotics biodegradation. Furthermore, a large number of molecular markers were identified from the assembled genes, including simple sequence repeats (SSRs) 4,437 and SNP substitutions of transition and transversion types of CT and CS. Our study is the first to provide a transcriptome sequence resource for Allium spp. with regard to cold and freezing stress. We identified a large set of genes and determined their DEG profiles under cold and freezing conditions using two different genotypes. These data represent a valuable resource for genetic and genomic studies of Allium spp.
PathCase-SB architecture and database design
2011-01-01
Background Integration of metabolic pathways resources and regulatory metabolic network models, and deploying new tools on the integrated platform can help perform more effective and more efficient systems biology research on understanding the regulation in metabolic networks. Therefore, the tasks of (a) integrating under a single database environment regulatory metabolic networks and existing models, and (b) building tools to help with modeling and analysis are desirable and intellectually challenging computational tasks. Description PathCase Systems Biology (PathCase-SB) is built and released. The PathCase-SB database provides data and API for multiple user interfaces and software tools. The current PathCase-SB system provides a database-enabled framework and web-based computational tools towards facilitating the development of kinetic models for biological systems. PathCase-SB aims to integrate data of selected biological data sources on the web (currently, BioModels database and KEGG), and to provide more powerful and/or new capabilities via the new web-based integrative framework. This paper describes architecture and database design issues encountered in PathCase-SB's design and implementation, and presents the current design of PathCase-SB's architecture and database. Conclusions PathCase-SB architecture and database provide a highly extensible and scalable environment with easy and fast (real-time) access to the data in the database. PathCase-SB itself is already being used by researchers across the world. PMID:22070889
Ran, Xia; Cai, Wei-Jun; Huang, Xiu-Feng; Liu, Qi; Lu, Fan; Qu, Jia; Wu, Jinyu; Jin, Zi-Bing
2014-01-01
Inherited retinal degeneration (IRD), a leading cause of human blindness worldwide, is exceptionally heterogeneous with clinical heterogeneity and genetic variety. During the past decades, tremendous efforts have been made to explore the complex heterogeneity, and massive mutations have been identified in different genes underlying IRD with the significant advancement of sequencing technology. In this study, we developed a comprehensive database, 'RetinoGenetics', which contains informative knowledge about all known IRD-related genes and mutations for IRD. 'RetinoGenetics' currently contains 4270 mutations in 186 genes, with detailed information associated with 164 phenotypes from 934 publications and various types of functional annotations. Then extensive annotations were performed to each gene using various resources, including Gene Ontology, KEGG pathways, protein-protein interaction, mutational annotations and gene-disease network. Furthermore, by using the search functions, convenient browsing ways and intuitive graphical displays, 'RetinoGenetics' could serve as a valuable resource for unveiling the genetic basis of IRD. Taken together, 'RetinoGenetics' is an integrative, informative and updatable resource for IRD-related genetic predispositions. Database URL: http://www.retinogenetics.org/. © The Author(s) 2014. Published by Oxford University Press.
2012-01-01
Background In rubber tree, bark is one of important agricultural and biological organs. However, the molecular mechanism involved in the bark formation and development in rubber tree remains largely unknown, which is at least partially due to lack of bark transcriptomic and genomic information. Therefore, it is necessary to carried out high-throughput transcriptome sequencing of rubber tree bark to generate enormous transcript sequences for the functional characterization and molecular marker development. Results In this study, more than 30 million sequencing reads were generated using Illumina paired-end sequencing technology. In total, 22,756 unigenes with an average length of 485 bp were obtained with de novo assembly. The similarity search indicated that 16,520 and 12,558 unigenes showed significant similarities to known proteins from NCBI non-redundant and Swissprot protein databases, respectively. Among these annotated unigenes, 6,867 and 5,559 unigenes were separately assigned to Gene Ontology (GO) and Clusters of Orthologous Group (COG). When 22,756 unigenes searched against the Kyoto Encyclopedia of Genes and Genomes Pathway (KEGG) database, 12,097 unigenes were assigned to 5 main categories including 123 KEGG pathways. Among the main KEGG categories, metabolism was the biggest category (9,043, 74.75%), suggesting the active metabolic processes in rubber tree bark. In addition, a total of 39,257 EST-SSRs were identified from 22,756 unigenes, and the characterizations of EST-SSRs were further analyzed in rubber tree. 110 potential marker sites were randomly selected to validate the assembly quality and develop EST-SSR markers. Among 13 Hevea germplasms, PCR success rate and polymorphism rate of 110 markers were separately 96.36% and 55.45% in this study. Conclusion By assembling and analyzing de novo transcriptome sequencing data, we reported the comprehensive functional characterization of rubber tree bark. This research generated a substantial fraction of rubber tree transcriptome sequences, which were very useful resources for gene annotation and discovery, molecular markers development, genome assembly and annotation, and microarrays development in rubber tree. The EST-SSR markers identified and developed in this study will facilitate marker-assisted selection breeding in rubber tree. Moreover, this study also supported that transcriptome analysis based on Illumina paired-end sequencing is a powerful tool for transcriptome characterization and molecular marker development in non-model species, especially those with large and complex genomes. PMID:22607098
Analysis of molecular pathways in pancreatic ductal adenocarcinomas with a bioinformatics approach.
Wang, Yan; Li, Yan
2015-01-01
Pancreatic ductal adenocarcinoma (PDAC) is a leading cause of cancer death worldwide. Our study aimed to reveal molecular mechanisms. Microarray data of GSE15471 (including 39 matching pairs of pancreatic tumor tissues and patient-matched normal tissues) was downloaded from Gene Expression Omnibus (GEO) database. We identified differentially expressed genes (DEGs) in PDAC tissues compared with normal tissues by limma package in R language. Then GO and KEGG pathway enrichment analyses were conducted with online DAVID. In addition, principal component analysis was performed and a protein-protein interaction network was constructed to study relationships between the DEGs through database STRING. A total of 532 DEGs were identified in the 38 PDAC tissues compared with 33 normal tissues. The results of principal component analysis of the top 20 DEGs could differentiate the PDAC tissues from normal tissues directly. In the PPI network, 8 of the 20 DEGs were all key genes of the collagen family. Additionally, FN1 (fibronectin 1) was also a hub node in the network. The genes of the collagen family as well as FN1 were significantly enriched in complement and coagulation cascades, ECM-receptor interaction and focal adhesion pathways. Our results suggest that genes of collagen family and FN1 may play an important role in PDAC progression. Meanwhile, these DEGs and enriched pathways, such as complement and coagulation cascades, ECM-receptor interaction and focal adhesion may be important molecular mechanisms involved in the development and progression of PDAC.
Subhra Das, Sankha; James, Mithun; Paul, Sandip
2017-01-01
Abstract The various pathophysiological processes occurring in living systems are known to be orchestrated by delicate interplays and cross-talks between different genes and their regulators. Among the various regulators of genes, there is a class of small non-coding RNA molecules known as microRNAs. Although, the relative simplicity of miRNAs and their ability to modulate cellular processes make them attractive therapeutic candidates, their presence in large numbers make it challenging for experimental researchers to interpret the intricacies of the molecular processes they regulate. Most of the existing bioinformatic tools fail to address these challenges. Here, we present a new web resource ‘miRnalyze’ that has been specifically designed to directly identify the putative regulation of cell signaling pathways by miRNAs. The tool integrates miRNA-target predictions with signaling cascade members by utilizing TargetScanHuman 7.1 miRNA-target prediction tool and the KEGG pathway database, and thus provides researchers with in-depth insights into modulation of signal transduction pathways by miRNAs. miRnalyze is capable of identifying common miRNAs targeting more than one gene in the same signaling pathway—a feature that further increases the probability of modulating the pathway and downstream reactions when using miRNA modulators. Additionally, miRnalyze can sort miRNAs according to the seed-match types and TargetScan Context ++ score, thus providing a hierarchical list of most valuable miRNAs. Furthermore, in order to provide users with comprehensive information regarding miRNAs, genes and pathways, miRnalyze also links to expression data of miRNAs (miRmine) and genes (TiGER) and proteome abundance (PaxDb) data. To validate the capability of the tool, we have documented the correlation of miRnalyze’s prediction with experimental confirmation studies. Database URL: http://www.mirnalyze.in PMID:28365733
Jeffryes, James G.; Colastani, Ricardo L.; Elbadawi-Sidhu, Mona; ...
2015-08-28
Metabolomics have proven difficult to execute in an untargeted and generalizable manner. Liquid chromatography–mass spectrometry (LC–MS) has made it possible to gather data on thousands of cellular metabolites. However, matching metabolites to their spectral features continues to be a bottleneck, meaning that much of the collected information remains uninterpreted and that new metabolites are seldom discovered in untargeted studies. These challenges require new approaches that consider compounds beyond those available in curated biochemistry databases. Here we present Metabolic In silico Network Expansions (MINEs), an extension of known metabolite databases to include molecules that have not been observed, but are likelymore » to occur based on known metabolites and common biochemical reactions. We utilize an algorithm called the Biochemical Network Integrated Computational Explorer (BNICE) and expert-curated reaction rules based on the Enzyme Commission classification system to propose the novel chemical structures and reactions that comprise MINE databases. Starting from the Kyoto Encyclopedia of Genes and Genomes (KEGG) COMPOUND database, the MINE contains over 571,000 compounds, of which 93% are not present in the PubChem database. However, these MINE compounds have on average higher structural similarity to natural products than compounds from KEGG or PubChem. MINE databases were able to propose annotations for 98.6% of a set of 667 MassBank spectra, 14% more than KEGG alone and equivalent to PubChem while returning far fewer candidates per spectra than PubChem (46 vs. 1715 median candidates). Application of MINEs to LC–MS accurate mass data enabled the identity of an unknown peak to be confidently predicted. MINE databases are freely accessible for non-commercial use via user-friendly web-tools at http://minedatabase.mcs.anl.gov and developer-friendly APIs. MINEs improve metabolomics peak identification as compared to general chemical databases whose results include irrelevant synthetic compounds. MINEs complement and expand on previous in silico generated compound databases that focus on human metabolism. We are actively developing the database; future versions of this resource will incorporate transformation rules for spontaneous chemical reactions and more advanced filtering and prioritization of candidate structures.« less
Mallik, Saurav; Sen, Sagnik; Maulik, Ujjwal
2016-07-15
Involvement of intrinsically disordered proteins (IDPs) with various dreadful diseases like cancer is an interesting research topic. In order to gain novel insights into the regulation of IDPs, in this article, we perform a transcriptomic analysis of mRNAs (genes) for transcripts encoding IDPs on a human multi-omics prostate carcinoma dataset having both gene expression and methylation data. In this regard, firstly the genes that consist of both the expression and methylation data, and that are corresponding to the cancer-related prostate-tissue-specific disordered proteins of MobiDb database, are selected. We apply standard t-test for determining differentially expressed genes as well as differentially methylated genes. A network having these genes and their targeter miRNAs from Diana Tarbase v7.0 database and corresponding Transcription Factors from TRANSFAC and ITFP databases, is then built. Thereafter, we perform literature search, and KEGG pathway and Gene Ontology analyses using DAVID database. Finally, we report several significant potential gene-markers (with the corresponding IDPs) that have inverse relationship between differential expression and methylation patterns, and that are hub genes of the TF-miRNA-gene network. Copyright © 2016 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Prastowo, S.; Widyas, N.
2018-03-01
AMP-activated protein kinase (AMPK) is cellular energy censor which works based on ATP and AMP concentration. This protein interacts with mitochondria in determine its activity to generate energy for cell metabolism purposes. For that, this paper aims to compare the protein to protein interaction of AMPK and mitochondrial activity genes in the metabolism of known animal farm (domesticated) that are cattle (Bos taurus), pig (Sus scrofa) and chicken (Gallus gallus). In silico study was done using STRING V.10 as prominent protein interaction database, followed with biological function comparison in KEGG PATHWAY database. Set of genes (12 in total) were used as input analysis that are PRKAA1, PRKAA2, PRKAB1, PRKAB2, PRKAG1, PRKAG2, PRKAG3, PPARGC1, ACC, CPT1B, NRF2 and SOD. The first 7 genes belong to gene in AMPK family, while the last 5 belong to mitochondrial activity genes. The protein interaction result shows 11, 8 and 5 metabolism pathways in Bos taurus, Sus scrofa and Gallus gallus, respectively. The top pathway in Bos taurus is AMPK signaling pathway (10 genes), Sus scrofa is Adipocytokine signaling pathway (8 genes) and Gallus gallus is FoxO signaling pathway (5 genes). Moreover, the common pathways found in those 3 species are Adipocytokine signaling pathway, Insulin signaling pathway and FoxO signaling pathway. Genes clustered in Adipocytokine and Insulin signaling pathway are PRKAA2, PPARGC1A, PRKAB1 and PRKAG2. While, in FoxO signaling pathway are PRKAA2, PRKAB1, PRKAG2. According to that, we found PRKAA2, PRKAB1 and PRKAG2 are the common genes. Based on the bioinformatics analysis, we can demonstrate that protein to protein interaction shows distinct different of metabolism in different species. However, further validation is needed to give a clear explanation.
ChemiRs: a web application for microRNAs and chemicals.
Su, Emily Chia-Yu; Chen, Yu-Sing; Tien, Yun-Cheng; Liu, Jeff; Ho, Bing-Ching; Yu, Sung-Liang; Singh, Sher
2016-04-18
MicroRNAs (miRNAs) are about 22 nucleotides, non-coding RNAs that affect various cellular functions, and play a regulatory role in different organisms including human. Until now, more than 2500 mature miRNAs in human have been discovered and registered, but still lack of information or algorithms to reveal the relations among miRNAs, environmental chemicals and human health. Chemicals in environment affect our health and daily life, and some of them can lead to diseases by inferring biological pathways. We develop a creditable online web server, ChemiRs, for predicting interactions and relations among miRNAs, chemicals and pathways. The database not only compares gene lists affected by chemicals and miRNAs, but also incorporates curated pathways to identify possible interactions. Here, we manually retrieved associations of miRNAs and chemicals from biomedical literature. We developed an online system, ChemiRs, which contains miRNAs, diseases, Medical Subject Heading (MeSH) terms, chemicals, genes, pathways and PubMed IDs. We connected each miRNA to miRBase, and every current gene symbol to HUGO Gene Nomenclature Committee (HGNC) for genome annotation. Human pathway information is also provided from KEGG and REACTOME databases. Information about Gene Ontology (GO) is queried from GO Online SQL Environment (GOOSE). With a user-friendly interface, the web application is easy to use. Multiple query results can be easily integrated and exported as report documents in PDF format. Association analysis of miRNAs and chemicals can help us understand the pathogenesis of chemical components. ChemiRs is freely available for public use at http://omics.biol.ntnu.edu.tw/ChemiRs .
Transcriptomic analysis of flower development in tea (Camellia sinensis (L.)).
Liu, Feng; Wang, Yu; Ding, Zhaotang; Zhao, Lei; Xiao, Jun; Wang, Linjun; Ding, Shibo
2017-10-05
Flowering is a critical and complicated process in plant development, involving interactions of numerous endogenous and environmental factors, but little is known about the complex network regulating flower development in tea plants. In this study, de novo transcriptome assembly and gene expression analysis using Illumina sequencing technology were performed. Transcriptomic analysis assembles gene-related information involved in reproductive growth of C. sinensis. Gene Ontology (GO) analysis of the annotated unigenes revealed that the majority of sequenced genes were associated with metabolic and cellular processes, cell and cell parts, catalytic activity and binding. Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis indicated that metabolic pathways, biosynthesis of secondary metabolites, and plant hormone signal transduction were enriched among the DEGs. Furthermore, 207 flowering-associated unigenes were identified from our database. Some transcription factors, such as WRKY, ERF, bHLH, MYB and MADS-box were shown to be up-regulated in floral transition, which might play the role of progression of flowering. Furthermore, 14 genes were selected for confirmation of expression levels using quantitative real-time PCR (qRT-PCR). The comprehensive transcriptomic analysis presents fundamental information on the genes and pathways which are involved in flower development in C. sinensis. Our data also provided a useful database for further research of tea and other species of plants. Copyright © 2017 Elsevier B.V. All rights reserved.
Gao, Jing; Li, Yuhong; Wang, Tongmei; Shi, Zhuo; Zhang, Yiqi; Liu, Shuang; Wen, Pushuai; Ma, Chunyan
2018-03-06
The aim of this study was to identify the key genes involved in the cardiac hypertrophy (CH) induced by pressure overload. mRNA microarray dataset GSE5500 and GSE18801 were downloaded from GEO database, and differentially expressed genes (DEGs) were screened using Limma package; then, functional and pathway enrichment analysis were performed for common DEGs using DAVID database. Furthermore, the top DEGs were further validated using qPCR in the hypertrophic heart tissue induced by Isoprenaline (ISO). A total of 113 common DEGs with absolute fold change >0.5, including 60 significantly up-regulated DEGs and 53 down-regulated DEGs were obtained. GO term enrichment analysis suggested that common up-regulated DEG mainly enriched in neutrophil chemotaxis, extracellular fibril organization and cell proliferation, and the common down-regulated genes were significantly enriched in ion transport, endoplasmic reticulum and dendritic spine. KEGG pathway analysis found that the common DEGs were mainly enriched in ECM-receptor interaction, phagosome, and focal adhesion. Additionally, the expression of Mfap4, Ltbp2, Aspn, Serpina3n, and Cnksr1 were up-regulated in the model of cardiac hypertrophy, while the expression of Anp32a was down-regulated. The current study identified the key deregulated genes and pathways involved in the CH, which could shed new light to understand the mechanism of CH.
NASA Astrophysics Data System (ADS)
Ferreira, M.; Creveling, J.; Hilburn, I.; Karlsson, E.; Pepe-Ranney, C.; Spear, J.; Dawson, S.; Geobio2008, I.
2008-12-01
Silicified structures that exhibit a putative biologic component in their formation permeate the rock record as stromatolites. We have studied a silicified microbial structure from a hot spring in Yellowstone National Park using phenotypic, phylogenetic, and metagenomic analyses to determine microbial carbon metabolic pathways and the phylogenetic affiliations of microbes present in this unique structure. In this multi-faceted approach, dominant physiologies, specifically with regards to anaerobic and aerobic metabolisms, were inferred from 16S rRNA gene sequences and 454 sequencing data from bulk DNA samples of the structure. Carbon utilization as indicated by ECO Biolog plates showed abundant heterotrophy and heterotrophic diversity throughout the microbial structure. Microbes within the structure are able to utilize all tested sources of carbohydrates, lipids/fatty acids, and protein/amino acids as carbon sources. ECO plate testing of the hot spring water yielded considerable less carbohydrate consumption (only 4 out of 13 tested carbohydrates) and similar lipids/fatty acids and protein/amino acids consumption (2 out of 3 and 5 out of 5 tested sources respectively). Full length 16S rRNA gene sequences and metagenomic 454 pyrosequencing of community DNA showed limited diversity among primary producers. From the 16S data, the majority of the autotrophs are inferred to utilize the Calvin cycle for CO2 fixation, followed by 3-hydroxypropionate/4- hydroxybutyrate CO2 fixation. However, an analysis of the metagenomic data compared to the KEGG database does not show genes directly involved with Calvin cycle carbon fixation. Further BLAST searches of our data failed to find significant matches within our 6514 metagenomic sequences to known RuBisCo sequences taken from the NCBI database. This is likely due to a far under-sampled dataset of metagenomic sequences, and the low number (958) that had matches to the KEGG pathways database. Anaerobic versus aerobic physiology also can be estimated from the 16S clone libraries. Phylogenetic analysis of recovered 16S sequences suggests that 15% of the 16S sequences can be attributed to anaerobic microbes while 42% likely come from aerobes. The remaining 43% of 16S rRNA gene sequences belong to metabolically unassigned phyla both known and novel. This preliminary study demonstrates that the small spatially stratified silicified microbial structure present on the margins of a hot spring contains a rich and complex microbial community with different trophic levels and enzymatic pathways.
Jeffryes, James G; Colastani, Ricardo L; Elbadawi-Sidhu, Mona; Kind, Tobias; Niehaus, Thomas D; Broadbelt, Linda J; Hanson, Andrew D; Fiehn, Oliver; Tyo, Keith E J; Henry, Christopher S
2015-01-01
In spite of its great promise, metabolomics has proven difficult to execute in an untargeted and generalizable manner. Liquid chromatography-mass spectrometry (LC-MS) has made it possible to gather data on thousands of cellular metabolites. However, matching metabolites to their spectral features continues to be a bottleneck, meaning that much of the collected information remains uninterpreted and that new metabolites are seldom discovered in untargeted studies. These challenges require new approaches that consider compounds beyond those available in curated biochemistry databases. Here we present Metabolic In silico Network Expansions (MINEs), an extension of known metabolite databases to include molecules that have not been observed, but are likely to occur based on known metabolites and common biochemical reactions. We utilize an algorithm called the Biochemical Network Integrated Computational Explorer (BNICE) and expert-curated reaction rules based on the Enzyme Commission classification system to propose the novel chemical structures and reactions that comprise MINE databases. Starting from the Kyoto Encyclopedia of Genes and Genomes (KEGG) COMPOUND database, the MINE contains over 571,000 compounds, of which 93% are not present in the PubChem database. However, these MINE compounds have on average higher structural similarity to natural products than compounds from KEGG or PubChem. MINE databases were able to propose annotations for 98.6% of a set of 667 MassBank spectra, 14% more than KEGG alone and equivalent to PubChem while returning far fewer candidates per spectra than PubChem (46 vs. 1715 median candidates). Application of MINEs to LC-MS accurate mass data enabled the identity of an unknown peak to be confidently predicted. MINE databases are freely accessible for non-commercial use via user-friendly web-tools at http://minedatabase.mcs.anl.gov and developer-friendly APIs. MINEs improve metabolomics peak identification as compared to general chemical databases whose results include irrelevant synthetic compounds. Furthermore, MINEs complement and expand on previous in silico generated compound databases that focus on human metabolism. We are actively developing the database; future versions of this resource will incorporate transformation rules for spontaneous chemical reactions and more advanced filtering and prioritization of candidate structures. Graphical abstractMINE database construction and access methods. The process of constructing a MINE database from the curated source databases is depicted on the left. The methods for accessing the database are shown on the right.
Identification of pathogenic genes and upstream regulators in age-related macular degeneration.
Zhao, Bin; Wang, Mengya; Xu, Jing; Li, Min; Yu, Yuhui
2017-06-26
Age-related macular degeneration (AMD) is the leading cause of irreversible blindness in older individuals. Our study aims to identify the key genes and upstream regulators in AMD. To screen pathogenic genes of AMD, an integrated analysis was performed by using the microarray datasets in AMD derived from the Gene Expression Omnibus (GEO) database. The functional annotation and potential pathways of differentially expressed genes (DEGs) were further discovered by Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis. We constructed the AMD-specific transcriptional regulatory network to find the crucial transcriptional factors (TFs) which target the DEGs in AMD. Quantitative real time polymerase chain reaction (qRT-PCR) was performed to verify the DEGs and TFs obtained by integrated analysis. From two GEO datasets obtained, we identified 1280 DEGs (730 up-regulated and 550 down-regulated genes) between AMD and normal control (NC). After KEGG analysis, steroid biosynthesis is a significantly enriched pathway for DEGs. The expression of 8 genes (TNC, GRP, TRAF6, ADAMTS5, GPX3, FAP, DHCR7 and FDFT1) was detected. Except for TNC and GPX3, the other 6 genes in qRT-PCR played the same pattern with that in our integrated analysis. The dysregulation of these eight genes may involve with the process of AMD. Two crucial transcription factors (c-rel and myogenin) were concluded to play a role in AMD. Especially, myogenin was associated with AMD by regulating TNC, GRP and FAP. Our finding can contribute to developing new potential biomarkers, revealing the underlying pathogenesis, and further raising new therapeutic targets for AMD.
Gene network biological validity based on gene-gene interaction relevance.
Gómez-Vela, Francisco; Díaz-Díaz, Norberto
2014-01-01
In recent years, gene networks have become one of the most useful tools for modeling biological processes. Many inference gene network algorithms have been developed as techniques for extracting knowledge from gene expression data. Ensuring the reliability of the inferred gene relationships is a crucial task in any study in order to prove that the algorithms used are precise. Usually, this validation process can be carried out using prior biological knowledge. The metabolic pathways stored in KEGG are one of the most widely used knowledgeable sources for analyzing relationships between genes. This paper introduces a new methodology, GeneNetVal, to assess the biological validity of gene networks based on the relevance of the gene-gene interactions stored in KEGG metabolic pathways. Hence, a complete KEGG pathway conversion into a gene association network and a new matching distance based on gene-gene interaction relevance are proposed. The performance of GeneNetVal was established with three different experiments. Firstly, our proposal is tested in a comparative ROC analysis. Secondly, a randomness study is presented to show the behavior of GeneNetVal when the noise is increased in the input network. Finally, the ability of GeneNetVal to detect biological functionality of the network is shown.
2011-01-01
Background Sporadic amyotrophic lateral sclerosis (sALS) is a motor neuron disease with poorly understood etiology. Results of gene expression profiling studies of whole blood from ALS patients have not been validated and are difficult to relate to ALS pathogenesis because gene expression profiles depend on the relative abundance of the different cell types present in whole blood. We conducted microarray analyses using Agilent Human Whole Genome 4 × 44k Arrays on a more homogeneous cell population, namely purified peripheral blood lymphocytes (PBLs), from ALS patients and healthy controls to identify molecular signatures possibly relevant to ALS pathogenesis. Methods Differentially expressed genes were determined by LIMMA (Linear Models for MicroArray) and SAM (Significance Analysis of Microarrays) analyses. The SAFE (Significance Analysis of Function and Expression) procedure was used to identify molecular pathway perturbations. Proteasome inhibition assays were conducted on cultured peripheral blood mononuclear cells (PBMCs) from ALS patients to confirm alteration of the Ubiquitin/Proteasome System (UPS). Results For the first time, using SAFE in a global gene ontology analysis (gene set size 5-100), we show significant perturbation of the KEGG (Kyoto Encyclopedia of Genes and Genomes) ALS pathway of motor neuron degeneration in PBLs from ALS patients. This was the only KEGG disease pathway significantly upregulated among 25, and contributing genes, including SOD1, represented 54% of the encoded proteins or protein complexes of the KEGG ALS pathway. Further SAFE analysis, including gene set sizes >100, showed that only neurodegenerative diseases (4 out of 34 disease pathways) including ALS were significantly upregulated. Changes in UBR2 expression correlated inversely with time since onset of disease and directly with ALSFRS-R, implying that UBR2 was increased early in the course of ALS. Cultured PBMCs from ALS patients accumulated more ubiquitinated proteins than PBMCs from healthy controls in a serum-dependent manner confirming changes in this pathway. Conclusions Our study indicates that PBLs from sALS patients are strong responders to systemic signals or local signals acquired by cell trafficking, representing changes in gene expression similar to those present in brain and spinal cord of sALS patients. PBLs may provide a useful means to study ALS pathogenesis. PMID:22027401
Columba: an integrated database of proteins, structures, and annotations.
Trissl, Silke; Rother, Kristian; Müller, Heiko; Steinke, Thomas; Koch, Ina; Preissner, Robert; Frömmel, Cornelius; Leser, Ulf
2005-03-31
Structural and functional research often requires the computation of sets of protein structures based on certain properties of the proteins, such as sequence features, fold classification, or functional annotation. Compiling such sets using current web resources is tedious because the necessary data are spread over many different databases. To facilitate this task, we have created COLUMBA, an integrated database of annotations of protein structures. COLUMBA currently integrates twelve different databases, including PDB, KEGG, Swiss-Prot, CATH, SCOP, the Gene Ontology, and ENZYME. The database can be searched using either keyword search or data source-specific web forms. Users can thus quickly select and download PDB entries that, for instance, participate in a particular pathway, are classified as containing a certain CATH architecture, are annotated as having a certain molecular function in the Gene Ontology, and whose structures have a resolution under a defined threshold. The results of queries are provided in both machine-readable extensible markup language and human-readable format. The structures themselves can be viewed interactively on the web. The COLUMBA database facilitates the creation of protein structure data sets for many structure-based studies. It allows to combine queries on a number of structure-related databases not covered by other projects at present. Thus, information on both many and few protein structures can be used efficiently. The web interface for COLUMBA is available at http://www.columba-db.de.
NASA Astrophysics Data System (ADS)
Wu, Kun; Huang, Chao; Shi, Xi; Chen, Feng; Xu, Yi-Huan; Pan, Ya-Xiong; Luo, Zhi; Liu, Xu
2016-12-01
Previous studies have investigated the physiological responses in the liver of Synechogobius hasta exposed to waterborne zinc (Zn). However, at present, very little is known about the underlying molecular mechanisms of these responses. In this study, RNA sequencing (RNA-seq) was performed to analyse the differences in the hepatic transcriptomes between control and Zn-exposed S. hasta. A total of 36,339 unigenes and 1,615 bp of unigene N50 were detected. These genes were further annotated to the Nonredundant protein (NR), Nonredundant nucleotide (Nt), Swiss-Prot, Kyoto Encyclopedia of Genes and Genomes (KEGG), Clusters of Orthologous Groups (COG) and Gene Ontology (GO) databases. After 60 days of Zn exposure, 708 and 237 genes were significantly up- and down-regulated, respectively. Many differentially expressed genes (DEGs) involved in energy metabolic pathways were identified, and their expression profiles suggested increased catabolic processes and reduced biosynthetic processes. These changes indicated that waterborne Zn exposure increased the energy production and requirement, which was related to the activation of the AMPK signalling pathway. Furthermore, using the primary hepatocytes of S. hasta, we identified the role of the AMPK signalling pathway in Zn-influenced energy metabolism.
Gene expression profiles in liver of mouse after chronic exposure to drinking water.
Wu, Bing; Zhang, Yan; Zhao, Dayong; Zhang, Xuxiang; Kong, Zhiming; Cheng, Shupei
2009-10-01
cDNA micorarray approach was applied to hepatic transcriptional profile analysis in male mouse (Mus musculus, ICR) to assess the potential health effects of drinking water in Nanjing, China. Mice were treated with continuous exposure to drinking water for 90 days. Hepatic gene expression was analyzed with Affymetrix Mouse Genome 430A 2.0 arrays, and pathway analysis was carried out by Molecule Annotation System 2.0 and KEGG pathway database. A total of 836 genes were found to be significantly altered (1.5-fold, P < or = 0.05), including 294 up-regulated genes and 542 down-regulated genes. According to biological pathway analysis, drinking water exposure resulted in aberration of gene expression and biological pathways linked to xenobiotic metabolism, signal transduction, cell cycle and oxidative stress response. Further, deregulation of several genes associated with carcinogenesis or tumor progression including Ccnd1, Egfr, Map2k3, Mcm2, Orc2l and Smad2 was observed. Although transcription changes in identified genes are unlikely to be used as a sole indicator of adverse health effects, the results of this study could enhance our understanding of early toxic effects of drinking water exposure and support future studies on drinking water safety.
Li, Xihong; Cui, Zhaoxia; Liu, Yuan; Song, Chengwen; Shi, Guohui
2013-01-01
Background The Chinese mitten crab Eriocheir sinensis is an important economic crustacean and has been seriously attacked by various diseases, which requires more and more information for immune relevant genes on genome background. Recently, high-throughput RNA sequencing (RNA-seq) technology provides a powerful and efficient method for transcript analysis and immune gene discovery. Methods/Principal Findings A cDNA library from hepatopancreas of E. sinensis challenged by a mixture of three pathogen strains (Gram-positive bacteria Micrococcus luteus, Gram-negative bacteria Vibrio alginolyticus and fungi Pichia pastoris; 108 cfu·mL−1) was constructed and randomly sequenced using Illumina technique. Totally 39.76 million clean reads were assembled to 70,300 unigenes. After ruling out short-length and low-quality sequences, 52,074 non-redundant unigenes were compared to public databases for homology searching and 17,617 of them showed high similarity to sequences in NCBI non-redundant protein (Nr) database. For function classification and pathway assignment, 18,734 (36.00%) unigenes were categorized to three Gene Ontology (GO) categories, 12,243 (23.51%) were classified to 25 Clusters of Orthologous Groups (COG), and 8,983 (17.25%) were assigned to six Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. Potentially, 24, 14, 47 and 132 unigenes were characterized to be involved in Toll, IMD, JAK-STAT and MAPK pathways, respectively. Conclusions/Significance This is the first systematical transcriptome analysis of components relating to innate immune pathways in E. sinensis. Functional genes and putative pathways identified here will contribute to better understand immune system and prevent various diseases in crab. PMID:23874555
Liu, Yao-Zhong; Zhang, Lei; Roy-Engel, Astrid M; Saito, Shigeki; Lasky, Joseph A; Wang, Guangdi; Wang, He
2016-01-01
The health impacts of the BP oil spill are yet to be further revealed as the toxicological effects of oil products and dispersants on human respiratory system may be latent and complex, and hence difficult to study and follow up. Here we performed RNA-seq analyses of a system of human airway epithelial cells treated with the BP crude oil and/or dispersants Corexit 9500 and Corexit 9527 that were used to help break up the oil spill. Based on the RNA-seq data, we then systemically analyzed the transcriptomic perturbations of the cells at the KEGG pathway level using two pathway-based analysis tools, GAGE (generally applicable gene set enrichment) and GSNCA (Gene Sets Net Correlations Analysis). Our results suggested a pattern of change towards carcinogenesis for the treated cells marked by upregulation of ribosomal biosynthesis (hsa03008) (p = 1.97e-13), protein processing (hsa04141) (p = 4.09e-7), Wnt signaling (hsa04310) (p = 6.76e-3), neurotrophin signaling (hsa04722) (p = 7.73e-3) and insulin signaling (hsa04910) (p = 1.16e-2) pathways under the dispersant Corexit 9527 treatment, as identified by GAGE analysis. Furthermore, through GSNCA analysis, we identified gene co-expression changes for several KEGG cancer pathways, including small cell lung cancer pathway (hsa05222, p = 9.99e-5), under various treatments of oil/dispersant, especially the mixture of oil and Corexit 9527. Overall, our results suggested carcinogenic effects of dispersants (in particular Corexit 9527) and their mixtures with the BP crude oil, and provided further support for more stringent safety precautions and regulations for operations involving long-term respiratory exposure to oil and dispersants. PMID:27866042
METscout: a pathfinder exploring the landscape of metabolites, enzymes and transporters.
Geffers, Lars; Tetzlaff, Benjamin; Cui, Xiao; Yan, Jun; Eichele, Gregor
2013-01-01
METscout (http://metscout.mpg.de) brings together metabolism and gene expression landscapes. It is a MySQL relational database linking biochemical pathway information with 3D patterns of gene expression determined by robotic in situ hybridization in the E14.5 mouse embryo. The sites of expression of ∼1500 metabolic enzymes and of ∼350 solute carriers (SLCs) were included and are accessible as single cell resolution images and in the form of semi-quantitative image abstractions. METscout provides several graphical web-interfaces allowing navigation through complex anatomical and metabolic information. Specifically, the database shows where in the organism each of the many metabolic reactions take place and where SLCs transport metabolites. To link enzymatic reactions and transport, the KEGG metabolic reaction network was extended to include metabolite transport. This network in conjunction with spatial expression pattern of the network genes allows for a tracing of metabolic reactions and transport processes across the entire body of the embryo.
Fang, Lu; Yang, Yuchen; Guo, Wuxia; Li, Jianfang; Zhong, Cairong; Huang, Yelin; Zhou, Renchao; Shi, Suhua
2016-08-01
Aegiceras corniculatum (L.) Blanco is one of the most salt tolerant mangrove species and can thrive in 3% salinity at the seaward edge of mangrove forests. Here we sequenced the transcriptome of A. corniculatum used Illumina GA platform to develop its genomic resources for ecological and evolutionary studies. We obtained about 50 million high-quality paired-end reads with 75bp in length. Using the short read assembler Velvet, we yielded 49,437 contigs with the average length of 625bp. A total of 32,744 (66.23%) contigs showed significant similarity to the GenBank non-redundant (NR) protein database. 30,911 and 18,004 of these sequences were assigned to Gene Ontology and eukaryotic orthologous groups of proteins (KOG). A total of 4942 transcripts from our assemblies had significant similarity with KEGG Orthologs and were involved in 144 KEGG pathways, while 9899 unigenes had enzyme commission (EC) numbers. In addition, 9792 transcriptome-derived SSRs were identified from 7342 sequences. With our strict criteria, 4165 candidate SNPs were also identified from 2058 contigs. Some of these SNPs were further validated by Sanger sequencing. Genomic resources generated in this study should be valuable in ecological, evolutionary, and functional genomics studies for this mangrove species. Copyright © 2016 Elsevier B.V. All rights reserved.
B-cell Ligand Processing Pathways Detected by Large-scale Comparative Analysis
Towfic, Fadi; Gupta, Shakti; Honavar, Vasant; Subramaniam, Shankar
2012-01-01
The initiation of B-cell ligand recognition is a critical step for the generation of an immune response against foreign bodies. We sought to identify the biochemical pathways involved in the B-cell ligand recognition cascade and sets of ligands that trigger similar immunological responses. We utilized several comparative approaches to analyze the gene coexpression networks generated from a set of microarray experiments spanning 33 different ligands. First, we compared the degree distributions of the generated networks. Second, we utilized a pairwise network alignment algorithm, BiNA, to align the networks based on the hubs in the networks. Third, we aligned the networks based on a set of KEGG pathways. We summarized our results by constructing a consensus hierarchy of pathways that are involved in B cell ligand recognition. The resulting pathways were further validated through literature for their common physiological responses. Collectively, the results based on our comparative analyses of degree distributions, alignment of hubs, and alignment based on KEGG pathways provide a basis for molecular characterization of the immune response states of B-cells and demonstrate the power of comparative approaches (e.g., gene coexpression network alignment algorithms) in elucidating biochemical pathways involved in complex signaling events in cells. PMID:22917187
Akune, Yukie; Lin, Chi-Hung; Abrahams, Jodie L; Zhang, Jingyu; Packer, Nicolle H; Aoki-Kinoshita, Kiyoko F; Campbell, Matthew P
2016-08-05
Glycan structures attached to proteins are comprised of diverse monosaccharide sequences and linkages that are produced from precursor nucleotide-sugars by a series of glycosyltransferases. Databases of these structures are an essential resource for the interpretation of analytical data and the development of bioinformatics tools. However, with no template to predict what structures are possible the human glycan structure databases are incomplete and rely heavily on the curation of published, experimentally determined, glycan structure data. In this work, a library of 45 human glycosyltransferases was used to generate a theoretical database of N-glycan structures comprised of 15 or less monosaccharide residues. Enzyme specificities were sourced from major online databases including Kyoto Encyclopedia of Genes and Genomes (KEGG) Glycan, Consortium for Functional Glycomics (CFG), Carbohydrate-Active enZymes (CAZy), GlycoGene DataBase (GGDB) and BRENDA. Based on the known activities, more than 1.1 million theoretical structures and 4.7 million synthetic reactions were generated and stored in our database called UniCorn. Furthermore, we analyzed the differences between the predicted glycan structures in UniCorn and those contained in UniCarbKB (www.unicarbkb.org), a database which stores experimentally described glycan structures reported in the literature, and demonstrate that UniCorn can be used to aid in the assignment of ambiguous structures whilst also serving as a discovery database. Copyright © 2016 Elsevier Ltd. All rights reserved.
Gene Polymorphism Studies in a Teaching Laboratory
NASA Astrophysics Data System (ADS)
Shultz, Jeffry
2009-02-01
I present a laboratory procedure for illustrating transcription, post-transcriptional modification, gene conservation, and comparative genetics for use in undergraduate biology education. Students are individually assigned genes in a targeted biochemical pathway, for which they design and test polymerase chain reaction (PCR) primers. In this example, students used genes annotated for the steroid biosynthesis pathway in soybean. The authoritative Kyoto encyclopedia of genes and genomes (KEGG) interactive database and other online resources were used to design primers based first on soybean expressed sequence tags (ESTs), then on ESTs from an alternate organism if soybean sequence was unavailable. Students designed a total of 50 gene-based primer pairs (37 soybean, 13 alternative) and tested these for polymorphism state and similarity between two soybean and two pea lines. Student assessment was based on acquisition of laboratory skills and successful project completion. This simple procedure illustrates conservation of genes and is not limited to soybean or pea. Cost per student estimates are included, along with a detailed protocol and flow diagram of the procedure.
Zhang, Chaoyang; Peng, Li; Zhang, Yaqin; Liu, Zhaoyang; Li, Wenling; Chen, Shilian; Li, Guancheng
2017-06-01
Liver cancer is a serious threat to public health and has fairly complicated pathogenesis. Therefore, the identification of key genes and pathways is of much importance for clarifying molecular mechanism of hepatocellular carcinoma (HCC) initiation and progression. HCC-associated gene expression dataset was downloaded from Gene Expression Omnibus database. Statistical software R was used for significance analysis of differentially expressed genes (DEGs) between liver cancer samples and normal samples. Gene Ontology (GO) term enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis, based on R software, were applied for the identification of pathways in which DEGs significantly enriched. Cytoscape software was for the construction of protein-protein interaction (PPI) network and module analysis to find the hub genes and key pathways. Finally, weighted correlation network analysis (WGCNA) was conducted to further screen critical gene modules with similar expression pattern and explore their biological significance. Significance analysis identified 1230 DEGs with fold change >2, including 632 significantly down-regulated DEGs and 598 significantly up-regulated DEGs. GO term enrichment analysis suggested that up-regulated DEG significantly enriched in immune response, cell adhesion, cell migration, type I interferon signaling pathway, and cell proliferation, and the down-regulated DEG mainly enriched in response to endoplasmic reticulum stress and endoplasmic reticulum unfolded protein response. KEGG pathway analysis found DEGs significantly enriched in five pathways including complement and coagulation cascades, focal adhesion, ECM-receptor interaction, antigen processing and presentation, and protein processing in endoplasmic reticulum. The top 10 hub genes in HCC were separately GMPS, ACACA, ALB, TGFB1, KRAS, ERBB2, BCL2, EGFR, STAT3, and CD8A, which resulted from PPI network. The top 3 gene interaction modules in PPI network enriched in immune response, organ development, and response to other organism, respectively. WGCNA revealed that the confirmed eight gene modules significantly enriched in monooxygenase and oxidoreductase activity, response to endoplasmic reticulum stress, type I interferon signaling pathway, processing, presentation and binding of peptide antigen, cellular response to cadmium and zinc ion, cell locomotion and differentiation, ribonucleoprotein complex and RNA processing, and immune system process, respectively. In conclusion, we identified some key genes and pathways closely related with HCC initiation and progression by a series of bioinformatics analysis on DEGs. These screened genes and pathways provided for a more detailed molecular mechanism underlying HCC occurrence and progression, holding promise for acting as biomarkers and potential therapeutic targets.
Aberrant methylation patterns affect the molecular pathogenesis of rheumatoid arthritis.
Lin, Yang; Luo, Zhengqiang
2017-05-01
This study aims to investigate DNA methylation signatures in fibroblast-like synoviocytes (FLS) from patients with rheumatoid arthritis (RA), and to explore the relationship with transcription factors (TFs) that help to distinguish RA from osteoarthritis (OA). Microarray dataset of GSE46346, including six FLS samples from patients with RA and five FLS samples from patients with OA, was downloaded from the Gene Expression Omnibus database. RA and OA samples were screened for differentially methylated loci (DMLs). The corresponding differentially methylated genes (DMGs) were identified, followed by Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway and Gene Ontology (GO) enrichment analysis. A transcriptional regulatory network was built with TFs and their corresponding DMGs. Overall, 280 hypomethylated loci and 561 hypermethylated loci were screened. Genes containing hypermethylated loci were enriched in pathways in cancer, ECM-receptor interaction, focal adhesion and neurotrophin signaling pathways. Genes containing hypomethylated loci were enriched in the neurotrophin signaling pathway. Moreover, we found that CCCTC-binding factor (CTCF), Yin Yang 1 (YY1), v-myc avian myelocytomatosis viral oncogene homolog (c-MYC), and early growth response 1 (EGR1) were important TFs in the transcriptional regulatory network. Therefore, DMGs might participate in the neurotrophin signaling pathway, pathways in cancer, ECM-receptor interaction and focal adhesion pathways in RA. Furthermore, CTCF, c-MYC, YY1, and EGR1 may play important roles in RA through regulating DMGs. Copyright © 2017 Elsevier B.V. All rights reserved.
Correcting ligands, metabolites, and pathways
Ott, Martin A; Vriend, Gert
2006-01-01
Background A wide range of research areas in bioinformatics, molecular biology and medicinal chemistry require precise chemical structure information about molecules and reactions, e.g. drug design, ligand docking, metabolic network reconstruction, and systems biology. Most available databases, however, treat chemical structures more as illustrations than as a datafield in its own right. Lack of chemical accuracy impedes progress in the areas mentioned above. We present a database of metabolites called BioMeta that augments the existing pathway databases by explicitly assessing the validity, correctness, and completeness of chemical structure and reaction information. Description The main bulk of the data in BioMeta were obtained from the KEGG Ligand database. We developed a tool for chemical structure validation which assesses the chemical validity and stereochemical completeness of a molecule description. The validation tool was used to examine the compounds in BioMeta, showing that a relatively small number of compounds had an incorrect constitution (connectivity only, not considering stereochemistry) and that a considerable number (about one third) had incomplete or even incorrect stereochemistry. We made a large effort to correct the errors and to complete the structural descriptions. A total of 1468 structures were corrected and/or completed. We also established the reaction balance of the reactions in BioMeta and corrected 55% of the unbalanced (stoichiometrically incorrect) reactions in an automatic procedure. The BioMeta database was implemented in PostgreSQL and provided with a web-based interface. Conclusion We demonstrate that the validation of metabolite structures and reactions is a feasible and worthwhile undertaking, and that the validation results can be used to trigger corrections and improvements to BioMeta, our metabolite database. BioMeta provides some tools for rational drug design, reaction searches, and visualization. It is freely available at provided that the copyright notice of all original data is cited. The database will be useful for querying and browsing biochemical pathways, and to obtain reference information for identifying compounds. However, these applications require that the underlying data be correct, and that is the focus of BioMeta. PMID:17132165
Hayashi, Takanori; Matsuzaki, Yuri; Yanagisawa, Keisuke; Ohue, Masahito; Akiyama, Yutaka
2018-05-08
Protein-protein interactions (PPIs) play several roles in living cells, and computational PPI prediction is a major focus of many researchers. The three-dimensional (3D) structure and binding surface are important for the design of PPI inhibitors. Therefore, rigid body protein-protein docking calculations for two protein structures are expected to allow elucidation of PPIs different from known complexes in terms of 3D structures because known PPI information is not explicitly required. We have developed rapid PPI prediction software based on protein-protein docking, called MEGADOCK. In order to fully utilize the benefits of computational PPI predictions, it is necessary to construct a comprehensive database to gather prediction results and their predicted 3D complex structures and to make them easily accessible. Although several databases exist that provide predicted PPIs, the previous databases do not contain a sufficient number of entries for the purpose of discovering novel PPIs. In this study, we constructed an integrated database of MEGADOCK PPI predictions, named MEGADOCK-Web. MEGADOCK-Web provides more than 10 times the number of PPI predictions than previous databases and enables users to conduct PPI predictions that cannot be found in conventional PPI prediction databases. In MEGADOCK-Web, there are 7528 protein chains and 28,331,628 predicted PPIs from all possible combinations of those proteins. Each protein structure is annotated with PDB ID, chain ID, UniProt AC, related KEGG pathway IDs, and known PPI pairs. Additionally, MEGADOCK-Web provides four powerful functions: 1) searching precalculated PPI predictions, 2) providing annotations for each predicted protein pair with an experimentally known PPI, 3) visualizing candidates that may interact with the query protein on biochemical pathways, and 4) visualizing predicted complex structures through a 3D molecular viewer. MEGADOCK-Web provides a huge amount of comprehensive PPI predictions based on docking calculations with biochemical pathways and enables users to easily and quickly assess PPI feasibilities by archiving PPI predictions. MEGADOCK-Web also promotes the discovery of new PPIs and protein functions and is freely available for use at http://www.bi.cs.titech.ac.jp/megadock-web/ .
MetaMapR: pathway independent metabolomic network analysis incorporating unknowns.
Grapov, Dmitry; Wanichthanarak, Kwanjeera; Fiehn, Oliver
2015-08-15
Metabolic network mapping is a widely used approach for integration of metabolomic experimental results with biological domain knowledge. However, current approaches can be limited by biochemical domain or pathway knowledge which results in sparse disconnected graphs for real world metabolomic experiments. MetaMapR integrates enzymatic transformations with metabolite structural similarity, mass spectral similarity and empirical associations to generate richly connected metabolic networks. This open source, web-based or desktop software, written in the R programming language, leverages KEGG and PubChem databases to derive associations between metabolites even in cases where biochemical domain or molecular annotations are unknown. Network calculation is enhanced through an interface to the Chemical Translation System, which allows metabolite identifier translation between >200 common biochemical databases. Analysis results are presented as interactive visualizations or can be exported as high-quality graphics and numerical tables which can be imported into common network analysis and visualization tools. Freely available at http://dgrapov.github.io/MetaMapR/. Requires R and a modern web browser. Installation instructions, tutorials and application examples are available at http://dgrapov.github.io/MetaMapR/. ofiehn@ucdavis.edu. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
LigandBox: A database for 3D structures of chemical compounds
Kawabata, Takeshi; Sugihara, Yusuke; Fukunishi, Yoshifumi; Nakamura, Haruki
2013-01-01
A database for the 3D structures of available compounds is essential for the virtual screening by molecular docking. We have developed the LigandBox database (http://ligandbox.protein.osaka-u.ac.jp/ligandbox/) containing four million available compounds, collected from the catalogues of 37 commercial suppliers, and approved drugs and biochemical compounds taken from KEGG_DRUG, KEGG_COMPOUND and PDB databases. Each chemical compound in the database has several 3D conformers with hydrogen atoms and atomic charges, which are ready to be docked into receptors using docking programs. The 3D conformations were generated using our molecular simulation program package, myPresto. Various physical properties, such as aqueous solubility (LogS) and carcinogenicity have also been calculated to characterize the ADME-Tox properties of the compounds. The Web database provides two services for compound searches: a property/chemical ID search and a chemical structure search. The chemical structure search is performed by a descriptor search and a maximum common substructure (MCS) search combination, using our program kcombu. By specifying a query chemical structure, users can find similar compounds among the millions of compounds in the database within a few minutes. Our database is expected to assist a wide range of researchers, in the fields of medical science, chemical biology, and biochemistry, who are seeking to discover active chemical compounds by the virtual screening. PMID:27493549
LigandBox: A database for 3D structures of chemical compounds.
Kawabata, Takeshi; Sugihara, Yusuke; Fukunishi, Yoshifumi; Nakamura, Haruki
2013-01-01
A database for the 3D structures of available compounds is essential for the virtual screening by molecular docking. We have developed the LigandBox database (http://ligandbox.protein.osaka-u.ac.jp/ligandbox/) containing four million available compounds, collected from the catalogues of 37 commercial suppliers, and approved drugs and biochemical compounds taken from KEGG_DRUG, KEGG_COMPOUND and PDB databases. Each chemical compound in the database has several 3D conformers with hydrogen atoms and atomic charges, which are ready to be docked into receptors using docking programs. The 3D conformations were generated using our molecular simulation program package, myPresto. Various physical properties, such as aqueous solubility (LogS) and carcinogenicity have also been calculated to characterize the ADME-Tox properties of the compounds. The Web database provides two services for compound searches: a property/chemical ID search and a chemical structure search. The chemical structure search is performed by a descriptor search and a maximum common substructure (MCS) search combination, using our program kcombu. By specifying a query chemical structure, users can find similar compounds among the millions of compounds in the database within a few minutes. Our database is expected to assist a wide range of researchers, in the fields of medical science, chemical biology, and biochemistry, who are seeking to discover active chemical compounds by the virtual screening.
Singh, Satendra; Singh, Dev Bukhsh; Singh, Anamika; Gautam, Budhayash; Ram, Gurudayal; Dwivedi, Seema; Ramteke, Pramod W
2016-12-01
Streptococcus pyogenes is one of the most important pathogens as it is involved in various infections affecting upper respiratory tract and skin. Due to the emergence of multidrug resistance and cross-resistance, S. Pyogenes is becoming more pathogenic and dangerous. In the present study, an in silico comparative analysis of total 65 metabolic pathways of the host (Homo sapiens) and the pathogen was performed. Initially, 486 paralogous enzymes were identified so that they can be removed from possible drug target list. The 105 enzymes of the biochemical pathways of S. pyogenes from the KEGG metabolic pathway database were compared with the proteins from the Homo sapiens by performing a BLASTP search against the non-redundant database restricted to the Homo sapiens subset. Out of these, 83 enzymes were identified as non-human homologous while 30 enzymes of inadequate amino acid length were removed for further processing. Essential enzymes were finally mined from remaining 53 enzymes. Finally, 28 essential enzymes were identified in S. pyogenes SF370 (serotype M1). In subcellular localization study, 18 enzymes were predicted with cytoplasmic localization and ten enzymes with the membrane localization. These ten enzymes with putative membrane localization should be of particular interest. Acyl-carrier-protein S-malonyltransferase, DNA polymerase III subunit beta and dihydropteroate synthase are novel drug targets and thus can be used to design potential inhibitors against S. pyogenes infection. 3D structure of dihydropteroate synthase was modeled and validated that can be used for virtual screening and interaction study of potential inhibitors with the target enzyme.
The chemokine receptor CCR1 is identified in mast cell-derived exosomes.
Liang, Yuting; Qiao, Longwei; Peng, Xia; Cui, Zelin; Yin, Yue; Liao, Huanjin; Jiang, Min; Li, Li
2018-01-01
Mast cells are important effector cells of the immune system, and mast cell-derived exosomes carrying RNAs play a role in immune regulation. However, the molecular function of mast cell-derived exosomes is currently unknown, and here, we identify differentially expressed genes (DEGs) in mast cells and exosomes. We isolated mast cells derived exosomes through differential centrifugation and screened the DEGs from mast cell-derived exosomes, using the GSE25330 array dataset downloaded from the Gene Expression Omnibus database. Biochemical pathways were analyzed by Gene ontology (GO) annotation and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway on the online tool DAVID. DEGs-associated protein-protein interaction networks (PPIs) were constructed using the STRING database and Cytoscape software. The genes identified from these bioinformatics analyses were verified by qRT-PCR and Western blot in mast cells and exosomes. We identified 2121 DEGs (843 up and 1278 down-regulated genes) in HMC-1 cell-derived exosomes and HMC-1 cells. The up-regulated DEGs were classified into two significant modules. The chemokine receptor CCR1 was screened as a hub gene and enriched in cytokine-mediated signaling pathway in module one. Seven genes, including CCR1, CD9, KIT, TGFBR1, TLR9, TPSAB1 and TPSB2 were screened and validated through qRT-PCR analysis. We have achieved a comprehensive view of the pivotal genes and pathways in mast cells and exosomes and identified CCR1 as a hub gene in mast cell-derived exosomes. Our results provide novel clues with respect to the biological processes through which mast cell-derived exosomes modulate immune responses.
De novo assembly and transcriptomic profiling of the grazing response in Stipa grandis.
Wan, Dongli; Wan, Yongqing; Hou, Xiangyang; Ren, Weibo; Ding, Yong; Sa, Rula
2015-01-01
Stipa grandis (Poaceae) is one of the dominant species in a typical steppe of the Inner Mongolian Plateau. However, primarily due to heavy grazing, the grasslands have become seriously degraded, and S. grandis has developed a special growth-inhibition phenotype against the stressful habitat. Because of the lack of transcriptomic and genomic information, the understanding of the molecular mechanisms underlying the grazing response of S. grandis has been prohibited. Using the Illumina HiSeq 2000 platform, two libraries prepared from non-grazing (FS) and overgrazing samples (OS) were sequenced. De novo assembly produced 94,674 unigenes, of which 65,047 unigenes had BLAST hits in the National Center for Biotechnology Information (NCBI) non-redundant (nr) database (E-value < 10-5). In total, 47,747, 26,156 and 40,842 unigenes were assigned to the Gene Ontology (GO), Clusters of Orthologous Group (COG), and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases, respectively. A total of 13,221 unigenes showed significant differences in expression under the overgrazing condition, with a threshold false discovery rate ≤ 0.001 and an absolute value of log2Ratio ≥ 1. These differentially expressed genes (DEGs) were assigned to 43,257 GO terms and were significantly enriched in 32 KEGG pathways (q-value ≤ 0.05). The alterations in the wound-, drought- and defense-related genes indicate that stressors have an additive effect on the growth inhibition of this species. This first large-scale transcriptome study will provide important information for further gene expression and functional genomics studies, and it facilitated our investigation of the molecular mechanisms of the S. grandis grazing response and the associated morphological and physiological characteristics.
De novo Assembly and Transcriptomic Profiling of the Grazing Response in Stipa grandis
Hou, Xiangyang; Ren, Weibo; Ding, Yong; Sa, Rula
2015-01-01
Background Stipa grandis (Poaceae) is one of the dominant species in a typical steppe of the Inner Mongolian Plateau. However, primarily due to heavy grazing, the grasslands have become seriously degraded, and S. grandis has developed a special growth-inhibition phenotype against the stressful habitat. Because of the lack of transcriptomic and genomic information, the understanding of the molecular mechanisms underlying the grazing response of S. grandis has been prohibited. Results Using the Illumina HiSeq 2000 platform, two libraries prepared from non-grazing (FS) and overgrazing samples (OS) were sequenced. De novo assembly produced 94,674 unigenes, of which 65,047 unigenes had BLAST hits in the National Center for Biotechnology Information (NCBI) non-redundant (nr) database (E-value < 10-5). In total, 47,747, 26,156 and 40,842 unigenes were assigned to the Gene Ontology (GO), Clusters of Orthologous Group (COG), and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases, respectively. A total of 13,221 unigenes showed significant differences in expression under the overgrazing condition, with a threshold false discovery rate ≤ 0.001 and an absolute value of log2Ratio ≥ 1. These differentially expressed genes (DEGs) were assigned to 43,257 GO terms and were significantly enriched in 32 KEGG pathways (q-value ≤ 0.05). The alterations in the wound-, drought- and defense-related genes indicate that stressors have an additive effect on the growth inhibition of this species. Conclusions This first large-scale transcriptome study will provide important information for further gene expression and functional genomics studies, and it facilitated our investigation of the molecular mechanisms of the S. grandis grazing response and the associated morphological and physiological characteristics. PMID:25875617
Lee, A Yeong; Park, Won; Kang, Tae-Wook; Cha, Min Ho; Chun, Jin Mi
2018-07-15
Yijin-Tang (YJT) is a traditional prescription for the treatment of hyperlipidaemia, atherosclerosis and other ailments related to dampness phlegm, a typical pathological symptom of abnormal body fluid metabolism in Traditional Korean Medicine. However, a holistic network pharmacology approach to understanding the therapeutic mechanisms underlying hyperlipidaemia and atherosclerosis has not been pursued. To examine the network pharmacological potential effects of YJT on hyperlipidaemia and atherosclerosis, we analysed components, performed target prediction and network analysis, and investigated interacting pathways using a network pharmacology approach. Information on compounds in herbal medicines was obtained from public databases, and oral bioavailability and drug-likeness was screened using absorption, distribution, metabolism, and excretion (ADME) criteria. Correlations between compounds and genes were linked using the STITCH database, and genes related to hyperlipidaemia and atherosclerosis were gathered using the GeneCards database. Human genes were identified and subjected to Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis. Network analysis identified 447 compounds in five herbal medicines that were subjected to ADME screening, and 21 compounds and 57 genes formed the main pathways linked to hyperlipidaemia and atherosclerosis. Among them, 10 compounds (naringenin, nobiletin, hesperidin, galangin, glycyrrhizin, homogentisic acid, stigmasterol, 6-gingerol, quercetin and glabridin) were linked to more than four genes, and are bioactive compounds and key chemicals. Core genes in this network were CASP3, CYP1A1, CYP1A2, MMP2 and MMP9. The compound-target gene network revealed close interactions between multiple components and multiple targets, and facilitates a better understanding of the potential therapeutic effects of YJT. Pharmacological network analysis can help to explain the potential effects of YJT for treating dampness phlegm-related diseases such as hyperlipidaemia and atherosclerosis. Copyright © 2018 Elsevier B.V. All rights reserved.
A network pharmacology study of Sendeng-4, a Mongolian medicine.
Zi, Tian; Yu, Dong
2015-02-01
We collected the data on the Sendeng-4 chemical composition corresponding targets through the literature and from DrugBank, SuperTarget, TTD (Therapeutic Targets Database) and other databases and the relevant signaling pathways from the KEGG (Kyoto Encyclopedia of Genes and Genomes) database and established models of the chemical composition-target network and chemical composition-target-disease network using Cytoscape software, the analysis indicated that the chemical composition had at least nine different types of targets that acted together to exert effects on the diseases, suggesting a "multi-component, multi-target" feature of the traditional Mongolian medicine. We also employed the rat model of rheumatoid arthritis induced by Collgen Type II to validate the key targets of the chemical components of Sendeng-4, and three of the key targets were validated through laboratory experiments, further confirming the anti-inflammatory effects of Sendeng-4. In all, this study predicted the active ingredients and targets of Sendeng-4, and explored its mechanism of action, which provided new strategies and methods for further research and development of Sendeng-4 and other traditional Mongolian medicines as well. Copyright © 2015 China Pharmaceutical University. Published by Elsevier B.V. All rights reserved.
Zhao, He; Duan, Li-Jun; Sun, Qing-Ling; Gao, Yu-Shan; Yang, Yong-Dong; Tang, Xiang-Sheng; Zhao, Ding-Yan; Xiong, Yang; Hu, Zhen-Guo; Li, Chuan-Hong; Chen, Si-Xue; Liu, Tao; Yu, Xing
2018-04-19
Peripheral nerve injury (PNI) has devastating consequences. Dorsal root ganglion as a pivotal locus participates in the process of neuropathic pain and nerve regeneration. In recent years, gene sequencing technology has seen rapid rise in the biomedicine field. So, we attempt to gain insight into in the mechanism of neuropathic pain and nerve regeneration in the transcriptional level and to explore novel genes through bioinformatics analysis. The gene expression profiles of GSE96051 were downloaded from GEO database. The gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes pathway (KEGG) enrichment analyses were performed, and protein-protein interaction (PPI) network of the differentially expressed genes (DEGs) was constructed by Cytoscape software. Our results showed that both IL-6 and Jun genes and the signaling pathway of MAPK, apoptosis, P53 present their vital modulatory role in nerve regeneration and neuropathic pain. Noteworthy, 13 hub genes associated with neuropathic pain and nerve regeneration, including Ccl12, Ppp1r15a, Cdkn1a, Atf3, Nts, Dusp1, Ccl7, Csf, Gadd45a, Serpine1, Timp1 were rarely reported in PubMed database, these genes may provide us the new orientation in experimental research and clinical study. Our results may provide more deep insight into the mechanism and a promising therapeutic target. The next step is to put our emphasis on an experiment level and to verify the novel genes from 13 hub genes.
Guo, Nan; Zhang, Nan; Yan, Liqiu; Lian, Zheng; Wang, Jiawang; Lv, Fengfeng; Wang, Yunfei; Cao, Xufen
2018-06-14
Acute myocardial infarction induces ventricular remodeling, which is implicated in dilated heart and heart failure. The pathogenical mechanism of myocardium remodeling remains to be elucidated. The aim of the present study was to identify key genes and networks for myocardium remodeling following ischemia‑reperfusion (IR). First, the mRNA expression data from the National Center for Biotechnology Information database were downloaded to identify differences in mRNA expression of the IR heart at days 2 and 7. Then, weighted gene co‑expression network analysis, hierarchical clustering, protein‑protein interaction (PPI) network, Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway were used to identify key genes and networks for the heart remodeling process following IR. A total of 3,321 differentially expressed genes were identified during the heart remodeling process. A total of 6 modules were identified through gene co‑expression network analysis. GO and KEGG analysis results suggested that each module represented a different biological function and was associated with different pathways. Finally, hub genes of each module were identified by PPI network construction. The present study revealed that heart remodeling following IR is a complicated process, involving extracellular matrix organization, neural development, apoptosis and energy metabolism. The dysregulated genes, including SRC proto‑oncogene, non‑receptor tyrosine kinase, discs large MAGUK scaffold protein 1, ATP citrate lyase, RAN, member RAS oncogene family, tumor protein p53, and polo like kinase 2, may be essential for heart remodeling following IR and may be used as potential targets for the inhibition of heart remodeling following acute myocardial infarction.
Chen, Hui; Zhao, Mingwen; Shi, Liang; Chen, Mingjie; Wang, Hong; Feng, Zhiyong
2015-01-01
To elucidate the mechanisms of fruit body development in H. marmoreus, a total of 43609521 high-quality RNA-seq reads were obtained from four developmental stages, including the mycelial knot (H-M), mycelial pigmentation (H-V), primordium (H-P) and fruiting body (H-F) stages. These reads were assembled to obtain 40568 unigenes with an average length of 1074 bp. A total of 26800 (66.06%) unigenes were annotated and analyzed with the Kyoto Encyclopedia of Genes and Genomes (KEGG), Gene Ontology (GO), and Eukaryotic Orthologous Group (KOG) databases. Differentially expressed genes (DEGs) from the four transcriptomes were analyzed. The KEGG enrichment analysis revealed that the mycelium pigmentation stage was associated with the MAPK, cAMP, and blue light signal transduction pathways. In addition, expression of the two-component system members changed with the transition from H-M to H-V, suggesting that light affected the expression of genes related to fruit body initiation in H. marmoreus. During the transition from H-V to H-P, stress signals associated with MAPK, cAMP and ROS signals might be the most important inducers. Our data suggested that nitrogen starvation might be one of the most important factors in promoting fruit body maturation, and nitrogen metabolism and mTOR signaling pathway were associated with this process. In addition, 30 genes of interest were analyzed by quantitative real-time PCR to verify their expression profiles at the four developmental stages. This study advances our understanding of the molecular mechanism of fruiting body development in H. marmoreus by identifying a wealth of new genes that may play important roles in mushroom morphogenesis. PMID:25837428
Xue, Shuxia; Liu, Yichen; Zhang, Yichen; Sun, Yan; Geng, Xuyun; Sun, Jinsheng
2013-01-01
White spot syndrome virus (WSSV) is a causative pathogen found in most shrimp farming areas of the world and causes large economic losses to the shrimp aquaculture. The mechanism underlying the molecular pathogenesis of the highly virulent WSSV remains unknown. To better understand the virus-host interactions at the molecular level, the transcriptome profiles in hemocytes of unchallenged and WSSV-challenged shrimp (Litopenaeus vannamei) were compared using a short-read deep sequencing method (Illumina). RNA-seq analysis generated more than 25.81 million clean pair end (PE) reads, which were assembled into 52,073 unigenes (mean size = 520 bp). Based on sequence similarity searches, 23,568 (45.3%) genes were identified, among which 6,562 and 7,822 unigenes were assigned to gene ontology (GO) categories and clusters of orthologous groups (COG), respectively. Searches in the Kyoto Encyclopedia of Genes and Genomes Pathway database (KEGG) mapped 14,941 (63.4%) unigenes to 240 KEGG pathways. Among all the annotated unigenes, 1,179 were associated with immune-related genes. Digital gene expression (DGE) analysis revealed that the host transcriptome profile was slightly changed in the early infection (5 hours post injection) of the virus, while large transcriptional differences were identified in the late infection (48 hpi) of WSSV. The differentially expressed genes mainly involved in pattern recognition genes and some immune response factors. The results indicated that antiviral immune mechanisms were probably involved in the recognition of pathogen-associated molecular patterns. This study provided a global survey of host gene activities against virus infection in a non-model organism, pacific white shrimp. Results can contribute to the in-depth study of candidate genes in white shrimp, and help to improve the current understanding of host-pathogen interactions.
Huang, Wen; Ren, Chunhua; Li, Hongmei; Huo, Da; Wang, Yanhong; Jiang, Xiao; Tian, Yushun; Luo, Peng; Chen, Ting; Hu, Chaoqun
2017-01-01
The Pacific white shrimp (Litopenaeus vannamei) is an important cultured crustacean species worldwide. However, little is known about the molecular mechanism of this species involved in the response to cold stress. In this study, four separate RNA-Seq libraries of L. vannamei were generated from 13°C stress and control temperature. Total 29,662 of Unigenes and overall of 19,619 annotated genes were obtained. Three comparisons were carried out among the four libraries, in which 72 of the top 20% of differentially-expressed genes were obtained, 15 GO and 5 KEGG temperature-sensitive pathways were fished out. Catalytic activity (GO: 0003824) and Metabolic pathways (ko01100) were the most annotated GO and KEGG pathways in response to cold stress, respectively. In addition, Calcium, MAPK cascade, Transcription factor and Serine/threonine-protein kinase signal pathway were picked out and clustered. Serine/threonine-protein kinase signal pathway might play more important roles in cold adaptation, while other three signal pathway were not widely transcribed. Our results had summarized the differentially-expressed genes and suggested the major important signaling pathways and related genes. These findings provide the first profile insight into the molecular basis of L. vannamei response to cold stress.
Huang, Wen; Ren, Chunhua; Li, Hongmei; Huo, Da; Wang, Yanhong; Jiang, Xiao; Tian, Yushun; Luo, Peng; Hu, Chaoqun
2017-01-01
The Pacific white shrimp (Litopenaeus vannamei) is an important cultured crustacean species worldwide. However, little is known about the molecular mechanism of this species involved in the response to cold stress. In this study, four separate RNA-Seq libraries of L. vannamei were generated from 13°C stress and control temperature. Total 29,662 of Unigenes and overall of 19,619 annotated genes were obtained. Three comparisons were carried out among the four libraries, in which 72 of the top 20% of differentially-expressed genes were obtained, 15 GO and 5 KEGG temperature-sensitive pathways were fished out. Catalytic activity (GO: 0003824) and Metabolic pathways (ko01100) were the most annotated GO and KEGG pathways in response to cold stress, respectively. In addition, Calcium, MAPK cascade, Transcription factor and Serine/threonine-protein kinase signal pathway were picked out and clustered. Serine/threonine-protein kinase signal pathway might play more important roles in cold adaptation, while other three signal pathway were not widely transcribed. Our results had summarized the differentially-expressed genes and suggested the major important signaling pathways and related genes. These findings provide the first profile insight into the molecular basis of L. vannamei response to cold stress. PMID:28575089
Anand Brown, Andrew; Ding, Zhihao; Viñuela, Ana; Glass, Dan; Parts, Leopold; Spector, Tim; Winn, John; Durbin, Richard
2015-03-09
Statistical factor analysis methods have previously been used to remove noise components from high-dimensional data prior to genetic association mapping and, in a guided fashion, to summarize biologically relevant sources of variation. Here, we show how the derived factors summarizing pathway expression can be used to analyze the relationships between expression, heritability, and aging. We used skin gene expression data from 647 twins from the MuTHER Consortium and applied factor analysis to concisely summarize patterns of gene expression to remove broad confounding influences and to produce concise pathway-level phenotypes. We derived 930 "pathway phenotypes" that summarized patterns of variation across 186 KEGG pathways (five phenotypes per pathway). We identified 69 significant associations of age with phenotype from 57 distinct KEGG pathways at a stringent Bonferroni threshold ([Formula: see text]). These phenotypes are more heritable ([Formula: see text]) than gene expression levels. On average, expression levels of 16% of genes within these pathways are associated with age. Several significant pathways relate to metabolizing sugars and fatty acids; others relate to insulin signaling. We have demonstrated that factor analysis methods combined with biological knowledge can produce more reliable phenotypes with less stochastic noise than the individual gene expression levels, which increases our power to discover biologically relevant associations. These phenotypes could also be applied to discover associations with other environmental factors. Copyright © 2015 Brown et al.
Anand Brown, Andrew; Ding, Zhihao; Viñuela, Ana; Glass, Dan; Parts, Leopold; Spector, Tim; Winn, John; Durbin, Richard
2015-01-01
Statistical factor analysis methods have previously been used to remove noise components from high-dimensional data prior to genetic association mapping and, in a guided fashion, to summarize biologically relevant sources of variation. Here, we show how the derived factors summarizing pathway expression can be used to analyze the relationships between expression, heritability, and aging. We used skin gene expression data from 647 twins from the MuTHER Consortium and applied factor analysis to concisely summarize patterns of gene expression to remove broad confounding influences and to produce concise pathway-level phenotypes. We derived 930 “pathway phenotypes” that summarized patterns of variation across 186 KEGG pathways (five phenotypes per pathway). We identified 69 significant associations of age with phenotype from 57 distinct KEGG pathways at a stringent Bonferroni threshold (P<5.38×10−5). These phenotypes are more heritable (h2=0.32) than gene expression levels. On average, expression levels of 16% of genes within these pathways are associated with age. Several significant pathways relate to metabolizing sugars and fatty acids; others relate to insulin signaling. We have demonstrated that factor analysis methods combined with biological knowledge can produce more reliable phenotypes with less stochastic noise than the individual gene expression levels, which increases our power to discover biologically relevant associations. These phenotypes could also be applied to discover associations with other environmental factors. PMID:25758824
Tracing the Repertoire of Promiscuous Enzymes along the Metabolic Pathways in Archaeal Organisms.
Martínez-Núñez, Mario Alberto; Rodríguez-Escamilla, Zuemy; Rodríguez-Vázquez, Katya; Pérez-Rueda, Ernesto
2017-07-13
The metabolic pathways that carry out the biochemical transformations sustaining life depend on the efficiency of their associated enzymes. In recent years, it has become clear that promiscuous enzymes have played an important role in the function and evolution of metabolism. In this work we analyze the repertoire of promiscuous enzymes in 89 non-redundant genomes of the Archaea cellular domain. Promiscuous enzymes are defined as those proteins with two or more different Enzyme Commission (E.C.) numbers, according the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. From this analysis, it was found that the fraction of promiscuous enzymes is lower in Archaea than in Bacteria. A greater diversity of superfamily domains is associated with promiscuous enzymes compared to specialized enzymes, both in Archaea and Bacteria, and there is an enrichment of substrate promiscuity rather than catalytic promiscuity in the archaeal enzymes. Finally, the presence of promiscuous enzymes in the metabolic pathways was found to be heterogeneously distributed at the domain level and in the phyla that make up the Archaea. These analyses increase our understanding of promiscuous enzymes and provide additional clues to the evolution of metabolism in Archaea.
Yang, Mei; Cong, Min; Peng, Xiuming; Wu, Junrui; Wu, Rina; Liu, Biao; Ye, Wenhui; Yue, Xiqing
2016-05-18
Milk fat globule membrane (MFGM) proteins have many functions. To explore the different proteomics of human and bovine MFGM, MFGM proteins were separated from human and bovine colostrum and mature milk, and analyzed by the iTRAQ proteomic approach. A total of 411 proteins were recognized and quantified. Among these, 232 kinds of differentially expressed proteins were identified. These differentially expressed proteins were analyzed based on multivariate analysis, gene ontology (GO) annotation and KEGG pathway. Biological processes involved were response to stimulus, localization, establishment of localization, and the immune system process. Cellular components engaged were the extracellular space, extracellular region parts, cell fractions, and vesicles. Molecular functions touched upon were protein binding, nucleotide binding, and enzyme inhibitor activity. The KEGG pathway analysis showed several pathways, including regulation of the actin cytoskeleton, focal adhesion, neurotrophin signaling pathway, leukocyte transendothelial migration, tight junction, complement and coagulation cascades, vascular endothelial growth factor signaling pathway, and adherens junction. These results enhance our understanding of different proteomes of human and bovine MFGM across different lactation phases, which could provide important information and potential directions for the infant milk powder and functional food industries.
NASA Astrophysics Data System (ADS)
Liu, Wuxing; Wang, Qingling; Hou, Jinyu; Tu, Chen; Luo, Yongming; Christie, Peter
2016-05-01
This research undertook the systematic analysis of the Klebsiella sp. D5A genome and identification of genes that contribute to plant growth-promoting (PGP) traits, especially genes related to salt tolerance and wide pH adaptability. The genome sequence of isolate D5A was obtained using an Illumina HiSeq 2000 sequencing system with average coverages of 174.7× and 200.1× using the paired-end and mate-pair sequencing, respectively. Predicted and annotated gene sequences were analyzed for similarity with the Kyoto Encyclopedia of Genes and Genomes (KEGG) enzyme database followed by assignment of each gene into the KEGG pathway charts. The results show that the Klebsiella sp. D5A genome has a total of 5,540,009 bp with 57.15% G + C content. PGP conferring genes such as indole-3-acetic acid (IAA) biosynthesis, phosphate solubilization, siderophore production, acetoin and 2,3-butanediol synthesis, and N2 fixation were determined. Moreover, genes putatively responsible for resistance to high salinity including glycine-betaine synthesis, trehalose synthesis and a number of osmoregulation receptors and transport systems were also observed in the D5A genome together with numerous genes that contribute to pH homeostasis. These genes reveal the genetic adaptation of D5A to versatile environmental conditions and the effectiveness of the isolate to serve as a plant growth stimulator.
Gene expression profile after activation of RIG-I in 5'ppp-dsRNA challenged DF1.
Chen, Yang; Xu, Qi; Li, Yang; Liu, Ran; Huang, Zhengyang; Wang, Bin; Chen, Guohong
2016-12-01
Retinoic acid inducible gene I (RIG-I) can recognize influenza viruses and evoke the innate immune response. RIG-I is absent in the chicken genome, but is conserved in the genome of ducks. Lack of RIG-I renders chickens more susceptible to avian influenza infection, and the clinical symptoms are more prominent than in other poultry. It is unknown whether introduction of duck RIG-I into chicken cells can establish the immunity as is seen in ducks and the role of RIG-I in established immunity is unknown. In this study, a chicken cell strain with stable expression of duRIG-I was established by lentiviral infection, giving DF1/LV5-RIG-I, and a control strain DF1/LV5 was established in parallel. To verify stable, high level expression of duRIG-I in DF1 cells, the levels of duRIG-I mRNA and protein were determined by real-time RT-PCR and Western blot, respectively. Further, 5'triphosphate double stranded RNA (5'ppp-dsRNA) was used to mimic an RNA virus infection and the infected DF1/LV5-RIG-I and DF1/LV5 cells were subjected to high-throughput RNA-sequencing, which yielded 193.46 M reads and 39.07 G bases. A total of 278 differentially expressed genes (DEGs), i.e., duRIG-I-mediated responsive genes, were identified by RNA-seq. Among the 278 genes, 120 DEGs are annotated in the KEGG database, and the most reliable KEGG pathways are likely to be the signaling pathways of RIG-I like receptors. Functional analysis by Gene ontology (GO) indicates that the functions of these DEGs are primarily related to Type I interferon (IFN) signaling, IFN-β-mediated cellular responses and up-regulation of the RIG-I signaling pathway. Based on the shared genes among different pathways, a network representing crosstalk between RIG-I and other signaling pathways was constructed using Cytoscape software. The network suggests that RIG-mediated pathway may crosstalk with the Jak-STAT signaling pathway, Toll-like receptor signaling pathway, Wnt signaling pathway, ubiquitin-mediated proteolysis and MAPK signaling pathway during the transduction of antiviral signals. After screening, a group of key responsive genes in RIG-I-mediated signaling pathways, such as ISG12-2, Mx1, IFIT5, TRIM25, USP18, STAT1, STAT2, IRF1, IRF7 and IRF8, were tested for differential expression by real-time RT-PCR. In summary, by combining our results and the current literature, we propose a RIG-I-mediated signaling network in chickens. Copyright © 2016 Elsevier Ltd. All rights reserved.
Ge, Cheng-Hao; Sun, Na; Kang, Qi; Ren, Long-Fei; Ahmad, Hafiz Adeel; Ni, Shou-Qing; Wang, Zhibin
2018-03-01
A distinct shift of bacterial community driven by organic matter (OM) and powder activated carbon (PAC) was discovered in the simultaneous anammox and denitrification (SAD) process which was operated in an anti-fouling submerged anaerobic membrane bio-reactor. Based on anammox performance, optimal OM dose (50 mg/L) was advised to start up SAD process successfully. The results of qPCR and high throughput sequencing analysis indicated that OM played a key role in microbial community evolutions, impelling denitrifiers to challenge anammox's dominance. The addition of PAC not only mitigated the membrane fouling, but also stimulated the enrichment of denitrifiers, accounting for the predominant phylum changing from Planctomycetes to Proteobacteria in SAD process. Functional genes forecasts based on KEGG database and COG database showed that the expressions of full denitrification functional genes were highly promoted in R C , which demonstrated the enhanced full denitrification pathway driven by OM and PAC under low COD/N value (0.11). Copyright © 2017 Elsevier Ltd. All rights reserved.
A systematic analysis of a mi-RNA inter-pathway regulatory motif
2013-01-01
Background The continuing discovery of new types and functions of small non-coding RNAs is suggesting the presence of regulatory mechanisms far more complex than the ones currently used to study and design Gene Regulatory Networks. Just focusing on the roles of micro RNAs (miRNAs), they have been found to be part of several intra-pathway regulatory motifs. However, inter-pathway regulatory mechanisms have been often neglected and require further investigation. Results In this paper we present the result of a systems biology study aimed at analyzing a high-level inter-pathway regulatory motif called Pathway Protection Loop, not previously described, in which miRNAs seem to play a crucial role in the successful behavior and activation of a pathway. Through the automatic analysis of a large set of public available databases, we found statistical evidence that this inter-pathway regulatory motif is very common in several classes of KEGG Homo Sapiens pathways and concurs in creating a complex regulatory network involving several pathways connected by this specific motif. The role of this motif seems also confirmed by a deeper review of other research activities on selected representative pathways. Conclusions Although previous studies suggested transcriptional regulation mechanism at the pathway level such as the Pathway Protection Loop, a high-level analysis like the one proposed in this paper is still missing. The understanding of higher-level regulatory motifs could, as instance, lead to new approaches in the identification of therapeutic targets because it could unveil new and “indirect” paths to activate or silence a target pathway. However, a lot of work still needs to be done to better uncover this high-level inter-pathway regulation including enlarging the analysis to other small non-coding RNA molecules. PMID:24152805
Morphinome Database - The database of proteins altered by morphine administration - An update.
Bodzon-Kulakowska, Anna; Padrtova, Tereza; Drabik, Anna; Ner-Kluza, Joanna; Antolak, Anna; Kulakowski, Konrad; Suder, Piotr
2018-04-13
Morphine is considered a gold standard in pain treatment. Nevertheless, its use could be associated with severe side effects, including drug addiction. Thus, it is very important to understand the molecular mechanism of morphine action in order to develop new methods of pain therapy, or at least to attenuate the side effects of opioids usage. Proteomics allows for the indication of proteins involved in certain biological processes, but the number of items identified in a single study is usually overwhelming. Thus, researchers face the difficult problem of choosing the proteins which are really important for the investigated processes and worth further studies. Therefore, based on the 29 published articles, we created a database of proteins regulated by morphine administration - The Morphinome Database (addiction-proteomics.org). This web tool allows for indicating proteins that were identified during different proteomics studies. Moreover, the collection and organization of such a vast amount of data allows us to find the same proteins that were identified in various studies and to create their ranking, based on the frequency of their identification. STRING and KEGG databases indicated metabolic pathways which those molecules are involved in. This means that those molecular pathways seem to be strongly affected by morphine administration and could be important targets for further investigations. The data about proteins identified by different proteomics studies of molecular changes caused by morphine administration (29 published articles) were gathered in the Morphinome Database. Unification of those data allowed for the identification of proteins that were indicated several times by distinct proteomics studies, which means that they seem to be very well verified and important for the entire process. Those proteins might be now considered promising aims for more detailed studies of their role in the molecular mechanism of morphine action. Copyright © 2018. Published by Elsevier B.V.
Identification of hub subnetwork based on topological features of genes in breast cancer
ZHUANG, DA-YONG; JIANG, LI; HE, QING-QING; ZHOU, PENG; YUE, TAO
2015-01-01
The aim of this study was to provide functional insight into the identification of hub subnetworks by aggregating the behavior of genes connected in a protein-protein interaction (PPI) network. We applied a protein network-based approach to identify subnetworks which may provide new insight into the functions of pathways involved in breast cancer rather than individual genes. Five groups of breast cancer data were downloaded and analyzed from the Gene Expression Omnibus (GEO) database of high-throughput gene expression data to identify gene signatures using the genome-wide global significance (GWGS) method. A PPI network was constructed using Cytoscape and clusters that focused on highly connected nodes were obtained using the molecular complex detection (MCODE) clustering algorithm. Pathway analysis was performed to assess the functional relevance of selected gene signatures based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Topological centrality was used to characterize the biological importance of gene signatures, pathways and clusters. The results revealed that, cluster1, as well as the cell cycle and oocyte meiosis pathways were significant subnetworks in the analysis of degree and other centralities, in which hub nodes mostly distributed. The most important hub nodes, with top ranked centrality, were also similar with the common genes from the above three subnetwork intersections, which was viewed as a hub subnetwork with more reproducible than individual critical genes selected without network information. This hub subnetwork attributed to the same biological process which was essential in the function of cell growth and death. This increased the accuracy of identifying gene interactions that took place within the same functional process and was potentially useful for the development of biomarkers and networks for breast cancer. PMID:25573623
The chemokine receptor CCR1 is identified in mast cell-derived exosomes
Liang, Yuting; Qiao, Longwei; Peng, Xia; Cui, Zelin; Yin, Yue; Liao, Huanjin; Jiang, Min; Li, Li
2018-01-01
Mast cells are important effector cells of the immune system, and mast cell-derived exosomes carrying RNAs play a role in immune regulation. However, the molecular function of mast cell-derived exosomes is currently unknown, and here, we identify differentially expressed genes (DEGs) in mast cells and exosomes. We isolated mast cells derived exosomes through differential centrifugation and screened the DEGs from mast cell-derived exosomes, using the GSE25330 array dataset downloaded from the Gene Expression Omnibus database. Biochemical pathways were analyzed by Gene ontology (GO) annotation and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway on the online tool DAVID. DEGs-associated protein-protein interaction networks (PPIs) were constructed using the STRING database and Cytoscape software. The genes identified from these bioinformatics analyses were verified by qRT-PCR and Western blot in mast cells and exosomes. We identified 2121 DEGs (843 up and 1278 down-regulated genes) in HMC-1 cell-derived exosomes and HMC-1 cells. The up-regulated DEGs were classified into two significant modules. The chemokine receptor CCR1 was screened as a hub gene and enriched in cytokine-mediated signaling pathway in module one. Seven genes, including CCR1, CD9, KIT, TGFBR1, TLR9, TPSAB1 and TPSB2 were screened and validated through qRT-PCR analysis. We have achieved a comprehensive view of the pivotal genes and pathways in mast cells and exosomes and identified CCR1 as a hub gene in mast cell-derived exosomes. Our results provide novel clues with respect to the biological processes through which mast cell-derived exosomes modulate immune responses. PMID:29511430
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ovacik, Meric A.; Sen, Banalata; Euling, Susan Y.
Pathway activity level analysis, the approach pursued in this study, focuses on all genes that are known to be members of metabolic and signaling pathways as defined by the KEGG database. The pathway activity level analysis entails singular value decomposition (SVD) of the expression data of the genes constituting a given pathway. We explore an extension of the pathway activity methodology for application to time-course microarray data. We show that pathway analysis enhances our ability to detect biologically relevant changes in pathway activity using synthetic data. As a case study, we apply the pathway activity level formulation coupled with significancemore » analysis to microarray data from two different rat testes exposed in utero to Dibutyl Phthalate (DBP). In utero DBP exposure in the rat results in developmental toxicity of a number of male reproductive organs, including the testes. One well-characterized mode of action for DBP and the male reproductive developmental effects is the repression of expression of genes involved in cholesterol transport, steroid biosynthesis and testosterone synthesis that lead to a decreased fetal testicular testosterone. Previous analyses of DBP testes microarray data focused on either individual gene expression changes or changes in the expression of specific genes that are hypothesized, or known, to be important in testicular development and testosterone synthesis. However, a pathway analysis may inform whether there are additional affected pathways that could inform additional modes of action linked to DBP developmental toxicity. We show that Pathway activity analysis may be considered for a more comprehensive analysis of microarray data.« less
Senachak, Jittisak; Cheevadhanarak, Supapon; Hongsthong, Apiradee
2015-07-29
Spirulina (Arthrospira) platensis is the only cyanobacterium that in addition to being studied at the molecular level and subjected to gene manipulation, can also be mass cultivated in outdoor ponds for commercial use as a food supplement. Thus, encountering environmental changes, including temperature stresses, is common during the mass production of Spirulina. The use of cyanobacteria as an experimental platform, especially for photosynthetic gene manipulation in plants and bacteria, is becoming increasingly important. Understanding the mechanisms and protein-protein interaction networks that underlie low- and high-temperature responses is relevant to Spirulina mass production. To accomplish this goal, high-throughput techniques such as OMICs analyses are used. Thus, large datasets must be collected, managed and subjected to information extraction. Therefore, databases including (i) proteomic analysis and protein-protein interaction (PPI) data and (ii) domain/motif visualization tools are required for potential use in temperature response models for plant chloroplasts and photosynthetic bacteria. A web-based repository was developed including an embedded database, SpirPro, and tools for network visualization. Proteome data were analyzed integrated with protein-protein interactions and/or metabolic pathways from KEGG. The repository provides various information, ranging from raw data (2D-gel images) to associated results, such as data from interaction and/or pathway analyses. This integration allows in silico analyses of protein-protein interactions affected at the metabolic level and, particularly, analyses of interactions between and within the affected metabolic pathways under temperature stresses for comparative proteomic analysis. The developed tool, which is coded in HTML with CSS/JavaScript and depicted in Scalable Vector Graphics (SVG), is designed for interactive analysis and exploration of the constructed network. SpirPro is publicly available on the web at http://spirpro.sbi.kmutt.ac.th . SpirPro is an analysis platform containing an integrated proteome and PPI database that provides the most comprehensive data on this cyanobacterium at the systematic level. As an integrated database, SpirPro can be applied in various analyses, such as temperature stress response networking analysis in cyanobacterial models and interacting domain-domain analysis between proteins of interest.
Dehne, T.; Lindahl, A.; Brittberg, M.; Pruss, A.; Ringe, J.; Sittinger, M.; Karlsson, C.
2012-01-01
Objective: It is well known that expression of markers for WNT signaling is dysregulated in osteoarthritic (OA) bone. However, it is still not fully known if the expression of these markers also is affected in OA cartilage. The aim of this study was therefore to examine this issue. Methods: Human cartilage biopsies from OA and control donors were subjected to genome-wide oligonucleotide microarrays. Genes involved in WNT signaling were selected using the BioRetis database, KEGG pathway analysis was searched using DAVID software tools, and cluster analysis was performed using Genesis software. Results from the microarray analysis were verified using quantitative real-time PCR and immunohistochemistry. In order to study the impact of cytokines for the dysregulated WNT signaling, OA and control chondrocytes were stimulated with interleukin-1 and analyzed with real-time PCR for their expression of WNT-related genes. Results: Several WNT markers displayed a significantly altered expression in OA compared to normal cartilage. Interestingly, inhibitors of the canonical and planar cell polarity WNT signaling pathways displayed significantly increased expression in OA cartilage, while the Ca2+/WNT signaling pathway was activated. Both real-time PCR and immunohistochemistry verified the microarray results. Real-time PCR analysis demonstrated that interleukin-1 upregulated expression of important WNT markers. Conclusions: WNT signaling is significantly affected in OA cartilage. The result suggests that both the canonical and planar cell polarity WNT signaling pathways were partly inhibited while the Ca2+/WNT pathway was activated in OA cartilage. PMID:26069618
Wu, Jie; Li, Lian; Sun, Yu; Huang, Shuai; Tang, Juan; Yu, Pan; Wang, Genlin
2015-01-01
Toll-like receptor 4 (TLR4) mediated activation of the nuclear transcription factor κB (NF-κB) signaling pathway by mastitis initiates expression of genes associated with inflammation and the innate immune response. In this study, the profile of mastitis-induced differential gene expression in the mammary tissue of Chinese Holstein cattle was investigated by Gene-Chip microarray and bioinformatics. The microarray results revealed that 79 genes associated with the TLR4/NF-κB signaling pathway were differentially expressed. Of these genes, 19 were up-regulated and 29 were down-regulated in mastitis tissue compared to normal, healthy tissue. Statistical analysis of transcript and protein level expression changes indicated that 10 genes, namely TLR4, MyD88, IL-6, and IL-10, were up-regulated, while, CD14, TNF-α, MD-2, IL-β, NF-κB, and IL-12 were significantly down-regulated in mastitis tissue in comparison with normal tissue. Analyses using bioinformatics database resources, such as the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis and the Gene Ontology Consortium (GO) for term enrichment analysis, suggested that these differently expressed genes implicate different regulatory pathways for immune function in the mammary gland. In conclusion, our study provides new evidence for better understanding the differential expression and mechanisms of the TLR4 /NF-κB signaling pathway in Chinese Holstein cattle with mastitis. PMID:25706977
Wu, Jie; Li, Lian; Sun, Yu; Huang, Shuai; Tang, Juan; Yu, Pan; Wang, Genlin
2015-01-01
Toll-like receptor 4 (TLR4) mediated activation of the nuclear transcription factor κB (NF-κB) signaling pathway by mastitis initiates expression of genes associated with inflammation and the innate immune response. In this study, the profile of mastitis-induced differential gene expression in the mammary tissue of Chinese Holstein cattle was investigated by Gene-Chip microarray and bioinformatics. The microarray results revealed that 79 genes associated with the TLR4/NF-κB signaling pathway were differentially expressed. Of these genes, 19 were up-regulated and 29 were down-regulated in mastitis tissue compared to normal, healthy tissue. Statistical analysis of transcript and protein level expression changes indicated that 10 genes, namely TLR4, MyD88, IL-6, and IL-10, were up-regulated, while, CD14, TNF-α, MD-2, IL-β, NF-κB, and IL-12 were significantly down-regulated in mastitis tissue in comparison with normal tissue. Analyses using bioinformatics database resources, such as the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis and the Gene Ontology Consortium (GO) for term enrichment analysis, suggested that these differently expressed genes implicate different regulatory pathways for immune function in the mammary gland. In conclusion, our study provides new evidence for better understanding the differential expression and mechanisms of the TLR4 /NF-κB signaling pathway in Chinese Holstein cattle with mastitis.
Cui, Kai; Wang, Haiying; Liao, Shengxi; Tang, Qi; Li, Li; Cui, Yongzhong; He, Yuan
2016-01-01
Dendrocalamus sinicus is the world’s largest bamboo species with strong woody culms, and known for its fast-growing culms. As an economic bamboo species, it was popularized for multi-functional applications including furniture, construction, and industrial paper pulp. To comprehensively elucidate the molecular processes involved in its culm elongation, Illumina paired-end sequencing was conducted. About 65.08 million high-quality reads were produced, and assembled into 81,744 unigenes with an average length of 723 bp. A total of 64,338 (79%) unigenes were annotated for their functions, of which, 56,587 were annotated in the NCBI non-redundant protein database and 35,262 were annotated in the Swiss-Prot database. Also, 42,508 and 21,009 annotated unigenes were allocated to gene ontology (GO) categories and clusters of orthologous groups (COG), respectively. By searching against the Kyoto Encyclopedia of Genes and Genomes Pathway database (KEGG), 33,920 unigenes were assigned to 128 KEGG pathways. Meanwhile, 8,553 simple sequence repeats (SSRs) and 81,534 single-nucleotide polymorphism (SNPs) were identified, respectively. Additionally, 388 transcripts encoding lignin biosynthesis were detected, among which, 27 transcripts encoding Shikimate O-hydroxycinnamoyltransferase (HCT) specifically expressed in D. sinicus when compared to other bamboo species and rice. The phylogenetic relationship between D. sinicus and other plants was analyzed, suggesting functional diversity of HCT unigenes in D. sinicus. We conjectured that HCT might lead to the high lignin content and giant culm. Given that the leaves are not yet formed and culm is covered with sheaths during culm elongation, the existence of photosynthesis of bamboo culm is usually neglected. Surprisedly, 109 transcripts encoding photosynthesis were identified, including photosystem I and II, cytochrome b6/f complex, photosynthetic electron transport and F-type ATPase, and 24 transcripts were characterized as antenna proteins that regarded as the main tool for capturing light of plants, implying stem photosynthesis plays a key role during culm elongation due to the unavailability of its leaf. By real-time quantitative PCR, the expression level of 6 unigenes was detected. The results showed the expression level of all genes accorded with the transcriptome data, which confirm the reliability of the transcriptome data. As we know, this is the first study underline the D. sinicus transcriptome, which will deepen the understanding of the molecular mechanisms of culm development. The results may help variety improvement and resource utilization of bamboos. PMID:27304219
Cui, Kai; Wang, Haiying; Liao, Shengxi; Tang, Qi; Li, Li; Cui, Yongzhong; He, Yuan
2016-01-01
Dendrocalamus sinicus is the world's largest bamboo species with strong woody culms, and known for its fast-growing culms. As an economic bamboo species, it was popularized for multi-functional applications including furniture, construction, and industrial paper pulp. To comprehensively elucidate the molecular processes involved in its culm elongation, Illumina paired-end sequencing was conducted. About 65.08 million high-quality reads were produced, and assembled into 81,744 unigenes with an average length of 723 bp. A total of 64,338 (79%) unigenes were annotated for their functions, of which, 56,587 were annotated in the NCBI non-redundant protein database and 35,262 were annotated in the Swiss-Prot database. Also, 42,508 and 21,009 annotated unigenes were allocated to gene ontology (GO) categories and clusters of orthologous groups (COG), respectively. By searching against the Kyoto Encyclopedia of Genes and Genomes Pathway database (KEGG), 33,920 unigenes were assigned to 128 KEGG pathways. Meanwhile, 8,553 simple sequence repeats (SSRs) and 81,534 single-nucleotide polymorphism (SNPs) were identified, respectively. Additionally, 388 transcripts encoding lignin biosynthesis were detected, among which, 27 transcripts encoding Shikimate O-hydroxycinnamoyltransferase (HCT) specifically expressed in D. sinicus when compared to other bamboo species and rice. The phylogenetic relationship between D. sinicus and other plants was analyzed, suggesting functional diversity of HCT unigenes in D. sinicus. We conjectured that HCT might lead to the high lignin content and giant culm. Given that the leaves are not yet formed and culm is covered with sheaths during culm elongation, the existence of photosynthesis of bamboo culm is usually neglected. Surprisedly, 109 transcripts encoding photosynthesis were identified, including photosystem I and II, cytochrome b6/f complex, photosynthetic electron transport and F-type ATPase, and 24 transcripts were characterized as antenna proteins that regarded as the main tool for capturing light of plants, implying stem photosynthesis plays a key role during culm elongation due to the unavailability of its leaf. By real-time quantitative PCR, the expression level of 6 unigenes was detected. The results showed the expression level of all genes accorded with the transcriptome data, which confirm the reliability of the transcriptome data. As we know, this is the first study underline the D. sinicus transcriptome, which will deepen the understanding of the molecular mechanisms of culm development. The results may help variety improvement and resource utilization of bamboos.
Aligning Metabolic Pathways Exploiting Binary Relation of Reactions.
Huang, Yiran; Zhong, Cheng; Lin, Hai Xiang; Huang, Jing
2016-01-01
Metabolic pathway alignment has been widely used to find one-to-one and/or one-to-many reaction mappings to identify the alternative pathways that have similar functions through different sets of reactions, which has important applications in reconstructing phylogeny and understanding metabolic functions. The existing alignment methods exhaustively search reaction sets, which may become infeasible for large pathways. To address this problem, we present an effective alignment method for accurately extracting reaction mappings between two metabolic pathways. We show that connected relation between reactions can be formalized as binary relation of reactions in metabolic pathways, and the multiplications of zero-one matrices for binary relations of reactions can be accomplished in finite steps. By utilizing the multiplications of zero-one matrices for binary relation of reactions, we efficiently obtain reaction sets in a small number of steps without exhaustive search, and accurately uncover biologically relevant reaction mappings. Furthermore, we introduce a measure of topological similarity of nodes (reactions) by comparing the structural similarity of the k-neighborhood subgraphs of the nodes in aligning metabolic pathways. We employ this similarity metric to improve the accuracy of the alignments. The experimental results on the KEGG database show that when compared with other state-of-the-art methods, in most cases, our method obtains better performance in the node correctness and edge correctness, and the number of the edges of the largest common connected subgraph for one-to-one reaction mappings, and the number of correct one-to-many reaction mappings. Our method is scalable in finding more reaction mappings with better biological relevance in large metabolic pathways.
Kim, Mi Ae; Rhee, Jae-Sung; Kim, Tae Ha; Lee, Jung Sick; Choi, Ah-Young; Choi, Beom-Soon; Choi, Ik-Young; Sohn, Young Chang
2017-03-09
In order to characterize the female or male transcriptome of the Pacific abalone and further increase genomic resources, we sequenced the mRNA of full-length complementary DNA (cDNA) libraries derived from pooled tissues of female and male Haliotis discus hannai by employing the Iso-Seq protocol of the PacBio RSII platform. We successfully assembled whole full-length cDNA sequences and constructed a transcriptome database that included isoform information. After clustering, a total of 15,110 and 12,145 genes that coded for proteins were identified in female and male abalones, respectively. A total of 13,057 putative orthologs were retained from each transcriptome in abalones. Overall Gene Ontology terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways analyzed in each database showed a similar composition between sexes. In addition, a total of 519 and 391 isoforms were genome-widely identified with at least two isoforms from female and male transcriptome databases. We found that the number of isoforms and their alternatively spliced patterns are variable and sex-dependent. This information represents the first significant contribution to sex-preferential genomic resources of the Pacific abalone. The availability of whole female and male transcriptome database and their isoform information will be useful to improve our understanding of molecular responses and also for the analysis of population dynamics in the Pacific abalone.
Kim, Mi Ae; Rhee, Jae-Sung; Kim, Tae Ha; Lee, Jung Sick; Choi, Ah-Young; Choi, Beom-Soon; Choi, Ik-Young; Sohn, Young Chang
2017-01-01
In order to characterize the female or male transcriptome of the Pacific abalone and further increase genomic resources, we sequenced the mRNA of full-length complementary DNA (cDNA) libraries derived from pooled tissues of female and male Haliotis discus hannai by employing the Iso-Seq protocol of the PacBio RSII platform. We successfully assembled whole full-length cDNA sequences and constructed a transcriptome database that included isoform information. After clustering, a total of 15,110 and 12,145 genes that coded for proteins were identified in female and male abalones, respectively. A total of 13,057 putative orthologs were retained from each transcriptome in abalones. Overall Gene Ontology terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways analyzed in each database showed a similar composition between sexes. In addition, a total of 519 and 391 isoforms were genome-widely identified with at least two isoforms from female and male transcriptome databases. We found that the number of isoforms and their alternatively spliced patterns are variable and sex-dependent. This information represents the first significant contribution to sex-preferential genomic resources of the Pacific abalone. The availability of whole female and male transcriptome database and their isoform information will be useful to improve our understanding of molecular responses and also for the analysis of population dynamics in the Pacific abalone. PMID:28282934
Li, Cong; Wu, Xia; Zhang, Wei; Li, Jia; Liu, Huawei; Hao, Ming; Wang, Junsong; Zhang, Honghai; Yang, Gengxia; Hao, Meijun; Sheng, Shoupeng; Sun, Yu; Long, Jiang; Li, Juan; Zhuang, Fengfeng; Hu, Caixia; Li, Li; Zheng, Jiasheng
2016-01-01
Liver cancer is one of the most lethal cancer types in humans, but our understanding of the molecular mechanisms underlying this process remains insufficient. Here, we conducted high-content screening of the potential genes involved in liver cancer metastasis, which we selected from the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway database, based on the SAMcell method and RNA interference technology. We identified two powerful genes in the liver cancer metastasis process, AEG-1 and AKR1C2, both of which proved to be positive regulators in promoting metastasis in liver cancer. Further clinical results verified their roles in liver cancer. In summary, these findings could provide new insight into the liver cancer mechanism and potentially therapeutic novel targets for liver cancer therapies in the future. © 2015 Society for Laboratory Automation and Screening.
BioRuby: bioinformatics software for the Ruby programming language.
Goto, Naohisa; Prins, Pjotr; Nakao, Mitsuteru; Bonnal, Raoul; Aerts, Jan; Katayama, Toshiaki
2010-10-15
The BioRuby software toolkit contains a comprehensive set of free development tools and libraries for bioinformatics and molecular biology, written in the Ruby programming language. BioRuby has components for sequence analysis, pathway analysis, protein modelling and phylogenetic analysis; it supports many widely used data formats and provides easy access to databases, external programs and public web services, including BLAST, KEGG, GenBank, MEDLINE and GO. BioRuby comes with a tutorial, documentation and an interactive environment, which can be used in the shell, and in the web browser. BioRuby is free and open source software, made available under the Ruby license. BioRuby runs on all platforms that support Ruby, including Linux, Mac OS X and Windows. And, with JRuby, BioRuby runs on the Java Virtual Machine. The source code is available from http://www.bioruby.org/. katayama@bioruby.org
Transcriptome Sequencing in a Tibetan Barley Landrace with High Resistance to Powdery Mildew
Zeng, Xing-Quan; Luo, Xiao-Mei; Wang, Yu-Lin; Xu, Qi-Jun; Bai, Li-Jun; Yuan, Hong-Jun; Tashi, Nyima
2014-01-01
Hulless barley is an important cereal crop worldwide, especially in Tibet of China. However, this crop is usually susceptible to powdery mildew caused by Blumeria graminis f. sp. hordei. In this study, we aimed to understand the functions and pathways of genes involved in the disease resistance by transcriptome sequencing of a Tibetan barley landrace with high resistance to powdery mildew. A total of 831 significant differentially expressed genes were found in the infected seedlings, covering 19 functions. Either “cell,” “cell part,” and “extracellular region” in the cellular component category or “binding” and “catalytic” in the category of molecular function as well as “metabolic process” and “cellular process” in the biological process category together demonstrated that these functions may be involved in the resistance to powdery mildew of the hulless barley. In addition, 330 KEGG pathways were found using BLASTx with an E-value cut-off of <10−5. Among them, three pathways, namely, “photosynthesis,” “plant-pathogen interaction,” and “photosynthesis-antenna proteins” had significant matches in the database. Significant expressions of the three pathways were detected at 24 h, 48 h, and 96 h after infection, respectively. These results indicated a complex process of barley response to powdery mildew infection. PMID:25587568
Zhou, Lei-Lei; Xu, Xiao-Yue; Ni, Jie; Zhao, Xia; Zhou, Jian-Wei; Feng, Ji-Feng
2018-06-01
Due to the low incidence and the heterogeneity of subtypes, the biological process of T-cell lymphomas is largely unknown. Although many genes have been detected in T-cell lymphomas, the role of these genes in biological process of T-cell lymphomas was not further analyzed. Two qualified datasets were downloaded from Gene Expression Omnibus database. The biological functions of differentially expressed genes were evaluated by gene ontology enrichment and KEGG pathway analysis. The network for intersection genes was constructed by the cytoscape v3.0 software. Kaplan-Meier survival curves and log-rank test were employed to assess the association between differentially expressed genes and clinical characters. The intersection mRNAs were proved to be associated with fundamental processes of T-cell lymphoma cells. These intersection mRNAs were involved in the activation of some cancer-related pathways, including PI3K/AKT, Ras, JAK-STAT, and NF-kappa B signaling pathway. PDGFRA, CXCL12, and CCL19 were the most significant central genes in the signal-net analysis. The results of survival analysis are not entirely credible. Our findings uncovered aberrantly expressed genes and a complex RNA signal network in T-cell lymphomas and indicated cancer-related pathways involved in disease initiation and progression, providing a new insight for biotargeted therapy in T-cell lymphomas. © 2018 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Zuo, Qisheng; Li, Dong; Zhang, Lei; Elsayed, Ahmed Kamel; Lian, Chao; Shi, Qingqing; Zhang, Zhentao; Zhu, Rui; Wang, Yinjie; Jin, Kai; Zhang, Yani; Li, Bichun
2015-01-01
Here, we explore the regulatory mechanism of lipid metabolic signaling pathways and related genes during differentiation of male germ cells in chickens, with the hope that better understanding of these pathways may improve in vitro induction. Fluorescence-activated cell sorting was used to obtain highly purified cultures of embryonic stem cells (ESCs), primitive germ cells (PGCs), and spermatogonial stem cells (SSCs). The total RNA was then extracted from each type of cell. High-throughput analysis methods (RNA-seq) were used to sequence the transcriptome of these cells. Gene Ontology (GO) analysis and the KEGG database were used to identify lipid metabolism pathways and related genes. Retinoic acid (RA), the end-product of the retinol metabolism pathway, induced in vitro differentiation of ESC into male germ cells. Quantitative real-time PCR (qRT-PCR) was used to detect changes in the expression of the genes involved in the retinol metabolic pathways. From the results of RNA-seq and the database analyses, we concluded that there are 328 genes in 27 lipid metabolic pathways continuously involved in lipid metabolism during the differentiation of ESC into SSC in vivo, including retinol metabolism. Alcohol dehydrogenase 5 (ADH5) and aldehyde dehydrogenase 1 family member A1 (ALDH1A1) are involved in RA synthesis in the cell. ADH5 was specifically expressed in PGC in our experiments and aldehyde dehydrogenase 1 family member A1 (ALDH1A1) persistently increased throughout development. CYP26b1, a member of the cytochrome P450 superfamily, is involved in the degradation of RA. Expression of CYP26b1, in contrast, decreased throughout development. Exogenous RA in the culture medium induced differentiation of ESC to SSC-like cells. The expression patterns of ADH5, ALDH1A1, and CYP26b1 were consistent with RNA-seq results. We conclude that the retinol metabolism pathway plays an important role in the process of chicken male germ cell differentiation.
Lan, Daoliang; Xiong, Xianrong; Huang, Cai; Mipam, Tserang Donko; Li, Jian
2016-01-01
Yaks (Bos grunniens) are endemic species that can adapt well to thin air, cold temperatures, and high altitude. These species can survive in harsh plateau environments and are major source of animal production for local residents, being an important breed in the Qinghai-Tibet Plateau. However, compared with ordinary cattle that live in the plains, yaks generally have lower fertility. Investigating the basic physiological molecular features of yak ovary and identifying the biological events underlying the differences between the ovaries of yak and plain cattle is necessary to understand the specificity of yak reproduction. Therefore, RNA-seq technology was applied to analyze transcriptome data comparatively between the yak and plain cattle estrous ovaries. After deep sequencing, 3,653,032 clean reads with a total of 4,828,772,880 base pairs were obtained from yak ovary library. Alignment analysis showed that 16992 yak genes mapped to the yak genome, among which, 12,731 and 14,631 genes were assigned to Gene Ontology (GO) categories and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. Furthermore, comparison of yak and cattle ovary transcriptome data revealed that 1307 genes were significantly and differentially expressed between the two libraries, wherein 661 genes were upregulated and 646 genes were downregulated in yak ovary. Functional analysis showed that the differentially expressed genes were involved in various Gene Ontology (GO) categories and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. GO annotations indicated that the genes related to "cell adhesion," "hormonal" biological processes, and "calcium ion binding," "cation transmembrane transport" molecular events were significantly active. KEGG pathway analysis showed that the "complement and coagulation cascade" pathway was the most enriched in yak ovary transcriptome data, followed by the "cytochrome P450" related and "ECM-receptor interaction" pathways. Moreover, several novel pathways, such as "circadian rhythm," were significantly enriched despite having no evident associations with the reproductive function. Our findings provide a molecular resource for further investigation of the general molecular mechanism of yak ovary and offer new insights to understand comprehensively the specificity of yak reproduction.
Zhang, Yu; Mo, Wei-Jia; Wang, Xiao; Zhang, Tong-Tong; Qin, Yuan; Wang, Han-Lin; Chen, Gang; Wei, Dan-Ming; Dang, Yi-Wu
2018-05-02
The long non‑coding RNA (lncRNA) PVT1 plays vital roles in the tumorigenesis and development of various types of cancer. However, the potential expression profiling, functions and pathways of PVT1 in HCC remain unknown. PVT1 was knocked down in SMMC‑7721 cells, and a miRNA microarray analysis was performed to detect the differentially expressed miRNAs. Twelve target prediction algorithms were used to predict the underlying targets of these differentially expressed miRNAs. Bioinformatics analysis was performed to explore the underlying functions, pathways and networks of the targeted genes. Furthermore, the relationship between PVT1 and the clinical parameters in HCC was confirmed based on the original data in the TCGA database. Among the differentially expressed miRNAs, the top two upregulated and downregulated miRNAs were selected for further analysis based on the false discovery rate (FDR), fold‑change (FC) and P‑values. Based on the TCGA database, PVT1 was obviously highly expressed in HCC, and a statistically higher PVT1 expression was found for sex (male), ethnicity (Asian) and pathological grade (G3+G4) compared to the control groups (P<0.05). Furthermore, Gene Ontology (GO) analysis revealed that the target genes were involved in complex cellular pathways, such as the macromolecule biosynthetic process, compound metabolic process, and transcription. Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis revealed that the MAPK and Wnt signaling pathways may be correlated with the regulation of the four candidate miRNAs. The results therefore provide significant information on the differentially expressed miRNAs associated with PVT1 in HCC, and we hypothesized that PVT1 may play vital roles in HCC by regulating different miRNAs or target gene expression (particularly MAPK8) via the MAPK or Wnt signaling pathways. Thus, further investigation of the molecular mechanism of PVT1 in HCC is needed.
Prioritizing biological pathways by recognizing context in time-series gene expression data.
Lee, Jusang; Jo, Kyuri; Lee, Sunwon; Kang, Jaewoo; Kim, Sun
2016-12-23
The primary goal of pathway analysis using transcriptome data is to find significantly perturbed pathways. However, pathway analysis is not always successful in identifying pathways that are truly relevant to the context under study. A major reason for this difficulty is that a single gene is involved in multiple pathways. In the KEGG pathway database, there are 146 genes, each of which is involved in more than 20 pathways. Thus activation of even a single gene will result in activation of many pathways. This complex relationship often makes the pathway analysis very difficult. While we need much more powerful pathway analysis methods, a readily available alternative way is to incorporate the literature information. In this study, we propose a novel approach for prioritizing pathways by combining results from both pathway analysis tools and literature information. The basic idea is as follows. Whenever there are enough articles that provide evidence on which pathways are relevant to the context, we can be assured that the pathways are indeed related to the context, which is termed as relevance in this paper. However, if there are few or no articles reported, then we should rely on the results from the pathway analysis tools, which is termed as significance in this paper. We realized this concept as an algorithm by introducing Context Score and Impact Score and then combining the two into a single score. Our method ranked truly relevant pathways significantly higher than existing pathway analysis tools in experiments with two data sets. Our novel framework was implemented as ContextTRAP by utilizing two existing tools, TRAP and BEST. ContextTRAP will be a useful tool for the pathway based analysis of gene expression data since the user can specify the context of the biological experiment in a set of keywords. The web version of ContextTRAP is available at http://biohealth.snu.ac.kr/software/contextTRAP .
Poot-Hernandez, Augusto Cesar; Rodriguez-Vazquez, Katya; Perez-Rueda, Ernesto
2015-11-17
It is generally accepted that gene duplication followed by functional divergence is one of the main sources of metabolic diversity. In this regard, there is an increasing interest in the development of methods that allow the systematic identification of these evolutionary events in metabolism. Here, we used a method not based on biomolecular sequence analysis to compare and identify common and variable routes in the metabolism of 40 Gammaproteobacteria species. The metabolic maps deposited in the KEGG database were transformed into linear Enzymatic Step Sequences (ESS) by using the breadth-first search algorithm. These ESS represent subsequent enzymes linked to each other, where their catalytic activities are encoded in the Enzyme Commission numbers. The ESS were compared in an all-against-all (pairwise comparisons) approach by using a dynamic programming algorithm, leaving only a set of significant pairs. From these comparisons, we identified a set of functionally conserved enzymatic steps in different metabolic maps, in which cell wall components and fatty acid and lysine biosynthesis were included. In addition, we found that pathways associated with biosynthesis share a higher proportion of similar ESS than degradation pathways and secondary metabolism pathways. Also, maps associated with the metabolism of similar compounds contain a high proportion of similar ESS, such as those maps from nucleotide metabolism pathways, in particular the inosine monophosphate pathway. Furthermore, diverse ESS associated with the low part of the glycolysis pathway were identified as functionally similar to multiple metabolic pathways. In summary, our comparisons may help to identify similar reactions in different metabolic pathways and could reinforce the patchwork model in the evolution of metabolism in Gammaproteobacteria.
Chen, Langdong; Cao, Yan; Zhang, Hai; Lv, Diya; Zhao, Yahong; Liu, Yanjun; Ye, Guan; Chai, Yifeng
2018-01-31
Yangxinshi tablet (YXST) is an effective treatment for heart failure and myocardial infarction; it consists of 13 herbal medicines formulated according to traditional Chinese Medicine (TCM) practices. It has been used for the treatment of cardiovascular disease for many years in China. In this study, a network pharmacology-based strategy was used to elucidate the mechanism of action of YXST for the treatment of heart failure. Cardiovascular disease-related protein target and compound databases were constructed for YXST. A molecular docking platform was used to predict the protein targets of YXST. The affinity between proteins and ingredients was determined using surface plasmon resonance (SPR) assays. The action modes between targets and representative ingredients were calculated using Glide docking, and the related pathways were predicted using the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. A protein target database containing 924 proteins was constructed; 179 compounds in YXST were identified, and 48 compounds with high relevance to the proteins were defined as representative ingredients. Thirty-four protein targets of the 48 representative ingredients were analyzed and classified into two categories: immune and cardiovascular systems. The SPR assay and molecular docking partly validated the interplay between protein targets and representative ingredients. Moreover, 28 pathways related to heart failure were identified, which provided directions for further research on YXST. This study demonstrated that the cardiovascular protective effect of YXST mainly involved the immune and cardiovascular systems. Through the research strategy based on network pharmacology, we analysis the complex system of YXST and found 48 representative compounds, 34 proteins and 28 related pathways of YXST, which could help us understand the underlying mechanism of YSXT's anti-heart failure effect. The network-based investigation could help researchers simplify the complex system of YXSY. It may also offer a feasible approach to decipher the chemical and pharmacological bases of other TCM formulas. Copyright © 2018 Elsevier B.V. All rights reserved.
Jia, Tianqi; Wei, Danfeng; Meng, Shan; Allan, Andrew C.; Zeng, Lihui
2014-01-01
Longan (Dimocarpus longan L.) is a tropical/subtropical fruit tree of significant economic importance in Southeast Asia. However, a lack of transcriptomic and genomic information hinders research on longan traits, such as the control of flowering. In this study, high-throughput RNA sequencing (RNA-Seq) was used to investigate differentially expressed genes between a unique longan cultivar ‘Sijimi’(S) which flowers throughout the year and a more typical cultivar ‘Lidongben’(L) which flowers only once in the season, with the aim of identifying candidate genes associated with continuous flowering. 36,527 and 40,982 unigenes were obtained by de novo assembly of the clean reads from cDNA libraries of L and S cultivars. Additionally 40,513 unigenes were assembled from combined reads of these libraries. A total of 32,475 unigenes were annotated by BLAST search to NCBI non-redundant protein (NR), Swiss-Prot, Clusters of Orthologous Groups (COGs) and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. Of these, almost fifteen thousand unigenes were identified as significantly differentially expressed genes (DEGs) by using Reads Per kb per Million reads (RPKM) method. A total of 6,415 DEGs were mapped to 128 KEGG pathways, and 8,743 DEGs were assigned to 54 Gene Ontology categories. After blasting the DEGs to public sequence databases, 539 potential flowering-related DEGs were identified. In addition, 107 flowering-time genes were identified in longan, their expression levels between two longan samples were compared by RPKM method, of which the expression levels of 15 were confirmed by real-time quantitative PCR. Our results suggest longan homologues of SHORT VEGETATIVE PHASE (SVP), GIGANTEA (GI), F-BOX 1 (FKF1) and EARLY FLOWERING 4 (ELF4) may be involved this flowering trait and ELF4 may be a key gene. The identification of candidate genes related to continuous flowering will provide new insight into the molecular process of regulating flowering time in woody plants. PMID:25479005
Accurate atom-mapping computation for biochemical reactions.
Latendresse, Mario; Malerich, Jeremiah P; Travers, Mike; Karp, Peter D
2012-11-26
The complete atom mapping of a chemical reaction is a bijection of the reactant atoms to the product atoms that specifies the terminus of each reactant atom. Atom mapping of biochemical reactions is useful for many applications of systems biology, in particular for metabolic engineering where synthesizing new biochemical pathways has to take into account for the number of carbon atoms from a source compound that are conserved in the synthesis of a target compound. Rapid, accurate computation of the atom mapping(s) of a biochemical reaction remains elusive despite significant work on this topic. In particular, past researchers did not validate the accuracy of mapping algorithms. We introduce a new method for computing atom mappings called the minimum weighted edit-distance (MWED) metric. The metric is based on bond propensity to react and computes biochemically valid atom mappings for a large percentage of biochemical reactions. MWED models can be formulated efficiently as Mixed-Integer Linear Programs (MILPs). We have demonstrated this approach on 7501 reactions of the MetaCyc database for which 87% of the models could be solved in less than 10 s. For 2.1% of the reactions, we found multiple optimal atom mappings. We show that the error rate is 0.9% (22 reactions) by comparing these atom mappings to 2446 atom mappings of the manually curated Kyoto Encyclopedia of Genes and Genomes (KEGG) RPAIR database. To our knowledge, our computational atom-mapping approach is the most accurate and among the fastest published to date. The atom-mapping data will be available in the MetaCyc database later in 2012; the atom-mapping software will be available within the Pathway Tools software later in 2012.
Naganeeswaran, Sudalaimuthu Asari; Subbian, Elain Apshara; Ramaswamy, Manimekalai
2012-01-01
Phytophthora megakarya, the causative agent of cacao black pod disease in West African countries causes an extensive loss of yield. In this study we have analyzed 4 libraries of ESTs derived from Phytophthora megakarya infected cocoa leaf and pod tissues. Totally 6379 redundant sequences were retrieved from ESTtik database and EST processing was performed using seqclean tool. Clustering and assembling using CAP3 generated 3333 non-redundant (907 contigs and 2426 singletons) sequences. The primary sequence analysis of 3333 non-redundant sequences showed that the GC percentage was 42.7 and the sequence length ranged from 101 - 2576 nucleotides. Further, functional analysis (Blast, Interproscan, Gene ontology and KEGG search) were executed and 1230 orthologous genes were annotated. Totally 272 enzymes corresponding to 114 metabolic pathways were identified. Functional annotation revealed that most of the sequences are related to molecular function, stress response and biological processes. The annotated enzymes are aldehyde dehydrogenase (E.C: 1.2.1.3), catalase (E.C: 1.11.1.6), acetyl-CoA C-acetyltransferase (E.C: 2.3.1.9), threonine ammonia-lyase (E.C: 4.3.1.19), acetolactate synthase (E.C: 2.2.1.6), O-methyltransferase (E.C: 2.1.1.68) which play an important role in amino acid biosynthesis and phenyl propanoid biosynthesis. All this information was stored in MySQL database management system to be used in future for reconstruction of biotic stress response pathway in cocoa.
Li, Fengmei; Liu, Wuyi
2017-06-01
The basic helix-loop-helix (bHLH) transcription factors (TFs) form a huge superfamily and play crucial roles in many essential developmental, genetic, and physiological-biochemical processes of eukaryotes. In total, 109 putative bHLH TFs were identified and categorized successfully in the genomic databases of cattle, Bos Taurus, after removing redundant sequences and merging genetic isoforms. Through phylogenetic analyses, 105 proteins among these bHLH TFs were classified into 44 families with 46, 25, 14, 3, 13, and 4 members in the high-order groups A, B, C, D, E, and F, respectively. The remaining 4 bHLH proteins were sorted out as 'orphans.' Next, these 109 putative bHLH proteins identified were further characterized as significantly enriched in 524 significant Gene Ontology (GO) annotations (corrected P value ≤ 0.05) and 21 significantly enriched pathways (corrected P value ≤ 0.05) that had been mapped by the web server KOBAS 2.0. Furthermore, 95 bHLH proteins were further screened and analyzed together with two uncharacterized proteins in the STRING online database to reconstruct the protein-protein interaction network of cattle bHLH TFs. Ultimately, 89 bHLH proteins were fully mapped in a network with 67 biological process, 13 molecular functions, 5 KEGG pathways, 12 PFAM protein domains, and 25 INTERPRO classified protein domains and features. These results provide much useful information and a good reference for further functional investigations and updated researches on cattle bHLH TFs.
Identification of Key Transcription Factors Associated with Lung Squamous Cell Carcinoma
Zhang, Feng; Chen, Xia; Wei, Ke; Liu, Daoming; Xu, Xiaodong; Zhang, Xing; Shi, Hong
2017-01-01
Background Lung squamous cell carcinoma (lung SCC) is a common type of lung cancer, but its mechanism of pathogenesis is unclear. The aim of this study was to identify key transcription factors in lung SCC and elucidate its mechanism. Material/Methods Six published microarray datasets of lung SCC were downloaded from Gene Expression Omnibus (GEO) for integrated bioinformatics analysis. Significance analysis of microarrays was used to identify differentially expressed genes (DEGs) between lung SCC and normal controls. The biological functions and signaling pathways of DEGs were mapped in the Gene Otology and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway database, respectively. A transcription factor gene regulatory network was used to obtain insights into the functions of DEGs. Results A total of 1,011 genes, including 539 upregulated genes and 462 downregulated genes, were filtered as DEGs between lung SCC and normal controls. DEGs were significantly enriched in cell cycle, DNA replication, p53 signaling pathway, pathways in cancer, adherens junction, and cell adhesion molecules signaling pathways. There were 57 transcription factors identified, which were used to construct a regulatory network. The network consisted of 736 interactions between 49 transcription factors and 486 DEGs. NFIC, BRCA1, and NFATC2 were the top 3 transcription factors that had the highest connectivity with DEGs and that regulated 83, 82, and 75 DEGs in the network, respectively. Conclusions NFIC, BRCA1, and NFATC2 might be the key transcription factors in the development of lung SCC by regulating the genes involved in cell cycle and DNA replication pathways. PMID:28081052
Yang, Yujia; Wang, Xiaozhu; Liu, Yang; Fu, Qiang; Tian, Changxu; Wu, Chenglong; Shi, Huitong; Yuan, Zihao; Tan, Suxu; Liu, Shikai; Gao, Dongya; Dunham, Rex; Liu, Zhanjiang
2018-04-30
In aquatic organisms, hearing is an important sense for acoustic communications and detection of sound-emitting predators and prey. Channel catfish is a dominant aquaculture species in the United States. As channel catfish can hear sounds of relatively high frequency, it serves as a good model for study auditory mechanisms. In catfishes, Weberian ossicles connect the swimbladder to the inner ear to transfer the forced vibrations and improve hearing ability. In this study, we examined the transcriptional profiles of channel catfish swimbladder and other four tissues (gill, liver, skin, and intestine). We identified a total of 1777 genes that exhibited preferential expression pattern in swimbladder of channel catfish. Based on Gene Ontology enrichment analysis, many of swimbladder-enriched genes were categorized into sensory perception of sound, auditory behavior, response to auditory stimulus, or detection of mechanical stimulus involved in sensory perception of sound, such as coch, kcnq4, sptbn1, sptbn4, dnm1, ush2a, and col11a1. Six signaling pathways associated with hearing (Glutamatergic synapse, GABAergic synapse pathways, Axon guidance, cAMP signaling pathway, Ionotropic glutamate receptor pathway, and Metabotropic glutamate receptor group III pathway) were over-represented in KEGG and PANTHER databases. Protein interaction prediction revealed an interactive relationship among the swimbladder-enriched genes and genes involved in sensory perception of sound. This study identified a set of genes and signaling pathways associated with auditory system in the swimbladder of channel catfish and provide resources for further study on the biological and physiological roles in catfish swimbladder. Copyright © 2018 Elsevier Inc. All rights reserved.
Xu, Yiran; Cheng, Xiaorui; Cui, Xiuliang; Wang, Tongxing; Liu, Gang; Yang, Ruishang; Wang, Jianhui; Bo, Xiaochen; Wang, Shengqi; Zhou, Wenxia; Zhang, Yongxiang
2015-09-01
Stress induces cognitive impairments, which are likely related to the damaged dendritic morphology in the brain. Treatments for stress-induced impairments remain limited because the molecules and pathways underlying these impairments are unknown. Therefore, the aim of this study was to find the potential molecules and pathways related to damage of the dendritic morphology induced by stress. To do this, we detected gene expression, constructed a protein-protein interaction (PPI) network, and analyzed the molecular pathways in the brains of mice exposed to 5-h multimodal stress. The results showed that stress increased plasma corticosterone concentration, decreased cognitive function, damaged dendritic morphologies, and altered APBB1, CLSTN1, KCNA4, NOTCH3, PLAU, RPS6KA1, SYP, TGFB1, KCNA1, NTRK3, and SNCA expression in the brains of mice. Further analyses found that the abnormal expressions of CLSTN1, PLAU, NOTCH3, and TGFB1 induced by stress were related to alterations in the dendritic morphology. These four genes demonstrated interactions with 55 other genes, and configured a closed PPI network. Molecular pathway analysis use the Database for Annotation, Visualization, and Integrated Discovery (DAVID), specifically the gene ontology and Kyoto Encyclopedia of Genes and Genomes (KEGG), each identified three pathways that were significantly enriched in the gene list of the PPI network, with genes belonging to the Notch and transforming growth factor-beta (TGF-B) signaling pathways being the most enriched. Our results suggest that TGFB1, PLAU, NOTCH3, and CLSTN1 may be related to the alterations in dendritic morphology induced by stress, and imply that the Notch and TGF-B signaling pathways may be involved. Copyright © 2015 Elsevier Inc. All rights reserved.
Tissue Non-Specific Genes and Pathways Associated with Diabetes: An Expression Meta-Analysis.
Mei, Hao; Li, Lianna; Liu, Shijian; Jiang, Fan; Griswold, Michael; Mosley, Thomas
2017-01-21
We performed expression studies to identify tissue non-specific genes and pathways of diabetes by meta-analysis. We searched curated datasets of the Gene Expression Omnibus (GEO) database and identified 13 and five expression studies of diabetes and insulin responses at various tissues, respectively. We tested differential gene expression by empirical Bayes-based linear method and investigated gene set expression association by knowledge-based enrichment analysis. Meta-analysis by different methods was applied to identify tissue non-specific genes and gene sets. We also proposed pathway mapping analysis to infer functions of the identified gene sets, and correlation and independent analysis to evaluate expression association profile of genes and gene sets between studies and tissues. Our analysis showed that PGRMC1 and HADH genes were significant over diabetes studies, while IRS1 and MPST genes were significant over insulin response studies, and joint analysis showed that HADH and MPST genes were significant over all combined data sets. The pathway analysis identified six significant gene sets over all studies. The KEGG pathway mapping indicated that the significant gene sets are related to diabetes pathogenesis. The results also presented that 12.8% and 59.0% pairwise studies had significantly correlated expression association for genes and gene sets, respectively; moreover, 12.8% pairwise studies had independent expression association for genes, but no studies were observed significantly different for expression association of gene sets. Our analysis indicated that there are both tissue specific and non-specific genes and pathways associated with diabetes pathogenesis. Compared to the gene expression, pathway association tends to be tissue non-specific, and a common pathway influencing diabetes development is activated through different genes at different tissues.
Conceptualizing adverse outcome pathways for ...
Cyclooxygenase (COX) inhibition is of concern in fish because COX inhibitors (e.g., ibuprofen) are ubiquitous in aquatic systems/fish tissues, and can disrupt synthesis of prostaglandins that modulate a variety of essential biological functions (e.g., reproduction). This study utilized newly generated high content (transcriptomic and metabolomic) empirical data in combination with existing high throughput (ACTOR, epa.gov) toxicity data to facilitate development of adverse outcome pathways (AOPs) for molecular initiating event (MIE) of COX inhibition. We examined effects of a waterborne, 96h exposure to three COX inhibitors (indomethacin (IN; 100 µg/L), ibuprofen (IB; 200 µg/L) and celecoxib (CX; 20 µg/L) on the liver metabolome and ovarian gene expression (using oligonucleotide microarray 4 x15K platform) in sexually mature fathead minnows (n=8). Differentially expressed genes were identified (t-test, p < 0.01), and functional analyses performed to determine enriched Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways (p < 0.05). Principal component analysis indicated that liver metabolomics profiles of IN, IB and CX were not significantly different from control or one another. When compared to control, exposure to IB and CX resulted in differential expression of comparable numbers of genes (IB = 433, CX= 545). In contrast, 2558 genes were differentially expressed in IN-treated fish. KEGG pathway analyses show that IN had extensive effects on oocyte meios
Deriving pathway maps from automated text analysis using a grammar-based approach.
Olsson, Björn; Gawronska, Barbara; Erlendsson, Björn
2006-04-01
We demonstrate how automated text analysis can be used to support the large-scale analysis of metabolic and regulatory pathways by deriving pathway maps from textual descriptions found in the scientific literature. The main assumption is that correct syntactic analysis combined with domain-specific heuristics provides a good basis for relation extraction. Our method uses an algorithm that searches through the syntactic trees produced by a parser based on a Referent Grammar formalism, identifies relations mentioned in the sentence, and classifies them with respect to their semantic class and epistemic status (facts, counterfactuals, hypotheses). The semantic categories used in the classification are based on the relation set used in KEGG (Kyoto Encyclopedia of Genes and Genomes), so that pathway maps using KEGG notation can be automatically generated. We present the current version of the relation extraction algorithm and an evaluation based on a corpus of abstracts obtained from PubMed. The results indicate that the method is able to combine a reasonable coverage with high accuracy. We found that 61% of all sentences were parsed, and 97% of the parse trees were judged to be correct. The extraction algorithm was tested on a sample of 300 parse trees and was found to produce correct extractions in 90.5% of the cases.
Lu, Xiao-Ming; Chen, Chang; Zheng, Tian-Ling
2017-05-01
Pyrosequencing and metagenomic profiling were used to assess the phylogenetic and functional characteristics of microbial communities residing in sediments collected from the estuaries of Rivers Oujiang (OS) and Jiaojiang (JS) in the western region of the East China Sea. Another sediment sample was obtained from near the shore far from estuaries, used for contrast (CS). Characterization of estuary sediment bacterial communities showed that toxic chemicals potentially reduced the natural variability in microbial communities, while they increased the microbial metabolic enzymes and pathways. Polycyclic aromatic hydrocarbons (PAHs) and nitrobenzene were negatively correlated with the bacterial community variation. The dominant class in the sediments was Gammaproteobacteria. According to Kyoto Encyclopedia of Genes and Genomes (KEGG) enzyme profiles, dominant enzymes were found in estuarine sediments, which increased greatly, such as 2-oxoglutarate synthase, acetolactate synthase, inorganic diphosphatase, and aconitate hydratase. In KEGG pathway profiles, most of the pathways were also dominated by specific metabolism in these sediments and showed a marked increase, for instance alanine, aspartate, and glutamate metabolism, carbon fixation pathways in prokaryotes, and aminoacyl-tRNA biosynthesis. The estuarine sediment bacterial diversity varied with the polluted river water inputs. In the estuary receiving river water from the more seriously polluted River Oujiang, the sediment bacterial community function was more severely affected.
Matrine inhibits the progression of prostate cancer by promoting expression of GADD45B.
Huang, Hai; Wang, Qiong; Du, Tao; Lin, Chunhao; Lai, Yiming; Zhu, Dingjun; Wu, Wanhua; Ma, Xiaoming; Bai, Soumin; Li, Zean; Liu, Leyuan; Li, Qi
2018-04-01
Matrine is a naturally occurring alkaloid extracted from the Chinese herb Sophora flavescens. It has been demonstrated to exhibit antiproliferative properties, promote apoptosis, and inhibit cell invasion in a number of cancer cell lines by modulating the NF-κB pathway to downregulate the expression of MMP2 and MM9. It has also been shown to improve the efficacy of chemotherapy when it is combined with other chemotherapy drugs. However, the therapeutic potential of matrine for prostate cancer needs to be further studied. We analyzed KEGG pathways of differential gene expression between matrine-treated and untreated prostate cancer cell lines and identified GADD45B as one of major target genes of matrine based on its role in apoptosis and prognosis value for prostate cancer patients in TCGA database. We further analyzed the expression of GADD45B protein in a tissue microarray and mRNA in TCGA database, and tested the synergistic impacts of matrine and GADD45B overexpression on proliferation, apoptosis, migration and invasion of prostate cancer cell DU145. Matrine promoted the expression of GADD45B, a tumor suppressive gene that is involved in the regulation of cell cycle, DNA damage repair, cell survival, aging, apoptosis and other cellular processes through p38/JNK, ROS-GADD45B-p38, or other signal pathways. Although GADD45B is elevated in prostate cancer tissues, levels of GADD45B in prostate tumor tissues are reduced at late stage of tumor invasion, and higher levels of GADD45B predict better survivals of prostate cancer patients. Matrine may be used to treat prostate cancer patients to increase the levels of GADD45B to inhibit tumor invasion and improve patient survivals. © 2018 Wiley Periodicals, Inc.
The Importance of Biological Databases in Biological Discovery.
Baxevanis, Andreas D; Bateman, Alex
2015-06-19
Biological databases play a central role in bioinformatics. They offer scientists the opportunity to access a wide variety of biologically relevant data, including the genomic sequences of an increasingly broad range of organisms. This unit provides a brief overview of major sequence databases and portals, such as GenBank, the UCSC Genome Browser, and Ensembl. Model organism databases, including WormBase, The Arabidopsis Information Resource (TAIR), and those made available through the Mouse Genome Informatics (MGI) resource, are also covered. Non-sequence-centric databases, such as Online Mendelian Inheritance in Man (OMIM), the Protein Data Bank (PDB), MetaCyc, and the Kyoto Encyclopedia of Genes and Genomes (KEGG), are also discussed. Copyright © 2015 John Wiley & Sons, Inc.
Comparative study on gene set and pathway topology-based enrichment methods.
Bayerlová, Michaela; Jung, Klaus; Kramer, Frank; Klemm, Florian; Bleckmann, Annalen; Beißbarth, Tim
2015-10-22
Enrichment analysis is a popular approach to identify pathways or sets of genes which are significantly enriched in the context of differentially expressed genes. The traditional gene set enrichment approach considers a pathway as a simple gene list disregarding any knowledge of gene or protein interactions. In contrast, the new group of so called pathway topology-based methods integrates the topological structure of a pathway into the analysis. We comparatively investigated gene set and pathway topology-based enrichment approaches, considering three gene set and four topological methods. These methods were compared in two extensive simulation studies and on a benchmark of 36 real datasets, providing the same pathway input data for all methods. In the benchmark data analysis both types of methods showed a comparable ability to detect enriched pathways. The first simulation study was conducted with KEGG pathways, which showed considerable gene overlaps between each other. In this study with original KEGG pathways, none of the topology-based methods outperformed the gene set approach. Therefore, a second simulation study was performed on non-overlapping pathways created by unique gene IDs. Here, methods accounting for pathway topology reached higher accuracy than the gene set methods, however their sensitivity was lower. We conducted one of the first comprehensive comparative works on evaluating gene set against pathway topology-based enrichment methods. The topological methods showed better performance in the simulation scenarios with non-overlapping pathways, however, they were not conclusively better in the other scenarios. This suggests that simple gene set approach might be sufficient to detect an enriched pathway under realistic circumstances. Nevertheless, more extensive studies and further benchmark data are needed to systematically evaluate these methods and to assess what gain and cost pathway topology information introduces into enrichment analysis. Both types of methods for enrichment analysis require further improvements in order to deal with the problem of pathway overlaps.
Sequencing and characterization of lncRNAs in the breast muscle of Gushi and Arbor Acres chickens.
Ren, Tuanhui; Li, Zhuanjian; Zhou, Yu; Liu, Xuelian; Han, Ruili; Wang, Yongcai; Yan, FengBin; Sun, GuiRong; Li, Hong; Kang, Xiangtao
2018-05-01
Chicken muscle quality is one of the most important factors determining the economic value of poultry, and muscle development and growth are affected by genetics, environment, and nutrition. However, little is known about the molecular regulatory mechanisms of long non-coding RNAs (lncRNAs) in chicken skeletal muscle development. Our study aimed to better understand muscle development in chickens and thereby improve meat quality. In this study, Ribo-Zero RNA-Seq was used to investigate differences in the expression profiles of muscle development related genes and associated pathways between Gushi (GS) and Arbor Acres (AA) chickens. We identified two muscle tissue specific expression lncRNAs. In addition, the target genes of these lncRNAs were significantly enriched in certain biological processes and molecular functions, as demonstrated by Gene Ontology (GO) analysis, and these target genes participate in five signaling pathway, as revealed by an analysis of the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Taken together, these data suggest that different lncRNAs might be involved in regulating chicken muscle development and growth and provide new insight into the molecular mechanisms of lncRNAs.
Deng, Yu; Li, Fei; Rieske, Lynne K; Sun, Li-Li; Sun, Shou-Hui
2018-08-20
Fall webworm, Hyphantria cunea Drury (Lepidoptera: Arctiidae) is extremely adaptable and highly invasive in China as a defoliator of ornamental and forest trees. Both voltinism and diapause strategies of fall webworm in China are variable, and this variability contributes to it invasiveness. Little is known about molecular regulation of diapause in fall webworm. To gain insight into possible mechanisms of diapause induction, high-throughput RNA-seq data were generated from non-diapause pupae (NDP) and diapause pupae (DP). A total of 58,151 unigenes were assembled and researched against nine public databases. In total, 29,013 up-regulated and 3451 down-regulated unigenes were differentially expressed by DP when compared with those of NDP. Genes encoding proteins such as UDP-glycosyl transferase (UGT), cytochrome P450 and Hsp70 were predicted to be involved in diapause. Moreover, GO function and KEGG pathway enrichments were performed on all differentially expressed genes (DEGs) and showed that cell cycle and insulin signaling pathways may be related to the diapause of the fall webworm. This study provides valuable information about the fall webworm transcriptome for future gene function research, especially as it relates to diapause. Copyright © 2018 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Sun, Xiujun; Li, Dongming; Liu, Zhihong; Zhou, Liqing; Wu, Biao; Yang, Aiguo
2017-10-01
The pen shell ( Atrina pectinata) is a large wedge-shaped bivalve, which belongs to family Pinnidae. Due to its large and nutritious adductor muscle, it is the popular seafood with high commercial value in Asia-Pacific countries. However, limiting genomic and transcriptomic data have hampered its genetic investigations. In this study, the transcriptome of A. pectinata was deeply sequenced using Illumina pair-end sequencing technology. After assembling, a total of 127263 unigenes were obtained. Functional annotation indicated that the highest percentage of unigenes (18.60%) was annotated on GO database, followed by 18.44% on PFAM database and 17.04% on NR database. There were 270 biological pathways matched with those in KEGG database. Furthermore, a total of 23452 potential simple sequence repeats (SSRs) were identified, of them the most abundant type was mono-nucleotide repeats (12902, 55.01%), which was followed by di-nucleotide (8132, 34.68%), tri-nucleotide (2010, 8.57%), tetra-nucleotide (401, 1.71%), and penta-nucleotide (7, 0.03%) repeats. Sixty SSRs were selected for validating and developing genic SSR markers, of them 23 showed polymorphism in a cultured population with the average observed and expected heterozygosities of 0.412 and 0.579, respectively. In this study, we established the first comprehensive transcript dataset of A. pectinata genes. Our results demonstrated that RNA-Seq is a fast and cost-effective method for genic SSR development in non-model species.
ESTree db: a Tool for Peach Functional Genomics
Lazzari, Barbara; Caprera, Andrea; Vecchietti, Alberto; Stella, Alessandra; Milanesi, Luciano; Pozzi, Carlo
2005-01-01
Background The ESTree db represents a collection of Prunus persica expressed sequenced tags (ESTs) and is intended as a resource for peach functional genomics. A total of 6,155 successful EST sequences were obtained from four in-house prepared cDNA libraries from Prunus persica mesocarps at different developmental stages. Another 12,475 peach EST sequences were downloaded from public databases and added to the ESTree db. An automated pipeline was prepared to process EST sequences using public software integrated by in-house developed Perl scripts and data were collected in a MySQL database. A php-based web interface was developed to query the database. Results The ESTree db version as of April 2005 encompasses 18,630 sequences representing eight libraries. Contig assembly was performed with CAP3. Putative single nucleotide polymorphism (SNP) detection was performed with the AutoSNP program and a search engine was implemented to retrieve results. All the sequences and all the contig consensus sequences were annotated both with blastx against the GenBank nr db and with GOblet against the viridiplantae section of the Gene Ontology db. Links to NiceZyme (Expasy) and to the KEGG metabolic pathways were provided. A local BLAST utility is available. A text search utility allows querying and browsing the database. Statistics were provided on Gene Ontology occurrences to assign sequences to Gene Ontology categories. Conclusion The resulting database is a comprehensive resource of data and links related to peach EST sequences. The Sequence Report and Contig Report pages work as the web interface core structures, giving quick access to data related to each sequence/contig. PMID:16351742
ESTree db: a tool for peach functional genomics.
Lazzari, Barbara; Caprera, Andrea; Vecchietti, Alberto; Stella, Alessandra; Milanesi, Luciano; Pozzi, Carlo
2005-12-01
The ESTree db http://www.itb.cnr.it/estree/ represents a collection of Prunus persica expressed sequenced tags (ESTs) and is intended as a resource for peach functional genomics. A total of 6,155 successful EST sequences were obtained from four in-house prepared cDNA libraries from Prunus persica mesocarps at different developmental stages. Another 12,475 peach EST sequences were downloaded from public databases and added to the ESTree db. An automated pipeline was prepared to process EST sequences using public software integrated by in-house developed Perl scripts and data were collected in a MySQL database. A php-based web interface was developed to query the database. The ESTree db version as of April 2005 encompasses 18,630 sequences representing eight libraries. Contig assembly was performed with CAP3. Putative single nucleotide polymorphism (SNP) detection was performed with the AutoSNP program and a search engine was implemented to retrieve results. All the sequences and all the contig consensus sequences were annotated both with blastx against the GenBank nr db and with GOblet against the viridiplantae section of the Gene Ontology db. Links to NiceZyme (Expasy) and to the KEGG metabolic pathways were provided. A local BLAST utility is available. A text search utility allows querying and browsing the database. Statistics were provided on Gene Ontology occurrences to assign sequences to Gene Ontology categories. The resulting database is a comprehensive resource of data and links related to peach EST sequences. The Sequence Report and Contig Report pages work as the web interface core structures, giving quick access to data related to each sequence/contig.
Chen, Xin; Zhang, Jin; Liu, Qingzhong; Guo, Wei; Zhao, Tiantian; Ma, Qinghua; Wang, Guixi
2014-01-01
The genus Corylus is an important woody species in Northeast China. Its products, hazelnuts, constitute one of the most important raw materials for the pastry and chocolate industry. However, limited genetic research has focused on Corylus because of the lack of genomic resources. The advent of high-throughput sequencing technologies provides a turning point for Corylus research. In the present study, we performed de novo transcriptome sequencing for the first time to produce a comprehensive database for the Corylus heterophylla Fisch floral buds. The C. heterophylla Fisch floral buds transcriptome was sequenced using the Illumina paired-end sequencing technology. We produced 28,930,890 raw reads and assembled them into 82,684 contigs. A total of 40,941 unigenes were identified, among which 30,549 were annotated in the NCBI Non-redundant (Nr) protein database and 18,581 were annotated in the Swiss-Prot database. Of these annotated unigenes, 25,311 and 10,514 unigenes were assigned to gene ontology (GO) categories and clusters of orthologous groups (COG), respectively. We could map 17,207 unigenes onto 128 pathways using the Kyoto Encyclopedia of Genes and Genomes Pathway (KEGG) database. Additionally, based on the transcriptome, we constructed a candidate cold tolerance gene set of C. heterophylla Fisch floral buds. The expression patterns of selected genes during four stages of cold acclimation suggested that these genes might be involved in different cold responsive stages in C. heterophylla Fisch floral buds. The transcriptome of C. heterophylla Fisch floral buds was deep sequenced, de novo assembled, and annotated, providing abundant data to better understand the C. heterophylla Fisch floral buds transcriptome. Candidate genes potentially involved in cold tolerance were identified, providing a material basis for future molecular mechanism analysis of C. heterophylla Fisch floral buds tolerant to cold stress.
The aquatic animals' transcriptome resource for comparative functional analysis.
Chou, Chih-Hung; Huang, Hsi-Yuan; Huang, Wei-Chih; Hsu, Sheng-Da; Hsiao, Chung-Der; Liu, Chia-Yu; Chen, Yu-Hung; Liu, Yu-Chen; Huang, Wei-Yun; Lee, Meng-Lin; Chen, Yi-Chang; Huang, Hsien-Da
2018-05-09
Aquatic animals have great economic and ecological importance. Among them, non-model organisms have been studied regarding eco-toxicity, stress biology, and environmental adaptation. Due to recent advances in next-generation sequencing techniques, large amounts of RNA-seq data for aquatic animals are publicly available. However, currently there is no comprehensive resource exist for the analysis, unification, and integration of these datasets. This study utilizes computational approaches to build a new resource of transcriptomic maps for aquatic animals. This aquatic animal transcriptome map database dbATM provides de novo assembly of transcriptome, gene annotation and comparative analysis of more than twenty aquatic organisms without draft genome. To improve the assembly quality, three computational tools (Trinity, Oases and SOAPdenovo-Trans) were employed to enhance individual transcriptome assembly, and CAP3 and CD-HIT-EST software were then used to merge these three assembled transcriptomes. In addition, functional annotation analysis provides valuable clues to gene characteristics, including full-length transcript coding regions, conserved domains, gene ontology and KEGG pathways. Furthermore, all aquatic animal genes are essential for comparative genomics tasks such as constructing homologous gene groups and blast databases and phylogenetic analysis. In conclusion, we establish a resource for non model organism aquatic animals, which is great economic and ecological importance and provide transcriptomic information including functional annotation and comparative transcriptome analysis. The database is now publically accessible through the URL http://dbATM.mbc.nctu.edu.tw/ .
Tian, Wenlan; Paudel, Dev
2017-01-01
Jatropha (Jatropha curcas L.) is an economically important species with a great potential for biodiesel production. To enrich the jatropha genomic databases and resources for microgravity studies, we sequenced and annotated the transcriptome of jatropha and developed SSR and SNP markers from the transcriptome sequences. In total 1,714,433 raw reads with an average length of 441.2 nucleotides were generated. De novo assembling and clustering resulted in 115,611 uniquely assembled sequences (UASs) including 21,418 full-length cDNAs and 23,264 new jatropha transcript sequences. The whole set of UASs were fully annotated, out of which 59,903 (51.81%) were assigned with gene ontology (GO) term, 12,584 (10.88%) had orthologs in Eukaryotic Orthologous Groups (KOG), and 8,822 (7.63%) were mapped to 317 pathways in six different categories in Kyoto Encyclopedia of Genes and Genome (KEGG) database, and it contained 3,588 putative transcription factors. From the UASs, 9,798 SSRs were discovered with AG/CT as the most frequent (45.8%) SSR motif type. Further 38,693 SNPs were detected and 7,584 remained after filtering. This UAS set has enriched the current jatropha genomic databases and provided a large number of genetic markers, which can facilitate jatropha genetic improvement and many other genetic and biological studies. PMID:28154822
Predicting miRNA targets for head and neck squamous cell carcinoma using an ensemble method.
Gao, Hong; Jin, Hui; Li, Guijun
2018-01-01
This study aimed to uncover potential microRNA (miRNA) targets in head and neck squamous cell carcinoma (HNSCC) using an ensemble method which combined 3 different methods: Pearson's correlation coefficient (PCC), Lasso and a causal inference method (i.e., intervention calculus when the directed acyclic graph (DAG) is absent [IDA]), based on Borda count election. The Borda count election method was used to integrate the top 100 predicted targets of each miRNA generated by individual methods. Afterwards, to validate the performance ability of our method, we checked the TarBase v6.0, miRecords v2013, miRWalk v2.0 and miRTarBase v4.5 databases to validate predictions for miRNAs. Pathway enrichment analysis of target genes in the top 1,000 miRNA-messenger RNA (mRNA) interactions was conducted to focus on significant KEGG pathways. Finally, we extracted target genes based on occurrence frequency ≥3. Based on an absolute value of PCC >0.7, we found 33 miRNAs and 288 mRNAs for further analysis. We extracted 10 target genes with predicted frequencies not less than 3. The target gene MYO5C possessed the highest frequency, which was predicted by 7 different miRNAs. Significantly, a total of 8 pathways were identified; the pathways of cytokine-cytokine receptor interaction and chemokine signaling pathway were the most significant. We successfully predicted target genes and pathways for HNSCC relying on miRNA expression data, mRNA expression profile, an ensemble method and pathway information. Our results may offer new information for the diagnosis and estimation of the prognosis of HNSCC.
MESSI: metabolic engineering target selection and best strain identification tool.
Kang, Kang; Li, Jun; Lim, Boon Leong; Panagiotou, Gianni
2015-01-01
Metabolic engineering and synthetic biology are synergistically related fields for manipulating target pathways and designing microorganisms that can act as chemical factories. Saccharomyces cerevisiae's ideal bioprocessing traits make yeast a very attractive chemical factory for production of fuels, pharmaceuticals, nutraceuticals as well as a wide range of chemicals. However, future attempts of engineering S. cerevisiae's metabolism using synthetic biology need to move towards more integrative models that incorporate the high connectivity of metabolic pathways and regulatory processes and the interactions in genetic elements across those pathways and processes. To contribute in this direction, we have developed Metabolic Engineering target Selection and best Strain Identification tool (MESSI), a web server for predicting efficient chassis and regulatory components for yeast bio-based production. The server provides an integrative platform for users to analyse ready-to-use public high-throughput metabolomic data, which are transformed to metabolic pathway activities for identifying the most efficient S. cerevisiae strain for the production of a compound of interest. As input MESSI accepts metabolite KEGG IDs or pathway names. MESSI outputs a ranked list of S. cerevisiae strains based on aggregation algorithms. Furthermore, through a genome-wide association study of the metabolic pathway activities with the strains' natural variation, MESSI prioritizes genes and small variants as potential regulatory points and promising metabolic engineering targets. Users can choose various parameters in the whole process such as (i) weight and expectation of each metabolic pathway activity in the final ranking of the strains, (ii) Weighted AddScore Fuse or Weighted Borda Fuse aggregation algorithm, (iii) type of variants to be included, (iv) variant sets in different biological levels.Database URL: http://sbb.hku.hk/MESSI/. © The Author(s) 2015. Published by Oxford University Press.
Dynamics of NAD-metabolism: everything but constant.
Opitz, Christiane A; Heiland, Ines
2015-12-01
NAD, as well as its phosphorylated form, NADP, are best known as electron carriers and co-substrates of various redox reactions. As such they participate in approximately one quarter of all reactions listed in the reaction database KEGG. In metabolic pathway analysis, the total amount of NAD is usually assumed to be constant. That means that changes in the redox state might be considered, but concentration changes of the NAD moiety are usually neglected. However, a growing number of NAD-consuming reactions have been identified, showing that this assumption does not hold true in general. NAD-consuming reactions are common characteristics of NAD(+)-dependent signalling pathways and include mono- and poly-ADP-ribosylation of proteins, NAD(+)-dependent deacetylation by sirtuins and the formation of messenger molecules such as cyclic ADP-ribose (cADPR) and nicotinic acid (NA)-ADP (NAADP). NAD-consuming reactions are thus involved in major signalling and gene regulation pathways such as DNA-repair or regulation of enzymes central in metabolism. All known NAD(+)-dependent signalling processes include the release of nicotinamide (Nam). Thus cellular NAD pools need to be constantly replenished, mostly by recycling Nam to NAD(+). This process is, among others, regulated by the circadian clock, causing complex dynamic changes in NAD concentration. As disturbances in NAD homoeostasis are associated with a large number of diseases ranging from cancer to diabetes, it is important to better understand the dynamics of NAD metabolism to develop efficient pharmacological invention strategies to target this pathway. © 2015 Authors; published by Portland Press Limited.
Yang, Hong; Lin, Shan; Cui, Jingru
2014-02-10
Arsenic trioxide (ATO) is presently the most active single agent in the treatment of acute promyelocytic leukemia (APL). In order to explore the molecular mechanism of ATO in leukemia cells with time series, we adopted bioinformatics strategy to analyze expression changing patterns and changes in transcription regulation modules of time series genes filtered from Gene Expression Omnibus database (GSE24946). We totally screened out 1847 time series genes for subsequent analysis. The KEGG (Kyoto encyclopedia of genes and genomes) pathways enrichment analysis of these genes showed that oxidative phosphorylation and ribosome were the top 2 significantly enriched pathways. STEM software was employed to compare changing patterns of gene expression with assigned 50 expression patterns. We screened out 7 significantly enriched patterns and 4 tendency charts of time series genes. The result of Gene Ontology showed that functions of times series genes mainly distributed in profiles 41, 40, 39 and 38. Seven genes with positive regulation of cell adhesion function were enriched in profile 40, and presented the same first increased model then decreased model as profile 40. The transcription module analysis showed that they mainly involved in oxidative phosphorylation pathway and ribosome pathway. Overall, our data summarized the gene expression changes in ATO treated K562-r cell lines with time and suggested that time series genes mainly regulated cell adhesive. Furthermore, our result may provide theoretical basis of molecular biology in treating acute promyelocytic leukemia. Copyright © 2013 Elsevier B.V. All rights reserved.
Mokhtar, Morad M; Adawy, Sami S; El-Assal, Salah El-Din S; Hussein, Ebtissam H A
2016-01-01
The present investigation was carried out aiming to use the bioinformatics tools in order to identify and characterize, simple sequence repeats within the third Version of the date palm genome and develop a new SSR primers database. In addition single nucleotide polymorphisms (SNPs) that are located within the SSR flanking regions were recognized. Moreover, the pathways for the sequences assigned by SSR primers, the biological functions and gene interaction were determined. A total of 172,075 SSR motifs was identified on date palm genome sequence with a frequency of 450.97 SSRs per Mb. Out of these, 130,014 SSRs (75.6%) were located within the intergenic regions with a frequency of 499 SSRs per Mb. While, only 42,061 SSRs (24.4%) were located within the genic regions with a frequency of 347.5 SSRs per Mb. A total of 111,403 of SSR primer pairs were designed, that represents 291.9 SSR primers per Mb. Out of the 111,403, only 31,380 SSR primers were in the genic regions, while 80,023 primers were in the intergenic regions. A number of 250,507 SNPs were recognized in 84,172 SSR flanking regions, which represents 75.55% of the total SSR flanking regions. Out of 12,274 genes only 463 genes comprising 896 SSR primers were mapped onto 111 pathways using KEGG data base. The most abundant enzymes were identified in the pathway related to the biosynthesis of antibiotics. We tested 1031 SSR primers using both publicly available date palm genome sequences as templates in the in silico PCR reactions. Concerning in vitro validation, 31 SSR primers among those used in the in silico PCR were synthesized and tested for their ability to detect polymorphism among six Egyptian date palm cultivars. All tested primers have successfully amplified products, but only 18 primers detected polymorphic amplicons among the studied date palm cultivars.
Xu, Xing-Li; Cheng, Tian-Yin; Yang, Hu; Yan, Fen; Yang, Ya
2015-06-01
Saliva plays an important role in feeding and pathogen transmission, identification and analysis of tick salivary gland (SG) proteins is considered as a hot spot in anti-tick researching area. Herein, we present the first description of SG transcriptome of Haemaphysalis flava using next-generation sequencing (NGS). A total of over 143 million high-quality reads were assembled into 54,357 unigenes, of which 20,145 (37.06%) had significant similarities to proteins in the Swiss-Prot database. 13,513 annotated sequences were associated with GO terms. Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis showed that 14,280 unigenes were assigned to 279 KEGG pathways in total. Reads per kb per million reads (RPKM) analysis showed that there were 3035 down-regulated unigenes and 2260 up-regulated unigenes in the engorged ticks (ET) compared with the semi-engorged one (SET). Several important genes are associated with blood feeding and ingestion as secreted salivary proteins, concluding cysteine, longipain, 4D8, calreticulin, metalloproteases, serine protease inhibitor, enolase, heat shock protein and AV422 in SG, were identified. The qRT-PCR results confirmed that patterns of these genes (except for the longipain gene) expression were consistent with RNA-seq results. This de novo assembly of SG transcriptome of H. flava not only provides more chance for screening and cloning functional genes, but also forms a solid basis for further insight into the changes of salivary proteins during blood-feeding. Copyright © 2015 Elsevier B.V. All rights reserved.
Genome-Wide Gene Set Analysis for Identification of Pathways Associated with Alcohol Dependence
Biernacka, Joanna M.; Geske, Jennifer; Jenkins, Gregory D.; Colby, Colin; Rider, David N.; Karpyak, Victor M.; Choi, Doo-Sup; Fridley, Brooke L.
2013-01-01
It is believed that multiple genetic variants with small individual effects contribute to the risk of alcohol dependence. Such polygenic effects are difficult to detect in genome-wide association studies that test for association of the phenotype with each single nucleotide polymorphism (SNP) individually. To overcome this challenge, gene set analysis (GSA) methods that jointly test for the effects of pre-defined groups of genes have been proposed. Rather than testing for association between the phenotype and individual SNPs, these analyses evaluate the global evidence of association with a set of related genes enabling the identification of cellular or molecular pathways or biological processes that play a role in development of the disease. It is hoped that by aggregating the evidence of association for all available SNPs in a group of related genes, these approaches will have enhanced power to detect genetic associations with complex traits. We performed GSA using data from a genome-wide study of 1165 alcohol dependent cases and 1379 controls from the Study of Addiction: Genetics and Environment (SAGE), for all 200 pathways listed in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Results demonstrated a potential role of the “Synthesis and Degradation of Ketone Bodies” pathway. Our results also support the potential involvement of the “Neuroactive Ligand Receptor Interaction” pathway, which has previously been implicated in addictive disorders. These findings demonstrate the utility of GSA in the study of complex disease, and suggest specific directions for further research into the genetic architecture of alcohol dependence. PMID:22717047
Zhang, Shu; Sui, Zhenghong; Chang, Lianpeng; Kang, Kyoungho; Ma, Jinhua; Kong, Fanna; Zhou, Wei; Wang, Jinguo; Guo, Liliang; Geng, Huili; Zhong, Jie; Ma, Qingxia
2014-03-10
In this article, high-throughput de novo transcriptomic sequencing was performed in Alexandrium catenella, which provided the first view of the gene repertoire in this dinoflagellate based on next-generation sequencing (NGS) technologies. A total of 118,304 unigenes were identified with an average length of 673bp (base pair). Of these unigenes, 77,936 (65.9%) were annotated with known proteins based on sequence similarities, among which 24,149 and 22,956 unigenes were assigned to gene ontology categories (GO) and clusters of orthologous groups (COGs), respectively. Furthermore, 16,467 unigenes were mapped onto 322 pathways using the Kyoto Encyclopedia of Genes and Genomes Pathway database (KEGG). We also detected 1143 simple sequence repeats (SSRs), in which the tri-nucleotide repeat motif (69.3%) was the most abundant. The genetic facts and significance derived from the transcriptome dataset were suggested and discussed. All four core nucleosomal histones and linker histones were detected, in addition to the unigenes involved in histone modifications.190 unigenes were identified as being involved in the endocytosis pathway, and clathrin-dependent endocytosis was suggested to play a role in the heterotrophy of A. catenella. A conserved 22-nt spliced leader (SL) was identified in 21 unigenes which suggested the existence of trans-splicing processing of mRNA in A. catenella. Crown Copyright © 2013. Published by Elsevier B.V. All rights reserved.
Huang, Cai; Mipam, Tserang Donko; Li, Jian
2016-01-01
Background Yaks (Bos grunniens) are endemic species that can adapt well to thin air, cold temperatures, and high altitude. These species can survive in harsh plateau environments and are major source of animal production for local residents, being an important breed in the Qinghai–Tibet Plateau. However, compared with ordinary cattle that live in the plains, yaks generally have lower fertility. Investigating the basic physiological molecular features of yak ovary and identifying the biological events underlying the differences between the ovaries of yak and plain cattle is necessary to understand the specificity of yak reproduction. Therefore, RNA-seq technology was applied to analyze transcriptome data comparatively between the yak and plain cattle estrous ovaries. Results After deep sequencing, 3,653,032 clean reads with a total of 4,828,772,880 base pairs were obtained from yak ovary library. Alignment analysis showed that 16992 yak genes mapped to the yak genome, among which, 12,731 and 14,631 genes were assigned to Gene Ontology (GO) categories and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. Furthermore, comparison of yak and cattle ovary transcriptome data revealed that 1307 genes were significantly and differentially expressed between the two libraries, wherein 661 genes were upregulated and 646 genes were downregulated in yak ovary. Functional analysis showed that the differentially expressed genes were involved in various Gene Ontology (GO) categories and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. GO annotations indicated that the genes related to “cell adhesion,” “hormonal” biological processes, and “calcium ion binding,” “cation transmembrane transport” molecular events were significantly active. KEGG pathway analysis showed that the “complement and coagulation cascade” pathway was the most enriched in yak ovary transcriptome data, followed by the “cytochrome P450” related and “ECM–receptor interaction” pathways. Moreover, several novel pathways, such as “circadian rhythm,” were significantly enriched despite having no evident associations with the reproductive function. Conclusion Our findings provide a molecular resource for further investigation of the general molecular mechanism of yak ovary and offer new insights to understand comprehensively the specificity of yak reproduction. PMID:27044040
TabPath: interactive tables for metabolic pathway analysis.
Moraes, Lauro Ângelo Gonçalves de; Felestrino, Érica Barbosa; Assis, Renata de Almeida Barbosa; Matos, Diogo; Lima, Joubert de Castro; Lima, Leandro de Araújo; Almeida, Nalvo Franco; Setubal, João Carlos; Garcia, Camila Carrião Machado; Moreira, Leandro Marcio
2018-03-15
Information about metabolic pathways in a comparative context is one of the most powerful tool to help the understanding of genome-based differences in phenotypes among organisms. Although several platforms exist that provide a wealth of information on metabolic pathways of diverse organisms, the comparison among organisms using metabolic pathways is still a difficult task. We present TabPath (Tables for Metabolic Pathway), a web-based tool to facilitate comparison of metabolic pathways in genomes based on KEGG. From a selection of pathways and genomes of interest on the menu, TabPath generates user-friendly tables that facilitate analysis of variations in metabolism among the selected organisms. TabPath is available at http://200.239.132.160:8686. lmmorei@gmail.com.
2013-01-01
Background Contemporary coral reef research has firmly established that a genomic approach is urgently needed to better understand the effects of anthropogenic environmental stress and global climate change on coral holobiont interactions. Here we present KEGG orthology-based annotation of the complete genome sequence of the scleractinian coral Acropora digitifera and provide the first comprehensive view of the genome of a reef-building coral by applying advanced bioinformatics. Description Sequences from the KEGG database of protein function were used to construct hidden Markov models. These models were used to search the predicted proteome of A. digitifera to establish complete genomic annotation. The annotated dataset is published in ZoophyteBase, an open access format with different options for searching the data. A particularly useful feature is the ability to use a Google-like search engine that links query words to protein attributes. We present features of the annotation that underpin the molecular structure of key processes of coral physiology that include (1) regulatory proteins of symbiosis, (2) planula and early developmental proteins, (3) neural messengers, receptors and sensory proteins, (4) calcification and Ca2+-signalling proteins, (5) plant-derived proteins, (6) proteins of nitrogen metabolism, (7) DNA repair proteins, (8) stress response proteins, (9) antioxidant and redox-protective proteins, (10) proteins of cellular apoptosis, (11) microbial symbioses and pathogenicity proteins, (12) proteins of viral pathogenicity, (13) toxins and venom, (14) proteins of the chemical defensome and (15) coral epigenetics. Conclusions We advocate that providing annotation in an open-access searchable database available to the public domain will give an unprecedented foundation to interrogate the fundamental molecular structure and interactions of coral symbiosis and allow critical questions to be addressed at the genomic level based on combined aspects of evolutionary, developmental, metabolic, and environmental perspectives. PMID:23889801
Metabolic Pathway Assignment of Plant Genes based on Phylogenetic Profiling–A Feasibility Study
Weißenborn, Sandra; Walther, Dirk
2017-01-01
Despite many developed experimental and computational approaches, functional gene annotation remains challenging. With the rapidly growing number of sequenced genomes, the concept of phylogenetic profiling, which predicts functional links between genes that share a common co-occurrence pattern across different genomes, has gained renewed attention as it promises to annotate gene functions based on presence/absence calls alone. We applied phylogenetic profiling to the problem of metabolic pathway assignments of plant genes with a particular focus on secondary metabolism pathways. We determined phylogenetic profiles for 40,960 metabolic pathway enzyme genes with assigned EC numbers from 24 plant species based on sequence and pathway annotation data from KEGG and Ensembl Plants. For gene sequence family assignments, needed to determine the presence or absence of particular gene functions in the given plant species, we included data of all 39 species available at the Ensembl Plants database and established gene families based on pairwise sequence identities and annotation information. Aside from performing profiling comparisons, we used machine learning approaches to predict pathway associations from phylogenetic profiles alone. Selected metabolic pathways were indeed found to be composed of gene families of greater than expected phylogenetic profile similarity. This was particularly evident for primary metabolism pathways, whereas for secondary pathways, both the available annotation in different species as well as the abstraction of functional association via distinct pathways proved limiting. While phylogenetic profile similarity was generally not found to correlate with gene co-expression, direct physical interactions of proteins were reflected by a significantly increased profile similarity suggesting an application of phylogenetic profiling methods as a filtering step in the identification of protein-protein interactions. This feasibility study highlights the potential and challenges associated with phylogenetic profiling methods for the detection of functional relationships between genes as well as the need to enlarge the set of plant genes with proven secondary metabolism involvement as well as the limitations of distinct pathways as abstractions of relationships between genes. PMID:29163570
Zhao, Min; Li, XiaoMo; Qu, Hong
2013-12-01
Eating disorder is a group of physiological and psychological disorders affecting approximately 1% of the female population worldwide. Although the genetic epidemiology of eating disorder is becoming increasingly clear with accumulated studies, the underlying molecular mechanisms are still unclear. Recently, integration of various high-throughput data expanded the range of candidate genes and started to generate hypotheses for understanding potential pathogenesis in complex diseases. This article presents EDdb (Eating Disorder database), the first evidence-based gene resource for eating disorder. Fifty-nine experimentally validated genes from the literature in relation to eating disorder were collected as the core dataset. Another four datasets with 2824 candidate genes across 601 genome regions were expanded based on the core dataset using different criteria (e.g., protein-protein interactions, shared cytobands, and related complex diseases). Based on human protein-protein interaction data, we reconstructed a potential molecular sub-network related to eating disorder. Furthermore, with an integrative pathway enrichment analysis of genes in EDdb, we identified an extended adipocytokine signaling pathway in eating disorder. Three genes in EDdb (ADIPO (adiponectin), TNF (tumor necrosis factor) and NR3C1 (nuclear receptor subfamily 3, group C, member 1)) link the KEGG (Kyoto Encyclopedia of Genes and Genomes) "adipocytokine signaling pathway" with the BioCarta "visceral fat deposits and the metabolic syndrome" pathway to form a joint pathway. In total, the joint pathway contains 43 genes, among which 39 genes are related to eating disorder. As the first comprehensive gene resource for eating disorder, EDdb ( http://eddb.cbi.pku.edu.cn ) enables the exploration of gene-disease relationships and cross-talk mechanisms between related disorders. Through pathway statistical studies, we revealed that abnormal body weight caused by eating disorder and obesity may both be related to dysregulation of the novel joint pathway of adipocytokine signaling. In addition, this joint pathway may be the common pathway for body weight regulation in complex human diseases related to unhealthy lifestyle.
Alteration of metabolite profiling by cold atmospheric plasma treatment in human myeloma cells.
Xu, Dehui; Xu, Yujing; Ning, Ning; Cui, Qingjie; Liu, Zhijie; Wang, Xiaohua; Liu, Dingxin; Chen, Hailan; Kong, Michael G
2018-01-01
Despite new progress of chemotherapy in multiple myeloma (MM) clinical treatment, MM is still a refractory disease and new technology is needed to improve the outcomes and prolong the survival. Cold atmospheric plasma is a rapidly developed technology in recent years, which has been widely applied in biomedicine. Although plasma could efficiently inactivate various tumor cells, the effects of plasma on tumor cell metabolism have not been studied yet. In this study, we investigated the metabolite profiling of He plasma treatment on myeloma tumor cells by gas-chromatography time-of-flight (GC-TOF) mass-spectrometry. Meanwhile, by bioinformatic analysis such as GO and KEGG analysis we try to figure out the metabolism pathway that was significantly affected by gas plasma treatment. By GC-TOF mass-spectrometry, 573 signals were detected and evaluated using PCA and OPLS-DA. By KEGG analysis we listed all the differential metabolites and further classified into different metabolic pathways. The results showed that beta-alanine metabolism pathway was the most significant change after He gas plasma treatment in myeloma cells. Besides, propanoate metabolism and linoleic acid metabolism should also be concerned during gas plasma treatment of cancer cells. Cold atmospheric plasma treatment could significantly alter the metabolite profiling of myeloma tumor cells, among which, the beta-alanine metabolism pathway is the most susceptible to He gas plasma treatment.
PUFA diets alter the microRNA expression profiles in an inflammation rat model
ZHENG, ZHENG; GE, YINLIN; ZHANG, JINYU; XUE, MEILAN; LI, QUAN; LIN, DONGLIANG; MA, WENHUI
2015-01-01
Omega-3 and -6 polyunsaturated fatty acids (PUFAs) can directly or indirectly regulate immune homeostasis via inflammatory pathways, and components of these pathways are crucial targets of microRNAs (miRNAs). However, no study has examined the changes in the miRNA transcriptome during PUFA-regulated inflammatory processes. Here, we established PUFA diet-induced autoimmune-prone (AP) and autoimmune-averse (AA) rat models, and studied their physical characteristics and immune status. Additionally, miRNA expression patterns in the rat models were compared using microarray assays and bioinformatic methods. A total of 54 miRNAs were differentially expressed in common between the AP and the AA rats, and the changes in rno-miR-19b-3p, -146b-5p and -183-5p expression were validated using stem-loop reverse transcription-quantitative polymerase chain reaction. To better understand the mechanisms underlying PUFA-regulated miRNA changes during inflammation, computational algorithms and biological databases were used to identify the target genes of the three validated miRNAs. Furthermore, Gene Ontology (GO) term annotation and KEGG pathway analyses of the miRNA targets further allowed to explore the potential implication of the miRNAs in inflammatory pathways. The predicted PUFA-regulated inflammatory pathways included the Toll-like receptor (TLR), T cell receptor (TCR), NOD-like receptor (NLR), RIG-I-like receptor (RLR), mitogen-activated protein kinase (MAPK) and the transforming growth factor-β (TGF-β) pathway. This study is the first report, to the best of our knowledge, on in vivo comparative profiling of miRNA transcriptomes in PUFA diet-induced inflammatory rat models using a microarray approach. The results provide a useful resource for future investigation of the role of PUFA-regulated miRNAs in immune homeostasis. PMID:25672643
Li, Hong-Mei; Yang, Hong; Wen, Dong-Yue; Luo, Yi-Huan; Liang, Chun-Yan; Pan, Deng-Hua; Ma, Wei; Chen, Gang; He, Yun; Chen, Jun-Qiang
2017-05-01
The role of long non-coding RNA (lncRNA) HOX transcript antisense RNA (HOTAIR) in thyroid carcinoma (TC) remains unclear. The current study was aimed to assess the clinical value of HOTAIR expression levels in TC based on publically available data and to evaluate its potential signaling pathways. The expression data of HOTAIR and clinical information concerning TC were downloaded from The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO), respectively. Furthermore, 3 online biological databases, Starbase, Cbioportal, and Multi Experiment Matrix, were used to identify HOTAIR-related genes in TC. Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), and Panther pathway analyses were then undertaken to study the most enriched signaling pathways in TC (EASE score<0.1, Bonferroni<0.05). The TCGA results demonstrated that the expression level of HOTAIR in TC tissues was significantly increased compared with non-cancerous tissues (p<0.001). HOTAIR over-expression was significantly associated with poor survival in TC patients (p=0.03). Meta-analyses of GEO datasets revealed a trend consistent with the above results on HOTAIR expression levels in TC (SMD=0.23; 95%CI, 0.00-0.45; p=0.047). Finally, the results of functional analysis for HOTAIR-related genes indicated that HOTAIR might participate in tumorigenesis via the Wnt signaling pathway. In conclusion, our study demonstrates that HOTAIR may be involved in thyroid carcinogenesis, and the over-expression of HOTAIR could act as a biomarker associated with a poor outcome in TC patients. Moreover, the Wnt signaling pathway may be the key pathway regulated by HOTAIR in TC. © Georg Thieme Verlag KG Stuttgart · New York.
Tang, Hongwei; Wei, Peng; Duell, Eric J; Risch, Harvey A; Olson, Sara H; Bueno-de-Mesquita, H Bas; Gallinger, Steven; Holly, Elizabeth A; Petersen, Gloria; Bracci, Paige M; McWilliams, Robert R; Jenab, Mazda; Riboli, Elio; Tjønneland, Anne; Boutron-Ruault, Marie Christine; Kaaks, Rudolph; Trichopoulos, Dimitrios; Panico, Salvatore; Sund, Malin; Peeters, Petra H M; Khaw, Kay-Tee; Amos, Christopher I; Li, Donghui
2014-05-01
Cigarette smoking is the best established modifiable risk factor for pancreatic cancer. Genetic factors that underlie smoking-related pancreatic cancer have previously not been examined at the genome-wide level. Taking advantage of the existing Genome-wide association study (GWAS) genotype and risk factor data from the Pancreatic Cancer Case Control Consortium, we conducted a discovery study in 2028 cases and 2109 controls to examine gene-smoking interactions at pathway/gene/single nucleotide polymorphism (SNP) level. Using the likelihood ratio test nested in logistic regression models and ingenuity pathway analysis (IPA), we examined 172 KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways, 3 manually curated gene sets, 3 nicotine dependency gene ontology pathways, 17 912 genes and 468 114 SNPs. None of the individual pathway/gene/SNP showed significant interaction with smoking after adjusting for multiple comparisons. Six KEGG pathways showed nominal interactions (P < 0.05) with smoking, and the top two are the pancreatic secretion and salivary secretion pathways (major contributing genes: RAB8A, PLCB and CTRB1). Nine genes, i.e. ZBED2, EXO1, PSG2, SLC36A1, CLSTN1, MTHFSD, FAT2, IL10RB and ATXN2 had P interaction < 0.0005. Five intergenic region SNPs and two SNPs of the EVC and KCNIP4 genes had P interaction < 0.00003. In IPA analysis of genes with nominal interactions with smoking, axonal guidance signaling $$\\left(P=2.12\\times 1{0}^{-7}\\right)$$ and α-adrenergic signaling $$\\left(P=2.52\\times 1{0}^{-5}\\right)$$ genes were significantly overrepresented canonical pathways. Genes contributing to the axon guidance signaling pathway included the SLIT/ROBO signaling genes that were frequently altered in pancreatic cancer. These observations need to be confirmed in additional data set. Once confirmed, it will open a new avenue to unveiling the etiology of smoking-associated pancreatic cancer.
Ouyang, Kunxi; Li, Juncheng; Zhao, Xianhai; Que, Qingmin; Li, Pei; Huang, Hao; Deng, Xiaomei; Singh, Sunil Kumar; Wu, Ai-Min; Chen, Xiaoyang
2016-01-01
Neolamarckia cadamba is a fast-growing tropical hardwood tree that is used extensively for plywood and pulp production, light furniture fabrication, building materials, and as a raw material for the preparation of certain indigenous medicines. Lack of genomic resources hampers progress in the molecular breeding and genetic improvement of this multipurpose tree species. In this study, transcriptome profiling of differentiating stems was performed to understand N. cadamba xylogenesis. The N. cadamba transcriptome was sequenced using Illumina paired-end sequencing technology. This generated 42.49 G of raw data that was then de novo assembled into 55,432 UniGenes with a mean length of 803.2bp. Approximately 47.8% of the UniGenes (26,487) were annotated against publically available protein databases, among which 21,699 and 7,754 UniGenes were assigned to Gene Ontology categories (GO) and Clusters of Orthologous Groups (COG), respectively. 5,589 UniGenes could be mapped onto 116 pathways using the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway database. Among 6,202 UniGenes exhibiting differential expression during xylogenesis, 1,634 showed significantly higher levels of expression in the basal and middle stem segments compared to the apical stem segment. These genes included NAC and MYB transcription factors related to secondary cell wall biosynthesis, genes related to most metabolic steps of lignin biosynthesis, and CesA genes involved in cellulose biosynthesis. This study lays the foundation for further screening of key genes associated with xylogenesis in N. cadamba as well as enhancing our understanding of the mechanism of xylogenesis in fast-growing trees.
Mantello, Camila Campos; Cardoso-Silva, Claudio Benicio; da Silva, Carla Cristina; de Souza, Livia Moura; Scaloppi Junior, Erivaldo José; de Souza Gonçalves, Paulo; Vicentini, Renato; de Souza, Anete Pereira
2014-01-01
Hevea brasiliensis (Willd. Ex Adr. Juss.) Muell.-Arg. is the primary source of natural rubber that is native to the Amazon rainforest. The singular properties of natural rubber make it superior to and competitive with synthetic rubber for use in several applications. Here, we performed RNA sequencing (RNA-seq) of H. brasiliensis bark on the Illumina GAIIx platform, which generated 179,326,804 raw reads on the Illumina GAIIx platform. A total of 50,384 contigs that were over 400 bp in size were obtained and subjected to further analyses. A similarity search against the non-redundant (nr) protein database returned 32,018 (63%) positive BLASTx hits. The transcriptome analysis was annotated using the clusters of orthologous groups (COG), gene ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), and Pfam databases. A search for putative molecular marker was performed to identify simple sequence repeats (SSRs) and single nucleotide polymorphisms (SNPs). In total, 17,927 SSRs and 404,114 SNPs were detected. Finally, we selected sequences that were identified as belonging to the mevalonate (MVA) and 2-C-methyl-D-erythritol 4-phosphate (MEP) pathways, which are involved in rubber biosynthesis, to validate the SNP markers. A total of 78 SNPs were validated in 36 genotypes of H. brasiliensis. This new dataset represents a powerful information source for rubber tree bark genes and will be an important tool for the development of microsatellites and SNP markers for use in future genetic analyses such as genetic linkage mapping, quantitative trait loci identification, investigations of linkage disequilibrium and marker-assisted selection.
Ke, Tao; Yu, Jingyin; Dong, Caihua; Mao, Han; Hua, Wei; Liu, Shengyi
2015-01-21
Oil crop seeds are important sources of fatty acids (FAs) for human and animal nutrition. Despite their importance, there is a lack of an essential bioinformatics resource on gene transcription of oil crops from a comparative perspective. In this study, we developed ocsESTdb, the first database of expressed sequence tag (EST) information on seeds of four large-scale oil crops with an emphasis on global metabolic networks and oil accumulation metabolism that target the involved unigenes. A total of 248,522 ESTs and 106,835 unigenes were collected from the cDNA libraries of rapeseed (Brassica napus), soybean (Glycine max), sesame (Sesamum indicum) and peanut (Arachis hypogaea). These unigenes were annotated by a sequence similarity search against databases including TAIR, NR protein database, Gene Ontology, COG, Swiss-Prot, TrEMBL and Kyoto Encyclopedia of Genes and Genomes (KEGG). Five genome-scale metabolic networks that contain different numbers of metabolites and gene-enzyme reaction-association entries were analysed and constructed using Cytoscape and yEd programs. Details of unigene entries, deduced amino acid sequences and putative annotation are available from our database to browse, search and download. Intuitive and graphical representations of EST/unigene sequences, functional annotations, metabolic pathways and metabolic networks are also available. ocsESTdb will be updated regularly and can be freely accessed at http://ocri-genomics.org/ocsESTdb/ . ocsESTdb may serve as a valuable and unique resource for comparative analysis of acyl lipid synthesis and metabolism in oilseed plants. It also may provide vital insights into improving oil content in seeds of oil crop species by transcriptional reconstruction of the metabolic network.
De novo transcriptomic analysis and development of EST-SSRs for Sorbus pohuashanensis (Hance) Hedl.
Guan, Xuelian; Fu, Qiang; Zhang, Ze; Hu, Zenghui; Zheng, Jian; Lu, Yizeng; Li, Wei
2017-01-01
Sorbus pohuashanensis is a native tree species of northern China that is used for a variety of ecological purposes. The species is often grown as an ornamental landscape tree because of its beautiful form, silver flowers in early summer, attractive pinnate leaves in summer, and red leaves and fruits in autumn. However, development and further utilization of the species are hindered by the lack of comprehensive genetic information, which impedes research into its genetics and molecular biology. Recent advances in de novo transcriptome sequencing (RNA-seq) technology have provided an effective means to obtain genomic information from non-model species. Here, we applied RNA-seq for sequencing S. pohuashanensis leaves and obtained a total of 137,506 clean reads. After assembly, 96,213 unigenes with an average length of 770 bp were obtained. We found that 64.5% of the unigenes could be annotated using bioinformatics tools to analyze gene function and alignment with the NCBI database. Overall, 59,089 unigenes were annotated using the Nr database(non-redundant protein database), 35,225 unigenes were annotated using the GO (Gene Ontology categories) database, and 33,168 unigenes were annotated using COG (Cluster of Orthologous Groups). Analysis of the unigenes using the KEGG (Kyoto Encyclopedia of Genes and Genomes) database indicated that 13,953 unigenes were involved in 322 metabolic pathways. Finally, simple sequence repeat (SSR) site detection identified 6,604 unigenes that included EST-SSRs and a total of 7,473 EST-SSRs in the unigene sequences. Fifteen polymorphic SSRs were screened and found to be of use for future genetic research. These unigene sequences will provide important genetic resources for genetic improvement and investigation of biochemical processes in S. pohuashanensis. PMID:28614366
Brito, Rory C. F.; Guimarães, Frederico G.; Velloso, João P. L.; Corrêa-Oliveira, Rodrigo; Ruiz, Jeronimo C.; Reis, Alexandre B.; Resende, Daniela M.
2017-01-01
Leishmaniasis is a wide-spectrum disease caused by parasites from Leishmania genus. There is no human vaccine available and it is considered by many studies as apotential effective tool for disease control. To discover novel antigens, computational programs have been used in reverse vaccinology strategies. In this work, we developed a validation antigen approach that integrates prediction of B and T cell epitopes, analysis of Protein-Protein Interaction (PPI) networks and metabolic pathways. We selected twenty candidate proteins from Leishmania tested in murine model, with experimental outcome published in the literature. The predictions for CD4+ and CD8+ T cell epitopes were correlated with protection in experimental outcomes. We also mapped immunogenic proteins on PPI networks in order to find Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways associated with them. Our results suggest that non-protective antigens have lowest frequency of predicted T CD4+ and T CD8+ epitopes, compared with protective ones. T CD4+ and T CD8+ cells are more related to leishmaniasis protection in experimental outcomes than B cell predicted epitopes. Considering KEGG analysis, the proteins considered protective are connected to nodes with few pathways, including those associated with ribosome biosynthesis and purine metabolism. PMID:28208616
Brito, Rory C F; Guimarães, Frederico G; Velloso, João P L; Corrêa-Oliveira, Rodrigo; Ruiz, Jeronimo C; Reis, Alexandre B; Resende, Daniela M
2017-02-10
Leishmaniasis is a wide-spectrum disease caused by parasites from Leishmania genus. There is no human vaccine available and it is considered by many studies as apotential effective tool for disease control. To discover novel antigens, computational programs have been used in reverse vaccinology strategies. In this work, we developed a validation antigen approach that integrates prediction of B and T cell epitopes, analysis of Protein-Protein Interaction (PPI) networks and metabolic pathways. We selected twenty candidate proteins from Leishmania tested in murine model, with experimental outcome published in the literature. The predictions for CD4⁺ and CD8⁺ T cell epitopes were correlated with protection in experimental outcomes. We also mapped immunogenic proteins on PPI networks in order to find Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways associated with them. Our results suggest that non-protective antigens have lowest frequency of predicted T CD4⁺ and T CD8⁺ epitopes, compared with protective ones. T CD4⁺ and T CD8⁺ cells are more related to leishmaniasis protection in experimental outcomes than B cell predicted epitopes. Considering KEGG analysis, the proteins considered protective are connected to nodes with few pathways, including those associated with ribosome biosynthesis and purine metabolism.
Determining the semantic similarities among Gene Ontology terms.
Taha, Kamal
2013-05-01
We present in this paper novel techniques that determine the semantic relationships among GeneOntology (GO) terms. We implemented these techniques in a prototype system called GoSE, which resides between user application and GO database. Given a set S of GO terms, GoSE would return another set S' of GO terms, where each term in S' is semantically related to each term in S. Most current research is focused on determining the semantic similarities among GO ontology terms based solely on their IDs and proximity to one another in the GO graph structure, while overlooking the contexts of the terms, which may lead to erroneous results. The context of a GO term T is the set of other terms, whose existence in the GO graph structure is dependent on T. We propose novel techniques that determine the contexts of terms based on the concept of existence dependency. We present a stack-based sort-merge algorithm employing these techniques for determining the semantic similarities among GO terms.We evaluated GoSE experimentally and compared it with three existing methods. The results of measuring the semantic similarities among genes in KEGG and Pfam pathways retrieved from the DBGET and Sanger Pfam databases, respectively, have shown that our method outperforms the other three methods in recall and precision.
The functional cancer map: a systems-level synopsis of genetic deregulation in cancer.
Krupp, Markus; Maass, Thorsten; Marquardt, Jens U; Staib, Frank; Bauer, Tobias; König, Rainer; Biesterfeld, Stefan; Galle, Peter R; Tresch, Achim; Teufel, Andreas
2011-06-30
Cancer cells are characterized by massive dysegulation of physiological cell functions with considerable disruption of transcriptional regulation. Genome-wide transcriptome profiling can be utilized for early detection and molecular classification of cancers. Accurate discrimination of functionally different tumor types may help to guide selection of targeted therapy in translational research. Concise grouping of tumor types in cancer maps according to their molecular profile may further be helpful for the development of new therapeutic modalities or open new avenues for already established therapies. Complete available human tumor data of the Stanford Microarray Database was downloaded and filtered for relevance, adequacy and reliability. A total of 649 tumor samples from more than 1400 experiments and 58 different tissues were analyzed. Next, a method to score deregulation of KEGG pathway maps in different tumor entities was established, which was then used to convert hundreds of gene expression profiles into corresponding tumor-specific pathway activity profiles. Based on the latter, we defined a measure for functional similarity between tumor entities, which yielded to phylogeny of tumors. We provide a comprehensive, easy-to-interpret functional cancer map that characterizes tumor types with respect to their biological and functional behavior. Consistently, multiple pathways commonly associated with tumor progression were revealed as common features in the majority of the tumors. However, several pathways previously not linked to carcinogenesis were identified in multiple cancers suggesting an essential role of these pathways in cancer biology. Among these pathways were 'ECM-receptor interaction', 'Complement and Coagulation cascades', and 'PPAR signaling pathway'. The functional cancer map provides a systematic view on molecular similarities across different cancers by comparing tumors on the level of pathway activity. This work resulted in identification of novel superimposed functional pathways potentially linked to cancer biology. Therefore, our work may serve as a starting point for rationalizing combination of tumor therapeutics as well as for expanding the application of well-established targeted tumor therapies.
ExplorEnz: a MySQL database of the IUBMB enzyme nomenclature.
McDonald, Andrew G; Boyce, Sinéad; Moss, Gerard P; Dixon, Henry B F; Tipton, Keith F
2007-07-27
We describe the database ExplorEnz, which is the primary repository for EC numbers and enzyme data that are being curated on behalf of the IUBMB. The enzyme nomenclature is incorporated into many other resources, including the ExPASy-ENZYME, BRENDA and KEGG bioinformatics databases. The data, which are stored in a MySQL database, preserve the formatting of chemical and enzyme names. A simple, easy to use, web-based query interface is provided, along with an advanced search engine for more complex queries. The database is publicly available at http://www.enzyme-database.org. The data are available for download as SQL and XML files via FTP. ExplorEnz has powerful and flexible search capabilities and provides the scientific community with the most up-to-date version of the IUBMB Enzyme List.
Proteomic analysis of human follicular fluid associated with successful in vitro fertilization.
Shen, Xiaofang; Liu, Xin; Zhu, Peng; Zhang, Yuhua; Wang, Jiahui; Wang, Yanwei; Wang, Wenting; Liu, Juan; Li, Ning; Liu, Fujun
2017-07-27
Human follicular fluid (HFF) provides a key environment for follicle development and oocyte maturation, and contributes to oocyte quality and in vitro fertilization (IVF) outcome. To better understand folliculogenesis in the ovary, a proteomic strategy based on dual reverse phase high performance liquid chromatography (RP-HPLC) coupled to matrix-assisted laser desorption/ionization time-of-flight tandem mass spectrometry (LC-MALDI TOF/TOF MS) was used to investigate the protein profile of HFF from women undergoing successful IVF. A total of 219 unique high-confidence (False Discovery Rate (FDR) < 0.01) HFF proteins were identified by searching the reviewed Swiss-Prot human database (20,183 sequences), and MS data were further verified by western blot. PANTHER showed HFF proteins were involved in complement and coagulation cascade, growth factor and hormone, immunity, and transportation, KEGG indicated their pathway, and STRING demonstrated their interaction networks. In comparison, 32% and 50% of proteins have not been reported in previous human follicular fluid and plasma. Our HFF proteome research provided a new complementary high-confidence dataset of folliculogenesis and oocyte maturation environment. Those proteins associated with innate immunity, complement cascade, blood coagulation, and angiogenesis might serve as the biomarkers of female infertility and IVF outcome, and their pathways facilitated a complete exhibition of reproductive process.
Nuñez-Acuña, Gustavo; Valenzuela-Muñoz, Valentina; Gallardo-Escárate, Cristian
2014-06-01
The salmon louse Caligus rogercresseyi is the dominant ectoparasite species affecting the salmon aquaculture industry in the Southern hemisphere, and it is currently the main cause for economic losses in Chilean aquaculture. However, despite the great concern over Caligus infestations, genomic information on this louse is still scarce, even while the need to develop high-resolution molecular markers is growing. This study provides the first deep transcriptome survey to identify thousands of SNP markers from C. rogercresseyi, with a total of 69,466 SNPs identified using the MiSeq platform (Illumina®), 30,605 (52%) of which were found in contigs successfully annotated against known protein databases. Furthermore, in silico gene expression profiles associated with SNP variants were evaluated, and the results evidenced a wide array of genes that were down- and upregulated throughout the developmental stages of C. rogercresseyi. Interestingly, putative KEGG pathways involved in resistance to antiparasitic agents were also identified, where ten pathways were associated with the nervous system and one was related to ABC transporters. Taken together, this information could be highly useful for investigating the molecular underpinnings involved in the susceptibility or resistance of salmon lice to chemical treatments. Copyright © 2014 Elsevier Inc. All rights reserved.
Gilany, Kambiz; Minai-Tehrani, Arash; Savadi-Shiraz, Elham; Rezadoost, Hassan; Lakpour, Niknam
2015-01-01
The human seminal fluid is a complex body fluid. It is not known how many proteins are expressed in the seminal plasma; however in analog with the blood it is possible up to 10,000 proteins are expressed in the seminal plasma. The human seminal fluid is a rich source of potential biomarkers for male infertility and reproduction disorder. In this review, the ongoing list of proteins identified from the human seminal fluid was collected. To date, 4188 redundant proteins of the seminal fluid are identified using different proteomics technology, including 2-DE, SDS-PAGE-LC-MS/MS, MudPIT. However, this was reduced to a database of 2168 non-redundant protein using UniProtKB/Swiss-Prot reviewed database. The core concept of proteome were analyzed including pI, MW, Amino Acids, Chromosome and PTM distribution in the human seminal plasma proteome. Additionally, the biological process, molecular function and KEGG pathway were investigated using DAVID software. Finally, the biomarker identified in different male reproductive system disorder was investigated using proteomics platforms so far. In this study, an attempt was made to update the human seminal plasma proteome database. Our finding showed that human seminal plasma studies used to date seem to have converged on a set of proteins that are repeatedly identified in many studies and that represent only a small fraction of the entire human seminal plasma proteome.
Mardi, Mohsen; Karimi Farsad, Laleh; Gharechahi, Javad; Salekdeh, Ghasem Hosseini
2015-01-01
Witches' broom disease of acid lime greatly affects the production of Mexican lime in Iran. It is caused by a phytoplasma (Candidatus Phytoplasma aurantifolia). However, the molecular mechanisms that underlie phytoplasma pathogenicity and the mode of interactions with host plants are largely unknown. Here, high-throughput transcriptome sequencing was conducted to explore gene expression signatures associated with phytoplasma infection in Mexican lime trees. We assembled 78,185 unique transcript sequences (unigenes) with an average length of 530 nt. Of these, 41,805 (53.4%) were annotated against the NCBI non-redundant (nr) protein database using a BLASTx search (e-value ≤ 1e-5). When the abundances of unigenes in healthy and infected plants were compared, 2,805 transcripts showed significant differences (false discovery rate ≤ 0.001 and log2 ratio ≥ 1.5). These differentially expressed genes (DEGs) were significantly enriched in 43 KEGG metabolic and regulatory pathways. The up-regulated DEGs were mainly categorized into pathways with possible implication in plant-pathogen interaction, including cell wall biogenesis and degradation, sucrose metabolism, secondary metabolism, hormone biosynthesis and signalling, amino acid and lipid metabolism, while down-regulated DEGs were predominantly enriched in ubiquitin proteolysis and oxidative phosphorylation pathways. Our analysis provides novel insight into the molecular pathways that are deregulated during the host-pathogen interaction in Mexican lime trees infected by phytoplasma. The findings can be valuable for unravelling the molecular mechanisms of plant-phytoplasma interactions and can pave the way for engineering lime trees with resistance to witches' broom disease.
Gene expression profiles of fin regeneration in loach (Paramisgurnus dabryanu).
Li, Li; He, Jingya; Wang, Linlin; Chen, Weihua; Chang, Zhongjie
2017-11-01
Teleost fins can regenerate accurate position-matched structure and function after amputation. However, we still lack systematic transcriptional profiling and methodologies to understand the molecular basis of fin regeneration. After histological analysis, we established a suppression subtraction hybridization library containing 418 distinct sequences expressed differentially during the process of blastema formation and differentiation in caudal fin regeneration. Genome ontology and comparative analysis of differential distribution of our data and the reference zebrafish genome showed notable subcategories, including multi-organism processes, response to stimuli, extracellular matrix, antioxidant activity, and cell junction function. KEGG pathway analysis allowed the effective identification of relevant genes in those pathways involved in tissue morphogenesis and regeneration, including tight junction, cell adhesion molecules, mTOR and Jak-STAT signaling pathway. From relevant function subcategories and signaling pathways, 78 clones were examined for further Southern-blot hybridization. Then, 17 genes were chosen and characterized using semi-quantitative PCR. Then 4 candidate genes were identified, including F11r, Mmp9, Agr2 and one without a match to any database. After real-time quantitative PCR, the results showed obvious expression changes in different periods of caudal fin regeneration. We can assume that the 4 candidates, likely valuable genes associated with fin regeneration, deserve additional attention. Thus, our study demonstrated how to investigate the transcript profiles with an emphasis on bioinformatics intervention and how to identify potential genes related to fin regeneration processes. The results also provide a foundation or knowledge for further research into genes and molecular mechanisms of fin regeneration. Copyright © 2017 Elsevier B.V. All rights reserved.
Pang, Xiaocong; Zhao, Ying; Wang, Jinhua; Zhou, Qimeng; Xu, Lvjie; Kang, De
2017-01-01
Aim The incidence of Alzheimer's disease (AD) has been increasing in recent years, but there exists no cure and the pathological mechanisms are not fully understood. This study aimed to find out the pathogenesis of learning and memory impairment, new biomarkers, potential therapeutic targets, and drugs for AD. Methods We downloaded the microarray data of entorhinal cortex (EC) and hippocampus (HIP) of AD and controls from Gene Expression Omnibus (GEO) database, and then the differentially expressed genes (DEGs) in EC and HIP regions were analyzed for functional and pathway enrichment. Furthermore, we utilized the DEGs to construct coexpression networks to identify hub genes and discover the small molecules which were capable of reversing the gene expression profile of AD. Finally, we also analyzed microarray and RNA-seq dataset of blood samples to find the biomarkers related to gene expression in brain. Results We found some functional hub genes, such as ErbB2, ErbB4, OCT3, MIF, CDK13, and GPI. According to GO and KEGG pathway enrichment, several pathways were significantly dysregulated in EC and HIP. CTSD and VCAM1 were dysregulated significantly in blood, EC, and HIP, which were potential biomarkers for AD. Target genes of four microRNAs had similar GO_terms distribution with DEGs in EC and HIP. In addtion, small molecules were screened out for AD treatment. Conclusion These biological pathways and DEGs or hub genes will be useful to elucidate AD pathogenesis and identify novel biomarkers or drug targets for developing improved diagnostics and therapeutics against AD. PMID:29359159
Zhang, Jing; Blessing, Danso; Wu, Chenyu; Liu, Na; Li, Juan; Qin, Sheng
2017-01-01
Wings of Bombyx mori (B. mori) develop from the primordium, and different B. mori strains have different wing types. In order to identify the key factors influencing B. mori wing development, we chose strains P50 and U11, which are typical for normal wing and minute wing phenotypes, respectively. We dissected the wing disc on the 1st-day of wandering stage (P50D1 and U11D1), 2nd-day of wandering stage (P50D2 and U11D2), and 3rd-day of wandering stage (P50D3 and U11D3). Subsequently, RNA-sequencing (RNA-Seq) was performed on both strains in order to construct their gene expression profiles. P50 exhibited 628 genes differentially expressed to U11, 324 up-regulated genes, and 304 down-regulated genes. Five enriched gene ontology (GO) terms were identified by GO enrichment analysis based on these differentially expressed genes (DEGs). KEGG enrichment analysis results showed that the DEGs were enriched in five pathways; of these, we identified three pathways related to the development of wings. The three pathways include amino sugar and nucleotide sugar metabolism pathway, proteasome signaling pathway, and the Hippo signaling pathway. The representative genes in the enrichment pathways were further verified by quantitative real-time reverse transcription polymerase chain reaction (qRT-PCR). The RNA-Seq and qRT-PCR results were largely consistent with each other. Our results also revealed that the significantly different genes obtained in our study might be involved in the development of the size of B. mori wings. In addition, several KEGG enriched pathways might be involved in the regulation of the pathways of wing formation. These results provide a basis for further research of wing development in B. mori. PMID:28617839
Pathway-based variant enrichment analysis on the example of dilated cardiomyopathy.
Backes, Christina; Meder, Benjamin; Lai, Alan; Stoll, Monika; Rühle, Frank; Katus, Hugo A; Keller, Andreas
2016-01-01
Genome-wide association (GWA) studies have significantly contributed to the understanding of human genetic variation and its impact on clinical traits. Frequently only a limited number of highly significant associations were considered as biologically relevant. Increasingly, network analysis of affected genes is used to explore the potential role of the genetic background on disease mechanisms. Instead of first determining affected genes or calculating scores for genes and performing pathway analysis on the gene level, we integrated both steps and directly calculated enrichment on the genetic variant level. The respective approach has been tested on dilated cardiomyopathy (DCM) GWA data as showcase. To compute significance values, 5000 permutation tests were carried out and p values were adjusted for multiple testing. For 282 KEGG pathways, we computed variant enrichment scores and significance values. Of these, 65 were significant. Surprisingly, we discovered the "nucleotide excision repair" and "tuberculosis" pathways to be most significantly associated with DCM (p = 10(-9)). The latter pathway is driven by genes of the HLA-D antigen group, a finding that closely resembles previous discoveries made by expression quantitative trait locus analysis in the context of DCM-GWA. Next, we implemented a sub-network-based analysis, which searches for affected parts of KEGG, however, independent on the pre-defined pathways. Here, proteins of the contractile apparatus of cardiac cells as well as the FAS sub-network were found to be affected by common polymorphisms in DCM. In this work, we performed enrichment analysis directly on variants, leveraging the potential to discover biological information in thousands of published GWA studies. The applied approach is cutoff free and considers a ranked list of genetic variants as input.
Yan, Hong-Bin; Lou, Zhong-Zi; Li, Li; Brindley, Paul J; Zheng, Yadong; Luo, Xuenong; Hou, Junling; Guo, Aijiang; Jia, Wan-Zhong; Cai, Xuepeng
2014-06-04
Cysticercosis remains a major neglected tropical disease of humanity in many regions, especially in sub-Saharan Africa, Central America and elsewhere. Owing to the emerging drug resistance and the inability of current drugs to prevent re-infection, identification of novel vaccines and chemotherapeutic agents against Taenia solium and related helminth pathogens is a public health priority. The T. solium genome and the predicted proteome were reported recently, providing a wealth of information from which new interventional targets might be identified. In order to characterize and classify the entire repertoire of protease-encoding genes of T. solium, which act fundamental biological roles in all life processes, we analyzed the predicted proteins of this cestode through a combination of bioinformatics tools. Functional annotation was performed to yield insights into the signaling processes relevant to the complex developmental cycle of this tapeworm and to highlight a suite of the proteases as potential intervention targets. Within the genome of this helminth parasite, we identified 200 open reading frames encoding proteases from five clans, which correspond to 1.68% of the 11,902 protein-encoding genes predicted to be present in its genome. These proteases include calpains, cytosolic, mitochondrial signal peptidases, ubiquitylation related proteins, and others. Many not only show significant similarity to proteases in the Conserved Domain Database but have conserved active sites and catalytic domains. KEGG Automatic Annotation Server (KAAS) analysis indicated that ~60% of these proteases share strong sequence identities with proteins of the KEGG database, which are involved in human disease, metabolic pathways, genetic information processes, cellular processes, environmental information processes and organismal systems. Also, we identified signal peptides and transmembrane helices through comparative analysis with classes of important regulatory proteases. Phylogenetic analysis using Bayes approach provided support for inferring functional divergence among regulatory cysteine and serine proteases. Numerous putative proteases were identified for the first time in T. solium, and important regulatory proteases have been predicted. This comprehensive analysis not only complements the growing knowledge base of proteolytic enzymes, but also provides a platform from which to expand knowledge of cestode proteases and to explore their biochemistry and potential as intervention targets.
Brambila-Tapia, Aniel Jessica Leticia; Poot-Hernández, Augusto Cesar; Garcia-Guevara, Jose Fernando; Rodríguez-Vázquez, Katya
2016-06-01
To date, a few works have performed a correlation of metabolic variables in bacteria; however specific correlations with these variables have not been reported. In this work, we included 36 human pathogenic bacteria and 18 non- or less-pathogenic-related bacteria and obtained all metabolic variables, including enzymes, metabolic pathways, enzymatic steps and specific metabolic pathways, and enzymatic steps of particular metabolic processes, from a reliable metabolic database (KEGG). Then, we correlated the number of the open reading frames (ORF) with these variables and with the proportions of these variables, and we observed a negative correlation with the proportion of enzymes (r = -0.506, p < 0.0001), metabolic pathways (r = -0.871, p < 00.0001), enzymatic reactions (r = -0.749, p < 00.0001), and with the proportions of central metabolism variables as well as a positive correlation with the proportions of multistep reactions (r = 0.650, p < 00.0001) and secondary metabolism variables. The proportion of multifunctional reactions (r: -0.114, p = 0.41) and the proportion of enzymatic steps (r: -0.205, p = 0.14) did not present a significant correlation. These correlations indicate that as the size of a genome (measured in the number of ORFs) increases, the proportion of genes that encode enzymes significantly diminishes (especially those related to central metabolism), suggesting that when essential metabolic pathways are complete, an increase in the number of ORFs does not require a similar increase in the metabolic pathways and enzymes, but only a slight increase is sufficient to cope with a large genome.
Guo, Junguo; Yan, Tingqin; Bi, Hongsheng; Xie, Xiaofeng; Wang, Xingrong; Guo, Dadong; Jiang, Haiqiang
2014-06-01
The identification of the biomarkers of patients with acute anterior uveitis (AAU) may allow for a less invasive and more accurate diagnosis, as well as serving as a predictor in AAU progression and treatment response. The aim of this study was to identify the potential biomarkers and the metabolic pathways from plasma in patients with AAU. Both plasma metabolic biomarkers and metabolic pathways in the AAU patients versus healthy volunteers were investigated using ultra-performance liquid chromatography-mass spectrometry (UPLC-MS) and a metabonomics approach. The principal component analysis (PCA) was used to separate AAU patients from healthy volunteers as well as to identify the different biomarkers between the two groups. Metabolic compounds were matched to the KEGG, METLIN, and HMDB databases, and metabolic pathways associated with AAU were identified. The PCA for UPLC-MS data shows that the metabolites in AAU patients were significantly different from those of healthy volunteers. Of the 4,396 total features detected by UPLC-MS, 102 features were significantly different between AAU patients and healthy volunteers according to the variable importance plot (VIP) values (greater than two) of partial least squares discriminate analysis (PLS-DA). Thirty-three metabolic compounds were identified and were considered as potential biomarkers. Meanwhile, ten metabolic pathways were found that were related to the AAU according to the identified biomarkers. These data suggest that metabolomics study can identify potential metabolites that differ between AAU patients and healthy volunteers. Based on the PCA, PLS-DA, several potential metabolic biomarkers and pathways in AAU patients were found and identified. In addition, the UPLC-MS technique combined with metabonomics could be a suitable systematic biology tool in research in clinical problems in ophthalmology, and can provide further insight into the pathophysiology of AAU.
Darkazalli, Ali; Vied, Cynthia; Badger, Crystal-Dawn; Levenson, Cathy W
2017-01-01
Traumatic brain injury (TBI) results in a progressive disease state with many adverse and long-term neurological consequences. Mesenchymal stem cells (MSCs) have emerged as a promising cytotherapy and have been previously shown to reduce secondary apoptosis and cognitive deficits associated with TBI. Consistent with the established literature, we observed that systemically administered human MSCs (hMSCs) accumulate with high specificity at the TBI lesion boundary zone known as the penumbra. Substantial work has been done to illuminate the mechanisms by which MSCs, and the bioactive molecules they secrete, exert their therapeutic effect. However, no such work has been published to examine the effect of MSC treatment on gene expression in the brain post-TBI. In the present study, we use high-throughput RNA sequencing (RNAseq) of cortical tissue from the TBI penumbra to assess the molecular effects of both TBI and subsequent treatment with intravenously delivered hMSCs. RNAseq revealed that expression of almost 7000 cortical genes in the penumbra were differentially regulated by TBI. Pathway analysis using the KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway database revealed that TBI regulated a large number of genes belonging to pathways involved in metabolism, receptor-mediated cell signaling, neuronal plasticity, immune cell recruitment and infiltration, and neurodegenerative disease. Remarkably, hMSC treatment was found to normalize 49% of all genes disrupted by TBI, with notably robust normalization of specific pathways within the categories mentioned above, including neuroactive receptor-ligand interactions (57%), glycolysis and gluconeogenesis (81%), and Parkinson's disease (100%). These data provide evidence in support of the multi-mechanistic nature of stem cell therapy and suggest that hMSC treatment is capable of simultaneously normalizing a wide variety of important molecular pathways that are disrupted by brain injury.
Wang, Huan; Tang, Lei; Wei, Hongling; Lu, Junkai; Mu, Changkao; Wang, Chunlin
2018-05-31
Scylla paramamosain (Crustacea: Decapoda: Portunidae: Syclla De Hann) is a commercially important mud crab distributed along the coast of southern China and other Indo-Pacific countries (Lin Z, Hao M, Zhu D, et al, Comp Biochem Physiol B Biochem Mol Biol 208-209:29-37, 2017; Walton ME, Vay LL, Lebata JH, et al, Estuar Coast Shelf Sci 66(3-4):493-500, 2006; Wang Z, Sun B, Zhu F, Fish Shellfish Immunol 67:612-9, 2017). While S. paramamosain is a euryhaline species, a sudden drop in salinity induces a negative impact on growth, molting, and reproduction, and may even cause death. The mechanism of osmotic regulation of marine crustaceans has been recently under investigation. However, the mechanism of adapting to a sudden drop in salinity has not been reported. In this study, transcriptomics analysis was conducted on the gills of S. paramamosain to test its adaptive capabilities over 120 h with a sudden drop in salinity from 23 ‰ to 3 ‰. At the level of transcription, 135 DEGs (108 up-regulated and 27 down-regulated) annotated by NCBI non-redundant (nr) protein database were screened. GO analysis showed that the catalytic activity category showed the most participating genes in the 24 s-tier GO terms, indicating that intracellular metabolic activities in S. paramamosain were enhanced. Of the 164 mapped KEGG pathways, seven of the top 20 pathways were closely related to regulation of the Na + / K + -ATPase. Seven additional amino acid metabolism-related pathways were also found, along with other important signaling pathways. Ion transport and amino acid metabolism were key factors in regulating the salinity adaptation of S. paramamosain in addition to several important signaling pathways.
Fox, Simon A; Currie, Sean S; Dalley, Andrew J; Farah, Camile S
2018-05-01
The role of alcohol-containing mouthwash as a risk factor for the development of oral cancer is a subject of conflicting epidemiological evidence in the literature despite alcohol being a recognised carcinogen. The aim of this study was to use in vitro models to investigate mechanistic and global gene expression effects of exposure to alcohol-containing mouthwash. Two brands of alcohol-containing mouthwash and their alcohol-free counterparts were used to treat two oral cell lines derived from normal (OKF6-TERT) and dysplastic (DOK) tissues. Genotoxicity was determined by Comet assay. RNA-seq was performed using the Ion Torrent platform. Bioinformatics analysis used R/Bioconductor packages with differential expression using DEseq2. Pathway enrichment analysis used EnrichR with the WikiPathways and Kegg databases. Both cell lines displayed dose-dependent DNA damage in response to acute exposure to ethanol and alcohol-containing mouthwashes as well as alcohol-free mouthwashes reconstituted with ethanol as shown by Comet assay. The transcriptomic effects of alcohol-containing mouthwash exposure were more complex with significant differential gene expression ranging from >2000 genes in dysplastic (DOK) cells to <100 genes in normal (OKF6-TERT) cells. Pathway enrichment analysis in DOK cells revealed alcohol-containing mouthwashes showed common features between the two brands used including DNA damage response as well as cancer-associated pathways. In OKF6-TERT cells, the most significantly enriched pathways involved inflammatory signalling. Alcohol-containing mouthwashes are genotoxic in vitro to normal and dysplastic oral keratinocytes and induce widespread changes in gene expression. Dysplastic cells are more susceptible to the transcriptomic effects of mouthwash. © 2018 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Luo, Hui; Xiao, Shijun; Ye, Hua; Zhang, Zhengshi; Lv, Changhuan; Zheng, Shuming; Wang, Zhiyong; Wang, Xiaoqing
2016-01-01
Schizothorax prenanti (S. prenanti) is mainly distributed in the upstream regions of the Yangtze River and its tributaries in China. This species is indigenous and commercially important. However, in recent years, wild populations and aquacultures have faced the serious challenges of germplasm variation loss and an increased susceptibility to a range of pathogens. Currently, the genetics and immune mechanisms of S. prenanti are unknown, partly due to a lack of genome and transcriptome information. Here, we sought to identify genes related to immune functions and to identify molecular markers to study the function of these genes and for trait mapping. To this end, the transcriptome from spleen tissues of S. prenanti was analyzed and sequenced. Using paired-end reads from the Illumina Hiseq2500 platform, 48,517 transcripts were isolated from the spleen transcriptome. These transcripts could be clustered into 37,785 unigenes with an N50 length of 2,539 bp. The majority of the unigenes (35,653, 94.4%) were successfully annotated using non-redundant nucleotide sequence analysis (nt), and the non-redundant protein (nr), Swiss-Prot, Gene Ontology (GO), and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. KEGG pathway assignment identified more than 500 immune-related genes. Furthermore, 7,545 putative simple sequence repeats (SSRs), 857,535 single nucleotide polymorphisms (SNPs), and 53,481 insertion/deletion (InDels) were detected from the transcriptome. This is the first reported high-throughput transcriptome analysis of S. prenanti, and it provides valuable genetic resources for the investigation of immune mechanisms, conservation of germplasm, and molecular marker-assisted breeding of S. prenanti.
A toolbox model of evolution of metabolic pathways on networks of arbitrary topology.
Pang, Tin Yau; Maslov, Sergei
2011-05-01
In prokaryotic genomes the number of transcriptional regulators is known to be proportional to the square of the total number of protein-coding genes. A toolbox model of evolution was recently proposed to explain this empirical scaling for metabolic enzymes and their regulators. According to its rules, the metabolic network of an organism evolves by horizontal transfer of pathways from other species. These pathways are part of a larger "universal" network formed by the union of all species-specific networks. It remained to be understood, however, how the topological properties of this universal network influence the scaling law of functional content of genomes in the toolbox model. Here we answer this question by first analyzing the scaling properties of the toolbox model on arbitrary tree-like universal networks. We prove that critical branching topology, in which the average number of upstream neighbors of a node is equal to one, is both necessary and sufficient for quadratic scaling. We further generalize the rules of the model to incorporate reactions with multiple substrates/products as well as branched and cyclic metabolic pathways. To achieve its metabolic tasks, the new model employs evolutionary optimized pathways with minimal number of reactions. Numerical simulations of this realistic model on the universal network of all reactions in the KEGG database produced approximately quadratic scaling between the number of regulated pathways and the size of the metabolic network. To quantify the geometrical structure of individual pathways, we investigated the relationship between their number of reactions, byproducts, intermediate, and feedback metabolites. Our results validate and explain the ubiquitous appearance of the quadratic scaling for a broad spectrum of topologies of underlying universal metabolic networks. They also demonstrate why, in spite of "small-world" topology, real-life metabolic networks are characterized by a broad distribution of pathway lengths and sizes of metabolic regulons in regulatory networks.
DNA microarray analysis is plagued by a lack of data reproducibility and by limits to the detectability of transcripts by hybridization. To mitigate these limitations, we employed transcriptional coupling within the S. typhimurium genome. This genome has 2664 transcriptionally co...
Nevalainen, Jaana; Skarp, Sini; Savolainen, Eeva-Riitta; Ryynänen, Markku; Järvenpää, Jouko
2017-10-26
To evaluate placental gene expression in severe early- or late-onset preeclampsia with intrauterine growth restriction compared to controls. Chorionic villus sampling was conducted after cesarean section from the placentas of five women with early- or late-onset severe preeclampsia and five controls for each preeclampsia group. Microarray analysis was performed to identify gene expression differences between the groups. Pathway analysis showed over-representation of gene ontology (GO) biological process terms related to inflammatory and immune response pathways, platelet development, vascular development, female pregnancy and reproduction in early-onset preeclampsia. Pathways related to immunity, complement and coagulation cascade were overrepresented in the hypergeometric test for the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Ten genes (ABI3BP, C7, HLA-G, IL2RB, KRBOX1, LRRC15, METTL7B, MPP5, RFLNB and SLC20A) had a ≥±1 fold expression difference in severe early-onset preeclampsia group compared to early controls. There were 362 genes that had a ≥±1 fold expression difference in severe early-onset preeclampsia group compared to late-onset preeclampsia group including ABI3BP, C7, HLA-G and IL2RB. There are significant differences in placental gene expression between severe early- and late-onset preeclampsia when both are associated with intrauterine growth restriction. ABI3BP, C7, HLA-G and IL2RB might contribute to the development of early form of severe preeclampsia.
Duan, Xinle; Wang, Kang; Su, Sha; Tian, Ruizheng; Li, Yuting; Chen, Maohua
2017-01-01
The bird cherry-oat aphid, Rhopalosiphum padi (L.), is one of the most abundant aphid pests of cereals and has a global distribution. Next-generation sequencing (NGS) is a rapid and efficient method for developing molecular markers. However, transcriptomic and genomic resources of R. padi have not been investigated. In this study, we used transcriptome information obtained by RNA-Seq to develop polymorphic microsatellites for investigating population genetics in this species. The transcriptome of R. padi was sequenced on an Illumina HiSeq 2000 platform. A total of 114.4 million raw reads with a GC content of 40.03% was generated. The raw reads were cleaned and assembled into 29,467 unigenes with an N50 length of 1,580 bp. Using several public databases, 82.47% of these unigenes were annotated. Of the annotated unigenes, 8,022 were assigned to COG pathways, 9,895 were assigned to GO pathways, and 14,586 were mapped to 257 KEGG pathways. A total of 7,936 potential microsatellites were identified in 5,564 unigenes, 60 of which were selected randomly and amplified using specific primer pairs. Fourteen loci were found to be polymorphic in the four R. padi populations. The transcriptomic data presented herein will facilitate gene discovery, gene analyses, and development of molecular markers for future studies of R. padi and other closely related aphid species.
Kong, Fan-Yun; Wei, Xiao; Zhou, Kai; Hu, Wei; Kou, Yan-Bo; You, Hong-Juan; Liu, Xiao-Mei; Zheng, Kui-Yang; Tang, Ren-Xian
2016-01-01
Hepatocellular carcinoma (HCC)is the fifth most common malignancy associated with high mortality. One of the risk factors for HCC is chronic hepatitis B virus (HBV) infection. The treatment strategy for the disease is dependent on the stage of HCC, and the Barcelona clinic liver cancer (BCLC) staging system is used in most HCC cases. However, the molecular characteristics of HBV-related HCC in different BCLC stages are still unknown. Using GSE14520 microarray data from HBV-related HCC cases with BCLC stages from 0 (very early stage) to C (advanced stage) in the gene expression omnibus (GEO) database, differentially expressed genes (DEGs), including common DEGs and unique DEGs in different BCLC stages, were identified. These DEGs were located on different chromosomes. The molecular functions and biology pathways of DEGs were identified by gene ontology (GO) analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis, and the interactome networks of DEGs were constructed using the NetVenn online tool. The results revealed that both common DEGs and stage-specific DEGs were associated with various molecular functions and were involved in special biological pathways. In addition, several hub genes were found in the interactome networks of DEGs. The identified DEGs and hub genes promote our understanding of the molecular mechanisms underlying the development of HBV-related HCC through the different BCLC stages, and might be used as staging biomarkers or molecular targets for the treatment of HCC with HBV infection.
Liu, Shaoqun; Li, Wanshun; Wu, Yimin; Chen, Changming; Lei, Jianjun
2013-01-01
The capsaicinoids are a group of compounds produced by chili pepper fruits and are used widely in many fields, especially in medical purposes. The capsaicinoid biosynthetic pathway has not yet been established clearly. To understand more knowledge in biosynthesis of capsaicinoids, we applied RNA-seq for the mixture of placenta and pericarp of pungent pepper (Capsicum frutescens L.). We have assessed the effect of various assembly parameters using different assembly software, and obtained one of the best strategies for de novo assembly of transcriptome data. We obtained a total 54,045 high-quality unigenes (transcripts) using Trinity software. About 92.65% of unigenes showed similarity to the public protein sequences, genome of potato and tomato and pepper (C. annuum) ESTs databases. Our results predicted 3 new structural genes (DHAD, TD, PAT), which filled gaps of the capsaicinoid biosynthetic pathway predicted by Mazourek, and revealed new candidate genes involved in capsaicinoid biosynthesis based on KEGG (Kyoto Encyclopedia of Genes and Genomes) analysis. A significant number of SSR (Simple Sequence Repeat) and SNP (Single Nucleotide Polymorphism) markers were predicted in C. frutescens and C. annuum sequences, which will be helpful in the identification of polymorphisms within chili pepper populations. These data will provide new insights to the pathway of capsaicinoid biosynthesis and subsequent research of chili peppers. In addition, our strategy of de novo transcriptome assembly is applicable to a wide range of similar studies.
Liu, Shaoqun; Li, Wanshun; Wu, Yimin; Chen, Changming; Lei, Jianjun
2013-01-01
The capsaicinoids are a group of compounds produced by chili pepper fruits and are used widely in many fields, especially in medical purposes. The capsaicinoid biosynthetic pathway has not yet been established clearly. To understand more knowledge in biosynthesis of capsaicinoids, we applied RNA-seq for the mixture of placenta and pericarp of pungent pepper (Capsicum frutescens L.). We have assessed the effect of various assembly parameters using different assembly software, and obtained one of the best strategies for de novo assembly of transcriptome data. We obtained a total 54,045 high-quality unigenes (transcripts) using Trinity software. About 92.65% of unigenes showed similarity to the public protein sequences, genome of potato and tomato and pepper (C. annuum) ESTs databases. Our results predicted 3 new structural genes (DHAD, TD, PAT), which filled gaps of the capsaicinoid biosynthetic pathway predicted by Mazourek, and revealed new candidate genes involved in capsaicinoid biosynthesis based on KEGG (Kyoto Encyclopedia of Genes and Genomes) analysis. A significant number of SSR (Simple Sequence Repeat) and SNP (Single Nucleotide Polymorphism) markers were predicted in C. frutescens and C. annuum sequences, which will be helpful in the identification of polymorphisms within chili pepper populations. These data will provide new insights to the pathway of capsaicinoid biosynthesis and subsequent research of chili peppers. In addition, our strategy of de novo transcriptome assembly is applicable to a wide range of similar studies. PMID:23349661
Xia, Jia; Yang, Lili; Chen, Jialin; Wu, Yuping; Yi, Meisheng
2013-01-01
Background The Indo-Pacific humpback dolphin (Sousa chinensis), a marine mammal species inhabited in the waters of Southeast Asia, South Africa and Australia, has attracted much attention because of the dramatic decline in population size in the past decades, which raises the concern of extinction. So far, this species is poorly characterized at molecular level due to little sequence information available in public databases. Recent advances in large-scale RNA sequencing provide an efficient approach to generate abundant sequences for functional genomic analyses in the species with un-sequenced genomes. Principal Findings We performed a de novo assembly of the Indo-Pacific humpback dolphin leucocyte transcriptome by Illumina sequencing. 108,751 high quality sequences from 47,840,388 paired-end reads were generated, and 48,868 and 46,587 unigenes were functionally annotated by BLAST search against the NCBI non-redundant and Swiss-Prot protein databases (E-value<10−5), respectively. In total, 16,467 unigenes were clustered into 25 functional categories by searching against the COG database, and BLAST2GO search assigned 37,976 unigenes to 61 GO terms. In addition, 36,345 unigenes were grouped into 258 KEGG pathways. We also identified 9,906 simple sequence repeats and 3,681 putative single nucleotide polymorphisms as potential molecular markers in our assembled sequences. A large number of unigenes were predicted to be involved in immune response, and many genes were predicted to be relevant to adaptive evolution and cetacean-specific traits. Conclusion This study represented the first transcriptome analysis of the Indo-Pacific humpback dolphin, an endangered species. The de novo transcriptome analysis of the unique transcripts will provide valuable sequence information for discovery of new genes, characterization of gene expression, investigation of various pathways and adaptive evolution, as well as identification of genetic markers. PMID:24015242
Gui, Duan; Jia, Kuntong; Xia, Jia; Yang, Lili; Chen, Jialin; Wu, Yuping; Yi, Meisheng
2013-01-01
The Indo-Pacific humpback dolphin (Sousa chinensis), a marine mammal species inhabited in the waters of Southeast Asia, South Africa and Australia, has attracted much attention because of the dramatic decline in population size in the past decades, which raises the concern of extinction. So far, this species is poorly characterized at molecular level due to little sequence information available in public databases. Recent advances in large-scale RNA sequencing provide an efficient approach to generate abundant sequences for functional genomic analyses in the species with un-sequenced genomes. We performed a de novo assembly of the Indo-Pacific humpback dolphin leucocyte transcriptome by Illumina sequencing. 108,751 high quality sequences from 47,840,388 paired-end reads were generated, and 48,868 and 46,587 unigenes were functionally annotated by BLAST search against the NCBI non-redundant and Swiss-Prot protein databases (E-value<10(-5)), respectively. In total, 16,467 unigenes were clustered into 25 functional categories by searching against the COG database, and BLAST2GO search assigned 37,976 unigenes to 61 GO terms. In addition, 36,345 unigenes were grouped into 258 KEGG pathways. We also identified 9,906 simple sequence repeats and 3,681 putative single nucleotide polymorphisms as potential molecular markers in our assembled sequences. A large number of unigenes were predicted to be involved in immune response, and many genes were predicted to be relevant to adaptive evolution and cetacean-specific traits. This study represented the first transcriptome analysis of the Indo-Pacific humpback dolphin, an endangered species. The de novo transcriptome analysis of the unique transcripts will provide valuable sequence information for discovery of new genes, characterization of gene expression, investigation of various pathways and adaptive evolution, as well as identification of genetic markers.
ExplorEnz: a MySQL database of the IUBMB enzyme nomenclature
McDonald, Andrew G; Boyce, Sinéad; Moss, Gerard P; Dixon, Henry BF; Tipton, Keith F
2007-01-01
Background We describe the database ExplorEnz, which is the primary repository for EC numbers and enzyme data that are being curated on behalf of the IUBMB. The enzyme nomenclature is incorporated into many other resources, including the ExPASy-ENZYME, BRENDA and KEGG bioinformatics databases. Description The data, which are stored in a MySQL database, preserve the formatting of chemical and enzyme names. A simple, easy to use, web-based query interface is provided, along with an advanced search engine for more complex queries. The database is publicly available at . The data are available for download as SQL and XML files via FTP. Conclusion ExplorEnz has powerful and flexible search capabilities and provides the scientific community with the most up-to-date version of the IUBMB Enzyme List. PMID:17662133
Gacesa, Ranko; Zucko, Jurica; Petursdottir, Solveig K; Gudmundsdottir, Elisabet Eik; Fridjonsson, Olafur H; Diminic, Janko; Long, Paul F; Cullum, John; Hranueli, Daslav; Hreggvidsson, Gudmundur O; Starcevic, Antonio
2017-06-01
The MEGGASENSE platform constructs relational databases of DNA or protein sequences. The default functional analysis uses 14 106 hidden Markov model (HMM) profiles based on sequences in the KEGG database. The Solr search engine allows sophisticated queries and a BLAST search function is also incorporated. These standard capabilities were used to generate the SCATT database from the predicted proteome of Streptomyces cattleya . The implementation of a specialised metagenome database (AMYLOMICS) for bioprospecting of carbohydrate-modifying enzymes is described. In addition to standard assembly of reads, a novel 'functional' assembly was developed, in which screening of reads with the HMM profiles occurs before the assembly. The AMYLOMICS database incorporates additional HMM profiles for carbohydrate-modifying enzymes and it is illustrated how the combination of HMM and BLAST analyses helps identify interesting genes. A variety of different proteome and metagenome databases have been generated by MEGGASENSE.
Kalra, Shikha; Puniya, Bhanwar Lal; Kulshreshtha, Deepika; Kumar, Sunil; Kaur, Jagdeep; Ramachandran, Srinivasan; Singh, Kashmir
2013-01-01
Chlorophytum borivilianum, an endangered medicinal plant species is highly recognized for its aphrodisiac properties provided by saponins present in the plant. The transcriptome information of this species is limited and only few hundred expressed sequence tags (ESTs) are available in the public databases. To gain molecular insight of this plant, high throughput transcriptome sequencing of leaf RNA was carried out using Illumina's HiSeq 2000 sequencing platform. A total of 22,161,444 single end reads were retrieved after quality filtering. Available (e.g., De-Bruijn/Eulerian graph) and in-house developed bioinformatics tools were used for assembly and annotation of transcriptome. A total of 101,141 assembled transcripts were obtained, with coverage size of 22.42 Mb and average length of 221 bp. Guanine-cytosine (GC) content was found to be 44%. Bioinformatics analysis, using non-redundant proteins, gene ontology (GO), enzyme commission (EC) and kyoto encyclopedia of genes and genomes (KEGG) databases, extracted all the known enzymes involved in saponin and flavonoid biosynthesis. Few genes of the alkaloid biosynthesis, along with anticancer and plant defense genes, were also discovered. Additionally, several cytochrome P450 (CYP450) and glycosyltransferase unique sequences were also found. We identified simple sequence repeat motifs in transcripts with an abundance of di-nucleotide simple sequence repeat (SSR; 43.1%) markers. Large scale expression profiling through Reads per Kilobase per Million mapped reads (RPKM) showed major genes involved in different metabolic pathways of the plant. Genes, expressed sequence tags (ESTs) and unique sequences from this study provide an important resource for the scientific community, interested in the molecular genetics and functional genomics of C. borivilianum. PMID:24376689
Mantello, Camila Campos; Cardoso-Silva, Claudio Benicio; da Silva, Carla Cristina; de Souza, Livia Moura; Scaloppi Junior, Erivaldo José; de Souza Gonçalves, Paulo; Vicentini, Renato; de Souza, Anete Pereira
2014-01-01
Hevea brasiliensis (Willd. Ex Adr. Juss.) Muell.-Arg. is the primary source of natural rubber that is native to the Amazon rainforest. The singular properties of natural rubber make it superior to and competitive with synthetic rubber for use in several applications. Here, we performed RNA sequencing (RNA-seq) of H. brasiliensis bark on the Illumina GAIIx platform, which generated 179,326,804 raw reads on the Illumina GAIIx platform. A total of 50,384 contigs that were over 400 bp in size were obtained and subjected to further analyses. A similarity search against the non-redundant (nr) protein database returned 32,018 (63%) positive BLASTx hits. The transcriptome analysis was annotated using the clusters of orthologous groups (COG), gene ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), and Pfam databases. A search for putative molecular marker was performed to identify simple sequence repeats (SSRs) and single nucleotide polymorphisms (SNPs). In total, 17,927 SSRs and 404,114 SNPs were detected. Finally, we selected sequences that were identified as belonging to the mevalonate (MVA) and 2-C-methyl-D-erythritol 4-phosphate (MEP) pathways, which are involved in rubber biosynthesis, to validate the SNP markers. A total of 78 SNPs were validated in 36 genotypes of H. brasiliensis. This new dataset represents a powerful information source for rubber tree bark genes and will be an important tool for the development of microsatellites and SNP markers for use in future genetic analyses such as genetic linkage mapping, quantitative trait loci identification, investigations of linkage disequilibrium and marker-assisted selection. PMID:25048025
Deep RNA-Seq to unlock the gene bank of floral development in Sinapis arvensis.
Liu, Jia; Mei, Desheng; Li, Yunchang; Huang, Shunmou; Hu, Qiong
2014-01-01
Sinapis arvensis is a weed with strong biological activity. Despite being a problematic annual weed that contaminates agricultural crop yield, it is a valuable alien germplasm resource. It can be utilized for broadening the genetic background of Brassica crops with desirable agricultural traits like resistance to blackleg (Leptosphaeria maculans), stem rot (Sclerotinia sclerotium) and pod shatter (caused by FRUITFULL gene). However, few genetic studies of S. arvensis were reported because of the lack of genomic resources. In the present study, we performed de novo transcriptome sequencing to produce a comprehensive dataset for S. arvensis for the first time. We used Illumina paired-end sequencing technology to sequence the S. arvensis flower transcriptome and generated 40,981,443 reads that were assembled into 131,278 transcripts. We de novo assembled 96,562 high quality unigenes with an average length of 832 bp. A total of 33,662 full-length ORF complete sequences were identified, and 41,415 unigenes were mapped onto 128 pathways using the KEGG Pathway database. The annotated unigenes were compared against Brassica rapa, B. oleracea, B. napus and Arabidopsis thaliana. Among these unigenes, 76,324 were identified as putative homologs of annotated sequences in the public protein databases, of which 1194 were associated with plant hormone signal transduction and 113 were related to gibberellin homeostasis/signaling. Unigenes that did not match any of those sequence datasets were considered to be unique to S. arvensis. Furthermore, 21,321 simple sequence repeats were found. Our study will enhance the currently available resources for Brassicaceae and will provide a platform for future genomic studies for genetic improvement of Brassica crops.
Deep RNA-Seq to Unlock the Gene Bank of Floral Development in Sinapis arvensis
Liu, Jia; Mei, Desheng; Li, Yunchang; Huang, Shunmou; Hu, Qiong
2014-01-01
Sinapis arvensis is a weed with strong biological activity. Despite being a problematic annual weed that contaminates agricultural crop yield, it is a valuable alien germplasm resource. It can be utilized for broadening the genetic background of Brassica crops with desirable agricultural traits like resistance to blackleg (Leptosphaeria maculans), stem rot (Sclerotinia sclerotium) and pod shatter (caused by FRUITFULL gene). However, few genetic studies of S. arvensis were reported because of the lack of genomic resources. In the present study, we performed de novo transcriptome sequencing to produce a comprehensive dataset for S. arvensis for the first time. We used Illumina paired-end sequencing technology to sequence the S. arvensis flower transcriptome and generated 40,981,443 reads that were assembled into 131,278 transcripts. We de novo assembled 96,562 high quality unigenes with an average length of 832 bp. A total of 33,662 full-length ORF complete sequences were identified, and 41,415 unigenes were mapped onto 128 pathways using the KEGG Pathway database. The annotated unigenes were compared against Brassica rapa, B. oleracea, B. napus and Arabidopsis thaliana. Among these unigenes, 76,324 were identified as putative homologs of annotated sequences in the public protein databases, of which 1194 were associated with plant hormone signal transduction and 113 were related to gibberellin homeostasis/signaling. Unigenes that did not match any of those sequence datasets were considered to be unique to S. arvensis. Furthermore, 21,321 simple sequence repeats were found. Our study will enhance the currently available resources for Brassicaceae and will provide a platform for future genomic studies for genetic improvement of Brassica crops. PMID:25192023
Zhang, Jianxia; He, Chunmei; Wu, Kunlin; Teixeira da Silva, Jaime A.; Zeng, Songjun; Zhang, Xinhua; Yu, Zhenming; Xia, Haoqiang; Duan, Jun
2016-01-01
Dendrobium officinale is one of the most important Chinese medicinal herbs. Polysaccharides are one of the main active ingredients of D. officinale. To identify the genes that maybe related to polysaccharides synthesis, two cDNA libraries were prepared from juvenile and adult D. officinale, and were named Dendrobium-1 and Dendrobium-2, respectively. Illumina sequencing for Dendrobium-1 generated 102 million high quality reads that were assembled into 93,881 unigenes with an average sequence length of 790 base pairs. The sequencing for Dendrobium-2 generated 86 million reads that were assembled into 114,098 unigenes with an average sequence length of 695 base pairs. Two transcriptome databases were integrated and assembled into a total of 145,791 unigenes. Among them, 17,281 unigenes were assigned to 126 KEGG pathways while 135 unigenes were involved in fructose and mannose metabolism. Gene Ontology analysis revealed that the majority of genes were associated with metabolic and cellular processes. Furthermore, 430 glycosyltransferase and 89 cellulose synthase genes were identified. Comparative analysis of both transcriptome databases revealed a total of 32,794 differential expression genes (DEGs), including 22,051 up-regulated and 10,743 down-regulated genes in Dendrobium-2 compared to Dendrobium-1. Furthermore, a total of 1142 and 7918 unigenes showed unique expression in Dendrobium-1 and Dendrobium-2, respectively. These DEGs were mainly correlated with metabolic pathways and the biosynthesis of secondary metabolites. In addition, 170 DEGs belonged to glycosyltransferase genes, 37 DEGs were related to cellulose synthase genes and 627 DEGs encoded transcription factors. This study substantially expands the transcriptome information for D. officinale and provides valuable clues for identifying candidate genes involved in polysaccharide biosynthesis and elucidating the mechanism of polysaccharide biosynthesis. PMID:26904032
Kalra, Shikha; Puniya, Bhanwar Lal; Kulshreshtha, Deepika; Kumar, Sunil; Kaur, Jagdeep; Ramachandran, Srinivasan; Singh, Kashmir
2013-01-01
Chlorophytum borivilianum, an endangered medicinal plant species is highly recognized for its aphrodisiac properties provided by saponins present in the plant. The transcriptome information of this species is limited and only few hundred expressed sequence tags (ESTs) are available in the public databases. To gain molecular insight of this plant, high throughput transcriptome sequencing of leaf RNA was carried out using Illumina's HiSeq 2000 sequencing platform. A total of 22,161,444 single end reads were retrieved after quality filtering. Available (e.g., De-Bruijn/Eulerian graph) and in-house developed bioinformatics tools were used for assembly and annotation of transcriptome. A total of 101,141 assembled transcripts were obtained, with coverage size of 22.42 Mb and average length of 221 bp. Guanine-cytosine (GC) content was found to be 44%. Bioinformatics analysis, using non-redundant proteins, gene ontology (GO), enzyme commission (EC) and kyoto encyclopedia of genes and genomes (KEGG) databases, extracted all the known enzymes involved in saponin and flavonoid biosynthesis. Few genes of the alkaloid biosynthesis, along with anticancer and plant defense genes, were also discovered. Additionally, several cytochrome P450 (CYP450) and glycosyltransferase unique sequences were also found. We identified simple sequence repeat motifs in transcripts with an abundance of di-nucleotide simple sequence repeat (SSR; 43.1%) markers. Large scale expression profiling through Reads per Kilobase per Million mapped reads (RPKM) showed major genes involved in different metabolic pathways of the plant. Genes, expressed sequence tags (ESTs) and unique sequences from this study provide an important resource for the scientific community, interested in the molecular genetics and functional genomics of C. borivilianum.
Xiong, Hongchun; Guo, Huijun; Xie, Yongdun; Zhao, Linshu; Gu, Jiayu; Zhao, Shirong; Li, Junhui; Liu, Luxiang
2017-06-02
Salinity stress has become an increasing threat to food security worldwide and elucidation of the mechanism for salinity tolerance is of great significance. Induced mutation, especially spaceflight mutagenesis, is one important method for crop breeding. In this study, we show that a spaceflight-induced wheat mutant, named salinity tolerance 1 (st1), is a salinity-tolerant line. We report the characteristics of transcriptomic sequence variation induced by spaceflight, and show that mutations in genes associated with sodium ion transport may directly contribute to salinity tolerance in st1. Furthermore, GO and KEGG enrichment analysis of differentially expressed genes (DEGs) between salinity-treated st1 and wild type suggested that the homeostasis of oxidation-reduction process is important for salt tolerance in st1. Through KEGG pathway analysis, "Butanoate metabolism" was identified as a new pathway for salinity responses. Additionally, key genes for salinity tolerance, such as genes encoding arginine decarboxylase, polyamine oxidase, hormones-related, were not only salt-induced in st1 but also showed higher expression in salt-treated st1 compared with salt-treated WT, indicating that these genes may play important roles in salinity tolerance in st1. This study presents valuable genetic resources for studies on transcriptome variation caused by induced mutation and the identification of salt tolerance genes in crops.
HuMiChip: Development of a Functional Gene Array for the Study of Human Microbiomes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tu, Q.; Deng, Ye; Lin, Lu
Microbiomes play very important roles in terms of nutrition, health and disease by interacting with their hosts. Based on sequence data currently available in public domains, we have developed a functional gene array to monitor both organismal and functional gene profiles of normal microbiota in human and mouse hosts, and such an array is called human and mouse microbiota array, HMM-Chip. First, seed sequences were identified from KEGG databases, and used to construct a seed database (seedDB) containing 136 gene families in 19 metabolic pathways closely related to human and mouse microbiomes. Second, a mother database (motherDB) was constructed withmore » 81 genomes of bacterial strains with 54 from gut and 27 from oral environments, and 16 metagenomes, and used for selection of genes and probe design. Gene prediction was performed by Glimmer3 for bacterial genomes, and by the Metagene program for metagenomes. In total, 228,240 and 801,599 genes were identified for bacterial genomes and metagenomes, respectively. Then the motherDB was searched against the seedDB using the HMMer program, and gene sequences in the motherDB that were highly homologous with seed sequences in the seedDB were used for probe design by the CommOligo software. Different degrees of specific probes, including gene-specific, inclusive and exclusive group-specific probes were selected. All candidate probes were checked against the motherDB and NCBI databases for specificity. Finally, 7,763 probes covering 91.2percent (12,601 out of 13,814) HMMer confirmed sequences from 75 bacterial genomes and 16 metagenomes were selected. This developed HMM-Chip is able to detect the diversity and abundance of functional genes, the gene expression of microbial communities, and potentially, the interactions of microorganisms and their hosts.« less
Qi, Zhitao; Wu, Ping; Zhang, Qihuan; Wei, Youchuan; Wang, Zisheng; Qiu, Ming; Shao, Rong; Li, Yao; Gao, Qian
2016-02-01
Soiny mullet (Liza haematocheila) is becoming an economically important aquaculture mugilid species in China and other Asian countries. However, increasing incidences of bacterial pathogenic diseases has greatly hampered the production of the soiny mullet. Deeper understanding of the soiny mullet immune system and its related genes in response to bacterial infections are necessary for disease control in this species. In this study, the transcriptomic profile of spleen from soiny mullet challenged with Streptococcus dysgalactiae was analyzed by Illumina-based paired-end sequencing method. After assembly, 86,884 unique transcript fragments (unigenes) were assembled, with an average length of 991 bp. Approximately 41,795 (48.1%) unigenes were annotated in the nr NCBI database and 57.9% of the unigenes were similar to that of the Nile tilapia. A total of 24,299 unigenes were categorized into three Gene Ontology (GO) categories (molecular function, cellular component and biological process), 13,570 unigenes into 25 functional Clusters of Orthologous Groups of proteins (COG) categories, and 30,547 unigenes were grouped into 258 known pathways in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Following S. dysgalactiae infection, 11,461 differentially expressed unigenes were identified including 4658 up-regulated unigenes and 6803 down-regulated unigenes. Significant enrichment analysis of these differentially expressed unigenes identified major immune related pathways, including the Toll-like receptor, complement and coagulation cascades, T cell receptor signaling pathway and B cell receptor signaling pathway. In addition, 24,813 simple sequence repeats (SSRs) and 127,503 candidate single nucleotide polymorphisms (SNPs) were identified from the mullet spleen transcriptome. To this date, this study has globally analyzed the transcriptome profile from the spleen of L. haematocheila after S. dysgalactiae infection. Therefore, the results of our study contributes to better on the immune system and defense mechanisms of soiny mullet in response to bacterial infection, and provides valuable references for related studies in mugilidae species which currently lack genomic reference. Copyright © 2015 Elsevier Ltd. All rights reserved.
Wang, Xiaoliang; Shojaie, Ali; Zhang, Yuzheng; Shelley, David; Lampe, Paul D; Levy, Lisa; Peters, Ulrike; Potter, John D; White, Emily; Lampe, Johanna W
2017-01-01
Long-term use of aspirin is associated with lower risk of colorectal cancer and other cancers; however, the mechanism of chemopreventive effect of aspirin is not fully understood. Animal studies suggest that COX-2, NFκB signaling and Wnt/β-catenin pathways may play a role, but no clinical trials have systematically evaluated the biological response to aspirin in healthy humans. Using a high-density antibody array, we assessed the difference in plasma protein levels after 60 days of regular dose aspirin (325 mg/day) compared to placebo in a randomized double-blinded crossover trial of 44 healthy non-smoking men and women, aged 21-45 years. The plasma proteome was analyzed on an antibody microarray with ~3,300 full-length antibodies, printed in triplicate. Moderated paired t-tests were performed on individual antibodies, and gene-set analyses were performed based on KEGG and GO pathways. Among the 3,000 antibodies analyzed, statistically significant differences in plasma protein levels were observed for nine antibodies after adjusting for false discoveries (FDR adjusted p-value<0.1). The most significant protein was succinate dehydrogenase subunit C (SDHC), a key enzyme complex of the mitochondrial tricarboxylic acid (TCA) cycle. The other statistically significant proteins (NR2F1, MSI1, MYH1, FOXO1, KHDRBS3, NFKBIE, LYZ and IKZF1) are involved in multiple pathways, including DNA base-pair repair, inflammation and oncogenic pathways. None of the 258 KEGG and 1,139 GO pathways was found to be statistically significant after FDR adjustment. This study suggests several chemopreventive mechanisms of aspirin in humans, which have previously been reported to play a role in anti- or pro-carcinogenesis in cell systems; however, larger, confirmatory studies are needed.
Highly Disordered Proteins in Prostate Cancer.
Uversky, Vladimir N; Na, Insung; Landau, Kevin S; Schenck, Ryan O
2017-01-01
Prostate cancer is one of the major threats to the man's health. There are several mechanisms of the prostate cancer development characterized by the involvement of various androgen-related and androgen-unrelated factors in prostate cancer pathogenesis and in the metastatic carcinogenesis of prostate. In all these processes, proteins play various important roles, and the KEGG database has information on 88 human proteins experimentally shown to be involved in prostate cancer. It is known that many proteins associated with different human maladies are intrinsically disordered (i.e., they do not have stable secondary and/or tertiary structure in their unbound states). The goal of this review is to consider several highly disordered proteins known to be associated with the prostate cancer pathogenesis in order to better understand the roles of disordered proteins in this disease. We also hope that consideration of the pathology-related proteins from the perspective of intrinsic disorder can potentially lead to future experimental studies of these proteins to find novel pathways associated with prostate cancer. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Li, Jitao; Li, Jian; Chen, Ping; Liu, Ping; He, Yuying
2015-01-01
The ridgetail white prawn Exopalaemon carinicauda is one of major economic mariculture species in eastern China. The deficiency of genomic and transcriptomic data is becoming the bottleneck of further researches on its good traits. In the present study, 454 pyrosequencing was undertaken to investigate the transcriptome profiles of E. carinicauda. A collection of 1,028,710 sequence reads (459.59 Mb) obtained from cDNA prepared from eyestalk and hemocytes was assembled into 162,056 expressed sequence tags (ESTs). Of these, 29.88 % of 48,428 contigs and 70.12 % of 113,628 singlets possessed high similarities to sequences in the GenBank non-redundant database, with most significant (E value <1e(-10)) unigenes matches occurring with crustacean and insect sequences. KEGG analysis of unigenes identified putative members of biological pathways related to growth and immunity. In addition, we obtained a total of putative 125,112 SNPs and 13,467 microsatellites. These results will contribute to the understanding of the genome makeup and provide useful information for future functional genomic research in E. carinicauda.
Xu, Weichao; Zhang, Yuxiu; Cao, Hongbin; Sheng, Yuxing; Li, Haibo; Li, Yuping; Zhao, He; Gui, Xuefei
2018-05-18
Coal gasification wastewater is a typical high phenol-containing, toxic and refractory industrial wastewater. Here, lab-scale anaerobic-anoxic-oxic system was employed to treat real coal gasification wastewater, and methanol was added to oxic tank as the co-substrate to enhance the removal of refractory organic pollutants. The results showed that the average COD removal in oxic effluent increased from 24.9% to 36.0% by adding methanol, the total phenols concentration decreased from 54.4 to 44.9 mg/L. GC-MS analysis revealed that contents of phenolic components and polycyclic aromatic hydrocarbons (PAHs) were decreased compared to the control and their degradation intermediates were observed. Microbial community revealed that methanol increased the abundance of phenolics and PAHs degraders such as Comamonas, Burkholderia and Sphingopyxis. Moreover, functional analysis revealed the relative abundance of functional genes associated with toluene, benzoate and PAHs degradation pathways was higher than that of control based on KEGG database. Copyright © 2018. Published by Elsevier Ltd.
Luo, Lin; Zhou, Wen-Hua; Cai, Jiang-Jia; Feng, Mei; Zhou, Mi; Hu, Su-Pei; Xu, Jin; Ji, Lin-Dan
2017-01-01
Diabetic peripheral neuropathy (DPN) is a common complication of diabetes mellitus (DM). It is not diagnosed or managed properly in the majority of patients because its pathogenesis remains controversial. In this study, human whole genome microarrays identified 2898 and 4493 differentially expressed genes (DEGs) in DM and DPN patients, respectively. A further KEGG pathway analysis indicated that DPN and DM share four pathways, including apoptosis, B cell receptor signaling pathway, endocytosis, and Toll-like receptor signaling pathway. The DEGs identified through comparison of DPN and DM were significantly enriched in MAPK signaling pathway, NOD-like receptor signaling pathway, and neurotrophin signaling pathway, while the "neurotrophin-MAPK signaling pathway" was notably downregulated. Seven DEGs from the neurotrophin-MAPK signaling pathway were validated in additional 78 samples, and the results confirmed the initial microarray findings. These findings demonstrated that downregulation of the neurotrophin-MAPK signaling pathway may be the major mechanism of DPN pathogenesis, thus providing a potential approach for DPN treatment.
Rai, Amit; Nakaya, Taiki; Shimizu, Yohei; Rai, Megha; Nakamura, Michimi; Suzuki, Hideyuki; Saito, Kazuki; Yamazaki, Mami
2018-05-29
Lithospermum officinale is a valuable source of bioactive metabolites with medicinal and industrial values. However, little is known about genes involved in the biosynthesis of these metabolites, primarily due to the lack of genome or transcriptome resources. This study presents the first effort to establish and characterize de novo transcriptome assembly resource for L. officinale and expression analysis for three of its tissues, namely leaf, stem, and root. Using over 4Gbps of RNA-sequencing datasets, we obtained de novo transcriptome assembly of L. officinale , consisting of 77,047 unigenes with assembly N50 value as 1524 bps. Based on transcriptome annotation and functional classification, 52,766 unigenes were assigned with putative genes functions, gene ontology terms, and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. KEGG pathway and gene ontology enrichment analysis using highly expressed unigenes across three tissues and targeted metabolome analysis showed active secondary metabolic processes enriched specifically in the root of L. officinale . Using co-expression analysis, we also identified 20 and 48 unigenes representing different enzymes of lithospermic/chlorogenic acid and shikonin biosynthesis pathways, respectively. We further identified 15 candidate unigenes annotated as cytochrome P450 with the highest expression in the root of L. officinale as novel genes with a role in key biochemical reactions toward shikonin biosynthesis. Thus, through this study, we not only generated a high-quality genomic resource for L. officinale but also propose candidate genes to be involved in shikonin biosynthesis pathways for further functional characterization. Georg Thieme Verlag KG Stuttgart · New York.
GeneSCF: a real-time based functional enrichment tool with support for multiple organisms.
Subhash, Santhilal; Kanduri, Chandrasekhar
2016-09-13
High-throughput technologies such as ChIP-sequencing, RNA-sequencing, DNA sequencing and quantitative metabolomics generate a huge volume of data. Researchers often rely on functional enrichment tools to interpret the biological significance of the affected genes from these high-throughput studies. However, currently available functional enrichment tools need to be updated frequently to adapt to new entries from the functional database repositories. Hence there is a need for a simplified tool that can perform functional enrichment analysis by using updated information directly from the source databases such as KEGG, Reactome or Gene Ontology etc. In this study, we focused on designing a command-line tool called GeneSCF (Gene Set Clustering based on Functional annotations), that can predict the functionally relevant biological information for a set of genes in a real-time updated manner. It is designed to handle information from more than 4000 organisms from freely available prominent functional databases like KEGG, Reactome and Gene Ontology. We successfully employed our tool on two of published datasets to predict the biologically relevant functional information. The core features of this tool were tested on Linux machines without the need for installation of more dependencies. GeneSCF is more reliable compared to other enrichment tools because of its ability to use reference functional databases in real-time to perform enrichment analysis. It is an easy-to-integrate tool with other pipelines available for downstream analysis of high-throughput data. More importantly, GeneSCF can run multiple gene lists simultaneously on different organisms thereby saving time for the users. Since the tool is designed to be ready-to-use, there is no need for any complex compilation and installation procedures.
DGEM--a microarray gene expression database for primary human disease tissues.
Xia, Yuni; Campen, Andrew; Rigsby, Dan; Guo, Ying; Feng, Xingdong; Su, Eric W; Palakal, Mathew; Li, Shuyu
2007-01-01
Gene expression patterns can reflect gene regulations in human tissues under normal or pathologic conditions. Gene expression profiling data from studies of primary human disease samples are particularly valuable since these studies often span many years in order to collect patient clinical information and achieve a large sample size. Disease-to-Gene Expression Mapper (DGEM) provides a beneficial community resource to access and analyze these data; it currently includes Affymetrix oligonucleotide array datasets for more than 40 human diseases and 1400 samples. The data are normalized to the same scale and stored in a relational database. A statistical-analysis pipeline was implemented to identify genes abnormally expressed in disease tissues or genes whose expressions are associated with clinical parameters such as cancer patient survival. Data-mining results can be queried through a web-based interface at http://dgem.dhcp.iupui.edu/. The query tool enables dynamic generation of graphs and tables that are further linked to major gene and pathway resources that connect the data to relevant biology, including Entrez Gene and Kyoto Encyclopedia of Genes and Genomes (KEGG). In summary, DGEM provides scientists and physicians a valuable tool to study disease mechanisms, to discover potential disease biomarkers for diagnosis and prognosis, and to identify novel gene targets for drug discovery. The source code is freely available for non-profit use, on request to the authors.
Dang, Yunfei; Xu, Xiaoyan; Shen, Yubang; Hu, Moyan; Zhang, Meng; Li, Lisen; Lv, Liqun; Li, Jiale
2016-01-01
The grass carp (Ctenopharyngodon idella) is an important commercial farmed herbivorous fish species in China, but is susceptible to Aeromonas hydrophila infections. In the present study, we performed de novo RNA-Seq sequencing of spleen tissue from specimens of a disease-resistant family, which were given intra-peritoneal injections containing PBS with or without a dose of A. hydrophila. The fish were sampled from the control group at 0 h, and from the experimental group at 4, 8, 12, 24, 48 and 72 h. 122.18 million clean reads were obtained from the normalized cDNA libraries; these were assembled into 425,260 contigs and then 191,795 transcripts. Of those, 52,668 transcripts were annotated with the NCBI Nr database, and 41,347 of the annotated transcripts were assigned into 90 functional groups. 20,569 unigenes were classified into six main categories, including 38 secondary KEGG pathways. 2,992 unigenes were used in the analysis of differentially expressed genes (DEGs). 89 of the putative DEGs were related to the immune system and 41 of them were involved in the complement and coagulation cascades pathway. This study provides insights into the complement and complement-related pathways involved in innate immunity, through expression profile analysis of the genomic resources in C. idella. We conclude that complement and complement-related genes play important roles during defense against A. hydrophila infection. The immune response is activated at 4 h after the bacterial injections, indicating that the complement pathways are activated at the early stage of bacterial infection. The study has improved our understanding of the immune response mechanisms in C. idella to bacterial pathogens. PMID:27383749
Khaldun, A. B. M.; Huang, Wenjun; Liao, Sihong; Lv, Haiyan; Wang, Ying
2015-01-01
Although Lycium chinense (goji berry) is an important traditional Chinese medicinal plant, little genome information is available for this plant, particularly at the small-RNA level. Recent findings indicate that the evolutionary role of miRNAs is very important for a better understanding of gene regulation in different plant species. To elucidate small RNAs and their potential target genes in fruit and shoot tissues, high-throughput RNA sequencing technology was used followed by qRT-PCR and RLM 5’-RACE experiments. A total of 60 conserved miRNAs belonging to 31 families and 30 putative novel miRNAs were identified. A total of 62 significantly differentially expressed miRNAs were identified, of which 15 (14 known and 1 novel) were shoot-specific, and 12 (7 known and 5 novel) were fruit-specific. Additionally, 28 differentially expressed miRNAs were recorded as up-regulated in fruit tissues. The predicted potential targets were involved in a wide range of metabolic and regulatory pathways. GO (Gene Ontology) enrichment analysis and the KEGG (Kyoto Encyclopedia of Genes and Genomes) database revealed that “metabolic pathways” is the most significant pathway with respect to the rich factor and gene numbers. Moreover, five miRNAs were related to fruit maturation, lycopene biosynthesis and signaling pathways, which might be important for the further study of fruit molecular biology. This study is the first, to detect known and novel miRNAs, and their potential targets, of L. chinense. The data and findings that are presented here might be a good source for the functional genomic study of medicinal plants and for understanding the links among diversified biological pathways. PMID:25587984
Reconstruction of biological pathways and metabolic networks from in silico labeled metabolites.
Hadadi, Noushin; Hafner, Jasmin; Soh, Keng Cher; Hatzimanikatis, Vassily
2017-01-01
Reaction atom mappings track the positional changes of all of the atoms between the substrates and the products as they undergo the biochemical transformation. However, information on atom transitions in the context of metabolic pathways is not widely available in the literature. The understanding of metabolic pathways at the atomic level is of great importance as it can deconvolute the overlapping catabolic/anabolic pathways resulting in the observed metabolic phenotype. The automated identification of atom transitions within a metabolic network is a very challenging task since the degree of complexity of metabolic networks dramatically increases when we transit from metabolite-level studies to atom-level studies. Despite being studied extensively in various approaches, the field of atom mapping of metabolic networks is lacking an automated approach, which (i) accounts for the information of reaction mechanism for atom mapping and (ii) is extendable from individual atom-mapped reactions to atom-mapped reaction networks. Hereby, we introduce a computational framework, iAM.NICE (in silico Atom Mapped Network Integrated Computational Explorer), for the systematic atom-level reconstruction of metabolic networks from in silico labelled substrates. iAM.NICE is to our knowledge the first automated atom-mapping algorithm that is based on the underlying enzymatic biotransformation mechanisms, and its application goes beyond individual reactions and it can be used for the reconstruction of atom-mapped metabolic networks. We illustrate the applicability of our method through the reconstruction of atom-mapped reactions of the KEGG database and we provide an example of an atom-level representation of the core metabolic network of E. coli. Copyright © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Nagaraj, Shivashankar H; Gasser, Robin B; Nisbet, Alasdair J; Ranganathan, Shoba
2008-01-01
The analysis of expressed sequence tags (EST) offers a rapid and cost effective approach to elucidate the transcriptome of an organism, but requires several computational methods for assembly and annotation. Researchers frequently analyse each step manually, which is laborious and time consuming. We have recently developed ESTExplorer, a semi-automated computational workflow system, in order to achieve the rapid analysis of EST datasets. In this study, we evaluated EST data analysis for the parasitic nematode Trichostrongylus vitrinus (order Strongylida) using ESTExplorer, compared with database matching alone. We functionally annotated 1776 ESTs obtained via suppressive-subtractive hybridisation from T. vitrinus, an important parasitic trichostrongylid of small ruminants. Cluster and comparative genomic analyses of the transcripts using ESTExplorer indicated that 290 (41%) sequences had homologues in Caenorhabditis elegans, 329 (42%) in parasitic nematodes, 202 (28%) in organisms other than nematodes, and 218 (31%) had no significant match to any sequence in the current databases. Of the C. elegans homologues, 90 were associated with 'non-wildtype' double-stranded RNA interference (RNAi) phenotypes, including embryonic lethality, maternal sterility, sterile progeny, larval arrest and slow growth. We could functionally classify 267 (38%) sequences using the Gene Ontologies (GO) and establish pathway associations for 230 (33%) sequences using the Kyoto Encyclopedia of Genes and Genomes (KEGG). Further examination of this EST dataset revealed a number of signalling molecules, proteases, protease inhibitors, enzymes, ion channels and immune-related genes. In addition, we identified 40 putative secreted proteins that could represent potential candidates for developing novel anthelmintics or vaccines. We further compared the automated EST sequence annotations, using ESTExplorer, with database search results for individual T. vitrinus ESTs. ESTExplorer reliably and rapidly annotated 301 ESTs, with pathway and GO information, eliminating 60 low quality hits from database searches. We evaluated the efficacy of ESTExplorer in analysing EST data, and demonstrate that computational tools can be used to accelerate the process of gene discovery in EST sequencing projects. The present study has elucidated sets of relatively conserved and potentially novel genes for biological investigation, and the annotated EST set provides further insight into the molecular biology of T. vitrinus, towards the identification of novel drug targets.
Arvind, Akanksha; Jain, Vaibhav; Saravanan, Parameswaran; Mohan, C Gopi
2013-12-01
Mycobacterium tuberculosis (Mtb) is a causative agent of tuberculosis (TB) disease, which has affected approximately 2 billion people worldwide. Due to the emergence of resistance towards the existing drugs, discovery of new anti-TB drugs is an important global healthcare challenge. To address this problem, there is an urgent need to identify new drug targets in Mtb. In the present study, the subtractive genomics approach has been employed for the identification of new drug targets against TB. Screening the Mtb proteome using the Database of Essential Genes (DEG) and human proteome resulted in the identification of 60 key proteins which have no eukaryotic counterparts. Critical analysis of these proteins using Kyoto Encyclopedia of Genes and Genomes (KEGG) metabolic pathways database revealed uridine monophosphate kinase (UMPK) enzyme as a potential drug target for developing novel anti-TB drugs. Homology model of Mtb-UMPK was constructed for the first time on the basis of the crystal structure of E. coli-UMPK, in order to understand its structure-function relationships, and which would in turn facilitate to perform structure-based inhibitor design. Furthermore, the structural similarity search was carried out using physiological inhibitor UTP of Mtb-UMPK to virtually screen ZINC database. Retrieved hits were further screened by implementing several filters like ADME and toxicity followed by molecular docking. Finally, on the basis of the Glide docking score and the mode of binding, 6 putative leads were identified as inhibitors of this enzyme which can potentially emerge as future drugs for the treatment of TB.
Li, Dong-Yao; Chen, Wen-Jie; Shang, Jun; Chen, Gang; Li, Shi-Kang
2018-06-01
Long non-coding RNAs (lncRNAs) have been demonstrated to mediate carcinogenesis in various types of cancer. However, the regulatory role of lncRNA LINC00968 in lung adenocarcinoma remains unclear. The microRNA (miRNA) expression in LINC00968-overexpressing human lung adenocarcinoma A549 cells was detected using miRNA microarray analysis. miR-9-3p was selected for further analysis, and its expression was verified in the Gene Expression Omnibus (GEO) database. In addition, the regulatory axis of LINC00968 was validated using The Cancer Genome Atlas (TCGA) database. Results of the GEO database indicated miR-9-3p expression in lung adenocarcinoma was significantly higher compared with normal tissues. Functional enrichment analyses of the target genes of miR-9-3p indicated protein binding and the AMP-activated protein kinase pathway were the most enriched Gene Ontology and KEGG terms, respectively. Combining target genes with the correlated genes of LINC00968 and miR-9-3p, 120 objective genes were obtained, which were used to construct a protein-protein interaction (PPI) network. Cyclin A2 (CCNA2) was identified to have a vital role in the PPI network. Significant correlations were detected between LINC00968, miR-9-3p and CCNA2 in lung adenocarcinoma. The LINC00968/miR-9-3p/CCNA2 regulatory axis provides a new foundation for further evaluating the regulatory mechanisms of LINC00968 in lung adenocarcinoma.
A New Omics Data Resource of Pleurocybella porrigens for Gene Discovery
Dohra, Hideo; Someya, Takumi; Takano, Tomoyuki; Harada, Kiyonori; Omae, Saori; Hirai, Hirofumi; Yano, Kentaro; Kawagishi, Hirokazu
2013-01-01
Background Pleurocybella porrigens is a mushroom-forming fungus, which has been consumed as a traditional food in Japan. In 2004, 55 people were poisoned by eating the mushroom and 17 people among them died of acute encephalopathy. Since then, the Japanese government has been alerting Japanese people to take precautions against eating the P . porrigens mushroom. Unfortunately, despite efforts, the molecular mechanism of the encephalopathy remains elusive. The genome and transcriptome sequence data of P . porrigens and the related species, however, are not stored in the public database. To gain the omics data in P . porrigens , we sequenced genome and transcriptome of its fruiting bodies and mycelia by next generation sequencing. Methodology/Principal Findings Short read sequences of genomic DNAs and mRNAs in P . porrigens were generated by Illumina Genome Analyzer. Genome short reads were de novo assembled into scaffolds using Velvet. Comparisons of genome signatures among Agaricales showed that P . porrigens has a unique genome signature. Transcriptome sequences were assembled into contigs (unigenes). Biological functions of unigenes were predicted by Gene Ontology and KEGG pathway analyses. The majority of unigenes would be novel genes without significant counterparts in the public omics databases. Conclusions Functional analyses of unigenes present the existence of numerous novel genes in the basidiomycetes division. The results mean that the omics information such as genome, transcriptome and metabolome in basidiomycetes is short in the current databases. The large-scale omics information on P . porrigens , provided from this research, will give a new data resource for gene discovery in basidiomycetes. PMID:23936076
Luo, Lin; Zhou, Wen-Hua; Cai, Jiang-Jia; Feng, Mei; Zhou, Mi; Hu, Su-Pei
2017-01-01
Diabetic peripheral neuropathy (DPN) is a common complication of diabetes mellitus (DM). It is not diagnosed or managed properly in the majority of patients because its pathogenesis remains controversial. In this study, human whole genome microarrays identified 2898 and 4493 differentially expressed genes (DEGs) in DM and DPN patients, respectively. A further KEGG pathway analysis indicated that DPN and DM share four pathways, including apoptosis, B cell receptor signaling pathway, endocytosis, and Toll-like receptor signaling pathway. The DEGs identified through comparison of DPN and DM were significantly enriched in MAPK signaling pathway, NOD-like receptor signaling pathway, and neurotrophin signaling pathway, while the “neurotrophin-MAPK signaling pathway” was notably downregulated. Seven DEGs from the neurotrophin-MAPK signaling pathway were validated in additional 78 samples, and the results confirmed the initial microarray findings. These findings demonstrated that downregulation of the neurotrophin-MAPK signaling pathway may be the major mechanism of DPN pathogenesis, thus providing a potential approach for DPN treatment. PMID:28900628
Qian, Baoying; Xue, Liangyi; Huang, Hongli
2016-01-01
The large yellow croaker (Larimichthys crocea) is an economically important fish species in Chinese mariculture industry. To understand the molecular basis underlying the response to fasting, Illumina HiSeqTM 2000 was used to analyze the liver transcriptome of fasting large yellow croakers. A total of 54,933,550 clean reads were obtained and assembled into 110,364 contigs. Annotation to the NCBI database identified a total of 38,728 unigenes, of which 19,654 were classified into Gene Ontology and 22,683 were found in Kyoto Encyclopedia of Genes and Genomes (KEGG). Comparative analysis of the expression profiles between fasting fish and normal-feeding fish identified a total of 7,623 differentially expressed genes (P < 0.05), including 2,500 upregulated genes and 5,123 downregulated genes. Dramatic differences were observed in the genes involved in metabolic pathways such as fat digestion and absorption, citrate cycle, and glycolysis/gluconeogenesis, and the similar results were also found in the transcriptome of skeletal muscle. Further qPCR analysis confirmed that the genes encoding the factors involved in those pathways significantly changed in terms of expression levels. The results of the present study provide insights into the molecular mechanisms underlying the metabolic response of the large yellow croaker to fasting as well as identified areas that require further investigation. PMID:26967898
Praveen, Paurush; Fröhlich, Holger
2013-01-01
Inferring regulatory networks from experimental data via probabilistic graphical models is a popular framework to gain insights into biological systems. However, the inherent noise in experimental data coupled with a limited sample size reduces the performance of network reverse engineering. Prior knowledge from existing sources of biological information can address this low signal to noise problem by biasing the network inference towards biologically plausible network structures. Although integrating various sources of information is desirable, their heterogeneous nature makes this task challenging. We propose two computational methods to incorporate various information sources into a probabilistic consensus structure prior to be used in graphical model inference. Our first model, called Latent Factor Model (LFM), assumes a high degree of correlation among external information sources and reconstructs a hidden variable as a common source in a Bayesian manner. The second model, a Noisy-OR, picks up the strongest support for an interaction among information sources in a probabilistic fashion. Our extensive computational studies on KEGG signaling pathways as well as on gene expression data from breast cancer and yeast heat shock response reveal that both approaches can significantly enhance the reconstruction accuracy of Bayesian Networks compared to other competing methods as well as to the situation without any prior. Our framework allows for using diverse information sources, like pathway databases, GO terms and protein domain data, etc. and is flexible enough to integrate new sources, if available.
PUFA diets alter the microRNA expression profiles in an inflammation rat model.
Zheng, Zheng; Ge, Yinlin; Zhang, Jinyu; Xue, Meilan; Li, Quan; Lin, Dongliang; Ma, Wenhui
2015-06-01
Omega‑3 and ‑6 polyunsaturated fatty acids (PUFAs) can directly or indirectly regulate immune homeostasis via inflammatory pathways, and components of these pathways are crucial targets of microRNAs (miRNAs). However, no study has examined the changes in the miRNA transcriptome during PUFA‑regulated inflammatory processes. Here, we established PUFA diet‑induced autoimmune‑prone (AP) and autoimmune‑averse (AA) rat models, and studied their physical characteristics and immune status. Additionally, miRNA expression patterns in the rat models were compared using microarray assays and bioinformatic methods. A total of 54 miRNAs were differentially expressed in common between the AP and the AA rats, and the changes in rno‑miR‑19b‑3p, ‑146b‑5p and ‑183‑5p expression were validated using stem‑loop reverse transcription‑quantitative polymerase chain reaction. To better understand the mechanisms underlying PUFA‑regulated miRNA changes during inflammation, computational algorithms and biological databases were used to identify the target genes of the three validated miRNAs. Furthermore, Gene Ontology (GO) term annotation and KEGG pathway analyses of the miRNA targets further allowed to explore the potential implication of the miRNAs in inflammatory pathways. The predicted PUFA‑regulated inflammatory pathways included the Toll‑like receptor (TLR), T cell receptor (TCR), NOD‑like receptor (NLR), RIG‑I‑like receptor (RLR), mitogen‑activated protein kinase (MAPK) and the transforming growth factor‑β (TGF‑β) pathway. This study is the first report, to the best of our knowledge, on in vivo comparative profiling of miRNA transcriptomes in PUFA diet‑induced inflammatory rat models using a microarray approach. The results provide a useful resource for future investigation of the role of PUFA‑regulated miRNAs in immune homeostasis.
Discovery of cashmere goat (Capra hircus) microRNAs in skin and hair follicles by Solexa sequencing.
Yuan, Chao; Wang, Xiaolong; Geng, Rongqing; He, Xiaolin; Qu, Lei; Chen, Yulin
2013-07-28
MicroRNAs (miRNAs) are a large family of endogenous, non-coding RNAs, about 22 nucleotides long, which regulate gene expression through sequence-specific base pairing with target mRNAs. Extensive studies have shown that miRNA expression in the skin changes remarkably during distinct stages of the hair cycle in humans, mice, goats and sheep. In this study, the skin tissues were harvested from the three stages of hair follicle cycling (anagen, catagen and telogen) in a fibre-producing goat breed. In total, 63,109,004 raw reads were obtained by Solexa sequencing and 61,125,752 clean reads remained for the small RNA digitalisation analysis. This resulted in the identification of 399 conserved miRNAs; among these, 326 miRNAs were expressed in all three follicular cycling stages, whereas 3, 12 and 11 miRNAs were specifically expressed in anagen, catagen, and telogen, respectively. We also identified 172 potential novel miRNAs by Mireap, 36 miRNAs were expressed in all three cycling stages, whereas 23, 29 and 44 miRNAs were specifically expressed in anagen, catagen, and telogen, respectively. The expression level of five arbitrarily selected miRNAs was analyzed by quantitative PCR, and the results indicated that the expression patterns were consistent with the Solexa sequencing results. Gene Ontology and KEGG pathway analyses indicated that five major biological pathways (Metabolic pathways, Pathways in cancer, MAPK signalling pathway, Endocytosis and Focal adhesion) accounted for 23.08% of target genes among 278 biological functions, indicating that these pathways are likely to play significant roles during hair cycling. During all hair cycle stages of cashmere goats, a large number of conserved and novel miRNAs were identified through a high-throughput sequencing approach. This study enriches the Capra hircus miRNA databases and provides a comprehensive miRNA transcriptome profile in the skin of goats during the hair follicle cycle.
Chen, Jing; Zhang, Hanping; Feng, Mingfeng; Zuo, Dengpan; Hu, Yahui; Jiang, Tong
2016-07-13
Woodland strawberry (Fragaria vesca) infected with Strawberry vein banding virus (SVBV) exhibits chlorotic symptoms along the leaf veins. However, little is known about the molecular mechanism of strawberry disease caused by SVBV. We performed the next-generation sequencing (RNA-Seq) study to identify gene expression changes induced by SVBV in woodland strawberry using mock-inoculated plants as a control. Using RNA-Seq, we have identified 36,850 unigenes, of which 517 were differentially expressed in the virus-infected plants (DEGs). The unigenes were annotated and classified with Gene Ontology (GO), Clusters of Orthologous Group (COG) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses. The KEGG pathway analysis of these genes suggested that strawberry disease caused by SVBV may affect multiple processes including pigment metabolism, photosynthesis and plant-pathogen interactions. Our research provides comprehensive transcriptome information regarding SVBV infection in strawberry.
HypoxiaDB: a database of hypoxia-regulated proteins
Khurana, Pankaj; Sugadev, Ragumani; Jain, Jaspreet; Singh, Shashi Bala
2013-01-01
There has been intense interest in the cellular response to hypoxia, and a large number of differentially expressed proteins have been identified through various high-throughput experiments. These valuable data are scattered, and there have been no systematic attempts to document the various proteins regulated by hypoxia. Compilation, curation and annotation of these data are important in deciphering their role in hypoxia and hypoxia-related disorders. Therefore, we have compiled HypoxiaDB, a database of hypoxia-regulated proteins. It is a comprehensive, manually-curated, non-redundant catalog of proteins whose expressions are shown experimentally to be altered at different levels and durations of hypoxia. The database currently contains 72 000 manually curated entries taken on 3500 proteins extracted from 73 peer-reviewed publications selected from PubMed. HypoxiaDB is distinctive from other generalized databases: (i) it compiles tissue-specific protein expression changes under different levels and duration of hypoxia. Also, it provides manually curated literature references to support the inclusion of the protein in the database and establish its association with hypoxia. (ii) For each protein, HypoxiaDB integrates data on gene ontology, KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway, protein–protein interactions, protein family (Pfam), OMIM (Online Mendelian Inheritance in Man), PDB (Protein Data Bank) structures and homology to other sequenced genomes. (iii) It also provides pre-compiled information on hypoxia-proteins, which otherwise requires tedious computational analysis. This includes information like chromosomal location, identifiers like Entrez, HGNC, Unigene, Uniprot, Ensembl, Vega, GI numbers and Genbank accession numbers associated with the protein. These are further cross-linked to respective public databases augmenting HypoxiaDB to the external repositories. (iv) In addition, HypoxiaDB provides an online sequence-similarity search tool for users to compare their protein sequences with HypoxiaDB protein database. We hope that HypoxiaDB will enrich our knowledge about hypoxia-related biology and eventually will lead to the development of novel hypothesis and advancements in diagnostic and therapeutic activities. HypoxiaDB is freely accessible for academic and non-profit users via http://www.hypoxiadb.com. Database URL: http://www.hypoxiadb.com PMID:24178989
Yang, Mengquan; You, Wenjing; Wu, Shiwen; Fan, Zhen; Xu, Baofu; Zhu, Mulan; Li, Xuan; Xiao, Youli
2017-03-22
Huperzia serrata (H. serrata) is an economically important traditional Chinese herb with the notably medicinal value. As a representative member of the Lycopodiaceae family, the H. serrata produces various types of effectively bioactive lycopodium alkaloids, especially the huperzine A (HupA) which is a promising drug for Alzheimer's disease. Despite their medicinal importance, the public genomic and transcriptomic resources are very limited and the biosynthesis of HupA is largely unknown. Previous studies on comparison of 454-ESTs from H. serrata and Phlegmariurus carinatus predicted putative genes involved in lycopodium alkaloid biosynthesis, such as lysine decarboxylase like (LDC-like) protein and some CYP450s. However, these gene annotations were not carried out with further biochemical characterizations. To understand the biosynthesis of HupA and its regulation in H. serrata, a global transcriptome analysis on H. Serrata tissues was performed. In this study, we used the Illumina Highseq4000 platform to generate a substantial RNA sequencing dataset of H. serrata. A total of 40.1 Gb clean data was generated from four different tissues: root, stem, leaf, and sporangia and assembled into 181,141 unigenes. The total length, average length, N50 and GC content of unigenes were 219,520,611 bp, 1,211 bp, 2,488 bp and 42.51%, respectively. Among them, 105,516 unigenes (58.25%) were annotated by seven public databases (NR, NT, Swiss-Prot, KEGG, COG, Interpro, GO), and 54 GO terms and 3,391 transcription factors (TFs) were functionally classified, respectively. KEGG pathway analysis revealed that 72,230 unigenes were classified into 21 functional pathways. Three types of candidate enzymes, LDC, CAO and PKS, responsible for the biosynthesis of precursors of HupA were all identified in the transcripts. Four hundred and fifty-seven CYP450 genes in H. serrata were also analyzed and compared with tissue-specific gene expression. Moreover, two key classes of CYP450 genes BBE and SLS, with 23 members in total, for modification of the lycopodium alkaloid scaffold in the late two stages of biosynthesis of HupA were further evaluated. This study is the first report of global transcriptome analysis on all tissues of H. serrata, and critical genes involved in the biosynthesis of precursors and scaffold modifications of HupA were discovered and predicted. The transcriptome data from this work not only could provide an important resource for further investigating on metabolic pathways in H. serrata, but also shed light on synthetic biology study of HupA.
Comparison of Metabolic Pathways in Escherichia coli by Using Genetic Algorithms.
Ortegon, Patricia; Poot-Hernández, Augusto C; Perez-Rueda, Ernesto; Rodriguez-Vazquez, Katya
2015-01-01
In order to understand how cellular metabolism has taken its modern form, the conservation and variations between metabolic pathways were evaluated by using a genetic algorithm (GA). The GA approach considered information on the complete metabolism of the bacterium Escherichia coli K-12, as deposited in the KEGG database, and the enzymes belonging to a particular pathway were transformed into enzymatic step sequences by using the breadth-first search algorithm. These sequences represent contiguous enzymes linked to each other, based on their catalytic activities as they are encoded in the Enzyme Commission numbers. In a posterior step, these sequences were compared using a GA in an all-against-all (pairwise comparisons) approach. Individual reactions were chosen based on their measure of fitness to act as parents of offspring, which constitute the new generation. The sequences compared were used to construct a similarity matrix (of fitness values) that was then considered to be clustered by using a k-medoids algorithm. A total of 34 clusters of conserved reactions were obtained, and their sequences were finally aligned with a multiple-sequence alignment GA optimized to align all the reaction sequences included in each group or cluster. From these comparisons, maps associated with the metabolism of similar compounds also contained similar enzymatic step sequences, reinforcing the Patchwork Model for the evolution of metabolism in E. coli K-12, an observation that can be expanded to other organisms, for which there is metabolism information. Finally, our mapping of these reactions is discussed, with illustrations from a particular case.
Comparison of Metabolic Pathways in Escherichia coli by Using Genetic Algorithms
Ortegon, Patricia; Poot-Hernández, Augusto C.; Perez-Rueda, Ernesto; Rodriguez-Vazquez, Katya
2015-01-01
In order to understand how cellular metabolism has taken its modern form, the conservation and variations between metabolic pathways were evaluated by using a genetic algorithm (GA). The GA approach considered information on the complete metabolism of the bacterium Escherichia coli K-12, as deposited in the KEGG database, and the enzymes belonging to a particular pathway were transformed into enzymatic step sequences by using the breadth-first search algorithm. These sequences represent contiguous enzymes linked to each other, based on their catalytic activities as they are encoded in the Enzyme Commission numbers. In a posterior step, these sequences were compared using a GA in an all-against-all (pairwise comparisons) approach. Individual reactions were chosen based on their measure of fitness to act as parents of offspring, which constitute the new generation. The sequences compared were used to construct a similarity matrix (of fitness values) that was then considered to be clustered by using a k-medoids algorithm. A total of 34 clusters of conserved reactions were obtained, and their sequences were finally aligned with a multiple-sequence alignment GA optimized to align all the reaction sequences included in each group or cluster. From these comparisons, maps associated with the metabolism of similar compounds also contained similar enzymatic step sequences, reinforcing the Patchwork Model for the evolution of metabolism in E. coli K-12, an observation that can be expanded to other organisms, for which there is metabolism information. Finally, our mapping of these reactions is discussed, with illustrations from a particular case. PMID:25973143
Jia, Zhiying; Wang, Qiai; Wu, Kaikai; Wei, Zhenlin; Zhou, Zunchun; Liu, Xiaolin
2017-09-01
Strongylocentrotus nudus is an edible sea urchin, mainly harvested in China. Correlation studies indicated that S. nudus with larger diameter have a prolonged marketing time and better palatability owing to their precocious gonads and extended maturation process. However, the molecular mechanism underlying this phenomenon is still unknown. Here, transcriptome sequencing was applied to study the ovaries of adult S. nudus with different shell diameters to explore the possible mechanism. In this study, four independent cDNA libraries were constructed, including two from the big size urchins and two from the small ones using a HiSeq™2500 platform. A total of 88,581 unigenes were acquired with a mean length of 1354bp, of which 66,331 (74.88%) unigenes could be annotated using six major publicly available databases. Comparative analysis revealed that 353 unigenes were differentially expressed (with log2(ratio)≥1, FDR≤0.001) between the two groups. Of these, 20 differentially expressed genes (DEGs) were selected to confirm the accuracy of RNA-seq data by quantitative real-time RT-PCR. Furthermore, gene ontology and KEGG pathway enrichment analyses were performed to find the putative genes and pathways related to ovarian maturity. Eight unigenes were identified as significant DEGs involved in reproduction related pathways; these included Mos, Cdc20, Rec8, YP30, cytochrome P450 2U1, ovoperoxidase, proteoliaisin, and rendezvin. Our research fills the gap in the studies on the S. nudus ovaries using transcriptome analysis. Copyright © 2017 Elsevier Inc. All rights reserved.
2012-01-01
Background Huntington’s disease (HD) is a fatal progressive neurodegenerative disorder caused by the expansion of the polyglutamine repeat region in the huntingtin gene. Although the disease is triggered by the mutation of a single gene, intensive research has linked numerous other genes to its pathogenesis. To obtain a systematic overview of these genes, which may serve as therapeutic targets, CHDI Foundation has recently established the HD Research Crossroads database. With currently over 800 cataloged genes, this web-based resource constitutes the most extensive curation of genes relevant to HD. It provides us with an unprecedented opportunity to survey molecular mechanisms involved in HD in a holistic manner. Methods To gain a synoptic view of therapeutic targets for HD, we have carried out a variety of bioinformatical and statistical analyses to scrutinize the functional association of genes curated in the HD Research Crossroads database. In particular, enrichment analyses were performed with respect to Gene Ontology categories, KEGG signaling pathways, and Pfam protein families. For selected processes, we also analyzed differential expression, using published microarray data. Additionally, we generated a candidate set of novel genetic modifiers of HD by combining information from the HD Research Crossroads database with previous genome-wide linkage studies. Results Our analyses led to a comprehensive identification of molecular mechanisms associated with HD. Remarkably, we not only recovered processes and pathways, which have frequently been linked to HD (such as cytotoxicity, apoptosis, and calcium signaling), but also found strong indications for other potentially disease-relevant mechanisms that have been less intensively studied in the context of HD (such as the cell cycle and RNA splicing, as well as Wnt and ErbB signaling). For follow-up studies, we provide a regularly updated compendium of molecular mechanism, that are associated with HD, at http://hdtt.sysbiolab.eu Additionally, we derived a candidate set of 24 novel genetic modifiers, including histone deacetylase 3 (HDAC3), metabotropic glutamate receptor 1 (GRM1), CDK5 regulatory subunit 2 (CDK5R2), and coactivator 1ß of the peroxisome proliferator-activated receptor gamma (PPARGC1B). Conclusions The results of our study give us an intriguing picture of the molecular complexity of HD. Our analyses can be seen as a first step towards a comprehensive list of biological processes, molecular functions, and pathways involved in HD, and may provide a basis for the development of more holistic disease models and new therapeutics. PMID:22741533
Ling, JunJun; Yang, Shengyou; Huang, Yi; Wei, Dongfeng; Cheng, Weidong
2018-06-01
Alzheimer disease (AD) is a progressive neurodegenerative disease, the etiology of which remains largely unknown. Accumulating evidence indicates that elevated manganese (Mn) in brain exerts toxic effects on neurons and contributes to AD development. Thus, we aimed to explore the gene and pathway variations through analysis of high through-put data in this process.To screen the differentially expressed genes (DEGs) that may play critical roles in Mn-induced AD, public microarray data regarding Mn-treated neurocytes versus controls (GSE70845), and AD versus controls (GSE48350), were downloaded and the DEGs were screened out, respectively. The intersection of the DEGs of each datasets was obtained by using Venn analysis. Then, gene ontology (GO) function analysis and KEGG pathway analysis were carried out. For screening hub genes, protein-protein interaction network was constructed. At last, DEGs were analyzed in Connectivity Map (CMAP) for identification of small molecules that overcome Mn-induced neurotoxicity or AD development.The intersection of the DEGs obtained 140 upregulated and 267 downregulated genes. The top 5 items of biological processes of GO analysis were taxis, chemotaxis, cell-cell signaling, regulation of cellular physiological process, and response to wounding. The top 5 items of KEGG pathway analysis were cytokine-cytokine receptor interaction, apoptosis, oxidative phosphorylation, Toll-like receptor signaling pathway, and insulin signaling pathway. Afterwards, several hub genes such as INSR, VEGFA, PRKACB, DLG4, and BCL2 that might play key roles in Mn-induced AD were further screened out. Interestingly, tyrphostin AG-825, an inhibitor of tyrosine phosphorylation, was predicted to be a potential agent for overcoming Mn-induced neurotoxicity or AD development.The present study provided a novel insight into the molecular mechanisms of Mn-induced neurotoxicity or AD development and screened out several small molecular candidates that might be critical for Mn neurotoxicity prevention and Mn-induced AD treatment.
Hu, Qian-Nan; Deng, Zhe; Hu, Huanan; Cao, Dong-Sheng; Liang, Yi-Zeng
2011-09-01
Biochemical reactions play a key role to help sustain life and allow cells to grow. RxnFinder was developed to search biochemical reactions from KEGG reaction database using three search criteria: molecular structures, molecular fragments and reaction similarity. RxnFinder is helpful to get reference reactions for biosynthesis and xenobiotics metabolism. RxnFinder is freely available via: http://sdd.whu.edu.cn/rxnfinder. qnhu@whu.edu.cn.
Yang, Mei; Cao, Xueyan; Wu, Rina; Liu, Biao; Ye, Wenhui; Yue, Xiqing; Wu, Junrui
2017-09-01
Whey, an essential source of dietary nutrients, is widely used in dairy foods for infants. A total of 584 whey proteins in human and bovine colostrum and mature milk were identified and quantified by the isobaric tag for relative and absolute quantification (iTRAQ) proteomic method. The 424 differentially expressed whey proteins were identified and analyzed according to gene ontology (GO) annotation, Kyoto encyclopedia of genes and genomes (KEGG) pathway, and multivariate statistical analysis. Biological processes principally involved biological regulation and response to stimulus. Major cellular components were extracellular region part and extracellular space. The most prevalent molecular function was protein binding. Twenty immune-related proteins and 13 proteins related to enzyme regulatory activity were differentially expressed in human and bovine milk. Differentially expressed whey proteins participated in many KEGG pathways, including major complement and coagulation cascades and in phagosomes. Whey proteins show obvious differences in expression in human and bovine colostrum and mature milk, with consequences for biological function. The results here increase our understanding of different whey proteomes, which could provide useful information for the development and manufacture of dairy products and nutrient food for infants. The advanced iTRAQ proteomic approach was used to analyze differentially expressed whey proteins in human and bovine colostrum and mature milk.
A statistical method for measuring activation of gene regulatory networks.
Esteves, Gustavo H; Reis, Luiz F L
2018-06-13
Gene expression data analysis is of great importance for modern molecular biology, given our ability to measure the expression profiles of thousands of genes and enabling studies rooted in systems biology. In this work, we propose a simple statistical model for the activation measuring of gene regulatory networks, instead of the traditional gene co-expression networks. We present the mathematical construction of a statistical procedure for testing hypothesis regarding gene regulatory network activation. The real probability distribution for the test statistic is evaluated by a permutation based study. To illustrate the functionality of the proposed methodology, we also present a simple example based on a small hypothetical network and the activation measuring of two KEGG networks, both based on gene expression data collected from gastric and esophageal samples. The two KEGG networks were also analyzed for a public database, available through NCBI-GEO, presented as Supplementary Material. This method was implemented in an R package that is available at the BioConductor project website under the name maigesPack.
Investigation of anti-cancer mechanisms by comparative analysis of naked mole rat and rat
2013-01-01
Background The naked mole rats (NMRs) are small-sized underground rodents with plenty of unusual traits. Their life expectancy can be up to thirty years, more than seven times longer than laboratory rat. Furthermore, they are resistant to both congenital and experimentally induced cancer genesis. These peculiar physiological and pathological characteristics allow them to become a suitable model for cancer and aging research. Results In this paper, we carried out a genome-wide comparative analysis of rat and NMR using the recently published genome sequence of NMR. First, we identified all the rat-NMR orthologous genes and specific genes within each of them. The expanded and contracted numbers of protein families in NMR were also analyzed when compared to rat. Seven cancer-related protein families appeared to be significantly expanded, whereas several receptor families were found to be contracted in NMR. We then chose those rat genes that were inexistent in NMR and adopted KEGG pathway database to investigate the metabolic processes in which their proteins may be involved. These genes were significantly enriched in two rat cancer pathways, "Pathway in cancer" and "Bladder cancer". In the rat "Pathway in cancer", 9 out of 14 paths leading to evading apoptosis appeared to be affected in NMR. In addition, a significant number of other NMR-missing genes enriched in several cancer-related pathways have been known to be related to a variety of cancers, implying that many of them may be also related to tumorigenesis in mammals. Finally, investigation of sequence variations among orthologous proteins between rat and NMR revealed that significant fragment insertions/deletions within important functional domains were present in some NMR proteins, which might lead to expressional and/or functional changes of these genes in different species. Conclusions Overall, this study provides insights into understanding the possible anti-cancer mechanisms of NMR as well as searching for new cancer-related candidate genes. PMID:24565050
PiiL: visualization of DNA methylation and gene expression data in gene pathways.
Moghadam, Behrooz Torabi; Zamani, Neda; Komorowski, Jan; Grabherr, Manfred
2017-08-02
DNA methylation is a major mechanism involved in the epigenetic state of a cell. It has been observed that the methylation status of certain CpG sites close to or within a gene can directly affect its expression, either by silencing or, in some cases, up-regulating transcription. However, a vertebrate genome contains millions of CpG sites, all of which are potential targets for methylation, and the specific effects of most sites have not been characterized to date. To study the complex interplay between methylation status, cellular programs, and the resulting phenotypes, we present PiiL, an interactive gene expression pathway browser, facilitating analyses through an integrated view of methylation and expression on multiple levels. PiiL allows for specific hypothesis testing by quickly assessing pathways or gene networks, where the data is projected onto pathways that can be downloaded directly from the online KEGG database. PiiL provides a comprehensive set of analysis features that allow for quick and specific pattern searches. Individual CpG sites and their impact on host gene expression, as well as the impact on other genes present in the regulatory network, can be examined. To exemplify the power of this approach, we analyzed two types of brain tumors, Glioblastoma multiform and lower grade gliomas. At a glance, we could confirm earlier findings that the predominant methylation and expression patterns separate perfectly by mutations in the IDH genes, rather than by histology. We could also infer the IDH mutation status for samples for which the genotype was not known. By applying different filtering methods, we show that a subset of CpG sites exhibits consistent methylation patterns, and that the status of sites affect the expression of key regulator genes, as well as other genes located downstream in the same pathways. PiiL is implemented in Java with focus on a user-friendly graphical interface. The source code is available under the GPL license from https://github.com/behroozt/PiiL.git .
Li, Weiguo; Zhang, Lihui; Ding, Zhan; Wang, Guodong; Zhang, Yandi; Gong, Hongmei; Chang, Tianjun; Zhang, Yanwen
2017-02-28
Taihangia rupestris, an andromonoecious plant species, bears both male and hermaphroditic flowers within the same individual. However, the establishment and development of male and hermaphroditic flowers in andromonoecious Taihangia remain poorly understood, due to the limited genetic and sequence information. To investigate the potential molecular mechanism in the regulation of Taihangia flower formation, we used de novo RNA sequencing to compare the transcriptome profiles of male and hermaphroditic flowers at early and late developmental stages. Four cDNA libraries, including male floral bud, hermaphroditic floral bud, male flower, and hermaphroditic flower, were constructed and sequenced by using the Illumina RNA-Seq method. Totally, 84,596,426 qualified Illumina reads were obtained and then assembled into 59,064 unigenes, of which 24,753 unigenes were annotated in the NCBI non-redundant protein database. In addition, 12,214, 7,153, and 8,115 unigenes were assigned into 53 Gene Ontology (GO) functional groups, 25 Clusters of Orthologous Group (COG) categories, and 126 Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways, respectively. By pairwise comparison of unigene abundance between the samples, we identified 1,668 differential expressed genes (DEGs), including 176 transcription factors (TFs) between the male and hermaphroditic flowers. At the early developmental stage, we found 263 up-regulated genes and 436 down-regulated genes expressed in hermaphroditic floral buds, while 844 up-regulated genes and 314 down-regulated genes were detected in hermaphroditic flowers at the late developmental stage. GO and KEGG enrichment analyses showed that a large number of DEGs were associated with a wide range of functions, including cell cycle, epigenetic processes, flower development, and biosynthesis of unsaturated fatty acid pathway. Finally, real-time quantitative PCR was conducted to validate the DEGs identified in the present study. In this study, transcriptome data of this rare andromonoecious Taihangia were reported for the first time. Comparative transcriptome analysis revealed the significant differences in gene expression profiles between male and hermaphroditic flowers at early and late developmental stages. The transcriptome data of Taihangia would be helpful to improve the understanding of the underlying molecular mechanisms in regulation of flower formation and unisexual flower establishment in andromonoecious plants.
2012-01-01
Background Fusarium wilt, caused by the fungal pathogen Fusarium oxysporum f. sp. cubense tropical race 4 (Foc TR4), is considered the most lethal disease of Cavendish bananas in the world. The disease can be managed in the field by planting resistant Cavendish plants generated by somaclonal variation. However, little information is available on the genetic basis of plant resistance to Foc TR4. To a better understand the defense response of resistant banana plants to the Fusarium wilt pathogen, the transcriptome profiles in roots of resistant and susceptible Cavendish banana challenged with Foc TR4 were compared. Results RNA-seq analysis generated more than 103 million 90-bp clean pair end (PE) reads, which were assembled into 88,161 unigenes (mean size = 554 bp). Based on sequence similarity searches, 61,706 (69.99%) genes were identified, among which 21,273 and 50,410 unigenes were assigned to gene ontology (GO) categories and clusters of orthologous groups (COG), respectively. Searches in the Kyoto Encyclopedia of Genes and Genomes Pathway database (KEGG) mapped 33,243 (37.71%) unigenes to 119 KEGG pathways. A total of 5,008 genes were assigned to plant-pathogen interactions, including disease defense and signal transduction. Digital gene expression (DGE) analysis revealed large differences in the transcriptome profiles of the Foc TR4-resistant somaclonal variant and its susceptible wild-type. Expression patterns of genes involved in pathogen-associated molecular pattern (PAMP) recognition, activation of effector-triggered immunity (ETI), ion influx, and biosynthesis of hormones as well as pathogenesis-related (PR) genes, transcription factors, signaling/regulatory genes, cell wall modification genes and genes with other functions were analyzed and compared. The results indicated that basal defense mechanisms are involved in the recognition of PAMPs, and that high levels of defense-related transcripts may contribute to Foc TR4 resistance in banana. Conclusions This study generated a substantial amount of banana transcript sequences and compared the defense responses against Foc TR4 between resistant and susceptible Cavendish bananas. The results contribute to the identification of candidate genes related to plant resistance in a non-model organism, banana, and help to improve the current understanding of host-pathogen interactions. PMID:22863187
Featured Article: Genotation: Actionable knowledge for the scientific reader
Willis, Ethan; Sakauye, Mark; Jose, Rony; Chen, Hao; Davis, Robert L
2016-01-01
We present an article viewer application that allows a scientific reader to easily discover and share knowledge by linking genomics-related concepts to knowledge of disparate biomedical databases. High-throughput data streams generated by technical advancements have contributed to scientific knowledge discovery at an unprecedented rate. Biomedical Informaticists have created a diverse set of databases to store and retrieve the discovered knowledge. The diversity and abundance of such resources present biomedical researchers a challenge with knowledge discovery. These challenges highlight a need for a better informatics solution. We use a text mining algorithm, Genomine, to identify gene symbols from the text of a journal article. The identified symbols are supplemented with information from the GenoDB knowledgebase. Self-updating GenoDB contains information from NCBI Gene, Clinvar, Medgen, dbSNP, KEGG, PharmGKB, Uniprot, and Hugo Gene databases. The journal viewer is a web application accessible via a web browser. The features described herein are accessible on www.genotation.org. The Genomine algorithm identifies gene symbols with an accuracy shown by .65 F-Score. GenoDB currently contains information regarding 59,905 gene symbols, 5633 drug–gene relationships, 5981 gene–disease relationships, and 713 pathways. This application provides scientific readers with actionable knowledge related to concepts of a manuscript. The reader will be able to save and share supplements to be visualized in a graphical manner. This provides convenient access to details of complex biological phenomena, enabling biomedical researchers to generate novel hypothesis to further our knowledge in human health. This manuscript presents a novel application that integrates genomic, proteomic, and pharmacogenomic information to supplement content of a biomedical manuscript and enable readers to automatically discover actionable knowledge. PMID:26900164
Featured Article: Genotation: Actionable knowledge for the scientific reader.
Nagahawatte, Panduka; Willis, Ethan; Sakauye, Mark; Jose, Rony; Chen, Hao; Davis, Robert L
2016-06-01
We present an article viewer application that allows a scientific reader to easily discover and share knowledge by linking genomics-related concepts to knowledge of disparate biomedical databases. High-throughput data streams generated by technical advancements have contributed to scientific knowledge discovery at an unprecedented rate. Biomedical Informaticists have created a diverse set of databases to store and retrieve the discovered knowledge. The diversity and abundance of such resources present biomedical researchers a challenge with knowledge discovery. These challenges highlight a need for a better informatics solution. We use a text mining algorithm, Genomine, to identify gene symbols from the text of a journal article. The identified symbols are supplemented with information from the GenoDB knowledgebase. Self-updating GenoDB contains information from NCBI Gene, Clinvar, Medgen, dbSNP, KEGG, PharmGKB, Uniprot, and Hugo Gene databases. The journal viewer is a web application accessible via a web browser. The features described herein are accessible on www.genotation.org The Genomine algorithm identifies gene symbols with an accuracy shown by .65 F-Score. GenoDB currently contains information regarding 59,905 gene symbols, 5633 drug-gene relationships, 5981 gene-disease relationships, and 713 pathways. This application provides scientific readers with actionable knowledge related to concepts of a manuscript. The reader will be able to save and share supplements to be visualized in a graphical manner. This provides convenient access to details of complex biological phenomena, enabling biomedical researchers to generate novel hypothesis to further our knowledge in human health. This manuscript presents a novel application that integrates genomic, proteomic, and pharmacogenomic information to supplement content of a biomedical manuscript and enable readers to automatically discover actionable knowledge. © 2016 by the Society for Experimental Biology and Medicine.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Maranas, Costas D.
With advances in DNA sequencing and genome annotation techniques, the breadth of metabolic knowledge across all kingdoms of life is increasing. The construction of genome-scale models (GSMs) facilitates this distillation of knowledge by systematically accounting for reaction stoichiometry and directionality, gene to protein to reaction relationships, reaction localization among cellular organelles, metabolite transport costs and routes, transcriptional regulation, and biomass composition. Genome-scale reconstructions available now span across all kingdoms of life, from microbes to whole-plant models, and have become indispensable for driving informed metabolic designs and interventions. A key barrier to the pace of this development is our inability tomore » utilize metabolite/reaction information from databases such as BRENDA [1], KEGG [2], MetaCyc [3], etc. due to incompatibilities of representation, duplications, and errors. Duplicate entries constitute a major impediment, where the same metabolite is found with multiple names across databases and models, which significantly slows downs the collating of information from multiple data sources. This can also lead to serious modeling errors such as charge/mass imbalances [4,5] which can thwart model predictive abilities such as identifying synthetic lethal gene pairs and quantifying metabolic flows. Hence, we created the MetRxn database [6] that takes the next step in integrating data from multiple sources and formats to automatically create a standardized knowledgebase. We subsequently deployed this resource to bring about new paradigms in genome-scale metabolic model reconstruction, metabolic flux elucidation through MFA, modeling of microbial communities, and pathway prospecting. This research has enabled the PI’s group to continue building upon research milestones and reach new ones (see list of MetRxn-related publications below).« less
Philipp, E E R; Kraemer, L; Mountfort, D; Schilhabel, M; Schreiber, S; Rosenstiel, P
2012-03-15
Next generation sequencing (NGS) technologies allow a rapid and cost-effective compilation of large RNA sequence datasets in model and non-model organisms. However, the storage and analysis of transcriptome information from different NGS platforms is still a significant bottleneck, leading to a delay in data dissemination and subsequent biological understanding. Especially database interfaces with transcriptome analysis modules going beyond mere read counts are missing. Here, we present the Transcriptome Analysis and Comparison Explorer (T-ACE), a tool designed for the organization and analysis of large sequence datasets, and especially suited for transcriptome projects of non-model organisms with little or no a priori sequence information. T-ACE offers a TCL-based interface, which accesses a PostgreSQL database via a php-script. Within T-ACE, information belonging to single sequences or contigs, such as annotation or read coverage, is linked to the respective sequence and immediately accessible. Sequences and assigned information can be searched via keyword- or BLAST-search. Additionally, T-ACE provides within and between transcriptome analysis modules on the level of expression, GO terms, KEGG pathways and protein domains. Results are visualized and can be easily exported for external analysis. We developed T-ACE for laboratory environments, which have only a limited amount of bioinformatics support, and for collaborative projects in which different partners work on the same dataset from different locations or platforms (Windows/Linux/MacOS). For laboratories with some experience in bioinformatics and programming, the low complexity of the database structure and open-source code provides a framework that can be customized according to the different needs of the user and transcriptome project.
High precision multi-genome scale reannotation of enzyme function by EFICAz
Arakaki, Adrian K; Tian, Weidong; Skolnick, Jeffrey
2006-01-01
Background The functional annotation of most genes in newly sequenced genomes is inferred from similarity to previously characterized sequences, an annotation strategy that often leads to erroneous assignments. We have performed a reannotation of 245 genomes using an updated version of EFICAz, a highly precise method for enzyme function prediction. Results Based on our three-field EC number predictions, we have obtained lower-bound estimates for the average enzyme content in Archaea (29%), Bacteria (30%) and Eukarya (18%). Most annotations added in KEGG from 2005 to 2006 agree with EFICAz predictions made in 2005. The coverage of EFICAz predictions is significantly higher than that of KEGG, especially for eukaryotes. Thousands of our novel predictions correspond to hypothetical proteins. We have identified a subset of 64 hypothetical proteins with low sequence identity to EFICAz training enzymes, whose biochemical functions have been recently characterized and find that in 96% (84%) of the cases we correctly identified their three-field (four-field) EC numbers. For two of the 64 hypothetical proteins: PA1167 from Pseudomonas aeruginosa, an alginate lyase (EC 4.2.2.3) and Rv1700 of Mycobacterium tuberculosis H37Rv, an ADP-ribose diphosphatase (EC 3.6.1.13), we have detected annotation lag of more than two years in databases. Two examples are presented where EFICAz predictions act as hypothesis generators for understanding the functional roles of hypothetical proteins: FLJ11151, a human protein overexpressed in cancer that EFICAz identifies as an endopolyphosphatase (EC 3.6.1.10), and MW0119, a protein of Staphylococcus aureus strain MW2 that we propose as candidate virulence factor based on its EFICAz predicted activity, sphingomyelin phosphodiesterase (EC 3.1.4.12). Conclusion Our results suggest that we have generated enzyme function annotations of high precision and recall. These predictions can be mined and correlated with other information sources to generate biologically significant hypotheses and can be useful for comparative genome analysis and automated metabolic pathway reconstruction. PMID:17166279
Li, Guoxi; Zhao, Yinli; Liu, Zhonghu; Gao, Chunsheng; Yan, Fengbin; Liu, Bianzhi; Feng, Jianxin
2015-06-01
Common carp (Cyprinus carpio) is one of the most important aquacultured species of the family Cyprinidae, and breeding this species for disease resistance is becoming more and more important. However, at the genome or transcriptome levels, study of the immunogenetics of disease resistance in the common carp is lacking. In this study, 60,316,906 and 75,200,328 paired-end clean reads were obtained from two cDNA libraries of the common carp spleen by Illumina paired-end sequencing technology. Totally, 130,293 unique transcript fragments (unigenes) were assembled, with an average length of 1400.57 bp. Approximately 105,612 (81.06%) unigenes could be annotated according to their homology with matches in the Nr, Nt, Swiss-Prot, COG, GO, or KEGG databases, and they were found to represent 46,747 non-redundant genes. Comparative analysis showed that 59.82% of the unigenes have significant similarity to zebrafish Refseq proteins. Gene expression comparison revealed that 10,432 and 6889 annotated unigenes were, respectively, up- and down-regulated with at least twofold changes between two developmental stages of the common carp spleen. Gene ontology and KEGG analysis were performed to classify all unigenes into functional categories for understanding gene functions and regulation pathways. In addition, 46,847 simple sequence repeats (SSRs) were detected from 35,618 unigenes, and a large number of single nucleotide polymorphism (SNP) and insertion/deletion (INDEL) sites were identified in the spleen transcriptome of common carp. This study has characterized the spleen transcriptome of the common carp for the first time, providing a valuable resource for a better understanding of the common carp immune system and defense mechanisms. This knowledge will also facilitate future functional studies on common carp immunogenetics that may eventually be applied in breeding programs. Copyright © 2015 Elsevier Ltd. All rights reserved.
Piao, Junjie; Sun, Jie; Yang, Yang; Jin, Tiefeng; Chen, Liyan; Lin, Zhenhua
2018-03-20
Non-small cell lung cancer (NSCLC) is the major leading cause of cancer-related deaths worldwide. This study aims to explore molecular mechanism of NSCLC. Microarray dataset was obtained from the Gene Expression Omnibus (GEO) database, and analyzed by using GEO2R. Functional and pathway enrichment analysis were performed based on Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Then, STRING, Cytoscape and MCODE were applied to construct the Protein-protein interaction (PPI) network and screen hub genes. Following, overall survival (OS) analysis of hub genes was performed by using the Kaplan-Meier plotter online tool. Moreover, miRecords was also applied to predict the targets of the differentially expressed microRNAs (DEMs). A total of 228 DEGs were identified, and they were mainly enriched in the terms of cell adhesion molecules, leukocyte transendothelial migration and ECM-receptor interaction. A PPI network was constructed, and 16 hub genes were identified, including TEK, ANGPT1, MMP9, VWF, CDH5, EDN1, ESAM, CCNE1, CDC45, PRC1, CCNB2, AURKA, MELK, CDC20, TOP2A and PTTG1. Among the genes, expressions of 14 hub genes were associated with prognosis of NSCLC patients. Additionally, a total of 11 DEMs were also identified. Our results provide some potential underlying biomarkers for NSCLC. Further studies are required to elucidate the pathogenesis of NSCLC. Copyright © 2018 Elsevier B.V. All rights reserved.
Zhang, X J; Jiang, H Y; Li, L M; Yuan, L H; Chen, J P
2016-06-20
The aim of this study was to provide comprehensive insights into the genetic background of sturgeon by transcriptome study. We performed a de novo assembly of the Amur sturgeon Acipenser schrenckii transcriptome using Illumina Hiseq 2000 sequencing. A total of 148,817 non-redundant unigenes with base length of approximately 121,698,536 bp and ranges from 201 to 26,789 bp were obtained. All the unigenes were classified into 3368 distinct categories and 145,449 singletons by homologous transcript cluster analysis. In all, 46,865 (31.49%) unigenes showed homologous matches with Nr database and 32,214 (21.65%) unigenes were matched to Nt database. In total, 24,862 unigenes were categorized into significantly enriched 52 function groups by GO analysis, and 38,436 unigenes were classified into 25 groups by KOG prediction, as well as 128 enriched KEGG pathways were identified by 45,598 unigenes (P < 0.05). Subsequently, a total of 19,860 SSRs markers were identified with the abundant di-nucleotide type (10,658; 53.67%) and the most AT/TA motif repeats (2689; 13.54%). A total of 1341 conserved lncRNAs were identified by a customized pipeline. Our study provides new sequence and function information for A. schrenckii, which will be the basis for further genetic studies on sturgeon species. The huge number of potential SSRs and putatively conserved lncRNAs isolated by the transcriptome also shed light on research in many fields, including the evolution, conservation management, and biological processes in sturgeon.
In silico analysis of the potential mechanism of telocinobufagin on breast cancer MCF-7 cells.
Dang, Yi-Wu; Lin, Peng; Liu, Li-Min; He, Rong-Quan; Zhang, Li-Jie; Peng, Zhi-Gang; Li, Xiao-Jiao; Chen, Gang
2018-05-01
The extractives from a ChanSu, traditional Chinese medicine, have been discovered to possess anti-inflammatory and tumor-suppressing abilities. However, the molecular mechanism of telocinobufagin, a compound extracted from ChanSu, on breast cancer cells has not been clarified. The aim of this study is to investigate the underlying mechanism of telocinobufagin on breast cancer cells. The differentially expressed genes after telocinobufagin treatment on breast cancer cells were searched and downloaded from Gene Expression Omnibus (GEO), ArrayExpress and literatures. Bioinformatics tools were applied to further explore the potential mechanism of telocinobufagin in breast cancer using the Kyoto Encyclopedia of genes and genomes (KEGG) pathway, Gene ontology (GO) enrichment, panther, and protein-protein interaction analyses. To better comprehend the role of telocinobufagin in breast cancer, we also queried the Connectivity Map using the gene expression profiles of telocinobufagin treatment. One GEO accession (GSE85871) provided 1251 differentially expressed genes after telocinobufagin treatment on MCF-7 cells. The pathway of neuroactive ligand-receptor interaction, cell adhesion molecules (CAMs), intestinal immune network for IgA production, hematopoietic cell lineage and calcium signaling pathway were the key pathways from KEGG analysis. IGF1 and KSR1, owning to higher protein levels in breast cancer tissues, IGF1 and KSR1 could be the hub genes related to telocinobufagin treatment. It was indicated that the molecular mechanism of telocinobufagin resembled that of fenspiride. Telocinobufagin might regulate neuroactive ligand-receptor interaction pathway to exert its influences in breast cancer MCF-7 cells, and its molecular mechanism might share some similarities with fenspiride. This study only presented a comprehensive picture of the role of telocinobufagin in breast cancer MCF-7 cells using big data. However, more thorough and deeper researches are required to add to the validity of this study. Copyright © 2018 Elsevier GmbH. All rights reserved.
MicroRNA meta-signature of oral cancer: evidence from a meta-analysis.
Zeljic, Katarina; Jovanovic, Ivan; Jovanovic, Jasmina; Magic, Zvonko; Stankovic, Aleksandra; Supic, Gordana
2018-03-01
It was the aim of the study to identify commonly deregulated miRNAs in oral cancer patients by performing a meta-analysis of previously published miRNA expression profiles in cancer and matched normal non-cancerous tissue in such patients. Meta-analysis included seven independent studies analyzed by a vote-counting method followed by bioinformatic enrichment analysis. Amongst seven independent studies included in the meta-analysis, 20 miRNAs were found to be deregulated in oral cancer when compared with non-cancerous tissue. Eleven miRNAs were consistently up-regulated in three or more studies (miR-21-5p, miR-31-5p, miR-135b-5p, miR-31-3p, miR-93-5p, miR-34b-5p, miR-424-5p, miR-18a-5p, miR-455-3p, miR-450a-5p, miR-21-3p), and nine were down-regulated (miR-139-5p, miR-30a-3p, miR-376c-3p, miR-885-5p, miR-375, miR-486-5p, miR-411-5p, miR-133a-3p, miR-30a-5p). The meta-signature of identified miRNAs was functionally characterized by KEGG enrichment analysis. Twenty-four KEGG pathways were significantly enriched, and TGF-beta signaling was the most enriched signaling pathway. The highest number of meta-signature miRNAs was involved in the sphingolipid signaling pathway. Natural killer cell-mediated cytotoxicity was the pathway with most genes regulated by identified miRNAs. The rest of the enriched pathways in our miRNA list describe different malignancies and signaling. The identified miRNA meta-signature might be considered as a potential battery of biomarkers when distinguishing oral cancer tissue from normal, non-cancerous tissue. Further mechanistic studies are warranted in order to confirm and fully elucidate the role of deregulated miRNAs in oral cancer.
2012-01-01
Background Exposure to environmental tobacco smoke (ETS) leads to higher rates of pulmonary diseases and infections in children. To study the biochemical changes that may precede lung diseases, metabolomic effects on fetal and maternal lungs and plasma from rats exposed to ETS were compared to filtered air control animals. Genome- reconstructed metabolic pathways may be used to map and interpret dysregulation in metabolic networks. However, mass spectrometry-based non-targeted metabolomics datasets often comprise many metabolites for which links to enzymatic reactions have not yet been reported. Hence, network visualizations that rely on current biochemical databases are incomplete and also fail to visualize novel, structurally unidentified metabolites. Results We present a novel approach to integrate biochemical pathway and chemical relationships to map all detected metabolites in network graphs (MetaMapp) using KEGG reactant pair database, Tanimoto chemical and NIST mass spectral similarity scores. In fetal and maternal lungs, and in maternal blood plasma from pregnant rats exposed to environmental tobacco smoke (ETS), 459 unique metabolites comprising 179 structurally identified compounds were detected by gas chromatography time of flight mass spectrometry (GC-TOF MS) and BinBase data processing. MetaMapp graphs in Cytoscape showed much clearer metabolic modularity and complete content visualization compared to conventional biochemical mapping approaches. Cytoscape visualization of differential statistics results using these graphs showed that overall, fetal lung metabolism was more impaired than lungs and blood metabolism in dams. Fetuses from ETS-exposed dams expressed lower lipid and nucleotide levels and higher amounts of energy metabolism intermediates than control animals, indicating lower biosynthetic rates of metabolites for cell division, structural proteins and lipids that are critical for in lung development. Conclusions MetaMapp graphs efficiently visualizes mass spectrometry based metabolomics datasets as network graphs in Cytoscape, and highlights metabolic alterations that can be associated with higher rate of pulmonary diseases and infections in children prenatally exposed to ETS. The MetaMapp scripts can be accessed at http://metamapp.fiehnlab.ucdavis.edu. PMID:22591066
Molecular dysexpression in gastric cancer revealed by integrated analysis of transcriptome data.
Li, Xiaomei; Dong, Weiwei; Qu, Xueling; Zhao, Huixia; Wang, Shuo; Hao, Yixin; Li, Qiuwen; Zhu, Jianhua; Ye, Min; Xiao, Wenhua
2017-05-01
Gastric cancer (GC) is often diagnosed in the advanced stages and is associated with a poor prognosis. Obtaining an in depth understanding of the molecular mechanisms of GC has lagged behind compared with other cancers. This study aimed to identify candidate biomarkers for GC. An integrated analysis of microarray datasets was performed to identify differentially expressed genes (DEGs) between GC and normal tissues. Gene ontology and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses were then performed to identify the functions of the DEGs. Furthermore, a protein-protein interaction (PPI) network of the DEGs was constructed. The expression levels of the DEGs were validated in human GC tissues using reverse transcription-quantitative polymerase chain reaction (RT-qPCR). A set of 689 DEGs were identified in GC tissues, as compared with normal tissues, including 202 upregulated DEGs and 487 downregulated DEGs. The KEGG pathway analysis suggested that various pathways may play important roles in the pathology of GC, including pathways related to protein digestion and absorption, extracellular matrix-receptor interaction, and the metabolism of xenobiotics by cytochrome P450. The PPI network analysis indicated that the significant hub proteins consisted of SPP1, TOP2A and ARPC1B. RT-qPCR validation indicated that the expression levels of the top 10 most significantly dysexpressed genes were consistent with the illustration of the integrated analysis. The present study yielded a reference list of reliable DEGs, which represents a robust pool of candidates for further evaluation of GC pathogenesis and treatment.
Zhang, Lei; Ma, Shiyun; Wang, Huailiang; Su, Hang; Su, Ke; Li, Longjie
2017-11-15
The purpose of our study was to identify new pathogenic genes used for exploring the pathogenesis of rheumatoid arthritis (RA). To screen pathogenic genes of RA, an integrated analysis was performed by using the microarray datasets in RA derived from the Gene Expression Omnibus (GEO) database. The functional annotation and potential pathways of differentially expressed genes (DEGs) were further discovered by Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis. Afterwards, the integrated analysis of DNA methylation and gene expression profiling was used to screen crucial genes. In addition, we used RT-PCR and MSP to verify the expression levels and methylation status of these crucial genes in 20 synovial biopsy samples obtained from 10 RA model mice and 10 normal mice. BCL11B, CCDC88C, FCRLA and APOL6 were both up-regulated and hypomethylated in RA according to integrated analysis, RT-PCR and MSP verification. Four crucial genes (BCL11B, CCDC88C, FCRLA and APOL6) identified and analyzed in this study might be closely connected with the pathogenesis of RA. Copyright © 2017. Published by Elsevier B.V.
Protein-protein interaction network of gene expression in the hydrocortisone-treated keloid.
Chen, Rui; Zhang, Zhiliang; Xue, Zhujia; Wang, Lin; Fu, Mingang; Lu, Yi; Bai, Ling; Zhang, Ping; Fan, Zhihong
2015-01-01
In order to explore the molecular mechanism of hydrocortisone in keloid tissue, the gene expression profiles of keloid samples treated with hydrocortisone were subjected to bioinformatics analysis. Firstly, the gene expression profiles (GSE7890) of five samples of keloid treated with hydrocortisone and five untreated keloid samples were downloaded from the Gene Expression Omnibus (GEO) database. Secondly, data were preprocessed using packages in R language and differentially expressed genes (DEGs) were screened using a significance analysis of microarrays (SAM) protocol. Thirdly, the DEGs were subjected to gene ontology (GO) function and KEGG pathway enrichment analysis. Finally, the interactions of DEGs in samples of keloid treated with hydrocortisone were explored in a human protein-protein interaction (PPI) network, and sub-modules of the DEGs interaction network were analyzed using Cytoscape software. Based on the analysis, 572 DEGs in the hydrocortisone-treated samples were screened; most of these were involved in the signal transduction and cell cycle. Furthermore, three critical genes in the module, including COL1A1, NID1, and PRELP, were screened in the PPI network analysis. These findings enhance understanding of the pathogenesis of the keloid and provide references for keloid therapy. © 2015 The International Society of Dermatology.
Yu, Liang; Wang, Bingbo; Ma, Xiaoke; Gao, Lin
2016-12-23
Extracting drug-disease correlations is crucial in unveiling disease mechanisms, as well as discovering new indications of available drugs, or drug repositioning. Both the interactome and the knowledge of disease-associated and drug-associated genes remain incomplete. We present a new method to predict the associations between drugs and diseases. Our method is based on a module distance, which is originally proposed to calculate distances between modules in incomplete human interactome. We first map all the disease genes and drug genes to a combined protein interaction network. Then based on the module distance, we calculate the distances between drug gene sets and disease gene sets, and take the distances as the relationships of drug-disease pairs. We also filter possible false positive drug-disease correlations by p-value. Finally, we validate the top-100 drug-disease associations related to six drugs in the predicted results. The overlapping between our predicted correlations with those reported in Comparative Toxicogenomics Database (CTD) and literatures, and their enriched Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways demonstrate our approach can not only effectively identify new drug indications, but also provide new insight into drug-disease discovery.
Employing conservation of co-expression to improve functional inference
Daub, Carsten O; Sonnhammer, Erik LL
2008-01-01
Background Observing co-expression between genes suggests that they are functionally coupled. Co-expression of orthologous gene pairs across species may improve function prediction beyond the level achieved in a single species. Results We used orthology between genes of the three different species S. cerevisiae, D. melanogaster, and C. elegans to combine co-expression across two species at a time. This led to increased function prediction accuracy when we incorporated expression data from either of the other two species and even further increased when conservation across both of the two other species was considered at the same time. Employing the conservation across species to incorporate abundant model organism data for the prediction of protein interactions in poorly characterized species constitutes a very powerful annotation method. Conclusion To be able to employ the most suitable co-expression distance measure for our analysis, we evaluated the ability of four popular gene co-expression distance measures to detect biologically relevant interactions between pairs of genes. For the expression datasets employed in our co-expression conservation analysis above, we used the GO and the KEGG PATHWAY databases as gold standards. While the differences between distance measures were small, Spearman correlation showed to give most robust results. PMID:18808668
The Human Cell Surfaceome of Breast Tumors
da Cunha, Júlia Pinheiro Chagas; Galante, Pedro Alexandre Favoretto; de Souza, Jorge Estefano Santana; Pieprzyk, Martin; Carraro, Dirce Maria; Old, Lloyd J.; Camargo, Anamaria Aranha; de Souza, Sandro José
2013-01-01
Introduction. Cell surface proteins are ideal targets for cancer therapy and diagnosis. We have identified a set of more than 3700 genes that code for transmembrane proteins believed to be at human cell surface. Methods. We used a high-throuput qPCR system for the analysis of 573 cell surface protein-coding genes in 12 primary breast tumors, 8 breast cell lines, and 21 normal human tissues including breast. To better understand the role of these genes in breast tumors, we used a series of bioinformatics strategies to integrates different type, of the datasets, such as KEGG, protein-protein interaction databases, ONCOMINE, and data from, literature. Results. We found that at least 77 genes are overexpressed in breast primary tumors while at least 2 of them have also a restricted expression pattern in normal tissues. We found common signaling pathways that may be regulated in breast tumors through the overexpression of these cell surface protein-coding genes. Furthermore, a comparison was made between the genes found in this report and other genes associated with features clinically relevant for breast tumorigenesis. Conclusions. The expression profiling generated in this study, together with an integrative bioinformatics analysis, allowed us to identify putative targets for breast tumors. PMID:24195083
Huang, Xiaoyun; Zang, Xiaonan; Wu, Fei; Jin, Yuming; Wang, Haitao; Liu, Chang; Ding, Yating; He, Bangxiang; Xiao, Dongfang; Song, Xinwei; Liu, Zhu
2017-01-01
Gracilariopsis lemaneiformis (aka Gracilaria lemaneiformis) is a red macroalga rich in phycoerythrin, which can capture light efficiently and transfer it to photosystemⅡ. However, little is known about the synthesis of optically active phycoerythrinin in G. lemaneiformis at the molecular level. With the advent of high-throughput sequencing technology, analysis of genetic information for G. lemaneiformis by transcriptome sequencing is an effective means to get a deeper insight into the molecular mechanism of phycoerythrin synthesis. Illumina technology was employed to sequence the transcriptome of two strains of G. lemaneiformis- the wild type and a green-pigmented mutant. We obtained a total of 86915 assembled unigenes as a reference gene set, and 42884 unigenes were annotated in at least one public database. Taking the above transcriptome sequencing as a reference gene set, 4041 differentially expressed genes were screened to analyze and compare the gene expression profiles of the wild type and green mutant. By GO and KEGG pathway analysis, we concluded that three factors, including a reduction in the expression level of apo-phycoerythrin, an increase of chlorophyll light-harvesting complex synthesis, and reduction of phycoerythrobilin by competitive inhibition, caused the reduction of optically active phycoerythrin in the green-pigmented mutant.
Characterization of gonadal transcriptomes from the turbot (Scophthalmus maximus).
Hu, Yulong; Huang, Meng; Wang, Weiji; Guan, Jiantao; Kong, Jie
2016-01-01
The mechanisms underlying sexual reproduction and sex ratio determination remains unclear in turbot, a flatfish of great commercial value. And there is limited information in the turbot database regarding genes related to the reproductive system. Here, we conducted high-throughput transcriptome profiling of turbot gonad tissues to better understand their reproductive functions and to supply essential gene sequence information for marker-assisted selection programs in the turbot industry. In this study, two gonad libraries representing sex differences in Scophthalmus maximus yielded 453 818 high-quality reads that were assembled into 24 611 contigs and 33 713 singletons by using 454 pyrosequencing, 13 936 contigs and singletons (CS) of which were annotated using BLASTx. GO (Gene Ontology) and KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway analyses revealed that various biological functions and processes were associated with many of the annotated CS. Expression analyses showed that 510 genes were differentially expressed in males versus females; 80% of these genes were annotated. In addition, 6484 and 6036 single nucleotide polymorphisms (SNPs) were identified in male and female libraries, respectively. This transcriptome resource will serve as the foundation for cDNA or SNP microarray construction, gene expression characterization, and sex-specific linkage mapping in turbot.
NASA Astrophysics Data System (ADS)
Fang, Hua; Xu, Tianheng; Cao, Duantao; Cheng, Longyin; Yu, Yunlong
2016-08-01
A novel bacterium capable of utilizing metamitron as the sole source of carbon and energy was isolated from contaminated soil and identified as Rhodococcus sp. MET based on its morphological characteristics, BIOLOG GP2 microplate profile, and 16S rDNA phylogeny. Genome sequencing and functional annotation of the isolate MET showed a 6,340,880 bp genome with a 62.47% GC content and 5,987 protein-coding genes. In total, 5,907 genes were annotated with the COG, GO, KEGG, Pfam, Swiss-Prot, TrEMBL, and nr databases. The degradation rate of metamitron by the isolate MET obviously increased with increasing substrate concentrations from 1 to 10 mg/l and subsequently decreased at 100 mg/l. The optimal pH and temperature for metamitron biodegradation were 7.0 and 20-30 °C, respectively. Based on genome annotation of the metamitron degradation genes and the metabolites detected by HPLC-MS/MS, the following metamitron biodegradation pathways were proposed: 1) Metamitron was transformed into 2-(3-hydrazinyl-2-ethyl)-hydrazono-2-phenylacetic acid by triazinone ring cleavage and further mineralization; 2) Metamitron was converted into 3-methyl-4-amino-6(2-hydroxy-muconic acid)-1,2,4-triazine-5(4H)-one by phenyl ring cleavage and further mineralization. The coexistence of diverse mineralization pathways indicates that our isolate may effectively bioremediate triazinone herbicide-contaminated soils.
DIANA-microT web server: elucidating microRNA functions through target prediction.
Maragkakis, M; Reczko, M; Simossis, V A; Alexiou, P; Papadopoulos, G L; Dalamagas, T; Giannopoulos, G; Goumas, G; Koukis, E; Kourtis, K; Vergoulis, T; Koziris, N; Sellis, T; Tsanakas, P; Hatzigeorgiou, A G
2009-07-01
Computational microRNA (miRNA) target prediction is one of the key means for deciphering the role of miRNAs in development and disease. Here, we present the DIANA-microT web server as the user interface to the DIANA-microT 3.0 miRNA target prediction algorithm. The web server provides extensive information for predicted miRNA:target gene interactions with a user-friendly interface, providing extensive connectivity to online biological resources. Target gene and miRNA functions may be elucidated through automated bibliographic searches and functional information is accessible through Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. The web server offers links to nomenclature, sequence and protein databases, and users are facilitated by being able to search for targeted genes using different nomenclatures or functional features, such as the genes possible involvement in biological pathways. The target prediction algorithm supports parameters calculated individually for each miRNA:target gene interaction and provides a signal-to-noise ratio and a precision score that helps in the evaluation of the significance of the predicted results. Using a set of miRNA targets recently identified through the pSILAC method, the performance of several computational target prediction programs was assessed. DIANA-microT 3.0 achieved there with 66% the highest ratio of correctly predicted targets over all predicted targets. The DIANA-microT web server is freely available at www.microrna.gr/microT.
Li, Xiaoying; Korir, Nicholas Kibet; Liu, Lili; Shangguan, Lingfei; Wang, Yuzhu; Han, Jian; Chen, Ming; Fang, Jinggui
2012-11-15
Microarray analysis is a technique that can be employed to provide expression profiles of single genes and new insights to elucidate the biological mechanisms responsible for fruit development. To evaluate expression of genes mostly engaged in fruit development between Prunus mume and Prunus armeniaca, we first identified differentially expressed transcripts along the entire fruit life cycle by using microarrays spotted with 10,641 ESTs collected from P. mume and other Prunus EST sequences. A total of 1418 ESTs were selected after quality control of microarray spots and analysis for differential gene expression patterns during fruit development of P. mume and P. Armeniaca. From these, 707 up-regulated and 711 down-regulated genes showing more than two-fold differences in expression level were annotated by GO based on biological processes, molecular functions and cellular components. These differentially expressed genes were found to be involved in several important pathways of carbohydrate, galactose, and starch and sucrose metabolism as well as in biosynthesis of other secondary metabolites via KEGG. This could provide detailed information on the fruit quality differences during development and ripening of these two species. With the results obtained, we provide a practical database for comprehensive understanding of molecular events during fruit development and also lay a theoretical foundation for the cloning of genes regulating in a series of important rate-limiting enzymes involved in vital metabolic pathways during fruit development. Copyright © 2012 Elsevier GmbH. All rights reserved.
Rebeca, Carballar-Lejarazú; Zhu, Xiaoli; Guo, Yajie; Lin, Qiannan; Hu, Xia; Wang, Rong; Liang, Guanghong; Guan, Xiong
2017-01-01
The pine aphid Cinara pinitabulaeformis Zhang et Zhang is the main pine pest in China, it causes pine needles to produce dense dew (honeydew) which can lead to sooty mold (black filamentous saprophytic ascomycetes). Although common chemical and physical strategies are used to prevent the disease caused by C. pinitabulaeformis Zhang et Zhang, new strategies based on biological and/or genetic approaches are promising to control and eradicate the disease. However, there is no information about genomics, proteomics or transcriptomics to allow the design of new control strategies for this pine aphid. We used next generation sequencing technology to sequence the transcriptome of C. pinitabulaeformis Zhang et Zhang and built a transcriptome database. We identified 80,259 unigenes assigned for Gene Ontology (GO) terms and information for a total of 11,609 classified unigenes was obtained in the Clusters of Orthologous Groups (COGs). A total of 10,806 annotated unigenes were analyzed to identify the represented biological pathways, among them 8,845 unigenes matched with 228 KEGG pathways. In addition, our data describe propagative viruses, nutrition-related genes, detoxification related molecules, olfactory related receptors, stressed-related protein, putative insecticide resistance genes and possible insecticide targets. Moreover, this study provides valuable information about putative insecticide resistance related genes and for the design of new genetic/biological based strategies to manage and control C. pinitabulaeformis Zhang et Zhang populations. PMID:28570707
Jiang, Hai-Qiang; Li, Yun-Lun; Xie, Jun
2012-03-01
To study the changes of urine metabolites in hypertension patients of ascendant hyperactivity of Gan yang syndrome (AHGYS), and to explore its essence in hypertension patients. Ten typical hypertension patients of AHGYS were recruited as the patient group, and the other twelve healthy volunteers were recruited as the normal group. The metabolite profiling in the urine were collected using by high performance liquid chromatography coupled with time of flight mass spectrometry (HPLC-TOFMS). The principal component analysis (PCA) and partial least-square discriminant analysis (PLS-DA) were analyzed using SIMCA-P Software. The differential metabolites in the urine were found out and identified. The possible relevant metabolic pathways were explained. The data from the analysis by PCA in the urine samples of the patient group and the normal group showed, two sets of data could be obviously classified in the score plot. Compared with the normal group, significant changes happened to the body metabolism in the patient group. The metabolites relevant to hypertension patients of AHGYS were determined using the PLS-DA. Fifteen compounds of the structure and metabolic pathways had been confirmed through inquiring KEGG Database, mainly including amino acids, free fatty acids, sphingosine, and so on. The hypertension patients of AHGYS were studied using HPLC-TOFMS combined with pattern recognition, thus finding out small molecular metabolic markers from the microscopic field, which was advantageous in probing the biological nature of Chinese medicine syndromes.
Praveen, Paurush; Fröhlich, Holger
2013-01-01
Inferring regulatory networks from experimental data via probabilistic graphical models is a popular framework to gain insights into biological systems. However, the inherent noise in experimental data coupled with a limited sample size reduces the performance of network reverse engineering. Prior knowledge from existing sources of biological information can address this low signal to noise problem by biasing the network inference towards biologically plausible network structures. Although integrating various sources of information is desirable, their heterogeneous nature makes this task challenging. We propose two computational methods to incorporate various information sources into a probabilistic consensus structure prior to be used in graphical model inference. Our first model, called Latent Factor Model (LFM), assumes a high degree of correlation among external information sources and reconstructs a hidden variable as a common source in a Bayesian manner. The second model, a Noisy-OR, picks up the strongest support for an interaction among information sources in a probabilistic fashion. Our extensive computational studies on KEGG signaling pathways as well as on gene expression data from breast cancer and yeast heat shock response reveal that both approaches can significantly enhance the reconstruction accuracy of Bayesian Networks compared to other competing methods as well as to the situation without any prior. Our framework allows for using diverse information sources, like pathway databases, GO terms and protein domain data, etc. and is flexible enough to integrate new sources, if available. PMID:23826291
Wang, Yanjie; Dong, Chunlan; Xue, Zeyun; Jin, Qijiang; Xu, Yingchun
2016-01-15
Paeonia ostii, an important ornamental and medicinal plant, grows normally on copper (Cu) mines with widespread Cu contamination of soils, and it has the ability to lower Cu contents in the Cu-contaminated soils. However, very little molecular information concerned with Cu resistance of P. ostii is available. In this study, high-throughput de novo transcriptome sequencing was carried out for P. ostii with and without Cu treatment using Illumina HiSeq 2000 platform. A total of 77,704 All-unigenes were obtained with a mean length of 710 bp. Of these unigenes, 47,461 were annotated with public databases based on sequence similarities. Comparative transcript profiling allowed the discovery of 4324 differentially expressed genes (DEGs), with 2207 up-regulated and 2117 down-regulated unigenes in Cu-treated library as compared to the control counterpart. Based on these DEGs, Gene Ontology (GO) enrichment analysis indicated Cu stress-relevant terms, such as 'membrane' and 'antioxidant activity'. Meanwhile, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis uncovered some important pathways, including 'biosynthesis of secondary metabolites' and 'metabolic pathways'. In addition, expression patterns of 12 selected DEGs derived from quantitative real-time polymerase chain reaction (qRT-PCR) were consistent with their transcript abundance changes obtained by transcriptomic analyses, suggesting that all the 12 genes were authentically involved in Cu tolerance in P. ostii. This is the first report to identify genes related to Cu stress responses in P. ostii, which could offer valuable information on the molecular mechanisms of Cu resistance, and provide a basis for further genomics research on this and related ornamental species for phytoremediation. Copyright © 2015 Elsevier B.V. All rights reserved.
Zhu, Baojie; Cao, Huiting; Sun, Limin; Li, Bo; Guo, Liwei; Duan, Jinao; Zhu, Huaxu; Zhang, Qichun
2018-04-24
Huang-Lian Jie-Du decoction (HLJDD), a traditional formula of Chinese medicine constituted with Rhizoma Coptidis, RadixScutellariae, CortexPhellodendri amurensis and Fructus Gardeniae, exhibits unambiguous therapeutic effect on cerebral ischemia via multi-targets action. Further investigation, however, is still required to explore the relationship between those mechanisms and targets through system approaches. Rats of cerebral ischemia were completed by middle cerebral artery occlusion (MCAO) with reperfusion. Following evaluation of pharmacological actions of HLJDD on MCAO rats, the plasma samples from rats of control, MCAO and HLJDD-treated MCAO groups were prepared strictly and subjected to ultra-performance liquid chromatography quadrupole time of flight mass spectrometry for metabolites analysis. The raw mass data were imported to MassLynx software for peak detection and alignment, and further introduced to EZinfo 2.0 software for orthogonal projection to latent structures analysis, principal component analysis and partial least-squares-discriminant analysis. The metabolic pathways assay of those potential biomarkers were performed with MetaboAnalyst through the online database, HMDB, Metlin, KEGG and SMPD. Those intriguing metabolic pathways were further investigated via biochemical assay. HLJDD ameliorated the MCAO-induce cerebral damage and blocked the severe inflammation response. There were nineteen different biomarkers identified among control, MCAO and HLJDD-treated MCAO groups. Ten metabolic pathways were proposed from these significant metabolites. Incorporation with the biochemical assay of cerebral tissue, modulation of metabolic stress, regulation glutamate/GABA-glutamine cycle and enhancement of cholinergic neurons function were explored that involved in the actions of HLJDD on cerebral ischemia. HLJDD achieves therapeutic action on cerebral ischemia via coordinating the basic pathophysiological network of metabolic stress, glutamate metabolism, and acetylcholine levels and function. Copyright © 2018 Elsevier B.V. All rights reserved.
Large-Scale Event Extraction from Literature with Multi-Level Gene Normalization
Wei, Chih-Hsuan; Hakala, Kai; Pyysalo, Sampo; Ananiadou, Sophia; Kao, Hung-Yu; Lu, Zhiyong; Salakoski, Tapio; Van de Peer, Yves; Ginter, Filip
2013-01-01
Text mining for the life sciences aims to aid database curation, knowledge summarization and information retrieval through the automated processing of biomedical texts. To provide comprehensive coverage and enable full integration with existing biomolecular database records, it is crucial that text mining tools scale up to millions of articles and that their analyses can be unambiguously linked to information recorded in resources such as UniProt, KEGG, BioGRID and NCBI databases. In this study, we investigate how fully automated text mining of complex biomolecular events can be augmented with a normalization strategy that identifies biological concepts in text, mapping them to identifiers at varying levels of granularity, ranging from canonicalized symbols to unique gene and proteins and broad gene families. To this end, we have combined two state-of-the-art text mining components, previously evaluated on two community-wide challenges, and have extended and improved upon these methods by exploiting their complementary nature. Using these systems, we perform normalization and event extraction to create a large-scale resource that is publicly available, unique in semantic scope, and covers all 21.9 million PubMed abstracts and 460 thousand PubMed Central open access full-text articles. This dataset contains 40 million biomolecular events involving 76 million gene/protein mentions, linked to 122 thousand distinct genes from 5032 species across the full taxonomic tree. Detailed evaluations and analyses reveal promising results for application of this data in database and pathway curation efforts. The main software components used in this study are released under an open-source license. Further, the resulting dataset is freely accessible through a novel API, providing programmatic and customized access (http://www.evexdb.org/api/v001/). Finally, to allow for large-scale bioinformatic analyses, the entire resource is available for bulk download from http://evexdb.org/download/, under the Creative Commons – Attribution – Share Alike (CC BY-SA) license. PMID:23613707
Deep Sequencing-Based Analysis of the Cymbidium ensifolium Floral Transcriptome
Li, Xiaobai; Luo, Jie; Yan, Tianlian; Xiang, Lin; Jin, Feng; Qin, Dehui; Sun, Chongbo; Xie, Ming
2013-01-01
Cymbidium ensifolium is a Chinese Cymbidium with an elegant shape, beautiful appearance, and a fragrant aroma. C. ensifolium has a long history of cultivation in China and it has excellent commercial value as a potted plant and cut flower. The development of C. ensifolium genomic resources has been delayed because of its large genome size. Taking advantage of technical and cost improvement of RNA-Seq, we extracted total mRNA from flower buds and mature flowers and obtained a total of 9.52 Gb of filtered nucleotides comprising 98,819,349 filtered reads. The filtered reads were assembled into 101,423 isotigs, representing 51,696 genes. Of the 101,423 isotigs, 41,873 were putative homologs of annotated sequences in the public databases, of which 158 were associated with floral development and 119 were associated with flowering. The isotigs were categorized according to their putative functions. In total, 10,212 of the isotigs were assigned into 25 eukaryotic orthologous groups (KOGs), 41,690 into 58 gene ontology (GO) terms, and 9,830 into 126 Arabidopsis Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways, and 9,539 isotigs into 123 rice pathways. Comparison of the isotigs with those of the two related orchid species P. equestris and C. sinense showed that 17,906 isotigs are unique to C. ensifolium. In addition, a total of 7,936 SSRs and 16,676 putative SNPs were identified. To our knowledge, this transcriptome database is the first major genomic resource for C. ensifolium and the most comprehensive transcriptomic resource for genus Cymbidium. These sequences provide valuable information for understanding the molecular mechanisms of floral development and flowering. Sequences predicted to be unique to C. ensifolium would provide more insights into C. ensifolium gene diversity. The numerous SNPs and SSRs identified in the present study will contribute to marker development for C. ensifolium. PMID:24392013
Wei, Lin; Li, Shenghua; Liu, Shenggui; He, Anna; Wang, Dan; Wang, Jie; Tang, Yulian; Wu, Xianjin
2014-01-01
Houttuynia cordata Thunb. is an important traditional medical herb in China and other Asian countries, with high medicinal and economic value. However, a lack of available genomic information has become a limitation for research on this species. Thus, we carried out high-throughput transcriptomic sequencing of H. cordata to generate an enormous transcriptome sequence dataset for gene discovery and molecular marker development. Illumina paired-end sequencing technology produced over 56 million sequencing reads from H. cordata mRNA. Subsequent de novo assembly yielded 63,954 unigenes, 39,982 (62.52%) and 26,122 (40.84%) of which had significant similarity to proteins in the NCBI nonredundant protein and Swiss-Prot databases (E-value <10(-5)), respectively. Of these annotated unigenes, 30,131 and 15,363 unigenes were assigned to gene ontology categories and clusters of orthologous groups, respectively. In addition, 24,434 (38.21%) unigenes were mapped onto 128 pathways using the KEGG pathway database and 17,964 (44.93%) unigenes showed homology to Vitis vinifera (Vitaceae) genes in BLASTx analysis. Furthermore, 4,800 cDNA SSRs were identified as potential molecular markers. Fifty primer pairs were randomly selected to detect polymorphism among 30 samples of H. cordata; 43 (86%) produced fragments of expected size, suggesting that the unigenes were suitable for specific primer design and of high quality, and the SSR marker could be widely used in marker-assisted selection and molecular breeding of H. cordata in the future. This is the first application of Illumina paired-end sequencing technology to investigate the whole transcriptome of H. cordata and to assemble RNA-seq reads without a reference genome. These data should help researchers investigating the evolution and biological processes of this species. The SSR markers developed can be used for construction of high-resolution genetic linkage maps and for gene-based association analyses in H. cordata. This work will enable future functional genomic research and research into the distinctive active constituents of this genus.
Transcriptome profiling of pumpkin (Cucurbita moschata Duch.) leaves infected with powdery mildew
Chen, Bi-Hua; Chen, Xue-Jin; Guo, Yan-Yan; Yang, He-Lian; Li, Xin-Zheng; Wang, Guang-Yin
2018-01-01
Cucurbit powdery mildew (PM) is one of the most severe fungal diseases, but the molecular mechanisms underlying PM resistance remain largely unknown, especially in pumpkin (Cucurbita moschata Duch.). The goal of this study was to identify gene expression differences in PM-treated plants (harvested at 24 h and 48 h after inoculation) and untreated (control) plants of inbred line “112–2” using RNA sequencing (RNA-Seq). The inbred line “112–2” has been purified over 8 consecutive generations of self-pollination and shows high resistance to PM. More than 7600 transcripts were examined in pumpkin leaves, and 3129 and 3080 differentially expressed genes (DEGs) were identified in inbred line “112–2” at 24 and 48 hours post inoculation (hpi), respectively. Based on the KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway database and GO (Gene Ontology) database, a complex regulatory network for PM resistance that may involve hormone signal transduction pathways, transcription factors and defense responses was revealed at the transcription level. In addition, the expression profiles of 16 selected genes were analyzed using quantitative RT-PCR. Among these genes, the transcript levels of 6 DEGs, including bHLH87 (Basic Helix-loop-helix transcription factor), ERF014 (Ethylene response factor), WRKY21 (WRKY domain), HSF (heat stress transcription factor A), MLO3 (Mildew Locus O), and SGT1 (Suppressor of G-Two Allele of Skp1), in PM-resistant “112–2” were found to be significantly up- or down-regulated both before 9 hpi and at 24 hpi or 48 hpi; this behavior differed from that observed in the PM-susceptible material (cultivar “Jiujiangjiaoding”). The transcriptome data provide novel insights into the response of Cucurbita moschata to PM stress and are expected to be highly useful for dissecting PM defense mechanisms in this major vegetable and for improving pumpkin breeding with enhanced resistance to PM. PMID:29320569
Transcriptome profiling of pumpkin (Cucurbita moschata Duch.) leaves infected with powdery mildew.
Guo, Wei-Li; Chen, Bi-Hua; Chen, Xue-Jin; Guo, Yan-Yan; Yang, He-Lian; Li, Xin-Zheng; Wang, Guang-Yin
2018-01-01
Cucurbit powdery mildew (PM) is one of the most severe fungal diseases, but the molecular mechanisms underlying PM resistance remain largely unknown, especially in pumpkin (Cucurbita moschata Duch.). The goal of this study was to identify gene expression differences in PM-treated plants (harvested at 24 h and 48 h after inoculation) and untreated (control) plants of inbred line "112-2" using RNA sequencing (RNA-Seq). The inbred line "112-2" has been purified over 8 consecutive generations of self-pollination and shows high resistance to PM. More than 7600 transcripts were examined in pumpkin leaves, and 3129 and 3080 differentially expressed genes (DEGs) were identified in inbred line "112-2" at 24 and 48 hours post inoculation (hpi), respectively. Based on the KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway database and GO (Gene Ontology) database, a complex regulatory network for PM resistance that may involve hormone signal transduction pathways, transcription factors and defense responses was revealed at the transcription level. In addition, the expression profiles of 16 selected genes were analyzed using quantitative RT-PCR. Among these genes, the transcript levels of 6 DEGs, including bHLH87 (Basic Helix-loop-helix transcription factor), ERF014 (Ethylene response factor), WRKY21 (WRKY domain), HSF (heat stress transcription factor A), MLO3 (Mildew Locus O), and SGT1 (Suppressor of G-Two Allele of Skp1), in PM-resistant "112-2" were found to be significantly up- or down-regulated both before 9 hpi and at 24 hpi or 48 hpi; this behavior differed from that observed in the PM-susceptible material (cultivar "Jiujiangjiaoding"). The transcriptome data provide novel insights into the response of Cucurbita moschata to PM stress and are expected to be highly useful for dissecting PM defense mechanisms in this major vegetable and for improving pumpkin breeding with enhanced resistance to PM.
Wen, Dong-Yue; Lin, Peng; Pang, Yu-Yan; Chen, Gang; He, Yun; Dang, Yi-Wu; Yang, Hong
2018-05-05
BACKGROUND Long non-coding RNAs (lncRNAs) have a role in physiological and pathological processes, including cancer. The aim of this study was to investigate the expression of the long intergenic non-protein coding RNA 665 (LINC00665) gene and the cell cycle in hepatocellular carcinoma (HCC) using database analysis including The Cancer Genome Atlas (TCGA), the Gene Expression Omnibus (GEO), and quantitative real-time polymerase chain reaction (qPCR). MATERIAL AND METHODS Expression levels of LINC00665 were compared between human tissue samples of HCC and adjacent normal liver, clinicopathological correlations were made using TCGA and the GEO, and qPCR was performed to validate the findings. Other public databases were searched for other genes associated with LINC00665 expression, including The Atlas of Noncoding RNAs in Cancer (TANRIC), the Multi Experiment Matrix (MEM), Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) and protein-protein interaction (PPI) networks. RESULTS Overexpression of LINC00665 in patients with HCC was significantly associated with gender, tumor grade, stage, and tumor cell type. Overexpression of LINC00665 in patients with HCC was significantly associated with overall survival (OS) (HR=1.47795%; CI: 1.046-2.086). Bioinformatics analysis identified 469 related genes and further analysis supported a hypothesis that LINC00665 regulates pathways in the cell cycle to facilitate the development and progression of HCC through ten identified core genes: CDK1, BUB1B, BUB1, PLK1, CCNB2, CCNB1, CDC20, ESPL1, MAD2L1, and CCNA2. CONCLUSIONS Overexpression of the lncRNA, LINC00665 may be involved in the regulation of cell cycle pathways in HCC through ten identified hub genes.
Comparative transcriptome analysis of papilla and skin in the sea cucumber, Apostichopus japonicus.
Zhou, Xiaoxu; Cui, Jun; Liu, Shikai; Kong, Derong; Sun, He; Gu, Chenlei; Wang, Hongdi; Qiu, Xuemei; Chang, Yaqing; Liu, Zhanjiang; Wang, Xiuli
2016-01-01
Papilla and skin are two important organs of the sea cucumber. Both tissues have ectodermic origin, but they are morphologically and functionally very different. In the present study, we performed comparative transcriptome analysis of the papilla and skin from the sea cucumber (Apostichopus japonicus) in order to identify and characterize gene expression profiles by using RNA-Seq technology. We generated 30.6 and 36.4 million clean reads from the papilla and skin and de novo assembled in 156,501 transcripts. The Gene Ontology (GO) analysis indicated that cell part, metabolic process and catalytic activity were the most abundant GO category in cell component, biological process and molecular funcation, respectively. Comparative transcriptome analysis between the papilla and skin allowed the identification of 1,059 differentially expressed genes, of which 739 genes were expressed at higher levels in papilla, while 320 were expressed at higher levels in skin. In addition, 236 differentially expressed unigenes were not annotated with any database, 160 of which were apparently expressed at higher levels in papilla, 76 were expressed at higher levels in skin. We identified a total of 288 papilla-specific genes, 171 skin-specific genes and 600 co-expressed genes. Also, 40 genes in papilla-specific were not annotated with any database, 2 in skin-specific. Development-related genes were also enriched, such as fibroblast growth factor, transforming growth factor-β, collagen-α2 and Integrin-α2, which may be related to the formation of the papilla and skin in sea cucumber. Further pathway analysis identified ten KEGG pathways that were differently enriched between the papilla and skin. The findings on expression profiles between two key organs of the sea cucumber should be valuable to reveal molecular mechanisms involved in the development of organs that are related but with morphological differences in the sea cucumber.
Wu, Hao; Wu, Runliu; Chen, Miao; Li, Daojiang; Dai, Jing; Zhang, Yi; Gao, Kai; Yu, Jun; Hu, Gui; Guo, Yihang; Lin, Changwei; Li, Xiaorong
2017-03-28
Growing evidence suggests that long non-coding RNAs (lncRNAs) play a key role in tumorigenesis. However, the mechanism remains largely unknown. Thousands of significantly dysregulated lncRNAs and mRNAs were identified by microarray. Furthermore, a miR-133b-meditated lncRNA-mRNA ceRNA network was revealed, a subset of which was validated in 14 paired CRC patient tumor/non-tumor samples. Gene set enrichment analysis (GSEA) results demonstrated that lncRNAs ENST00000520055 and ENST00000535511 shared KEGG pathways with miR-133b target genes. We used microarrays to survey the lncRNA and mRNA expression profiles of colorectal cancer and para-cancer tissues. Gene Ontology (GO) and KEGG pathway enrichment analyses were performed to explore the functions of the significantly dysregulated genes. An innovate method was employed that combined analyses of two microarray data sets to construct a miR-133b-mediated lncRNA-mRNA competing endogenous RNAs (ceRNA) network. Quantitative RT-PCR analysis was used to validate part of this network. GSEA was used to predict the potential functions of these lncRNAs. This study identifies and validates a new method to investigate the miR-133b-mediated lncRNA-mRNA ceRNA network and lays the foundation for future investigation into the role of lncRNAs in colorectal cancer.
De Novo Transcriptome Analysis for Kentucky Bluegrass Dwarf Mutants Induced by Space Mutation
Gan, Lu; Di, Rong; Chao, Yuehui; Han, Liebao; Chen, Xingwu; Wu, Chao; Yin, Shuxia
2016-01-01
Kentucky bluegrass (Poa pratensis L.) is a major cool-season turfgrass requiring frequent mowing. Utilization of cultivars with slow growth is a promising method to decrease mowing frequency. In this study, two dwarf mutant selections of Kentucky bluegrass (A12 and A16) induced by space mutation were analyzed for the differentially expressed genes compared with the wild type (WT) by the high-throughput RNA-Seq technology. 253,909 unigenes were obtained by de novo assembly. 24.20% of the unigenes had a significant level of amino acid sequence identity to Brachypodium distachyon proteins, followed by Hordeum vulgare with 18.72% among the non-redundant (NR) Blastx top hits. Assembled unigenes were associated with 32 pathways using KEGG orthology terms and their respective KEGG maps. Between WT and A16 libraries, 4,203 differentially expressed genes (DEGs) were identified, whereas there were 883 DEGs between WT and A12 libraries. Further investigation revealed that the DEG pathways were mainly involved in terpenoid biosynthesis and plant hormone metabolism, which might account for the differences of plant height and leaf blade color between dwarf mutant and WT plants. Our study presents the first comprehensive transcriptomic data and gene function analysis of Poa pratensis L., providing a valuable resource for future studies in plant dwarfing breeding and comparative genome analysis for Pooideae plants. PMID:27010560
Integrating Microarray Data and GRNs.
Koumakis, L; Potamias, G; Tsiknakis, M; Zervakis, M; Moustakis, V
2016-01-01
With the completion of the Human Genome Project and the emergence of high-throughput technologies, a vast amount of molecular and biological data are being produced. Two of the most important and significant data sources come from microarray gene-expression experiments and respective databanks (e,g., Gene Expression Omnibus-GEO (http://www.ncbi.nlm.nih.gov/geo)), and from molecular pathways and Gene Regulatory Networks (GRNs) stored and curated in public (e.g., Kyoto Encyclopedia of Genes and Genomes-KEGG (http://www.genome.jp/kegg/pathway.html), Reactome (http://www.reactome.org/ReactomeGWT/entrypoint.html)) as well as in commercial repositories (e.g., Ingenuity IPA (http://www.ingenuity.com/products/ipa)). The association of these two sources aims to give new insight in disease understanding and reveal new molecular targets in the treatment of specific phenotypes.Three major research lines and respective efforts that try to utilize and combine data from both of these sources could be identified, namely: (1) de novo reconstruction of GRNs, (2) identification of Gene-signatures, and (3) identification of differentially expressed GRN functional paths (i.e., sub-GRN paths that distinguish between different phenotypes). In this chapter, we give an overview of the existing methods that support the different types of gene-expression and GRN integration with a focus on methodologies that aim to identify phenotype-discriminant GRNs or subnetworks, and we also present our methodology.
Label free quantitative proteomics analysis on the cisplatin resistance in ovarian cancer cells.
Wang, F; Zhu, Y; Fang, S; Li, S; Liu, S
2017-05-20
Quantitative proteomics has been made great progress in recent years. Label free quantitative proteomics analysis based on the mass spectrometry is widely used. Using this technique, we determined the differentially expressed proteins in the cisplatin-sensitive ovarian cancer cells COC1 and cisplatin-resistant cells COC1/DDP before and after the application of cisplatin. Using the GO analysis, we classified those proteins into different subgroups bases on their cellular component, biological process, and molecular function. We also used KEGG pathway analysis to determine the key signal pathways that those proteins were involved in. There are 710 differential proteins between COC1 and COC1/DDP cells, 783 between COC1 and COC1/DDP cells treated with cisplatin, 917 between the COC1/DDP cells and COC1/DDP cells treated with LaCl3, 775 between COC1/DDP cells treated with cisplatin and COC1/DDP cells treated with cisplatin and LaCl3. Among the same 411 differentially expressed proteins in cisplatin-sensitive COC1 cells and cisplain-resistant COC1/DDP cells before and after cisplatin treatment, 14% of them were localized on the cell membrane. According to the KEGG results, differentially expressed proteins were classified into 21 groups. The most abundant proteins were involved in spliceosome. This study lays a foundation for deciphering the mechanism for drug resistance in ovarian tumor.
Gao, Fan-Xiang; Wang, Yang; Zhang, Qi-Ya; Mou, Cheng-Yan; Li, Zhi; Deng, Yuan-Sheng; Zhou, Li; Gui, Jian-Fang
2017-07-24
Gibel carp is an important aquaculture species in China, and a herpesvirus, called as Carassius auratus herpesvirus (CaHV), has hampered the aquaculture development. Diverse gynogenetic clones of gibel carp have been identified or created, and some of them have been used as aquaculture varieties, but their resistances to herpesvirus and the underlying mechanism remain unknown. To reveal their susceptibility differences, we firstly performed herpesvirus challenge experiments in three gynogenetic clones of gibel carp, including the leading variety clone A + , candidate variety clone F and wild clone H. Three clones showed distinct resistances to CaHV. Moreover, 8772, 8679 and 10,982 differentially expressed unigenes (DEUs) were identified from comparative transcriptomes between diseased individuals and control individuals of clone A + , F and H, respectively. Comprehensive analysis of the shared DEUs in all three clones displayed common defense pathways to the herpesvirus infection, activating IFN system and suppressing complements. KEGG pathway analysis of specifically changed DEUs in respective clones revealed distinct immune responses to the herpesvirus infection. The DEU numbers identified from clone H in KEGG immune-related pathways, such as "chemokine signaling pathway", "Toll-like receptor signaling pathway" and others, were remarkably much more than those from clone A + and F. Several IFN-related genes, including Mx1, viperin, PKR and others, showed higher increases in the resistant clone H than that in the others. IFNphi3, IFI44-like and Gig2 displayed the highest expression in clone F and IRF1 uniquely increased in susceptible clone A + . In contrast to strong immune defense in resistant clone H, susceptible clone A + showed remarkable up-regulation of genes related to apoptosis or death, indicating that clone A + failed to resist virus offensive and evidently induced apoptosis or death. Our study is the first attempt to screen distinct resistances and immune responses of three gynogenetic gibel carp clones to herpesvirus infection by comprehensive transcriptomes. These differential DEUs, immune-related pathways and IFN system genes identified from susceptible and resistant clones will be beneficial to marker-assisted selection (MAS) breeding or molecular module-based resistance breeding in gibel carp.
Zhao, Daqiu; Jiang, Yao; Ning, Chuanlong; Meng, Jiasong; Lin, Shasha; Ding, Wen; Tao, Jun
2014-08-19
Herbaceous peony (Paeonia lactiflora Pall.) is a traditional flower in China and a wedding attractive flower in worldwide. In its flower colour, yellow is the rarest which is ten times the price of the other colours. However, the breeding of new yellow P. lactiflora varieties using genetic engineering is severely limited due to the little-known biochemical and molecular mechanisms underlying its characteristic formation. In this study, two cDNA libraries generated from P. lactiflora chimaera with red outer-petal and yellow inner-petal were sequenced using an Illumina HiSeq™ 2000 platform. 66,179,398 and 65,481,444 total raw reads from red outer-petal and yellow inner-petal cDNA libraries were generated, which were assembled into 61,431 and 70,359 Unigenes with an average length of 628 and 617 nt, respectively. Moreover, 61,408 non-redundant All-unigenes were obtained, with 37,511 All-unigenes (61.08%) annotated in public databases. In addition, 6,345 All-unigenes were differentially expressed between the red outer-petal and yellow inner-petal, with 3,899 up-regulated and 2,446 down-regulated All-unigenes, and the flavonoid metabolic pathway related to colour development was identified using the Kyoto encyclopedia of genes and genomes database (KEGG). Subsequently, the expression patterns of 10 candidate differentially expressed genes (DEGs) involved in the flavonoid metabolic pathway were examined, and flavonoids were qualitatively and quantitatively analysed. Numerous anthoxanthins (flavone and flavonol) and a few anthocyanins were detected in the yellow inner-petal, which were all lower than those in the red outer-petal due to the low expression levels of the phenylalanine ammonialyase gene (PlPAL), flavonol synthase gene (PlFLS), dihydroflavonol 4-reductase gene (PlDFR), anthocyanidin synthase gene (PlANS), anthocyanidin 3-O-glucosyltransferase gene (Pl3GT) and anthocyanidin 5-O-glucosyltransferase gene (Pl5GT). Transcriptome sequencing (RNA-Seq) analysis based on the high throughput sequencing technology was an efficient approach to identify critical genes in P. lactiflora and other non-model plants. The flavonoid metabolic pathway and glucide metabolic pathway were identified as relatived yellow formation in P. lactiflora, PlPAL, PlFLS, PlDFR, PlANS, Pl3GT and Pl5GT were selected as potential candidates involved in flavonoid metabolic pathway, which inducing inhibition of anthocyanin biosynthesis mediated yellow formation in P. lactiflora. This study could lay a theoretical foundation for breeding new yellow P. lactiflora varieties.
Bikel, Shirley; Jacobo-Albavera, Leonor; Sánchez-Muñoz, Fausto; Cornejo-Granados, Fernanda; Canizales-Quinteros, Samuel; Soberón, Xavier; Sotelo-Mundo, Rogerio R; Del Río-Navarro, Blanca E; Mendoza-Vargas, Alfredo; Sánchez, Filiberto; Ochoa-Leyva, Adrian
2017-01-01
In spite of the emergence of RNA sequencing (RNA-seq), microarrays remain in widespread use for gene expression analysis in the clinic. There are over 767,000 RNA microarrays from human samples in public repositories, which are an invaluable resource for biomedical research and personalized medicine. The absolute gene expression analysis allows the transcriptome profiling of all expressed genes under a specific biological condition without the need of a reference sample. However, the background fluorescence represents a challenge to determine the absolute gene expression in microarrays. Given that the Y chromosome is absent in female subjects, we used it as a new approach for absolute gene expression analysis in which the fluorescence of the Y chromosome genes of female subjects was used as the background fluorescence for all the probes in the microarray. This fluorescence was used to establish an absolute gene expression threshold, allowing the differentiation between expressed and non-expressed genes in microarrays. We extracted the RNA from 16 children leukocyte samples (nine males and seven females, ages 6-10 years). An Affymetrix Gene Chip Human Gene 1.0 ST Array was carried out for each sample and the fluorescence of 124 genes of the Y chromosome was used to calculate the absolute gene expression threshold. After that, several expressed and non-expressed genes according to our absolute gene expression threshold were compared against the expression obtained using real-time quantitative polymerase chain reaction (RT-qPCR). From the 124 genes of the Y chromosome, three genes (DDX3Y, TXLNG2P and EIF1AY) that displayed significant differences between sexes were used to calculate the absolute gene expression threshold. Using this threshold, we selected 13 expressed and non-expressed genes and confirmed their expression level by RT-qPCR. Then, we selected the top 5% most expressed genes and found that several KEGG pathways were significantly enriched. Interestingly, these pathways were related to the typical functions of leukocytes cells, such as antigen processing and presentation and natural killer cell mediated cytotoxicity. We also applied this method to obtain the absolute gene expression threshold in already published microarray data of liver cells, where the top 5% expressed genes showed an enrichment of typical KEGG pathways for liver cells. Our results suggest that the three selected genes of the Y chromosome can be used to calculate an absolute gene expression threshold, allowing a transcriptome profiling of microarray data without the need of an additional reference experiment. Our approach based on the establishment of a threshold for absolute gene expression analysis will allow a new way to analyze thousands of microarrays from public databases. This allows the study of different human diseases without the need of having additional samples for relative expression experiments.
Vernocchi, Pamela; Del Chierico, Federica; Quagliariello, Andrea; Ercolini, Danilo; Lucidi, Vincenzina; Putignani, Lorenza
2017-12-09
Cystic fibrosis (CF) is a life-limiting hereditary disorder that results in aberrant mucosa in the lungs and digestive tract, chronic respiratory infections, chronic inflammation, and the need for repeated antibiotic treatments. Probiotics have been demonstrated to improve the quality of life of CF patients. We investigated the distribution of gut microbiota (GM) bacteria to identify new potential probiotics for CF patients on the basis of GM patterns. Fecal samples of 28 CF patients and 31 healthy controls (HC) were collected and analyzed by 16S rRNA-based pyrosequencing analysis of GM, to produce CF-HC paired maps of the distribution of operational taxonomic units (OTUs), and by Phylogenetic Investigation of Communities by Reconstruction of Unobserved States (PICRUSt) for Kyoto Encyclopedia of Genes and Genomes (KEGG) biomarker prediction. The maps were scanned to highlight the distribution of bacteria commonly claimed as probiotics, such as bifidobacteria and lactobacilli, and of butyrate-producing colon bacteria, such as Eubacterium spp. and Faecalibacterium prausnitzii. The analyses highlighted 24 OTUs eligible as putative probiotics. Eleven and nine species were prevalently associated with the GM of CF and HC subjects, respectively. Their KEGG prediction provided differential CF and HC pathways, indeed associated with health-promoting biochemical activities in the latter case. GM profiling and KEGG biomarkers concurred in the evaluation of nine bacterial species as novel putative probiotics that could be investigated for the nutritional management of CF patients.
de Anda-Jáuregui, Guillermo; Guo, Kai; McGregor, Brett A.; Hur, Junguk
2018-01-01
The quintessential biological response to disease is inflammation. It is a driver and an important element in a wide range of pathological states. Pharmacological management of inflammation is therefore central in the clinical setting. Anti-inflammatory drugs modulate specific molecules involved in the inflammatory response; these drugs are traditionally classified as steroidal and non-steroidal drugs. However, the effects of these drugs are rarely limited to their canonical targets, affecting other molecules and altering biological functions with system-wide effects that can lead to the emergence of secondary therapeutic applications or adverse drug reactions (ADRs). In this study, relationships among anti-inflammatory drugs, functional pathways, and ADRs were explored through network models. We integrated structural drug information, experimental anti-inflammatory drug perturbation gene expression profiles obtained from the Connectivity Map and Library of Integrated Network-Based Cellular Signatures, functional pathways in the Kyoto Encyclopedia of Genes and Genomes (KEGG) and Reactome databases, as well as adverse reaction information from the U.S. Food and Drug Administration (FDA) Adverse Event Reporting System (FAERS). The network models comprise nodes representing anti-inflammatory drugs, functional pathways, and adverse effects. We identified structural and gene perturbation similarities linking anti-inflammatory drugs. Functional pathways were connected to drugs by implementing Gene Set Enrichment Analysis (GSEA). Drugs and adverse effects were connected based on the proportional reporting ratio (PRR) of an adverse effect in response to a given drug. Through these network models, relationships among anti-inflammatory drugs, their functional effects at the pathway level, and their adverse effects were explored. These networks comprise 70 different anti-inflammatory drugs, 462 functional pathways, and 1,175 ADRs. Network-based properties, such as degree, clustering coefficient, and node strength, were used to identify new therapeutic applications within and beyond the anti-inflammatory context, as well as ADR risk for these drugs, helping to select better repurposing candidates. Based on these parameters, we identified naproxen, meloxicam, etodolac, tenoxicam, flufenamic acid, fenoprofen, and nabumetone as candidates for drug repurposing with lower ADR risk. This network-based analysis pipeline provides a novel way to explore the effects of drugs in a therapeutic space. PMID:29545755
de Anda-Jáuregui, Guillermo; Guo, Kai; McGregor, Brett A; Hur, Junguk
2018-01-01
The quintessential biological response to disease is inflammation. It is a driver and an important element in a wide range of pathological states. Pharmacological management of inflammation is therefore central in the clinical setting. Anti-inflammatory drugs modulate specific molecules involved in the inflammatory response; these drugs are traditionally classified as steroidal and non-steroidal drugs. However, the effects of these drugs are rarely limited to their canonical targets, affecting other molecules and altering biological functions with system-wide effects that can lead to the emergence of secondary therapeutic applications or adverse drug reactions (ADRs). In this study, relationships among anti-inflammatory drugs, functional pathways, and ADRs were explored through network models. We integrated structural drug information, experimental anti-inflammatory drug perturbation gene expression profiles obtained from the Connectivity Map and Library of Integrated Network-Based Cellular Signatures, functional pathways in the Kyoto Encyclopedia of Genes and Genomes (KEGG) and Reactome databases, as well as adverse reaction information from the U.S. Food and Drug Administration (FDA) Adverse Event Reporting System (FAERS). The network models comprise nodes representing anti-inflammatory drugs, functional pathways, and adverse effects. We identified structural and gene perturbation similarities linking anti-inflammatory drugs. Functional pathways were connected to drugs by implementing Gene Set Enrichment Analysis (GSEA). Drugs and adverse effects were connected based on the proportional reporting ratio (PRR) of an adverse effect in response to a given drug. Through these network models, relationships among anti-inflammatory drugs, their functional effects at the pathway level, and their adverse effects were explored. These networks comprise 70 different anti-inflammatory drugs, 462 functional pathways, and 1,175 ADRs. Network-based properties, such as degree, clustering coefficient, and node strength, were used to identify new therapeutic applications within and beyond the anti-inflammatory context, as well as ADR risk for these drugs, helping to select better repurposing candidates. Based on these parameters, we identified naproxen, meloxicam, etodolac, tenoxicam, flufenamic acid, fenoprofen, and nabumetone as candidates for drug repurposing with lower ADR risk. This network-based analysis pipeline provides a novel way to explore the effects of drugs in a therapeutic space.
Gao, Lijie; Wang, Yunqi; Li, Yi; Dong, Ya; Yang, Aimin; Zhang, Jie; Li, Fengying; Zhang, Rongqiang
2018-07-01
Comprehensive bioinformatics analyses were performed to explore the key biomarkers in response to HIV infection of CD4 + and CD8 + T cells. The numbers of CD4 + and CD8 + T cells of HIV infected individuals were analyzed and the GEO database (GSE6740) was screened for differentially expressed genes (DEGs) in HIV infected CD4 + and CD8 + T cells. Gene Ontology enrichment, KEGG pathway analyses, and protein-protein interaction (PPI) network were performed to identify the key pathway and core proteins in anti-HIV virus process of CD4 + and CD8 + T cells. Finally, we analyzed the expressions of key proteins in HIV-infected T cells (GSE6740 dataset) and peripheral blood mononuclear cells(PBMCs) (GSE511 dataset). 1) CD4 + T cells counts and ratio of CD4 + /CD8 + T cells decreased while CD8 + T cells counts increased in HIV positive individuals; 2) 517 DEGs were found in HIV infected CD4 + and CD8 + T cells at acute and chronic stage with the criterial of P-value <0.05 and fold change (FC) ≥2; 3) In acute HIV infection, type 1 interferon (IFN-1) pathway might played a critical role in response to HIV infection of T cells. The main biological processes of the DEGs were response to virus and defense response to virus. At chronic stage, ISG15 protein, in conjunction with IFN-1 pathway might play key roles in anti-HIV responses of CD4 + T cells; and 4) The expression of ISG15 increased in both T cells and PBMCs after HIV infection. Gene expression profile of CD4 + and CD8 + T cells changed significantly in HIV infection, in which ISG15 gene may play a central role in activating the natural antiviral process of immune cells. © 2018 Wiley Periodicals, Inc.
Mangalam, AK; Poisson, LM; Nemutlu, E; Datta, I; Denic, A; Dzeja, P; Rodriguez, M; Rattan, R; Giri, S
2013-01-01
Multiple sclerosis (MS) is a chronic inflammatory and demyelinating disease of the CNS. Although, MS is well characterized in terms of the role played by immune cells, cytokines and CNS pathology, nothing is known about the metabolic alterations that occur during the disease process in circulation. Recently, metabolic aberrations have been defined in various disease processes either as contributing to the disease, as potential biomarkers, or as therapeutic targets. Thus in an attempt to define the metabolic alterations that may be associated with MS disease progression, we profiled the plasma metabolites at the chronic phase of disease utilizing relapsing remitting-experimental autoimmune encephalomyelitis (RR-EAE) model in SJL mice. At the chronic phase of the disease (day 45), untargeted global metabolomic profiling of plasma collected from EAE diseased SJL and healthy mice was performed, using a combination of high-throughput liquid-and-gas chromatography with mass spectrometry. A total of 282 metabolites were identified, with significant changes observed in 44 metabolites (32 up-regulated and 12 down-regulated), that mapped to lipid, amino acid, nucleotide and xenobiotic metabolism and distinguished EAE from healthy group (p<0.05, false discovery rate (FDR)<0.23). Mapping the differential metabolite signature to their respective biochemical pathways using the Kyoto Encyclopedia of Genes and Genomics (KEGG) database, we found six major pathways that were significantly altered (containing concerted alterations) or impacted (containing alteration in key junctions). These included bile acid biosynthesis, taurine metabolism, tryptophan and histidine metabolism, linoleic acid and D-arginine metabolism pathways. Overall, this study identified a 44 metabolite signature drawn from various metabolic pathways which correlated well with severity of the EAE disease, suggesting that these metabolic changes could be exploited as (1) biomarkers for EAE/MS progression and (2) to design new treatment paradigms where metabolic interventions could be combined with present and experimental therapeutics to achieve better treatment of MS. PMID:24273690
Long, Yong; Li, Qing; Zhou, Bolan; Song, Guili; Li, Tao; Cui, Zongbin
2013-01-01
Fish skin serves as the first line of defense against a wide variety of chemical, physical and biological stressors. Secretion of mucus is among the most prominent characteristics of fish skin and numerous innate immune factors have been identified in the epidermal mucus. However, molecular mechanisms underlying the mucus secretion and immune activities of fish skin remain largely unclear due to the lack of genomic and transcriptomic data for most economically important fish species. In this study, we characterized the skin transcriptome of mud loach using Illumia paired-end sequencing. A total of 40364 unigenes were assembled from 86.6 million (3.07 gigabases) filtered reads. The mean length, N50 size and maximum length of assembled transcripts were 387, 611 and 8670 bp, respectively. A total of 17336 (43.76%) unigenes were annotated by blast searches against the NCBI non-redundant protein database. Gene ontology mapping assigned a total of 108513 GO terms to 15369 (38.08%) unigenes. KEGG orthology mapping annotated 9337 (23.23%) unigenes. Among the identified KO categories, immune system is the largest category that contains various components of multiple immune pathways such as chemokine signaling, leukocyte transendothelial migration and T cell receptor signaling, suggesting the complexity of immune mechanisms in fish skin. As for mucin biosynthesis, 37 unigenes were mapped to 7 enzymes of the mucin type O-glycan biosynthesis pathway and 8 members of the polypeptide N-acetylgalactosaminyltransferase family were identified. Additionally, 38 unigenes were mapped to 23 factors of the SNARE interactions in vesicular transport pathway, indicating that the activity of this pathway is required for the processes of epidermal mucus storage and release. Moreover, 1754 simple sequence repeats (SSRs) were detected in 1564 unigenes and dinucleotide repeats represented the most abundant type. These findings have laid the foundation for further understanding the secretary processes and immune functions of loach skin mucus. PMID:23437293
Long, Yong; Li, Qing; Zhou, Bolan; Song, Guili; Li, Tao; Cui, Zongbin
2013-01-01
Fish skin serves as the first line of defense against a wide variety of chemical, physical and biological stressors. Secretion of mucus is among the most prominent characteristics of fish skin and numerous innate immune factors have been identified in the epidermal mucus. However, molecular mechanisms underlying the mucus secretion and immune activities of fish skin remain largely unclear due to the lack of genomic and transcriptomic data for most economically important fish species. In this study, we characterized the skin transcriptome of mud loach using Illumia paired-end sequencing. A total of 40364 unigenes were assembled from 86.6 million (3.07 gigabases) filtered reads. The mean length, N50 size and maximum length of assembled transcripts were 387, 611 and 8670 bp, respectively. A total of 17336 (43.76%) unigenes were annotated by blast searches against the NCBI non-redundant protein database. Gene ontology mapping assigned a total of 108513 GO terms to 15369 (38.08%) unigenes. KEGG orthology mapping annotated 9337 (23.23%) unigenes. Among the identified KO categories, immune system is the largest category that contains various components of multiple immune pathways such as chemokine signaling, leukocyte transendothelial migration and T cell receptor signaling, suggesting the complexity of immune mechanisms in fish skin. As for mucin biosynthesis, 37 unigenes were mapped to 7 enzymes of the mucin type O-glycan biosynthesis pathway and 8 members of the polypeptide N-acetylgalactosaminyltransferase family were identified. Additionally, 38 unigenes were mapped to 23 factors of the SNARE interactions in vesicular transport pathway, indicating that the activity of this pathway is required for the processes of epidermal mucus storage and release. Moreover, 1754 simple sequence repeats (SSRs) were detected in 1564 unigenes and dinucleotide repeats represented the most abundant type. These findings have laid the foundation for further understanding the secretary processes and immune functions of loach skin mucus.
Zhang, Zhijun; Zhang, Pengjun; Li, Weidi; Zhang, Jinming; Huang, Fang; Yang, Jian; Bei, Yawei; Lu, Yaobin
2013-05-01
The western flower thrips (WFT), Frankliniella occidentalis, a world-wide invasive insect, causes agricultural damage by directly feeding and by indirectly vectoring Tospoviruses, such as Tomato spotted wilt virus (TSWV). We characterized the transcriptome of WFT and analyzed global gene expression of WFT response to TSWV infection using Illumina sequencing platform. We compiled 59,932 unigenes, and identified 36,339 unigenes by similarity analysis against public databases, most of which were annotated using gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis. Within these annotated transcripts, we collected 278 sequences related to insecticide resistance. GO and KEGG analysis of different expression genes between TSWV-infected and non-infected WFT population revealed that TSWV can regulate cellular process and immune response, which might lead to low virus titers in thrips cells and no detrimental effects on F. occidentalis. This data-set not only enriches genomic resource for WFT, but also benefits research into its molecular genetics and functional genomics. Copyright © 2013 Elsevier Inc. All rights reserved.
In silico analysis of fragile histidine triad involved in regression of carcinoma.
Rasheed, Muhammad Asif; Tariq, Fatima; Afzal, Sara; Mannanv, Shazia
2017-04-01
Hepatocellular carcinoma (HCCa) is a primary malignancy of the liver. Many different proteins are involved in HCCa including insulin growth factor (IGF) II , signal transducers and activators of transcription (STAT) 3, STAT4, mothers against decapentaplegic homolog 4 (SMAD 4), fragile histidine triad (FHIT) and selective internal radiation therapy (SIRT) etc. The present study is based on the bioinformatics analysis of FHIT protein in order to understand the proteomics aspect and improvement of the diagnosis of the disease based on the protein. Different information related to protein were gathered from different databases, including National Centre for Biotechnology Information (NCBI) Gene, Protein and Online Mendelian Inheritance in Man (OMIM) databases, Uniprot database, String database and Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Moreover, the structure of the protein and evaluation of the quality of the structure were included from Easy modeler programme. Hence, this analysis not only helped to gather information related to the protein at one place, but also analysed the structure and quality of the protein to conclude that the protein has a role in carcinoma.
mESAdb: microRNA Expression and Sequence Analysis Database
Kaya, Koray D.; Karakülah, Gökhan; Yakıcıer, Cengiz M.; Acar, Aybar C.; Konu, Özlen
2011-01-01
microRNA expression and sequence analysis database (http://konulab.fen.bilkent.edu.tr/mirna/) (mESAdb) is a regularly updated database for the multivariate analysis of sequences and expression of microRNAs from multiple taxa. mESAdb is modular and has a user interface implemented in PHP and JavaScript and coupled with statistical analysis and visualization packages written for the R language. The database primarily comprises mature microRNA sequences and their target data, along with selected human, mouse and zebrafish expression data sets. mESAdb analysis modules allow (i) mining of microRNA expression data sets for subsets of microRNAs selected manually or by motif; (ii) pair-wise multivariate analysis of expression data sets within and between taxa; and (iii) association of microRNA subsets with annotation databases, HUGE Navigator, KEGG and GO. The use of existing and customized R packages facilitates future addition of data sets and analysis tools. Furthermore, the ability to upload and analyze user-specified data sets makes mESAdb an interactive and expandable analysis tool for microRNA sequence and expression data. PMID:21177657
mESAdb: microRNA expression and sequence analysis database.
Kaya, Koray D; Karakülah, Gökhan; Yakicier, Cengiz M; Acar, Aybar C; Konu, Ozlen
2011-01-01
microRNA expression and sequence analysis database (http://konulab.fen.bilkent.edu.tr/mirna/) (mESAdb) is a regularly updated database for the multivariate analysis of sequences and expression of microRNAs from multiple taxa. mESAdb is modular and has a user interface implemented in PHP and JavaScript and coupled with statistical analysis and visualization packages written for the R language. The database primarily comprises mature microRNA sequences and their target data, along with selected human, mouse and zebrafish expression data sets. mESAdb analysis modules allow (i) mining of microRNA expression data sets for subsets of microRNAs selected manually or by motif; (ii) pair-wise multivariate analysis of expression data sets within and between taxa; and (iii) association of microRNA subsets with annotation databases, HUGE Navigator, KEGG and GO. The use of existing and customized R packages facilitates future addition of data sets and analysis tools. Furthermore, the ability to upload and analyze user-specified data sets makes mESAdb an interactive and expandable analysis tool for microRNA sequence and expression data.
Vrahatis, Aristidis G; Dimitrakopoulos, Georgios N; Tsakalidis, Athanasios K; Bezerianos, Anastasios
2015-01-01
In the road for network medicine the newly emerged systems-level subpathway-based analysis methods offer new disease genes, drug targets and network-based biomarkers. In parallel, paired miRNA/mRNA expression data enable simultaneously monitoring of the micronome effect upon the signaling pathways. Towards this orientation, we present a methodological pipeline for the identification of differentially expressed subpathways along with their miRNA regulators by using KEGG signaling pathway maps, miRNA-target interactions and expression profiles from paired miRNA/mRNA experiments. Our pipeline offered new biological insights on a real application of paired miRNA/mRNA expression profiles with respect to the dynamic changes from colostrum to mature milk whey; several literature supported genes and miRNAs were recontextualized through miRNA-mediated differentially expressed subpathways.
Zhu, Youyin; Li, Yongqiang; Xin, Dedong; Chen, Wenrong; Shao, Xu; Wang, Yue; Guo, Weidong
2015-01-25
Bud dormancy is a critical biological process allowing Chinese cherry (Prunus pseudocerasus) to survive in winter. Due to the lake of genomic information, molecular mechanisms triggering endodormancy release in flower buds have remained unclear. Hence, we used Illumina RNA-Seq technology to carry out de novo transcriptome assembly and digital gene expression profiling of flower buds. Approximately 47million clean reads were assembled into 50,604 sequences with an average length of 837bp. A total of 37,650 unigene sequences were successfully annotated. 128 pathways were annotated by Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis, and metabolic, biosynthesis of second metabolite and plant hormone signal transduction accounted for higher percentage in flower bud. In critical period of endodormancy release, 1644, significantly differentially expressed genes (DEGs) were identified from expression profile. DEGs related to oxidoreductase activity were especially abundant in Gene Ontology (GO) molecular function category. Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis demonstrated that DEGs were involved in various metabolic processes, including phytohormone metabolism. Quantitative real-time PCR (qRT-PCR) analysis indicated that levels of DEGs for abscisic acid and gibberellin biosynthesis decreased while the abundance of DEGs encoding their degradation enzymes increased and GID1 was down-regulated. Concomitant with endodormancy release, MADS-box transcription factors including P. pseudocerasus dormancy-associated MADS-box (PpcDAM), Agamous-like2, and APETALA3-like genes, shown remarkably epigenetic roles. The newly generated transcriptome and gene expression profiling data provide valuable genetic information for revealing transcriptomic variation during bud dormancy in Chinese cherry. The uncovered data should be useful for future studies of bud dormancy in Prunus fruit trees lacking genomic information. Copyright © 2014 Elsevier B.V. All rights reserved.
Huang, X C; Maimaiti, X Y M; Huang, C W; Zhang, L; Li, Z B; Chen, Z G; Gao, X; Chen, T Y
2014-01-01
To further understand the synergistic mechanism of As2O3 and asscorbic acid (AA) in human osteosarcoma MG-63 cells by systems biology analysis. Human osteosarcoma MG-63 cells were treated by As2O3 (1 µmol/L), AA (62.5 µmol/L) and combined drugs (1 µmol/L As2O3 plus 62.5 µmol/L AA). Dynamic morphological characteristics were recorded by Cell-IQ system, and growth rate was calculated. Illumina beadchip assay was used to analyze the differential expression genes in different groups. Synergic effects on differential expression genes (DEGs) were analyzed by mixture linear model and singular value decomposition model. KEGG pathway annotations and GO enrichment analysis were performed to figure out the pathways involved in the synergic effects. We captured 1987 differential expression genes in combined therapy MG-63 cells. FAT1 gene was significantly upregulated in all three groups, which is a promising drug target as an important tumor suppressor analogue; meanwhile, HIST1H2BD gene was markedly downregulated in the As2O3 monotherapy group and the combined therapy group, which was found to be upregulated in prostatic cancer. These two genes might play critical roles in synergetic effects of AA and As2O3, although the exact mechanism needs further investigation. KEGG pathway analysis showed many DEGs were related with tight junction, and GO analysis also indicated that DEGs in the combined therapy cells gathered in occluding junction, apical junction complex, cell junction, and tight junction. AA potentiates the efficacy of As2O3 in MG-63 cells. Systems biology analysis showed the synergic effect on the DEGs.
Jani, Saurin D; Argraves, Gary L; Barth, Jeremy L; Argraves, W Scott
2010-04-01
An important objective of DNA microarray-based gene expression experimentation is determining inter-relationships that exist between differentially expressed genes and biological processes, molecular functions, cellular components, signaling pathways, physiologic processes and diseases. Here we describe GeneMesh, a web-based program that facilitates analysis of DNA microarray gene expression data. GeneMesh relates genes in a query set to categories available in the Medical Subject Headings (MeSH) hierarchical index. The interface enables hypothesis driven relational analysis to a specific MeSH subcategory (e.g., Cardiovascular System, Genetic Processes, Immune System Diseases etc.) or unbiased relational analysis to broader MeSH categories (e.g., Anatomy, Biological Sciences, Disease etc.). Genes found associated with a given MeSH category are dynamically linked to facilitate tabular and graphical depiction of Entrez Gene information, Gene Ontology information, KEGG metabolic pathway diagrams and intermolecular interaction information. Expression intensity values of groups of genes that cluster in relation to a given MeSH category, gene ontology or pathway can be displayed as heat maps of Z score-normalized values. GeneMesh operates on gene expression data derived from a number of commercial microarray platforms including Affymetrix, Agilent and Illumina. GeneMesh is a versatile web-based tool for testing and developing new hypotheses through relating genes in a query set (e.g., differentially expressed genes from a DNA microarray experiment) to descriptors making up the hierarchical structure of the National Library of Medicine controlled vocabulary thesaurus, MeSH. The system further enhances the discovery process by providing links between sets of genes associated with a given MeSH category to a rich set of html linked tabular and graphic information including Entrez Gene summaries, gene ontologies, intermolecular interactions, overlays of genes onto KEGG pathway diagrams and heatmaps of expression intensity values. GeneMesh is freely available online at http://proteogenomics.musc.edu/genemesh/.
Nie, Hongyi; Liu, Chun; Zhang, Yinxia; Zhou, Mengting; Huang, Xiaofeng; Peng, Li; Xia, Qingyou
2014-01-01
The ability to respond quickly and efficiently to transient extreme environmental conditions is an important property of all biota. However, the physiological basis of thermotolerance in different species is still unclear. Here, we found that the cot mutant showed a seizure phenotype including contraction of the body, rolling, vomiting gut juice and a momentary cessation of movement, and the heartbeat rhythm of the dorsal vessel significantly increases after hyperthermia. To comprehensively understand this process at the molecular level, the transcriptomic profile of cot mutant, which is a behavior mutant that exhibits a seizure phenotype, was investigated after hyperthermia (42°C) that was induced for 5 min. By digital gene expression profiling, we determined the gene expression profile of three strains (cot/cot ok/ok, +/+ ok/ok and +/+ +/+) under hyperthermia (42°C) and normal (25°C) conditions. A Venn diagram showed that the most common differentially expressed genes (DEGs, FDR<0.01 and log2 Ratio≥1) were up-regulated and annotated with the heat shock proteins (HSPs) in 3 strains after treatment with hyperthermia, suggesting that HSPs rapidly increased in response to high temperature; 110 unique DEGs, could be identified in the cot mutant after inducing hyperthermia when compared to the control strains. Of these 110 unique DEGs, 98.18% (108 genes) were up-regulated and 1.82% (two genes) were down-regulated in the cot mutant. KEGG pathways analysis of these unique DEGs suggested that the top three KEGG pathways were “Biotin metabolism,” “Fatty acid biosynthesis” and “Purine metabolism,” implying that diverse metabolic processes are active in cot mutant induced-hyperthermia. Unique DEGs of interest were mainly involved in the ubiquitin system, nicotinic acetylcholine receptor genes, cardiac excitation–contraction coupling or the Notch signaling pathway. Insights into hyperthermia-induced alterations in gene expression and related pathways could yield hints for understanding the relationship between behaviors and environmental stimuli (hyperthermia) in insects. PMID:25423472
Evaluation and comparison of bioinformatic tools for the enrichment analysis of metabolomics data.
Marco-Ramell, Anna; Palau-Rodriguez, Magali; Alay, Ania; Tulipani, Sara; Urpi-Sarda, Mireia; Sanchez-Pla, Alex; Andres-Lacueva, Cristina
2018-01-02
Bioinformatic tools for the enrichment of 'omics' datasets facilitate interpretation and understanding of data. To date few are suitable for metabolomics datasets. The main objective of this work is to give a critical overview, for the first time, of the performance of these tools. To that aim, datasets from metabolomic repositories were selected and enriched data were created. Both types of data were analysed with these tools and outputs were thoroughly examined. An exploratory multivariate analysis of the most used tools for the enrichment of metabolite sets, based on a non-metric multidimensional scaling (NMDS) of Jaccard's distances, was performed and mirrored their diversity. Codes (identifiers) of the metabolites of the datasets were searched in different metabolite databases (HMDB, KEGG, PubChem, ChEBI, BioCyc/HumanCyc, LipidMAPS, ChemSpider, METLIN and Recon2). The databases that presented more identifiers of the metabolites of the dataset were PubChem, followed by METLIN and ChEBI. However, these databases had duplicated entries and might present false positives. The performance of over-representation analysis (ORA) tools, including BioCyc/HumanCyc, ConsensusPathDB, IMPaLA, MBRole, MetaboAnalyst, Metabox, MetExplore, MPEA, PathVisio and Reactome and the mapping tool KEGGREST, was examined. Results were mostly consistent among tools and between real and enriched data despite the variability of the tools. Nevertheless, a few controversial results such as differences in the total number of metabolites were also found. Disease-based enrichment analyses were also assessed, but they were not found to be accurate probably due to the fact that metabolite disease sets are not up-to-date and the difficulty of predicting diseases from a list of metabolites. We have extensively reviewed the state-of-the-art of the available range of tools for metabolomic datasets, the completeness of metabolite databases, the performance of ORA methods and disease-based analyses. Despite the variability of the tools, they provided consistent results independent of their analytic approach. However, more work on the completeness of metabolite and pathway databases is required, which strongly affects the accuracy of enrichment analyses. Improvements will be translated into more accurate and global insights of the metabolome.
Uddin, Reaz; Sufian, Muhammad
2016-01-01
Infections caused by Salmonella enterica, a Gram-negative facultative anaerobic bacteria belonging to the family of Enterobacteriaceae, are major threats to the health of humans and animals. The recent availability of complete genome data of pathogenic strains of the S. enterica gives new avenues for the identification of drug targets and drug candidates. We have used the genomic and metabolic pathway data to identify pathways and proteins essential to the pathogen and absent from the host. We took the whole proteome sequence data of 42 strains of S. enterica and Homo sapiens along with KEGG-annotated metabolic pathway data, clustered proteins sequences using CD-HIT, identified essential genes using DEG database and discarded S. enterica homologs of human proteins in unique metabolic pathways (UMPs) and characterized hypothetical proteins with SVM-prot and InterProScan. Through this core proteomic analysis we have identified enzymes essential to the pathogen. The identification of 73 enzymes common in 42 strains of S. enterica is the real strength of the current study. We proposed all 73 unexplored enzymes as potential drug targets against the infections caused by the S. enterica. The study is comprehensive around S. enterica and simultaneously considered every possible pathogenic strain of S. enterica. This comprehensiveness turned the current study significant since, to the best of our knowledge it is the first subtractive core proteomic analysis of the unique metabolic pathways applied to any pathogen for the identification of drug targets. We applied extensive computational methods to shortlist few potential drug targets considering the druggability criteria e.g. Non-homologous to the human host, essential to the pathogen and playing significant role in essential metabolic pathways of the pathogen (i.e. S. enterica). In the current study, the subtractive proteomics through a novel approach was applied i.e. by considering only proteins of the unique metabolic pathways of the pathogens and mining the proteomic data of all completely sequenced strains of the pathogen, thus improving the quality and application of the results. We believe that the sharing of the knowledge from this study would eventually lead to bring about novel and unique therapeutic regimens against the infections caused by the S. enterica.
Zhang, Wei-Dong; Zhao, Yong; Zhang, Hong-Fu; Wang, Shu-Kun; Hao, Zhi-Hui; Liu, Jing; Yuan, Yu-Qing; Zhang, Peng-Fei; Yang, Hong-Di; Shen, Wei; Li, Lan
2016-08-01
Granulosa cells (GCs) are those somatic cells closest to the female germ cell. GCs play a vital role in oocyte growth and development, and the oocyte is necessary for multiplication of a species. Zinc oxide (ZnO) nanoparticles (NPs) readily cross biologic barriers to be absorbed into biologic systems that make them promising candidates as food additives. The objective of the present investigation was to explore the impact of intact NPs on gene expression and the functional classification of altered genes in hen GCs in vivo, to compare the data from in vivo and in vitro studies, and finally to point out the adverse effects of ZnO NPs on the reproductive system. After a 24-week treatment, hen GCs were isolated and gene expression was quantified. Intact NPs were found in the ovary and other organs. Zn levels were similar in ZnO-NP-100 mg/kg- and ZnSO4-100 mg/kg-treated hen ovaries. ZnO-NP-100 mg/kg and ZnSO4-100 mg/kg regulated the expression of the same sets of genes, and they also altered the expression of different sets of genes individually. The number of genes altered by the ZnO-NP-100 mg/kg and ZnSO4-100 mg/kg treatments was different. Gene Ontology (GO) functional analysis reported that different results for the two treatments and, in Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment, 12 pathways (out of the top 20 pathways) in each treatment were different. These results suggested that intact NPs and Zn(2+) had different effects on gene expression in GCs in vivo. In our recent publication, we noted that intact NPs and Zn(2+) differentially altered gene expression in GCs in vitro. However, GO functional classification and KEGG pathway enrichment analyses revealed close similarities for the changed genes in vivo and in vitro after ZnO NP treatment. Furthermore, close similarities were observed for the changed genes after ZnSO4 treatments in vivo and in vitro by GO functional classification and KEGG pathway enrichment analyses. Therefore, the effects of ZnO NPs on gene expression in vitro might represent their effects on gene expression in vivo. The results from this study and our earlier studies support previous findings indicating ZnO NPs promote adverse effects on organisms. Therefore, precautions should be taken when ZnO NPs are used as diet additives for hens because they might cause reproductive issues. Copyright © 2016 Elsevier Inc. All rights reserved.
PathwayAccess: CellDesigner plugins for pathway databases.
Van Hemert, John L; Dickerson, Julie A
2010-09-15
CellDesigner provides a user-friendly interface for graphical biochemical pathway description. Many pathway databases are not directly exportable to CellDesigner models. PathwayAccess is an extensible suite of CellDesigner plugins, which connect CellDesigner directly to pathway databases using respective Java application programming interfaces. The process is streamlined for creating new PathwayAccess plugins for specific pathway databases. Three PathwayAccess plugins, MetNetAccess, BioCycAccess and ReactomeAccess, directly connect CellDesigner to the pathway databases MetNetDB, BioCyc and Reactome. PathwayAccess plugins enable CellDesigner users to expose pathway data to analytical CellDesigner functions, curate their pathway databases and visually integrate pathway data from different databases using standard Systems Biology Markup Language and Systems Biology Graphical Notation. Implemented in Java, PathwayAccess plugins run with CellDesigner version 4.0.1 and were tested on Ubuntu Linux, Windows XP and 7, and MacOSX. Source code, binaries, documentation and video walkthroughs are freely available at http://vrac.iastate.edu/~jlv.
HDAPD: a web tool for searching the disease-associated protein structures
2010-01-01
Background The protein structures of the disease-associated proteins are important for proceeding with the structure-based drug design to against a particular disease. Up until now, proteins structures are usually searched through a PDB id or some sequence information. However, in the HDAPD database presented here the protein structure of a disease-associated protein can be directly searched through the associated disease name keyed in. Description The search in HDAPD can be easily initiated by keying some key words of a disease, protein name, protein type, or PDB id. The protein sequence can be presented in FASTA format and directly copied for a BLAST search. HDAPD is also interfaced with Jmol so that users can observe and operate a protein structure with Jmol. The gene ontological data such as cellular components, molecular functions, and biological processes are provided once a hyperlink to Gene Ontology (GO) is clicked. Further, HDAPD provides a link to the KEGG map such that where the protein is placed and its relationship with other proteins in a metabolic pathway can be found from the map. The latest literatures namely titles, journals, authors, and abstracts searched from PubMed for the protein are also presented as a length controllable list. Conclusions Since the HDAPD data content can be routinely updated through a PHP-MySQL web page built, the new database presented is useful for searching the structures for some disease-associated proteins that may play important roles in the disease developing process for performing the structure-based drug design to against the diseases. PMID:20158919
Convergent evidence from systematic analysis of GWAS revealed genetic basis of esophageal cancer.
Gao, Xue-Xin; Gao, Lei; Wang, Jiu-Qiang; Qu, Su-Su; Qu, Yue; Sun, Hong-Lei; Liu, Si-Dang; Shang, Ying-Li
2016-07-12
Recent genome-wide association studies (GWAS) have identified single nucleotide polymorphisms (SNPs) associated with risk of esophageal cancer (EC). However, investigation of genetic basis from the perspective of systematic biology and integrative genomics remains scarce.In this study, we explored genetic basis of EC based on GWAS data and implemented a series of bioinformatics methods including functional annotation, expression quantitative trait loci (eQTL) analysis, pathway enrichment analysis and pathway grouped network analysis.Two hundred and thirteen risk SNPs were identified, in which 44 SNPs were found to have significantly differential gene expression in esophageal tissues by eQTL analysis. By pathway enrichment analysis, 170 risk genes mapped by risk SNPs were enriched into 38 significant GO terms and 17 significant KEGG pathways, which were significantly grouped into 9 sub-networks by pathway grouped network analysis. The 9 groups of interconnected pathways were mainly involved with muscle cell proliferation, cellular response to interleukin-6, cell adhesion molecules, and ethanol oxidation, which might participate in the development of EC.Our findings provide genetic evidence and new insight for exploring the molecular mechanisms of EC.
Wu, Shengru; Liu, Yanli; Guo, Wei; Cheng, Xi; Ren, Xiaochun; Chen, Si; Li, Xueyuan; Duan, Yongle; Sun, Qingzhu; Yang, Xiaojun
2018-06-27
The liver is mainly hematopoietic in the embryo, and converts into a major metabolic organ in the adult. Therefore, it is intensively remodeled after birth to adapt and perform adult functions. Long non-coding RNAs (lncRNAs) are involved in organ development and cell differentiation, likely they have potential roles in regulating postnatal liver development. Herein, in order to understand the roles of lncRNAs in postnatal liver maturation, we analyzed the lncRNAs and mRNAs expression profiles in immature and mature livers from one-day-old and adult (40 weeks of age) breeder roosters by Ribo-Zero RNA-Sequencing. Around 21,939 protein-coding genes and 2220 predicted lncRNAs were expressed in livers of breeder roosters. Compared to protein-coding genes, the identified chicken lncRNAs shared fewer exons, shorter transcript length, and significantly lower expression levels. Notably, in comparison between the livers of newborn and adult breeder roosters, a total of 1570 mRNAs and 214 lncRNAs were differentially expressed with the criteria of log 2 fold change > 1 or < - 1 and P values < 0.05, which were validated by qPCR using randomly selected five mRNAs and five lncRNAs. Further GO and KEGG analyses have revealed that the differentially expressed mRNAs were involved in the hepatic metabolic and immune functional changes, as well as some biological processes and pathways including cell proliferation, apoptotic and cell cycle that are implicated in the development of liver. We also investigated the cis- and trans- regulatory effects of differentially expressed lncRNAs on its target genes. GO and KEGG analyses indicated that these lncRNAs had their neighbor protein coding genes and trans-regulated genes associated with adapting of adult hepatic functions, as well as some pathways involved in liver development, such as cell cycle pathway, Notch signaling pathway, Hedgehog signaling pathway, and Wnt signaling pathway. This study provides a catalog of mRNAs and lncRNAs related to postnatal liver maturation of chicken, and will contribute to a fuller understanding of biological processes or signaling pathways involved in significant functional transition during postnatal liver development that differentially expressed genes and lncRNAs could take part in.
Liu, Yanqing; Wang, Yueqiu; Zhang, Yanxia; Liu, Zhiyong; Xiang, Hongfei; Peng, Xianbo
2017-01-01
Objectives. We aimed to find the key pathways associated with the development of osteoporosis. Methods. We downloaded expression profile data of GSE35959 and analyzed the differentially expressed genes (DEGs) in 3 comparison groups (old_op versus middle, old_op versus old, and old_op versus senescent). KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway enrichment analyses were carried out. Besides, Venn diagram analysis and gene functional interaction (FI) network analysis were performed. Results. Totally 520 DEGs, 966 DEGs, and 709 DEGs were obtained in old_op versus middle, old_op versus old, and old_op versus senescent groups, respectively. Lysosome pathway was the significantly enriched pathways enriched by intersection genes. The pathways enriched by subnetwork modules suggested that mitotic metaphase and anaphase and signaling by Rho GTPases in module 1 had more proteins from module. Conclusions. Lysosome pathway, mitotic metaphase and anaphase, and signaling by Rho GTPases may be involved in the development of osteoporosis. Furthermore, Rho GTPases may regulate the balance of bone resorption and bone formation via controlling osteoclast and osteoblast. These 3 pathways may be regarded as the treatment targets for osteoporosis. PMID:28466021
Zhang, Ying; Zhang, Wei; Li, Xinglan; Li, Dapeng; Zhang, Xiaoling; Yin, Yajie; Deng, Xiangyun; Sheng, Xiugui
2016-06-01
Endometrial cancer (EC) is the most prevalent malignancy worldwide. Although several efforts had been made to explore the molecular mechanism responsible for EC progression, it is still not fully understood. To evaluate the clinical characteristics and prognostic factors of patients with EC, and further to search for novel genes associated with EC progression. We recruited 328 patients with EC and analyzed prognostic factors using Cox proportional hazard regression model. Further, a gene expression profile of EC was used to identify the differentially expressed genes (DEGs) between normal samples and tumor samples. Subsequently, Kyoto Encyclopedia of Genes and Genomes pathway enrichment analysis ( http://www.genome.jp/kegg/ ) for DEGs were performed, and then protein-protein interaction (PPI) network of DEGs as well as the subnetwork of PPI were constructed with plug-in, MCODE by mapping DEGs into the Search Tool for the Retrieval of Interacting Genes database. Our results showed that body mass index (BMI), hypertension, myometrial invasion, pathological type, and Glut4 positive expression were prognostic factors in EC (P < 0.05). Bioinformatics analysis showed that upregulated DEGs were associated with cell cycle, and downregulated DEGs were related to MAPK pathway. Meanwhile, PPI network analysis revealed that upregulated CDK1 and CCNA2 as well as downregulated JUN and FOS were listed in top two nodes with high degrees. Patients with EC should be given more focused attentions in respect of pathological type, BMI, hypertension, and Glut4-positive expression. In addition, CDK1, CCNA2, JUN, and FOS might play important roles in EC development.
Gupta, Parul; Goel, Ridhi; Pathak, Sumya; Srivastava, Apeksha; Singh, Surya Pratap; Sangwan, Rajender Singh; Asif, Mehar Hasan; Trivedi, Prabodh Kumar
2013-01-01
Withania somnifera is one of the most valuable medicinal plants used in Ayurvedic and other indigenous medicine systems due to bioactive molecules known as withanolides. As genomic information regarding this plant is very limited, little information is available about biosynthesis of withanolides. To facilitate the basic understanding about the withanolide biosynthesis pathways, we performed transcriptome sequencing for Withania leaf (101L) and root (101R) which specifically synthesize withaferin A and withanolide A, respectively. Pyrosequencing yielded 8,34,068 and 7,21,755 reads which got assembled into 89,548 and 1,14,814 unique sequences from 101L and 101R, respectively. A total of 47,885 (101L) and 54,123 (101R) could be annotated using TAIR10, NR, tomato and potato databases. Gene Ontology and KEGG analyses provided a detailed view of all the enzymes involved in withanolide backbone synthesis. Our analysis identified members of cytochrome P450, glycosyltransferase and methyltransferase gene families with unique presence or differential expression in leaf and root and might be involved in synthesis of tissue-specific withanolides. We also detected simple sequence repeats (SSRs) in transcriptome data for use in future genetic studies. Comprehensive sequence resource developed for Withania, in this study, will help to elucidate biosynthetic pathway for tissue-specific synthesis of secondary plant products in non-model plant organisms as well as will be helpful in developing strategies for enhanced biosynthesis of withanolides through biotechnological approaches. PMID:23667511
Down-weighting overlapping genes improves gene set analysis
2012-01-01
Background The identification of gene sets that are significantly impacted in a given condition based on microarray data is a crucial step in current life science research. Most gene set analysis methods treat genes equally, regardless how specific they are to a given gene set. Results In this work we propose a new gene set analysis method that computes a gene set score as the mean of absolute values of weighted moderated gene t-scores. The gene weights are designed to emphasize the genes appearing in few gene sets, versus genes that appear in many gene sets. We demonstrate the usefulness of the method when analyzing gene sets that correspond to the KEGG pathways, and hence we called our method Pathway Analysis with Down-weighting of Overlapping Genes (PADOG). Unlike most gene set analysis methods which are validated through the analysis of 2-3 data sets followed by a human interpretation of the results, the validation employed here uses 24 different data sets and a completely objective assessment scheme that makes minimal assumptions and eliminates the need for possibly biased human assessments of the analysis results. Conclusions PADOG significantly improves gene set ranking and boosts sensitivity of analysis using information already available in the gene expression profiles and the collection of gene sets to be analyzed. The advantages of PADOG over other existing approaches are shown to be stable to changes in the database of gene sets to be analyzed. PADOG was implemented as an R package available at: http://bioinformaticsprb.med.wayne.edu/PADOG/or http://www.bioconductor.org. PMID:22713124
Comparative de novo transcriptome analysis of male and female Sea buckthorn.
Bansal, Ankush; Salaria, Mehul; Sharma, Tashil; Stobdan, Tsering; Kant, Anil
2018-02-01
Sea buckthorn is a dioecious medicinal plant found at high altitude. The plant has both male and female reproductive organs in separate individuals. In this article, whole transcriptome de novo assemblies of male and female flower bud samples were carried out using Illumina NextSeq 500 platform to determine the role of the genes involved in sex determination. Moreover, genes with differential expression in male and female transcriptomes were identified to understand the underlying sex determination mechanism. The current study showed 63,904 and 62,272 coding sequences (CDS) in female and male transcriptome data sets, respectively. 16,831 common CDS were screened out from both transcriptomes, out of which 625 were upregulated and 491 were found to be downregulated. To understand the potential regulatory roles of differentially expressed genes in metabolic networks and biosynthetic pathways: KEGG mapping, gene ontology, and co-expression network analysis were performed. Comparison with Flowering Interactive Database (FLOR-ID) resulted in eight differentially expressed genes viz. CHD3-type chromatin-remodeling factor PICKLE ( PKL ), phytochrome-associated serine/threonine-protein phosphatase ( FYPP ), protein TOPLESS ( TPL ), sensitive to freezing 6 ( SFR6 ), lysine-specific histone demethylase 1 homolog 1 ( LDL1 ), pre-mRNA-processing-splicing factor 8A ( PRP8A ), sucrose synthase 4 ( SUS4 ), ubiquitin carboxyl-terminal hydrolase 12 ( UBP12 ), known to be broadly involved in flowering, photoperiodism, embryo development, and cold response pathways. Male and female flower bud transcriptome data of Sea buckthorn may provide comprehensive information at genomic level for the identification of genetic regulation involved in sex determination.
Zhang, Shi-tao; Zuo, Chao; Li, Wan-nan; Fu, Xue-qi; Xing, Shu; Zhang, Xiao-ping
2016-02-01
To identify key genes related to the effect of estrogen on ovarian cancer. Microarray data (GSE22600) were downloaded from Gene Expression Omnibus. Eight estrogen and seven placebo treatment samples were obtained using a 2 × 2 factorial designs, which contained 2 cell lines (PEO4 and 2008) and 2 treatments (estrogen and placebo). Differentially expressed genes were identified by Bayesian methods, and the genes with P < 0.05 and |log2FC (fold change)| ≥0.5 were chosen as cut-off criterion. Differentially co-expressed genes (DCGs) and differentially regulated genes (DRGs) were, respectively, identified by DCe function and DRsort function in DCGL package. Topological structure analysis was performed on the important transcriptional factors (TFs) and genes in transcriptional regulatory network using tYNA. Functional enrichment analysis was, respectively, performed for DEGs and the important genes using Gene Ontology and KEGG databases. In total, 465 DEGs were identified. Functional enrichment analysis of DEGs indicated that ACVR2B, LTBP1, BMP7 and MYC involved in TGF-beta signaling pathway. The 2285 DCG pairs and 357 DRGs were identified. Topological structure analysis showed that 52 important TFs and 65 important genes were identified. Functional enrichment analysis of the important genes showed that TP53 and MLH1 participated in DNA damage response and the genes (ACVR2B, LTBP1, BMP7 and MYC) involved in TGF-beta signaling pathway. TP53, MLH1, ACVR2B, LTBP1 and BMP7 might participate in the pathogenesis of ovarian cancer.
Zhao, Wenchao; Yang, Xueyong; Yu, Hongjun; Jiang, Weijie; Sun, Na; Liu, Xiaoran; Liu, Xiaolin; Zhang, Xiaomeng; Wang, Yan; Gu, Xingfang
2015-03-01
Nitrogen (N) is both an important macronutrient and a signal for plant growth and development. However, the early regulatory mechanism of plants in response to N starvation is not well understood, especially in cucumber, an economically important crop that normally consumes excessive N during production. In this study, the early time-course transcriptome response of cucumber leaves under N deficiency was monitored using RNA sequencing (RNA-Seq). More than 23,000 transcripts were examined in cucumber leaves, of which 364 genes were differentially expressed in response to N deficiency. Based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway database, gene ontology (GO) and protein-protein interaction analysis, 64 signaling-related N-deficiency-responsive genes were identified. Furthermore, the potential regulatory mechanisms of anthocyanin accumulation, Chl decline and cell wall remodeling were assessed at the transcription level. Increased ascorbic acid synthesis was identified in cucumber seedlings and fruit under N-deficient conditions, and a new corresponding regulatory hypothesis has been proposed. A data cross-comparison between model plants and cucumber was made, and some common and specific N-deficient response mechanisms were found in the present study. Our study provides novel insights into the responses of cucumber to nitrogen starvation at the global transcriptome level, which are expected to be highly useful for dissecting the N response pathways in this major vegetable and for improving N fertilization practices. © The Author 2014. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Gong, Ai-Xiu; Zhang, Jing-Han; Li, Jing; Wu, Jun; Wang, Lin; Miao, Deng-Shun
2017-01-01
There are anatomical and functional differences between human dental pulp (DP) and periodontal ligament (PDL). However, the molecular biological differences and function of these tissues are poorly understood. In the present study, we employed a cDNA microarray array to screen for differentially expressed genes (DEGs) between human DP and PDL tissues, and used the online software WebGestalt to perform the functional analysis of the DEGs. In addition, the STRING database and KEGG pathway analysis were applied for interaction network and pathway analysis of the DEGs. DP and PDL samples were obtained from permanent premolars (n=16) extracted for orthodontic purposes. The results of the microarray assay were confirmed by RT-qPCR. The DEGs were found to be significantly associated with the extracellular matrix and focal adhesion. A total of 10 genes were selected to confirm the results. The mRNA levels of integrin alpha 4 (ITGA4), integrin alpha 8 (ITGA8), neurexin 1 (NRXN1) and contactin 1 (CNTN1) were significantly higher in the DP than in the PDL tissues. However, the levels of collagen type XI alpha 1 (COL11A1), aggrecan (ACAN), collagen type VI alpha 1 (COL6A1), chondroadherin (CHAD), laminin gamma 2 (LAMC2) and laminin alpha 3 (LAMA3) were higher in the PDL than in the DP samples. The gene expression profiles provide novel insight into the characterization of DP and PDL tissues, and contribute to our understanding of the potential molecular mechanisms of dental tissue mineralization and regeneration. PMID:28713908
Fischer, Carol L; Dawson, Deborah V; Blanchette, Derek R; Drake, David R; Wertz, Philip W; Brogden, Kim A
2016-01-01
Lipids endogenous to skin and mucosal surfaces exhibit potent antimicrobial activity against Porphyromonas gingivalis, an important colonizer of the oral cavity implicated in periodontitis. Our previous work demonstrated the antimicrobial activity of the fatty acid sapienic acid (C(16:1Δ6)) against P. gingivalis and found that sapienic acid treatment alters both protein and lipid composition from those in controls. In this study, we further examined whole-cell protein differences between sapienic acid-treated bacteria and untreated controls, and we utilized open-source functional association and annotation programs to explore potential mechanisms for the antimicrobial activity of sapienic acid. Our analyses indicated that sapienic acid treatment induces a unique stress response in P. gingivalis resulting in differential expression of proteins involved in a variety of metabolic pathways. This network of differentially regulated proteins was enriched in protein-protein interactions (P = 2.98 × 10(-8)), including six KEGG pathways (P value ranges, 2.30 × 10(-5) to 0.05) and four Gene Ontology (GO) molecular functions (P value ranges, 0.02 to 0.04), with multiple suggestive enriched relationships in KEGG pathways and GO molecular functions. Upregulated metabolic pathways suggest increases in energy production, lipid metabolism, iron acquisition and processing, and respiration. Combined with a suggested preferential metabolism of serine, which is necessary for fatty acid biosynthesis, these data support our previous findings that the site of sapienic acid antimicrobial activity is likely at the bacterial membrane. P. gingivalis is an important opportunistic pathogen implicated in periodontitis. Affecting nearly 50% of the population, periodontitis is treatable, but the resulting damage is irreversible and eventually progresses to tooth loss. There is a great need for natural products that can be used to treat and/or prevent the overgrowth of periodontal pathogens and increase oral health. Sapienic acid is endogenous to the oral cavity and is a potent antimicrobial agent, suggesting a potential therapeutic or prophylactic use for this fatty acid. This study examines the effects of sapienic acid treatment on P. gingivalis and highlights the membrane as the likely site of antimicrobial activity. Copyright © 2015, American Society for Microbiology. All Rights Reserved.
Screening and analyzing genes associated with Amur tiger placental development.
Li, Q; Lu, T F; Liu, D; Hu, P F; Sun, B; Ma, J Z; Wang, W J; Wang, K F; Zhang, W X; Chen, J; Guan, W J; Ma, Y H; Zhang, M H
2014-09-26
The Amur tiger is a unique endangered species in the world, and thus, protection of its genetic resources is extremely important. In this study, an Amur tiger placenta cDNA library was constructed using the SMART cDNA Library Construction kit. A total of 508 colonies were sequenced, in which 205 (76%) genes were annotated and mapped to 74 KEGG pathways, including 29 metabolism, 29 genetic information processing, 4 environmental information processing, 7 cell motility, and 5 organismal system pathways. Additionally, PLAC8, PEG10 and IGF-II were identified after screening genes from the expressed sequence tags, and they were associated with placental development. These findings could lay the foundation for future functional genomic studies of the Amur tiger.
Characterization of the myometrial transcriptome in women with an arrest of dilatation during labor
Chaemsaithong, Piya; Madan, Ichchha; Romero, Roberto; Than, Nandor G; Tarca, Adi L; Draghici, Sorin; Bhatti, Gaurav; Mazor, Moshe; Kim, Chong Jai; Hassan, Sonia S; Chaiworapongsa, Tinnakorn
2014-01-01
Objective The molecular basis of failure to progress in labor is poorly understood. This study was undertaken to characterize the myometrial transcriptome of patients with an arrest of dilatation (AODIL). Study design Human myometrium was prospectively collected from women in the following groups: 1) spontaneous term labor (TL; n=29); and 2) arrest of dilatation (AODIL; n=14). Gene expression was characterized using Illumina® HumanHT-12 microarrays. A moderated student t-test and false discovery rate adjustment were used for analysis. Quantitative reverse transcription-polymerase chain reaction (qRT-PCR) of selected genes was performed in an independent sample set. Pathway analysis was performed on the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway database using Pathway Analysis with Down-weighting of Overlapping Genes (PADOG). The Metacore knowledge base was also mined for pathway analysis. Results 1) 42 genes differentially expressed were identified in women with an AODIL; 2) gene ontology analysis indicated enrichment of biological processes, which included: regulation of angiogenesis, response to hypoxia, inflammatory response, and chemokine-mediated signaling pathway. Enriched molecular functions included: transcription repressor activity, Heat shock protein (Hsp) 90 binding, and nitric oxide synthase (NOS) activity; 3) Metacore analysis identified immune response chemokine (C-C motif) ligand 2 (CCL2) signaling, muscle contraction regulation of eNOS activity in endothelial cells, and Triiodothyronine and Thyroxine signaling as significantly over-represented (FDR<0.05); 4) qRT-PCR confirmed overexpression of Nitric oxide synthase 3 NOS3; hypoxic ischemic factor (HIF1A), Chemokine (C-C motif) ligand 2 (CCL2); angiopoietin-like 4 (ANGPTL4), ADAM metallopeptidase with thrombospondin type 1, motif 9 (ADAMTS9), G protein-coupled receptor 4 (GPR4), metallothionein 1A (MT1A), MT2A, selectin E (SELE) in an AODIL. Conclusion The myometrium of women with arrest of dilatation have a stereotypic transcriptome profile. This disorder was associated with a pattern of gene expression involved in muscle contraction, an inflammatory response, and hypoxia. This is the first comprehensive and unbiased examination of the molecular basis of an AODIL. PMID:23893668
Hampel, Miriam; Alonso, Esteban; Aparicio, Irene; Bron, James E; Santos, Juan Luis; Taggart, John B; Leaver, Michael J
2010-05-01
Pharmaceuticals are emerging pollutants widely used in everyday urban activities which can be detected in surface, ground, and drinking waters. Their presence is derived from consumption of medicines, disposal of expired medications, release of treated and untreated urban effluents, and from the pharmaceutical industry. Their growing use has become an alarming environmental problem which potentially will become dangerous in the future. However, there is still a lack of knowledge about long-term effects in non-target organisms as well as for human health. Toxicity testing has indicated a relatively low acute toxicity to fish species, but no information is available on possible sublethal effects. This study provides data on the physiological pathways involved in the exposure of Atlantic salmon as representative test species to three pharmaceutical compounds found in ground, surface, and drinking waters based on the evaluation of the xenobiotic-induced impairment resulting in the activation and silencing of specific genes. Individuals of Atlantic salmon (Salmo salar) parr were exposed during 5 days to environmentally relevant concentrations of three representative pharmaceutical compounds with high consumption rates: the analgesic acetaminophen (54.77+/-34.67 microg L(-1)), the anticonvulsant carbamazepine (7.85+/-0.13 microg L(-1)), and the beta-blocker atenolol (11.08+/-7.98 microg L(-1)). Five immature males were selected for transcriptome analysis in brain tissues by means of a 17k salmon cDNA microarray. For this purpose, mRNA was isolated and reverse-transcribed into cDNA which was labeled with fluorescent dyes and hybridized against a common pool to the arrays. Lists of significantly up- and down-regulated candidate genes were submitted to KEGG (Kyoto Encyclopedia of Genes and Genomes) in order to analyze for induced pathways and to evaluate the usefulness of this method in cases of not completely annotated test organisms. Exposure during 5 days to environmentally relevant concentrations of the selected pharmaceutical compounds acetaminophen, carbamazepine, and atenolol produced differences in the expression of 659, 700, and 480 candidate genes, respectively. KEGG annotation numbers (KO annotations) were obtained for between 26.57% and 33.33% of these differently expressed genes per treatment in comparison to non-exposure conditions. Pathways that showed to be induced did not always follow previously reported targets or metabolic routes for the employed treatments; however, several other pathways have been found (four or more features) to be significantly induced. Energy-related pathways have been altered under exposure in all the selected treatments, indicating a possible energy budget leakage due to additional processes resulting from the exposure to environmental contaminants. Observed induction of pathways may indicate additional processes involved in the mode of action of the selected pharmaceuticals which may not have been detected with conventional methods like quantitative PCR in which only suspected features are analyzed punctually for effects. The employment of novel high-throughput screening techniques in combination with global pathway analysis methods, even if the organism is not completely annotated, allows the examination of a much broader range of candidates for potential effects of exposure at the gene level. The continuously growing number of annotations of representative species relevant for environmental quality testing is facilitating pathway analysis processes for not completely annotated organisms. KEGG has shown to be a useful tool for the analysis of induced pathways from data generated by microarray techniques with the selected pharmaceutical contaminants acetaminophen, carbamazepine, and atenolol, but further studies have to be carried out in order to determine if a similar expression pattern in terms of fold change quantity and pathways is observed after long-term exposure. Together with the information obtained in this study, it will then be possible to evaluate the potential risk that the continuous release of these compounds may have on the environment and ecosystem functioning.
Wan, Xuebin; Wang, Dan; Xiong, Qi; Xiang, Hong; Li, Huanan; Wang, Hongshuai; Liu, Zezhang; Niu, Hongdan; Peng, Jian; Jiang, Siwen; Chai, Jin
2016-11-11
Stress response is tightly linked to meat quality. The current understanding of the intrinsic mechanism of meat deterioration under stress is limited. Here, male piglets were randomly assigned to cortisol and control groups. Our results showed that when serum cortisol level was significantly increased, the meat color at 1 h postmortem, muscle bundle ratio, apoptosis rate, and gene expression levels of calcium channel and cell apoptosis including SERCA1, IP3R1, BAX, Bcl-2, and Caspase-3, were notably increased. However, the value of drip loss at 24 h postmortem and serum CK were significantly decreased. Additionally, a large number of differentially expressed genes (DEGs) in GC regulation mechanism were screened out using transcriptome sequencing technology. A total of 223 DEGs were found, including 80 up-regulated genes and 143 down-regulated genes. A total of 204 genes were enriched in GO terms, and 140 genes annotated into in KEGG database. Numerous genes were primarily involved in defense, inflammatory and wound responses. This study not only identifies important genes and signalling pathways that may affect the meat quality but also offers a reference for breeding and feeding management to provide consumers with better quality pork products.
Singh, Vinayak; Goel, Ridhi; Pande, Veena; Asif, Mehar Hasan; Mohanty, Chandra Sekhar
2017-01-01
Condensed tannin (CT) or proanthocyanidin (PA) is a unique group of phenolic metabolite with high molecular weight with specific structure. It is reported that, the presence of high-CT in the legumes adversely affect the nutrients in the plant and impairs the digestibility upon consumption by animals. Winged bean (Psophocarpus tetragonolobus (L.) DC.) is one of the promising underutilized legume with high protein and oil-content. One of the reasons for its underutilization is due to the presence of CT. Transcriptome sequencing of leaves of two diverse CT-containing lines of P. tetragonolobus was carried out on Illumina Nextseq 500 sequencer to identify the underlying genes and contigs responsible for CT-biosynthesis. RNA-Seq data generated 102586 and 88433 contigs for high (HCTW) and low CT (LCTW) lines of P. tetragonolobus, respectively. Based on the similarity searches against gene ontology (GO) and Kyoto encyclopedia of genes and genomes (KEGG) database revealed 5210 contigs involved in 229 different pathways. A total of 1235 contigs were detected to differentially express between HCTW and LCTW lines. This study along with its findings will be helpful in providing information for functional and comparative genomic analysis of condensed tannin biosynthesis in this plant in specific and legumes in general. PMID:28322296
Long non-coding RNA expression profile in Cdk5-knockdown mouse skin.
Ji, Kaiyuan; Fan, Ruiwen; Zhang, Junzhen; Yang, Shanshan; Dong, Changsheng
2018-06-08
To elucidate the Cdk5 regulatory molecular mechanism in skin, we generated Cdk5-knockdown mice and subjected their skins to lncRNA sequencing. The results showed that there were 4533 novel lncRNAs from 142 lncRNA families. In total, 693 lncRNAs were significantly differentially expressed. Alignment analysis of the lncRNAs in miRBase identified 45 pre-mRNAs. By KEGG PATHWAY Database analysis, we found that lncRNAs (lnc-NONMMUT064276.2, lnc-NONMMUT075728.1, and lnc-NONMMUT039653.2) may regulate pigmentation by regulating target genes. To reveal potential antisense lncRNA-mRNA interactions, we searched all lncRNA-mRNA duplexes using RNAplex, and found 97 lncRNAs interacted with mRNAs. The luciferase assay confirmed that TCONS_00049140 binded to Krt80 by the co-transfection of pVAX1-TCONS_00049140 and pGL0-Krt80 expression plasmids in 293T cell, based on the bioinformatics analysis. Overexpression of TCONS_00049140 in mouse melanocytes down-regulated Krt80 and resulted in the phenotype of increased cell proliferation and increased melanin production. The results suggested that TCONS_00049140 contributed to skin thickening through Krt80. Our findings provide a direction for research of the molecular mechanism of Cdk5 function. Copyright © 2017. Published by Elsevier B.V.
Yang, Qing; Sun, Fanyue; Yang, Zhi; Li, Hongjun
2014-01-01
Calanus sinicus Brodsky (Copepoda, Crustacea) is a dominant zooplanktonic species widely distributed in the margin seas of the Northwest Pacific Ocean. In this study, we utilized an RNA-Seq-based approach to develop molecular resources for C. sinicus. Adult samples were sequenced using the Illumina HiSeq 2000 platform. The sequencing data generated 69,751 contigs from 58.9 million filtered reads. The assembled contigs had an average length of 928.8 bp. Gene annotation allowed the identification of 43,417 unigene hits against the NCBI database. Gene ontology (GO) and KEGG pathway mapping analysis revealed various functional genes related to diverse biological functions and processes. Transcripts potentially involved in stress response and lipid metabolism were identified among these genes. Furthermore, 4,871 microsatellites and 110,137 single nucleotide polymorphisms (SNPs) were identified in the C. sinicus transcriptome sequences. SNP validation by the melting temperature (T m)-shift method suggested that 16 primer pairs amplified target products and showed biallelic polymorphism among 30 individuals. The present work demonstrates the power of Illumina-based RNA-Seq for the rapid development of molecular resources in nonmodel species. The validated SNP set from our study is currently being utilized in an ongoing ecological analysis to support a future study of C. sinicus population genetics. PMID:24982883
Integrative Functional Genomics for Systems Genetics in GeneWeaver.org.
Bubier, Jason A; Langston, Michael A; Baker, Erich J; Chesler, Elissa J
2017-01-01
The abundance of existing functional genomics studies permits an integrative approach to interpreting and resolving the results of diverse systems genetics studies. However, a major challenge lies in assembling and harmonizing heterogeneous data sets across species for facile comparison to the positional candidate genes and coexpression networks that come from systems genetic studies. GeneWeaver is an online database and suite of tools at www.geneweaver.org that allows for fast aggregation and analysis of gene set-centric data. GeneWeaver contains curated experimental data together with resource-level data such as GO annotations, MP annotations, and KEGG pathways, along with persistent stores of user entered data sets. These can be entered directly into GeneWeaver or transferred from widely used resources such as GeneNetwork.org. Data are analyzed using statistical tools and advanced graph algorithms to discover new relations, prioritize candidate genes, and generate function hypotheses. Here we use GeneWeaver to find genes common to multiple gene sets, prioritize candidate genes from a quantitative trait locus, and characterize a set of differentially expressed genes. Coupling a large multispecies repository curated and empirical functional genomics data to fast computational tools allows for the rapid integrative analysis of heterogeneous data for interpreting and extrapolating systems genetics results.
Analysis of expressed sequence tags from the Ulva prolifera (Chlorophyta)
NASA Astrophysics Data System (ADS)
Niu, Jianfeng; Hu, Haiyan; Hu, Songnian; Wang, Guangce; Peng, Guang; Sun, Song
2010-01-01
In 2008, a green tide broke out before the sailing competition of the 29th Olympic Games in Qingdao. The causative species was determined to be Enteromorpha prolifera ( Ulva prolifera O. F. Müller), a familiar green macroalga along the coastline of China. Rapid accumulation of a large biomass of floating U. prolifera prompted research on different aspects of this species. In this study, we constructed a nonnormalized cDNA library from the thalli of U. prolifera and acquired 10 072 high-quality expressed sequence tags (ESTs). These ESTs were assembled into 3 519 nonredundant gene groups, including 1 446 clusters and 2 073 singletons. After annotation with the nr database, a large number of genes were found to be related with chloroplast and ribosomal protein, GO functional classification showed 1 418 ESTs participated in photosynthesis and 1 359 ESTs were responsible for the generation of precursor metabolites and energy. In addition, rather comprehensive carbon fixation pathways were found in U. prolifera using KEGG. Some stress-related and signal transduction-related genes were also found in this study. All the evidences displayed that U. prolifera had substance and energy foundation for the intense photosynthesis and the rapid proliferation. Phylogenetic analysis of cytochrome c oxidase subunit I revealed that this green-tide causative species is most closely affiliated to Pseudendoclonium akinetum (Ulvophyceae).
Zeng, Xu; Ling, Hong; Yang, Jianwen; Chen, Juan; Guo, Shunxing
2018-05-05
Hericium erinaceus, a famous edible mushroom, is also a well-known traditional medicinal fungus. To date, a large number of bioactive metabolites with antitumor, antibacterial, and immune-boosting effects were isolated from the free-living mycelium and fruiting body of H. erinaceus. Here we used the proteomic approach to explore proteins involved in the regulation of bioactive metabolites, including terpenoid, polyketide, sterol and etc. RESULTS: Using mass spectrometry, a total of 2543 unique proteins were identified using H. erinaceus genome, of which 2449, 1855, 1533 and 690 proteins were successfully annotated in Nr, KOG, KEGG and GO databases. Among them, 722 proteins were differentially expressed (528 up- and 194 down-regulated) in fruiting body compared with mycelium. Most of differentially expressed proteins were putatively involved in energy metabolism, molecular signaling, and secondary metabolism. Additionally, numerous proteins involved in terpenoid, polyketide, and sterol biosynthesis were identified. Our data revealed that proteins involved in polyketide biosynthesis were up-regulated in the fruiting body, while some proteins in mevalonate (MEP) pathway from terpenoid biosynthesis were generally up-regulated in mycelium. The present study suggested that the differential regulation of biosynthesis genes could produce various bioactive metabolites with pharmacological effects in H. erinaceus. Copyright © 2017. Published by Elsevier B.V.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shi, CY; Yang, H; Wei, CL
Tea is one of the most popular non-alcoholic beverages worldwide. However, the tea plant, Camellia sinensis, is difficult to culture in vitro, to transform, and has a large genome, rendering little genomic information available. Recent advances in large-scale RNA sequencing (RNA-seq) provide a fast, cost-effective, and reliable approach to generate large expression datasets for functional genomic analysis, which is especially suitable for non-model species with un-sequenced genomes. Using high-throughput Illumina RNA-seq, the transcriptome from poly (A){sup +} RNA of C. sinensis was analyzed at an unprecedented depth (2.59 gigabase pairs). Approximate 34.5 million reads were obtained, trimmed, and assembled intomore » 127,094 unigenes, with an average length of 355 bp and an N50 of 506 bp, which consisted of 788 contig clusters and 126,306 singletons. This number of unigenes was 10-fold higher than existing C. sinensis sequences deposited in GenBank (as of August 2010). Sequence similarity analyses against six public databases (Uniprot, NR and COGs at NCBI, Pfam, InterPro and KEGG) found 55,088 unigenes that could be annotated with gene descriptions, conserved protein domains, or gene ontology terms. Some of the unigenes were assigned to putative metabolic pathways. Targeted searches using these annotations identified the majority of genes associated with several primary metabolic pathways and natural product pathways that are important to tea quality, such as flavonoid, theanine and caffeine biosynthesis pathways. Novel candidate genes of these secondary pathways were discovered. Comparisons with four previously prepared cDNA libraries revealed that this transcriptome dataset has both a high degree of consistency with previous EST data and an approximate 20 times increase in coverage. Thirteen unigenes related to theanine and flavonoid synthesis were validated. Their expression patterns in different organs of the tea plant were analyzed by RT-PCR and quantitative real time PCR (qRT-PCR). An extensive transcriptome dataset has been obtained from the deep sequencing of tea plant. The coverage of the transcriptome is comprehensive enough to discover all known genes of several major metabolic pathways. This transcriptome dataset can serve as an important public information platform for gene expression, genomics, and functional genomic studies in C. sinensis.« less
2011-01-01
Background Tea is one of the most popular non-alcoholic beverages worldwide. However, the tea plant, Camellia sinensis, is difficult to culture in vitro, to transform, and has a large genome, rendering little genomic information available. Recent advances in large-scale RNA sequencing (RNA-seq) provide a fast, cost-effective, and reliable approach to generate large expression datasets for functional genomic analysis, which is especially suitable for non-model species with un-sequenced genomes. Results Using high-throughput Illumina RNA-seq, the transcriptome from poly (A)+ RNA of C. sinensis was analyzed at an unprecedented depth (2.59 gigabase pairs). Approximate 34.5 million reads were obtained, trimmed, and assembled into 127,094 unigenes, with an average length of 355 bp and an N50 of 506 bp, which consisted of 788 contig clusters and 126,306 singletons. This number of unigenes was 10-fold higher than existing C. sinensis sequences deposited in GenBank (as of August 2010). Sequence similarity analyses against six public databases (Uniprot, NR and COGs at NCBI, Pfam, InterPro and KEGG) found 55,088 unigenes that could be annotated with gene descriptions, conserved protein domains, or gene ontology terms. Some of the unigenes were assigned to putative metabolic pathways. Targeted searches using these annotations identified the majority of genes associated with several primary metabolic pathways and natural product pathways that are important to tea quality, such as flavonoid, theanine and caffeine biosynthesis pathways. Novel candidate genes of these secondary pathways were discovered. Comparisons with four previously prepared cDNA libraries revealed that this transcriptome dataset has both a high degree of consistency with previous EST data and an approximate 20 times increase in coverage. Thirteen unigenes related to theanine and flavonoid synthesis were validated. Their expression patterns in different organs of the tea plant were analyzed by RT-PCR and quantitative real time PCR (qRT-PCR). Conclusions An extensive transcriptome dataset has been obtained from the deep sequencing of tea plant. The coverage of the transcriptome is comprehensive enough to discover all known genes of several major metabolic pathways. This transcriptome dataset can serve as an important public information platform for gene expression, genomics, and functional genomic studies in C. sinensis. PMID:21356090
Identification of DNA Methyltransferase Genes in Human Pathogenic Bacteria by Comparative Genomics.
Brambila-Tapia, Aniel Jessica Leticia; Poot-Hernández, Augusto Cesar; Perez-Rueda, Ernesto; Rodríguez-Vázquez, Katya
2016-06-01
DNA methylation plays an important role in gene expression and virulence in some pathogenic bacteria. In this report, we describe DNA methyltransferases (MTases) present in human pathogenic bacteria and compared them with related species, which are not pathogenic or less pathogenic, based in comparative genomics. We performed a search in the KEGG database of the KEGG database orthology groups associated with adenine and cytosine DNA MTase activities (EC: 2.1.1.37, EC: 2.1.1.113 and EC: 2.1.1.72) in 37 human pathogenic species and 18 non/less pathogenic relatives and performed comparisons of the number of these MTases sequences according to their genome size, the DNA MTase type and with their non-less pathogenic relatives. We observed that Helicobacter pylori and Neisseria spp. presented the highest number of MTases while ten different species did not present a predicted DNA MTase. We also detected a significant increase of adenine MTases over cytosine MTases (2.19 vs. 1.06, respectively, p < 0.001). Adenine MTases were the only MTases associated with restriction modification systems and DNA MTases associated with type I restriction modification systems were more numerous than those associated with type III restriction modification systems (0.84 vs. 0.17, p < 0.001); additionally, there was no correlation with the genome size and the total number of DNA MTases, indicating that the number of DNA MTases is related to the particular evolution and lifestyle of specific species, regulating the expression of virulence genes in some pathogenic bacteria.
Kong, Wei; Mou, Xiaoyang; Di, Benteng; Deng, Jin; Zhong, Ruxing; Wang, Shuaiqun
2017-11-20
Dysregulated pathway identification is an important task which can gain insight into the underlying biological processes of disease. Current pathway-identification methods focus on a set of co-expression genes and single pathways and ignore the correlation between genes and pathways. The method proposed in this study, takes into account the internal correlations not only between genes but also pathways to identifying dysregulated pathways related to Alzheimer's disease (AD), the most common form of dementia. In order to find the significantly differential genes for AD, mutual information (MI) is used to measure interdependencies between genes other than expression valves. Then, by integrating the topology information from KEGG, the significant pathways involved in the feature genes are identified. Next, the distance correlation (DC) is applied to measure the pairwise pathway crosstalks since DC has the advantage of detecting nonlinear correlations when compared to Pearson correlation. Finally, the pathway pairs with significantly different correlations between normal and AD samples are known as dysregulated pathways. The molecular biology analysis demonstrated that many dysregulated pathways related to AD pathogenesis have been discovered successfully by the internal correlation detection. Furthermore, the insights of the dysregulated pathways in the development and deterioration of AD will help to find new effective target genes and provide important theoretical guidance for drug design. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
2012-01-01
Background Chinese fir (Cunninghamia lanceolata) is an important timber species that accounts for 20–30% of the total commercial timber production in China. However, the available genomic information of Chinese fir is limited, and this severely encumbers functional genomic analysis and molecular breeding in Chinese fir. Recently, major advances in transcriptome sequencing have provided fast and cost-effective approaches to generate large expression datasets that have proven to be powerful tools to profile the transcriptomes of non-model organisms with undetermined genomes. Results In this study, the transcriptomes of nine tissues from Chinese fir were analyzed using the Illumina HiSeq™ 2000 sequencing platform. Approximately 40 million paired-end reads were obtained, generating 3.62 gigabase pairs of sequencing data. These reads were assembled into 83,248 unique sequences (i.e. Unigenes) with an average length of 449 bp, amounting to 37.40 Mb. A total of 73,779 Unigenes were supported by more than 5 reads, 42,663 (57.83%) had homologs in the NCBI non-redundant and Swiss-Prot protein databases, corresponding to 27,224 unique protein entries. Of these Unigenes, 16,750 were assigned to Gene Ontology classes, and 14,877 were clustered into orthologous groups. A total of 21,689 (29.40%) were mapped to 119 pathways by BLAST comparison against the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. The majority of the genes encoding the enzymes in the biosynthetic pathways of cellulose and lignin were identified in the Unigene dataset by targeted searches of their annotations. And a number of candidate Chinese fir genes in the two metabolic pathways were discovered firstly. Eighteen genes related to cellulose and lignin biosynthesis were cloned for experimental validating of transcriptome data. Overall 49 Unigenes, covering different regions of these selected genes, were found by alignment. Their expression patterns in different tissues were analyzed by qRT-PCR to explore their putative functions. Conclusions A substantial fraction of transcript sequences was obtained from the deep sequencing of Chinese fir. The assembled Unigene dataset was used to discover candidate genes of cellulose and lignin biosynthesis. This transcriptome dataset will provide a comprehensive sequence resource for molecular genetics research of C. lanceolata. PMID:23171398
ReNE: A Cytoscape Plugin for Regulatory Network Enhancement
Politano, Gianfranco; Benso, Alfredo; Savino, Alessandro; Di Carlo, Stefano
2014-01-01
One of the biggest challenges in the study of biological regulatory mechanisms is the integration, americanmodeling, and analysis of the complex interactions which take place in biological networks. Despite post transcriptional regulatory elements (i.e., miRNAs) are widely investigated in current research, their usage and visualization in biological networks is very limited. Regulatory networks are commonly limited to gene entities. To integrate networks with post transcriptional regulatory data, researchers are therefore forced to manually resort to specific third party databases. In this context, we introduce ReNE, a Cytoscape 3.x plugin designed to automatically enrich a standard gene-based regulatory network with more detailed transcriptional, post transcriptional, and translational data, resulting in an enhanced network that more precisely models the actual biological regulatory mechanisms. ReNE can automatically import a network layout from the Reactome or KEGG repositories, or work with custom pathways described using a standard OWL/XML data format that the Cytoscape import procedure accepts. Moreover, ReNE allows researchers to merge multiple pathways coming from different sources. The merged network structure is normalized to guarantee a consistent and uniform description of the network nodes and edges and to enrich all integrated data with additional annotations retrieved from genome-wide databases like NCBI, thus producing a pathway fully manageable through the Cytoscape environment. The normalized network is then analyzed to include missing transcription factors, miRNAs, and proteins. The resulting enhanced network is still a fully functional Cytoscape network where each regulatory element (transcription factor, miRNA, gene, protein) and regulatory mechanism (up-regulation/down-regulation) is clearly visually identifiable, thus enabling a better visual understanding of its role and the effect in the network behavior. The enhanced network produced by ReNE is exportable in multiple formats for further analysis via third party applications. ReNE can be freely installed from the Cytoscape App Store (http://apps.cytoscape.org/apps/rene) and the full source code is freely available for download through a SVN repository accessible at http://www.sysbio.polito.it/tools_svn/BioInformatics/Rene/releases/. ReNE enhances a network by only integrating data from public repositories, without any inference or prediction. The reliability of the introduced interactions only depends on the reliability of the source data, which is out of control of ReNe developers. PMID:25541727
2013-01-01
Background Olive cDNA libraries to isolate candidate genes that can help enlightening the molecular mechanism of periodicity and / or fruit production were constructed and analyzed. For this purpose, cDNA libraries from the leaves of trees in “on year” and in “off year” in July (when fruits start to appear) and in November (harvest time) were constructed. Randomly selected 100 positive clones from each library were analyzed with respect to sequence and size. A fruit-flesh cDNA library was also constructed and characterized to confirm the reliability of each library’s temporal and spatial properties. Results Quantitative real-time RT-PCR (qRT-PCR) analyses of the cDNA libraries confirmed cDNA molecules that are associated with different developmental stages (e. g. “on year” leaves in July, “off year” leaves in July, leaves in November) and fruits. Hence, a number of candidate cDNAs associated with “on year” and “off year” were isolated. Comparison of the detected cDNAs to the current EST database of GenBank along with other non - redundant databases of NCBI revealed homologs of previously described genes along with several unknown cDNAs. Of around 500 screened cDNAs, 48 cDNA elements were obtained after eliminating ribosomal RNA sequences. These independent transcripts were analyzed using BLAST searches (cutoff E-value of 1.0E-5) against the KEGG and GenBank nucleotide databases and 37 putative transcripts corresponding to known gene functions were annotated with gene names and Gene Ontology (GO) terms. Transcripts in the biological process were found to be related with metabolic process (27%), cellular process (23%), response to stimulus (17%), localization process (8.5%), multicellular organismal process (6.25%), developmental process (6.25%) and reproduction (4.2%). Conclusions A putative P450 monooxigenase expressed fivefold more in the “on year” than that of “off year” leaves in July. Two putative dehydrins expressed significantly more in “on year” leaves than that of “off year” leaves in November. Homologs of UDP – glucose epimerase, acyl - CoA binding protein, triose phosphate isomerase and a putative nuclear core anchor protein were significant in fruits only, while a homolog of an embryo binding protein / small GTPase regulator was detected in “on year” leaves only. One of the two unknown cDNAs was specific to leaves in July while the other was detected in all of the libraries except fruits. KEGG pathway analyses for the obtained sequences correlated with essential metabolisms such as galactose metabolism, amino sugar and nucleotide sugar metabolisms and photosynthesis. Detailed analysis of the results presents candidate cDNAs that can be used to dissect further the genetic basis of fruit production and / or alternate bearing which causes significant economical loss for olive growers. PMID:23552171
JBioWH: an open-source Java framework for bioinformatics data integration
Vera, Roberto; Perez-Riverol, Yasset; Perez, Sonia; Ligeti, Balázs; Kertész-Farkas, Attila; Pongor, Sándor
2013-01-01
The Java BioWareHouse (JBioWH) project is an open-source platform-independent programming framework that allows a user to build his/her own integrated database from the most popular data sources. JBioWH can be used for intensive querying of multiple data sources and the creation of streamlined task-specific data sets on local PCs. JBioWH is based on a MySQL relational database scheme and includes JAVA API parser functions for retrieving data from 20 public databases (e.g. NCBI, KEGG, etc.). It also includes a client desktop application for (non-programmer) users to query data. In addition, JBioWH can be tailored for use in specific circumstances, including the handling of massive queries for high-throughput analyses or CPU intensive calculations. The framework is provided with complete documentation and application examples and it can be downloaded from the Project Web site at http://code.google.com/p/jbiowh. A MySQL server is available for demonstration purposes at hydrax.icgeb.trieste.it:3307. Database URL: http://code.google.com/p/jbiowh PMID:23846595
JBioWH: an open-source Java framework for bioinformatics data integration.
Vera, Roberto; Perez-Riverol, Yasset; Perez, Sonia; Ligeti, Balázs; Kertész-Farkas, Attila; Pongor, Sándor
2013-01-01
The Java BioWareHouse (JBioWH) project is an open-source platform-independent programming framework that allows a user to build his/her own integrated database from the most popular data sources. JBioWH can be used for intensive querying of multiple data sources and the creation of streamlined task-specific data sets on local PCs. JBioWH is based on a MySQL relational database scheme and includes JAVA API parser functions for retrieving data from 20 public databases (e.g. NCBI, KEGG, etc.). It also includes a client desktop application for (non-programmer) users to query data. In addition, JBioWH can be tailored for use in specific circumstances, including the handling of massive queries for high-throughput analyses or CPU intensive calculations. The framework is provided with complete documentation and application examples and it can be downloaded from the Project Web site at http://code.google.com/p/jbiowh. A MySQL server is available for demonstration purposes at hydrax.icgeb.trieste.it:3307. Database URL: http://code.google.com/p/jbiowh.
Liu, Chuan-He; Fan, Chao
2016-01-01
A remarkable characteristic of pineapple is its ability to undergo floral induction in response to external ethylene stimulation. However, little information is available regarding the molecular mechanism underlying this process. In this study, the differentially expressed genes (DEGs) in plants exposed to 1.80 mL·L−1 (T1) or 2.40 mL·L−1 ethephon (T2) compared with Ct plants (control, cleaning water) were identified using RNA-seq and gene expression profiling. Illumina sequencing generated 65,825,224 high-quality reads that were assembled into 129,594 unigenes with an average sequence length of 1173 bp. Of these unigenes, 24,775 were assigned to specific KEGG pathways, of which metabolic pathways and biosynthesis of secondary metabolites were the most highly represented. Gene Ontology (GO) analysis of the annotated unigenes revealed that the majority were involved in metabolic and cellular processes, cell and cell part, catalytic activity and binding. Gene expression profiling analysis revealed 3788, 3062, and 758 DEGs in the comparisons of T1 with Ct, T2 with Ct, and T2 with T1, respectively. GO analysis indicated that these DEGs were predominantly annotated to metabolic and cellular processes, cell and cell part, catalytic activity, and binding. KEGG pathway analysis revealed the enrichment of several important pathways among the DEGs, including metabolic pathways, biosynthesis of secondary metabolites and plant hormone signal transduction. Thirteen DEGs were identified as candidate genes associated with the process of floral induction by ethephon, including three ERF-like genes, one ETR-like gene, one LTI-like gene, one FT-like gene, one VRN1-like gene, three FRI-like genes, one AP1-like gene, one CAL-like gene, and one AG-like gene. qPCR analysis indicated that the changes in the expression of these 13 candidate genes were consistent with the alterations in the corresponding RPKM values, confirming the accuracy and credibility of the RNA-seq and gene expression profiling results. Ethephon-mediated induction likely mimics the process of vernalization in the floral transition in pineapple by increasing LTI, FT, and VRN1 expression and promoting the up-regulation of floral meristem identity genes involved in flower development. The candidate genes screened can be used in investigations of the molecular mechanisms of the flowering pathway and of various other biological mechanisms in pineapple. PMID:26955375
2014-01-01
Background Syntrichia caninervis is a desiccation-tolerant moss and the dominant bryophyte of the Biological Soil Crusts (BSCs) found in the Mojave and Gurbantunggut deserts. Next generation high throughput sequencing technologies offer an efficient and economic choice for characterizing non-model organism transcriptomes with little or no prior molecular information available. Results In this study, we employed next generation, high-throughput, Illumina RNA-Seq to analyze the poly-(A) + mRNA from hydrated, dehydrating and desiccated S. caninervis gametophores. Approximately 58.0 million paired-end short reads were obtained and 92,240 unigenes were assembled with an average size of 493 bp, N50 value of 662 bp and a total size of 45.48 Mbp. Sequence similarity searches against five public databases (NR, Swiss-Prot, COSMOSS, KEGG and COG) found 54,125 unigenes (58.7%) with significant similarity to an existing sequence (E-value ≤ 1e-5) and could be annotated. Gene Ontology (GO) annotation assigned 24,183 unigenes to the three GO terms: Biological Process, Cellular Component or Molecular Function. GO comparison between P. patens and S. caninervis demonstrated similar sequence enrichment across all three GO categories. 29,370 deduced polypeptide sequences were assigned Pfam domain information and categorized into 4,212 Pfam domains/families. Using the PlantTFDB, 778 unigenes were predicted to be involved in the regulation of transcription and were classified into 49 transcription factor families. Annotated unigenes were mapped to the KEGG pathways and further annotated using MapMan. Comparative genomics revealed that 44% of protein families are shared in common by S. caninervis, P. patens and Arabidopsis thaliana and that 80% are shared by both moss species. Conclusions This study is one of the first comprehensive transcriptome analyses of the moss S. caninervis. Our data extends our knowledge of bryophyte transcriptomes, provides an insight to plants adapted to the arid regions of central Asia, and continues the development of S. caninervis as a model for understanding the molecular aspects of desiccation-tolerance. PMID:25086984
Yang, Wei; Yang, Chunping; Zhang, Jin; Yang, Yang; Wang, Baoxin; Guan, Fengrong
2018-01-01
The white-striped longhorn beetle Batocera horsfieldi (Coleoptera: Cerambycidae) is a polyphagous wood-boring pest that causes substantial damage to the lumber industry. Moreover olfactory proteins are crucial components to function in related processes, but the B. horsfieldi genome is not readily available for olfactory proteins analysis. In the present study, developmental transcriptomes of larvae from the first instar to the prepupal stage, pupae, and adults (females and males) from emergence to mating were built by RNA sequencing to establish a genetic background that may help understand olfactory genes. Approximately 199 million clean reads were obtained and assembled into 171,664 transcripts, which were classified into 23,380, 26,511, 22,393, 30,270, and 87, 732 unigenes for larvae, pupae, females, males, and combined datasets, respectively. The unigenes were annotated against NCBI’s non-redundant nucleotide and protein sequences, Swiss-Prot, Gene Ontology (GO), Pfam, Clusters of Eukaryotic Orthologous Groups (KOG), and KEGG Orthology (KO) databases. A total of 43,197 unigenes were annotated into 55 sub-categories under the three main GO categories; 25,237 unigenes were classified into 26 functional KOG categories, and 25,814 unigenes were classified into five functional KEGG Pathway categories. RSEM software identified 2,983, 3,097, 870, 2,437, 5,161, and 2,882 genes that were differentially expressed between larvae and males, larvae and pupae, larvae and females, males and females, males and pupae, and females and pupae, respectively. Among them, genes encoding seven candidate odorant binding proteins (OBPs) and three chemosensory proteins (CSPs) were identified. RT-PCR and RT-qPCR analyses showed that BhorOBP3, BhorCSP2, and BhorOBPC1/C3/C4 were highly expressed in the antenna of males, indicating these genes may may play key roles in foraging and host-orientation in B. horsfieldi. Our results provide valuable molecular information about the olfactory system in B. horsfieldi and will help guide future functional studies on olfactory genes. PMID:29474419
2015-07-24
remodeling & satellite cell activation. Fig. S8. a) Enriched KEGG pathways from differentially expressed genes for the late time points. The size of...Socs3, IL-1rn, IL-4rα, IL-10rα, IL-13rα1, FDR=4.31e-10 - GO:0050728, negative regulation of inflammatory response Invading immune cell genes: Cd68...Inflammatory States Several Days After Injury Innate immunity and microbial recognition: Tlr1, Tlr7, Tlr8, FDR=0.003 - GO:0034121, regulation of
Ye, Yaqiong; Lin, Shumao; Mu, Heping; Tang, Xiaohong; Ou, Yangdan; Chen, Jian; Ma, Yongjiang; Li, Yugu
2014-01-01
Intramuscular fat (IMF) plays an important role in meat quality. However, the molecular mechanisms underlying IMF deposition in skeletal muscle have not been addressed for the sex-linked dwarf (SLD) chicken. In this study, potential candidate genes and signaling pathways related to IMF deposition in chicken leg muscle tissue were characterized using gene expression profiling of both 7-week-old SLD and normal chickens. A total of 173 differentially expressed genes (DEGs) were identified between the two breeds. Subsequently, 6 DEGs related to lipid metabolism or muscle development were verified in each breed based on gene ontology (GO) analysis. In addition, KEGG pathway analysis of DEGs indicated that some of them (GHR, SOCS3, and IGF2BP3) participate in adipocytokine and insulin signaling pathways. To investigate the role of the above signaling pathways in IMF deposition, the gene expression of pathway factors and other downstream genes were measured by using qRT-PCR and Western blot analyses. Collectively, the results identified potential candidate genes related to IMF deposition and suggested that IMF deposition in skeletal muscle of SLD chicken is regulated partially by pathways of adipocytokine and insulin and other downstream signaling pathways (TGF-β/SMAD3 and Wnt/catenin-β pathway). PMID:24757673
ProteinWorldDB: querying radical pairwise alignments among protein sets from complete genomes.
Otto, Thomas Dan; Catanho, Marcos; Tristão, Cristian; Bezerra, Márcia; Fernandes, Renan Mathias; Elias, Guilherme Steinberger; Scaglia, Alexandre Capeletto; Bovermann, Bill; Berstis, Viktors; Lifschitz, Sergio; de Miranda, Antonio Basílio; Degrave, Wim
2010-03-01
Many analyses in modern biological research are based on comparisons between biological sequences, resulting in functional, evolutionary and structural inferences. When large numbers of sequences are compared, heuristics are often used resulting in a certain lack of accuracy. In order to improve and validate results of such comparisons, we have performed radical all-against-all comparisons of 4 million protein sequences belonging to the RefSeq database, using an implementation of the Smith-Waterman algorithm. This extremely intensive computational approach was made possible with the help of World Community Grid, through the Genome Comparison Project. The resulting database, ProteinWorldDB, which contains coordinates of pairwise protein alignments and their respective scores, is now made available. Users can download, compare and analyze the results, filtered by genomes, protein functions or clusters. ProteinWorldDB is integrated with annotations derived from Swiss-Prot, Pfam, KEGG, NCBI Taxonomy database and gene ontology. The database is a unique and valuable asset, representing a major effort to create a reliable and consistent dataset of cross-comparisons of the whole protein content encoded in hundreds of completely sequenced genomes using a rigorous dynamic programming approach. The database can be accessed through http://proteinworlddb.org
Analysis of the Genome and Chromium Metabolism-Related Genes of Serratia sp. S2.
Dong, Lanlan; Zhou, Simin; He, Yuan; Jia, Yan; Bai, Qunhua; Deng, Peng; Gao, Jieying; Li, Yingli; Xiao, Hong
2018-05-01
This study is to investigate the genome sequence of Serratia sp. S2. The genomic DNA of Serratia sp. S2 was extracted and the sequencing library was constructed. The sequencing was carried out by Illumina 2000 and complete genomic sequences were obtained. Gene function annotation and bioinformatics analysis were performed by comparing with the known databases. The genome size of Serratia sp. S2 was 5,604,115 bp and the G+C content was 57.61%. There were 5373 protein coding genes, and 3732, 3614, and 3942 genes were respectively annotated into the GO, KEGG, and COG databases. There were 12 genes related to chromium metabolism in the Serratia sp. S2 genome. The whole genome sequence of Serratia sp. S2 is submitted to the GenBank database with gene accession number of LNRP00000000. Our findings may provide theoretical basis for the subsequent development of new biotechnology to repair environmental chromium pollution.
Forth, Thomas; McConkey, Glenn A; Westhead, David R
2010-09-15
An application has been developed to help with the creation and editing of Systems Biology Markup Language (SBML) format metabolic networks up to the organism scale. Networks are defined as a collection of Kyoto Encyclopedia of Genes and Genomes (KEGG) LIGAND reactions with an optional associated Enzyme Classification (EC) number for each reaction. Additional custom reactions can be defined by the user. Reactions within the network can be assigned flux constraints and compartmentalization is supported for each reaction in addition to the support for reactions that occur across compartment boundaries. Exported networks are fully SBML L2V4 compatible with an optional L2V1 export for compatibility with old versions of the COBRA toolbox. The software runs in the free Microsoft Access 2007 Runtime (Microsoft Inc.), which is included with the installer and works on Windows XP SP2 or better. Full source code is viewable in the full version of Access 2007 or 2010. Users must have a license to use the KEGG LIGAND database (free academic licensing is available). Please go to www.bioinformatics.leeds.ac.uk/~pytf/metnetmaker for software download, help and tutorials.
GarlicESTdb: an online database and mining tool for garlic EST sequences.
Kim, Dae-Won; Jung, Tae-Sung; Nam, Seong-Hyeuk; Kwon, Hyuk-Ryul; Kim, Aeri; Chae, Sung-Hwa; Choi, Sang-Haeng; Kim, Dong-Wook; Kim, Ryong Nam; Park, Hong-Seog
2009-05-18
Allium sativum., commonly known as garlic, is a species in the onion genus (Allium), which is a large and diverse one containing over 1,250 species. Its close relatives include chives, onion, leek and shallot. Garlic has been used throughout recorded history for culinary, medicinal use and health benefits. Currently, the interest in garlic is highly increasing due to nutritional and pharmaceutical value including high blood pressure and cholesterol, atherosclerosis and cancer. For all that, there are no comprehensive databases available for Expressed Sequence Tags(EST) of garlic for gene discovery and future efforts of genome annotation. That is why we developed a new garlic database and applications to enable comprehensive analysis of garlic gene expression. GarlicESTdb is an integrated database and mining tool for large-scale garlic (Allium sativum) EST sequencing. A total of 21,595 ESTs collected from an in-house cDNA library were used to construct the database. The analysis pipeline is an automated system written in JAVA and consists of the following components: automatic preprocessing of EST reads, assembly of raw sequences, annotation of the assembled sequences, storage of the analyzed information into MySQL databases, and graphic display of all processed data. A web application was implemented with the latest J2EE (Java 2 Platform Enterprise Edition) software technology (JSP/EJB/JavaServlet) for browsing and querying the database, for creation of dynamic web pages on the client side, and for mapping annotated enzymes to KEGG pathways, the AJAX framework was also used partially. The online resources, such as putative annotation, single nucleotide polymorphisms (SNP) and tandem repeat data sets, can be searched by text, explored on the website, searched using BLAST, and downloaded. To archive more significant BLAST results, a curation system was introduced with which biologists can easily edit best-hit annotation information for others to view. The GarlicESTdb web application is freely available at http://garlicdb.kribb.re.kr. GarlicESTdb is the first incorporated online information database of EST sequences isolated from garlic that can be freely accessed and downloaded. It has many useful features for interactive mining of EST contigs and datasets from each library, including curation of annotated information, expression profiling, information retrieval, and summary of statistics of functional annotation. Consequently, the development of GarlicESTdb will provide a crucial contribution to biologists for data-mining and more efficient experimental studies.
Tang, Cheng; Lan, Daoliang; Zhang, Huanrong; Ma, Jing; Yue, Hua
2013-01-01
Duck is an economically important poultry and animal model for human viral hepatitis B. However, the molecular mechanisms underlying host-virus interaction remain unclear because of limited information on the duck genome. This study aims to characterize the duck normal liver transcriptome and to identify the differentially expressed transcripts at 24 h after duck hepatitis A virus genotype C (DHAV-C) infection using Illumina-Solexa sequencing. After removal of low-quality sequences and assembly, a total of 52,757 unigenes was obtained from the normal liver group. Further blast analysis showed that 18,918 unigenes successfully matched the known genes in the database. GO analysis revealed that 25,116 unigenes took part in 61 categories of biological processes, cellular components, and molecular functions. Among the 25 clusters of orthologous group categories (COG), the cluster for "General function prediction only" represented the largest group, followed by "Transcription" and "Replication, recombination, and repair." KEGG analysis showed that 17,628 unigenes were involved in 301 pathways. Through comparison of normal and infected transcriptome data, we identified 20 significantly differentially expressed unigenes, which were further confirmed by real-time polymerase chain reaction. Of the 20 unigenes, nine matched the known genes in the database, including three up-regulated genes (virus replicase polyprotein, LRRC3B, and PCK1) and six down-regulated genes (CRP, AICL-like 2, L1CAM, CYB26A1, CHAC1, and ADAM32). The remaining 11 novel unigenes that did not match any known genes in the database may provide a basis for the discovery of new transcripts associated with infection. This study provided a gene expression pattern for normal duck liver and for the previously unrecognized changes in gene transcription that are altered during DHAV-C infection. Our data revealed useful information for future studies on the duck genome and provided new insights into the molecular mechanism of host-DHAV-C interaction.
Yan, Hai-Biao; Huang, Jia-Cheng; Chen, You-Rong; Yao, Jian-Ni; Cen, Wei-Ning; Li, Jia-Yi; Jiang, Yi-Fan; Chen, Gang; Li, Sheng-Hua
2018-02-01
To investigate the clinical value and potential molecular mechanisms of miR-1 in clear cell renal cell carcinoma (ccRCC). We searched the Gene Expression Omnibus (GEO), ArrayExpress, several online publication databases and the Cancer Genome Atlas (TCGA). Continuous variable meta-analysis and diagnostic meta-analysis were conducted, both in Stata 14, to show the expression of miR-1 in ccRCC. Furthermore, we acquired the potential targets of miR-1 from datasets that transfected miR-1 into ccRCC cells, online prediction databases, differentially expressed genes from TCGA and literature. Subsequently bioinformatics analysis based on aforementioned selected target genes was conducted. The combined effect was -0.92 with the 95% confidence interval (CI) of -1.08 to -0.77 based on fixed effect model (I 2 = 81.3%, P < 0.001). No publication bias was found in our investigation. Sensitivity analysis showed that GSE47582 and 2 TCGA studies might cause heterogeneity. After eliminating them, the combined effect was -0.47 (95%CI: -0.78, -0.16) with I 2 = 18.3%. As for the diagnostic meta-analysis, the combined sensitivity and specificity were 0.90 (95%CI: 0.61, 0.98) and 0.63 (95%CI: 0.39, 0.82). The area under the curve (AUC) in the summarized receiver operating characteristic (SROC) curve was 0.83 (95%CI: 0.80, 0.86). No publication bias was found (P = 0.15). We finally got 67 genes which were defined the promising target genes of miR-1 in ccRCC. The most three significant KEGG pathways based on the aforementioned genes were Complement and coagulation cascades, ECM-receptor interaction and Focal adhesion. The downregulation of miR-1 might play an important role in ccRCC by targeting its target genes. Copyright © 2017 Elsevier GmbH. All rights reserved.
Bikel, Shirley; Jacobo-Albavera, Leonor; Sánchez-Muñoz, Fausto; Cornejo-Granados, Fernanda; Canizales-Quinteros, Samuel; Soberón, Xavier; Sotelo-Mundo, Rogerio R.; del Río-Navarro, Blanca E.; Mendoza-Vargas, Alfredo; Sánchez, Filiberto
2017-01-01
Background In spite of the emergence of RNA sequencing (RNA-seq), microarrays remain in widespread use for gene expression analysis in the clinic. There are over 767,000 RNA microarrays from human samples in public repositories, which are an invaluable resource for biomedical research and personalized medicine. The absolute gene expression analysis allows the transcriptome profiling of all expressed genes under a specific biological condition without the need of a reference sample. However, the background fluorescence represents a challenge to determine the absolute gene expression in microarrays. Given that the Y chromosome is absent in female subjects, we used it as a new approach for absolute gene expression analysis in which the fluorescence of the Y chromosome genes of female subjects was used as the background fluorescence for all the probes in the microarray. This fluorescence was used to establish an absolute gene expression threshold, allowing the differentiation between expressed and non-expressed genes in microarrays. Methods We extracted the RNA from 16 children leukocyte samples (nine males and seven females, ages 6–10 years). An Affymetrix Gene Chip Human Gene 1.0 ST Array was carried out for each sample and the fluorescence of 124 genes of the Y chromosome was used to calculate the absolute gene expression threshold. After that, several expressed and non-expressed genes according to our absolute gene expression threshold were compared against the expression obtained using real-time quantitative polymerase chain reaction (RT-qPCR). Results From the 124 genes of the Y chromosome, three genes (DDX3Y, TXLNG2P and EIF1AY) that displayed significant differences between sexes were used to calculate the absolute gene expression threshold. Using this threshold, we selected 13 expressed and non-expressed genes and confirmed their expression level by RT-qPCR. Then, we selected the top 5% most expressed genes and found that several KEGG pathways were significantly enriched. Interestingly, these pathways were related to the typical functions of leukocytes cells, such as antigen processing and presentation and natural killer cell mediated cytotoxicity. We also applied this method to obtain the absolute gene expression threshold in already published microarray data of liver cells, where the top 5% expressed genes showed an enrichment of typical KEGG pathways for liver cells. Our results suggest that the three selected genes of the Y chromosome can be used to calculate an absolute gene expression threshold, allowing a transcriptome profiling of microarray data without the need of an additional reference experiment. Discussion Our approach based on the establishment of a threshold for absolute gene expression analysis will allow a new way to analyze thousands of microarrays from public databases. This allows the study of different human diseases without the need of having additional samples for relative expression experiments. PMID:29230367
Gandhi, Deepa; Sivanesan, Saravanadevi; Kannan, Krishnamurthi
2018-06-01
Manganese (Mn) is an essential trace element required for many physiological functions including proper biochemical and cellular functioning of the central nervous system (CNS). However, exposure to excess level of Mn through occupational settings or from environmental sources has been associated with neurotoxicity. The cellular and molecular mechanism of Mn-induced neurotoxicity remains unclear. In the current study, we investigated the effects of 30-day exposure to a sub-lethal concentration of Mn (100 μM) in human neuroblastoma cells (SH-SY5Y) using transcriptomic approach. Microarray analysis revealed differential expression of 1057 transcripts in Mn-exposed SH-SY5Y cells as compared to control cells. Gene functional annotation cluster analysis exhibited that the differentially expressed genes were associated with several biological pathways. Specifically, genes involved in neuronal pathways including neuron differentiation and development, regulation of neurogenesis, synaptic transmission, and neuronal cell death (apoptosis) were found to be significantly altered. KEGG pathway analysis showed upregulation of p53 signaling pathways and neuroactive ligand-receptor interaction pathways, and downregulation of neurotrophin signaling pathway. On the basis of the gene expression profile, possible molecular mechanisms underlying Mn-induced neuronal toxicity were predicted.
Genome-wide analysis of long non-coding RNAs and their role in postnatal porcine testis development.
Weng, Bo; Ran, Maoliang; Chen, Bin; He, Changqing; Dong, Lianhua; Peng, Fuzhi
2017-10-01
A comprehensive and systematic understanding of the roles of lncRNAs in the postnatal development of the pig testis has still not been achieved. In the present study, we obtained more than one billion clean reads and identified 15,528 lncRNA transcripts; these transcripts included 5032 known and 10,496 novel porcine lncRNA transcripts and corresponded to 10,041 lncRNA genes. Pairwise comparisons identified 449 known and 324 novel lncRNAs that showed differential expression patterns. GO and KEGG pathway enrichment analyses revealed that the targeted genes were involved in metabolic pathways regulating testis development and spermatogenesis, such as the TGF-beta pathway, the PI3K-Akt pathway, the Wnt/β-catenin pathway, and the AMPK pathway. Using this information, we predicted some lncRNAs and coding gene pairs were predicted that may function in testis development and spermatogenesis; these are listed in detail. This study has provided the most comprehensive catalog to date of lncRNAs in the postnatal pig testis and will aid our understanding of their functional roles in testis development and spermatogenesis. Copyright © 2017. Published by Elsevier Inc.
Schachtschneider, Kyle Michael; Liu, Xiaolin; Huang, Wei; Xie, Ming; Hou, Shuisheng
2014-01-01
Lean-type Pekin duck is a commercial breed that has been obtained through long-term selection. Investigation of the differentially expressed genes in breast muscle and skin fat at different developmental stages will contribute to a comprehensive understanding of the potential mechanisms underlying the lean-type Pekin duck phenotype. In the present study, RNA-seq was performed on breast muscle and skin fat at 2-, 4- and 6-weeks of age. More than 89% of the annotated duck genes were covered by our RNA-seq dataset. Thousands of differentially expressed genes, including many important genes involved in the regulation of muscle development and fat deposition, were detected through comparison of the expression levels in the muscle and skin fat of the same time point, or the same tissue at different time points. KEGG pathway analysis showed that the differentially expressed genes clustered significantly in many muscle development and fat deposition related pathways such as MAPK signaling pathway, PPAR signaling pathway, Calcium signaling pathway, Fat digestion and absorption, and TGF-beta signaling pathway. The results presented here could provide a basis for further investigation of the mechanisms involved in muscle development and fat deposition in Pekin duck. PMID:25264787
Zhang, Yongqiang; Pei, Xinwu; Zhang, Chao; Lu, Zifeng; Wang, Zhixing; Jia, Shirong; Li, Weimin
2012-01-01
Background The hypersensitive response (HR) system of Chenopodium spp. confers broad-spectrum virus resistance. However, little knowledge exists at the genomic level for Chenopodium, thus impeding the advanced molecular research of this attractive feature. Hence, we took advantage of RNA-seq to survey the foliar transcriptome of C. amaranticolor, a Chenopodium species widely used as laboratory indicator for pathogenic viruses, in order to facilitate the characterization of the HR-type of virus resistance. Methodology and Principal Findings Using Illumina HiSeq™ 2000 platform, we obtained 39,868,984 reads with 3,588,208,560 bp, which were assembled into 112,452 unigenes (3,847 clusters and 108,605 singletons). BlastX search against the NCBI NR database identified 61,698 sequences with a cut-off E-value above 10−5. Assembled sequences were annotated with gene descriptions, GO, COG and KEGG terms, respectively. A total number of 738 resistance gene analogs (RGAs) and homology sequences of 6 key signaling proteins within the R proteins-directed signaling pathway were identified. Based on this transcriptome data, we investigated the gene expression profiles over the stage of HR induced by Tobacco mosaic virus and Cucumber mosaic virus by using digital gene expression analysis. Numerous candidate genes specifically or commonly regulated by these two distinct viruses at early and late stages of the HR were identified, and the dynamic changes of the differently expressed genes enriched in the pathway of plant-pathogen interaction were particularly emphasized. Conclusions To our knowledge, this study is the first description of the genetic makeup of C. amaranticolor, providing deep insight into the comprehensive gene expression information at transcriptional level in this species. The 738 RGAs as well as the differentially regulated genes, particularly the common genes regulated by both TMV and CMV, are suitable candidates which merit further functional characterization to dissect the molecular mechanisms and regulatory pathways of the HR-type of virus resistance in Chenopodium. PMID:23029338
Prediction of Oncogenic Interactions and Cancer-Related Signaling Networks Based on Network Topology
Acencio, Marcio Luis; Bovolenta, Luiz Augusto; Camilo, Esther; Lemke, Ney
2013-01-01
Cancer has been increasingly recognized as a systems biology disease since many investigators have demonstrated that this malignant phenotype emerges from abnormal protein-protein, regulatory and metabolic interactions induced by simultaneous structural and regulatory changes in multiple genes and pathways. Therefore, the identification of oncogenic interactions and cancer-related signaling networks is crucial for better understanding cancer. As experimental techniques for determining such interactions and signaling networks are labor-intensive and time-consuming, the development of a computational approach capable to accomplish this task would be of great value. For this purpose, we present here a novel computational approach based on network topology and machine learning capable to predict oncogenic interactions and extract relevant cancer-related signaling subnetworks from an integrated network of human genes interactions (INHGI). This approach, called graph2sig, is twofold: first, it assigns oncogenic scores to all interactions in the INHGI and then these oncogenic scores are used as edge weights to extract oncogenic signaling subnetworks from INHGI. Regarding the prediction of oncogenic interactions, we showed that graph2sig is able to recover 89% of known oncogenic interactions with a precision of 77%. Moreover, the interactions that received high oncogenic scores are enriched in genes for which mutations have been causally implicated in cancer. We also demonstrated that graph2sig is potentially useful in extracting oncogenic signaling subnetworks: more than 80% of constructed subnetworks contain more than 50% of original interactions in their corresponding oncogenic linear pathways present in the KEGG PATHWAY database. In addition, the potential oncogenic signaling subnetworks discovered by graph2sig are supported by experimental evidence. Taken together, these results suggest that graph2sig can be a useful tool for investigators involved in cancer research interested in detecting signaling networks most prone to contribute with the emergence of malignant phenotype. PMID:24204854
NASA Astrophysics Data System (ADS)
Wun, S. R.; Huang, T. Y.; Hsu, B. M.; Fan, C. W.
2017-12-01
We aimed to study the effects of physical factors on the relative abundance of bacteria and their preferential admissions of autotrophic CO2 fixation pathways after subjected to environmental long-term influence. The Narrow-Sky located in upper part of Takangshan is a small gulch of Pleistocene coralline limestone formation in southern Taiwan. The physical parameters such as illumination, humidity, and temperature were varied largely in habitats around the gulch, namely on the limestone wall at the opening of gulch, on the coordinate ground soil, on the wall inside the gulch, and the water drip from limestone wall. The total organic carbon was measured in solid samples to evaluate the biomass of the habitats. A metagenomic approach was carried out to reveal their microbial community structure. After the metagenomic library of operational taxonomic units (OTUs) was constructed, a BLAST search by "nomenclature of bacteria" instead of sequences between the OTU libraries and KEGG database was carried out to generate libraries of "model microbial communities", which the complete genomes of the entire bacterial populations were available. Our results showed the biomass of habitats in the opening of gulch was twice higher than the inside, suggesting the illumination played an important role in biosynthesis. In quantitative comparison in key enzymes of CO2 fixation pathways by model communities, 70% to 90% of bacteria possessed key enzymes of Fuchs-Holo cycle, while only 5% to 20% of bacteria contained key enzymes of Calvin-Benson cycle. The key enzymes for hydroxypropionate/ hydroxybutyrate and dicarboxylate/ 4-hydroxybutyrate cycles were not found in this study. In the water sample, approximate 10% of bacteria consisted of the key enzyme for Arnon-Buchanan cycle. Less than 2% of bacteria in all habitats take the reductive acetyl-CoA cycle for CO2 fixation. This study provides a novel method to study biosynthetic process of microbial communities in natural habitats.
De novo transcriptomic analysis during Lentinula edodes fruiting body growth.
Wang, Yingzhu; Zeng, Xianlu; Liu, Wenguang
2018-01-30
The fruiting body of Lentinula edodes is a popular edible mushroom, and extracts from the mycelium and the fruiting body of this species have diverse therapeutic potential. To gain insights into the molecular mechanisms underlying the fruiting body growth of L. edodes from the early bud stage (EBS), through the intermediate developing stage (IDS), to the fully developed stage (FDS), we performed de novo transcriptomic analysis using high-throughput Illumina RNA-sequencing. First, we generated three cDNA libraries representative of the three respective stages. We then obtained 38,933,148, 44,594,472, and 37,905,646 high-quality reads from the respective libraries and assembled the reads into 25,104 transcriptional contigs, containing 15,199 unigenes. We found that only 9331 of the unigenes had been annotated in the NCBI non-redundant protein database, and we functionally annotated 4758 of them through Gene Ontology (GO) analysis and 2921 of them through Clusters of Orthologous Groups of proteins (COGs) analysis. We also assigned 3995 unigenes to metabolic pathways by using the Kyoto Encyclopedia of Genes and Genomes (KEGG). We further identified 399 differentially expressed genes (DEGs) between EBS and IDS, 1428 between IDS and FDS, and 1830 between EBS and FDS, uncovering 769 DEGs in multiple metabolic and signaling pathways. Interestingly, there were a limited number of DEGs whose expression was dramatically associated with FDS. Finally, genes, whose expression was either highly up-regulated in FDS or remained at a high level during fruiting body growth, were annotated specifically in the pathways of purine metabolism, unsaturated fatty acid metabolism and meiosis, suggesting that these key molecular events were actively occurring in the fruiting body. Our work is the first high-throughput transcriptome study on the growth of L. edodes fruiting bodies, and the results uncovered candidate genes for future gene identification and utilization of this commercially and medically important mushroom. Copyright © 2017 Elsevier B.V. All rights reserved.
Yang, Mei; Zhu, Lingping; Li, Ling; Li, Juanjuan; Xu, Liming; Feng, Ji; Liu, Yanling
2017-01-01
The predominant alkaloids in lotus leaves are aporphine alkaloids. These are the most important active components and have many pharmacological properties, but little is known about their biosynthesis. We used digital gene expression (DGE) technology to identify differentially-expressed genes (DEGs) between two lotus cultivars with different alkaloid contents at four leaf development stages. We also predicted potential genes involved in aporphine alkaloid biosynthesis by weighted gene co-expression network analysis (WGCNA). Approximately 335 billion nucleotides were generated; and 94% of which were aligned against the reference genome. Of 22 thousand expressed genes, 19,000 were differentially expressed between the two cultivars at the four stages. Gene Ontology (GO) enrichment analysis revealed that catalytic activity and oxidoreductase activity were enriched significantly in most pairwise comparisons. In Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis, dozens of DEGs were assigned to the categories of biosynthesis of secondary metabolites, isoquinoline alkaloid biosynthesis, and flavonoid biosynthesis. The genes encoding norcoclaurine synthase (NCS), norcoclaurine 6-O-methyltransferase (6OMT), coclaurine N-methyltransferase (CNMT), N-methylcoclaurine 3′-hydroxylase (NMCH), and 3′-hydroxy-N-methylcoclaurine 4′-O-methyltransferase (4′OMT) in the common pathways of benzylisoquinoline alkaloid biosynthesis and the ones encoding corytuberine synthase (CTS) in aporphine alkaloid biosynthetic pathway, which have been characterized in other plants, were identified in lotus. These genes had positive effects on alkaloid content, albeit with phenotypic lag. The WGCNA of DEGs revealed that one network module was associated with the dynamic change of alkaloid content. Eleven genes encoding proteins with methyltransferase, oxidoreductase and CYP450 activities were identified. These were surmised to be genes involved in aporphine alkaloid biosynthesis. This transcriptomic database provides new directions for future studies on clarifying the aporphine alkaloid pathway. PMID:28197160
Cardiac transcriptional response to acute and chronic angiotensin II treatments.
Larkin, Jennie E; Frank, Bryan C; Gaspard, Renee M; Duka, Irena; Gavras, Haralambos; Quackenbush, John
2004-07-08
Exposure of experimental animals to increased angiotensin II (ANG II) induces hypertension associated with cardiac hypertrophy, inflammation, and myocardial necrosis and fibrosis. Some of the most effective antihypertensive treatments are those that antagonize ANG II. We investigated cardiac gene expression in response to acute (24 h) and chronic (14 day) infusion of ANG II in mice; 24-h treatment induces hypertension, and 14-day treatment induces hypertension and extensive cardiac hypertrophy and necrosis. For genes differentially expressed in response to ANG II treatment, we tested for significant regulation of pathways, based on Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Microarray Pathway Profiler (GenMAPP) databases, as well as functional classes based on Gene Ontology (GO) terms. Both acute and chronic ANG II treatments resulted in decreased expression of mitochondrial metabolic genes, notably those for the electron transport chain and Krebs-TCA cycle; chronic ANG II treatment also resulted in decreased expression of genes involved in fatty acid metabolism. In contrast, genes involved in protein translation and ribosomal activity increased expression following both acute and chronic ANG II treatments. Some classes of genes showed differential response between acute and chronic ANG II treatments. Acute treatment increased expression of genes involved in oxidative stress and amino acid metabolism, whereas chronic treatments increased cytoskeletal and extracellular matrix genes, second messenger cascades responsive to ANG II, and amyloidosis genes. Although a functional linkage between Alzheimer disease, hypertension, and high cholesterol has been previously documented in studies of brain tissue, this is the first demonstration of induction of Alzheimer disease pathways by hypertension in heart tissue. This study provides the most comprehensive available survey of gene expression changes in response to acute and chronic ANG II treatment, verifying results from disparate studies, and suggests mechanisms that provide novel insight into the etiology of hypertensive heart disease and possible therapeutic interventions that may help to mitigate its effects.
Foerster, Hartmut; Bombarely, Aureliano; Battey, James N D; Sierro, Nicolas; Ivanov, Nikolai V; Mueller, Lukas A
2018-01-01
Abstract SolCyc is the entry portal to pathway/genome databases (PGDBs) for major species of the Solanaceae family hosted at the Sol Genomics Network. Currently, SolCyc comprises six organism-specific PGDBs for tomato, potato, pepper, petunia, tobacco and one Rubiaceae, coffee. The metabolic networks of those PGDBs have been computationally predicted by the pathologic component of the pathway tools software using the manually curated multi-domain database MetaCyc (http://www.metacyc.org/) as reference. SolCyc has been recently extended by taxon-specific databases, i.e. the family-specific SolanaCyc database, containing only curated data pertinent to species of the nightshade family, and NicotianaCyc, a genus-specific database that stores all relevant metabolic data of the Nicotiana genus. Through manual curation of the published literature, new metabolic pathways have been created in those databases, which are complemented by the continuously updated, relevant species-specific pathways from MetaCyc. At present, SolanaCyc comprises 199 pathways and 29 superpathways and NicotianaCyc accounts for 72 pathways and 13 superpathways. Curator-maintained, taxon-specific databases such as SolanaCyc and NicotianaCyc are characterized by an enrichment of data specific to these taxa and free of falsely predicted pathways. Both databases have been used to update recently created Nicotiana-specific databases for Nicotiana tabacum, Nicotiana benthamiana, Nicotiana sylvestris and Nicotiana tomentosiformis by propagating verifiable data into those PGDBs. In addition, in-depth curation of the pathways in N.tabacum has been carried out which resulted in the elimination of 156 pathways from the 569 pathways predicted by pathway tools. Together, in-depth curation of the predicted pathway network and the supplementation with curated data from taxon-specific databases has substantially improved the curation status of the species–specific N.tabacum PGDB. The implementation of this strategy will significantly advance the curation status of all organism-specific databases in SolCyc resulting in the improvement on database accuracy, data analysis and visualization of biochemical networks in those species. Database URL https://solgenomics.net/tools/solcyc/ PMID:29762652
Jiang, Tao; Guo, Junjie; Hu, Zhongchun; Zhao, Ming; Gu, Zhenggang; Miao, Shu
2018-06-20
BACKGROUND Long noncoding RNAs (lncRNAs) have been revealed to function as competing endogenous RNAs (ceRNAs), which can seclude the common microRNAs (miRNAs) and hence prevent the miRNAs from binding to their ancestral gene. Nonetheless, the role of lncRNA-mediated ceRNAs in prostate cancer has not yet been elucidated. MATERIAL AND METHODS Using The Cancer Genome Atlas (TCGA) database, lncRNA, miRNA, and mRNA profiles from 499 prostate cancer tissues and 52 normal prostate tissues were analyzed with the R package "DESeq" to identify the differentially expressed RNAs. GO and KEGG pathway analyses were performed using "DAVID6.8" and R packages "Clusterprofile." The ceRNA network in prostate cancer was constructed using miRDB, miRTarBase, and TargetScan databases. Survival analysis was performed with Kaplan-Meier analysis. RESULTS A total of 376 lncRNAs, 33 miRNAs, and 687 mRNAs were identified as significant factors in tumorigenesis. Based on the hypothesis that the ceRNA network (lncRNA-miRNA-mRNA regulatory axis) is involved in prostate cancer and forms competitive interrelations between miRNA and mRNA or lncRNA, we constructed a ceRNA network that included 23 lncRNAs, 6 miRNAs, and 2 mRNAs that were differentially expressed in prostate cancer. Only 3 lncRNAs (LINC00308, LINC00355, and OSTN-AS1) had a significant association with survival (P<0.05). The 3 prostate cancer-specific lncRNA were validated in prostate cancer cell lines PC3 and DU145 using qRT-PCR. CONCLUSIONS We demonstrated the differential lncRNA expression profiles in prostate cancer, which provides new insights for future studies of the ceRNA network and its regulatory mechanisms in prostate cancer.
Transcriptomics of the Bed Bug (Cimex lectularius)
Rajarapu, Swapna P.; Jones, Susan C.; Mittapalli, Omprakash
2011-01-01
Background Bed bugs (Cimex lectularius) are blood-feeding insects poised to become one of the major pests in households throughout the United States. Resistance of C. lectularius to insecticides/pesticides is one factor thought to be involved in its sudden resurgence. Despite its high-impact status, scant knowledge exists at the genomic level for C. lectularius. Hence, we subjected the C. lectularius transcriptome to 454 pyrosequencing in order to identify potential genes involved in pesticide resistance. Methodology and Principal Findings Using 454 pyrosequencing, we obtained a total of 216,419 reads with 79,596,412 bp, which were assembled into 35,646 expressed sequence tags (3902 contigs and 31744 singletons). Nearly 85.9% of the C. lectularius sequences showed similarity to insect sequences, but 44.8% of the deduced proteins of C. lectularius did not show similarity with sequences in the GenBank non-redundant database. KEGG analysis revealed putative members of several detoxification pathways involved in pesticide resistance. Lamprin domains, Protein Kinase domains, Protein Tyrosine Kinase domains and cytochrome P450 domains were among the top Pfam domains predicted for the C. lectularius sequences. An initial assessment of putative defense genes, including a cytochrome P450 and a glutathione-S-transferase (GST), revealed high transcript levels for the cytochrome P450 (CYP9) in pesticide-exposed versus pesticide-susceptible C. lectularius populations. A significant number of single nucleotide polymorphisms (296) and microsatellite loci (370) were predicted in the C. lectularius sequences. Furthermore, 59 putative sequences of Wolbachia were retrieved from the database. Conclusions To our knowledge this is the first study to elucidate the genetic makeup of C. lectularius. This pyrosequencing effort provides clues to the identification of potential detoxification genes involved in pesticide resistance of C. lectularius and lays the foundation for future functional genomics studies. PMID:21283830
[Bioinformatics on vascular invasion markers in hepatocellular carcinoma via Big-Data analysis].
Chen, Q; Qiu, X Q
2017-04-10
Objective: To investigate the biomarkers in hepatocellular carcinoma and their prognostic value via GEO (Gene Expression Omnibus) and TCGA (The Cancer Genome Atlas) database. Methods: Datasets of hepatocellular carcinoma were downloaded from GEO (GSE67140) and TCGA. MicroRNA in SNU423, SNU449, HepG2, Hep3B, SNU398 cell lines which had low or high invasion capabilities were investigated and verified, in 81 patients with and 91 without vascular invasion hepatocellular carcinoma. The prognostic value of these microRNAs were studied via TCGA database,obtained from 362 patients with hepatocellular carcinoma, through Kaplan-Meier and Multivariate Cox proportional hazard analysis. Target genes were analyzed by GO and KEGG. Results: Expressions of hsa-mir-1180, hsa-mir-149, hsa-mir-744 and hsa-mir-940 were all up regulated in high invasion capable cell lines (SNU423, SNU449) and vascular invasion patients with hepatocellular carcinoma (logFC>1, P <0.05). Results from the Survival analysis showed that hsa-mir-1180 ( HR =1.623, 95 % CI : 1.114-2.365, P =0.012), hsa-mir-149 ( HR =2.400, 95 % CI : 1.639-3.514) and hsa-mir-940 ( HR =1.704, 95 %CI : 1.188-2.443, P =0.004) were independent risk factors on the prognosis of patients with hepatocellular carcinoma ( P <0.05). The mechanism might be related to factors as immune response, focal adhesion and adherence junction signaling pathways. Conclusion: With TCGA and GEO data mining, we found that hsa-mir-1180, hsa-mir-149, hsa-mir-744 and hsa-mir-940 were all highly related to the prognosis of hepatocellular carcinoma, that enabled it to be used to further study the biomarkers related to the prognosis of hepatocellular carcinoma.
Identification of aberrantly expressed long non-coding RNAs in stomach adenocarcinoma.
Gu, Jianbin; Li, Yong; Fan, Liqiao; Zhao, Qun; Tan, Bibo; Hua, Kelei; Wu, Guobin
2017-07-25
Stomach adenocarcinoma (STAD) is a common malignancy worldwide. This study aimed to identify the aberrantly expressed long non-coding RNAs (lncRNAs) in STAD. Total of 74 DElncRNAs and 449 DEmRNAs were identified in STAD compared with paired non-tumor tissues. The DElncRNA/DEmRNA co-expression network was constructed, which covered 519 nodes and 2993 edges. The qRT-PCR validation results of DElncRNAs were consistent with our bioinformatics analysis based on RNA-sequencing. The DEmRNAs co-expressed with DElncRNAs were significantly enriched in gastric acid secretion, complement and coagulation cascades, pancreatic secretion, cytokine-cytokine receptor interaction and Jak-STAT signaling pathway. The expression levels of the nine candidate DElncRNAs in TCGA database were compatible with our RNA-sequencing. FEZF1-AS1, HOTAIR and LINC01234 had the potential diagnosis value for STAD. The lncRNA and mRNA expression profile of 3 STAD tissues and 3 matched adjacent non-tumor tissues was obtained through high-throughput RNA-sequencing. Differentially expressed lncRNAs/mRNAs (DElncRNAs/DEmRNAs) were identified in STAD. DElncRNA/DEmRNA co-expression network construction, Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses were conducted to predict the biological functions of DElncRNAs. Quantitative real-time polymerase chain reaction (qRT-PCR) was subjected to validate the expression levels of DEmRNAs and DElncRNAs. Moreover, the expression of DElncRNAs was validated through The Cancer Genome Atlas (TCGA) database. The diagnosis value of candidate DElncRNAs was accessed by receiver operating characteristic (ROC) analysis. Our work might provide useful information for exploring the tumorigenesis mechanism of STAD and pave the road for identification of diagnostic biomarkers in STAD.
Xia, Wei; Mason, Annaliese S.; Xia, Zhihui; Qiao, Fei; Zhao, Songlin; Tang, Haoru
2013-01-01
Background Cocos nucifera (coconut), a member of the Arecaceae family, is an economically important woody palm grown in tropical regions. Despite its agronomic importance, previous germplasm assessment studies have relied solely on morphological and agronomical traits. Molecular biology techniques have been scarcely used in assessment of genetic resources and for improvement of important agronomic and quality traits in Cocos nucifera, mostly due to the absence of available sequence information. Methodology/Principal Findings To provide basic information for molecular breeding and further molecular biological analysis in Cocos nucifera, we applied RNA-seq technology and de novo assembly to gain a global overview of the Cocos nucifera transcriptome from mixed tissue samples. Using Illumina sequencing, we obtained 54.9 million short reads and conducted de novo assembly to obtain 57,304 unigenes with an average length of 752 base pairs. Sequence comparison between assembled unigenes and released cDNA sequences of Cocos nucifera and Elaeis guineensis indicated that the assembled sequences were of high quality. Approximately 99.9% of unigenes were novel compared to the released coconut EST sequences. Using BLASTX, 68.2% of unigenes were successfully annotated based on the Genbank non-redundant (Nr) protein database. The annotated unigenes were then further classified using the Gene Ontology (GO), Clusters of Orthologous Groups (COG) and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. Conclusions/Significance Our study provides a large quantity of novel genetic information for Cocos nucifera. This information will act as a valuable resource for further molecular genetic studies and breeding in coconut, as well as for isolation and characterization of functional genes involved in different biochemical pathways in this important tropical crop species. PMID:23555859
A gene network bioinformatics analysis for pemphigoid autoimmune blistering diseases.
Barone, Antonio; Toti, Paolo; Giuca, Maria Rita; Derchi, Giacomo; Covani, Ugo
2015-07-01
In this theoretical study, a text mining search and clustering analysis of data related to genes potentially involved in human pemphigoid autoimmune blistering diseases (PAIBD) was performed using web tools to create a gene/protein interaction network. The Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database was employed to identify a final set of PAIBD-involved genes and to calculate the overall significant interactions among genes: for each gene, the weighted number of links, or WNL, was registered and a clustering procedure was performed using the WNL analysis. Genes were ranked in class (leader, B, C, D and so on, up to orphans). An ontological analysis was performed for the set of 'leader' genes. Using the above-mentioned data network, 115 genes represented the final set; leader genes numbered 7 (intercellular adhesion molecule 1 (ICAM-1), interferon gamma (IFNG), interleukin (IL)-2, IL-4, IL-6, IL-8 and tumour necrosis factor (TNF)), class B genes were 13, whereas the orphans were 24. The ontological analysis attested that the molecular action was focused on extracellular space and cell surface, whereas the activation and regulation of the immunity system was widely involved. Despite the limited knowledge of the present pathologic phenomenon, attested by the presence of 24 genes revealing no protein-protein direct or indirect interactions, the network showed significant pathways gathered in several subgroups: cellular components, molecular functions, biological processes and the pathologic phenomenon obtained from the Kyoto Encyclopaedia of Genes and Genomes (KEGG) database. The molecular basis for PAIBD was summarised and expanded, which will perhaps give researchers promising directions for the identification of new therapeutic targets.
Martínez-Núñez, Mario Alberto; Poot-Hernandez, Augusto Cesar; Rodríguez-Vázquez, Katya; Perez-Rueda, Ernesto
2013-01-01
In this work, the content of enzymes and DNA-binding transcription factors (TFs) in 794 non-redundant prokaryotic genomes was evaluated. The identification of enzymes was based on annotations deposited in the KEGG database as well as in databases of functional domains (COG and PFAM) and structural domains (Superfamily). For identifications of the TFs, hidden Markov profiles were constructed based on well-known transcriptional regulatory families. From these analyses, we obtained diverse and interesting results, such as the negative rate of incremental changes in the number of detected enzymes with respect to the genome size. On the contrary, for TFs the rate incremented as the complexity of genome increased. This inverse related performance shapes the diversity of metabolic and regulatory networks and impacts the availability of enzymes and TFs. Furthermore, the intersection of the derivatives between enzymes and TFs was identified at 9,659 genes, after this point, the regulatory complexity grows faster than metabolic complexity. In addition, TFs have a low number of duplications, in contrast to the apparent high number of duplications associated with enzymes. Despite the greater number of duplicated enzymes versus TFs, the increment by which duplicates appear is higher in TFs. A lower proportion of enzymes among archaeal genomes (22%) than in the bacterial ones (27%) was also found. This low proportion might be compensated by the interconnection between the metabolic pathways in Archaea. A similar proportion was also found for the archaeal TFs, for which the formation of regulatory complexes has been proposed. Finally, an enrichment of multifunctional enzymes in Bacteria, as a mechanism of ecological adaptation, was detected.
Martínez-Núñez, Mario Alberto; Poot-Hernandez, Augusto Cesar; Rodríguez-Vázquez, Katya; Perez-Rueda, Ernesto
2013-01-01
In this work, the content of enzymes and DNA-binding transcription factors (TFs) in 794 non-redundant prokaryotic genomes was evaluated. The identification of enzymes was based on annotations deposited in the KEGG database as well as in databases of functional domains (COG and PFAM) and structural domains (Superfamily). For identifications of the TFs, hidden Markov profiles were constructed based on well-known transcriptional regulatory families. From these analyses, we obtained diverse and interesting results, such as the negative rate of incremental changes in the number of detected enzymes with respect to the genome size. On the contrary, for TFs the rate incremented as the complexity of genome increased. This inverse related performance shapes the diversity of metabolic and regulatory networks and impacts the availability of enzymes and TFs. Furthermore, the intersection of the derivatives between enzymes and TFs was identified at 9,659 genes, after this point, the regulatory complexity grows faster than metabolic complexity. In addition, TFs have a low number of duplications, in contrast to the apparent high number of duplications associated with enzymes. Despite the greater number of duplicated enzymes versus TFs, the increment by which duplicates appear is higher in TFs. A lower proportion of enzymes among archaeal genomes (22%) than in the bacterial ones (27%) was also found. This low proportion might be compensated by the interconnection between the metabolic pathways in Archaea. A similar proportion was also found for the archaeal TFs, for which the formation of regulatory complexes has been proposed. Finally, an enrichment of multifunctional enzymes in Bacteria, as a mechanism of ecological adaptation, was detected. PMID:23922780
Fan, Haikuo; Xiao, Yong; Yang, Yaodong; Xia, Wei; Mason, Annaliese S; Xia, Zhihui; Qiao, Fei; Zhao, Songlin; Tang, Haoru
2013-01-01
Cocos nucifera (coconut), a member of the Arecaceae family, is an economically important woody palm grown in tropical regions. Despite its agronomic importance, previous germplasm assessment studies have relied solely on morphological and agronomical traits. Molecular biology techniques have been scarcely used in assessment of genetic resources and for improvement of important agronomic and quality traits in Cocos nucifera, mostly due to the absence of available sequence information. To provide basic information for molecular breeding and further molecular biological analysis in Cocos nucifera, we applied RNA-seq technology and de novo assembly to gain a global overview of the Cocos nucifera transcriptome from mixed tissue samples. Using Illumina sequencing, we obtained 54.9 million short reads and conducted de novo assembly to obtain 57,304 unigenes with an average length of 752 base pairs. Sequence comparison between assembled unigenes and released cDNA sequences of Cocos nucifera and Elaeis guineensis indicated that the assembled sequences were of high quality. Approximately 99.9% of unigenes were novel compared to the released coconut EST sequences. Using BLASTX, 68.2% of unigenes were successfully annotated based on the Genbank non-redundant (Nr) protein database. The annotated unigenes were then further classified using the Gene Ontology (GO), Clusters of Orthologous Groups (COG) and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. Our study provides a large quantity of novel genetic information for Cocos nucifera. This information will act as a valuable resource for further molecular genetic studies and breeding in coconut, as well as for isolation and characterization of functional genes involved in different biochemical pathways in this important tropical crop species.
Seifi Moroudi, Reihane; Masoudi, Ali Akbar; Vaez Torshizi, Rasoul; Zandi, Mohammad
2014-12-01
One of the important behaviors of dogs is trainability which is affected by learning and memory genes. These kinds of the genes have not yet been identified in dogs. In the current research, these genes were found in animal models by mining the biological data and scientific literatures. The proteins of these genes were obtained from the UniProt database in dogs and humans. Not all homologous proteins perform similar functions, thus comparison of these proteins was studied in terms of protein families, domains, biological processes, molecular functions, and cellular location of metabolic pathways in Interpro, KEGG, Quick Go and Psort databases. The results showed that some of these proteins have the same performance in the rat or mouse, dog, and human. It is anticipated that the protein of these genes may be effective in learning and memory in dogs. Then, the expression pattern of the recognized genes was investigated in the dog hippocampus using the existing information in the GEO profile. The results showed that BDNF, TAC1 and CCK genes are expressed in the dog hippocampus, therefore, these genes could be strong candidates associated with learning and memory in dogs. Subsequently, due to the importance of the promoter regions in gene function, this region was investigated in the above genes. Analysis of the promoter indicated that the HNF-4 site of BDNF gene and the transcription start site of CCK gene is exposed to methylation. Phylogenetic analysis of protein sequences of these genes showed high similarity in each of these three genes among the studied species. The dN/dS ratio for BDNF, TAC1 and CCK genes indicates a purifying selection during the evolution of the genes.
Fernández-Suárez, Xosé M; Rigden, Daniel J; Galperin, Michael Y
2014-01-01
The 2014 Nucleic Acids Research Database Issue includes descriptions of 58 new molecular biology databases and recent updates to 123 databases previously featured in NAR or other journals. For convenience, the issue is now divided into eight sections that reflect major subject categories. Among the highlights of this issue are six databases of the transcription factor binding sites in various organisms and updates on such popular databases as CAZy, Database of Genomic Variants (DGV), dbGaP, DrugBank, KEGG, miRBase, Pfam, Reactome, SEED, TCDB and UniProt. There is a strong block of structural databases, which includes, among others, the new RNA Bricks database, updates on PDBe, PDBsum, ArchDB, Gene3D, ModBase, Nucleic Acid Database and the recently revived iPfam database. An update on the NCBI's MMDB describes VAST+, an improved tool for protein structure comparison. Two articles highlight the development of the Structural Classification of Proteins (SCOP) database: one describes SCOPe, which automates assignment of new structures to the existing SCOP hierarchy; the other one describes the first version of SCOP2, with its more flexible approach to classifying protein structures. This issue also includes a collection of articles on bacterial taxonomy and metagenomics, which includes updates on the List of Prokaryotic Names with Standing in Nomenclature (LPSN), Ribosomal Database Project (RDP), the Silva/LTP project and several new metagenomics resources. The NAR online Molecular Biology Database Collection, http://www.oxfordjournals.org/nar/database/c/, has been expanded to 1552 databases. The entire Database Issue is freely available online on the Nucleic Acids Research website (http://nar.oxfordjournals.org/).
Wei, Lin; Li, Shenghua; Liu, Shenggui; He, Anna; Wang, Dan; Wang, Jie; Tang, Yulian; Wu, Xianjin
2014-01-01
Background Houttuynia cordata Thunb. is an important traditional medical herb in China and other Asian countries, with high medicinal and economic value. However, a lack of available genomic information has become a limitation for research on this species. Thus, we carried out high-throughput transcriptomic sequencing of H. cordata to generate an enormous transcriptome sequence dataset for gene discovery and molecular marker development. Principal Findings Illumina paired-end sequencing technology produced over 56 million sequencing reads from H. cordata mRNA. Subsequent de novo assembly yielded 63,954 unigenes, 39,982 (62.52%) and 26,122 (40.84%) of which had significant similarity to proteins in the NCBI nonredundant protein and Swiss-Prot databases (E-value <10−5), respectively. Of these annotated unigenes, 30,131 and 15,363 unigenes were assigned to gene ontology categories and clusters of orthologous groups, respectively. In addition, 24,434 (38.21%) unigenes were mapped onto 128 pathways using the KEGG pathway database and 17,964 (44.93%) unigenes showed homology to Vitis vinifera (Vitaceae) genes in BLASTx analysis. Furthermore, 4,800 cDNA SSRs were identified as potential molecular markers. Fifty primer pairs were randomly selected to detect polymorphism among 30 samples of H. cordata; 43 (86%) produced fragments of expected size, suggesting that the unigenes were suitable for specific primer design and of high quality, and the SSR marker could be widely used in marker-assisted selection and molecular breeding of H. cordata in the future. Conclusions This is the first application of Illumina paired-end sequencing technology to investigate the whole transcriptome of H. cordata and to assemble RNA-seq reads without a reference genome. These data should help researchers investigating the evolution and biological processes of this species. The SSR markers developed can be used for construction of high-resolution genetic linkage maps and for gene-based association analyses in H. cordata. This work will enable future functional genomic research and research into the distinctive active constituents of this genus. PMID:24392108
Yang, Jiajia; Lin, Yao; Jiang, Liming; Xi, Juemin; Wang, Xiaodan; Guan, Jiaoqiong; Chen, Junying; Pan, Yue; Luo, Jia; Ye, Chao; Sun, Qiangming
2018-05-02
To elucidate the differences in microRNAs during dengue virus infection between Vero cell-adapted strain (DENV-2-Vero) and its source, the clinical C6/36 isolated strain (DENV-2-C6/36), a comparison analysis was performed in Vero cells by high throughput sequencing. The results showed that the expression of 16 known and 3 novel miRNAs exhibited marked differences. 5 known miRNAs were up-regulated in DENV-2-C6/36 group, while 11 known microRNAs were down-regulated in DENV-2-Vero group. The GO enrichment and KEGG pathway analysis showed that there was a distinct difference in regulating viral replication between two strains. In DENV-2-Vero infection group, significantly enriched GO terms included virion attachment to host cells, viral structural protein/genome processing and packaging. Meanwhile, the regulation of cell death and apoptosis between two groups were different in the early stage of infection. KEGG enrichment analysis showed that DENV-2-C6/36 infection induced more intense regulation of immune-related pathways, including Fc gamma R-mediated phagocytosis, etc. DENV-2-Vero infection could partially alleviate the immune defense of Vero cells compared with DENV-2-C6/36. The results indicated that the distinct microRNA changes induced by two DENV-2 strains may be partly related to their infective abilities. Our data provide useful insights that help elucidate the host-pathogen interactions following DENV infection. Copyright © 2018 Elsevier B.V. All rights reserved.
Guo, Zhiqiang; Zhao, Chuncheng; Wang, Zheng
2014-09-26
To identify critical genes and biological pathways in acute lung injury (ALI), a comparative analysis of gene expression profiles of patients with ALI + sepsis compared with patients with sepsis alone were performed with bioinformatic tools. GSE10474 was downloaded from Gene Expression Omnibus, including a collective of 13 whole blood samples with ALI + sepsis and 21 whole blood samples with sepsis alone. After pre-treatment with robust multichip averaging (RMA) method, differential analysis was conducted using simpleaffy package based upon t-test and fold change. Hierarchical clustering was also performed using function hclust from package stats. Beisides, functional enrichment analysis was conducted using iGepros. Moreover, the gene regulatory network was constructed with information from Kyoto Encyclopedia of Genes and Genomes (KEGG) and then visualized by Cytoscape. A total of 128 differentially expressed genes (DEGs) were identified, including 47 up- and 81 down-regulated genes. The significantly enriched functions included negative regulation of cell proliferation, regulation of response to stimulus and cellular component morphogenesis. A total of 27 DEGs were significantly enriched in 16 KEGG pathways, such as protein digestion and absorption, fatty acid metabolism, amoebiasis, etc. Furthermore, the regulatory network of these 27 DEGs was constructed, which involved several key genes, including protein tyrosine kinase 2 (PTK2), v-src avian sarcoma (SRC) and Caveolin 2 (CAV2). PTK2, SRC and CAV2 may be potential markers for diagnosis and treatment of ALI. The virtual slide(s) for this article can be found here: http://www.diagnosticpathology.diagnomx.eu/vs/5865162912987143.
Comparison of the gene expression profiles between gallstones and gallbladder polyps.
Li, Quanfu; Ge, Xin; Xu, Xu; Zhong, Yonggang; Qie, Zengwang
2014-01-01
Gallstones and gallbladder polyps (GPs) are two major types of gallbladder diseases that share multiple common symptoms. However, their pathological mechanism remains largely unknown. The aim of our study is to identify gallstones and GPs related-genes and gain an insight into the underlying genetic basis of these diseases. We enrolled 7 patients with gallstones and 2 patients with GP for RNA-Seq and we conducted functional enrichment analysis and protein-protein interaction (PPI) networks analysis for identified differentially expressed genes (DEGs). RNA-Seq produced 41.7 million in gallstones and 32.1 million pairs in GPs. A total of 147 DEGs was identified between gallstones and GPs. We found GO terms for molecular functions significantly enriched in antigen binding (GO:0003823, P=5.9E-11), while for biological processes, the enriched GO terms were immune response (GO:0006955, P=2.6E-15), and for cellular component, the enriched GO terms were extracellular region (GO:0005576, P=2.7E-15). To further evaluate the biological significance for the DEGs, we also performed the KEGG pathway enrichment analysis. The most significant pathway in our KEGG analysis was Cytokine-cytokine receptor interaction (P=7.5E-06). PPI network analysis indicated that the significant hub proteins containing S100A9 (S100 calcium binding protein A9, Degree=94) and CR2 (complement component receptor 2, Degree=8). This present study suggests some promising genes and may provide a clue to the role of these genes playing in the development of gallstones and GPs.
He, Rong-Quan; Yang, Xia; Liang, Liang; Chen, Gang; Ma, Jie
2018-04-01
The present study aimed to explore the potential clinical significance of microRNA (miR)-124-3p expression in the hepatocarcinogenesis and development of hepatocellular carcinoma (HCC), as well as the potential target genes of functional HCC pathways. Reverse transcription-quantitative polymerase chain reaction was performed to evaluate the expression of miR-124-3p in 101 HCC and adjacent non-cancerous tissue samples. Additionally, the association between miR-124-3p expression and clinical parameters was also analyzed. Differentially expressed genes identified following miR-124-3p transfection, the prospective target genes predicted in silico and the key genes of HCC obtained from Natural Language Processing (NLP) were integrated to obtain potential target genes of miR-124-3p in HCC. Relevant signaling pathways were assessed with protein-protein interaction (PPI) networks, Gene Ontology (GO) enrichment analysis, Kyoto Encyclopedia of Genes and Genomes (KEGG) and Protein Annotation Through Evolutionary Relationships (PANTHER) pathway enrichment analysis. miR-124-3p expression was significantly reduced in HCC tissues compared with expression in adjacent non-cancerous liver tissues. In HCC, miR-124-3p was demonstrated to be associated with clinical stage. The mean survival time of the low miR-124-3p expression group was reduced compared with that of the high expression group. A total of 132 genes overlapped from differentially expressed genes, miR-124-3p predicted target genes and NLP identified genes. PPI network construction revealed a total of 109 nodes and 386 edges, and 20 key genes were identified. The major enriched terms of three GO categories included regulation of cell proliferation, positive regulation of cellular biosynthetic processes, cell leading edge, cytosol and cell projection, protein kinase activity, transcription activator activity and enzyme binding. KEGG analysis revealed pancreatic cancer, prostate cancer and non-small cell lung cancer as the top three terms. Angiogenesis, the endothelial growth factor receptor signaling pathway and the fibroblast growth factor signaling pathway were identified as the most significant terms in the PANTHER pathway analysis. The present study confirmed that miR-124-3p acts as a tumor suppressor in HCC. miR-124-3p may target multiple genes, exerting its effect spatiotemporally, or in combination with a diverse range of processes in HCC. Functional characterization of miR-124-3p targets will offer novel insight into the molecular changes that occur in HCC progression.
Composition of dissolved organic matter in groundwater
NASA Astrophysics Data System (ADS)
Longnecker, Krista; Kujawinski, Elizabeth B.
2011-05-01
Groundwater constitutes a globally important source of freshwater for drinking water and other agricultural and industrial purposes, and is a prominent source of freshwater flowing into the coastal ocean. Therefore, understanding the chemical components of groundwater is relevant to both coastal and inland communities. We used electrospray ionization coupled with Fourier-transform ion cyclotron resonance mass spectrometry (ESI FT-ICR MS) to examine dissolved organic compounds in groundwater prior to and after passage through a sediment-filled column containing microorganisms. The data revealed that an unexpectedly high proportion of organic compounds contained nitrogen and sulfur, possibly due to transport of surface waters from septic systems and rain events. We matched 292 chemical features, based on measured mass:charge ( m/z) values, to compounds stored in the Kyoto Encyclopedia of Genes and Genomes (KEGG). A subset of these compounds (88) had only one structural isomer in KEGG, thus supporting tentative identification. Most identified elemental formulas were linked with metabolic pathways that produce polyketides or with secondary metabolites produced by plants. The presence of polyketides in groundwater is notable because of their anti-bacterial and anti-cancer properties. However, their relative abundance must be quantified with appropriate analyses to assess any implications for public health.
Effects of aqueous extract of Arctium lappa L. roots on serum lipid metabolism.
Hou, Bo; Wang, Wencheng; Gao, Hui; Cai, Shanglang; Wang, Chunbo
2018-01-01
Objective To identify potential genes that may be involved in lipid metabolism in rats after treatment with aqueous extract of Arctium lappa L (burdock). Methods Rats were randomly divided into six groups: (i) control (standard diet); (ii) model group (high-fat diet only); (iii) high-fat diet and low-dose aqueous burdock root extract (2 g/kg); (iv) high-fat diet and moderate-dose aqueous burdock root extract (4 g/kg); (v) high-fat diet and high-dose aqueous burdock root extract (8 g/kg); and (vi) a positive control group exposed to a high-fat diet and simvastatin (10 mg/kg). Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis was performed to find the potential candidate genes involved in the modulation of blood lipids by treatment with aqueous burdock root extract. Results Burdock root extract reduced body weight and cholesterol levels in rats. KEGG analysis revealed 113 genes that were involved in metabolic pathways. Of these, 27 potential genes associated with blood lipid metabolism were identified. Conclusions Aqueous extract of burdock root reduced body weight and cholesterol in rats, possibly by modulating the differential expression of genes.
Collard, J-F; Hinsenkamp, M
2015-05-01
We observed on different tissues and organisms a biological response after exposure to pulsed low frequency and low amplitude electric or electromagnetic fields but the precise mechanism of cell response remains unknown. The aim of this publication is to understand, using bioinformatics, the biological relevance of processes involved in the modification of gene expression. The list of genes analyzed was obtained after microarray protocol realized on cultures of human epidermal explants growing on deepidermized human skin exposed to a pulsed low frequency electric field. The directed acyclic graph on a WebGestalt Gene Ontology module shows six categories under the biological process root: "biological regulation", "cellular process", "cell proliferation", "death", "metabolic process" and "response to stimulus". Enriched derived categories are coherent with the type of in vitro culture, the stimulation protocol or with the previous results showing a decrease of cell proliferation and an increase of differentiation. The Kegg module on WebGestalt has highlighted "cell cycle" and "p53 signaling pathway" as significantly involved. The Kegg website brings out interactions between FoxO, MAPK, JNK, p53, p38, PI3K/Akt, Wnt, mTor or NF-KappaB. Some genes expressed by the stimulation are known to have an exclusive function on these pathways. Analyses performed with Pathway Studio linked cell proliferation, cell differentiation, apoptosis, cell cycle, mitosis, cell death etc. with our microarrays results. Medline citation generated by the software and the fold change variation confirms a diminution of the proliferation, activation of the differentiation and a less well-defined role of apoptosis or wound healing. Wnt and DKK functional classes, DKK1, MACF1, ATF3, MME, TXNRD1, and BMP-2 genes proposed in previous publications after a manual analysis are also highlighted with other genes after Pathway Studio automatic procedure. Finally, an analysis conducted on a list of genes characterized by an accelerated regulation after extremely low frequency pulsed stimulation also confirms their role in the processes of cell proliferation and differentiation. Bioinformatics approach allows in-depth research, without the bias of pre-selection, on cellular processes involved in a huge gene list. Copyright © 2015 Elsevier Inc. All rights reserved.
Xu, Liming; Dan, Mo; Shao, Anliang; Cheng, Xiang; Zhang, Cuiping; Yokel, Robert A; Takemura, Taro; Hanagata, Nobutaka; Niwa, Masami; Watanabe, Daisuke
2015-01-01
Silver nanoparticles (Ag-NPs) can enter the brain and induce neurotoxicity. However, the toxicity of Ag-NPs on the blood-brain barrier (BBB) and the underlying mechanism(s) of action on the BBB and the brain are not well understood. To investigate Ag-NP suspension (Ag-NPS)-induced toxicity, a triple coculture BBB model of rat brain microvascular endothelial cells, pericytes, and astrocytes was established. The BBB permeability and tight junction protein expression in response to Ag-NPS, NP-released Ag ions, and polystyrene-NP exposure were investigated. Ultrastructural changes of the microvascular endothelial cells, pericytes, and astrocytes were observed using transmission electron microscopy (TEM). Global gene expression of astrocytes was measured using a DNA microarray. A triple coculture BBB model of primary rat brain microvascular endothelial cells, pericytes, and astrocytes was established, with the transendothelial electrical resistance values >200 Ω·cm(2). After Ag-NPS exposure for 24 hours, the BBB permeability was significantly increased and expression of the tight junction (TJ) protein ZO-1 was decreased. Discontinuous TJs were also observed between microvascular endothelial cells. After Ag-NPS exposure, severe mitochondrial shrinkage, vacuolations, endoplasmic reticulum expansion, and Ag-NPs were observed in astrocytes by TEM. Global gene expression analysis showed that three genes were upregulated and 20 genes were downregulated in astrocytes treated with Ag-NPS. Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis showed that the 23 genes were associated with metabolic processes, biosynthetic processes, response to stimuli, cell death, the MAPK pathway, and so on. No GO term and KEGG pathways were changed in the released-ion or polystyrene-NP groups. Ag-NPS inhibited the antioxidant defense of the astrocytes by increasing thioredoxin interacting protein, which inhibits the Trx system, and decreasing Nr4a1 and Dusp1. Meanwhile, Ag-NPS induced inflammation and apoptosis through modulation of the MAPK pathway or B-cell lymphoma-2 expression or mTOR activity in astrocytes. These results draw our attention to the importance of Ag-NP-induced toxicity on the neurovascular unit and provide a better understanding of its toxicological mechanisms on astrocytes.
Differential gene expression in Schistosoma japonicum schistosomula from Wistar rats and BALB/c mice
2011-01-01
Background More than 46 species of mammals can be naturally infected with Schistosoma japonicum in the mainland of China. Mice are permissive and may act as the definitive host of the life cycle. In contrast, rats are less susceptible to S. japonicum infection, and are considered to provide an unsuitable micro-environment for parasite growth and development. Since little is known of what effects this micro-environment has on the parasite itself, we have in the present study utilised a S. japonicum oligonucleotide microarray to compare the gene expression differences of 10-day-old schistosomula maintained in Wistar rats with those maintained in BALB/c mice. Results In total 3,468 schistosome genes were found to be differentially expressed, of which the majority (3,335) were down-regulated (≤ 2 fold) and 133 were up-regulated (≥ 2 fold) in schistosomula from Wistar rats compared with those from BALB/c mice. Gene ontology (GO) analysis revealed that of the differentially expressed genes with already established functions or close homology to well characterized genes in another organisms, many are related to important biological functions or molecular processes. Among the genes that were down-regulated in schistosomula from Wistar rats, some were associated with metabolism, signal transduction and development. Of these genes related to metabolic processes, areas including translation, protein and amino acid phosphorylation, proteolysis, oxidoreductase activities, catalytic activities and hydrolase activities, were represented. KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway analysis of differential expressed genes indicated that of the 328 genes that had a specific KEGG pathway annotation, 324 were down-regulated and were mainly associated with metabolism, growth, redox pathway, oxidative phosphorylation, the cell cycle, ubiquitin-mediated proteolysis, protein export and the MAPK (mitogen-activated protein kinases) signaling pathway. Conclusions This work presents the first large scale gene expression study identifying the differences between schistosomula maintained in mice and those maintained in rats, and specifically highlights differential expression that may impact on the survival and development of the parasite within the definitive host. The research presented here provides valuable information for the better understanding of schistosome development and host-parasite interactions. PMID:21819550
2013-01-01
Arthrospira (Spirulina) platensis as a representative species of cyanobacteria has been recognized and used worldwide as a source of protein in the food, which possesses some unusual and valuable physiological characteristics, such as alkali and salt tolerance. Based on complete genome sequencing of Arthrospira (Spirulina) plantensis-YZ, we compared the protein expression profiles of this organism under different salt-stress conditions (i.e. 0.02 M, 0.5 M and 1.0 M NaCl, respectively), using 2-D electrophoresis and peptide mass fingerprinting, and retrieved 141 proteins showing significantly differential expression in response to salt-stress. Of the 141 proteins, 114 Arthrospira (Spirulina) plantensis-YZ proteins were found with significant homology to those found in Arthrospira (76 proteins in Arthrospira platensis str. Paraca and 38 in Arthrospira maxima CS-328). The remaining 27 proteins belong to other bacteria. Subsequently, we determined the transcriptional level of 29 genes in vivo in response to NaCl treatments and verified them by qRT-PCR. We found that 12 genes keep consistency at both transcription and protein levels, and transcription of all of them but one were up-regulated. We classified the 141 differentially expressed proteins into 18 types of function categories using COG database, and linked them to their respective KEGG metabolism pathways. These proteins are involved in 31 metabolism pathways, such as photosynthesis, glucose metabolism, cysteine and methionine metabolism, lysine synthesis, fatty acid metabolism, glutathione metabolism. Additionally, the SRPs, heat shock protein and ABC transporter proteins were identified, which probably render Arthrospira (Spirulina) plantensis’s resistance against high salt stress. PMID:23363438
Comparative transcriptome analysis of the Asteraceae halophyte Karelinia caspica under salt stress.
Zhang, Xia; Liao, Maoseng; Chang, Dan; Zhang, Fuchun
2014-12-17
Much attention has been given to the potential of halophytes as sources of tolerance traits for introduction into cereals. However, a great deal remains unknown about the diverse mechanisms employed by halophytes to cope with salinity. To characterize salt tolerance mechanisms underlying Karelinia caspica, an Asteraceae halophyte, we performed Large-scale transcriptomic analysis using a high-throughput Illumina sequencing platform. Comparative gene expression analysis was performed to correlate the effects of salt stress and ABA regulation at the molecular level. Total sequence reads generated by pyrosequencing were assembled into 287,185 non-redundant transcripts with an average length of 652 bp. Using the BLAST function in the Swiss-Prot, NCBI nr, GO, KEGG, and KOG databases, a total of 216,416 coding sequences associated with known proteins were annotated. Among these, 35,533 unigenes were classified into 69 gene ontology categories, and 18,378 unigenes were classified into 202 known pathways. Based on the fold changes observed when comparing the salt stress and control samples, 60,127 unigenes were differentially expressed, with 38,122 and 22,005 up- and down-regulated, respectively. Several of the differentially expressed genes are known to be involved in the signaling pathway of the plant hormone ABA, including ABA metabolism, transport, and sensing as well as the ABA signaling cascade. Transcriptome profiling of K. caspica contribute to a comprehensive understanding of K. caspica at the molecular level. Moreover, the global survey of differentially expressed genes in this species under salt stress and analyses of the effects of salt stress and ABA regulation will contribute to the identification and characterization of genes and molecular mechanisms underlying salt stress responses in Asteraceae plants.
Walus, Marius; Kida, Elizabeth; Rabe, Ausma; Albertini, Giorgio; Golabek, Adam A
2016-01-01
Our previous study showed an improvement in locomotor deficits after voluntary lifelong running in Ts65Dn mice, an animal model for Down syndrome (DS). In the present study, we employed mouse microarrays printed with 55,681 probes in an attempt to identify molecular changes in the cerebellar transcriptome that might contribute to the observed behavioral benefits of voluntary long-term running in Ts65Dn mice. Euploid mice were processed in parallel for comparative purposes in some analyses. We found that running significantly changed the expression of 4,315 genes in the cerebellum of Ts65Dn mice, over five times more than in euploid animals, up-regulating 1,991 and down-regulating 2,324 genes. Functional analysis of these genes revealed a significant enrichment of 92 terms in the biological process category, including regulation of biosynthesis and metabolism, protein modification, phosphate metabolism, synaptic transmission, development, regulation of cell death/apoptosis, protein transport, development, neurogenesis and neuron differentiation. The KEGG pathway database identified 18 pathways that are up-regulated and two that are down-regulated by running that were associated with learning, memory, cell signaling, proteolysis, regeneration, cell cycle, proliferation, growth, migration, and survival. Of six mRNA protein products we tested by immunoblotting, four showed significant running-associated changes in their levels, the most prominent in glutaminergic receptor metabotropic 1, and two showed changes that were close to significant. Thus, unexpectedly, our data point to the high molecular plasticity of Ts65Dn mouse cerebellum, which translated into humans with DS, suggests that the motor deficits of individuals with DS could markedly benefit from prolonged exercise. Copyright © 2015 Elsevier B.V. All rights reserved.
Wu, Xia; Zhu, Jian-Cheng; Zhang, Yu; Li, Wei-Min; Rong, Xiang-Lu; Feng, Yi-Fan
2016-08-25
Potential impact of lipid research has been increasingly realized both in disease treatment and prevention. An effective metabolomics approach based on ultra-performance liquid chromatography/quadrupole-time-of-flight mass spectrometry (UPLC/Q-TOF-MS) along with multivariate statistic analysis has been applied for investigating the dynamic change of plasma phospholipids compositions in early type 2 diabetic rats after the treatment of an ancient prescription of Chinese Medicine Huang-Qi-San. The exported UPLC/Q-TOF-MS data of plasma samples were subjected to SIMCA-P and processed by bioMark, mixOmics, Rcomdr packages with R software. A clear score plots of plasma sample groups, including normal control group (NC), model group (MC), positive medicine control group (Flu) and Huang-Qi-San group (HQS), were achieved by principal-components analysis (PCA), partial least-squares discriminant analysis (PLS-DA) and orthogonal partial least-squares discriminant analysis (OPLS-DA). Biomarkers were screened out using student T test, principal component regression (PCR), partial least-squares regression (PLS) and important variable method (variable influence on projection, VIP). Structures of metabolites were identified and metabolic pathways were deduced by correlation coefficient. The relationship between compounds was explained by the correlation coefficient diagram, and the metabolic differences between similar compounds were illustrated. Based on KEGG database, the biological significances of identified biomarkers were described. The correlation coefficient was firstly applied to identify the structure and deduce the metabolic pathways of phospholipids metabolites, and the study provided a new methodological cue for further understanding the molecular mechanisms of metabolites in the process of regulating Huang-Qi-San for treating early type 2 diabetes. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
ESTs Analysis Reveals Putative Genes Involved in Symbiotic Seed Germination in Dendrobium officinale
Zhao, Ming-Ming; Zhang, Gang; Zhang, Da-Wei; Hsiao, Yu-Yun; Guo, Shun-Xing
2013-01-01
Dendrobium officinale (Orchidaceae) is one of the world’s most endangered plants with great medicinal value. In nature, D . officinale seeds must establish symbiotic relationships with fungi to germinate. However, the molecular events involved in the interaction between fungus and plant during this process are poorly understood. To isolate the genes involved in symbiotic germination, a suppression subtractive hybridization (SSH) cDNA library of symbiotically germinated D . officinale seeds was constructed. From this library, 1437 expressed sequence tags (ESTs) were clustered to 1074 Unigenes (including 902 singletons and 172 contigs), which were searched against the NCBI non-redundant (NR) protein database (E-value cutoff, e-5). Based on sequence similarity with known proteins, 579 differentially expressed genes in D . officinale were identified and classified into different functional categories by Gene Ontology (GO), Clusters of orthologous Groups of proteins (COGs) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. The expression levels of 15 selected genes emblematic of symbiotic germination were confirmed via real-time quantitative PCR. These genes were classified into various categories, including defense and stress response, metabolism, transcriptional regulation, transport process and signal transduction pathways. All transcripts were upregulated in the symbiotically germinated seeds (SGS). The functions of these genes in symbiotic germination were predicted. Furthermore, two fungus-induced calcium-dependent protein kinases (CDPKs), which were upregulated 6.76- and 26.69-fold in SGS compared with un-germinated seeds (UGS), were cloned from D . officinale and characterized for the first time. This study provides the first global overview of genes putatively involved in D . officinale symbiotic seed germination and provides a foundation for further functional research regarding symbiotic relationships in orchids. PMID:23967335
Zhao, Ming-Ming; Zhang, Gang; Zhang, Da-Wei; Hsiao, Yu-Yun; Guo, Shun-Xing
2013-01-01
Dendrobiumofficinale (Orchidaceae) is one of the world's most endangered plants with great medicinal value. In nature, D. officinale seeds must establish symbiotic relationships with fungi to germinate. However, the molecular events involved in the interaction between fungus and plant during this process are poorly understood. To isolate the genes involved in symbiotic germination, a suppression subtractive hybridization (SSH) cDNA library of symbiotically germinated D. officinale seeds was constructed. From this library, 1437 expressed sequence tags (ESTs) were clustered to 1074 Unigenes (including 902 singletons and 172 contigs), which were searched against the NCBI non-redundant (NR) protein database (E-value cutoff, e(-5)). Based on sequence similarity with known proteins, 579 differentially expressed genes in D. officinale were identified and classified into different functional categories by Gene Ontology (GO), Clusters of orthologous Groups of proteins (COGs) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. The expression levels of 15 selected genes emblematic of symbiotic germination were confirmed via real-time quantitative PCR. These genes were classified into various categories, including defense and stress response, metabolism, transcriptional regulation, transport process and signal transduction pathways. All transcripts were upregulated in the symbiotically germinated seeds (SGS). The functions of these genes in symbiotic germination were predicted. Furthermore, two fungus-induced calcium-dependent protein kinases (CDPKs), which were upregulated 6.76- and 26.69-fold in SGS compared with un-germinated seeds (UGS), were cloned from D. officinale and characterized for the first time. This study provides the first global overview of genes putatively involved in D. officinale symbiotic seed germination and provides a foundation for further functional research regarding symbiotic relationships in orchids.
Linking metabolic network features to phenotypes using sparse group lasso.
Samal, Satya Swarup; Radulescu, Ovidiu; Weber, Andreas; Fröhlich, Holger
2017-11-01
Integration of metabolic networks with '-omics' data has been a subject of recent research in order to better understand the behaviour of such networks with respect to differences between biological and clinical phenotypes. Under the conditions of steady state of the reaction network and the non-negativity of fluxes, metabolic networks can be algebraically decomposed into a set of sub-pathways often referred to as extreme currents (ECs). Our objective is to find the statistical association of such sub-pathways with given clinical outcomes, resulting in a particular instance of a self-contained gene set analysis method. In this direction, we propose a method based on sparse group lasso (SGL) to identify phenotype associated ECs based on gene expression data. SGL selects a sparse set of feature groups and also introduces sparsity within each group. Features in our model are clusters of ECs, and feature groups are defined based on correlations among these features. We apply our method to metabolic networks from KEGG database and study the association of network features to prostate cancer (where the outcome is tumor and normal, respectively) as well as glioblastoma multiforme (where the outcome is survival time). In addition, simulations show the superior performance of our method compared to global test, which is an existing self-contained gene set analysis method. R code (compatible with version 3.2.5) is available from http://www.abi.bit.uni-bonn.de/index.php?id=17. samal@combine.rwth-aachen.de or frohlich@bit.uni-bonn.de. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Zhao, Lijuan; Lu, Hong; Meng, Qinglei; Wang, Jinfu; Wang, Weimin; Yang, Ling; Lin, Li
2016-01-01
MicroRNAs (miRNAs) play important roles in regulation of many biological processes in eukaryotes, including pathogen infection and host interactions. Flavobacterium columnare (FC) infection can cause great economic loss of common carp (Cyprinus carpio) which is one of the most important cultured fish in the world. However, miRNAs in response to FC infection in common carp has not been characterized. To identify specific miRNAs involved in common carp infected with FC, we performed microRNA sequencing using livers of common carp infected with and without FC. A total of 698 miRNAs were identified, including 142 which were identified and deposited in the miRbase database (Available online: http://www.mirbase.org/) and 556 had only predicted miRNAs. Among the deposited miRNAs, eight miRNAs were first identified in common carp. Thirty of the 698 miRNAs were differentially expressed miRNAs (DIE-miRNAs) between the FC infected and control samples. From the DIE-miRNAs, seven were selected randomly and their expression profiles were confirmed to be consistent with the microRNA sequencing results using RT-PCR and qRT-PCR. In addition, a total of 27,363 target genes of the 30 DIE-miRNAs were predicted. The target genes were enriched in five Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways, including focal adhesion, extracellular matrix (ECM)-receptor interaction, erythroblastic leukemia viral oncogene homolog (ErbB) signaling pathway, regulation of actin cytoskeleton, and adherent junction. The miRNA expression profile of the liver of common carp infected with FC will pave the way for the development of effective strategies to fight against FC infection. PMID:27092486
Wang, Qian; Li, Yanwei; Dong, Hong; Wang, Li; Peng, Jinmei; An, Tongqing; Yang, Xufu; Tian, Zhijun; Cai, Xuehui
2017-02-22
The highly pathogenic porcine reproductive and respiratory syndrome virus (HP-PRRSV) continues to pose one of the greatest threats to the swine industry. M protein is the most conserved and important structural protein of PRRSV. However, information about the host cellular proteins that interact with M protein remains limited. Host cellular proteins that interact with the M protein of HP-PRRSV were immunoprecipitated from MARC-145 cells infected with PRRSV HuN4-F112 using the M monoclonal antibody (mAb). The differentially expressed proteins were identified by LC-MS/MS. The screened proteins were used for bioinformatics analysis including Gene Ontology, the interaction network, and the enriched KEGG pathways. Some interested cellular proteins were validated to interact with M protein by CO-IP. The PRRSV HuN4-F112 infection group had 10 bands compared with the control group. The bands included 219 non-redundant cellular proteins that interact with M protein, which were identified by LC-MS/MS with high confidence. The gene ontology and Kyoto encyclopedia of genes and genomes (KEGG) pathway bioinformatic analyses indicated that the identified proteins could be assigned to several different subcellular locations and functional classes. Functional analysis of the interactome profile highlighted cellular pathways associated with protein translation, infectious disease, and signal transduction. Two interested cellular proteins-nuclear factor of activated T cells 45 kDa (NF45) and proliferating cell nuclear antigen (PCNA)-that could interact with M protein were validated by Co-IP and confocal analyses. The interactome data between PRRSV M protein and cellular proteins were identified and contribute to the understanding of the roles of M protein in the replication and pathogenesis of PRRSV. The interactome of M protein will aid studies of virus/host interactions and provide means to decrease the threat of PRRSV to the swine industry in the future.
Lu, Xia; Kong, Jie; Luan, Sheng; Dai, Ping; Meng, Xianhong; Cao, Baoxiang; Luo, Kun
2016-01-01
In the practical farming of Litopenaeus vannamei, the intensive culture system and environmental pollution usually results in a high concentration of ammonia, which usually brings large detrimental effects to shrimp, such as increasing the susceptibility to pathogens, reducing growth, decreasing osmoregulatory capacity, increasing the molting frequency, and even causing high mortality. However, little information is available on the molecular mechanisms of the detrimental effects of ammonia stress in shrimp. In this study, we performed comparative transcriptome analysis between ammonia-challenged and control groups from the same family of L. vannamei to identify the key genes and pathways response to ammonia stress. The comparative transcriptome analysis identified 136 significantly differentially expressed genes that have high homologies with the known proteins in aquatic species, among which 94 genes are reported potentially related to immune function, and the rest of the genes are involved in apoptosis, growth, molting, and osmoregulation. Fourteen GO terms and 6 KEGG pathways were identified to be significantly changed by ammonia stress. In these GO terms, 13 genes have been studied in aquatic species, and 11 of them were reported potentially involved in immune defense and two genes were related to molting. In the significantly changed KEGG pathways, all the 7 significantly changed genes have been reported in shrimp, and four of them were potentially involved in immune defense and the other three were related to molting, defending toxicity, and osmoregulation, respectively. In addition, majority of the significantly changed genes involved in nitrogen metabolisms that play an important role in reducing ammonia toxicity failed to perform the protection function. The present results have supplied molecular level support for the previous founding of the detrimental effects of ammonia stress in shrimp, which is a prerequisite for better understanding the molecular mechanism of the immunosuppression from ammonia stress. PMID:27760162
Lu, Xia; Kong, Jie; Luan, Sheng; Dai, Ping; Meng, Xianhong; Cao, Baoxiang; Luo, Kun
2016-01-01
In the practical farming of Litopenaeus vannamei, the intensive culture system and environmental pollution usually results in a high concentration of ammonia, which usually brings large detrimental effects to shrimp, such as increasing the susceptibility to pathogens, reducing growth, decreasing osmoregulatory capacity, increasing the molting frequency, and even causing high mortality. However, little information is available on the molecular mechanisms of the detrimental effects of ammonia stress in shrimp. In this study, we performed comparative transcriptome analysis between ammonia-challenged and control groups from the same family of L. vannamei to identify the key genes and pathways response to ammonia stress. The comparative transcriptome analysis identified 136 significantly differentially expressed genes that have high homologies with the known proteins in aquatic species, among which 94 genes are reported potentially related to immune function, and the rest of the genes are involved in apoptosis, growth, molting, and osmoregulation. Fourteen GO terms and 6 KEGG pathways were identified to be significantly changed by ammonia stress. In these GO terms, 13 genes have been studied in aquatic species, and 11 of them were reported potentially involved in immune defense and two genes were related to molting. In the significantly changed KEGG pathways, all the 7 significantly changed genes have been reported in shrimp, and four of them were potentially involved in immune defense and the other three were related to molting, defending toxicity, and osmoregulation, respectively. In addition, majority of the significantly changed genes involved in nitrogen metabolisms that play an important role in reducing ammonia toxicity failed to perform the protection function. The present results have supplied molecular level support for the previous founding of the detrimental effects of ammonia stress in shrimp, which is a prerequisite for better understanding the molecular mechanism of the immunosuppression from ammonia stress.
Kilicoglu, Halil; Shin, Dongwook; Rindflesch, Thomas C.
2014-01-01
Gene regulatory networks are a crucial aspect of systems biology in describing molecular mechanisms of the cell. Various computational models rely on random gene selection to infer such networks from microarray data. While incorporation of prior knowledge into data analysis has been deemed important, in practice, it has generally been limited to referencing genes in probe sets and using curated knowledge bases. We investigate the impact of augmenting microarray data with semantic relations automatically extracted from the literature, with the view that relations encoding gene/protein interactions eliminate the need for random selection of components in non-exhaustive approaches, producing a more accurate model of cellular behavior. A genetic algorithm is then used to optimize the strength of interactions using microarray data and an artificial neural network fitness function. The result is a directed and weighted network providing the individual contribution of each gene to its target. For testing, we used invasive ductile carcinoma of the breast to query the literature and a microarray set containing gene expression changes in these cells over several time points. Our model demonstrates significantly better fitness than the state-of-the-art model, which relies on an initial random selection of genes. Comparison to the component pathways of the KEGG Pathways in Cancer map reveals that the resulting networks contain both known and novel relationships. The p53 pathway results were manually validated in the literature. 60% of non-KEGG relationships were supported (74% for highly weighted interactions). The method was then applied to yeast data and our model again outperformed the comparison model. Our results demonstrate the advantage of combining gene interactions extracted from the literature in the form of semantic relations with microarray analysis in generating contribution-weighted gene regulatory networks. This methodology can make a significant contribution to understanding the complex interactions involved in cellular behavior and molecular physiology. PMID:24921649
Chen, Guocai; Cairelli, Michael J; Kilicoglu, Halil; Shin, Dongwook; Rindflesch, Thomas C
2014-06-01
Gene regulatory networks are a crucial aspect of systems biology in describing molecular mechanisms of the cell. Various computational models rely on random gene selection to infer such networks from microarray data. While incorporation of prior knowledge into data analysis has been deemed important, in practice, it has generally been limited to referencing genes in probe sets and using curated knowledge bases. We investigate the impact of augmenting microarray data with semantic relations automatically extracted from the literature, with the view that relations encoding gene/protein interactions eliminate the need for random selection of components in non-exhaustive approaches, producing a more accurate model of cellular behavior. A genetic algorithm is then used to optimize the strength of interactions using microarray data and an artificial neural network fitness function. The result is a directed and weighted network providing the individual contribution of each gene to its target. For testing, we used invasive ductile carcinoma of the breast to query the literature and a microarray set containing gene expression changes in these cells over several time points. Our model demonstrates significantly better fitness than the state-of-the-art model, which relies on an initial random selection of genes. Comparison to the component pathways of the KEGG Pathways in Cancer map reveals that the resulting networks contain both known and novel relationships. The p53 pathway results were manually validated in the literature. 60% of non-KEGG relationships were supported (74% for highly weighted interactions). The method was then applied to yeast data and our model again outperformed the comparison model. Our results demonstrate the advantage of combining gene interactions extracted from the literature in the form of semantic relations with microarray analysis in generating contribution-weighted gene regulatory networks. This methodology can make a significant contribution to understanding the complex interactions involved in cellular behavior and molecular physiology.
Zhou, Wei; Song, Xiang-gang; Chen, Chao; Wang, Shu-mei; Liang, Sheng-wang
2015-08-01
Action mechanism and material base of compound Danshen dripping pills in treatment of carotid atherosclerosis were discussed based on gene expression profile and molecular fingerprint in this paper. First, gene expression profiles of atherosclerotic carotid artery tissues and histologically normal tissues in human body were collected, and were screened using significance analysis of microarray (SAM) to screen out differential gene expressions; then differential genes were analyzed by Gene Ontology (GO) analysis and KEGG pathway analysis; to avoid some genes with non-outstanding differential expression but biologically importance, Gene Set Enrichment Analysis (GSEA) were performed, and 7 chemical ingredients with higher negative enrichment score were obtained by Cmap method, implying that they could reversely regulate the gene expression profiles of pathological tissues; and last, based on the hypotheses that similar structures have similar activities, 336 ingredients of compound Danshen dripping pills were compared with 7 drug molecules in 2D molecular fingerprints method. The results showed that 147 differential genes including 60 up-regulated genes and 87 down regulated genes were screened out by SAM. And in GO analysis, Biological Process ( BP) is mainly concerned with biological adhesion, response to wounding and inflammatory response; Cellular Component (CC) is mainly concerned with extracellular region, extracellular space and plasma membrane; while Molecular Function (MF) is mainly concerned with antigen binding, metalloendopeptidase activity and peptide binding. KEGG pathway analysis is mainly concerned with JAK-STAT, RIG-I like receptor and PPAR signaling pathway. There were 10 compounds, such as hexadecane, with Tanimoto coefficients greater than 0.85, which implied that they may be the active ingredients (AIs) of compound Danshen dripping pills in treatment of carotid atherosclerosis (CAs). The present method can be applied to the research on material base and molecular action mechanism of TCM.
Wang, Wenbo; Zhao, Linlin; He, Zhenyu; Wu, Ning; Li, Qiuxia; Qiu, Xinjian; Zhou, Lu; Wang, Dongsheng
2018-06-12
Ge-Gen-Jiao-Tai-Wan (GGJTW) formula, derived from traditional Chinese herbal medicine, is composed of Pueraria montana var. lobata (Willd.) Sanjappa & Pradeep (Ge-Gen in Chinese), Coptis chinensis Franch (Huang-Lian), and Cinnamomum cassia (L.) J. Presl (Rou-Gui). GGJTW is used for treatment of diabetes in China, reflecting the potent hypoglycemic effect of its ingredients. However, little is known of the hypoglycemic effect of GGJTW and the underlying metabolic mechanism. This study aimed to investigate the hypoglycemic effect of GGJTW in type 2 diabetic rats and the metabolic mechanism of action. Ultra high-performance liquid chromatography coupled with quadrupole-time-of-flight tandem mass spectrometry (UHPLC-QTOF/MS)-based metabolomics approach was used for monitoring hyperglycaemia induced by high-sugar high-fat fodder and streptozotocin (STZ), and the protective effect of GGJTW. Dynamic fasting blood glucose (FBG) levels, body weight, and biochemical parameters, including lipid levels, hepatic-renal function, and hepatic histopathology were used to confirm the hyperglycaemic toxicity and attenuation effects. An orthogonal partial least squared-discriminant analysis (OPLS-DA) approach highlighted significant differences in the metabolome of the healthy control, diabetic, and drug-treated rats. The metabolomics pathway analysis (MetPA) and Kyoto encyclopedia of genes and genomes (KEGG) database were used to investigate the underlying metabolic pathways. Metabolic profiling revealed 37 metabolites as the most potential biomarker metabolites distinguishing GGJTW-treated rats from model rats. Most of the metabolites were primarily associated with bile acid metabolism and lipid metabolism. The most critical pathway was primary bile acid biosynthesis pathway involving the up-regulation of the levels of cholic acid (CA), chenodeoxycholic acid (CDCA), taurocholic acid (TCA), glycocholic acid (GCA), taurochenodesoxycholic acid (TCDCA), and taurine. The significantly-altered metabolite levels indicated the hypoglycemic effect of GGJTW on diabetic rats and the underlying metabolic mechanism. This study will be meaningful for the clinical application of GGJTW and valuable for further exploration of the mechanism. Copyright © 2018 Elsevier B.V. All rights reserved.
Grohar: Automated Visualization of Genome-Scale Metabolic Models and Their Pathways.
Moškon, Miha; Zimic, Nikolaj; Mraz, Miha
2018-05-01
Genome-scale metabolic models (GEMs) have become a powerful tool for the investigation of the entire metabolism of the organism in silico. These models are, however, often extremely hard to reconstruct and also difficult to apply to the selected problem. Visualization of the GEM allows us to easier comprehend the model, to perform its graphical analysis, to find and correct the faulty relations, to identify the parts of the system with a designated function, etc. Even though several approaches for the automatic visualization of GEMs have been proposed, metabolic maps are still manually drawn or at least require large amount of manual curation. We present Grohar, a computational tool for automatic identification and visualization of GEM (sub)networks and their metabolic fluxes. These (sub)networks can be specified directly by listing the metabolites of interest or indirectly by providing reference metabolic pathways from different sources, such as KEGG, SBML, or Matlab file. These pathways are identified within the GEM using three different pathway alignment algorithms. Grohar also supports the visualization of the model adjustments (e.g., activation or inhibition of metabolic reactions) after perturbations are induced.
PaintOmics 3: a web resource for the pathway analysis and visualization of multi-omics data.
Hernández-de-Diego, Rafael; Tarazona, Sonia; Martínez-Mira, Carlos; Balzano-Nogueira, Leandro; Furió-Tarí, Pedro; Pappas, Georgios J; Conesa, Ana
2018-05-25
The increasing availability of multi-omic platforms poses new challenges to data analysis. Joint visualization of multi-omics data is instrumental in better understanding interconnections across molecular layers and in fully utilizing the multi-omic resources available to make biological discoveries. We present here PaintOmics 3, a web-based resource for the integrated visualization of multiple omic data types onto KEGG pathway diagrams. PaintOmics 3 combines server-end capabilities for data analysis with the potential of modern web resources for data visualization, providing researchers with a powerful framework for interactive exploration of their multi-omics information. Unlike other visualization tools, PaintOmics 3 covers a comprehensive pathway analysis workflow, including automatic feature name/identifier conversion, multi-layered feature matching, pathway enrichment, network analysis, interactive heatmaps, trend charts, and more. It accepts a wide variety of omic types, including transcriptomics, proteomics and metabolomics, as well as region-based approaches such as ATAC-seq or ChIP-seq data. The tool is freely available at www.paintomics.org.
Gao, Tingting; Zhao, Xin; Liu, Chenchen; Shao, Binbin; Zhang, Xi; Li, Kai; Cai, Jinyang; Wang, Su; Huang, Xiaoyan
2018-05-24
Spermatogonial stem cell (SSC) self-renewal is an indispensable part of spermatogenesis. Angiotensin I-converting enzyme (ACE) is a zinc dipeptidyl carboxypeptidase that plays a critical role in regulation of the renin-angiotensin system. Here, we used RT-PCR and Western blot analysis to confirm that somatic ACE (sACE) but not testicular ACE (tACE) is highly expressed in mouse testis before postpartum day 7 and in cultured SSCs. Our results revealed that sACE is located on the membrane of SSCs. Treating cultured SSCs with the ACE competitive inhibitor captopril was found to inhibit sACE activity, and significantly reduced the proliferation rate of SSCs. Microarray analysis identified 651 genes with significant differential expression. KEGG pathway analysis showed that these differentially expressed genes are mainly involved in the mitogen-activated protein kinase (MAPK) signaling pathway and cell cycle. sACE was found to play an important role in SSC self-renewal via the regulation of MAPK-dependent cell proliferation.
Systematic analysis of molecular mechanisms for HCC metastasis via text mining approach.
Zhen, Cheng; Zhu, Caizhong; Chen, Haoyang; Xiong, Yiru; Tan, Junyuan; Chen, Dong; Li, Jin
2017-02-21
To systematically explore the molecular mechanism for hepatocellular carcinoma (HCC) metastasis and identify regulatory genes with text mining methods. Genes with highest frequencies and significant pathways related to HCC metastasis were listed. A handful of proteins such as EGFR, MDM2, TP53 and APP, were identified as hub nodes in PPI (protein-protein interaction) network. Compared with unique genes for HBV-HCCs, genes particular to HCV-HCCs were less, but may participate in more extensive signaling processes. VEGFA, PI3KCA, MAPK1, MMP9 and other genes may play important roles in multiple phenotypes of metastasis. Genes in abstracts of HCC-metastasis literatures were identified. Word frequency analysis, KEGG pathway and PPI network analysis were performed. Then co-occurrence analysis between genes and metastasis-related phenotypes were carried out. Text mining is effective for revealing potential regulators or pathways, but the purpose of it should be specific, and the combination of various methods will be more useful.
Castro, Juan C; Maddox, J Dylan; Cobos, Marianela; Requena, David; Zimic, Mirko; Bombarely, Aureliano; Imán, Sixto A; Cerdeira, Luis A; Medina, Andersson E
2015-11-24
Myrciaria dubia is an Amazonian fruit shrub that produces numerous bioactive phytochemicals, but is best known by its high L-ascorbic acid (AsA) content in fruits. Pronounced variation in AsA content has been observed both within and among individuals, but the genetic factors responsible for this variation are largely unknown. The goals of this research, therefore, were to assemble, characterize, and annotate the fruit transcriptome of M. dubia in order to reconstruct metabolic pathways and determine if multiple pathways contribute to AsA biosynthesis. In total 24,551,882 high-quality sequence reads were de novo assembled into 70,048 unigenes (mean length = 1150 bp, N50 = 1775 bp). Assembled sequences were annotated using BLASTX against public databases such as TAIR, GR-protein, FB, MGI, RGD, ZFIN, SGN, WB, TIGR_CMR, and JCVI-CMR with 75.2 % of unigenes having annotations. Of the three core GO annotation categories, biological processes comprised 53.6 % of the total assigned annotations, whereas cellular components and molecular functions comprised 23.3 and 23.1 %, respectively. Based on the KEGG pathway assignment of the functionally annotated transcripts, five metabolic pathways for AsA biosynthesis were identified: animal-like pathway, myo-inositol pathway, L-gulose pathway, D-mannose/L-galactose pathway, and uronic acid pathway. All transcripts coding enzymes involved in the ascorbate-glutathione cycle were also identified. Finally, we used the assembly to identified 6314 genic microsatellites and 23,481 high quality SNPs. This study describes the first next-generation sequencing effort and transcriptome annotation of a non-model Amazonian plant that is relevant for AsA production and other bioactive phytochemicals. Genes encoding key enzymes were successfully identified and metabolic pathways involved in biosynthesis of AsA, anthocyanins, and other metabolic pathways have been reconstructed. The identification of these genes and pathways is in agreement with the empirically observed capability of M. dubia to synthesize and accumulate AsA and other important molecules, and adds to our current knowledge of the molecular biology and biochemistry of their production in plants. By providing insights into the mechanisms underpinning these metabolic processes, these results can be used to direct efforts to genetically manipulate this organism in order to enhance the production of these bioactive phytochemicals. The accumulation of AsA precursor and discovery of genes associated with their biosynthesis and metabolism in M. dubia is intriguing and worthy of further investigation. The sequences and pathways produced here present the genetic framework required for further studies. Quantitative transcriptomics in concert with studies of the genome, proteome, and metabolome under conditions that stimulate production and accumulation of AsA and their precursors are needed to provide a more comprehensive view of how these pathways for AsA metabolism are regulated and linked in this species.
Expression Profile of Long Noncoding RNAs in Human Earlobe Keloids: A Microarray Analysis
Guo, Liang; Xu, Kai; Yan, Hongbo; Feng, Haifeng
2016-01-01
Background. Long noncoding RNAs (lncRNAs) play key roles in a wide range of biological processes and their deregulation results in human disease, including keloids. Earlobe keloid is a type of pathological skin scar, and the molecular pathogenesis of this disease remains largely unknown. Methods. In this study, microarray analysis was used to determine the expression profiles of lncRNAs and mRNAs between 3 pairs of earlobe keloid and normal specimens. Gene Ontology (GO) categories and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were performed to identify the main functions of the differentially expressed genes and earlobe keloid-related pathways. Results. A total of 2068 lncRNAs and 1511 mRNAs were differentially expressed between earlobe keloid and normal tissues. Among them, 1290 lncRNAs and 1092 mRNAs were upregulated, and 778 lncRNAs and 419 mRNAs were downregulated. Pathway analysis revealed that 24 pathways were correlated to the upregulated transcripts, while 11 pathways were associated with the downregulated transcripts. Conclusion. We characterized the expression profiles of lncRNA and mRNA in earlobe keloids and suggest that lncRNAs may serve as diagnostic biomarkers for the therapy of earlobe keloid. PMID:28101509
De novo transcriptome assembly databases for the butterfly orchid Phalaenopsis equestris
Niu, Shan-Ce; Xu, Qing; Zhang, Guo-Qiang; Zhang, Yong-Qiang; Tsai, Wen-Chieh; Hsu, Jui-Ling; Liang, Chieh-Kai; Luo, Yi-Bo; Liu, Zhong-Jian
2016-01-01
Orchids are renowned for their spectacular flowers and ecological adaptations. After the sequencing of the genome of the tropical epiphytic orchid Phalaenopsis equestris, we combined Illumina HiSeq2000 for RNA-Seq and Trinity for de novo assembly to characterize the transcriptomes for 11 diverse P. equestris tissues representing the root, stem, leaf, flower buds, column, lip, petal, sepal and three developmental stages of seeds. Our aims were to contribute to a better understanding of the molecular mechanisms driving the analysed tissue characteristics and to enrich the available data for P. equestris. Here, we present three databases. The first dataset is the RNA-Seq raw reads, which can be used to execute new experiments with different analysis approaches. The other two datasets allow different types of searches for candidate homologues. The second dataset includes the sets of assembled unigenes and predicted coding sequences and proteins, enabling a sequence-based search. The third dataset consists of the annotation results of the aligned unigenes versus the Nonredundant (Nr) protein database, Kyoto Encyclopaedia of Genes and Genomes (KEGG) and Clusters of Orthologous Groups (COG) databases with low e-values, enabling a name-based search. PMID:27673730
Sun, Xiudong; Zhou, Shumei; Meng, Fanlu; Liu, Shiqi
2012-10-01
Garlic is widely used as a spice throughout the world for the culinary value of its flavor and aroma, which are created by the chemical transformation of a series of organic sulfur compounds. To analyze the transcriptome of Allium sativum and discover the genes involved in sulfur metabolism, cDNAs derived from the total RNA of Allium sativum buds were analyzed by Illumina sequencing. Approximately 26.67 million 90 bp paired-end clean reads were achieved in two libraries. A total of 127,933 unigenes were generated by de novo assembly and were compared with the sequences in public databases. Of these, 45,286 unigenes had significant hits to the sequences in the Nr database, 29,514 showed significant similarity to known proteins in the Swiss-Prot database and, 20,706 and 21,952 unigenes had significant similarity to existing sequences in the KEGG and COG databases, respectively. Moreover, genes involved in organic sulfur biosynthesis were identified. These unigenes data will provide the foundation for research on gene expression, genomics and functional genomics in Allium sativum. Key message The obtained unigenes will provide the foundation for research on functional genomics in Allium sativum and its closely related species, and fill the gap of the existing plant EST database.
ProteinWorldDB: querying radical pairwise alignments among protein sets from complete genomes
Otto, Thomas Dan; Catanho, Marcos; Tristão, Cristian; Bezerra, Márcia; Fernandes, Renan Mathias; Elias, Guilherme Steinberger; Scaglia, Alexandre Capeletto; Bovermann, Bill; Berstis, Viktors; Lifschitz, Sergio; de Miranda, Antonio Basílio; Degrave, Wim
2010-01-01
Motivation: Many analyses in modern biological research are based on comparisons between biological sequences, resulting in functional, evolutionary and structural inferences. When large numbers of sequences are compared, heuristics are often used resulting in a certain lack of accuracy. In order to improve and validate results of such comparisons, we have performed radical all-against-all comparisons of 4 million protein sequences belonging to the RefSeq database, using an implementation of the Smith–Waterman algorithm. This extremely intensive computational approach was made possible with the help of World Community Grid™, through the Genome Comparison Project. The resulting database, ProteinWorldDB, which contains coordinates of pairwise protein alignments and their respective scores, is now made available. Users can download, compare and analyze the results, filtered by genomes, protein functions or clusters. ProteinWorldDB is integrated with annotations derived from Swiss-Prot, Pfam, KEGG, NCBI Taxonomy database and gene ontology. The database is a unique and valuable asset, representing a major effort to create a reliable and consistent dataset of cross-comparisons of the whole protein content encoded in hundreds of completely sequenced genomes using a rigorous dynamic programming approach. Availability: The database can be accessed through http://proteinworlddb.org Contact: otto@fiocruz.br PMID:20089515
Metabolite profiling of human colon carcinoma--deregulation of TCA cycle and amino acid turnover.
Denkert, Carsten; Budczies, Jan; Weichert, Wilko; Wohlgemuth, Gert; Scholz, Martin; Kind, Tobias; Niesporek, Silvia; Noske, Aurelia; Buckendahl, Anna; Dietel, Manfred; Fiehn, Oliver
2008-09-18
Apart from genetic alterations, development and progression of colorectal cancer has been linked to influences from nutritional intake, hyperalimentation, and cellular metabolic changes that may be the basis for new diagnostic and therapeutic approaches. However, in contrast to genomics and proteomics, comprehensive metabolomic investigations of alterations in malignant tumors have rarely been conducted. In this study we investigated a set of paired samples of normal colon tissue and colorectal cancer tissue with gas-chromatography time-of-flight mass-spectrometry, which resulted in robust detection of a total of 206 metabolites. Metabolic phenotypes of colon cancer and normal tissues were different at a Bonferroni corrected significance level of p=0.00170 and p=0.00005 for the first two components of an unsupervised PCA analysis. Subsequent supervised analysis found 82 metabolites to be significantly different at p<0.01. Metabolites were connected to abnormalities in metabolic pathways by a new approach that calculates the distance of each pair of metabolites in the KEGG database interaction lattice. Intermediates of the TCA cycle and lipids were found down-regulated in cancer, whereas urea cycle metabolites, purines, pyrimidines and amino acids were generally found at higher levels compared to normal colon mucosa. This study demonstrates that metabolic profiling facilitates biochemical phenotyping of normal and neoplastic colon tissue at high significance levels and points to GC-TOF-based metabolomics as a new method for molecular pathology investigations.
miRNAome expression profiles in the gonads of adult Melopsittacus undulatus
Jiang, Lan; Wang, Qingqing; Yu, Jue; Gowda, Vinita; Johnson, Gabriel; Yang, Jianke
2018-01-01
The budgerigar (Melopsittacus undulatus) is one of the most widely studied parrot species, serving as an excellent animal model for behavior and neuroscience research. Until recently, it was unknown how sexual differences in the behavior, physiology, and development of organisms are regulated by differential gene expression. MicroRNAs (miRNAs) are endogenous short non-coding RNA molecules that can post-transcriptionally regulate gene expression and play a critical role in gonadal differentiation as well as early development of animals. However, very little is known about the role gonadal miRNAs play in the early development of birds. Research on the sex-biased expression of miRNAs in avian gonads are limited, and little is known about M. undulatus. In the current study, we sequenced two small non-coding RNA libraries made from the gonads of adult male and female budgerigars using Illumina paired-end sequencing technology. We obtained 254 known and 141 novel miRNAs, and randomly validated five miRNAs. Of these, three miRNAs were differentially expressed miRNAs and 18 miRNAs involved in sexual differentiation as determined by functional analysis with GO annotation and KEGG pathway analysis. In conclusion, this work is the first report of sex-biased miRNAs expression in the budgerigar, and provides additional sequences to the avian miRNAome database which will foster further functional genomic research. PMID:29666766
Rampuria, Sakshi; Joshi, Uma; Palit, Paramita; Deokar, Amit A; Meghwal, Raju R; Mohapatra, T; Srinivasan, R; Bhatt, K V; Sharma, Ramavtar
2012-11-01
Moth bean ( Vigna aconitifolia (Jacq.) Marechal) is an important grain legume crop grown in rain fed areas of hot desert regions of Thar, India, under scorching sun rays with very little supplementation of water. An SSH cDNA library was generated from leaf tissues of V. aconitifolia var. RMO-40 exposed to an elevated temperature of 42 °C for 5 min to identify early-induced genes. A total of 488 unigenes (114 contigs and 374 singletons) were derived by cluster assembly and sequence alignment of 738 ESTs; out of 206 ESTs (28%) of unknown proteins, 160 ESTs (14%) were found to be novel to moth bean. Only 578 ESTs (78%) showed significant BLASTX similarity (<1 × 10(-6)) in the NCBI non-redundant database. Gene ontology functional classification terms were retrieved for 479 (65%) sequences, and 339 sequences were annotated with 165 EC codes and mapped to 68 different KEGG pathways. Four hundred and fifty-two ESTs were further annotated with InterProScan (IPS), and no IPS was assigned to 153 ESTs. In addition, the expression level of 27 ESTs in response to heat stress was evaluated through semiquantitative RT-PCR assay. Approximately 20 different signaling genes and 16 different transcription factors have been shown to be associated with heat stress in moth bean for the first time.
Hinsu, Ankit T; Parmar, Nidhi R; Nathani, Neelam M; Pandit, Ramesh J; Patel, Anand B; Patel, Amrutlal K; Joshi, Chaitanya G
2017-04-01
Recent advances in next generation sequencing technology have enabled analysis of complex microbial community from genome to transcriptome level. In the present study, metatranscriptomic approach was applied to elucidate functionally active bacteria and their biological processes in rumen of buffalo (Bubalus bubalis) adapted to different dietary treatments. Buffaloes were adapted to a diet containing 50:50, 75:25 and 100:0 forage to concentrate ratio, each for 6 weeks, before ruminal content sample collection. Metatranscriptomes from rumen fiber adherent and fiber-free active bacteria were sequenced using Ion Torrent PGM platform followed by annotation using MG-RAST server and CAZYmes (Carbohydrate active enzymes) analysis toolkit. In all the samples Bacteroidetes was the most abundant phylum followed by Firmicutes. Functional analysis using KEGG Orthology database revealed Metabolism as the most abundant category at level 1 within which Carbohydrate metabolism was dominating. Diet treatments also exerted significant differences in proportion of enzymes involved in metabolic pathways for VFA production. Carbohydrate Active Enzyme(CAZy) analysis revealed the abundance of genes encoding glycoside hydrolases with the highest representation of GH13 CAZy family in all the samples. The findings provide an overview of the activities occurring in the rumen as well as active bacterial population and the changes occurring through different dietary treatments. Copyright © 2017 Elsevier Ltd. All rights reserved.
Gene discovery using next-generation pyrosequencing to develop ESTs for Phalaenopsis orchids
2011-01-01
Background Orchids are one of the most diversified angiosperms, but few genomic resources are available for these non-model plants. In addition to the ecological significance, Phalaenopsis has been considered as an economically important floriculture industry worldwide. We aimed to use massively parallel 454 pyrosequencing for a global characterization of the Phalaenopsis transcriptome. Results To maximize sequence diversity, we pooled RNA from 10 samples of different tissues, various developmental stages, and biotic- or abiotic-stressed plants. We obtained 206,960 expressed sequence tags (ESTs) with an average read length of 228 bp. These reads were assembled into 8,233 contigs and 34,630 singletons. The unigenes were searched against the NCBI non-redundant (NR) protein database. Based on sequence similarity with known proteins, these analyses identified 22,234 different genes (E-value cutoff, e-7). Assembled sequences were annotated with Gene Ontology, Gene Family and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. Among these annotations, over 780 unigenes encoding putative transcription factors were identified. Conclusion Pyrosequencing was effective in identifying a large set of unigenes from Phalaenopsis. The informative EST dataset we developed constitutes a much-needed resource for discovery of genes involved in various biological processes in Phalaenopsis and other orchid species. These transcribed sequences will narrow the gap between study of model organisms with many genomic resources and species that are important for ecological and evolutionary studies. PMID:21749684
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gao, Jian; Luo, Mao; Zhu, Ye
2015-03-27
Viola yedoensis Makino is an important Chinese traditional medicine plant adapted to cadmium (Cd) pollution regions. Illumina sequencing technology was used to sequence the transcriptome of V. yedoensis Makino. We sequenced Cd-treated (VIYCd) and untreated (VIYCK) samples of V. yedoensis, and obtained 100,410,834 and 83,587,676 high quality reads, respectively. After de novo assembly and quantitative assessment, 109,800 unigenes were finally generated with an average length of 661 bp. We then obtained functional annotations by aligning unigenes with public protein databases including NR, NT, SwissProt, KEGG and COG. In addition, 892 differentially expressed genes (DEGs) were investigated between the two libraries ofmore » untreated (VIYCK) and Cd-treated (VIYCd) plants. Moreover, 15 randomly selected DEGs were further validated with qRT-PCR and the results were highly accordant with the Solexa analysis. This study firstly generated a successful global analysis of the V. yedoensis transcriptome and it will provide for further studies on gene expression, genomics, and functional genomics in Violaceae. - Highlights: • A de novo assembly generated 109,800 unigenes and 5,4479 of them were annotated. • 31,285 could be classified into 26 COG categories. • 263 biosynthesis pathways were predicted and classified into five categories. • 892 DEGs were detected and 15 of them were validated by qRT-PCR.« less
Zhou, Zhongkai; Wang, Yuyang; Jiang, Yumei; Diao, Yongjia; Strappe, Padraig; Prenzler, Paul; Ayton, Jamie; Blanchard, Chris
2016-04-28
Deep frying in oil is a popular cooking method around the world. However, the safety of deep-fried edible oil, which is ingested with fried food, is a concern, because the oil is exposed continuously to be re-used at a high temperature, leading to a number of well-known chemical reactions. Thus, this study investigates the changes in energy metabolism, colon histology and gut microbiota in rats following deep-fried oil consumption and explores the mechanisms involved in above alterations. Deep-fried oil was prepared following a published method. Adult male Wistar rats were randomly divided into three groups (n = 8/group). Group 1: basal diet without extra oil consumption (control group); Group 2: basal diet supplemented with non-heated canola oil (NEO group); Group 3: basal diet supplemented with deep-fried canola oil (DFEO group). One point five milliliters (1.5 mL) of non-heated or heated oil were fed by oral gavage using a feeding needle once daily for 6 consecutive weeks. Effect of DFEO on rats body weight, KEGG pathway regarding lipids metabolism, gut histology and gut microbiota were analyzed using techniques of RNA sequencing, HiSeq Illumina sequencing platform, etc. Among the three groups, DFEO diet resulted in a lowest rat body weight. Metabolic pathway analysis showed 13 significantly enriched KEGG pathways in Control versus NEO group, and the majority of these were linked to carbohydrate, lipid and amino acid metabolisms. Comparison of NEO group versus DFEO group, highlighted significantly enriched functional pathways were mainly associated with chronic diseases. Among them, only one metabolism pathway (i.e. glycerolipid metabolism pathway) was found to be significantly enriched, indicating that inhibition of this metabolism pathway (glycerolipid metabolism) may be a response to the reduction in energy metabolism in the rats of DFEO group. Related gene analysis indicated that the down-regulation of Lpin1 seems to be highly associated with the inhibition of glycerolipid metabolism pathway. Histological analysis of gastrointestinal tract demonstrated several changes induced by DFEO on intestinal mucosa with associated destruction of endocrine tissue and the evidence of inflammation. Microbiota data showed that rats in DFEO group had the lowest proportion of Prevotella and the highest proportion of Bacteroides among the three groups. In particular, rats in DFEO group were characterized with higher presence of Allobaculum (Firmicutes), but not in control and NEO groups. This study investigated the negative effect of DFEO on health, in which DFEO could impair glycerolipid metabolism, destroy gut histological structure and unbalance microbiota profile. More importantly, this is the first attempt to reveal the mechanism involved in these changes, which may provide the guideline for designing health diet.
GlycomeDB – integration of open-access carbohydrate structure databases
Ranzinger, René; Herget, Stephan; Wetter, Thomas; von der Lieth, Claus-Wilhelm
2008-01-01
Background Although carbohydrates are the third major class of biological macromolecules, after proteins and DNA, there is neither a comprehensive database for carbohydrate structures nor an established universal structure encoding scheme for computational purposes. Funding for further development of the Complex Carbohydrate Structure Database (CCSD or CarbBank) ceased in 1997, and since then several initiatives have developed independent databases with partially overlapping foci. For each database, different encoding schemes for residues and sequence topology were designed. Therefore, it is virtually impossible to obtain an overview of all deposited structures or to compare the contents of the various databases. Results We have implemented procedures which download the structures contained in the seven major databases, e.g. GLYCOSCIENCES.de, the Consortium for Functional Glycomics (CFG), the Kyoto Encyclopedia of Genes and Genomes (KEGG) and the Bacterial Carbohydrate Structure Database (BCSDB). We have created a new database called GlycomeDB, containing all structures, their taxonomic annotations and references (IDs) for the original databases. More than 100000 datasets were imported, resulting in more than 33000 unique sequences now encoded in GlycomeDB using the universal format GlycoCT. Inconsistencies were found in all public databases, which were discussed and corrected in multiple feedback rounds with the responsible curators. Conclusion GlycomeDB is a new, publicly available database for carbohydrate sequences with a unified, all-encompassing structure encoding format and NCBI taxonomic referencing. The database is updated weekly and can be downloaded free of charge. The JAVA application GlycoUpdateDB is also available for establishing and updating a local installation of GlycomeDB. With the advent of GlycomeDB, the distributed islands of knowledge in glycomics are now bridged to form a single resource. PMID:18803830
Tian, Tongde; Chen, Chuanliang; Yang, Feng; Tang, Jingwen; Pei, Junwen; Shi, Bian; Zhang, Ning; Zhang, Jianhua
2017-03-01
The paper aimed to screen out genetic markers applicable to early diagnosis for colorectal cancer and establish apoptotic regulatory network model for colorectal cancer, and to analyze the current situation of traditional Chinese medicine (TCM) target, thereby providing theoretical evidence for early diagnosis and targeted therapy of colorectal cancer. Taking databases including CNKI, VIP, Wanfang data, Pub Med, and MEDLINE as main sources of literature retrieval, literatures associated with genetic markers that are applied to early diagnosis of colorectal cancer were searched and performed comprehensive and quantitative analysis by Meta analysis, hence screening genetic markers used in early diagnosis of colorectal cancer. KEGG analysis was employed to establish apoptotic regulatory network model based on screened genetic markers, and optimization was conducted on TCM targets. Through Meta analysis, seven genetic markers were screened out, including WWOX, K-ras, COX-2, P53, APC, DCC and PTEN, among which DCC has the highest diagnostic efficiency. Apoptotic regulatory network was built by KEGG analysis. Currently, it was reported that TCM has regulatory function on gene locus in apoptotic regulatory network. The apoptotic regulatory model of colorectal cancer established in this study provides theoretical evidence for early diagnosis and TCM targeted therapy of colorectal cancer in clinic.
Yamamoto, Naoki; Suzuki, Tomohiro; Kobayashi, Masaaki; Dohra, Hideo; Sasaki, Yohei; Hirai, Hirofumi; Yokoyama, Koji; Kawagishi, Hirokazu; Yano, Kentaro
2014-12-03
The angel's wing oyster mushroom (Pleurocybella porrigens, Sugihiratake) is a well-known delicacy. However, its potential risk in acute encephalopathy was recently revealed by a food poisoning incident. To disclose the genes underlying the accident and provide mechanistic insight, we seek to develop an information infrastructure containing omics data. In our previous work, we sequenced the genome and transcriptome using next-generation sequencing techniques. The next step in achieving our goal is to develop a web database to facilitate the efficient mining of large-scale omics data and identification of genes specifically expressed in the mushroom. This paper introduces a web database A-WINGS (http://bioinf.mind.meiji.ac.jp/a-wings/) that provides integrated genomic and transcriptomic information for the angel's wing oyster mushroom. The database contains structure and functional annotations of transcripts and gene expressions. Functional annotations contain information on homologous sequences from NCBI nr and UniProt, Gene Ontology, and KEGG Orthology. Digital gene expression profiles were derived from RNA sequencing (RNA-seq) analysis in the fruiting bodies and mycelia. The omics information stored in the database is freely accessible through interactive and graphical interfaces by search functions that include 'GO TREE VIEW' browsing, keyword searches, and BLAST searches. The A-WINGS database will accelerate omics studies on specific aspects of the angel's wing oyster mushroom and the family Tricholomataceae.
Huang, Cong; Zhao, Fengguang; Lin, Ying; Zheng, Suiping; Liang, Shuli; Han, Shuangyan
2018-06-07
FKS1 encodes a β-1,3-glucan synthase, which is a key player in cell wall assembly in Saccharomyces cerevisiae. Here we analyzed the global transcriptomic changes in the FKS1 mutant to establish a correlation between the changes in the cell wall of the FKS1 mutant and the molecular mechanism of cell wall maintenance. These transcriptomic profiles showed that there are 1151 differentially expressed genes (DEGs) in the FKS1 mutant. Through KEGG pathway analysis of the DEGs, the MAPK pathway and seven pathways involved in carbon metabolism were significantly enriched. We found that the MAPK pathway is activated for FKS1 mutant survival and the synthesis of cell wall components are reinforced in the FKS1 mutant. Our results confirm that the FKS1 mutant has a β-1,3-glucan defect that affects the cell wall and partly elucidate the molecular mechanism responsible for cell wall synthesis. Our greater understanding of these mechanisms helps to explain how the FKS1 mutant survives, has useful implications for the study of similar pathways in other fungi, and increases the theoretical foundation for the regulation of the cell wall in S. cerevisiae. Copyright © 2018 Elsevier Inc. All rights reserved.