ZBIT Bioinformatics Toolbox: A Web-Platform for Systems Biology and Expression Data Analysis
Römer, Michael; Eichner, Johannes; Dräger, Andreas; Wrzodek, Clemens; Wrzodek, Finja; Zell, Andreas
2016-01-01
Bioinformatics analysis has become an integral part of research in biology. However, installation and use of scientific software can be difficult and often requires technical expert knowledge. Reasons are dependencies on certain operating systems or required third-party libraries, missing graphical user interfaces and documentation, or nonstandard input and output formats. In order to make bioinformatics software easily accessible to researchers, we here present a web-based platform. The Center for Bioinformatics Tuebingen (ZBIT) Bioinformatics Toolbox provides web-based access to a collection of bioinformatics tools developed for systems biology, protein sequence annotation, and expression data analysis. Currently, the collection encompasses software for conversion and processing of community standards SBML and BioPAX, transcription factor analysis, and analysis of microarray data from transcriptomics and proteomics studies. All tools are hosted on a customized Galaxy instance and run on a dedicated computation cluster. Users only need a web browser and an active internet connection in order to benefit from this service. The web platform is designed to facilitate the usage of the bioinformatics tools for researchers without advanced technical background. Users can combine tools for complex analyses or use predefined, customizable workflows. All results are stored persistently and reproducible. For each tool, we provide documentation, tutorials, and example data to maximize usability. The ZBIT Bioinformatics Toolbox is freely available at https://webservices.cs.uni-tuebingen.de/. PMID:26882475
Bioinformatics/biostatistics: microarray analysis.
Eichler, Gabriel S
2012-01-01
The quantity and complexity of the molecular-level data generated in both research and clinical settings require the use of sophisticated, powerful computational interpretation techniques. It is for this reason that bioinformatic analysis of complex molecular profiling data has become a fundamental technology in the development of personalized medicine. This chapter provides a high-level overview of the field of bioinformatics and outlines several, classic bioinformatic approaches. The highlighted approaches can be aptly applied to nearly any sort of high-dimensional genomic, proteomic, or metabolomic experiments. Reviewed technologies in this chapter include traditional clustering analysis, the Gene Expression Dynamics Inspector (GEDI), GoMiner (GoMiner), Gene Set Enrichment Analysis (GSEA), and the Learner of Functional Enrichment (LeFE).
Teaching bioinformatics and neuroinformatics by using free web-based tools.
Grisham, William; Schottler, Natalie A; Valli-Marill, Joanne; Beck, Lisa; Beatty, Jackson
2010-01-01
This completely computer-based module's purpose is to introduce students to bioinformatics resources. We present an easy-to-adopt module that weaves together several important bioinformatic tools so students can grasp how these tools are used in answering research questions. Students integrate information gathered from websites dealing with anatomy (Mouse Brain Library), quantitative trait locus analysis (WebQTL from GeneNetwork), bioinformatics and gene expression analyses (University of California, Santa Cruz Genome Browser, National Center for Biotechnology Information's Entrez Gene, and the Allen Brain Atlas), and information resources (PubMed). Instructors can use these various websites in concert to teach genetics from the phenotypic level to the molecular level, aspects of neuroanatomy and histology, statistics, quantitative trait locus analysis, and molecular biology (including in situ hybridization and microarray analysis), and to introduce bioinformatic resources. Students use these resources to discover 1) the region(s) of chromosome(s) influencing the phenotypic trait, 2) a list of candidate genes-narrowed by expression data, 3) the in situ pattern of a given gene in the region of interest, 4) the nucleotide sequence of the candidate gene, and 5) articles describing the gene. Teaching materials such as a detailed student/instructor's manual, PowerPoints, sample exams, and links to free Web resources can be found at http://mdcune.psych.ucla.edu/modules/bioinformatics.
Rahpeyma, Mehdi; Fotouhi, Fatemeh; Makvandi, Manouchehr; Ghadiri, Ata; Samarbaf-Zadeh, Alireza
2015-11-01
Crimean-Congo hemorrhagic fever virus (CCHFV) is a member of the nairovirus, a genus in the Bunyaviridae family, which causes a life threatening disease in human. Currently, there is no vaccine against CCHFV and detailed structural analysis of CCHFV proteins remains undefined. The CCHFV M RNA segment encodes two viral surface glycoproteins known as Gn and Gc. Viral glycoproteins can be considered as key targets for vaccine development. The current study aimed to investigate structural bioinformatics of CCHFV Gn protein and design a construct to make a recombinant bacmid to express by baculovirus system. To express the Gn protein in insect cells that can be used as antigen in animal model vaccine studies. Bioinformatic analysis of CCHFV Gn protein was performed and designed a construct and cloned into pFastBacHTb vector and a recombinant Gn-bacmid was generated by Bac to Bac system. Primary, secondary, and 3D structure of CCHFV Gn were obtained and PCR reaction with M13 forward and reverse primers confirmed the generation of recombinant bacmid DNA harboring Gn coding region under polyhedron promoter. Characterization of the detailed structure of CCHFV Gn by bioinformatics software provides the basis for development of new experiments and construction of a recombinant bacmid harboring CCHFV Gn, which is valuable for designing a recombinant vaccine against deadly pathogens like CCHFV.
Development of a Web-Enabled Informatics Platform for Manipulation of Gene Expression Data
2004-12-01
genomic platforms such as metabolomics and proteomics , and to federated databases for knowledge management. A successful SBIR Phase I completed...measurements that require sophisticated bioinformatic platforms for data archival, management, integration, and analysis if researchers are to derive...web-enabled bioinformatic platform consisting of a Laboratory Information Management System (LIMS), an Analysis Information Management System (AIMS
Rahpeyma, Mehdi; Fotouhi, Fatemeh; Makvandi, Manouchehr; Ghadiri, Ata; Samarbaf-Zadeh, Alireza
2015-01-01
Background Crimean-Congo hemorrhagic fever virus (CCHFV) is a member of the nairovirus, a genus in the Bunyaviridae family, which causes a life threatening disease in human. Currently, there is no vaccine against CCHFV and detailed structural analysis of CCHFV proteins remains undefined. The CCHFV M RNA segment encodes two viral surface glycoproteins known as Gn and Gc. Viral glycoproteins can be considered as key targets for vaccine development. Objectives The current study aimed to investigate structural bioinformatics of CCHFV Gn protein and design a construct to make a recombinant bacmid to express by baculovirus system. Materials and Methods To express the Gn protein in insect cells that can be used as antigen in animal model vaccine studies. Bioinformatic analysis of CCHFV Gn protein was performed and designed a construct and cloned into pFastBacHTb vector and a recombinant Gn-bacmid was generated by Bac to Bac system. Results Primary, secondary, and 3D structure of CCHFV Gn were obtained and PCR reaction with M13 forward and reverse primers confirmed the generation of recombinant bacmid DNA harboring Gn coding region under polyhedron promoter. Conclusions Characterization of the detailed structure of CCHFV Gn by bioinformatics software provides the basis for development of new experiments and construction of a recombinant bacmid harboring CCHFV Gn, which is valuable for designing a recombinant vaccine against deadly pathogens like CCHFV. PMID:26862379
Integration of QTL and bioinformatic tools to identify candidate genes for triglycerides in mice[S
Leduc, Magalie S.; Hageman, Rachael S.; Verdugo, Ricardo A.; Tsaih, Shirng-Wern; Walsh, Kenneth; Churchill, Gary A.; Paigen, Beverly
2011-01-01
To identify genetic loci influencing lipid levels, we performed quantitative trait loci (QTL) analysis between inbred mouse strains MRL/MpJ and SM/J, measuring triglyceride levels at 8 weeks of age in F2 mice fed a chow diet. We identified one significant QTL on chromosome (Chr) 15 and three suggestive QTL on Chrs 2, 7, and 17. We also carried out microarray analysis on the livers of parental strains of 282 F2 mice and used these data to find cis-regulated expression QTL. We then narrowed the list of candidate genes under significant QTL using a “toolbox” of bioinformatic resources, including haplotype analysis; parental strain comparison for gene expression differences and nonsynonymous coding single nucleotide polymorphisms (SNP); cis-regulated eQTL in livers of F2 mice; correlation between gene expression and phenotype; and conditioning of expression on the phenotype. We suggest Slc25a7 as a candidate gene for the Chr 7 QTL and, based on expression differences, five genes (Polr3 h, Cyp2d22, Cyp2d26, Tspo, and Ttll12) as candidate genes for Chr 15 QTL. This study shows how bioinformatics can be used effectively to reduce candidate gene lists for QTL related to complex traits. PMID:21622629
van Haaften, Rachel I M; Luceri, Cristina; van Erk, Arie; Evelo, Chris T A
2009-06-01
Omics technology used for large-scale measurements of gene expression is rapidly evolving. This work pointed out the need of an extensive bioinformatics analyses for array quality assessment before and after gene expression clustering and pathway analysis. A study focused on the effect of red wine polyphenols on rat colon mucosa was used to test the impact of quality control and normalisation steps on the biological conclusions. The integration of data visualization, pathway analysis and clustering revealed an artifact problem that was solved with an adapted normalisation. We propose a possible point to point standard analysis procedure, based on a combination of clustering and data visualization for the analysis of microarray data.
Rot, Gregor; Parikh, Anup; Curk, Tomaz; Kuspa, Adam; Shaulsky, Gad; Zupan, Blaz
2009-08-25
Bioinformatics often leverages on recent advancements in computer science to support biologists in their scientific discovery process. Such efforts include the development of easy-to-use web interfaces to biomedical databases. Recent advancements in interactive web technologies require us to rethink the standard submit-and-wait paradigm, and craft bioinformatics web applications that share analytical and interactive power with their desktop relatives, while retaining simplicity and availability. We have developed dictyExpress, a web application that features a graphical, highly interactive explorative interface to our database that consists of more than 1000 Dictyostelium discoideum gene expression experiments. In dictyExpress, the user can select experiments and genes, perform gene clustering, view gene expression profiles across time, view gene co-expression networks, perform analyses of Gene Ontology term enrichment, and simultaneously display expression profiles for a selected gene in various experiments. Most importantly, these tasks are achieved through web applications whose components are seamlessly interlinked and immediately respond to events triggered by the user, thus providing a powerful explorative data analysis environment. dictyExpress is a precursor for a new generation of web-based bioinformatics applications with simple but powerful interactive interfaces that resemble that of the modern desktop. While dictyExpress serves mainly the Dictyostelium research community, it is relatively easy to adapt it to other datasets. We propose that the design ideas behind dictyExpress will influence the development of similar applications for other model organisms.
Rot, Gregor; Parikh, Anup; Curk, Tomaz; Kuspa, Adam; Shaulsky, Gad; Zupan, Blaz
2009-01-01
Background Bioinformatics often leverages on recent advancements in computer science to support biologists in their scientific discovery process. Such efforts include the development of easy-to-use web interfaces to biomedical databases. Recent advancements in interactive web technologies require us to rethink the standard submit-and-wait paradigm, and craft bioinformatics web applications that share analytical and interactive power with their desktop relatives, while retaining simplicity and availability. Results We have developed dictyExpress, a web application that features a graphical, highly interactive explorative interface to our database that consists of more than 1000 Dictyostelium discoideum gene expression experiments. In dictyExpress, the user can select experiments and genes, perform gene clustering, view gene expression profiles across time, view gene co-expression networks, perform analyses of Gene Ontology term enrichment, and simultaneously display expression profiles for a selected gene in various experiments. Most importantly, these tasks are achieved through web applications whose components are seamlessly interlinked and immediately respond to events triggered by the user, thus providing a powerful explorative data analysis environment. Conclusion dictyExpress is a precursor for a new generation of web-based bioinformatics applications with simple but powerful interactive interfaces that resemble that of the modern desktop. While dictyExpress serves mainly the Dictyostelium research community, it is relatively easy to adapt it to other datasets. We propose that the design ideas behind dictyExpress will influence the development of similar applications for other model organisms. PMID:19706156
Soulet, Fabienne; Kilarski, Witold W.; Roux-Dalvai, Florence; Herbert, John M. J.; Sacewicz, Izabela; Mouton-Barbosa, Emmanuelle; Bicknell, Roy; Lalor, Patricia; Monsarrat, Bernard; Bikfalvi, Andreas
2013-01-01
In order to map the extracellular or membrane proteome associated with the vasculature and the stroma in an embryonic organism in vivo, we developed a biotinylation technique for chicken embryo and combined it with mass spectrometry and bioinformatic analysis. We also applied this procedure to implanted tumors growing on the chorioallantoic membrane or after the induction of granulation tissue. Membrane and extracellular matrix proteins were the most abundant components identified. Relative quantitative analysis revealed differential protein expression patterns in several tissues. Through a bioinformatic approach, we determined endothelial cell protein expression signatures, which allowed us to identify several proteins not yet reported to be associated with endothelial cells or the vasculature. This is the first study reported so far that applies in vivo biotinylation, in combination with robust label-free quantitative proteomics approaches and bioinformatic analysis, to an embryonic organism. It also provides the first description of the vascular and matrix proteome of the embryo that might constitute the starting point for further developments. PMID:23674615
Systemic bioinformatics analysis of skeletal muscle gene expression profiles of sepsis
Yang, Fang; Wang, Yumei
2018-01-01
Sepsis is a type of systemic inflammatory response syndrome with high morbidity and mortality. Skeletal muscle dysfunction is one of the major complications of sepsis that may also influence the outcome of sepsis. The aim of the present study was to explore and identify potential mechanisms and therapeutic targets of sepsis. Systemic bioinformatics analysis of skeletal muscle gene expression profiles from the Gene Expression Omnibus was performed. Differentially expressed genes (DEGs) in samples from patients with sepsis and control samples were screened out using the limma package. Differential co-expression and coregulation (DCE and DCR, respectively) analysis was performed based on the Differential Co-expression Analysis package to identify differences in gene co-expression and coregulation patterns between the control and sepsis groups. Gene Ontology terms and Kyoto Encyclopedia of Genes and Genomes pathways of DEGs were identified using the Database for Annotation, Visualization and Integrated Discovery, and inflammatory, cancer and skeletal muscle development-associated biological processes and pathways were identified. DCE and DCR analysis revealed several potential therapeutic targets for sepsis, including genes and transcription factors. The results of the present study may provide a basis for the development of novel therapeutic targets and treatment methods for sepsis. PMID:29805480
Mirel, Barbara; Görg, Carsten
2014-04-26
A common class of biomedical analysis is to explore expression data from high throughput experiments for the purpose of uncovering functional relationships that can lead to a hypothesis about mechanisms of a disease. We call this analysis expression driven, -omics hypothesizing. In it, scientists use interactive data visualizations and read deeply in the research literature. Little is known, however, about the actual flow of reasoning and behaviors (sense making) that scientists enact in this analysis, end-to-end. Understanding this flow is important because if bioinformatics tools are to be truly useful they must support it. Sense making models of visual analytics in other domains have been developed and used to inform the design of useful and usable tools. We believe they would be helpful in bioinformatics. To characterize the sense making involved in expression-driven, -omics hypothesizing, we conducted an in-depth observational study of one scientist as she engaged in this analysis over six months. From findings, we abstracted a preliminary sense making model. Here we describe its stages and suggest guidelines for developing visualization tools that we derived from this case. A single case cannot be generalized. But we offer our findings, sense making model and case-based tool guidelines as a first step toward increasing interest and further research in the bioinformatics field on scientists' analytical workflows and their implications for tool design.
2014-01-01
A common class of biomedical analysis is to explore expression data from high throughput experiments for the purpose of uncovering functional relationships that can lead to a hypothesis about mechanisms of a disease. We call this analysis expression driven, -omics hypothesizing. In it, scientists use interactive data visualizations and read deeply in the research literature. Little is known, however, about the actual flow of reasoning and behaviors (sense making) that scientists enact in this analysis, end-to-end. Understanding this flow is important because if bioinformatics tools are to be truly useful they must support it. Sense making models of visual analytics in other domains have been developed and used to inform the design of useful and usable tools. We believe they would be helpful in bioinformatics. To characterize the sense making involved in expression-driven, -omics hypothesizing, we conducted an in-depth observational study of one scientist as she engaged in this analysis over six months. From findings, we abstracted a preliminary sense making model. Here we describe its stages and suggest guidelines for developing visualization tools that we derived from this case. A single case cannot be generalized. But we offer our findings, sense making model and case-based tool guidelines as a first step toward increasing interest and further research in the bioinformatics field on scientists’ analytical workflows and their implications for tool design. PMID:24766796
Li, Chen; Shen, Weixing; Shen, Sheng; Ai, Zhilong
2013-12-01
To explore the molecular mechanisms of cholangiocarcinoma (CC), microarray technology was used to find biomarkers for early detection and diagnosis. The gene expression profiles from 6 patients with CC and 5 normal controls were downloaded from Gene Expression Omnibus and compared. As a result, 204 differentially co-expressed genes (DCGs) in CC patients compared to normal controls were identified using a computational bioinformatics analysis. These genes were mainly involved in coenzyme metabolic process, peptidase activity and oxidation reduction. A regulatory network was constructed by mapping the DCGs to known regulation data. Four transcription factors, FOXC1, ZIC2, NKX2-2 and GCGR, were hub nodes in the network. In conclusion, this study provides a set of targets useful for future investigations into molecular biomarker studies. Copyright © 2013 Elsevier Ltd. All rights reserved.
An integrated bioinformatics analysis to dissect kinase dependency in triple negative breast cancer.
Ryall, Karen A; Kim, Jihye; Klauck, Peter J; Shin, Jimin; Yoo, Minjae; Ionkina, Anastasia; Pitts, Todd M; Tentler, John J; Diamond, Jennifer R; Eckhardt, S Gail; Heasley, Lynn E; Kang, Jaewoo; Tan, Aik Choon
2015-01-01
Triple-Negative Breast Cancer (TNBC) is an aggressive disease with a poor prognosis. Clinically, TNBC patients have limited treatment options besides chemotherapy. The goal of this study was to determine the kinase dependency in TNBC cell lines and to predict compounds that could inhibit these kinases using integrative bioinformatics analysis. We integrated publicly available gene expression data, high-throughput pharmacological profiling data, and quantitative in vitro kinase binding data to determine the kinase dependency in 12 TNBC cell lines. We employed Kinase Addiction Ranker (KAR), a novel bioinformatics approach, which integrated these data sources to dissect kinase dependency in TNBC cell lines. We then used the kinase dependency predicted by KAR for each TNBC cell line to query K-Map for compounds targeting these kinases. We validated our predictions using published and new experimental data. In summary, we implemented an integrative bioinformatics analysis that determines kinase dependency in TNBC. Our analysis revealed candidate kinases as potential targets in TNBC for further pharmacological and biological studies.
BIAS: Bioinformatics Integrated Application Software.
Finak, G; Godin, N; Hallett, M; Pepin, F; Rajabi, Z; Srivastava, V; Tang, Z
2005-04-15
We introduce a development platform especially tailored to Bioinformatics research and software development. BIAS (Bioinformatics Integrated Application Software) provides the tools necessary for carrying out integrative Bioinformatics research requiring multiple datasets and analysis tools. It follows an object-relational strategy for providing persistent objects, allows third-party tools to be easily incorporated within the system and supports standards and data-exchange protocols common to Bioinformatics. BIAS is an OpenSource project and is freely available to all interested users at http://www.mcb.mcgill.ca/~bias/. This website also contains a paper containing a more detailed description of BIAS and a sample implementation of a Bayesian network approach for the simultaneous prediction of gene regulation events and of mRNA expression from combinations of gene regulation events. hallett@mcb.mcgill.ca.
VLSI Microsystem for Rapid Bioinformatic Pattern Recognition
NASA Technical Reports Server (NTRS)
Fang, Wai-Chi; Lue, Jaw-Chyng
2009-01-01
A system comprising very-large-scale integrated (VLSI) circuits is being developed as a means of bioinformatics-oriented analysis and recognition of patterns of fluorescence generated in a microarray in an advanced, highly miniaturized, portable genetic-expression-assay instrument. Such an instrument implements an on-chip combination of polymerase chain reactions and electrochemical transduction for amplification and detection of deoxyribonucleic acid (DNA).
Introduction to bioinformatics.
Can, Tolga
2014-01-01
Bioinformatics is an interdisciplinary field mainly involving molecular biology and genetics, computer science, mathematics, and statistics. Data intensive, large-scale biological problems are addressed from a computational point of view. The most common problems are modeling biological processes at the molecular level and making inferences from collected data. A bioinformatics solution usually involves the following steps: Collect statistics from biological data. Build a computational model. Solve a computational modeling problem. Test and evaluate a computational algorithm. This chapter gives a brief introduction to bioinformatics by first providing an introduction to biological terminology and then discussing some classical bioinformatics problems organized by the types of data sources. Sequence analysis is the analysis of DNA and protein sequences for clues regarding function and includes subproblems such as identification of homologs, multiple sequence alignment, searching sequence patterns, and evolutionary analyses. Protein structures are three-dimensional data and the associated problems are structure prediction (secondary and tertiary), analysis of protein structures for clues regarding function, and structural alignment. Gene expression data is usually represented as matrices and analysis of microarray data mostly involves statistics analysis, classification, and clustering approaches. Biological networks such as gene regulatory networks, metabolic pathways, and protein-protein interaction networks are usually modeled as graphs and graph theoretic approaches are used to solve associated problems such as construction and analysis of large-scale networks.
Stevens, David Cole; Conway, Kyle R.; Pearce, Nelson; Villegas-Peñaranda, Luis Roberto; Garza, Anthony G.; Boddy, Christopher N.
2013-01-01
Background Heterologous expression of bacterial biosynthetic gene clusters is currently an indispensable tool for characterizing biosynthetic pathways. Development of an effective, general heterologous expression system that can be applied to bioprospecting from metagenomic DNA will enable the discovery of a wealth of new natural products. Methodology We have developed a new Escherichia coli-based heterologous expression system for polyketide biosynthetic gene clusters. We have demonstrated the over-expression of the alternative sigma factor σ54 directly and positively regulates heterologous expression of the oxytetracycline biosynthetic gene cluster in E. coli. Bioinformatics analysis indicates that σ54 promoters are present in nearly 70% of polyketide and non-ribosomal peptide biosynthetic pathways. Conclusions We have demonstrated a new mechanism for heterologous expression of the oxytetracycline polyketide biosynthetic pathway, where high-level pleiotropic sigma factors from the heterologous host directly and positively regulate transcription of the non-native biosynthetic gene cluster. Our bioinformatics analysis is consistent with the hypothesis that heterologous expression mediated by the alternative sigma factor σ54 may be a viable method for the production of additional polyketide products. PMID:23724102
Correspondence regarding Zhong et al., BMC Bioinformatics 2013 Mar 7;14:89.
Kuhn, Alexandre
2014-11-28
Computational expression deconvolution aims to estimate the contribution of individual cell populations to expression profiles measured in samples of heterogeneous composition. Zhong et al. recently proposed Digital Sorting Algorithm (BMC Bioinformatics 2013 Mar 7;14:89) and showed that they could accurately estimate population-specific expression levels and expression differences between two populations. They compared DSA with Population-Specific Expression Analysis (PSEA), a previous deconvolution method that we developed to detect expression changes occurring within the same population between two conditions (e.g. disease versus non-disease). However, Zhong et al. compared PSEA-derived specific expression levels across different cell populations. Specific expression levels obtained with PSEA cannot be directly compared across different populations as they are on a relative scale. They are accurate as we demonstrate by deconvolving the same dataset used by Zhong et al. and, importantly, allow for comparison of population-specific expression across conditions.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kalra, Rajkumar S., E-mail: renu-wadhwa@aist.go.jp; Wadhwa, Renu, E-mail: renu-wadhwa@aist.go.jp
2015-02-27
Epithelial membrane antigen (EMA or MUC1) is a heavily glycosylated, type I transmembrane glycoprotein commonly expressed by epithelial cells of duct organs. It has been shown to be aberrantly glycosylated in several diseases including cancer. Protein sequence based annotation and analysis of glycosylation profile of glycoproteins by robust computational and comprehensive algorithms provides possible insights to the mechanism(s) of anomalous glycosylation. In present report, by using a number of bioinformatics applications we studied EMA/MUC1 and explored its trans-membrane structural domain sequence that is widely subjected to glycosylation. Exploration of different extracellular motifs led to prediction of N and O-linked glycosylationmore » target sites. Based on the putative O-linked target sites, glycosylated moieties and pathways were envisaged. Furthermore, Protein network analysis demonstrated physical interaction of EMA with a number of proteins and confirmed its functional involvement in cell growth and proliferation pathways. Gene Ontology analysis suggested an involvement of EMA in a number of functions including signal transduction, protein binding, processing and transport along with glycosylation. Thus, present study explored potential of bioinformatics prediction approach in analyzing glycosylation, co-expression and interaction patterns of EMA/MUC1 glycoprotein.« less
Serial analysis of gene expression in a rat lung model of asthma.
Yin, Lei-Miao; Jiang, Gong-Hao; Wang, Yu; Wang, Yan; Liu, Yan-Yan; Jin, Wei-Rong; Zhang, Zen; Xu, Yu-Dong; Yang, Yong-Qing
2008-11-01
The pathogenesis and molecular mechanism underlying asthma remain undetermined. The purpose of this study was to identify genes and pathways involved in the early airway response (EAR) phase of asthma by using serial analysis of gene expression (SAGE). Two SAGE tag libraries of lung tissues derived from a rat model of asthma and controls were generated. Bioinformatic analyses were carried out using the Database for Annotation, Visualization and IntegratedDiscovery Functional Annotation Tool, Gene Ontology (GO) TreeMachine and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis. A total of 26 552 SAGE tags of asthmatic rat lung were obtained, of which 12 221 were unique tags. Of the unique tags, 55.5% were matched with known genes. By comparison of the two libraries, 186 differentially expressed tags (P < 0.05) were identified, of which 103 were upregulated and 83 were downregulated. Using the bioinformatic tools these genes were classified into 23 functional groups, 15 KEGG pathways and 37 enriched GO categories. The bioinformatic analyses of gene distribution, enriched categories and the involvement of specific pathways in the SAGE libraries have provided information on regulatory networks of the EAR phase of asthma. Analyses of the regulated genes of interest may inform new hypotheses, increase our understanding of the disease and provide a foundation for future research.
Kang, Yuan; Dong, Xinran; Zhou, Qiongjie; Zhang, Ying; Cheng, Yan; Hu, Rong; Su, Cuihong; Jin, Hong; Liu, Xiaohui; Ma, Duan; Tian, Weidong; Li, Xiaotian
2012-03-01
This study aimed to identify candidate protein biomarkers from maternal serum for Down syndrome (DS) by integrated proteomic and bioinformatics analysis. A pregnancy DS group of 18 women and a control group with the same number were prepared, and the maternal serum proteins were analyzed by isobaric tags for relative and absolute quantitation and mass spectrometry, to identify DS differentially expressed maternal serum proteins (DS-DEMSPs). Comprehensive bioinformatics analysis was then employed to analyze DS-DEMSPs both in this paper and seven related publications. Down syndrome differentially expressed maternal serum proteins from different studies are significantly enriched with common Gene Ontology functions, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways, transcription factor binding sites, and Pfam protein domains, However, the DS-DEMSPs are less functionally related to known DS-related genes. These evidences suggest that common molecular mechanisms induced by secondary effects may be present upon DS carrying. A simple scoring scheme revealed Alpha-2-macroglobulin, Apolipoprotein A1, Apolipoprotein E, Complement C1s subcomponent, Complement component 5, Complement component 8, alpha polypeptide, Complement component 8, beta polypeptide and Fibronectin as potential DS biomarkers. The integration of proteomics and bioinformatics studies provides a novel approach to develop new prenatal screening methods for noninvasive yet accurate diagnosis of DS. Copyright © 2012 John Wiley & Sons, Ltd.
Revealing biological information using data structuring and automated learning.
Mohorianu, Irina; Moulton, Vincent
2010-11-01
The intermediary steps between a biological hypothesis, concretized in the input data, and meaningful results, validated using biological experiments, commonly employ bioinformatics tools. Starting with storage of the data and ending with a statistical analysis of the significance of the results, every step in a bioinformatics analysis has been intensively studied and the resulting methods and models patented. This review summarizes the bioinformatics patents that have been developed mainly for the study of genes, and points out the universal applicability of bioinformatics methods to other related studies such as RNA interference. More specifically, we overview the steps undertaken in the majority of bioinformatics analyses, highlighting, for each, various approaches that have been developed to reveal details from different perspectives. First we consider data warehousing, the first task that has to be performed efficiently, optimizing the structure of the database, in order to facilitate both the subsequent steps and the retrieval of information. Next, we review data mining, which occupies the central part of most bioinformatics analyses, presenting patents concerning differential expression, unsupervised and supervised learning. Last, we discuss how networks of interactions of genes or other players in the cell may be created, which help draw biological conclusions and have been described in several patents.
RAP: RNA-Seq Analysis Pipeline, a new cloud-based NGS web application.
D'Antonio, Mattia; D'Onorio De Meo, Paolo; Pallocca, Matteo; Picardi, Ernesto; D'Erchia, Anna Maria; Calogero, Raffaele A; Castrignanò, Tiziana; Pesole, Graziano
2015-01-01
The study of RNA has been dramatically improved by the introduction of Next Generation Sequencing platforms allowing massive and cheap sequencing of selected RNA fractions, also providing information on strand orientation (RNA-Seq). The complexity of transcriptomes and of their regulative pathways make RNA-Seq one of most complex field of NGS applications, addressing several aspects of the expression process (e.g. identification and quantification of expressed genes and transcripts, alternative splicing and polyadenylation, fusion genes and trans-splicing, post-transcriptional events, etc.). In order to provide researchers with an effective and friendly resource for analyzing RNA-Seq data, we present here RAP (RNA-Seq Analysis Pipeline), a cloud computing web application implementing a complete but modular analysis workflow. This pipeline integrates both state-of-the-art bioinformatics tools for RNA-Seq analysis and in-house developed scripts to offer to the user a comprehensive strategy for data analysis. RAP is able to perform quality checks (adopting FastQC and NGS QC Toolkit), identify and quantify expressed genes and transcripts (with Tophat, Cufflinks and HTSeq), detect alternative splicing events (using SpliceTrap) and chimeric transcripts (with ChimeraScan). This pipeline is also able to identify splicing junctions and constitutive or alternative polyadenylation sites (implementing custom analysis modules) and call for statistically significant differences in genes and transcripts expression, splicing pattern and polyadenylation site usage (using Cuffdiff2 and DESeq). Through a user friendly web interface, the RAP workflow can be suitably customized by the user and it is automatically executed on our cloud computing environment. This strategy allows to access to bioinformatics tools and computational resources without specific bioinformatics and IT skills. RAP provides a set of tabular and graphical results that can be helpful to browse, filter and export analyzed data, according to the user needs.
Biophysics and bioinformatics of transcription regulation in bacteria and bacteriophages
NASA Astrophysics Data System (ADS)
Djordjevic, Marko
2005-11-01
Due to rapid accumulation of biological data, bioinformatics has become a very important branch of biological research. In this thesis, we develop novel bioinformatic approaches and aid design of biological experiments by using ideas and methods from statistical physics. Identification of transcription factor binding sites within the regulatory segments of genomic DNA is an important step towards understanding of the regulatory circuits that control expression of genes. We propose a novel, biophysics based algorithm, for the supervised detection of transcription factor (TF) binding sites. The method classifies potential binding sites by explicitly estimating the sequence-specific binding energy and the chemical potential of a given TF. In contrast with the widely used information theory based weight matrix method, our approach correctly incorporates saturation in the transcription factor/DNA binding probability. This results in a significant reduction in the number of expected false positives, and in the explicit appearance---and determination---of a binding threshold. The new method was used to identify likely genomic binding sites for the Escherichia coli TFs, and to examine the relationship between TF binding specificity and degree of pleiotropy (number of regulatory targets). We next address how parameters of protein-DNA interactions can be obtained from data on protein binding to random oligos under controlled conditions (SELEX experiment data). We show that 'robust' generation of an appropriate data set is achieved by a suitable modification of the standard SELEX procedure, and propose a novel bioinformatic algorithm for analysis of such data. Finally, we use quantitative data analysis, bioinformatic methods and kinetic modeling to analyze gene expression strategies of bacterial viruses. We study bacteriophage Xp10 that infects rice pathogen Xanthomonas oryzae. Xp10 is an unusual bacteriophage, which has morphology and genome organization that most closely resembles temperate phages, such as lambda. It, however, encodes its own T7-like RNA polymerase (characteristic of virulent phages), whose role in gene expression was unclear. Our analysis resulted in quantitative understanding of the role of both host and phage RNA polymerase, and in the identification of the previously unknown promoter sequence for Xp10 RNA polymerase. More generally, an increasing number of phage genomes are being sequenced every year, and we expect that methods of quantitative data analysis that we introduced will provide an efficient way to study gene expression strategies of novel bacterial viruses.
Fang, H; Tong, W; Perkins, R; Shi, L; Hong, H; Cao, X; Xie, Q; Yim, SH; Ward, JM; Pitot, HC; Dragan, YP
2005-01-01
Background The completion of the sequencing of human, mouse and rat genomes and knowledge of cross-species gene homologies enables studies of differential gene expression in animal models. These types of studies have the potential to greatly enhance our understanding of diseases such as liver cancer in humans. Genes co-expressed across multiple species are most likely to have conserved functions. We have used various bioinformatics approaches to examine microarray expression profiles from liver neoplasms that arise in albumin-SV40 transgenic rats to elucidate genes, chromosome aberrations and pathways that might be associated with human liver cancer. Results In this study, we first identified 2223 differentially expressed genes by comparing gene expression profiles for two control, two adenoma and two carcinoma samples using an F-test. These genes were subsequently mapped to the rat chromosomes using a novel visualization tool, the Chromosome Plot. Using the same plot, we further mapped the significant genes to orthologous chromosomal locations in human and mouse. Many genes expressed in rat 1q that are amplified in rat liver cancer map to the human chromosomes 10, 11 and 19 and to the mouse chromosomes 7, 17 and 19, which have been implicated in studies of human and mouse liver cancer. Using Comparative Genomics Microarray Analysis (CGMA), we identified regions of potential aberrations in human. Lastly, a pathway analysis was conducted to predict altered human pathways based on statistical analysis and extrapolation from the rat data. All of the identified pathways have been known to be important in the etiology of human liver cancer, including cell cycle control, cell growth and differentiation, apoptosis, transcriptional regulation, and protein metabolism. Conclusion The study demonstrates that the hepatic gene expression profiles from the albumin-SV40 transgenic rat model revealed genes, pathways and chromosome alterations consistent with experimental and clinical research in human liver cancer. The bioinformatics tools presented in this paper are essential for cross species extrapolation and mapping of microarray data, its analysis and interpretation. PMID:16026603
A regulatory toolbox of MiniPromoters to drive selective expression in the brain.
Portales-Casamar, Elodie; Swanson, Douglas J; Liu, Li; de Leeuw, Charles N; Banks, Kathleen G; Ho Sui, Shannan J; Fulton, Debra L; Ali, Johar; Amirabbasi, Mahsa; Arenillas, David J; Babyak, Nazar; Black, Sonia F; Bonaguro, Russell J; Brauer, Erich; Candido, Tara R; Castellarin, Mauro; Chen, Jing; Chen, Ying; Cheng, Jason C Y; Chopra, Vik; Docking, T Roderick; Dreolini, Lisa; D'Souza, Cletus A; Flynn, Erin K; Glenn, Randy; Hatakka, Kristi; Hearty, Taryn G; Imanian, Behzad; Jiang, Steven; Khorasan-zadeh, Shadi; Komljenovic, Ivana; Laprise, Stéphanie; Liao, Nancy Y; Lim, Jonathan S; Lithwick, Stuart; Liu, Flora; Liu, Jun; Lu, Meifen; McConechy, Melissa; McLeod, Andrea J; Milisavljevic, Marko; Mis, Jacek; O'Connor, Katie; Palma, Betty; Palmquist, Diana L; Schmouth, Jean-François; Swanson, Magdalena I; Tam, Bonny; Ticoll, Amy; Turner, Jenna L; Varhol, Richard; Vermeulen, Jenny; Watkins, Russell F; Wilson, Gary; Wong, Bibiana K Y; Wong, Siaw H; Wong, Tony Y T; Yang, George S; Ypsilanti, Athena R; Jones, Steven J M; Holt, Robert A; Goldowitz, Daniel; Wasserman, Wyeth W; Simpson, Elizabeth M
2010-09-21
The Pleiades Promoter Project integrates genomewide bioinformatics with large-scale knockin mouse production and histological examination of expression patterns to develop MiniPromoters and related tools designed to study and treat the brain by directed gene expression. Genes with brain expression patterns of interest are subjected to bioinformatic analysis to delineate candidate regulatory regions, which are then incorporated into a panel of compact human MiniPromoters to drive expression to brain regions and cell types of interest. Using single-copy, homologous-recombination "knockins" in embryonic stem cells, each MiniPromoter reporter is integrated immediately 5' of the Hprt locus in the mouse genome. MiniPromoter expression profiles are characterized in differentiation assays of the transgenic cells or in mouse brains following transgenic mouse production. Histological examination of adult brains, eyes, and spinal cords for reporter gene activity is coupled to costaining with cell-type-specific markers to define expression. The publicly available Pleiades MiniPromoter Project is a key resource to facilitate research on brain development and therapies.
Lü, Dingding; Hou, Chengxiang; Qin, Guangxing; Gao, Kun; Chen, Tian; Guo, Xijie
2017-01-01
A full-length cDNA of lebocin 5 (BmLeb5) was first cloned from silkworm, Bombyx mori , by rapid amplification of cDNA ends. The BmLeb5 gene is 808 bp in length and the open reading frame encodes a 179-amino acid hydroxyproline-rich peptide. Bioinformatic analysis results showed that BmLeb5 owns an O-glycosylation site and four RXXR motifs as other lebocins. Sequence similarity and phylogenic analysis results indicated that lebocins form a multiple gene family in silkworm as cecropins. Quantitative real-time PCR analysis revealed that BmLeb5 was highest expressed in the fat body. In the silkworm larvae infected by Beauveria bassiana , the expression level of BmLeb5 was upregulated in the fat body and hemolymph which are the most important immune tissues in silkworm. The recombinant protein of BmLeb5 was for the first time successfully expressed with prokaryotic expression system and purified. There are no reports so far that the expression of lebocins could be induced by entomopathogenic fungus. Our study suggested that BmLeb5 might play an important role in the immune response of silkworm to defend B. bassiana infection. The results also provided helpful information for further studying the lebocin family functioned in antifungal immune response in the silkworm.
Beer, Lucian; Mlitz, Veronika; Gschwandtner, Maria; Berger, Tanja; Narzt, Marie-Sophie; Gruber, Florian; Brunner, Patrick M; Tschachler, Erwin; Mildner, Michael
2015-10-01
Reverse transcription polymerase chain reaction (qRT-PCR) has become a mainstay in many areas of skin research. To enable quantitative analysis, it is necessary to analyse expression of reference genes (RGs) for normalization of target gene expression. The selection of reliable RGs therefore has an important impact on the experimental outcome. In this study, we aimed to identify and validate the best suited RGs for qRT-PCR in human primary keratinocytes (KCs) over a broad range of experimental conditions using the novel bioinformatics tool 'RefGenes', which is based on a manually curated database of published microarray data. Expression of 6 RGs identified by RefGenes software and 12 commonly used RGs were validated by qRT-PCR. We assessed whether these 18 markers fulfilled the requirements for a valid RG by the comprehensive ranking of four bioinformatics tools and the coefficient of variation (CV). In an overall ranking, we found GUSB to be the most stably expressed RG, whereas the expression values of the commonly used RGs, GAPDH and B2M were significantly affected by varying experimental conditions. Our results identify RefGenes as a powerful tool for the identification of valid RGs and suggest GUSB as the most reliable RG for KCs. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Lin, Meng-Lay; Patel, Hetal; Remenyi, Judit; Banerji, Christopher R S; Lai, Chun-Fui; Periyasamy, Manikandan; Lombardo, Ylenia; Busonero, Claudia; Ottaviani, Silvia; Passey, Alun; Quinlan, Philip R; Purdie, Colin A; Jordan, Lee B; Thompson, Alastair M; Finn, Richard S; Rueda, Oscar M; Caldas, Carlos; Gil, Jesus; Coombes, R Charles; Fuller-Pace, Frances V; Teschendorff, Andrew E; Buluwela, Laki; Ali, Simak
2015-08-28
The Nuclear Receptor (NR) superfamily of transcription factors comprises 48 members, several of which have been implicated in breast cancer. Most important is estrogen receptor-α (ERα), which is a key therapeutic target. ERα action is facilitated by co-operativity with other NR and there is evidence that ERα function may be recapitulated by other NRs in ERα-negative breast cancer. In order to examine the inter-relationships between nuclear receptors, and to obtain evidence for previously unsuspected roles for any NRs, we undertook quantitative RT-PCR and bioinformatics analysis to examine their expression in breast cancer. While most NRs were expressed, bioinformatic analyses differentiated tumours into distinct prognostic groups that were validated by analyzing public microarray data sets. Although ERα and progesterone receptor were dominant in distinguishing prognostic groups, other NR strengthened these groups. Clustering analysis identified several family members with potential importance in breast cancer. Specifically, RORγ is identified as being co-expressed with ERα, whilst several NRs are preferentially expressed in ERα-negative disease, with TLX expression being prognostic in this subtype. Functional studies demonstrated the importance of TLX in regulating growth and invasion in ERα-negative breast cancer cells.
Gene expression analysis of colorectal cancer by bioinformatics strategy.
Cui, Meng; Yuan, Junhua; Li, Jun; Sun, Bing; Li, Tao; Li, Yuantao; Wu, Guoliang
2014-10-01
We used bioinformatics technology to analyze gene expression profiles involved in colorectal cancer tissue samples and healthy controls. In this paper, we downloaded the gene expression profile GSE4107 from Gene Expression Omnibus (GEO) database, in which a total of 22 chips were available, including normal colonic mucosa tissue from normal healthy donors (n=10), colorectal cancer tissue samples from colorectal patients (n=33). To further understand the biological functions of the screened DGEs, the KEGG pathway enrichment analysis were conducted. Then we built a transcriptome network to study differentially co-expressed links. A total of 3151 DEGs of CRC were selected. Besides, total 164 DCGs (Differentially Coexpressed Gene, DCG) and 29279 DCLs (Differentially Co-expressed Link, DCL) were obtained. Furthermore, the significantly enriched KEGG pathways were Endocytosis, Calcium signaling pathway, Vascular smooth muscle contraction, Linoleic acid metabolism, Arginine and proline metabolism, Inositol phosphate metabolism and MAPK signaling pathway. Our results show that the generation of CRC involves multiple genes, TFs and pathways. Several signal and immune pathways are linked to CRC and give us more clues in the process of CRC. Hence, our work would pave ways for novel diagnosis of CRC, and provided theoretical guidance into cancer therapy.
Unity in defence: honeybee workers exhibit conserved molecular responses to diverse pathogens.
Doublet, Vincent; Poeschl, Yvonne; Gogol-Döring, Andreas; Alaux, Cédric; Annoscia, Desiderato; Aurori, Christian; Barribeau, Seth M; Bedoya-Reina, Oscar C; Brown, Mark J F; Bull, James C; Flenniken, Michelle L; Galbraith, David A; Genersch, Elke; Gisder, Sebastian; Grosse, Ivo; Holt, Holly L; Hultmark, Dan; Lattorff, H Michael G; Le Conte, Yves; Manfredini, Fabio; McMahon, Dino P; Moritz, Robin F A; Nazzi, Francesco; Niño, Elina L; Nowick, Katja; van Rij, Ronald P; Paxton, Robert J; Grozinger, Christina M
2017-03-02
Organisms typically face infection by diverse pathogens, and hosts are thought to have developed specific responses to each type of pathogen they encounter. The advent of transcriptomics now makes it possible to test this hypothesis and compare host gene expression responses to multiple pathogens at a genome-wide scale. Here, we performed a meta-analysis of multiple published and new transcriptomes using a newly developed bioinformatics approach that filters genes based on their expression profile across datasets. Thereby, we identified common and unique molecular responses of a model host species, the honey bee (Apis mellifera), to its major pathogens and parasites: the Microsporidia Nosema apis and Nosema ceranae, RNA viruses, and the ectoparasitic mite Varroa destructor, which transmits viruses. We identified a common suite of genes and conserved molecular pathways that respond to all investigated pathogens, a result that suggests a commonality in response mechanisms to diverse pathogens. We found that genes differentially expressed after infection exhibit a higher evolutionary rate than non-differentially expressed genes. Using our new bioinformatics approach, we unveiled additional pathogen-specific responses of honey bees; we found that apoptosis appeared to be an important response following microsporidian infection, while genes from the immune signalling pathways, Toll and Imd, were differentially expressed after Varroa/virus infection. Finally, we applied our bioinformatics approach and generated a gene co-expression network to identify highly connected (hub) genes that may represent important mediators and regulators of anti-pathogen responses. Our meta-analysis generated a comprehensive overview of the host metabolic and other biological processes that mediate interactions between insects and their pathogens. We identified key host genes and pathways that respond to phylogenetically diverse pathogens, representing an important source for future functional studies as well as offering new routes to identify or generate pathogen resilient honey bee stocks. The statistical and bioinformatics approaches that were developed for this study are broadly applicable to synthesize information across transcriptomic datasets. These approaches will likely have utility in addressing a variety of biological questions.
Screening circular RNA related to chemotherapeutic resistance in breast cancer.
Gao, Danfeng; Zhang, Xiufen; Liu, Beibei; Meng, Dong; Fang, Kai; Guo, Zijian; Li, Lihua
2017-09-01
We aimed to identify circular RNAs (circRNAs) associated with breast cancer chemoresistance. CircRNA microarray expression profiles were obtained from Adriamycin (ADM) resistant MCF-7 breast cancer cells (MCF-7/ADM) and parental MCF-7 cells and were validated using quantitative real-time reverse transcription PCR. The expression data were analyzed bioinformatically. We detected 3093 circRNAs and identified 18 circRNAs that are differentially expressed between MCF-7/ADM and MCF-7 cells; after validating by quantitative real-time reverse transcription PCR, we predicted the possible miRNAs and potential target genes of the seven upregulated circRNAs using TargetScan and miRanda. The bioinformatics analysis revealed several target genes related to cancer-related signaling pathways. Additionally, we discovered a regulatory role of the circ_0006528-miR-7-5p-Raf1 axis in ADM-resistant breast cancer. These results revealed that circRNAs may play a role in breast cancer chemoresistance and that hsa_circ_0006528 might be a promising candidate for further functional analysis.
NASA Astrophysics Data System (ADS)
Symeonidis, Iphigenia Sofia
This paper aims to elucidate guiding concepts for the design of powerful undergraduate bioinformatics degrees which will lead to a conceptual framework for the curriculum. "Powerful" here should be understood as having truly bioinformatics objectives rather than enrichment of existing computer science or life science degrees on which bioinformatics degrees are often based. As such, the conceptual framework will be one which aims to demonstrate intellectual honesty in regards to the field of bioinformatics. A synthesis/conceptual analysis approach was followed as elaborated by Hurd (1983). The approach takes into account the following: bioinfonnatics educational needs and goals as expressed by different authorities, five undergraduate bioinformatics degrees case-studies, educational implications of bioinformatics as a technoscience and approaches to curriculum design promoting interdisciplinarity and integration. Given these considerations, guiding concepts emerged and a conceptual framework was elaborated. The practice of bioinformatics was given a closer look, which led to defining tool-integration skills and tool-thinking capacity as crucial areas of the bioinformatics activities spectrum. It was argued, finally, that a process-based curriculum as a variation of a concept-based curriculum (where the concepts are processes) might be more conducive to the teaching of bioinformatics given a foundational first year of integrated science education as envisioned by Bialek and Botstein (2004). Furthermore, the curriculum design needs to define new avenues of communication and learning which bypass the traditional disciplinary barriers of academic settings as undertaken by Tador and Tidmor (2005) for graduate studies.
RAP: RNA-Seq Analysis Pipeline, a new cloud-based NGS web application
2015-01-01
Background The study of RNA has been dramatically improved by the introduction of Next Generation Sequencing platforms allowing massive and cheap sequencing of selected RNA fractions, also providing information on strand orientation (RNA-Seq). The complexity of transcriptomes and of their regulative pathways make RNA-Seq one of most complex field of NGS applications, addressing several aspects of the expression process (e.g. identification and quantification of expressed genes and transcripts, alternative splicing and polyadenylation, fusion genes and trans-splicing, post-transcriptional events, etc.). Moreover, the huge volume of data generated by NGS platforms introduces unprecedented computational and technological challenges to efficiently analyze and store sequence data and results. Methods In order to provide researchers with an effective and friendly resource for analyzing RNA-Seq data, we present here RAP (RNA-Seq Analysis Pipeline), a cloud computing web application implementing a complete but modular analysis workflow. This pipeline integrates both state-of-the-art bioinformatics tools for RNA-Seq analysis and in-house developed scripts to offer to the user a comprehensive strategy for data analysis. RAP is able to perform quality checks (adopting FastQC and NGS QC Toolkit), identify and quantify expressed genes and transcripts (with Tophat, Cufflinks and HTSeq), detect alternative splicing events (using SpliceTrap) and chimeric transcripts (with ChimeraScan). This pipeline is also able to identify splicing junctions and constitutive or alternative polyadenylation sites (implementing custom analysis modules) and call for statistically significant differences in genes and transcripts expression, splicing pattern and polyadenylation site usage (using Cuffdiff2 and DESeq). Results Through a user friendly web interface, the RAP workflow can be suitably customized by the user and it is automatically executed on our cloud computing environment. This strategy allows to access to bioinformatics tools and computational resources without specific bioinformatics and IT skills. RAP provides a set of tabular and graphical results that can be helpful to browse, filter and export analyzed data, according to the user needs. PMID:26046471
Poswar, Fabiano de Oliveira; Farias, Lucyana Conceição; Fraga, Carlos Alberto de Carvalho; Bambirra, Wilson; Brito-Júnior, Manoel; Sousa-Neto, Manoel Damião; Santos, Sérgio Henrique Souza; de Paula, Alfredo Maurício Batista; D'Angelo, Marcos Flávio Silveira Vasconcelos; Guimarães, André Luiz Sena
2015-06-01
Bioinformatics has emerged as an important tool to analyze the large amount of data generated by research in different diseases. In this study, gene expression for radicular cysts (RCs) and periapical granulomas (PGs) was characterized based on a leader gene approach. A validated bioinformatics algorithm was applied to identify leader genes for RCs and PGs. Genes related to RCs and PGs were first identified in PubMed, GenBank, GeneAtlas, and GeneCards databases. The Web-available STRING software (The European Molecular Biology Laboratory [EMBL], Heidelberg, Baden-Württemberg, Germany) was used in order to build the interaction map among the identified genes by a significance score named weighted number of links. Based on the weighted number of links, genes were clustered using k-means. The genes in the highest cluster were considered leader genes. Multilayer perceptron neural network analysis was used as a complementary supplement for gene classification. For RCs, the suggested leader genes were TP53 and EP300, whereas PGs were associated with IL2RG, CCL2, CCL4, CCL5, CCR1, CCR3, and CCR5 genes. Our data revealed different gene expression for RCs and PGs, suggesting that not only the inflammatory nature but also other biological processes might differentiate RCs and PGs. Copyright © 2015 American Association of Endodontists. Published by Elsevier Inc. All rights reserved.
Bălăcescu, Loredana; Bălăcescu, O; Crişan, N; Fetica, B; Petruţ, B; Bungărdean, Cătălina; Rus, Meda; Tudoran, Oana; Meurice, G; Irimie, Al; Dragoş, N; Berindan-Neagoe, Ioana
2011-01-01
Prostate cancer represents the first leading cause of cancer among western male population, with different clinical behavior ranging from indolent to metastatic disease. Although many molecules and deregulated pathways are known, the molecular mechanisms involved in the development of prostate cancer are not fully understood. The aim of this study was to explore the molecular variation underlying the prostate cancer, based on microarray analysis and bioinformatics approaches. Normal and prostate cancer tissues were collected by macrodissection from prostatectomy pieces. All prostate cancer specimens used in our study were Gleason score 7. Gene expression microarray (Agilent Technologies) was used for Whole Human Genome evaluation. The bioinformatics and functional analysis were based on Limma and Ingenuity software. The microarray analysis identified 1119 differentially expressed genes between prostate cancer and normal prostate, which were up- or down-regulated at least 2-fold. P-values were adjusted for multiple testing using Benjamini-Hochberg method with a false discovery rate of 0.01. These genes were analyzed with Ingenuity Pathway Analysis software and were established 23 genetic networks. Our microarray results provide new information regarding the molecular networks in prostate cancer stratified as Gleason 7. These data highlighted gene expression profiles for better understanding of prostate cancer progression.
Zhang, Dong-Mei; Feng, Li-Xing; Li, Lu; Liu, Miao; Jiang, Bao-Hong; Yang, Min; Li, Guo-Qiang; Wu, Wan-Ying; Guo, De-An; Liu, Xuan
2016-09-01
The sea dragon Solenognathus hardwickii has long been used as a traditional Chinese medicine for the treatment of various diseases, such as male impotency. To gain a comprehensive insight into the protein components of the sea dragon, shotgun proteomic analysis of its protein expression profiling was conducted in the present study. Proteins were extracted from dried sea dragon using a trichloroacetic acid/acetone precipitation method and then separated by SDS-PAGE. The protein bands were cut from the gel and digested by trypsin to generate peptide mixture. The peptide fragments were then analyzed using nano liquid chromatography tandem mass spectrometry (nano-LC-ESI MS/MS). 810 proteins and 1 577 peptides were identified in the dried sea dragon. The identified proteins exhibited molecular weight values ranging from 1 900 to 3 516 900 Da and pI values from 3.8 to 12.18. Bioinformatic analysis was conducted using the DAVID Bioinformatics Resources 6.7 Gene Ontology (GO) analysis tool to explore possible functions of the identified proteins. Ascribed functions of the proteins mainly included intracellular non-membrane-bound organelle, non-membrane-bounded organelle, cytoskeleton, structural molecule activity, calcium ion binding and etc. Furthermore, possible signal networks of the identified proteins were predicted using STRING (Search Tool for the Retrieval of Interacting Genes) database. Ribosomal protein synthesis was found to play an important role in the signal network. The results of this study, to best of our knowledge, were the first to provide a reference proteome profile for the sea dragon, and would aid in the understanding of the expression and functions of the identified proteins. Copyright © 2016 China Pharmaceutical University. Published by Elsevier B.V. All rights reserved.
Diao, Hongyu; Li, Xinxing; Hu, Sheng; Liu, Yunhui
2012-01-01
Parkinson disease (PD) progresses relentlessly and affects approximately 4% of the population aged over 80 years old. It is difficult to diagnose in its early stages. The purpose of our study is to identify molecular biomarkers for PD initiation using a computational bioinformatics analysis of gene expression. We downloaded the gene expression profile of PD from Gene Expression Omnibus and identified differentially coexpressed genes (DCGs) and dysfunctional pathways in PD patients compared to controls. Besides, we built a regulatory network by mapping the DCGs to known regulatory data between transcription factors (TFs) and target genes and calculated the regulatory impact factor of each transcription factor. As the results, a total of 1004 genes associated with PD initiation were identified. Pathway enrichment of these genes suggests that biological processes of protein turnover were impaired in PD. In the regulatory network, HLF, E2F1 and STAT4 were found have altered expression levels in PD patients. The expression levels of other transcription factors, NKX3-1, TAL1, RFX1 and EGR3, were not found altered. However, they regulated differentially expressed genes. In conclusion, we suggest that HLF, E2F1 and STAT4 may be used as molecular biomarkers for PD; however, more work is needed to validate our result.
Diao, Hongyu; Li, Xinxing; Hu, Sheng; Liu, Yunhui
2012-01-01
Parkinson disease (PD) progresses relentlessly and affects approximately 4% of the population aged over 80 years old. It is difficult to diagnose in its early stages. The purpose of our study is to identify molecular biomarkers for PD initiation using a computational bioinformatics analysis of gene expression. We downloaded the gene expression profile of PD from Gene Expression Omnibus and identified differentially coexpressed genes (DCGs) and dysfunctional pathways in PD patients compared to controls. Besides, we built a regulatory network by mapping the DCGs to known regulatory data between transcription factors (TFs) and target genes and calculated the regulatory impact factor of each transcription factor. As the results, a total of 1004 genes associated with PD initiation were identified. Pathway enrichment of these genes suggests that biological processes of protein turnover were impaired in PD. In the regulatory network, HLF, E2F1 and STAT4 were found have altered expression levels in PD patients. The expression levels of other transcription factors, NKX3-1, TAL1, RFX1 and EGR3, were not found altered. However, they regulated differentially expressed genes. In conclusion, we suggest that HLF, E2F1 and STAT4 may be used as molecular biomarkers for PD; however, more work is needed to validate our result. PMID:23284986
Bioinformatic investigation of the role of ubiquitins in cucumber flower morphogenesis
NASA Astrophysics Data System (ADS)
Pawełkowicz, Magdalena; Osipowski, Paweł; Wojcieszek, Michał; Kowalczuk, Cezary; PlÄ der, Wojciech; Przybecki, Zbigniew
2016-09-01
Three cDNA clones were used to screen cucumber genome in order to find genes and proteins. Functional annotation reveals that they are correlated with ubiquitination pathways. Various bioinformatics tools were used to screen and check protein sequences features such as: the presence of specific domains, transmembrane regions, cleavage site and cellular placement. The computational analysis for promotor region shows many binding sites for transcription factors, which could regulate the expression of genes. In order to check gene expression levels in developing flower buds of monoecious (B10) and gynoecious (2gg) cucumber lines, the real - time PCR technique was applied. The expression was checked for the whole buds and only for the 3rd and 4th whorls of bud when generative organ are form which were obtained by Laser Capture Microdissection (LCM) technique.
Microarray gene expression profiling analysis combined with bioinformatics in multiple sclerosis.
Liu, Mingyuan; Hou, Xiaojun; Zhang, Ping; Hao, Yong; Yang, Yiting; Wu, Xiongfeng; Zhu, Desheng; Guan, Yangtai
2013-05-01
Multiple sclerosis (MS) is the most prevalent demyelinating disease and the principal cause of neurological disability in young adults. Recent microarray gene expression profiling studies have identified several genetic variants contributing to the complex pathogenesis of MS, however, expressional and functional studies are still required to further understand its molecular mechanism. The present study aimed to analyze the molecular mechanism of MS using microarray analysis combined with bioinformatics techniques. We downloaded the gene expression profile of MS from Gene Expression Omnibus (GEO) and analysed the microarray data using the differentially coexpressed genes (DCGs) and links package in R and Database for Annotation, Visualization and Integrated Discovery. The regulatory impact factor (RIF) algorithm was used to measure the impact factor of transcription factor. A total of 1,297 DCGs between MS patients and healthy controls were identified. Functional annotation indicated that these DCGs were associated with immune and neurological functions. Furthermore, the RIF result suggested that IKZF1, BACH1, CEBPB, EGR1, FOS may play central regulatory roles in controlling gene expression in the pathogenesis of MS. Our findings confirm the presence of multiple molecular alterations in MS and indicate the possibility for identifying prognostic factors associated with MS pathogenesis.
Functional analysis of the Arabidopsis PHT4 family of intracellular phosphate transporters.
Guo, B; Jin, Y; Wussler, C; Blancaflor, E B; Motes, C M; Versaw, W K
2008-01-01
The transport of phosphate (Pi) between subcellular compartments is central to metabolic regulation. Although some of the transporters involved in controlling the intracellular distribution of Pi have been identified in plants, others are predicted from genetic, biochemical and bioinformatics studies. Heterologous expression in yeast, and gene expression and localization in plants were used to characterize all six members of an Arabidopsis thaliana membrane transporter family designated here as PHT4. PHT4 proteins share similarity with SLC17/type I Pi transporters, a diverse group of animal proteins involved in the transport of Pi, organic anions and chloride. All of the PHT4 proteins mediate Pi transport in yeast with high specificity. Bioinformatic analysis and localization of PHT4-GFP fusion proteins indicate that five of the proteins are targeted to the plastid envelope, and the sixth resides in the Golgi apparatus. PHT4 genes are expressed in both roots and leaves, although two of the genes are expressed predominantly in leaves and one mostly in roots. These expression patterns, together with Pi transport activities and subcellular locations, suggest roles for PHT4 proteins in the transport of Pi between the cytosol and chloroplasts, heterotrophic plastids and the Golgi apparatus.
Augustin, Regina; Lichtenthaler, Stefan F.; Greeff, Michael; Hansen, Jens; Wurst, Wolfgang; Trümbach, Dietrich
2011-01-01
The molecular mechanisms and genetic risk factors underlying Alzheimer's disease (AD) pathogenesis are only partly understood. To identify new factors, which may contribute to AD, different approaches are taken including proteomics, genetics, and functional genomics. Here, we used a bioinformatics approach and found that distinct AD-related genes share modules of transcription factor binding sites, suggesting a transcriptional coregulation. To detect additional coregulated genes, which may potentially contribute to AD, we established a new bioinformatics workflow with known multivariate methods like support vector machines, biclustering, and predicted transcription factor binding site modules by using in silico analysis and over 400 expression arrays from human and mouse. Two significant modules are composed of three transcription factor families: CTCF, SP1F, and EGRF/ZBPF, which are conserved between human and mouse APP promoter sequences. The specific combination of in silico promoter and multivariate analysis can identify regulation mechanisms of genes involved in multifactorial diseases. PMID:21559189
A regulatory toolbox of MiniPromoters to drive selective expression in the brain
Portales-Casamar, Elodie; Swanson, Douglas J.; Liu, Li; de Leeuw, Charles N.; Banks, Kathleen G.; Ho Sui, Shannan J.; Fulton, Debra L.; Ali, Johar; Amirabbasi, Mahsa; Arenillas, David J.; Babyak, Nazar; Black, Sonia F.; Bonaguro, Russell J.; Brauer, Erich; Candido, Tara R.; Castellarin, Mauro; Chen, Jing; Chen, Ying; Cheng, Jason C. Y.; Chopra, Vik; Docking, T. Roderick; Dreolini, Lisa; D'Souza, Cletus A.; Flynn, Erin K.; Glenn, Randy; Hatakka, Kristi; Hearty, Taryn G.; Imanian, Behzad; Jiang, Steven; Khorasan-zadeh, Shadi; Komljenovic, Ivana; Laprise, Stéphanie; Liao, Nancy Y.; Lim, Jonathan S.; Lithwick, Stuart; Liu, Flora; Liu, Jun; Lu, Meifen; McConechy, Melissa; McLeod, Andrea J.; Milisavljevic, Marko; Mis, Jacek; O'Connor, Katie; Palma, Betty; Palmquist, Diana L.; Schmouth, Jean-François; Swanson, Magdalena I.; Tam, Bonny; Ticoll, Amy; Turner, Jenna L.; Varhol, Richard; Vermeulen, Jenny; Watkins, Russell F.; Wilson, Gary; Wong, Bibiana K. Y.; Wong, Siaw H.; Wong, Tony Y. T.; Yang, George S.; Ypsilanti, Athena R.; Jones, Steven J. M.; Holt, Robert A.; Goldowitz, Daniel; Wasserman, Wyeth W.; Simpson, Elizabeth M.
2010-01-01
The Pleiades Promoter Project integrates genomewide bioinformatics with large-scale knockin mouse production and histological examination of expression patterns to develop MiniPromoters and related tools designed to study and treat the brain by directed gene expression. Genes with brain expression patterns of interest are subjected to bioinformatic analysis to delineate candidate regulatory regions, which are then incorporated into a panel of compact human MiniPromoters to drive expression to brain regions and cell types of interest. Using single-copy, homologous-recombination “knockins” in embryonic stem cells, each MiniPromoter reporter is integrated immediately 5′ of the Hprt locus in the mouse genome. MiniPromoter expression profiles are characterized in differentiation assays of the transgenic cells or in mouse brains following transgenic mouse production. Histological examination of adult brains, eyes, and spinal cords for reporter gene activity is coupled to costaining with cell-type–specific markers to define expression. The publicly available Pleiades MiniPromoter Project is a key resource to facilitate research on brain development and therapies. PMID:20807748
Clinical proteomic analysis of scrub typhus infection.
Park, Edmond Changkyun; Lee, Sang-Yeop; Yun, Sung Ho; Choi, Chi-Won; Lee, Hayoung; Song, Hyun Seok; Jun, Sangmi; Kim, Gun-Hwa; Lee, Chang-Seop; Kim, Seung Il
2018-01-01
Scrub typhus is an acute and febrile infectious disease caused by the Gram-negative α-proteobacterium Orientia tsutsugamushi from the family Rickettsiaceae that is widely distributed in Northern, Southern and Eastern Asia. In the present study, we analysed the serum proteome of scrub typhus patients to investigate specific clinical protein patterns in an attempt to explain pathophysiology and discover potential biomarkers of infection. Serum samples were collected from three patients (before and after treatment with antibiotics) and three healthy subjects. One-dimensional sodium dodecyl sulphate-polyacrylamide gel electrophoresis followed by liquid chromatography-tandem mass spectrometry was performed to identify differentially abundant proteins using quantitative proteomic approaches. Bioinformatic analysis was then performed using Ingenuity Pathway Analysis. Proteomic analysis identified 236 serum proteins, of which 32 were differentially expressed in normal subjects, naive scrub typhus patients and patients treated with antibiotics. Comparative bioinformatic analysis of the identified proteins revealed up-regulation of proteins involved in immune responses, especially complement system, following infection with O. tsutsugamushi , and normal expression was largely rescued by antibiotic treatment. This is the first proteomic study of clinical serum samples from scrub typhus patients. Proteomic analysis identified changes in protein expression upon infection with O. tsutsugamushi and following antibiotic treatment. Our results provide valuable information for further investigation of scrub typhus therapy and diagnosis.
PSMB5 plays a dual role in cancer development and immunosuppression
Wang, Chih-Yang; Li, Chung-Yen; Hsu, Hui-Ping; Cho, Chien-Yu; Yen, Meng-Chi; Weng, Tzu-Yang; Chen, Wei-Ching; Hung, Yu-Hsuan; Lee, Kuo-Ting; Hung, Jui-Hsiang; Chen, Yi-Ling; Lai, Ming-Derg
2017-01-01
Tumor progression and metastasis are dependent on the intrinsic properties of tumor cells and the influence of microenvironment including the immune system. It would be important to identify target drug that can inhibit cancer cell and activate immune cells. Proteasome β subunits (PSMB) family, one component of the ubiquitin-proteasome system, has been demonstrated to play an important role in tumor cells and immune cells. Therefore, we used a bioinformatics approach to examine the potential role of PSMB family. Analysis of breast TCGA and METABRIC database revealed that high expression of PSMB5 was observed in breast cancer tissue and that high expression of PSMB5 predicted worse survival. In addition, high expression of PSMB5 was observed in M2 macrophages. Based on our bioinformatics analysis, we hypothesized that PSMB5 contained immunosuppressive and oncogenic characteristics. To study the effects of PSMB5 on the cancer cell and macrophage in vitro, we silenced PSMB5 expression with shRNA in THP-1 monocytes and MDA-MB-231 cells respectively. Knockdown of PSMB5 promoted human THP-1 monocyte differentiation into M1 macrophage. On the other hand, knockdown PSMB5 gene expression inhibited MDA-MB-231 cell growth and migration by colony formation assay and boyden chamber. Collectively, our data demonstrated that delivery of PSMB5 shRNA suppressed cell growth and activated defensive M1 macrophages in vitro. Furthermore, lentiviral delivery of PSMB5 shRNA significantly decreased tumor growth in a subcutaneous mouse model. In conclusion, our bioinformatics study and functional experiments revealed that PSMB5 served as novel cancer therapeutic targets. These results also demonstrated a novel translational approach to improve cancer immunotherapy. PMID:29218236
Remenyi, Judit; Banerji, Christopher R.S.; Lai, Chun-Fui; Periyasamy, Manikandan; Lombardo, Ylenia; Busonero, Claudia; Ottaviani, Silvia; Passey, Alun; Quinlan, Philip R.; Purdie, Colin A.; Jordan, Lee B.; Thompson, Alastair M.; Finn, Richard S.; Rueda, Oscar M.; Caldas, Carlos; Gil, Jesus; Coombes, R. Charles; Fuller-Pace, Frances V.; Teschendorff, Andrew E.; Buluwela, Laki; Ali, Simak
2015-01-01
The Nuclear Receptor (NR) superfamily of transcription factors comprises 48 members, several of which have been implicated in breast cancer. Most important is estrogen receptor-α (ERα), which is a key therapeutic target. ERα action is facilitated by co-operativity with other NR and there is evidence that ERα function may be recapitulated by other NRs in ERα-negative breast cancer. In order to examine the inter-relationships between nuclear receptors, and to obtain evidence for previously unsuspected roles for any NRs, we undertook quantitative RT-PCR and bioinformatics analysis to examine their expression in breast cancer. While most NRs were expressed, bioinformatic analyses differentiated tumours into distinct prognostic groups that were validated by analyzing public microarray data sets. Although ERα and progesterone receptor were dominant in distinguishing prognostic groups, other NR strengthened these groups. Clustering analysis identified several family members with potential importance in breast cancer. Specifically, RORγ is identified as being co-expressed with ERα, whilst several NRs are preferentially expressed in ERα-negative disease, with TLX expression being prognostic in this subtype. Functional studies demonstrated the importance of TLX in regulating growth and invasion in ERα-negative breast cancer cells. PMID:26280373
Cer, Regina Z; Herrera-Galeano, J Enrique; Anderson, Joseph J; Bishop-Lilly, Kimberly A; Mokashi, Vishwesh P
2014-01-01
Understanding the biological roles of microRNAs (miRNAs) is a an active area of research that has produced a surge of publications in PubMed, particularly in cancer research. Along with this increasing interest, many open-source bioinformatics tools to identify existing and/or discover novel miRNAs in next-generation sequencing (NGS) reads become available. While miRNA identification and discovery tools are significantly improved, the development of miRNA differential expression analysis tools, especially in temporal studies, remains substantially challenging. Further, the installation of currently available software is non-trivial and steps of testing with example datasets, trying with one's own dataset, and interpreting the results require notable expertise and time. Subsequently, there is a strong need for a tool that allows scientists to normalize raw data, perform statistical analyses, and provide intuitive results without having to invest significant efforts. We have developed miRNA Temporal Analyzer (mirnaTA), a bioinformatics package to identify differentially expressed miRNAs in temporal studies. mirnaTA is written in Perl and R (Version 2.13.0 or later) and can be run across multiple platforms, such as Linux, Mac and Windows. In the current version, mirnaTA requires users to provide a simple, tab-delimited, matrix file containing miRNA name and count data from a minimum of two to a maximum of 20 time points and three replicates. To recalibrate data and remove technical variability, raw data is normalized using Normal Quantile Transformation (NQT), and linear regression model is used to locate any miRNAs which are differentially expressed in a linear pattern. Subsequently, remaining miRNAs which do not fit a linear model are further analyzed in two different non-linear methods 1) cumulative distribution function (CDF) or 2) analysis of variances (ANOVA). After both linear and non-linear analyses are completed, statistically significant miRNAs (P < 0.05) are plotted as heat maps using hierarchical cluster analysis and Euclidean distance matrix computation methods. mirnaTA is an open-source, bioinformatics tool to aid scientists in identifying differentially expressed miRNAs which could be further mined for biological significance. It is expected to provide researchers with a means of interpreting raw data to statistical summaries in a fast and intuitive manner.
Estrogen Receptor Alpha (ESR1)-Dependent Regulation of the Mouse Oviductal Transcriptome.
Cerny, Katheryn L; Ribeiro, Rosanne A C; Jeoung, Myoungkun; Ko, CheMyong; Bridges, Phillip J
2016-01-01
Estrogen receptor-α (ESR1) is an important transcriptional regulator in the mammalian oviduct, however ESR1-dependent regulation of the transcriptome of this organ is not well defined, especially at the genomic level. The objective of this study was therefore to investigate estradiol- and ESR1-dependent regulation of the transcriptome of the oviduct using transgenic mice, both with (ESR1KO) and without (wild-type, WT) a global deletion of ESR1. Oviducts were collected from ESR1KO and WT littermates at 23 days of age, or ESR1KO and WT mice were treated with 5 IU PMSG to stimulate follicular development and the production of ovarian estradiol, and the oviducts collected 48 h later. RNA extracted from whole oviducts was hybridized to Affymetrix Genechip Mouse Genome 430-2.0 arrays (n = 3 arrays per genotype and treatment) or reverse transcribed to cDNA for analysis of the expression of selected mRNAs by real-time PCR. Following microarray analysis, a statistical two-way ANOVA and pairwise comparison (LSD test) revealed 2428 differentially expressed transcripts (DEG's, P < 0.01). Genotype affected the expression of 2215 genes, treatment (PMSG) affected the expression of 465 genes, and genotype x treatment affected the expression of 438 genes. With the goal of determining estradiol/ESR1-regulated function, gene ontology (GO) and bioinformatic pathway analyses were performed on DEG's in the oviducts of PMSG-treated ESR1KO versus PMSG-treated WT mice. Significantly enriched GO molecular function categories included binding and catalytic activity. Significantly enriched GO cellular component categories indicated the extracellular region. Significantly enriched GO biological process categories involved a single organism, modulation of a measurable attribute and developmental processes. Bioinformatic analysis revealed ESR1-regulation of the immune response within the oviduct as the primary canonical pathway. In summary, a transcriptomal profile of estradiol- and ESR1-regulated gene expression and related bioinformatic analysis is presented to increase our understanding of how estradiol/ESR1 affects function of the oviduct, and to identify genes that may be proven as important regulators of fertility in the future.
Wu, Jie; Li, Lian; Sun, Yu; Huang, Shuai; Tang, Juan; Yu, Pan; Wang, Genlin
2015-01-01
Toll-like receptor 4 (TLR4) mediated activation of the nuclear transcription factor κB (NF-κB) signaling pathway by mastitis initiates expression of genes associated with inflammation and the innate immune response. In this study, the profile of mastitis-induced differential gene expression in the mammary tissue of Chinese Holstein cattle was investigated by Gene-Chip microarray and bioinformatics. The microarray results revealed that 79 genes associated with the TLR4/NF-κB signaling pathway were differentially expressed. Of these genes, 19 were up-regulated and 29 were down-regulated in mastitis tissue compared to normal, healthy tissue. Statistical analysis of transcript and protein level expression changes indicated that 10 genes, namely TLR4, MyD88, IL-6, and IL-10, were up-regulated, while, CD14, TNF-α, MD-2, IL-β, NF-κB, and IL-12 were significantly down-regulated in mastitis tissue in comparison with normal tissue. Analyses using bioinformatics database resources, such as the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis and the Gene Ontology Consortium (GO) for term enrichment analysis, suggested that these differently expressed genes implicate different regulatory pathways for immune function in the mammary gland. In conclusion, our study provides new evidence for better understanding the differential expression and mechanisms of the TLR4 /NF-κB signaling pathway in Chinese Holstein cattle with mastitis. PMID:25706977
Wu, Jie; Li, Lian; Sun, Yu; Huang, Shuai; Tang, Juan; Yu, Pan; Wang, Genlin
2015-01-01
Toll-like receptor 4 (TLR4) mediated activation of the nuclear transcription factor κB (NF-κB) signaling pathway by mastitis initiates expression of genes associated with inflammation and the innate immune response. In this study, the profile of mastitis-induced differential gene expression in the mammary tissue of Chinese Holstein cattle was investigated by Gene-Chip microarray and bioinformatics. The microarray results revealed that 79 genes associated with the TLR4/NF-κB signaling pathway were differentially expressed. Of these genes, 19 were up-regulated and 29 were down-regulated in mastitis tissue compared to normal, healthy tissue. Statistical analysis of transcript and protein level expression changes indicated that 10 genes, namely TLR4, MyD88, IL-6, and IL-10, were up-regulated, while, CD14, TNF-α, MD-2, IL-β, NF-κB, and IL-12 were significantly down-regulated in mastitis tissue in comparison with normal tissue. Analyses using bioinformatics database resources, such as the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis and the Gene Ontology Consortium (GO) for term enrichment analysis, suggested that these differently expressed genes implicate different regulatory pathways for immune function in the mammary gland. In conclusion, our study provides new evidence for better understanding the differential expression and mechanisms of the TLR4 /NF-κB signaling pathway in Chinese Holstein cattle with mastitis.
Pao, Sheng-Ying; Lin, Win-Li; Hwang, Ming-Jing
2006-01-01
Background Screening for differentially expressed genes on the genomic scale and comparative analysis of the expression profiles of orthologous genes between species to study gene function and regulation are becoming increasingly feasible. Expressed sequence tags (ESTs) are an excellent source of data for such studies using bioinformatic approaches because of the rich libraries and tremendous amount of data now available in the public domain. However, any large-scale EST-based bioinformatics analysis must deal with the heterogeneous, and often ambiguous, tissue and organ terms used to describe EST libraries. Results To deal with the issue of tissue source, in this work, we carefully screened and organized more than 8 million human and mouse ESTs into 157 human and 108 mouse tissue/organ categories, to which we applied an established statistic test using different thresholds of the p value to identify genes differentially expressed in different tissues. Further analysis of the tissue distribution and level of expression of human and mouse orthologous genes showed that tissue-specific orthologs tended to have more similar expression patterns than those lacking significant tissue specificity. On the other hand, a number of orthologs were found to have significant disparity in their expression profiles, hinting at novel functions, divergent regulation, or new ortholog relationships. Conclusion Comprehensive statistics on the tissue-specific expression of human and mouse genes were obtained in this very large-scale, EST-based analysis. These statistical results have been organized into a database, freely accessible at our website , for easy searching of human and mouse tissue-specific genes and for investigating gene expression profiles in the context of comparative genomics. Comparative analysis showed that, although highly tissue-specific genes tend to exhibit similar expression profiles in human and mouse, there are significant exceptions, indicating that orthologous genes, while sharing basic genomic properties, could result in distinct phenotypes. PMID:16626500
Jha, Prabhash Kumar; Vijay, Aatira; Sahu, Anita; Ashraf, Mohammad Zahid
2016-01-01
Thrombosis is a leading cause of morbidity and mortality in patients with myeloproliferative disorders (MPDs), particularly polycythemia vera (PV) and essential thrombocythemia (ET). Despite the attempts to establish a link between them, the shared biological mechanisms are yet to be characterized. An integrated gene expression meta-analysis of five independent publicly available microarray data of the three diseases was conducted to identify shared gene expression signatures and overlapping biological processes. Using INMEX bioinformatic tool, based on combined Effect Size (ES) approaches, we identified a total of 1,157 differentially expressed genes (DEGs) (697 overexpressed and 460 underexpressed genes) shared between the three diseases. EnrichR tool’s rich library was used for comprehensive functional enrichment and pathway analysis which revealed “mRNA Splicing” and “SUMO E3 ligases SUMOylate target proteins” among the most enriched terms. Network based meta-analysis identified MYC and FN1 to be the most highly ranked hub genes. Our results reveal that the alterations in biomarkers of the coagulation cascade like F2R, PROS1, SELPLG and ITGB2 were common between the three diseases. Interestingly, the study has generated a novel database of candidate genetic markers, pathways and transcription factors shared between thrombosis and MPDs, which might aid in the development of prognostic therapeutic biomarkers. PMID:27892526
Molecular cloning and expression analysis of annexin A2 gene in sika deer antler tip.
Xia, Yanling; Qu, Haomiao; Lu, Binshan; Zhang, Qiang; Li, Heping
2018-04-01
Molecular cloning and bioinformatics analysis of annexin A2 ( ANXA2 ) gene in sika deer antler tip were conducted. The role of ANXA2 gene in the growth and development of the antler were analyzed initially. The reverse transcriptase polymerase chain reaction (RT-PCR) was used to clone the cDNA sequence of the ANXA2 gene from antler tip of sika deer ( Cervus Nippon hortulorum ) and the bioinformatics methods were applied to analyze the amino acid sequence of Anxa2 protein. The mRNA expression levels of the ANXA2 gene in different growth stages were examined by real time reverse transcriptase polymerase chain reaction (real time RT-PCR). The nucleotide sequence analysis revealed an open reading frame of 1,020 bp encoding 339 amino acids long protein of calculated molecular weight 38.6 kDa and isoelectric point 6.09. Homologous sequence alignment and phylogenetic analysis indicated that the Anxa2 mature protein of sika deer had the closest genetic distance with Cervus elaphus and Bos mutus . Real time RT-PCR results showed that the gene had differential expression levels in different growth stages, and the expression level of the ANXA2 gene was the highest at metaphase (rapid growing period). ANXA2 gene may promote the cell proliferation, and the finding suggested Anxa2 as an important candidate for regulating the growth and development of deer antler.
Sitras, V; Fenton, C; Acharya, G
2015-02-01
Cardiovascular disease (CVD) and preeclampsia (PE) share common clinical features. We aimed to identify common transcriptomic signatures involved in CVD and PE in humans. Meta-analysis of individual raw microarray data deposited in GEO, obtained from blood samples of patients with CVD versus controls and placental samples from women with PE versus healthy women with uncomplicated pregnancies. Annotation of cases versus control samples was taken directly from the microarray documentation. Genes that showed a significant differential expression in the majority of experiments were selected for subsequent analysis. Hypergeometric gene list analysis was performed using Bioconductor GOstats package. Bioinformatic analysis was performed in PANTHER. Seven studies in CVD and 5 studies in PE were eligible for meta-analysis. A total of 181 genes were found to be differentially expressed in microarray studies investigating gene expression in blood samples obtained from patients with CVD compared to controls and 925 genes were differentially expressed between preeclamptic and healthy placentas. Among these differentially expressed genes, 22 were common between CVD and PE. Bioinformatic analysis of these genes revealed oxidative stress, p-53 pathway feedback, inflammation mediated by chemokines and cytokines, interleukin signaling, B-cell activation, PDGF signaling, Wnt signaling, integrin signaling and Alzheimer disease pathways to be involved in the pathophysiology of both CVD and PE. Metabolism, development, response to stimulus, immune response and cell communication were the associated biologic processes in both conditions. Gene set enrichment analysis showed the following overlapping pathways between CVD and PE: TGF-β-signaling, apoptosis, graft-versus-host disease, allograft rejection, chemokine signaling, steroid hormone synthesis, type I and II diabetes mellitus, VEGF signaling, pathways in cancer, GNRH signaling, Huntingtons disease and Notch signaling. CVD and PE share same common traits in their gene expression profile indicating common pathways in their pathophysiology. Copyright © 2014 Elsevier Ltd. All rights reserved.
USDA-ARS?s Scientific Manuscript database
One important mechanism plants use to cope with salinity is keeping the cytosolic Na+ concentration low by sequestering Na+ in vacuoles, a process facilitated by Na+/H+ exchangers (NHX). There are eight NHX genes (NHX1 through NHX8) identified and characterized in Arabidopsis. Bioinformatic analysis...
Content of intrinsic disorder influences the outcome of cell-free protein synthesis.
Tokmakov, Alexander A; Kurotani, Atsushi; Ikeda, Mariko; Terazawa, Yumiko; Shirouzu, Mikako; Stefanov, Vasily; Sakurai, Tetsuya; Yokoyama, Shigeyuki
2015-09-11
Cell-free protein synthesis is used to produce proteins with various structural traits. Recent bioinformatics analyses indicate that more than half of eukaryotic proteins possess long intrinsically disordered regions. However, no systematic study concerning the connection between intrinsic disorder and expression success of cell-free protein synthesis has been presented until now. To address this issue, we examined correlations of the experimentally observed cell-free protein expression yields with the contents of intrinsic disorder bioinformatically predicted in the expressed sequences. This analysis revealed strong relationships between intrinsic disorder and protein amenability to heterologous cell-free expression. On the one hand, elevated disorder content was associated with the increased ratio of soluble expression. On the other hand, overall propensity for detectable protein expression decreased with disorder content. We further demonstrated that these tendencies are rooted in some distinct features of intrinsically disordered regions, such as low hydrophobicity, elevated surface accessibility and high abundance of sequence motifs for proteolytic degradation, including sites of ubiquitination and PEST sequences. Our findings suggest that identification of intrinsically disordered regions in the expressed amino acid sequences can be of practical use for predicting expression success and optimizing cell-free protein synthesis.
Huang, Lei; Zhao, Shuangping; Frasor, Jonna M.; Dai, Yang
2011-01-01
Approximately half of estrogen receptor (ER) positive breast tumors will fail to respond to endocrine therapy. Here we used an integrative bioinformatics approach to analyze three gene expression profiling data sets from breast tumors in an attempt to uncover underlying mechanisms contributing to the development of resistance and potential therapeutic strategies to counteract these mechanisms. Genes that are differentially expressed in tamoxifen resistant vs. sensitive breast tumors were identified from three different publically available microarray datasets. These differentially expressed (DE) genes were analyzed using gene function and gene set enrichment and examined in intrinsic subtypes of breast tumors. The Connectivity Map analysis was utilized to link gene expression profiles of tamoxifen resistant tumors to small molecules and validation studies were carried out in a tamoxifen resistant cell line. Despite little overlap in genes that are differentially expressed in tamoxifen resistant vs. sensitive tumors, a high degree of functional similarity was observed among the three datasets. Tamoxifen resistant tumors displayed enriched expression of genes related to cell cycle and proliferation, as well as elevated activity of E2F transcription factors, and were highly correlated with a Luminal intrinsic subtype. A number of small molecules, including phenothiazines, were found that induced a gene signature in breast cancer cell lines opposite to that found in tamoxifen resistant vs. sensitive tumors and the ability of phenothiazines to down-regulate cyclin E2 and inhibit proliferation of tamoxifen resistant breast cancer cells was validated. Our findings demonstrate that an integrated bioinformatics approach to analyze gene expression profiles from multiple breast tumor datasets can identify important biological pathways and potentially novel therapeutic options for tamoxifen-resistant breast cancers. PMID:21789246
Melendez, Roberto I.; McGinty, Jacqueline F.; Kalivas, Peter W.; Becker, Howard C.
2014-01-01
Neuroadaptations that participate in the ontogeny of alcohol dependence are likely a result of altered gene expression in various brain regions. The present study investigated brain region-specific changes in the pattern and magnitude of gene expression immediately following chronic intermittent ethanol (CIE) exposure and 8 hours following final ethanol exposure [i.e. early withdrawal (EWD)]. High-density oligonucleotide microarrays (Affymetrix 430A 2.0, Affymetrix, Santa Clara, CA, USA) and bioinformatics analysis were used to characterize gene expression and function in the prefrontal cortex (PFC), hippocampus (HPC) and nucleus accumbens (NAc) of C57BL/6J mice (Jackson Laboratories, Bar Harbor, ME, USA). Gene expression levels were determined using gene chip robust multi-array average followed by statistical analysis of microarrays and validated by quantitative real-time reverse transcription polymerase chain reaction and Western blot analysis. Results indicated that immediately following CIE exposure, changes in gene expression were strikingly greater in the PFC (284 genes) compared with the HPC (16 genes) and NAc (32 genes). Bioinformatics analysis revealed that most of the transcriptionally responsive genes in the PFC were involved in Ras/MAPK signaling, notch signaling or ubiquitination. In contrast, during EWD, changes in gene expression were greatest in the HPC (139 genes) compared with the PFC (four genes) and NAc (eight genes). The most transcriptionally responsive genes in the HPC were involved in mRNA processing or actin dynamics. Of the few genes detected in the NAc, the most representatives were involved in circadian rhythms. Overall, these findings indicate that brain region-specific and time-dependent neuroadaptive alterations in gene expression play an integral role in the development of alcohol dependence and withdrawal. PMID:21812870
Dueholm, Morten S; Søndergaard, Mads T; Nilsson, Martin; Christiansen, Gunna; Stensballe, Allan; Overgaard, Michael T; Givskov, Michael; Tolker-Nielsen, Tim; Otzen, Daniel E; Nielsen, Per H
2013-01-01
The fap operon, encoding functional amyloids in Pseudomonas (Fap), is present in most pseudomonads, but so far the expression and importance for biofilm formation has only been investigated for P. fluorescens strain UK4. In this study, we demonstrate the capacity of P. aeruginosa PAO1, P. fluorescens Pf-5, and P. putida F1 to express Fap fibrils, and investigated the effect of Fap expression on aggregation and biofilm formation. The fap operon in all three Pseudomonas species conferred the ability to express Fap fibrils as shown using a recombinant approach. This Fap overexpression consistently resulted in highly aggregative phenotypes and in increased biofilm formation. Detailed biophysical investigations of purified fibrils confirmed FapC as the main fibril monomer and supported the role of FapB as a minor, nucleating constituent as also indicated by bioinformatic analysis. Bioinformatics analysis suggested FapF and FapD as a potential β-barrel membrane pore and protease, respectively. Manipulation of the fap operon showed that FapA affects monomer composition of the final amyloid fibril, and that FapB is an amyloid protein, probably a nucleator for FapC polymerization. Our study highlights the fap operon as a molecular machine for functional amyloid formation. PMID:23504942
Tapocik, Jenica D.; Solomon, Matthew; Flanigan, Meghan; Meinhardt, Marcus; Barbier, Estelle; Schank, Jesse; Schwandt, Melanie; Sommer, Wolfgang H.; Heilig, Markus
2012-01-01
Long-term changes in brain gene expression have been identified in alcohol dependence, but underlying mechanisms remain unknown. Here, we examined the potential role of microRNAs for persistent gene expression changes in the rat medial prefrontal cortex after a history of alcohol dependence. Two-bottle free-choice alcohol consumption increased following 7-week exposure to intermittent alcohol intoxication. A bioinformatic approach using microarray analysis, qPCR, bioinformatic analysis, and microRNA-mRNA integrative analysis identified expression patterns indicative of a disruption in synaptic processes and neuroplasticity. 41 rat-microRNAs and 165 mRNAs in the medial prefrontal cortex were significantly altered after chronic alcohol exposure. A subset of the microRNAs and mRNAs was confirmed by qPCR. Gene ontology categories of differential expression pointed to functional processes commonly associated with neurotransmission, neuroadaptation, and synaptic plasticity. microRNA-mRNA expression pairing identified 33 microRNAs putatively targeting 89 mRNAs suggesting transcriptional networks involved in axonal guidance and neurotransmitter signaling. Our results demonstrate a significant shift in microRNA expression patterns in the medial prefrontal cortex following a history of dependence. Due to their global regulation of multiple downstream target transcripts, microRNAs may play a pivotal role in the reorganization of synaptic connections and long term neuroadaptations in alcohol dependence. microRNA-mediated alterations of transcriptional networks may be involved in disrupted prefrontal control over alcohol-drinking observed in alcoholic patients. PMID:22614244
High-throughput protein analysis integrating bioinformatics and experimental assays
del Val, Coral; Mehrle, Alexander; Falkenhahn, Mechthild; Seiler, Markus; Glatting, Karl-Heinz; Poustka, Annemarie; Suhai, Sandor; Wiemann, Stefan
2004-01-01
The wealth of transcript information that has been made publicly available in recent years requires the development of high-throughput functional genomics and proteomics approaches for its analysis. Such approaches need suitable data integration procedures and a high level of automation in order to gain maximum benefit from the results generated. We have designed an automatic pipeline to analyse annotated open reading frames (ORFs) stemming from full-length cDNAs produced mainly by the German cDNA Consortium. The ORFs are cloned into expression vectors for use in large-scale assays such as the determination of subcellular protein localization or kinase reaction specificity. Additionally, all identified ORFs undergo exhaustive bioinformatic analysis such as similarity searches, protein domain architecture determination and prediction of physicochemical characteristics and secondary structure, using a wide variety of bioinformatic methods in combination with the most up-to-date public databases (e.g. PRINTS, BLOCKS, INTERPRO, PROSITE SWISSPROT). Data from experimental results and from the bioinformatic analysis are integrated and stored in a relational database (MS SQL-Server), which makes it possible for researchers to find answers to biological questions easily, thereby speeding up the selection of targets for further analysis. The designed pipeline constitutes a new automatic approach to obtaining and administrating relevant biological data from high-throughput investigations of cDNAs in order to systematically identify and characterize novel genes, as well as to comprehensively describe the function of the encoded proteins. PMID:14762202
Bioinformatics of cardiovascular miRNA biology.
Kunz, Meik; Xiao, Ke; Liang, Chunguang; Viereck, Janika; Pachel, Christina; Frantz, Stefan; Thum, Thomas; Dandekar, Thomas
2015-12-01
MicroRNAs (miRNAs) are small ~22 nucleotide non-coding RNAs and are highly conserved among species. Moreover, miRNAs regulate gene expression of a large number of genes associated with important biological functions and signaling pathways. Recently, several miRNAs have been found to be associated with cardiovascular diseases. Thus, investigating the complex regulatory effect of miRNAs may lead to a better understanding of their functional role in the heart. To achieve this, bioinformatics approaches have to be coupled with validation and screening experiments to understand the complex interactions of miRNAs with the genome. This will boost the subsequent development of diagnostic markers and our understanding of the physiological and therapeutic role of miRNAs in cardiac remodeling. In this review, we focus on and explain different bioinformatics strategies and algorithms for the identification and analysis of miRNAs and their regulatory elements to better understand cardiac miRNA biology. Starting with the biogenesis of miRNAs, we present approaches such as LocARNA and miRBase for combining sequence and structure analysis including phylogenetic comparisons as well as detailed analysis of RNA folding patterns, functional target prediction, signaling pathway as well as functional analysis. We also show how far bioinformatics helps to tackle the unprecedented level of complexity and systemic effects by miRNA, underlining the strong therapeutic potential of miRNA and miRNA target structures in cardiovascular disease. In addition, we discuss drawbacks and limitations of bioinformatics algorithms and the necessity of experimental approaches for miRNA target identification. This article is part of a Special Issue entitled 'Non-coding RNAs'. Copyright © 2014 Elsevier Ltd. All rights reserved.
Zhang, Yu; Mo, Wei-Jia; Wang, Xiao; Zhang, Tong-Tong; Qin, Yuan; Wang, Han-Lin; Chen, Gang; Wei, Dan-Ming; Dang, Yi-Wu
2018-05-02
The long non‑coding RNA (lncRNA) PVT1 plays vital roles in the tumorigenesis and development of various types of cancer. However, the potential expression profiling, functions and pathways of PVT1 in HCC remain unknown. PVT1 was knocked down in SMMC‑7721 cells, and a miRNA microarray analysis was performed to detect the differentially expressed miRNAs. Twelve target prediction algorithms were used to predict the underlying targets of these differentially expressed miRNAs. Bioinformatics analysis was performed to explore the underlying functions, pathways and networks of the targeted genes. Furthermore, the relationship between PVT1 and the clinical parameters in HCC was confirmed based on the original data in the TCGA database. Among the differentially expressed miRNAs, the top two upregulated and downregulated miRNAs were selected for further analysis based on the false discovery rate (FDR), fold‑change (FC) and P‑values. Based on the TCGA database, PVT1 was obviously highly expressed in HCC, and a statistically higher PVT1 expression was found for sex (male), ethnicity (Asian) and pathological grade (G3+G4) compared to the control groups (P<0.05). Furthermore, Gene Ontology (GO) analysis revealed that the target genes were involved in complex cellular pathways, such as the macromolecule biosynthetic process, compound metabolic process, and transcription. Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis revealed that the MAPK and Wnt signaling pathways may be correlated with the regulation of the four candidate miRNAs. The results therefore provide significant information on the differentially expressed miRNAs associated with PVT1 in HCC, and we hypothesized that PVT1 may play vital roles in HCC by regulating different miRNAs or target gene expression (particularly MAPK8) via the MAPK or Wnt signaling pathways. Thus, further investigation of the molecular mechanism of PVT1 in HCC is needed.
A Bioinformatics Facility for NASA
NASA Technical Reports Server (NTRS)
Schweighofer, Karl; Pohorille, Andrew
2006-01-01
Building on an existing prototype, we have fielded a facility with bioinformatics technologies that will help NASA meet its unique requirements for biological research. This facility consists of a cluster of computers capable of performing computationally intensive tasks, software tools, databases and knowledge management systems. Novel computational technologies for analyzing and integrating new biological data and already existing knowledge have been developed. With continued development and support, the facility will fulfill strategic NASA s bioinformatics needs in astrobiology and space exploration. . As a demonstration of these capabilities, we will present a detailed analysis of how spaceflight factors impact gene expression in the liver and kidney for mice flown aboard shuttle flight STS-108. We have found that many genes involved in signal transduction, cell cycle, and development respond to changes in microgravity, but that most metabolic pathways appear unchanged.
Bioinformatics approaches to single-cell analysis in developmental biology.
Yalcin, Dicle; Hakguder, Zeynep M; Otu, Hasan H
2016-03-01
Individual cells within the same population show various degrees of heterogeneity, which may be better handled with single-cell analysis to address biological and clinical questions. Single-cell analysis is especially important in developmental biology as subtle spatial and temporal differences in cells have significant associations with cell fate decisions during differentiation and with the description of a particular state of a cell exhibiting an aberrant phenotype. Biotechnological advances, especially in the area of microfluidics, have led to a robust, massively parallel and multi-dimensional capturing, sorting, and lysis of single-cells and amplification of related macromolecules, which have enabled the use of imaging and omics techniques on single cells. There have been improvements in computational single-cell image analysis in developmental biology regarding feature extraction, segmentation, image enhancement and machine learning, handling limitations of optical resolution to gain new perspectives from the raw microscopy images. Omics approaches, such as transcriptomics, genomics and epigenomics, targeting gene and small RNA expression, single nucleotide and structural variations and methylation and histone modifications, rely heavily on high-throughput sequencing technologies. Although there are well-established bioinformatics methods for analysis of sequence data, there are limited bioinformatics approaches which address experimental design, sample size considerations, amplification bias, normalization, differential expression, coverage, clustering and classification issues, specifically applied at the single-cell level. In this review, we summarize biological and technological advancements, discuss challenges faced in the aforementioned data acquisition and analysis issues and present future prospects for application of single-cell analyses to developmental biology. © The Author 2015. Published by Oxford University Press on behalf of the European Society of Human Reproduction and Embryology. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
A new paradigm for transcription factor TFIIB functionality
Gelev, Vladimir; Zabolotny, Janice M.; Lange, Martin; Hiromura, Makoto; Yoo, Sang Wook; Orlando, Joseph S.; Kushnir, Anna; Horikoshi, Nobuo; Paquet, Eric; Bachvarov, Dimcho; Schaffer, Priscilla A.; Usheva, Anny
2014-01-01
Experimental and bioinformatic studies of transcription initiation by RNA polymerase II (RNAP2) have revealed a mechanism of RNAP2 transcription initiation less uniform across gene promoters than initially thought. However, the general transcription factor TFIIB is presumed to be universally required for RNAP2 transcription initiation. Based on bioinformatic analysis of data and effects of TFIIB knockdown in primary and transformed cell lines on cellular functionality and global gene expression, we report that TFIIB is dispensable for transcription of many human promoters, but is essential for herpes simplex virus-1 (HSV-1) gene transcription and replication. We report a novel cell cycle TFIIB regulation and localization of the acetylated TFIIB variant on the transcriptionally silent mitotic chromatids. Taken together, these results establish a new paradigm for TFIIB functionality in human gene expression, which when downregulated has potent anti-viral effects. PMID:24441171
Zhang, Ying; Zhang, Wei; Li, Xinglan; Li, Dapeng; Zhang, Xiaoling; Yin, Yajie; Deng, Xiangyun; Sheng, Xiugui
2016-06-01
Endometrial cancer (EC) is the most prevalent malignancy worldwide. Although several efforts had been made to explore the molecular mechanism responsible for EC progression, it is still not fully understood. To evaluate the clinical characteristics and prognostic factors of patients with EC, and further to search for novel genes associated with EC progression. We recruited 328 patients with EC and analyzed prognostic factors using Cox proportional hazard regression model. Further, a gene expression profile of EC was used to identify the differentially expressed genes (DEGs) between normal samples and tumor samples. Subsequently, Kyoto Encyclopedia of Genes and Genomes pathway enrichment analysis ( http://www.genome.jp/kegg/ ) for DEGs were performed, and then protein-protein interaction (PPI) network of DEGs as well as the subnetwork of PPI were constructed with plug-in, MCODE by mapping DEGs into the Search Tool for the Retrieval of Interacting Genes database. Our results showed that body mass index (BMI), hypertension, myometrial invasion, pathological type, and Glut4 positive expression were prognostic factors in EC (P < 0.05). Bioinformatics analysis showed that upregulated DEGs were associated with cell cycle, and downregulated DEGs were related to MAPK pathway. Meanwhile, PPI network analysis revealed that upregulated CDK1 and CCNA2 as well as downregulated JUN and FOS were listed in top two nodes with high degrees. Patients with EC should be given more focused attentions in respect of pathological type, BMI, hypertension, and Glut4-positive expression. In addition, CDK1, CCNA2, JUN, and FOS might play important roles in EC development.
Zhao, Xiaoqin; Rong, Can; Pan, Fenghui; Xiang, Lizhi; Wang, Xinlei; Hu, Yun
2018-06-28
Increasing evidence indicates that long noncoding RNAs (lncRNAs) perform special biological functions by regulating gene expression through multiple pathways and molecular mechanisms. The aim of this study was to explore the expression characteristics of lncRNA uc.322 in pancreatic islet cells and its effects on the secretion function of islet cells. Bioinformatics analysis was used to detect the lncRNA uc.322 sequence, location, and structural features. Expression of lncRNA uc.322 in different tissues was detected by quantitative polymerase chain reaction analyses. Quantitative polymerase chain reaction, Western blot analysis, adenosine triphosphate determination, glucose-stimulated insulin secretion, and enzyme-linked immunosorbent assay were used to evaluate the effects of lncRNA uc.322 on insulin secretion. The results showed that the full-length of lncRNA uc.322 is 224 bp and that it is highly conserved in various species. Bioinformatics analysis revealed that lncRNA uc.322 is located on chr7:122893196-122893419 (GRCH37/hg19) within the SRY-related HMG-box 6 gene exon region. Compared with other tissues, lncRNA uc.322 is highly expressed in pancreatic tissue. Upregulation of lncRNA uc.322 expression increases the insulin transcription factors pancreatic and duodenal homeobox 1 and Forkhead box O1 expression, promotes insulin secretion in the extracellular fluid of Min6 cells, and increases the adenosine triphosphate concentration. On the other hand, knockdown of lncRNA uc.322 has opposite effects on Min6 cells. Overall, this study showed that upregulation of lncRNA uc.322 in islet β-cells can increase the expression of insulin transcription factors and promote insulin secretion, and it may be a new therapeutic target for diabetes. © 2018 Wiley Periodicals, Inc.
Nesprin-2 epsilon: A novel nesprin isoform expressed in human ovary and Ntera-2 cells
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lam, Le Thanh; Boehm, Sabrina V.; Roberts, Roland G.
2011-08-26
Highlights: {yields} A novel epsilon isoform of nesprin-2 has been discovered. {yields} This 120 kDa protein was predicted by bioinformatic analysis, but has not previously been observed. {yields} It is the main isoform expressed in a teratocarcinoma cell line and is also found in ovary. {yields} Like other nesprins, it is located at the nuclear envelope. {yields} We suggest it may have a role in very early development or in some ovary-specific function. -- Abstract: The nuclear envelope-associated cytoskeletal protein, nesprin-2, is encoded by a large gene containing several internal promoters that produce shorter isoforms. In a study of Ntera-2more » teratocarcinoma cells, a novel isoform, nesprin-2-epsilon, was found to be the major mRNA and protein product of the nesprin-2 gene. Its existence was predicted by bioinformatic analysis, but this is the first direct demonstration of both the mRNA and the 120 kDa protein which is located at the nuclear envelope. In a panel of 21 adult and foetal human tissues, the nesprin-2-epsilon mRNA was strongly expressed in ovary but was a minor isoform elsewhere. The expression pattern suggests a possible link with very early development and a likely physiological role in ovary.« less
Tcof1-Related Molecular Networks in Treacher Collins Syndrome.
Dai, Jiewen; Si, Jiawen; Wang, Minjiao; Huang, Li; Fang, Bing; Shi, Jun; Wang, Xudong; Shen, Guofang
2016-09-01
Treacher Collins syndrome (TCS) is a rare, autosomal-dominant disorder characterized by craniofacial deformities, and is primarily caused by mutations in the Tcof1 gene. This article was aimed to perform a comprehensive literature review and systematic bioinformatic analysis of Tcof1-related molecular networks in TCS. First, the up- and down-regulated genes in Tcof1 heterozygous haploinsufficient mutant mice embryos and Tcof1 knockdown and Tcof1 over-expressed neuroblastoma N1E-115 cells were obtained from the Gene Expression Omnibus database. The GeneDecks database was used to calculate the 500 genes most closely related to Tcof1. Then, the relationships between 4 gene sets (a predicted set and sets comparing the wildtype with the 3 Gene Expression Omnibus datasets) were analyzed using the DAVID, GeneMANIA and STRING databases. The analysis results showed that the Tcof1-related genes were enriched in various biological processes, including cell proliferation, apoptosis, cell cycle, differentiation, and migration. They were also enriched in several signaling pathways, such as the ribosome, p53, cell cycle, and WNT signaling pathways. Additionally, these genes clearly had direct or indirect interactions with Tcof1 and between each other. Literature review and bioinformatic analysis finds imply that special attention should be given to these pathways, as they may offer target points for TCS therapies.
Li, Bing; Shi, Xiao-Yu; Liao, Dai-Xiang; Cao, Bang-Rong; Luo, Cheng-Hua; Cheng, Shu-Jun
2015-01-01
There are still no absolute parameters predicting progression of adenoma into cancer. The present study aimed to characterize functional differences on the multistep carcinogenetic process from the adenoma-carcinoma sequence. All samples were collected and mRNA expression profiling was performed by using Agilent Microarray high-throughput gene-chip technology. Then, the characteristics of mRNA expression profiles of adenoma-carcinoma sequence were described with bioinformatics software, and we analyzed the relationship between gene expression profiles of adenoma-adenocarcinoma sequence and clinical prognosis of colorectal cancer. The mRNA expressions of adenoma-carcinoma sequence were significantly different between high-grade intraepithelial neoplasia group and adenocarcinoma group. The biological process of gene ontology function enrichment analysis on differentially expressed genes between high-grade intraepithelial neoplasia group and adenocarcinoma group showed that genes enriched in the extracellular structure organization, skeletal system development, biological adhesion and itself regulated growth regulation, with the P value after FDR correction of less than 0.05. In addition, IPR-related protein mainly focused on the insulin-like growth factor binding proteins. The variable trends of gene expression profiles for adenoma-carcinoma sequence were mainly concentrated in high-grade intraepithelial neoplasia and adenocarcinoma. The differentially expressed genes are significantly correlated between high-grade intraepithelial neoplasia group and adenocarcinoma group. Bioinformatics analysis is an effective way to study the gene expression profiles in the adenoma-carcinoma sequence, and may provide an effective tool to involve colorectal cancer research strategy into colorectal adenoma or advanced adenoma.
USDA-ARS?s Scientific Manuscript database
RNA expression analysis was performed on the corpus luteum tissue at five time points after prostaglandin F2 alpha treatment of midcycle cows using an Affymetrix Bovine Gene v1 Array. The normalized linear microarray data was uploaded to the NCBI GEO repository (GSE94069). Subsequent statistical ana...
Partnering for functional genomics research conference: Abstracts of poster presentations
DOE Office of Scientific and Technical Information (OSTI.GOV)
NONE
1998-06-01
This reports contains abstracts of poster presentations presented at the Functional Genomics Research Conference held April 16--17, 1998 in Oak Ridge, Tennessee. Attention is focused on the following areas: mouse mutagenesis and genomics; phenotype screening; gene expression analysis; DNA analysis technology development; bioinformatics; comparative analyses of mouse, human, and yeast sequences; and pilot projects to evaluate methodologies.
Matoušková, Petra; Hanousková, Barbora; Skálová, Lenka
2018-04-14
Glutathione peroxidases (GPxs) belong to the eight-member family of phylogenetically related enzymes with different cellular localization, but distinct antioxidant function. Several GPxs are important selenoproteins. Dysregulated GPx expression is connected with severe pathologies, including obesity and diabetes. We performed a comprehensive bioinformatic analysis using the programs miRDB, miRanda, TargetScan, and Diana in the search for hypothetical microRNAs targeting 3'untranslated regions (3´UTR) of GPxs. We cross-referenced the literature for possible intersections between our results and available reports on identified microRNAs, with a special focus on the microRNAs related to oxidative stress, obesity, and related pathologies. We identified many microRNAs with an association with oxidative stress and obesity as putative regulators of GPxs. In particular, miR-185-5p was predicted by a larger number of programs to target six GPxs and thus could play the role as their master regulator. This microRNA was altered by selenium deficiency and can play a role as a feedback control of selenoproteins' expression. Through the bioinformatics analysis we revealed the potential connection of microRNAs, GPxs, obesity, and other redox imbalance related diseases.
CellLineNavigator: a workbench for cancer cell line analysis
Krupp, Markus; Itzel, Timo; Maass, Thorsten; Hildebrandt, Andreas; Galle, Peter R.; Teufel, Andreas
2013-01-01
The CellLineNavigator database, freely available at http://www.medicalgenomics.org/celllinenavigator, is a web-based workbench for large scale comparisons of a large collection of diverse cell lines. It aims to support experimental design in the fields of genomics, systems biology and translational biomedical research. Currently, this compendium holds genome wide expression profiles of 317 different cancer cell lines, categorized into 57 different pathological states and 28 individual tissues. To enlarge the scope of CellLineNavigator, the database was furthermore closely linked to commonly used bioinformatics databases and knowledge repositories. To ensure easy data access and search ability, a simple data and an intuitive querying interface were implemented. It allows the user to explore and filter gene expression, focusing on pathological or physiological conditions. For a more complex search, the advanced query interface may be used to query for (i) differentially expressed genes; (ii) pathological or physiological conditions; or (iii) gene names or functional attributes, such as Kyoto Encyclopaedia of Genes and Genomes pathway maps. These queries may also be combined. Finally, CellLineNavigator allows additional advanced analysis of differentially regulated genes by a direct link to the Database for Annotation, Visualization and Integrated Discovery (DAVID) Bioinformatics Resources. PMID:23118487
He, Hailong; Mao, Lingzhou; Xu, Peng; Xi, Yanhai; Xu, Ning; Xue, Mingtao; Yu, Jiangming; Ye, Xiaojian
2014-01-10
Ossification of the posterior longitudinal ligament (OPLL) is a kind of disease with physical barriers and neurological disorders. The objective of this study was to explore the differentially expressed genes (DEGs) in OPLL patient ligament cells and identify the target sites for the prevention and treatment of OPLL in clinic. Gene expression data GSE5464 was downloaded from Gene Expression Omnibus; then DEGs were screened by limma package in R language, and changed functions and pathways of OPLL cells compared to normal cells were identified by DAVID (The Database for Annotation, Visualization and Integrated Discovery); finally, an interaction network of DEGs was constructed by string. A total of 1536 DEGs were screened, with 31 down-regulated and 1505 up-regulated genes. Response to wounding function and Toll-like receptor signaling pathway may involve in the development of OPLL. Genes, such as PDGFB, PRDX2 may involve in OPLL through response to wounding function. Toll-like receptor signaling pathway enriched genes such as TLR1, TLR5, and TLR7 may involve in spine cord injury in OPLL. PIK3R1 was the hub gene in the network of DEGs with the highest degree; INSR was one of the most closely related genes of it. OPLL related genes screened by microarray gene expression profiling and bioinformatics analysis may be helpful for elucidating the mechanism of OPLL. © 2013.
Dueholm, Morten S; Søndergaard, Mads T; Nilsson, Martin; Christiansen, Gunna; Stensballe, Allan; Overgaard, Michael T; Givskov, Michael; Tolker-Nielsen, Tim; Otzen, Daniel E; Nielsen, Per H
2013-06-01
The fap operon, encoding functional amyloids in Pseudomonas (Fap), is present in most pseudomonads, but so far the expression and importance for biofilm formation has only been investigated for P. fluorescens strain UK4. In this study, we demonstrate the capacity of P. aeruginosa PAO1, P. fluorescens Pf-5, and P. putida F1 to express Fap fibrils, and investigated the effect of Fap expression on aggregation and biofilm formation. The fap operon in all three Pseudomonas species conferred the ability to express Fap fibrils as shown using a recombinant approach. This Fap overexpression consistently resulted in highly aggregative phenotypes and in increased biofilm formation. Detailed biophysical investigations of purified fibrils confirmed FapC as the main fibril monomer and supported the role of FapB as a minor, nucleating constituent as also indicated by bioinformatic analysis. Bioinformatics analysis suggested FapF and FapD as a potential β-barrel membrane pore and protease, respectively. Manipulation of the fap operon showed that FapA affects monomer composition of the final amyloid fibril, and that FapB is an amyloid protein, probably a nucleator for FapC polymerization. Our study highlights the fap operon as a molecular machine for functional amyloid formation. © 2013 The Authors. Microbiology Open published by John Wiley & Sons Ltd.
RNA-Rocket: an RNA-Seq analysis resource for infectious disease research
Warren, Andrew S.; Aurrecoechea, Cristina; Brunk, Brian; Desai, Prerak; Emrich, Scott; Giraldo-Calderón, Gloria I.; Harb, Omar; Hix, Deborah; Lawson, Daniel; Machi, Dustin; Mao, Chunhong; McClelland, Michael; Nordberg, Eric; Shukla, Maulik; Vosshall, Leslie B.; Wattam, Alice R.; Will, Rebecca; Yoo, Hyun Seung; Sobral, Bruno
2015-01-01
Motivation: RNA-Seq is a method for profiling transcription using high-throughput sequencing and is an important component of many research projects that wish to study transcript isoforms, condition specific expression and transcriptional structure. The methods, tools and technologies used to perform RNA-Seq analysis continue to change, creating a bioinformatics challenge for researchers who wish to exploit these data. Resources that bring together genomic data, analysis tools, educational material and computational infrastructure can minimize the overhead required of life science researchers. Results: RNA-Rocket is a free service that provides access to RNA-Seq and ChIP-Seq analysis tools for studying infectious diseases. The site makes available thousands of pre-indexed genomes, their annotations and the ability to stream results to the bioinformatics resources VectorBase, EuPathDB and PATRIC. The site also provides a combination of experimental data and metadata, examples of pre-computed analysis, step-by-step guides and a user interface designed to enable both novice and experienced users of RNA-Seq data. Availability and implementation: RNA-Rocket is available at rnaseq.pathogenportal.org. Source code for this project can be found at github.com/cidvbi/PathogenPortal. Contact: anwarren@vt.edu Supplementary information: Supplementary materials are available at Bioinformatics online. PMID:25573919
RNA-Rocket: an RNA-Seq analysis resource for infectious disease research.
Warren, Andrew S; Aurrecoechea, Cristina; Brunk, Brian; Desai, Prerak; Emrich, Scott; Giraldo-Calderón, Gloria I; Harb, Omar; Hix, Deborah; Lawson, Daniel; Machi, Dustin; Mao, Chunhong; McClelland, Michael; Nordberg, Eric; Shukla, Maulik; Vosshall, Leslie B; Wattam, Alice R; Will, Rebecca; Yoo, Hyun Seung; Sobral, Bruno
2015-05-01
RNA-Seq is a method for profiling transcription using high-throughput sequencing and is an important component of many research projects that wish to study transcript isoforms, condition specific expression and transcriptional structure. The methods, tools and technologies used to perform RNA-Seq analysis continue to change, creating a bioinformatics challenge for researchers who wish to exploit these data. Resources that bring together genomic data, analysis tools, educational material and computational infrastructure can minimize the overhead required of life science researchers. RNA-Rocket is a free service that provides access to RNA-Seq and ChIP-Seq analysis tools for studying infectious diseases. The site makes available thousands of pre-indexed genomes, their annotations and the ability to stream results to the bioinformatics resources VectorBase, EuPathDB and PATRIC. The site also provides a combination of experimental data and metadata, examples of pre-computed analysis, step-by-step guides and a user interface designed to enable both novice and experienced users of RNA-Seq data. RNA-Rocket is available at rnaseq.pathogenportal.org. Source code for this project can be found at github.com/cidvbi/PathogenPortal. anwarren@vt.edu Supplementary materials are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.
Sequence analysis reveals genomic factors affecting EST-SSR primer performance and polymorphism
USDA-ARS?s Scientific Manuscript database
Search for simple sequence repeat (SSR) motifs and design of flanking primers in expressed sequence tag (EST) sequences can be easily done at a large scale using bioinformatics programs. However, failed amplification and/or detection, along with lack of polymorphism, is often seen among randomly sel...
G-DOC Plus - an integrative bioinformatics platform for precision medicine.
Bhuvaneshwar, Krithika; Belouali, Anas; Singh, Varun; Johnson, Robert M; Song, Lei; Alaoui, Adil; Harris, Michael A; Clarke, Robert; Weiner, Louis M; Gusev, Yuriy; Madhavan, Subha
2016-04-30
G-DOC Plus is a data integration and bioinformatics platform that uses cloud computing and other advanced computational tools to handle a variety of biomedical BIG DATA including gene expression arrays, NGS and medical images so that they can be analyzed in the full context of other omics and clinical information. G-DOC Plus currently holds data from over 10,000 patients selected from private and public resources including Gene Expression Omnibus (GEO), The Cancer Genome Atlas (TCGA) and the recently added datasets from REpository for Molecular BRAin Neoplasia DaTa (REMBRANDT), caArray studies of lung and colon cancer, ImmPort and the 1000 genomes data sets. The system allows researchers to explore clinical-omic data one sample at a time, as a cohort of samples; or at the level of population, providing the user with a comprehensive view of the data. G-DOC Plus tools have been leveraged in cancer and non-cancer studies for hypothesis generation and validation; biomarker discovery and multi-omics analysis, to explore somatic mutations and cancer MRI images; as well as for training and graduate education in bioinformatics, data and computational sciences. Several of these use cases are described in this paper to demonstrate its multifaceted usability. G-DOC Plus can be used to support a variety of user groups in multiple domains to enable hypothesis generation for precision medicine research. The long-term vision of G-DOC Plus is to extend this translational bioinformatics platform to stay current with emerging omics technologies and analysis methods to continue supporting novel hypothesis generation, analysis and validation for integrative biomedical research. By integrating several aspects of the disease and exposing various data elements, such as outpatient lab workup, pathology, radiology, current treatments, molecular signatures and expected outcomes over a web interface, G-DOC Plus will continue to strengthen precision medicine research. G-DOC Plus is available at: https://gdoc.georgetown.edu .
DiscoverySpace: an interactive data analysis application
Robertson, Neil; Oveisi-Fordorei, Mehrdad; Zuyderduyn, Scott D; Varhol, Richard J; Fjell, Christopher; Marra, Marco; Jones, Steven; Siddiqui, Asim
2007-01-01
DiscoverySpace is a graphical application for bioinformatics data analysis. Users can seamlessly traverse references between biological databases and draw together annotations in an intuitive tabular interface. Datasets can be compared using a suite of novel tools to aid in the identification of significant patterns. DiscoverySpace is of broad utility and its particular strength is in the analysis of serial analysis of gene expression (SAGE) data. The application is freely available online. PMID:17210078
In the loop: promoter–enhancer interactions and bioinformatics
Mora, Antonio; Sandve, Geir Kjetil; Gabrielsen, Odd Stokke
2016-01-01
Enhancer–promoter regulation is a fundamental mechanism underlying differential transcriptional regulation. Spatial chromatin organization brings remote enhancers in contact with target promoters in cis to regulate gene expression. There is considerable evidence for promoter–enhancer interactions (PEIs). In the recent years, genome-wide analyses have identified signatures and mapped novel enhancers; however, being able to precisely identify their target gene(s) requires massive biological and bioinformatics efforts. In this review, we give a short overview of the chromatin landscape and transcriptional regulation. We discuss some key concepts and problems related to chromatin interaction detection technologies, and emerging knowledge from genome-wide chromatin interaction data sets. Then, we critically review different types of bioinformatics analysis methods and tools related to representation and visualization of PEI data, raw data processing and PEI prediction. Lastly, we provide specific examples of how PEIs have been used to elucidate a functional role of non-coding single-nucleotide polymorphisms. The topic is at the forefront of epigenetic research, and by highlighting some future bioinformatics challenges in the field, this review provides a comprehensive background for future PEI studies. PMID:26586731
van Uitert, Miranda; Moerland, Perry D; Enquobahrie, Daniel A; Laivuori, Hannele; van der Post, Joris A M; Ris-Stalpers, Carrie; Afink, Gijs B
2015-01-01
Studies using the placental transcriptome to identify key molecules relevant for preeclampsia are hampered by a relatively small sample size. In addition, they use a variety of bioinformatics and statistical methods, making comparison of findings challenging. To generate a more robust preeclampsia gene expression signature, we performed a meta-analysis on the original data of 11 placenta RNA microarray experiments, representing 139 normotensive and 116 preeclamptic pregnancies. Microarray data were pre-processed and analyzed using standardized bioinformatics and statistical procedures and the effect sizes were combined using an inverse-variance random-effects model. Interactions between genes in the resulting gene expression signature were identified by pathway analysis (Ingenuity Pathway Analysis, Gene Set Enrichment Analysis, Graphite) and protein-protein associations (STRING). This approach has resulted in a comprehensive list of differentially expressed genes that led to a 388-gene meta-signature of preeclamptic placenta. Pathway analysis highlights the involvement of the previously identified hypoxia/HIF1A pathway in the establishment of the preeclamptic gene expression profile, while analysis of protein interaction networks indicates CREBBP/EP300 as a novel element central to the preeclamptic placental transcriptome. In addition, there is an apparent high incidence of preeclampsia in women carrying a child with a mutation in CREBBP/EP300 (Rubinstein-Taybi Syndrome). The 388-gene preeclampsia meta-signature offers a vital starting point for further studies into the relevance of these genes (in particular CREBBP/EP300) and their concomitant pathways as biomarkers or functional molecules in preeclampsia. This will result in a better understanding of the molecular basis of this disease and opens up the opportunity to develop rational therapies targeting the placental dysfunction causal to preeclampsia.
Practical applications of the bioinformatics toolbox for narrowing quantitative trait loci.
Burgess-Herbert, Sarah L; Cox, Allison; Tsaih, Shirng-Wern; Paigen, Beverly
2008-12-01
Dissecting the genes involved in complex traits can be confounded by multiple factors, including extensive epistatic interactions among genes, the involvement of epigenetic regulators, and the variable expressivity of traits. Although quantitative trait locus (QTL) analysis has been a powerful tool for localizing the chromosomal regions underlying complex traits, systematically identifying the causal genes remains challenging. Here, through its application to plasma levels of high-density lipoprotein cholesterol (HDL) in mice, we demonstrate a strategy for narrowing QTL that utilizes comparative genomics and bioinformatics techniques. We show how QTL detected in multiple crosses are subjected to both combined cross analysis and haplotype block analysis; how QTL from one species are mapped to the concordant regions in another species; and how genomewide scans associating haplotype groups with their phenotypes can be used to prioritize the narrowed regions. Then we illustrate how these individual methods for narrowing QTL can be systematically integrated for mouse chromosomes 12 and 15, resulting in a significantly reduced number of candidate genes, often from hundreds to <10. Finally, we give an example of how additional bioinformatics resources can be combined with experiments to determine the most likely quantitative trait genes.
ERIC Educational Resources Information Center
Almeida, Craig A.; Tardiff, Daniel F.; De Luca, Jane P.
2004-01-01
We have developed an introductory bioinformatics exercise for sophomore biology and biochemistry students that reinforces the understanding of the structure of a gene and the principles and events involved in its expression. In addition, the activity illustrates the severe effect mutations in a gene sequence can have on the protein product.…
SPARTA: Simple Program for Automated reference-based bacterial RNA-seq Transcriptome Analysis.
Johnson, Benjamin K; Scholz, Matthew B; Teal, Tracy K; Abramovitch, Robert B
2016-02-04
Many tools exist in the analysis of bacterial RNA sequencing (RNA-seq) transcriptional profiling experiments to identify differentially expressed genes between experimental conditions. Generally, the workflow includes quality control of reads, mapping to a reference, counting transcript abundance, and statistical tests for differentially expressed genes. In spite of the numerous tools developed for each component of an RNA-seq analysis workflow, easy-to-use bacterially oriented workflow applications to combine multiple tools and automate the process are lacking. With many tools to choose from for each step, the task of identifying a specific tool, adapting the input/output options to the specific use-case, and integrating the tools into a coherent analysis pipeline is not a trivial endeavor, particularly for microbiologists with limited bioinformatics experience. To make bacterial RNA-seq data analysis more accessible, we developed a Simple Program for Automated reference-based bacterial RNA-seq Transcriptome Analysis (SPARTA). SPARTA is a reference-based bacterial RNA-seq analysis workflow application for single-end Illumina reads. SPARTA is turnkey software that simplifies the process of analyzing RNA-seq data sets, making bacterial RNA-seq analysis a routine process that can be undertaken on a personal computer or in the classroom. The easy-to-install, complete workflow processes whole transcriptome shotgun sequencing data files by trimming reads and removing adapters, mapping reads to a reference, counting gene features, calculating differential gene expression, and, importantly, checking for potential batch effects within the data set. SPARTA outputs quality analysis reports, gene feature counts and differential gene expression tables and scatterplots. SPARTA provides an easy-to-use bacterial RNA-seq transcriptional profiling workflow to identify differentially expressed genes between experimental conditions. This software will enable microbiologists with limited bioinformatics experience to analyze their data and integrate next generation sequencing (NGS) technologies into the classroom. The SPARTA software and tutorial are available at sparta.readthedocs.org.
Wang, Longxin; Fu, Dian; Qiu, Yongbin; Xing, Xiaoxiao; Xu, Feng; Han, Conghui; Xu, Xiaofeng; Wei, Zhifeng; Zhang, Zhengyu; Ge, Jingping; Cheng, Wen; Xie, Hai-Long
2014-07-10
To understand lncRNAs expression profiling and their potential functions in bladder cancer, we investigated the lncRNA and coding RNA expression on human bladder cancer and normal bladder tissues. Bioinformatic analysis revealed thousands of significantly differentially expressed lncRNAs and coding mRNA in bladder cancer relative to normal bladder tissue. Co-expression analysis revealed that 50% of lncRNAs and coding RNAs expressed in the same direction. A subset of lncRNAs might be involved in mTOR signaling, p53 signaling, cancer pathways. Our study provides a large scale of co-expression between lncRNA and coding RNAs in bladder cancer cells and lays biological basis for further investigation. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Bayesian models based on test statistics for multiple hypothesis testing problems.
Ji, Yuan; Lu, Yiling; Mills, Gordon B
2008-04-01
We propose a Bayesian method for the problem of multiple hypothesis testing that is routinely encountered in bioinformatics research, such as the differential gene expression analysis. Our algorithm is based on modeling the distributions of test statistics under both null and alternative hypotheses. We substantially reduce the complexity of the process of defining posterior model probabilities by modeling the test statistics directly instead of modeling the full data. Computationally, we apply a Bayesian FDR approach to control the number of rejections of null hypotheses. To check if our model assumptions for the test statistics are valid for various bioinformatics experiments, we also propose a simple graphical model-assessment tool. Using extensive simulations, we demonstrate the performance of our models and the utility of the model-assessment tool. In the end, we apply the proposed methodology to an siRNA screening and a gene expression experiment.
Chondrocyte channel transcriptomics
Lewis, Rebecca; May, Hannah; Mobasheri, Ali; Barrett-Jolley, Richard
2013-01-01
To date, a range of ion channels have been identified in chondrocytes using a number of different techniques, predominantly electrophysiological and/or biomolecular; each of these has its advantages and disadvantages. Here we aim to compare and contrast the data available from biophysical and microarray experiments. This letter analyses recent transcriptomics datasets from chondrocytes, accessible from the European Bioinformatics Institute (EBI). We discuss whether such bioinformatic analysis of microarray datasets can potentially accelerate identification and discovery of ion channels in chondrocytes. The ion channels which appear most frequently across these microarray datasets are discussed, along with their possible functions. We discuss whether functional or protein data exist which support the microarray data. A microarray experiment comparing gene expression in osteoarthritis and healthy cartilage is also discussed and we verify the differential expression of 2 of these genes, namely the genes encoding large calcium-activated potassium (BK) and aquaporin channels. PMID:23995703
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hatazawa, Yukino; Research Fellow of Japan Society for the Promotion of Science, Tokyo; Minami, Kimiko
The expression of the transcriptional coactivator PGC1α is increased in skeletal muscles during exercise. Previously, we showed that increased PGC1α leads to prolonged exercise performance (the duration for which running can be continued) and, at the same time, increases the expression of branched-chain amino acid (BCAA) metabolism-related enzymes and genes that are involved in supplying substrates for the TCA cycle. We recently created mice with PGC1α knockout specifically in the skeletal muscles (PGC1α KO mice), which show decreased mitochondrial content. In this study, global gene expression (microarray) analysis was performed in the skeletal muscles of PGC1α KO mice compared withmore » that of wild-type control mice. As a result, decreased expression of genes involved in the TCA cycle, oxidative phosphorylation, and BCAA metabolism were observed. Compared with previously obtained microarray data on PGC1α-overexpressing transgenic mice, each gene showed the completely opposite direction of expression change. Bioinformatic analysis of the promoter region of genes with decreased expression in PGC1α KO mice predicted the involvement of several transcription factors, including a nuclear receptor, ERR, in their regulation. As PGC1α KO microarray data in this study show opposing findings to the PGC1α transgenic data, a loss-of-function experiment, as well as a gain-of-function experiment, revealed PGC1α’s function in the oxidative energy metabolism of skeletal muscles. - Highlights: • Microarray analysis was performed in the skeletal muscle of PGC1α KO mice. • Expression of genes in the oxidative energy metabolism was decreased. • Bioinformatic analysis of promoter region of the genes predicted involvement of ERR. • PGC1α KO microarray data in this study show the mirror image of transgenic data.« less
Guo, Shuang-shuang; Cheng, Lin; Yang, Li-min; Han, Mei
2015-11-01
The β-Glucuronidase gene (sbGUS) cDNA firstly from Scutellari abaicalensis leaf was cloned by RT-PCR, with GenBank accession number KR364726. The full length cDNA of sbGUS was 1 584 bp with an open reading frame (ORF), encoding an unstable protein with 527 amino acids. The bioinformatic analysis showed that the sbGUS encoding protein had isoelectric point (pI) of 5.55 and a calculated molecular weight about 58.724 8 kDa, with a transmembrane regions and signal peptide, had conserved domains of glycoside hydrolase super family and unintegrated trans-glycosidase catalytic structure. In the secondary structure, the percentage of alpha helix, extended strand, β-extended and random coil were 25.62%, 28.84%, 13.28% and 32.26%, respectively. The homologous analysis indicated the nucleotide sequence 98.93% similarity and the amino acid sequence 98.29% similarity with S. baicalensis (BAA97804.1), in the nine positions were different. The expression level of sGUS was the highest in root based on a real-time PCR analysis, followed by flower and stem, and the lowest was in stem. The results provide a foundation for exploring the molecular function of sbGUS involved in baicalcin biosynthesis based on synthetic biology approach in S. baicalensis plants.
Secretome analysis of rat osteoblasts during icariin treatment induced osteogenesis
Qian, Weiqing; Su, Yan; Zhang, Yajie; Yao, Nianwei; Gu, Nin; Zhang, Xu; Yin, Hong
2018-01-01
Osteoporosis is a serious public health problem and icariin (ICA) is the active component of the Epimedium sagittatum, a traditional Chinese medicinal herb. The present study aimed to investigate the effects and underlying mechanisms of ICA as a potential therapy for osteoporosis. Calvaria osteoblasts were isolated from newborn rats and treated with ICA. Cell viability, apoptosis, alkaline phosphatase activity and calcium deposition were analyzed. Bioinformatics analyses were performed to identify differentially expressed proteins (DEPs) in response to ICA treatment. Western blot analysis was performed to validate the expression of DEPs. ICA administration promoted osteoblast viability, alkaline phosphatase activity, calcium deposition and inhibited osteoblast apoptosis. Secretome analysis of ICA-treated cells was performed using two-dimensional gel electrophoresis and matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. A total of 56 DEPs were identified, including serpin family F member 1 (PEDF), protein disulfide isomerase family A, member 3 (PDIA3), nuclear protein, co-activator of histone transcription (NPAT), c-Myc and heat shock protein 70 (HSP70). These proteins were associated with signaling pathways, including Fas and p53. Bioinformatics and western blot analyses confirmed that the expression levels of the six DEPs were upregulated following ICA treatment. These genes may be directly or indirectly involved in ICA-mediated osteogenic differentiation and osteogenesis. It was demonstrated that ICA treatment promoted osteogenesis by modulating the expression of PEDF, PDIA3, NPAT and HSP70 through signaling pathways, including Fas and p53. PMID:29532868
O'Brien, M.A.; Costin, B.N.; Miles, M.F.
2014-01-01
Postgenomic studies of the function of genes and their role in disease have now become an area of intense study since efforts to define the raw sequence material of the genome have largely been completed. The use of whole-genome approaches such as microarray expression profiling and, more recently, RNA-sequence analysis of transcript abundance has allowed an unprecedented look at the workings of the genome. However, the accurate derivation of such high-throughput data and their analysis in terms of biological function has been critical to truly leveraging the postgenomic revolution. This chapter will describe an approach that focuses on the use of gene networks to both organize and interpret genomic expression data. Such networks, derived from statistical analysis of large genomic datasets and the application of multiple bioinformatics data resources, poten-tially allow the identification of key control elements for networks associated with human disease, and thus may lead to derivation of novel therapeutic approaches. However, as discussed in this chapter, the leveraging of such networks cannot occur without a thorough understanding of the technical and statistical factors influencing the derivation of genomic expression data. Thus, while the catch phrase may be “it's the network … stupid,” the understanding of factors extending from RNA isolation to genomic profiling technique, multivariate statistics, and bioinformatics are all critical to defining fully useful gene networks for study of complex biology. PMID:23195313
Use of toxicogenomics for identifying genetic markers of pulmonary oedema
DOE Office of Scientific and Technical Information (OSTI.GOV)
Balharry, Dominique; Oreffo, Victor; Richards, Roy
2005-04-15
This study was undertaken primarily to identify genetic markers of oedema and inflammation. Mild pulmonary injury was induced following the instillation of the oedema-producing agent, bleomycin (0.5 units). Oedema was then confirmed by conventional toxicology (lavage protein levels, free cell counts and lung/body weight ratios) and histology 3 days post-bleomycin instillation.The expression profile of 1176 mRNA species was determined for bleomycin-exposed lung (Clontech Atlas macroarray, n = 9). To obtain pertinent results from these data, it was necessary to develop a simple, effective method for bioinformatic analysis of altered gene expression. Data were log{sub 10} transformed followed by global normalisation.more » Differential gene expression was accepted if: (a) genes were statistically significant (P {<=} 0.05) from a two-tailed t test; (b) genes were consistently outside a two standard deviation (SD) range from control levels. A combination of these techniques identified 31 mRNA transcripts (approximately 3%) which were significantly altered in bleomycin treated tissue. Of these genes, 26 were down-regulated whilst only five were up-regulated. Two distinct clusters were identified, with 17 genes classified as encoding hormone receptors, and nine as encoding ion channels. Both these clusters were consistently down-regulated.The magnitude of the changes in gene expression were quantified and confirmed by Q-PCR (n = 6), validating the macroarray data and the bioinformatic analysis employed.In conclusion, this study has developed a suitable macroarray analysis procedure and provides the basis for a better understanding of the gene expression changes occurring during the early phase of drug-induced pulmonary oedema.« less
Su, Huafang; Lin, Fuqiang; Deng, Xia; Shen, Lanxiao; Fang, Ya; Fei, Zhenghua; Zhao, Lihao; Zhang, Xuebang; Pan, Huanle; Xie, Deyao; Jin, Xiance; Xie, Congying
2016-07-28
Acquired radioresistance during radiotherapy is considered as the most important reason for local tumor recurrence or treatment failure. Circular RNAs (circRNAs) have recently been identified as microRNA sponges and involve in various biological processes. The purpose of this study is to investigate the role of circRNAs in the radioresistance of esophageal cancer. Total RNA was isolated from human parental cell line KYSE-150 and self-established radioresistant esophageal cancer cell line KYSE-150R, and hybridized to Arraystar Human circRNA Array. Quantitative real-time PCR was used to confirm the circRNA expression profiles obtained from the microarray data. Bioinformatic tools including gene ontology (GO) analysis, KEGG pathway analysis and network analysis were done for further assessment. Among the detected candidate 3752 circRNA genes, significant upregulation of 57 circRNAs and downregulation of 17 circRNAs in human radioresistant esophageal cancer cell line KYSE-150R were observed compared with the parental cell line KYSE-150 (fold change ≥2.0 and P < 0.05). There were 9 out of these candidate circRNAs were validated by real-time PCR. GO analysis revealed that numerous target genes, including most microRNAs were involved in the biological processes. There were more than 400 target genes enrichment on Wnt signaling pathway. CircRNA_001059 and circRNA_000167 were the two largest nodes in circRNA/microRNA co-expression network. Our study revealed a comprehensive expression and functional profile of differentially expressed circRNAs in radioresistant esophageal cancer cells, indicating possible involvement of these dysregulated circRNAs in the development of radiation resistance.
Translational bioinformatics: linking the molecular world to the clinical world.
Altman, R B
2012-06-01
Translational bioinformatics represents the union of translational medicine and bioinformatics. Translational medicine moves basic biological discoveries from the research bench into the patient-care setting and uses clinical observations to inform basic biology. It focuses on patient care, including the creation of new diagnostics, prognostics, prevention strategies, and therapies based on biological discoveries. Bioinformatics involves algorithms to represent, store, and analyze basic biological data, including DNA sequence, RNA expression, and protein and small-molecule abundance within cells. Translational bioinformatics spans these two fields; it involves the development of algorithms to analyze basic molecular and cellular data with an explicit goal of affecting clinical care.
Bioinformatics and expressional analysis of cDNA clones from floral buds
NASA Astrophysics Data System (ADS)
Pawełkowicz, Magdalena Ewa; Skarzyńska, Agnieszka; Cebula, Justyna; Hincha, Dirck; ZiÄ bska, Karolina; PlÄ der, Wojciech; Przybecki, Zbigniew
2017-08-01
The application of genomic approaches may serve as an initial step in understanding the complexity of biochemical network and cellular processes responsible for regulation and execution of many developmental tasks. The molecular mechanism of sex expression in cucumber is still not elucidated. A study of differential expression was conducted to identify genes involved in sex determination and floral organ morphogenesis. Herein, we present generation of expression sequence tags (EST) obtained by differential hybridization (DH) and subtraction technique (cDNA-DSC) and their characteristic features such as molecular function, involvement in biology processes, expression and mapping position on the genome.
Zhang, Ting; Guo, Yueshuai; Guo, Xuejiang; Zhou, Tao; Chen, Daozhen; Xiang, Jingying; Zhou, Zuomin
2013-01-01
Intrahepatic cholestasis of pregnancy (ICP) usually occurs in the third trimester and associated with increased risks in fetal complications. Currently, the exact cause of this disease is unknown. In this study we aim to investigate the potential proteins in placenta, which may participate in the molecular mechanisms of ICP-related fetal complications using iTRAQ-based proteomics approach. The iTRAQ analysis combined with liquid chromatography-tandem mass spectrometry (LC-MS/MS) was performed to separate differentially expressed placental proteins from 4 pregnant women with ICP and 4 healthy pregnant women. Bioinformatics analysis was used to find the relative processes that these differentially expressed proteins were involved in. Three apoptosis related proteins ERp29, PRDX6 and MPO that resulted from iTRAQ-based proteomics were further verified in placenta by Western blotting and immunohistochemistry. Placental apoptosis was also detected by TUNEL assay. Proteomics results showed there were 38 differentially expressed proteins from pregnant women with ICP and healthy pregnant women, 29 were upregulated and 9 were downregulated in placenta from pregnant women with ICP. Bioinformatics analysis showed most of the identified proteins was functionally related to specific cell processes, including apoptosis, oxidative stress, lipid metabolism. The expression levels of ERp29, PRDX6 and MPO were consistent with the proteomics data. The apoptosis index in placenta from ICP patients was significantly increased. This preliminary work provides a better understanding of the proteomic alterations of placenta from pregnant women with ICP and may provide us some new insights into the pathophysiology and potential novel treatment targets for ICP.
Wang, Jingrui; Tang, Wei; Zheng, Yongna; Xing, Zhuqing; Wang, Yanping
2016-09-01
A novel lactic acid bacteria strain Lactobacillus kefiranofaciens ZW3 exhibited the characteristics of high production of exopolysaccharide (EPS). The epsN gene, located in the eps gene cluster of this strain, is associated with EPS biosynthesis. Bioinformatics analysis of this gene was performed. The conserved domain analysis showed that the EpsN protein contained MATE-Wzx-like domains. Then the epsN gene was amplified to construct the recombinant expression vector pMG36e-epsN. The results showed that the EPS yields of the recombinants were significantly improved. By determining the yields of EPS and intracellular polysaccharide, it was considered that epsN gene could play its Wzx flippase role in the EPS biosynthesis. This is the first time to prove the effect of EpsN on L. kefiranofaciens EPS biosynthesis and further prove its functional property.
Analysis of functional redundancies within the Arabidopsis TCP transcription factor family.
Danisman, Selahattin; van Dijk, Aalt D J; Bimbo, Andrea; van der Wal, Froukje; Hennig, Lars; de Folter, Stefan; Angenent, Gerco C; Immink, Richard G H
2013-12-01
Analyses of the functions of TEOSINTE-LIKE1, CYCLOIDEA, and PROLIFERATING CELL FACTOR1 (TCP) transcription factors have been hampered by functional redundancy between its individual members. In general, putative functionally redundant genes are predicted based on sequence similarity and confirmed by genetic analysis. In the TCP family, however, identification is impeded by relatively low overall sequence similarity. In a search for functionally redundant TCP pairs that control Arabidopsis leaf development, this work performed an integrative bioinformatics analysis, combining protein sequence similarities, gene expression data, and results of pair-wise protein-protein interaction studies for the 24 members of the Arabidopsis TCP transcription factor family. For this, the work completed any lacking gene expression and protein-protein interaction data experimentally and then performed a comprehensive prediction of potential functional redundant TCP pairs. Subsequently, redundant functions could be confirmed for selected predicted TCP pairs by genetic and molecular analyses. It is demonstrated that the previously uncharacterized class I TCP19 gene plays a role in the control of leaf senescence in a redundant fashion with TCP20. Altogether, this work shows the power of combining classical genetic and molecular approaches with bioinformatics predictions to unravel functional redundancies in the TCP transcription factor family.
Analysis of functional redundancies within the Arabidopsis TCP transcription factor family
Danisman, Selahattin; de Folter, Stefan; Immink, Richard G. H.
2013-01-01
Analyses of the functions of TEOSINTE-LIKE1, CYCLOIDEA, and PROLIFERATING CELL FACTOR1 (TCP) transcription factors have been hampered by functional redundancy between its individual members. In general, putative functionally redundant genes are predicted based on sequence similarity and confirmed by genetic analysis. In the TCP family, however, identification is impeded by relatively low overall sequence similarity. In a search for functionally redundant TCP pairs that control Arabidopsis leaf development, this work performed an integrative bioinformatics analysis, combining protein sequence similarities, gene expression data, and results of pair-wise protein–protein interaction studies for the 24 members of the Arabidopsis TCP transcription factor family. For this, the work completed any lacking gene expression and protein–protein interaction data experimentally and then performed a comprehensive prediction of potential functional redundant TCP pairs. Subsequently, redundant functions could be confirmed for selected predicted TCP pairs by genetic and molecular analyses. It is demonstrated that the previously uncharacterized class I TCP19 gene plays a role in the control of leaf senescence in a redundant fashion with TCP20. Altogether, this work shows the power of combining classical genetic and molecular approaches with bioinformatics predictions to unravel functional redundancies in the TCP transcription factor family. PMID:24129704
USDA-ARS?s Scientific Manuscript database
A bioinformatic study was conducted to identify the putative genes in the biocontrol agent Trichoderma virens that encode for non-ribosomal peptide synthetases (NRPS). Gene expression analysis of 22 putative NRPSs and 4 NRPS/PKS (polyketide synthase) hybrid enzymes was conducted in the presence and...
Lima, L S; Gramacho, K P; Carels, N; Novais, R; Gaiotto, F A; Lopes, U V; Gesteira, A S; Zaidan, H A; Cascardo, J C M; Pires, J L; Micheli, F
2009-07-14
In order to increase the efficiency of cacao tree resistance to witches' broom disease, which is caused by Moniliophthora perniciosa (Tricholomataceae), we looked for molecular markers that could help in the selection of resistant cacao genotypes. Among the different markers useful for developing marker-assisted selection, single nucleotide polymorphisms (SNPs) constitute the most common type of sequence difference between alleles and can be easily detected by in silico analysis from expressed sequence tag libraries. We report the first detection and analysis of SNPs from cacao-M. perniciosa interaction expressed sequence tags, using bioinformatics. Selection based on analysis of these SNPs should be useful for developing cacao varieties resistant to this devastating disease.
Koschmann, Jeannette; Machens, Fabian; Becker, Marlies; Niemeyer, Julia; Schulze, Jutta; Bülow, Lorenz; Stahl, Dietmar J.; Hehl, Reinhard
2012-01-01
A combination of bioinformatic tools, high-throughput gene expression profiles, and the use of synthetic promoters is a powerful approach to discover and evaluate novel cis-sequences in response to specific stimuli. With Arabidopsis (Arabidopsis thaliana) microarray data annotated to the PathoPlant database, 732 different queries with a focus on fungal and oomycete pathogens were performed, leading to 510 up-regulated gene groups. Using the binding site estimation suite of tools, BEST, 407 conserved sequence motifs were identified in promoter regions of these coregulated gene sets. Motif similarities were determined with STAMP, classifying the 407 sequence motifs into 37 families. A comparative analysis of these 37 families with the AthaMap, PLACE, and AGRIS databases revealed similarities to known cis-elements but also led to the discovery of cis-sequences not yet implicated in pathogen response. Using a parsley (Petroselinum crispum) protoplast system and a modified reporter gene vector with an internal transformation control, 25 elicitor-responsive cis-sequences from 10 different motif families were identified. Many of the elicitor-responsive cis-sequences also drive reporter gene expression in an Agrobacterium tumefaciens infection assay in Nicotiana benthamiana. This work significantly increases the number of known elicitor-responsive cis-sequences and demonstrates the successful integration of a diverse set of bioinformatic resources combined with synthetic promoter analysis for data mining and functional screening in plant-pathogen interaction. PMID:22744985
Pan, Hai-Tao; Ding, Hai-Gang; Fang, Min; Yu, Bin; Cheng, Yi; Tan, Ya-Jing; Fu, Qi-Qin; Lu, Bo; Cai, Hong-Guang; Jin, Xin; Xia, Xian-Qing; Zhang, Tao
2018-01-01
Recurrent miscarriage (RM) affects 5% of women, it has an adverse emotional impact on women. Because of the complexities of early development, the mechanism of recurrent miscarriage is still unclear. We hypothesized that abnormal placenta leads to early recurrent miscarriage (ERM). The aim of this study was to identify ERM associated factors in human placenta villous tissue using proteomics. Investigation of these differences in protein expression in parallel profiling is essential to understand the comprehensive pathophysiological mechanism underlying recurrent miscarriage (RM). To gain more insight into mechanisms of recurrent miscarriage (RM), a comparative proteome profile of the human placenta villous tissue in normal and RM pregnancies was analyzed using iTRAQ technology and bioinformatics analysis used by Ingenuity Pathway Analysis (IPA) software. In this study, we employed an iTRAQ based proteomics analysis of four placental villous tissues from patients with early recurrent miscarriage (ERM) and four from normal pregnant women. Finally, we identified 2805 proteins and 79,998 peptides between patients with RM and normal matched group. Further analysis identified 314 differentially expressed proteins in placental villous tissue (≥1.3-fold, Student's t-test, p < 0.05); 209 proteins showed the increased expression while 105 proteins showed decreased expression. These 314 proteins were analyzed by Ingenuity Pathway Analysis (IPA) and were found to play important roles in the growth of embryo. Furthermore, network analysis show that Angiotensinogen (AGT), MAPK14 and Prothrombin (F2) are core factors in early embryonic development. We used another 8 independent samples (4 cases and 4 controls) to cross validation of the proteomic data. This study has identified several proteins that are associated with early development, these results may supply new insight into mechanisms behind recurrent miscarriage. Copyright © 2017 Elsevier Ltd. All rights reserved.
Structure prediction, expression, and antigenicity of c-terminal of GRP78.
Aghamollaei, Hossein; Mousavi Gargari, Seyed Latif; Ghanei, Mostafa; Rasaee, Mohamad Javad; Amani, Jafar; Bakherad, Hamid; Farnoosh, Gholamreza
2017-01-01
Glucose-regulated protein 78 (GRP78) is a typical endoplasmic reticulum luminal chaperone having a main role in the activation of the unfolded protein response. Because of hypoxia and nutrient deprivation in the tumor microenvironment, expression of GRP78 in these cells becomes higher than the native cells, which makes it a suitable candidate for cancer targeting. Suppression of survival signals by antibody production against C-terminal domain of GR78 (CGRP) can induce apoptosis of cancer cells. The aim of this study was in silico analysis, recombinant production, and characterization of CGRP in Escherichia coli. Structural prediction of CGRP by bioinformatics tools was done and the construct containing optimized sequence was transferred to E. coli T7 shuffle. Expression was induced by isopropyl-β-d-thiogalactoside, and recombinant protein was purified by Ni-NTA agarose resin. The content of secondary structures was obtained by circular dichroism (CD) spectrum. CGRP immunogenicity was evaluated from the immunized mouse sera. SDS-PAGE analysis showed CGRP expression in E. coli. CD spectrum also confirmed prediction of structures by bioinformatics tools. The enzyme-linked immunosorbent assay using sera from immunized mice revealed CGRP as a good immunogen. The results obtained in this study showed that the structure of truncated CGRP is very similar to its structure in the whole protein context. This protein can be used in cancer researches. © 2015 International Union of Biochemistry and Molecular Biology, Inc.
Han, Jun; Xie, Hao; Sun, Qingpeng; Wang, Jun; Lu, Min; Wang, Weixiang; Guo, Erhu; Pan, Jinbao
2014-08-10
MiRNAs are a novel group of non-coding small RNAs that negatively regulate gene expression. Many miRNAs have been identified and investigated extensively in plant species with sequenced genomes. However, few miRNAs have been identified in foxtail millet (Setaria italica), which is an ancient cereal crop of great importance for dry land agriculture. In this study, 271 foxtail millet miRNAs belonging to 44 families were identified using a bioinformatics approach. Twenty-three pairs of sense/antisense miRNAs belonging to 13 families, and 18 miRNA clusters containing members of 8 families were discovered in foxtail millet. We identified 432 potential targets for 38 miRNA families, most of which were predicted to be involved in plant development, signal transduction, metabolic pathways, disease resistance, and environmental stress responses. Gene ontology (GO) analysis revealed that 101, 56, and 23 target genes were involved in molecular functions, biological processes, and cellular components, respectively. We investigated the expression patterns of 43 selected miRNAs using qRT-PCR analysis. All of the miRNAs were expressed ubiquitously with many exhibiting different expression levels in different tissues. We validated five predicted targets of four miRNAs using the RNA ligase mediated rapid amplification of cDNA end (5'-RLM-RACE) method. Copyright © 2014 Elsevier B.V. All rights reserved.
Yan, Hai-Biao; Huang, Jia-Cheng; Chen, You-Rong; Yao, Jian-Ni; Cen, Wei-Ning; Li, Jia-Yi; Jiang, Yi-Fan; Chen, Gang; Li, Sheng-Hua
2018-02-01
To investigate the clinical value and potential molecular mechanisms of miR-1 in clear cell renal cell carcinoma (ccRCC). We searched the Gene Expression Omnibus (GEO), ArrayExpress, several online publication databases and the Cancer Genome Atlas (TCGA). Continuous variable meta-analysis and diagnostic meta-analysis were conducted, both in Stata 14, to show the expression of miR-1 in ccRCC. Furthermore, we acquired the potential targets of miR-1 from datasets that transfected miR-1 into ccRCC cells, online prediction databases, differentially expressed genes from TCGA and literature. Subsequently bioinformatics analysis based on aforementioned selected target genes was conducted. The combined effect was -0.92 with the 95% confidence interval (CI) of -1.08 to -0.77 based on fixed effect model (I 2 = 81.3%, P < 0.001). No publication bias was found in our investigation. Sensitivity analysis showed that GSE47582 and 2 TCGA studies might cause heterogeneity. After eliminating them, the combined effect was -0.47 (95%CI: -0.78, -0.16) with I 2 = 18.3%. As for the diagnostic meta-analysis, the combined sensitivity and specificity were 0.90 (95%CI: 0.61, 0.98) and 0.63 (95%CI: 0.39, 0.82). The area under the curve (AUC) in the summarized receiver operating characteristic (SROC) curve was 0.83 (95%CI: 0.80, 0.86). No publication bias was found (P = 0.15). We finally got 67 genes which were defined the promising target genes of miR-1 in ccRCC. The most three significant KEGG pathways based on the aforementioned genes were Complement and coagulation cascades, ECM-receptor interaction and Focal adhesion. The downregulation of miR-1 might play an important role in ccRCC by targeting its target genes. Copyright © 2017 Elsevier GmbH. All rights reserved.
In Silico Prediction and Validation of Gfap as an miR-3099 Target in Mouse Brain.
Abidin, Shahidee Zainal; Leong, Jia-Wen; Mahmoudi, Marzieh; Nordin, Norshariza; Abdullah, Syahril; Cheah, Pike-See; Ling, King-Hwa
2017-08-01
MicroRNAs are small non-coding RNAs that play crucial roles in the regulation of gene expression and protein synthesis during brain development. MiR-3099 is highly expressed throughout embryogenesis, especially in the developing central nervous system. Moreover, miR-3099 is also expressed at a higher level in differentiating neurons in vitro, suggesting that it is a potential regulator during neuronal cell development. This study aimed to predict the target genes of miR-3099 via in-silico analysis using four independent prediction algorithms (miRDB, miRanda, TargetScan, and DIANA-micro-T-CDS) with emphasis on target genes related to brain development and function. Based on the analysis, a total of 3,174 miR-3099 target genes were predicted. Those predicted by at least three algorithms (324 genes) were subjected to DAVID bioinformatics analysis to understand their overall functional themes and representation. The analysis revealed that nearly 70% of the target genes were expressed in the nervous system and a significant proportion were associated with transcriptional regulation and protein ubiquitination mechanisms. Comparison of in situ hybridization (ISH) expression patterns of miR-3099 in both published and in-house-generated ISH sections with the ISH sections of target genes from the Allen Brain Atlas identified 7 target genes (Dnmt3a, Gabpa, Gfap, Itga4, Lxn, Smad7, and Tbx18) having expression patterns complementary to miR-3099 in the developing and adult mouse brain samples. Of these, we validated Gfap as a direct downstream target of miR-3099 using the luciferase reporter gene system. In conclusion, we report the successful prediction and validation of Gfap as an miR-3099 target gene using a combination of bioinformatics resources with enrichment of annotations based on functional ontologies and a spatio-temporal expression dataset.
Guzmán-Flores, Juan Manuel; Flores-Pérez, Elsa Cristina; Hernández-Ortiz, Magdalena; Vargas-Ortiz, Katya; Ramírez-Emiliano, Joel; Encarnación-Guevara, Sergio; Pérez-Vázquez, Victoriano
2018-06-01
Type 2 diabetes mellitus is characterized by insulin resistance in the liver. Insulin is not only involved in carbohydrate metabolism, it also regulates protein synthesis. This work describes the expression of proteins in the liver of a diabetic mouse and identifies the metabolic pathways involved. Twenty-week-old diabetic db/db mice were hepatectomized, after which proteins were separated by 2D-Polyacrylamide Gel Electrophoresis (2D-PAGE). Spots varying in intensity were analyzed using mass spectrometry, and biological function was assigned by the Database for Annotation, Visualization and Integrated Discovery (DAVID) software. A differential expression of 26 proteins was identified; among these were arginase-1, pyruvate carboxylase, peroxiredoxin-1, regucalcin, and sorbitol dehydrogenase. Bioinformatics analysis indicated that many of these proteins are mitochondrial and participate in metabolic pathways, such as the citrate cycle, the fructose and mannose metabolism, and glycolysis or gluconeogenesis. In addition, these proteins are related to oxidation⁻reduction reactions and molecular function of vitamin binding and amino acid metabolism. In conclusion, the proteomic profile of the liver of diabetic mouse db/db exhibited mainly alterations in the metabolism of carbohydrates and nitrogen. These differences illustrate the heterogeneity of diabetes in its different stages and under different conditions and highlights the need to improve treatments for this disease.
Jiang, Min; Lash, Gendie E; Zhao, Xueqing; Long, Yan; Guo, Caijiao; Yang, Hongling
2018-05-07
Circular RNAs (circRNAs) are transcribed prevalently in the genome; however, their potential roles in multiple cardiovascular diseases, particularly preeclampsia (PE), are not yet well understood. This study investigated the expression profiles of circRNAs and explored circRNA-mediated pregnancy-associated plasma protein A (PAPP-A) expression as a potential biomarker for PE before 20 weeks of pregnancy. A nested case-control two-phase screening/validation study was performed in pregnant women before 20 weeks of gestation (before clinical diagnosis) at Guangzhou Women and Children's Medical Center from 2012 to 2015. In the screening phase, circRNA expression profiles of blood cells were assessed using a human circRNA microarray, which was designed to detect simultaneously 5396 circRNAs, in 5 patients with PE and 5 age- and gestational week-matched controls. In the validation phase, 18 circRNAs in blood cells predicted by bioinformatics tools were validated by quantitative reverse transcription PCR in a cohort of 60 patients (PE and age-, gestational week-, and sample storage time-matched controls). Then, we examined the involvement of circRNAs in PE-related pathways via interactions with miRNAs by multiple bioinformatics approaches. Bioinformatics analysis predicted that hsa_circ_0004904 and hsa_circ_0001855 miRNA sponges directly target PAPP-A. PAPP-A was verified in the serum of the same cohort of patients using an enzyme-linked immunosorbent assay. Finally, we combined PAPP-A with circRNAs to create a novel preclinical diagnostic model for PE with logistic regression and evaluated the efficiency of this model with receiver operating curve analysis. Volcano plot analysis using various parameters showed that circRNAs were differentially expressed among both groups (P < 0.01, fold change > 2). In the screening phase, we found that 2178 circRNAs were differentially expressed between the control and PE groups, in which 884 circRNAs were downregulated and 1294 circRNAs were upregulated in the PE group compared with the control group. In the validation phase, two circRNAs, hsa_circ_0004904 and hsa_circ_0001855, were significantly upregulated in PE patients compared with healthy pregnant women (P < 0.05). PAPP-A expression levels, related to the two circRNAs based on bioinformatics prediction, were increased in the PE group compared with the control group. The area under the curve of the combined model was 0.94 in the predicted PE subjects. This is the first study to report circRNA profiling in patients with PE prior to the onset of symptoms. Our data suggested that hsa_circ_0004904 and hsa_circ_0001855 combined with PAPP-A might be promising biomarkers for the detection of PE. Moreover, circRNAs may provide new insights into the potential mechanisms underlying the pathophysiology of PE. © 2018 The Author(s). Published by S. Karger AG, Basel.
Dai, Guangyao; Yao, Xiaoguang; Zhang, Yubin; Gu, Jianbin; Geng, Yunfeng; Xue, Fei; Zhang, Jingcheng
2018-04-01
Cancer-associated fibroblasts (CAFs) contribute to the proliferation of colorectal cancer(CRC) cells. However, the mechanism by which CAFs develop in the tumor microenvironment remains unknown. Exosomes may be involved in activating CAFs. Using a miRNA expression profiling array, we determined the miRNA expression profile of secretory exosomes in CRC cells and then identified potential miRNAs with significant differential expression compared to normal cells via enrichment analysis. Predicted targets of candidate miRNAs were then assessed via bioinformatics analysis. Realtime qPCR, western blot, and cell cycle analyses were performed to evaluate the role of candidate exosomal miRNAs. Luciferase reporter assays were applied to confirm whether candidate exosomal miRNAs control target pathway expression. A CRC xenograft mouse model was constructed to evaluate tumor growth in vivo. Exosomes from CRC cells contained significantly higher levels of miR-10b than did exosomes from normal colorectal epithelial cells. Moreover, exosomes containing miR-10b were transferred to fibroblasts. Bioinformatics analysis identified PIK3CA, as a potential target of miR-10b. Luciferase reporter assays confirmed that miR-10b directly inhibited PIK3CA expression. Co-culturing fibroblasts with exosomes containing miR-10b significantly suppressed PIK3CA expression and decreased PI3K/Akt/mTOR pathway activity. Finally, exosomes containing miR-10b reduced fibroblast proliferation but promoted expression of TGF-β and SM α-actin, suggesting that exosomal miR-10b may activate fibroblasts to become CAFs that express myofibroblast markers. These activated fibroblasts were able to promote CRC growth in vitro and in vivo. CRC-derived exosomes actively promote disease progression by modulating surrounding stromal cells, which subsequently acquire features of CAFs. Copyright © 2018 Société Française du Cancer. Published by Elsevier Masson SAS. All rights reserved.
Li, Nan; Han, Zhenzhen; Li, Lin; Zhang, Bing; Liu, Zhidong; Li, Jiawei
2018-01-01
The objective of this study was to investigate the effects of the solid lipid nanoparticles of baicalin (BA-SLNs) on an experimental cataract model and explore the molecular mechanism combined with bioinformatics analysis. The transparency of lens was observed daily by slit-lamp and photography. Lenticular opacity was graded. Two-dimensional gel electrophoresis (2-DE) was employed to analyze the differential protein expression modes in each group. Proteins of interest were subjected to protein identification by nano-liquid chromatography tandem mass spectrometry (LC-MS/MS). Bioinformatics analysis was performed using the Ingenuity Pathway Analysis (IPA) online software to comprehend the biological implications of the proteins identified by proteomics. At the end of the sodium selenite-induced cataract progression, almost all lenses from the model group developed partial nuclear opacity; however, all lenses were clear and normal in the blank group. There was no significant difference between the BA-SLNs group and the blank group. Many protein spots were differently expressed in 2-DE patterns of total proteins of lenses from each group, and 65 highly different protein spots were selected to be identified between the BA-SLNs group and the model group. A total of 23 proteins were identified, and 12 of which were crystalline proteins. We considered crystalline proteins to play important roles in preserving the normal expression levels of proteins and the transparency of lenses. The general trend in the BA-SLN-treated lenses' data showed that BA-SLNs regulated the protein expression mode of cataract lenses to normal lenses. Our findings suggest that BA-SLNs may be a potential therapeutic agent in treating cataract by regulating protein expression and may also be a strong candidate for future clinical research.
A role for circadian evening elements in cold-regulated gene expression in Arabidopsis.
Mikkelsen, Michael D; Thomashow, Michael F
2009-10-01
The plant transcriptome is dramatically altered in response to low temperature. The cis-acting DNA regulatory elements and trans-acting factors that regulate the majority of cold-regulated genes are unknown. Previous bioinformatic analysis has indicated that the promoters of cold-induced genes are enriched in the Evening Element (EE), AAAATATCT, a DNA regulatory element that has a role in circadian-regulated gene expression. Here we tested the role of EE and EE-like (EEL) elements in cold-induced expression of two Arabidopsis genes, CONSTANS-like 1 (COL1; At5g54470) and a gene encoding a 27-kDa protein of unknown function that we designated COLD-REGULATED GENE 27 (COR27; At5g42900). Mutational analysis indicated that the EE/EEL elements were required for cold induction of COL1 and COR27, and that their action was amplified through coupling with ABA response element (ABRE)-like (ABREL) motifs. An artificial promoter consisting solely of four EE motifs interspersed with three ABREL motifs was sufficient to impart cold-induced gene expression. Both COL1 and COR27 were found to be regulated by the circadian clock at warm growth temperatures and cold-induction of COR27 was gated by the clock. These results suggest that cold- and clock-regulated gene expression are integrated through regulatory proteins that bind to EE and EEL elements supported by transcription factors acting at ABREL sequences. Bioinformatic analysis indicated that the coupling of EE and EEL motifs with ABREL motifs is highly enriched in cold-induced genes and thus may constitute a DNA regulatory element pair with a significant role in configuring the low-temperature transcriptome.
Schönbach, Christian; Li, Jinyan; Ma, Lan; Horton, Paul; Sjaugi, Muhammad Farhan; Ranganathan, Shoba
2018-01-19
The 16th International Conference on Bioinformatics (InCoB) was held at Tsinghua University, Shenzhen from September 20 to 22, 2017. The annual conference of the Asia-Pacific Bioinformatics Network featured six keynotes, two invited talks, a panel discussion on big data driven bioinformatics and precision medicine, and 66 oral presentations of accepted research articles or posters. Fifty-seven articles comprising a topic assortment of algorithms, biomolecular networks, cancer and disease informatics, drug-target interactions and drug efficacy, gene regulation and expression, imaging, immunoinformatics, metagenomics, next generation sequencing for genomics and transcriptomics, ontologies, post-translational modification, and structural bioinformatics are the subject of this editorial for the InCoB2017 supplement issues in BMC Genomics, BMC Bioinformatics, BMC Systems Biology and BMC Medical Genomics. New Delhi will be the location of InCoB2018, scheduled for September 26-28, 2018.
Peng, Silu; Yang, Huilin; Zhu, Du; Zhang, Zhibin; Yan, Riming; Wang, Ya
2016-04-14
Huperzine A (HupA) was approved as a drug for the treatment of Alzheimer's disease. The HupA biosynthetic pathway was started from lysine decarboxylase (LDC), which catalyzes lysine to cadaverine. In this study, we cloned and expressed an LDC gene from a HupA-producing endophytic fungus, and tested LDC activities. An endophytic fungus Shiraia sp. Slf14 from Huperzia serrata was used. LDC gene was obtained by RT-PCR, and cloned into pET-22b(+) and pET-32a(+) vectors to construct recombinant plasmids pET- 22b-LDC and pET-32a-LDC. These two recombinant plasmids were transformed into E. coli BL21, cultured for 8 h at 24 °C, 200 r/min with 1×10–3 mol/L IPTG into medium to express the LDC proteins, respectively. LDC proteins were purified by Ni2+ affinity chromatography. Catalytic activities were measured by Thin Layer Chromatography. At last, the physicochemical properties and structures of these two LDCs were obtained by bioinformatics software. LDC and Trx-LDC were expressed in E. coli BL21 successfully. SDS-PAGE analysis shows that the molecular weight of LDC and Trx-LDC were 24.4 kDa and 42.7 kDa respectively, which are consistent with bioinformatics analysis. In addition, TLC analysis reveals that both LDC and Trx-LDC had catalytic abilities. This work can provide fundamental data for enriching LDC molecular information and reveal the HupA biosynthetic pathway in endophytic fungi.
Fang, Xiang; Li, Ning-qiu; Fu, Xiao-zhe; Li, Kai-bin; Lin, Qiang; Liu, Li-hui; Shi, Cun-bin; Wu, Shu-qin
2015-07-01
As a key component of life science, bioinformatics has been widely applied in genomics, transcriptomics, and proteomics. However, the requirement of high-performance computers rather than common personal computers for constructing a bioinformatics platform significantly limited the application of bioinformatics in aquatic science. In this study, we constructed a bioinformatic analysis platform for aquatic pathogen based on the MilkyWay-2 supercomputer. The platform consisted of three functional modules, including genomic and transcriptomic sequencing data analysis, protein structure prediction, and molecular dynamics simulations. To validate the practicability of the platform, we performed bioinformatic analysis on aquatic pathogenic organisms. For example, genes of Flavobacterium johnsoniae M168 were identified and annotated via Blast searches, GO and InterPro annotations. Protein structural models for five small segments of grass carp reovirus HZ-08 were constructed by homology modeling. Molecular dynamics simulations were performed on out membrane protein A of Aeromonas hydrophila, and the changes of system temperature, total energy, root mean square deviation and conformation of the loops during equilibration were also observed. These results showed that the bioinformatic analysis platform for aquatic pathogen has been successfully built on the MilkyWay-2 supercomputer. This study will provide insights into the construction of bioinformatic analysis platform for other subjects.
Fu, Wenjiang J.; Stromberg, Arnold J.; Viele, Kert; Carroll, Raymond J.; Wu, Guoyao
2009-01-01
Over the past two decades, there have been revolutionary developments in life science technologies characterized by high throughput, high efficiency, and rapid computation. Nutritionists now have the advanced methodologies for the analysis of DNA, RNA, protein, low-molecular-weight metabolites, as well as access to bioinformatics databases. Statistics, which can be defined as the process of making scientific inferences from data that contain variability, has historically played an integral role in advancing nutritional sciences. Currently, in the era of systems biology, statistics has become an increasingly important tool to quantitatively analyze information about biological macromolecules. This article describes general terms used in statistical analysis of large, complex experimental data. These terms include experimental design, power analysis, sample size calculation, and experimental errors (type I and II errors) for nutritional studies at population, tissue, cellular, and molecular levels. In addition, we highlighted various sources of experimental variations in studies involving microarray gene expression, real-time polymerase chain reaction, proteomics, and other bioinformatics technologies. Moreover, we provided guidelines for nutritionists and other biomedical scientists to plan and conduct studies and to analyze the complex data. Appropriate statistical analyses are expected to make an important contribution to solving major nutrition-associated problems in humans and animals (including obesity, diabetes, cardiovascular disease, cancer, ageing, and intrauterine fetal retardation). PMID:20233650
Keck, Michael; van Dijk, Roelof Maarten; Deeg, Cornelia A; Kistler, Katharina; Walker, Andreas; von Rüden, Eva-Lotta; Russmann, Vera; Hauck, Stefanie M; Potschka, Heidrun
2018-04-01
Information about epileptogenesis-associated changes in protein expression patterns is of particular interest for future selection of target and biomarker candidates. Bioinformatic analysis of proteomic data sets can increase our knowledge about molecular alterations characterizing the different phases of epilepsy development following an initial epileptogenic insult. Here, we report findings from a focused analysis of proteomic data obtained for the hippocampus and parahippocampal cortex samples collected during the early post-insult phase, latency phase, and chronic phase of a rat model of epileptogenesis. The study focused on proteins functionally associated with cell stress, cell death, extracellular matrix (ECM) remodeling, cell-ECM interaction, cell-cell interaction, angiogenesis, and blood-brain barrier function. The analysis revealed prominent pathway enrichment providing information about the complex expression alterations of the respective protein groups. In the hippocampus, the number of differentially expressed proteins declined over time during the course of epileptogenesis. In contrast, a peak in the regulation of proteins linked with cell stress and death as well as ECM and cell-cell interaction became evident at later phases during epileptogenesis in the parahippocampal cortex. The data sets provide valuable information about the time course of protein expression patterns during epileptogenesis for a series of proteins. Moreover, the findings provide comprehensive novel information about expression alterations of proteins that have not been discussed yet in the context of epileptogenesis. These for instance include different members of the lamin protein family as well as the fermitin family member 2 (FERMT2). Induction of FERMT2 and other selected proteins, CD18 (ITGB2), CD44 and Nucleolin were confirmed by immunohistochemistry. Taken together, focused bioinformatic analysis of the proteomic data sets completes our knowledge about molecular alterations linked with cell death and cellular plasticity during epileptogenesis. The analysis provided can guide future selection of target and biomarker candidates. Copyright © 2018 Elsevier Inc. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Handakumbura, Pubudu; Hixson, Kim K.; Purvine, Samuel O.
We present a simple one-pot extraction protocol, which rapidly isolates hydrophyllic metabolites, lipids, and proteins from the same pulverized plant sample. Also detailed is a global plant proteomics sample preparation method utilizing iTRAQ multiplexing reagents that enables deep proteome coverage due to the use of HPLC fractionation of the peptides prior to mass spectrometric analysis. We have successfully used this protocol on several different plant tissues (e.g., roots, stems, leaves) from different plants (e.g., sorghum, poplar, Arabidopsis, soybean), and have been able to successfully detect and quantify thousands of proteins. Multiplexing strategies such as iTRAQ and the bioinformatics strategy outlinedmore » here, ultimately provide insight into which proteins are significantly changed in abundance between two or more groups (e.g., control, perturbation). Our bioinformatics strategy yields z-score values, which normalize the expression data into a format that can easily be cross-compared with other expression data (i.e., metabolomics, transcriptomics) obtained from different analytical methods and instrumentation.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lu, Jian, E-mail: lujian@ujs.edu.cn; Institute of Life Sciences, Jiangsu University, Zhenjiang 212013; Zhou, Zhongping
Cadmium is a toxic heavy metal present in the environment and in industrial materials. Cadmium has demonstrated carcinogenic activity that induces cell transformation, but how this occurs is unclear. We used 2D-DIGE and MALDI TOF/TOF MS combined with bioinformatics and immunoblotting to investigate the molecular mechanism of cadmium transformation. We found that small GTPases were critical for transformation. Additionally, proteins involved in mitochondrial transcription, DNA repair, and translation also had altered expression patterns in cadmium treated cells. Collectively, our results suggest that activation of small GTPases contributes to cadmium-induced transformation of colon cells. - Highlights: • Colon epithelial cell linemore » is firstly successfully transformed by cadmium. • 2D-DIGE is applied to visualize the differentially expressed proteins. • RhoA plays an important role in cadmium induced malignant transformation. • Bioinformatic and experimental methods are combined to explore new mechanisms.« less
Jiang, Z; Gui, S; Zhang, Y
2011-05-01
Nonfunctioning pituitary adenomas (NFPAs) are relatively common, accounting for 30% of all pituitary adenomas; however, their pathogenesis remains enigmatic. To explore the possible pathogenesis of NFPAs, we used fiber-optic BeadArray to examine gene expression in 5 NFPAs compared with 3 normal pituitaries. 4 differentially expressed genes were chosen randomly for validation by reverse transcriptase-real time quantitative polymerase chain reaction (RT-qPCR). We then analyzed the differentially expressed gene profile with Kyoto Encyclopedia of Genes and Genomes (KEGG). The array analysis indentified significant increases in the expression of 1,402 genes and 383 expressed sequence tags (ESTs), and decreases in 1,697 genes and 113 ESTs in the NFPAs. Bioinformatic and pathway analysis showed that the genes HIGD1B, FAM5C, PMAIP1 and the pathway cell-cycle regulation may play an important role in tumorigenesis and progression of NFPAs. Our data suggest fiber-optic BeadArray combined with pathway analysis of differential gene expression profile appears to be a valid approach for investigating the pathogenesis of tumors. © Georg Thieme Verlag KG Stuttgart · New York.
Exploring Wound-Healing Genomic Machinery with a Network-Based Approach
Vitali, Francesca; Marini, Simone; Balli, Martina; Grosemans, Hanne; Sampaolesi, Maurilio; Lussier, Yves A.; Cusella De Angelis, Maria Gabriella; Bellazzi, Riccardo
2017-01-01
The molecular mechanisms underlying tissue regeneration and wound healing are still poorly understood despite their importance. In this paper we develop a bioinformatics approach, combining biology and network theory to drive experiments for better understanding the genetic underpinnings of wound healing mechanisms and for selecting potential drug targets. We start by selecting literature-relevant genes in murine wound healing, and inferring from them a Protein-Protein Interaction (PPI) network. Then, we analyze the network to rank wound healing-related genes according to their topological properties. Lastly, we perform a procedure for in-silico simulation of a treatment action in a biological pathway. The findings obtained by applying the developed pipeline, including gene expression analysis, confirms how a network-based bioinformatics method is able to prioritize candidate genes for in vitro analysis, thus speeding up the understanding of molecular mechanisms and supporting the discovery of potential drug targets. PMID:28635674
Fiat lux! Phylogeny and bioinformatics shed light on GABA functions in plants.
Renault, Hugues
2013-06-01
The non-protein amino acid γ-aminobutyric acid (GABA) accumulates in plants in response to a wide variety of environmental cues. Recent data point toward an involvement of GABA in tricarboxylic acid (TCA) cycle activity and respiration, especially in stressed roots. To gain further insights into potential GABA functions in plants, phylogenetic and bioinformatic approaches were undertaken. Phylogenetic reconstruction of the GABA transaminase (GABA-T) protein family revealed the monophyletic nature of plant GABA-Ts. However, this analysis also pointed to the common origin of several plant aminotransferases families, which were found more similar to plant GABA-Ts than yeast and human GABA-Ts. A computational analysis of AtGABA-T co-expressed genes was performed in roots and in stress conditions. This second approach uncovered a strong connection between GABA metabolism and glyoxylate cycle during stress. Both in silico analyses open new perspectives and hypotheses for GABA metabolic functions in plants.
Human L-DOPA decarboxylase mRNA is a target of miR-145: A prediction to validation workflow.
Papadopoulos, Emmanuel I; Fragoulis, Emmanuel G; Scorilas, Andreas
2015-01-10
l-DOPA decarboxylase (DDC) is a multiply-regulated gene which encodes the enzyme that catalyzes the biosynthesis of dopamine in humans. MicroRNAs comprise a novel class of endogenously transcribed small RNAs that can post-transcriptionally regulate the expression of various genes. Given that the mechanism of microRNA target recognition remains elusive, several genes, including DDC, have not yet been identified as microRNA targets. Nevertheless, a number of specifically designed bioinformatic algorithms provide candidate miRNAs for almost every gene, but still their results exhibit moderate accuracy and should be experimentally validated. Motivated by the above, we herein sought to discover a microRNA that regulates DDC expression. By using the current algorithms according to bibliographic recommendations we found that miR-145 could be predicted with high specificity as a candidate regulatory microRNA for DDC expression. Thus, a validation experiment followed by firstly transfecting an appropriate cell culture system with a synthetic miR-145 sequence and sequentially assessing the mRNA and protein levels of DDC via real-time PCR and Western blotting, respectively. Our analysis revealed that miR-145 had no significant impact on protein levels of DDC but managed to dramatically downregulate its mRNA expression. Overall, the experimental and bioinformatic analysis conducted herein indicate that miR-145 has the ability to regulate DDC mRNA expression and potentially this occurs by recognizing its mRNA as a target. Copyright © 2014. Published by Elsevier B.V.
Zhao, Wei; Liu, Zhongjie; Yu, Xujiao; Lai, Luying; Li, Haobo; Liu, Zipeng; Li, Le; Jiang, Shan; Xia, Zhengyuan; Xu, Shi-yuan
2016-02-01
Bupivacaine, a commonly used local anesthetic, has potential neurotoxicity through diverse signaling pathways. However, the key mechanism of bupivacaine-induced neurotoxicity remains unclear. Cultured human SH-SY5Y neuroblastoma cells were treated (bupivacaine) or untreated (control) with bupivacaine for 24 h. Compared to the control group, bupivacaine significantly increased cyto-inhibition, cellular reactive oxygen species, DNA damage, mitochondrial injury, apoptosis (increased TUNEL-positive cells, cleaved caspase 3, and Bcl-2/Bax), and activated autophagy (enhanced LC3II/LC3I ratio). To explore changes in protein expression and intercommunication among the pathways involved in bupivacaine-induced neurotoxicity, an 8-plex iTRAQ proteomic technique and bioinformatics analysis were performed. Compared to the control group, 241 differentially expressed proteins were identified, of which, 145 were up-regulated and 96 were down-regulated. Bioinformatics analysis of the cross-talk between the significant proteins with altered expression in bupivacaine-induced neurotoxicity indicated that phosphatidyl-3-kinase (PI3K) was the most frequently targeted protein in each of the interactions. We further confirmed these results by determining the downstream targets of the identified signaling pathways (PI3K, Akt, FoxO1, Erk, and JNK). In conclusion, our study demonstrated that PI3K may play a central role in contacting and regulating the signaling pathways that contribute to bupivacaine-induced neurotoxicity. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Hu, Qiping; Fu, Jun; Luo, Bin; Huang, Miao; Guo, Wenwen; Lin, Yongda; Xie, Xiaoxun; Xiao, Shaowen
2015-04-01
Given its tumor-specific expression, including liver cancer, OY-TES-1 is a potential molecular marker for the diagnosis and immunotherapy of liver cancers. However, investigations of the mechanisms and the role of OY-TES-1 in liver cancer are rare. In the present study, based on a comprehensive bioinformatic analysis combined with RNA interference (RNAi) and oligonucleotide microarray, we report for the first time that downregulation of OY-TES-1 resulted in significant changes in expression of NANOG, CD9, CCND2 and CDCA3 in the liver cancer cell line BEL-7404. NANOG, CD9, CCND2 and CDCA3 may be involved in cell proliferation, migration, invasion and apoptosis, yet also may be functionally related to each other and OY-TES-1. Among these molecules, we identified that NANOG, containing a Kazal-2 binding motif and homeobox, may be the most likely candidate protein interacting with OY-TES-1 in liver cancer. Thus, the present study may provide important information for further investigation of the roles of OY-TES-1 in liver cancer.
Organellar proteome analyses of ricin toxin-treated HeLa cells.
Liao, Peng; Li, Yunhu; Li, Hongyang; Liu, Wensen
2016-07-01
Apoptosis triggered by ricin toxin (RT) has previously been associated with certain cellular organellar compartments, but the diversity in the composition of the organellar proteins remains unclear. Here, we applied a shotgun proteomics strategy to examine the differential expression of proteins in the mitochondria, nuclei, and cytoplasm of HeLa cells treated and not treated with RT. Data were combined with a global bioinformatics analysis and experimental confirmations. A total of 3107 proteins were identified. Bioinformatics predictors (Proteome Analyst, WoLF PSORT, TargetP, MitoPred, Nucleo, MultiLoc, and k-nearest neighbor) and a Bayesian model that integrated these predictors were used to predict the locations of 1349 distinct organellar proteins. Our data indicate that the Bayesian model was more efficient than the individual implementation of these predictors. Additionally, a Biomolecular Interaction Network (BIN) analysis was used to identify 149 BIN subnetworks. Our experimental confirmations indicate that certain apoptosis-related proteins (e.g. cytochrome c, enolase, lamin B, Bax, and Drp1) were found to be translocated and had variable expression levels. These results provide new insights for the systematic understanding of RT-induced apoptosis responses. © The Author(s) 2014.
The Human Cell Surfaceome of Breast Tumors
da Cunha, Júlia Pinheiro Chagas; Galante, Pedro Alexandre Favoretto; de Souza, Jorge Estefano Santana; Pieprzyk, Martin; Carraro, Dirce Maria; Old, Lloyd J.; Camargo, Anamaria Aranha; de Souza, Sandro José
2013-01-01
Introduction. Cell surface proteins are ideal targets for cancer therapy and diagnosis. We have identified a set of more than 3700 genes that code for transmembrane proteins believed to be at human cell surface. Methods. We used a high-throuput qPCR system for the analysis of 573 cell surface protein-coding genes in 12 primary breast tumors, 8 breast cell lines, and 21 normal human tissues including breast. To better understand the role of these genes in breast tumors, we used a series of bioinformatics strategies to integrates different type, of the datasets, such as KEGG, protein-protein interaction databases, ONCOMINE, and data from, literature. Results. We found that at least 77 genes are overexpressed in breast primary tumors while at least 2 of them have also a restricted expression pattern in normal tissues. We found common signaling pathways that may be regulated in breast tumors through the overexpression of these cell surface protein-coding genes. Furthermore, a comparison was made between the genes found in this report and other genes associated with features clinically relevant for breast tumorigenesis. Conclusions. The expression profiling generated in this study, together with an integrative bioinformatics analysis, allowed us to identify putative targets for breast tumors. PMID:24195083
Privacy Preserving PCA on Distributed Bioinformatics Datasets
ERIC Educational Resources Information Center
Li, Xin
2011-01-01
In recent years, new bioinformatics technologies, such as gene expression microarray, genome-wide association study, proteomics, and metabolomics, have been widely used to simultaneously identify a huge number of human genomic/genetic biomarkers, generate a tremendously large amount of data, and dramatically increase the knowledge on human…
NASA Technical Reports Server (NTRS)
Karlin, Samuel
2004-01-01
We used bioinformatics methods to study phylogenetic relations and differentiation patterns of the archaeal chaperonin 60 kDa heat-shock protein (HSP60) genes in support of the study of differential expression patterns of the three chaperonin genes encoded in Sulfolobus shibatae.
Specific regions of the brain are capable of fructose metabolism.
Oppelt, Sarah A; Zhang, Wanming; Tolan, Dean R
2017-02-15
High fructose consumption in the Western diet correlates with disease states such as obesity and metabolic syndrome complications, including type II diabetes, chronic kidney disease, and non-alcoholic fatty acid liver disease. Liver and kidneys are responsible for metabolism of 40-60% of ingested fructose, while the physiological fate of the remaining fructose remains poorly understood. The primary metabolic pathway for fructose includes the fructose-transporting solute-like carrier transport proteins 2a (SLC2a or GLUT), including GLUT5 and GLUT9, ketohexokinase (KHK), and aldolase. Bioinformatic analysis of gene expression encoding these proteins (glut5, glut9, khk, and aldoC, respectively) identifies other organs capable of this fructose metabolism. This analysis predicts brain, lymphoreticular tissue, placenta, and reproductive tissues as possible additional organs for fructose metabolism. While expression of these genes is highest in liver, the brain is predicted to have expression levels of these genes similar to kidney. RNA in situ hybridization of coronal slices of adult mouse brains validate the in silico expression of glut5, glut9, khk, and aldoC, and show expression across many regions of the brain, with the most notable expression in the cerebellum, hippocampus, cortex, and olfactory bulb. Dissected samples of these brain regions show KHK and aldolase enzyme activity 5-10 times the concentration of that in liver. Furthermore, rates of fructose oxidation in these brain regions are 15-150 times that of liver slices, confirming the bioinformatics prediction and in situ hybridization data. This suggests that previously unappreciated regions across the brain can use fructose, in addition to glucose, for energy production. Copyright © 2016 Elsevier B.V. All rights reserved.
Specific regions of the brain are capable of fructose metabolism
Oppelt, Sarah A.; Zhang, Wanming; Tolan, Dean R.
2017-01-01
High fructose consumption in the Western diet correlates with disease states such as obesity and metabolic syndrome complications, including type II diabetes, chronic kidney disease, and nonalcoholic fatty acid liver disease. Liver and kidneys are responsible for metabolism of 40–60% of ingested fructose, while the physiological fate of the remaining fructose remains poorly understood. The primary metabolic pathway for fructose includes the fructose-transporting solute-like carrier transport proteins 2a (SLC2a or GLUT), including GLUT5 and GLUT9, ketohexokinase (KHK), and aldolase. Bioinformatic analysis of gene expression encoding these proteins (glut5, glut9, khk, and aldoC, respectively) identifies other organs capable of this fructose metabolism. This analysis predicts brain, lymphoreticular tissue, placenta, and reproductive tissues as possible additional organs for fructose metabolism. While expression of these genes is highest in liver, the brain is predicted to have expression levels of these genes similar to kidney. RNA in situ hybridization of coronal slices of adult mouse brains validate the in silico expression of glut5, glut9, khk, and aldoC, and show expression across many regions of the brain, with the most notable expression in the cerebellum, hippocampus, cortex, and olfactory bulb. Dissected samples of these brain regions show KHK and aldolase enzyme activity 5–10 times the concentration of that in liver. Furthermore, rates of fructose oxidation in these brain regions are 15–150 times that of liver slices, confirming the bioinformatics prediction and in situ hybridization data. This suggests that previously unappreciated regions across the brain can use fructose, in addition to glucose, for energy production. PMID:28034722
Guo, Can-Jie; Xiao, Xiao; Sheng, Li; Chen, Lili; Zhong, Wei; Li, Hai; Hua, Jing; Ma, Xiong
2017-01-01
To analyze the long noncoding (lncRNA)-mRNA expression network and potential roles in rat hepatic stellate cells (HSCs) during activation. LncRNA expression was analyzed in quiescent and culture-activated HSCs by RNA sequencing, and differentially expressed lncRNAs verified by quantitative reverse transcription polymerase chain reaction (qRT-PCR) were subjected to bioinformatics analysis. In vivo analyses of differential lncRNA-mRNA expression were performed on a rat model of liver fibrosis. We identified upregulation of 12 lncRNAs and 155 mRNAs and downregulation of 12 lncRNAs and 374 mRNAs in activated HSCs. Additionally, we identified the differential expression of upregulated lncRNAs (NONRATT012636.2, NONRATT016788.2, and NONRATT021402.2) and downregulated lncRNAs (NONRATT007863.2, NONRATT019720.2, and NONRATT024061.2) in activated HSCs relative to levels observed in quiescent HSCs, and Gene Ontology and Kyoto Encyclopedia of Genes and Genomes pathway analyses showed that changes in lncRNAs associated with HSC activation revealed 11 significantly enriched pathways according to their predicted targets. Moreover, based on the predicted co-expression network, the relative dynamic levels of NONRATT013819.2 and lysyl oxidase (Lox) were compared during HSC activation both in vitro and in vivo. Our results confirmed the upregulation of lncRNA NONRATT013819.2 and Lox mRNA associated with the extracellular matrix (ECM)-related signaling pathway in HSCs and fibrotic livers. Our results detailing a dysregulated lncRNA-mRNA network might provide new treatment strategies for hepatic fibrosis based on findings indicating potentially critical roles for NONRATT013819.2 and Lox in ECM remodeling during HSC activation. © 2017 The Author(s). Published by S. Karger AG, Basel.
miRanalyzer: a microRNA detection and analysis tool for next-generation sequencing experiments.
Hackenberg, Michael; Sturm, Martin; Langenberger, David; Falcón-Pérez, Juan Manuel; Aransay, Ana M
2009-07-01
Next-generation sequencing allows now the sequencing of small RNA molecules and the estimation of their expression levels. Consequently, there will be a high demand of bioinformatics tools to cope with the several gigabytes of sequence data generated in each single deep-sequencing experiment. Given this scene, we developed miRanalyzer, a web server tool for the analysis of deep-sequencing experiments for small RNAs. The web server tool requires a simple input file containing a list of unique reads and its copy numbers (expression levels). Using these data, miRanalyzer (i) detects all known microRNA sequences annotated in miRBase, (ii) finds all perfect matches against other libraries of transcribed sequences and (iii) predicts new microRNAs. The prediction of new microRNAs is an especially important point as there are many species with very few known microRNAs. Therefore, we implemented a highly accurate machine learning algorithm for the prediction of new microRNAs that reaches AUC values of 97.9% and recall values of up to 75% on unseen data. The web tool summarizes all the described steps in a single output page, which provides a comprehensive overview of the analysis, adding links to more detailed output pages for each analysis module. miRanalyzer is available at http://web.bioinformatics.cicbiogune.es/microRNA/.
Leduc, Magalie S; Blair, Rachael Hageman; Verdugo, Ricardo A; Tsaih, Shirng-Wern; Walsh, Kenneth; Churchill, Gary A; Paigen, Beverly
2012-06-01
A higher incidence of coronary artery disease is associated with a lower level of HDL-cholesterol. We searched for genetic loci influencing HDL-cholesterol in F2 mice from a cross between MRL/MpJ and SM/J mice. Quantitative trait loci (QTL) mapping revealed one significant HDL QTL (Apoa2 locus), four suggestive QTL on chromosomes 10, 11, 13, and 18 and four additional QTL on chromosomes 1 proximal, 3, 4, and 7 after adjusting HDL for the strong Apoa2 locus. A novel nonsynonymous polymorphism supports Lipg as the QTL gene for the chromosome 18 QTL, and a difference in Abca1 expression in liver tissue supports it as the QTL gene for the chromosome 4 QTL. Using weighted gene co-expression network analysis, we identified a module that after adjustment for Apoa2, correlated with HDL, was genetically determined by a QTL on chromosome 11, and overlapped with the HDL QTL. A combination of bioinformatics tools and systems genetics helped identify several candidate genes for both the chromosome 11 HDL and module QTL based on differential expression between the parental strains, cis regulation of expression, and causality modeling. We conclude that integrating systems genetics to a more-traditional genetics approach improves the power of complex trait gene identification.
GEOquery: a bridge between the Gene Expression Omnibus (GEO) and BioConductor.
Davis, Sean; Meltzer, Paul S
2007-07-15
Microarray technology has become a standard molecular biology tool. Experimental data have been generated on a huge number of organisms, tissue types, treatment conditions and disease states. The Gene Expression Omnibus (Barrett et al., 2005), developed by the National Center for Bioinformatics (NCBI) at the National Institutes of Health is a repository of nearly 140,000 gene expression experiments. The BioConductor project (Gentleman et al., 2004) is an open-source and open-development software project built in the R statistical programming environment (R Development core Team, 2005) for the analysis and comprehension of genomic data. The tools contained in the BioConductor project represent many state-of-the-art methods for the analysis of microarray and genomics data. We have developed a software tool that allows access to the wealth of information within GEO directly from BioConductor, eliminating many the formatting and parsing problems that have made such analyses labor-intensive in the past. The software, called GEOquery, effectively establishes a bridge between GEO and BioConductor. Easy access to GEO data from BioConductor will likely lead to new analyses of GEO data using novel and rigorous statistical and bioinformatic tools. Facilitating analyses and meta-analyses of microarray data will increase the efficiency with which biologically important conclusions can be drawn from published genomic data. GEOquery is available as part of the BioConductor project.
SNHG16/miR-216-5p/ZEB1 signal pathway contributes to the tumorigenesis of cervical cancer cells.
Zhu, Hong; Zeng, Yan; Zhou, Chen-Chen; Ye, Weiping
2018-01-01
Long non-coding RNAs (lncRNAs) have been confirmed as crucial regulators in tumorgenesis. Small nucleolar RNA host gene 16 (SNHG16) has been recently uncovered to be a potential oncogene in several types of cancers. However, its expression level and potential role in cervical cancer remain uncertain. In our research, we assessed the expression level of SNHG16 in clinical cervical cancer tissues and cells. We made use of functional assays to determine the biological effects of SNHG16 on cell proliferation and migration of cervical cancer. By employing the bioinformatics analysis tools, we revealed that miR-216-5p could interact with SNHG16 and there existed a negative correlation between the expression levels of miR-216-5p and SNHG16 in cervical cancer specimens. Furthermore, RIP assay, RNA pulldown system and dual luciferase reporter assays confirmed that SNHG16 directly targeted miR-216-5p by harboring the binding sites of microRNA in the SNHG16 sequence. Additionally, bioinformatics analysis provided an evidence that ZEB1 was a potential target of miR-216-5p. Collectively, it was suggested that SNHG16 could serve as an oncogene that promoted tumor progression by acting as an endogenous 'sponge' to regulate miR-216A-5p/ZEB1. Copyright © 2017 Elsevier Inc. All rights reserved.
Xia, Quan; Zhao, Yingli; Wang, Jiali; Qiao, Wenhao; Zhang, Dongling; Yin, Hao; Xu, Dujuan; Chen, Feihu
2017-07-01
4-amino-2-trifluoromethyl-phenyl retinate (ATPR) was reported to potentially inhibit proliferation and induce differentiation activity in some tumor cells. In this study, a proteomics approach was used to investigate the possible mechanism by screening the differentially expressed protein profiles of SGC-7901 cells before and after ATPR-treatment in vitro. Peptides digested from the total cellular proteins were analyzed by reverse phase LC-MS/MS followed by a label-free quantification analysis. The SEQUEST search engine was used to identify proteins and bioinformatics resources were used to investigate the involved pathways for the differentially expressed proteins. Thirteen down-regulated proteins were identified in the ATPR-treated group. Bioinformatics analysis showed that the effects of ATPR on 14-3-3ε might potentially involve the PI3K-AKT-FOXO pathway and P27Kip1 expression. Western blot and RT-PCR analysis showed that ATPR could inhibit AKT phosphorylation, up-regulate the expression of FOXO1A and P27Kip1 at both the protein and mRNA levels, and down-regulate the cytoplasmic expression of cyclin E and CDK2. ATPR-induced G0/G1 phase arrest and differentiation can be ablated if the P27kip1 gene is silenced with sequence-specific siRNA or in 14-3-3ε overexpression of SGC-7901 cells. ATPR might cause cell cycle arrest and differentiation in SGC-7901 cells by simultaneously inhibiting the phosphorylation of AKT and down-regulating 14-3-3ε. This change would then enhance the inhibition of cyclin E/CDK2 by up-regulating FOXO1A and P27Kip1. Our findings could be of value for finding new drug targets and for developing more effective differentiation inducer. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Analyzing gene expression profiles in dilated cardiomyopathy via bioinformatics methods.
Wang, Liming; Zhu, L; Luan, R; Wang, L; Fu, J; Wang, X; Sui, L
2016-10-10
Dilated cardiomyopathy (DCM) is characterized by ventricular dilatation, and it is a common cause of heart failure and cardiac transplantation. This study aimed to explore potential DCM-related genes and their underlying regulatory mechanism using methods of bioinformatics. The gene expression profiles of GSE3586 were downloaded from Gene Expression Omnibus database, including 15 normal samples and 13 DCM samples. The differentially expressed genes (DEGs) were identified between normal and DCM samples using Limma package in R language. Pathway enrichment analysis of DEGs was then performed. Meanwhile, the potential transcription factors (TFs) and microRNAs (miRNAs) of these DEGs were predicted based on their binding sequences. In addition, DEGs were mapped to the cMap database to find the potential small molecule drugs. A total of 4777 genes were identified as DEGs by comparing gene expression profiles between DCM and control samples. DEGs were significantly enriched in 26 pathways, such as lymphocyte TarBase pathway and androgen receptor signaling pathway. Furthermore, potential TFs (SP1, LEF1, and NFAT) were identified, as well as potential miRNAs (miR-9, miR-200 family, and miR-30 family). Additionally, small molecules like isoflupredone and trihexyphenidyl were found to be potential therapeutic drugs for DCM. The identified DEGs (PRSS12 and FOXG1), potential TFs, as well as potential miRNAs, might be involved in DCM.
Analyzing gene expression profiles in dilated cardiomyopathy via bioinformatics methods
Wang, Liming; Zhu, L.; Luan, R.; Wang, L.; Fu, J.; Wang, X.; Sui, L.
2016-01-01
Dilated cardiomyopathy (DCM) is characterized by ventricular dilatation, and it is a common cause of heart failure and cardiac transplantation. This study aimed to explore potential DCM-related genes and their underlying regulatory mechanism using methods of bioinformatics. The gene expression profiles of GSE3586 were downloaded from Gene Expression Omnibus database, including 15 normal samples and 13 DCM samples. The differentially expressed genes (DEGs) were identified between normal and DCM samples using Limma package in R language. Pathway enrichment analysis of DEGs was then performed. Meanwhile, the potential transcription factors (TFs) and microRNAs (miRNAs) of these DEGs were predicted based on their binding sequences. In addition, DEGs were mapped to the cMap database to find the potential small molecule drugs. A total of 4777 genes were identified as DEGs by comparing gene expression profiles between DCM and control samples. DEGs were significantly enriched in 26 pathways, such as lymphocyte TarBase pathway and androgen receptor signaling pathway. Furthermore, potential TFs (SP1, LEF1, and NFAT) were identified, as well as potential miRNAs (miR-9, miR-200 family, and miR-30 family). Additionally, small molecules like isoflupredone and trihexyphenidyl were found to be potential therapeutic drugs for DCM. The identified DEGs (PRSS12 and FOXG1), potential TFs, as well as potential miRNAs, might be involved in DCM. PMID:27737314
Technological advances and genomics in metazoan parasites.
Knox, D P
2004-02-01
Molecular biology has provided the means to identify parasite proteins, to define their function, patterns of expression and the means to produce them in quantity for subsequent functional analyses. Whole genome and expressed sequence tag programmes, and the parallel development of powerful bioinformatics tools, allow the execution of genome-wide between stage or species comparisons and meaningful gene-expression profiling. The latter can be undertaken with several new technologies such as DNA microarray and serial analysis of gene expression. Proteome analysis has come to the fore in recent years providing a crucial link between the gene and its protein product. RNA interference and ballistic gene transfer are exciting developments which can provide the means to precisely define the function of individual genes and, of importance in devising novel parasite control strategies, the effect that gene knockdown will have on parasite survival.
COMAN: a web server for comprehensive metatranscriptomics analysis.
Ni, Yueqiong; Li, Jun; Panagiotou, Gianni
2016-08-11
Microbiota-oriented studies based on metagenomic or metatranscriptomic sequencing have revolutionised our understanding on microbial ecology and the roles of both clinical and environmental microbes. The analysis of massive metatranscriptomic data requires extensive computational resources, a collection of bioinformatics tools and expertise in programming. We developed COMAN (Comprehensive Metatranscriptomics Analysis), a web-based tool dedicated to automatically and comprehensively analysing metatranscriptomic data. COMAN pipeline includes quality control of raw reads, removal of reads derived from non-coding RNA, followed by functional annotation, comparative statistical analysis, pathway enrichment analysis, co-expression network analysis and high-quality visualisation. The essential data generated by COMAN are also provided in tabular format for additional analysis and integration with other software. The web server has an easy-to-use interface and detailed instructions, and is freely available at http://sbb.hku.hk/COMAN/ CONCLUSIONS: COMAN is an integrated web server dedicated to comprehensive functional analysis of metatranscriptomic data, translating massive amount of reads to data tables and high-standard figures. It is expected to facilitate the researchers with less expertise in bioinformatics in answering microbiota-related biological questions and to increase the accessibility and interpretation of microbiota RNA-Seq data.
Chance, Mark R.; Chang, Jinsook; Liu, Shuqing; Gokulrangan, Giridharan; Chen, Daniel H.-C.; Lindsay, Aaron; Geng, Ruishuang; Zheng, Qing Y.; Alagramam, Kumar
2010-01-01
Proteins and protein networks associated with cochlear pathogenesis in the Ames waltzer (av) mouse, a model for deafness in Usher syndrome 1F (USH1F), were identified. Cochlear protein from wild-type and av mice at postnatal day 30, a time point in which cochlear pathology is well established, was analyzed by quantitative 2D gel electrophoresis followed by mass spectrometry (MS). The analytic gel resolved 2270 spots; 69 spots showed significant changes in intensity in the av cochlea compared with the control. The cochlin protein was identified in 20 peptide spots, most of which were up-regulated, while a few were down-regulated. Analysis of MS sequence data showed that, in the av cochlea, a set of full-length isoforms of cochlin was up-regulated, while isoforms missing the N-terminal FCH/LCCL domain were down-regulated. Protein interaction network analysis of all differentially expressed proteins was performed with Metacore software. That analysis revealed a number of statistically significant candidate protein networks predicted to be altered in the affected cochlea. Quantitative PCR (qPCR) analysis of select candidates from the proteomic and bioinformatic investigations showed up-regulation of Coch mRNA and those of p53, Brn3a and Nrf2, transcription factors linked to stress response and survival. Increased mRNA of Brn3a and Nrf2 has previously been associated with increased expression of cochlin in human glaucomatous trabecular meshwork. Our report strongly suggests that increased level of cochlin is an important etiologic factor leading to the degeneration of cochlear neuroepithelia in the USH1F model. PMID:20097680
Szabo, R; Samson, A L; Lawrence, D A; Medcalf, R L; Bugge, T H
2016-08-01
Essentials C57BL/6J-tissue plasminogen activator (tPA)-deficient mice are widely used to study tPA function. Congenic C57BL/6J-tPA-deficient mice harbor large 129-derived chromosomal segments. The 129-derived chromosomal segments contain gene mutations that may confound data interpretation. Passenger mutation-free isogenic tPA-deficient mice were generated for study of tPA function. Background The ability to generate defined null mutations in mice revolutionized the analysis of gene function in mammals. However, gene-deficient mice generated by using 129-derived embryonic stem cells may carry large segments of 129 DNA, even when extensively backcrossed to reference strains, such as C57BL/6J, and this may confound interpretation of experiments performed in these mice. Tissue plasminogen activator (tPA), encoded by the PLAT gene, is a fibrinolytic serine protease that is widely expressed in the brain. A number of neurological abnormalities have been reported in tPA-deficient mice. Objectives To study genetic contamination of tPA-deficient mice. Materials and methods Whole genome expression array analysis, RNAseq expression profiling, low- and high-density single nucleotide polymorphism (SNP) analysis, bioinformatics and genome editing were used to analyze gene expression in tPA-deficient mouse brains. Results and conclusions Genes differentially expressed in the brain of Plat(-/-) mice from two independent colonies highly backcrossed onto the C57BL/6J strain clustered near Plat on chromosome 8. SNP analysis attributed this anomaly to about 20 Mbp of DNA flanking Plat being of 129 origin in both strains. Bioinformatic analysis of these 129-derived chromosomal segments identified a significant number of mutations in genes co-segregating with the targeted Plat allele, including several potential null mutations. Using zinc finger nuclease technology, we generated novel 'passenger mutation'-free isogenic C57BL/6J-Plat(-/-) and FVB/NJ-Plat(-/-) mouse strains by introducing an 11 bp deletion into the exon encoding the signal peptide. These novel mouse strains will be a useful community resource for further exploration of tPA function in physiological and pathological processes. © 2016 International Society on Thrombosis and Haemostasis.
Li, Shicheng; Sun, Xiao; Miao, Shuncheng; Liu, Jia; Jiao, Wenjie
2017-11-01
Cigarette smoking is one of the greatest preventable risk factors for developing cancer, and most cases of lung squamous cell carcinoma (lung SCC) are associated with smoking. The pathogenesis mechanism of tumor progress is unclear. This study aimed to identify biomarkers in smoking-related lung cancer, including protein-coding gene, long noncoding RNA, and transcription factors. We selected and obtained messenger RNA microarray datasets and clinical data from the Gene Expression Omnibus database to identify gene expression altered by cigarette smoking. Integrated bioinformatic analysis was used to clarify biological functions of the identified genes, including Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway, the construction of a protein-protein interaction network, transcription factor, and statistical analyses. Subsequent quantitative real-time PCR was utilized to verify these bioinformatic analyses. Five hundred and ninety-eight differentially expressed genes and 21 long noncoding RNA were identified in smoking-related lung SCC. GO and KEGG pathway analysis showed that identified genes were enriched in the cancer-related functions and pathways. The protein-protein interaction network revealed seven hub genes identified in lung SCC. Several transcription factors and their binding sites were predicted. The results of real-time quantitative PCR revealed that AURKA and BIRC5 were significantly upregulated and LINC00094 was downregulated in the tumor tissues of smoking patients. Further statistical analysis indicated that dysregulation of AURKA, BIRC5, and LINC00094 indicated poor prognosis in lung SCC. Protein-coding genes AURKA, BIRC5, and LINC00094 could be biomarkers or therapeutic targets for smoking-related lung SCC. © 2017 The Authors. Thoracic Cancer published by China Lung Oncology Group and John Wiley & Sons Australia, Ltd.
Linking Advanced Visualization and MATLAB for the Analysis of 3D Gene Expression Data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ruebel, Oliver; Keranen, Soile V.E.; Biggin, Mark
Three-dimensional gene expression PointCloud data generated by the Berkeley Drosophila Transcription Network Project (BDTNP) provides quantitative information about the spatial and temporal expression of genes in early Drosophila embryos at cellular resolution. The BDTNP team visualizes and analyzes Point-Cloud data using the software application PointCloudXplore (PCX). To maximize the impact of novel, complex data sets, such as PointClouds, the data needs to be accessible to biologists and comprehensible to developers of analysis functions. We address this challenge by linking PCX and Matlab via a dedicated interface, thereby providing biologists seamless access to advanced data analysis functions and giving bioinformatics researchersmore » the opportunity to integrate their analysis directly into the visualization application. To demonstrate the usefulness of this approach, we computationally model parts of the expression pattern of the gene even skipped using a genetic algorithm implemented in Matlab and integrated into PCX via our Matlab interface.« less
Overcoming confounded controls in the analysis of gene expression data from microarray experiments.
Bhattacharya, Soumyaroop; Long, Dang; Lyons-Weiler, James
2003-01-01
A potential limitation of data from microarray experiments exists when improper control samples are used. In cancer research, comparisons of tumour expression profiles to those from normal samples is challenging due to tissue heterogeneity (mixed cell populations). A specific example exists in a published colon cancer dataset, in which tissue heterogeneity was reported among the normal samples. In this paper, we show how to overcome or avoid the problem of using normal samples that do not derive from the same tissue of origin as the tumour. We advocate an exploratory unsupervised bootstrap analysis that can reveal unexpected and undesired, but strongly supported, clusters of samples that reflect tissue differences instead of tumour versus normal differences. All of the algorithms used in the analysis, including the maximum difference subset algorithm, unsupervised bootstrap analysis, pooled variance t-test for finding differentially expressed genes and the jackknife to reduce false positives, are incorporated into our online Gene Expression Data Analyzer ( http:// bioinformatics.upmc.edu/GE2/GEDA.html ).
Detecting circular RNAs: bioinformatic and experimental challenges
Szabo, Linda; Salzman, Julia
2017-01-01
The pervasive expression of circular RNAs (circRNAs) is a recently discovered feature of gene expression in highly diverged eukaryotes. Numerous algorithms that are used to detect genome-wide circRNA expression from RNA sequencing (RNA-seq) data have been developed in the past few years, but there is little overlap in their predictions and no clear gold-standard method to assess the accuracy of these algorithms. We review sources of experimental and bioinformatic biases that complicate the accurate discovery of circRNAs and discuss statistical approaches to address these biases. We conclude with a discussion of the current experimental progress on the topic. PMID:27739534
Quantum Bio-Informatics II From Quantum Information to Bio-Informatics
NASA Astrophysics Data System (ADS)
Accardi, L.; Freudenberg, Wolfgang; Ohya, Masanori
2009-02-01
The problem of quantum-like representation in economy cognitive science, and genetics / L. Accardi, A. Khrennikov and M. Ohya -- Chaotic behavior observed in linea dynamics / M. Asano, T. Yamamoto and Y. Togawa -- Complete m-level quantum teleportation based on Kossakowski-Ohya scheme / M. Asano, M. Ohya and Y. Tanaka -- Towards quantum cybernetics: optimal feedback control in quantum bio informatics / V. P. Belavkin -- Quantum entanglement and circulant states / D. Chruściński -- The compound Fock space and its application in brain models / K. -H. Fichtner and W. Freudenberg -- Characterisation of beam splitters / L. Fichtner and M. Gäbler -- Application of entropic chaos degree to a combined quantum baker's map / K. Inoue, M. Ohya and I. V. Volovich -- On quantum algorithm for multiple alignment of amino acid sequences / S. Iriyama and M. Ohya --Quantum-like models for decision making in psychology and cognitive science / A. Khrennikov -- On completely positive non-Markovian evolution of a d-level system / A. Kossakowski and R. Rebolledo -- Measures of entanglement - a Hilbert space approach / W. A. Majewski -- Some characterizations of PPT states and their relation / T. Matsuoka -- On the dynamics of entanglement and characterization ofentangling properties of quantum evolutions / M. Michalski -- Perspective from micro-macro duality - towards non-perturbative renormalization scheme / I. Ojima -- A simple symmetric algorithm using a likeness with Introns behavior in RNA sequences / M. Regoli -- Some aspects of quadratic generalized white noise functionals / Si Si and T. Hida -- Analysis of several social mobility data using measure of departure from symmetry / K. Tahata ... [et al.] -- Time in physics and life science / I. V. Volovich -- Note on entropies in quantum processes / N. Watanabe -- Basics of molecular simulation and its application to biomolecules / T. Ando and I. Yamato -- Theory of proton-induced superionic conduction in hydrogen-bonded systems / H. Kamimura -- Massive collection of full-length complementary DNA clones and microarray analyses: keys to rice transcriptome analysis / S. Kikuchi -- Changes of influenza A(H5) viruses by means of entropic chaos degree / K. Sato and M. Ohya -- Basics of genome sequence analysis in bioinformatics - its fundamental ideas and problems / T. Suzuki and S. Miyazaki -- A basic introduction to gene expression studies using microarray expression data analysis / D. Wanke and J. Kilian -- Integrating biological perspectives: a quantum leap for microarray expression analysis / D. Wanke ... [et al.].
Gardiner, Erin J; Cairns, Murray J; Liu, Bing; Beveridge, Natalie J; Carr, Vaughan; Kelly, Brian; Scott, Rodney J; Tooney, Paul A
2013-04-01
Peripheral blood mononuclear cells (PBMCs) represent an accessible tissue source for gene expression profiling in schizophrenia that could provide insight into the molecular basis of the disorder. This study used the Illumina HT_12 microarray platform and quantitative real time PCR (QPCR) to perform mRNA expression profiling on 114 patients with schizophrenia or schizoaffective disorder and 80 non-psychiatric controls from the Australian Schizophrenia Research Bank (ASRB). Differential expression analysis revealed altered expression of 164 genes (59 up-regulated and 105 down-regulated) in the PBMCs from patients with schizophrenia compared to controls. Bioinformatic analysis indicated significant enrichment of differentially expressed genes known to be involved or associated with immune function and regulating the immune response. The differential expression of 6 genes, EIF2C2 (Ago 2), MEF2D, EVL, PI3, S100A12 and DEFA4 was confirmed by QPCR. Genome-wide expression analysis of PBMCs from individuals with schizophrenia was characterized by the alteration of genes with immune system function, supporting the hypothesis that the disorder has a significant immunological component in its etiology. Copyright © 2012 Elsevier Ltd. All rights reserved.
What is bioinformatics? A proposed definition and overview of the field.
Luscombe, N M; Greenbaum, D; Gerstein, M
2001-01-01
The recent flood of data from genome sequences and functional genomics has given rise to new field, bioinformatics, which combines elements of biology and computer science. Here we propose a definition for this new field and review some of the research that is being pursued, particularly in relation to transcriptional regulatory systems. Our definition is as follows: Bioinformatics is conceptualizing biology in terms of macromolecules (in the sense of physical-chemistry) and then applying "informatics" techniques (derived from disciplines such as applied maths, computer science, and statistics) to understand and organize the information associated with these molecules, on a large-scale. Analyses in bioinformatics predominantly focus on three types of large datasets available in molecular biology: macromolecular structures, genome sequences, and the results of functional genomics experiments (e.g. expression data). Additional information includes the text of scientific papers and "relationship data" from metabolic pathways, taxonomy trees, and protein-protein interaction networks. Bioinformatics employs a wide range of computational techniques including sequence and structural alignment, database design and data mining, macromolecular geometry, phylogenetic tree construction, prediction of protein structure and function, gene finding, and expression data clustering. The emphasis is on approaches integrating a variety of computational methods and heterogeneous data sources. Finally, bioinformatics is a practical discipline. We survey some representative applications, such as finding homologues, designing drugs, and performing large-scale censuses. Additional information pertinent to the review is available over the web at http://bioinfo.mbb.yale.edu/what-is-it.
Jiang, Zhiquan; Gui, Songbo; Zhang, Yazhuo
2010-09-01
Growth-hormone-secreting pituitary adenomas (GHomas) account for approximately 20% of all pituitary neoplasms. However, the pathogenesis of GHomas remains to be elucidated. To explore the possible pathogenesis of GHomas, we used bead-based fiber-optic arrays to examine the gene expression in five GHomas and compared them to three healthy pituitaries. Four differentially expressed genes were chosen randomly for validation by quantitative real-time reverse transcription-polymerase chain reaction. We then performed pathway analysis on the identified differentially expressed genes using the Kyoto Encyclopedia of Genes and Genomes. Array analysis showed significant increases in the expression of 353 genes and 206 expressed sequence tags (ESTs) and decreases in 565 genes and 29 ESTs. Bioinformatic analysis showed that the genes HIGD1B, HOXB2, ANGPT2, HPGD and BTG2 may play an important role in the tumorigenesis and progression of GHomas. Pathway analysis showed that the wingless-type signaling pathway and extracellular-matrix receptor interactions may play a key role in the tumorigenesis and progression of GHomas. Our data suggested that there are numerous aberrantly expressed genes and pathways involved in the pathogenesis of GHomas. Bead-based fiber-optic arrays combined with pathway analysis of differentially expressed genes appear to be a valid method for investigating the pathogenesis of tumors.
JIANG, ZHIQUAN; GUI, SONGBO; ZHANG, YAZHUO
2010-01-01
Growth-hormone-secreting pituitary adenomas (GHomas) account for approximately 20% of all pituitary neoplasms. However, the pathogenesis of GHomas remains to be elucidated. To explore the possible pathogenesis of GHomas, we used bead-based fiber-optic arrays to examine the gene expression in five GHomas and compared them to three healthy pituitaries. Four differentially expressed genes were chosen randomly for validation by quantitative real-time reverse transcription-polymerase chain reaction. We then performed pathway analysis on the identified differentially expressed genes using the Kyoto Encyclopedia of Genes and Genomes. Array analysis showed significant increases in the expression of 353 genes and 206 expressed sequence tags (ESTs) and decreases in 565 genes and 29 ESTs. Bioinformatic analysis showed that the genes HIGD1B, HOXB2, ANGPT2, HPGD and BTG2 may play an important role in the tumorigenesis and progression of GHomas. Pathway analysis showed that the wingless-type signaling pathway and extracellular-matrix receptor interactions may play a key role in the tumorigenesis and progression of GHomas. Our data suggested that there are numerous aberrantly expressed genes and pathways involved in the pathogenesis of GHomas. Bead-based fiber-optic arrays combined with pathway analysis of differentially expressed genes appear to be a valid method for investigating the pathogenesis of tumors. PMID:22993617
Goff, Loyal A.; Boucher, Shayne; Ricupero, Christopher L.; Fenstermacher, Sara; Swerdel, Mavis; Chase, Lucas; Adams, Christopher; Chesnut, Jonathan; Lakshmipathy, Uma; Hart, Ronald P.
2009-01-01
Objective Human multipotent mesenchymal stromal cells (MSC) have the potential to differentiate into multiple cell types, although little is known about factors that control their fate. Differentiation-specific microRNAs may play a key role in stem cell self renewal and differentiation. We propose that specific intracellular signalling pathways modulate gene expression during differentiation by regulating microRNA expression. Methods Illumina mRNA and NCode microRNA expression analyses were performed on MSC and their differentiated progeny. A combination of bioinformatic prediction and pathway inhibition was used to identify microRNAs associated with PDGF signalling. Results The pattern of microRNA expression in MSC is distinct from that in pluripotent stem cells such as human embryonic stem cells. Specific populations of microRNAs are regulated in MSC during differentiation targeted towards specific cell types. Complementary mRNA expression analysis increases the pool of markers characteristic of MSC or differentiated progeny. To identify microRNA expression patterns affected by signalling pathways, we examined the PDGF pathway found to be regulated during osteogenesis by microarray studies. A set of microRNAs bioinformatically predicted to respond to PDGF signalling was experimentally confirmed by direct PDGF inhibition. Conclusion Our results demonstrate that a subset of microRNAs regulated during osteogenic differentiation of MSCs is responsive to perturbation of the PDGF pathway. This approach not only identifies characteristic classes of differentiation-specific mRNAs and microRNAs, but begins to link regulated molecules with specific cellular pathways. PMID:18657893
LXtoo: an integrated live Linux distribution for the bioinformatics community
2012-01-01
Background Recent advances in high-throughput technologies dramatically increase biological data generation. However, many research groups lack computing facilities and specialists. This is an obstacle that remains to be addressed. Here, we present a Linux distribution, LXtoo, to provide a flexible computing platform for bioinformatics analysis. Findings Unlike most of the existing live Linux distributions for bioinformatics limiting their usage to sequence analysis and protein structure prediction, LXtoo incorporates a comprehensive collection of bioinformatics software, including data mining tools for microarray and proteomics, protein-protein interaction analysis, and computationally complex tasks like molecular dynamics. Moreover, most of the programs have been configured and optimized for high performance computing. Conclusions LXtoo aims to provide well-supported computing environment tailored for bioinformatics research, reducing duplication of efforts in building computing infrastructure. LXtoo is distributed as a Live DVD and freely available at http://bioinformatics.jnu.edu.cn/LXtoo. PMID:22813356
LXtoo: an integrated live Linux distribution for the bioinformatics community.
Yu, Guangchuang; Wang, Li-Gen; Meng, Xiao-Hua; He, Qing-Yu
2012-07-19
Recent advances in high-throughput technologies dramatically increase biological data generation. However, many research groups lack computing facilities and specialists. This is an obstacle that remains to be addressed. Here, we present a Linux distribution, LXtoo, to provide a flexible computing platform for bioinformatics analysis. Unlike most of the existing live Linux distributions for bioinformatics limiting their usage to sequence analysis and protein structure prediction, LXtoo incorporates a comprehensive collection of bioinformatics software, including data mining tools for microarray and proteomics, protein-protein interaction analysis, and computationally complex tasks like molecular dynamics. Moreover, most of the programs have been configured and optimized for high performance computing. LXtoo aims to provide well-supported computing environment tailored for bioinformatics research, reducing duplication of efforts in building computing infrastructure. LXtoo is distributed as a Live DVD and freely available at http://bioinformatics.jnu.edu.cn/LXtoo.
Bioinformatics on the cloud computing platform Azure.
Shanahan, Hugh P; Owen, Anne M; Harrison, Andrew P
2014-01-01
We discuss the applicability of the Microsoft cloud computing platform, Azure, for bioinformatics. We focus on the usability of the resource rather than its performance. We provide an example of how R can be used on Azure to analyse a large amount of microarray expression data deposited at the public database ArrayExpress. We provide a walk through to demonstrate explicitly how Azure can be used to perform these analyses in Appendix S1 and we offer a comparison with a local computation. We note that the use of the Platform as a Service (PaaS) offering of Azure can represent a steep learning curve for bioinformatics developers who will usually have a Linux and scripting language background. On the other hand, the presence of an additional set of libraries makes it easier to deploy software in a parallel (scalable) fashion and explicitly manage such a production run with only a few hundred lines of code, most of which can be incorporated from a template. We propose that this environment is best suited for running stable bioinformatics software by users not involved with its development.
Bioinformatics on the Cloud Computing Platform Azure
Shanahan, Hugh P.; Owen, Anne M.; Harrison, Andrew P.
2014-01-01
We discuss the applicability of the Microsoft cloud computing platform, Azure, for bioinformatics. We focus on the usability of the resource rather than its performance. We provide an example of how R can be used on Azure to analyse a large amount of microarray expression data deposited at the public database ArrayExpress. We provide a walk through to demonstrate explicitly how Azure can be used to perform these analyses in Appendix S1 and we offer a comparison with a local computation. We note that the use of the Platform as a Service (PaaS) offering of Azure can represent a steep learning curve for bioinformatics developers who will usually have a Linux and scripting language background. On the other hand, the presence of an additional set of libraries makes it easier to deploy software in a parallel (scalable) fashion and explicitly manage such a production run with only a few hundred lines of code, most of which can be incorporated from a template. We propose that this environment is best suited for running stable bioinformatics software by users not involved with its development. PMID:25050811
Kang, Jonghoon; Park, Seyeon; Venkat, Aarya; Gopinath, Adarsh
2015-12-01
New interdisciplinary biological sciences like bioinformatics, biophysics, and systems biology have become increasingly relevant in modern science. Many papers have suggested the importance of adding these subjects, particularly bioinformatics, to an undergraduate curriculum; however, most of their assertions have relied on qualitative arguments. In this paper, we will show our metadata analysis of a scientific literature database (PubMed) that quantitatively describes the importance of the subjects of bioinformatics, systems biology, and biophysics as compared with a well-established interdisciplinary subject, biochemistry. Specifically, we found that the development of each subject assessed by its publication volume was well described by a set of simple nonlinear equations, allowing us to characterize them quantitatively. Bioinformatics, which had the highest ratio of publications produced, was predicted to grow between 77% and 93% by 2025 according to the model. Due to the large number of publications produced in bioinformatics, which nearly matches the number published in biochemistry, it can be inferred that bioinformatics is almost equal in significance to biochemistry. Based on our analysis, we suggest that bioinformatics be added to the standard biology undergraduate curriculum. Adding this course to an undergraduate curriculum will better prepare students for future research in biology.
Cloning and expression analysis of FaPR-1 gene in strawberry
NASA Astrophysics Data System (ADS)
Mo, Fan; Luo, Ya; Ge, Cong; Mo, Qin; Ling, Yajie; Luo, Shu; Tang, Haoru
2018-04-01
The FaPR-1 gene was cloned by RT-PCR from `Benihoppe' strawberry and its bioinformatics analysis was conducted. The results showed that the open reading frame was 483 bp encoding encoding l60 amino acids which protein molecular weight and theoretical isoelectricity were 17854.17 and 8.72 respectively. Subcellular localization prediction shows that this gene is located extracellularly. By comparing strawberry FaPR-l and other plant Pathogenesis-related protein, homology and phylogenetic tree construction showed that the homology with grapes, peach is relatively close. In the treatments of ABA, sucrose and the mixture of the two, the expression of FaPR-1 in strawberry fruit were significantly increased.
Wang, Jianpeng; Wang, Dong; Wan, Dehong; Ma, Qingxia; Liu, Qian; Li, Jiye; Li, Zhaojian; Gao, Yang; Jiang, Guohui; Ma, Leina; Liu, Jia; Li, Chuzhong
2018-06-14
The invasion and recurrence of clinical nonfunctioning pituitary adenomas (NFA) often lead to surgical treatment failure. Circular RNAs (circRNAs) are a novel class of RNAs whose 3' and 5' ends are joined together and have been shown to play important roles in cancer development. Up to now, the roles of circRNAs remain unclear in invasive and recurrent NFA. We detected and summarized the circRNA expression pattern in 75 NFA tissues from 10 non-invasive cases and 65 invasive cases and 9 pairs NFA tumor tissues from 9 recurrent cases by circRNA microarrays. Accordingly, functional enrichment analysis and pathway analysis were performed and circRNA-microRNA(miRNA) network were generated by bioinformatic analysis tools. 5 new invasive NFA samples and 5 non-invasive NFA samples were collected to measure the microarray results. 570 dysregulated circRNAs (Invasive Tumor vs. Non-invasive Tumor) and 10 up-regulated circRNAs (Recurrent tumor Tissue vs. First surgery tumor Tissue) were identified based on the situation (FC>2, P<0.05). The parental genes of the dysregulated circRNAs in the comparison between invasion tumor and non-invasion tumor were found to be enriched in some cell adhesion signaling pathways such as Focal adhesion, Hippo signaling pathway, PI3K-Akt signaling pathway, and Adherens junction. The circRNA-miRNA network showed that the dysregulated circRNA may function as miRNA sponges. This is the first study to conduct and comprehensively analyze the circRNA expression profile in invasive and recurrent NFA. Our finding will provide evidence for the significance of circRNAs in NFA diagnosis, prognosis and clinical treatment. Copyright © 2018 Elsevier Inc. All rights reserved.
Lin, Huapeng; Zhang, Qian; Li, Xiaocheng; Wu, Yushen; Liu, Ye; Hu, Yingchun
2018-01-01
Abstract Hepatitis B virus-associated acute liver failure (HBV-ALF) is a rare but life-threatening syndrome that carried a high morbidity and mortality. Our study aimed to explore the possible molecular mechanisms of HBV-ALF by means of bioinformatics analysis. In this study, genes expression microarray datasets of HBV-ALF from Gene Expression Omnibus were collected, and then we identified differentially expressed genes (DEGs) by the limma package in R. After functional enrichment analysis, we constructed the protein–protein interaction (PPI) network by the Search Tool for the Retrieval of Interacting Genes online database and weighted genes coexpression network by the WGCNA package in R. Subsequently, we picked out the hub genes among the DEGs. A total of 423 DEGs with 198 upregulated genes and 225 downregulated genes were identified between HBV-ALF and normal samples. The upregulated genes were mainly enriched in immune response, and the downregulated genes were mainly enriched in complement and coagulation cascades. Orosomucoid 1 (ORM1), orosomucoid 2 (ORM2), plasminogen (PLG), and aldehyde oxidase 1 (AOX1) were picked out as the hub genes that with a high degree in both PPI network and weighted genes coexpression network. The weighted genes coexpression network analysis found out 3 of the 5 modules that upregulated genes enriched in were closely related to immune system. The downregulated genes enriched in only one module, and the genes in this module majorly enriched in the complement and coagulation cascades pathway. In conclusion, 4 genes (ORM1, ORM2, PLG, and AOX1) with immune response and the complement and coagulation cascades pathway may take part in the pathogenesis of HBV-ALF, and these candidate genes and pathways could be therapeutic targets for HBV-ALF. PMID:29384847
Seahawk: moving beyond HTML in Web-based bioinformatics analysis.
Gordon, Paul M K; Sensen, Christoph W
2007-06-18
Traditional HTML interfaces for input to and output from Bioinformatics analysis on the Web are highly variable in style, content and data formats. Combining multiple analyses can therefore be an onerous task for biologists. Semantic Web Services allow automated discovery of conceptual links between remote data analysis servers. A shared data ontology and service discovery/execution framework is particularly attractive in Bioinformatics, where data and services are often both disparate and distributed. Instead of biologists copying, pasting and reformatting data between various Web sites, Semantic Web Service protocols such as MOBY-S hold out the promise of seamlessly integrating multi-step analysis. We have developed a program (Seahawk) that allows biologists to intuitively and seamlessly chain together Web Services using a data-centric, rather than the customary service-centric approach. The approach is illustrated with a ferredoxin mutation analysis. Seahawk concentrates on lowering entry barriers for biologists: no prior knowledge of the data ontology, or relevant services is required. In stark contrast to other MOBY-S clients, in Seahawk users simply load Web pages and text files they already work with. Underlying the familiar Web-browser interaction is an XML data engine based on extensible XSLT style sheets, regular expressions, and XPath statements which import existing user data into the MOBY-S format. As an easily accessible applet, Seahawk moves beyond standard Web browser interaction, providing mechanisms for the biologist to concentrate on the analytical task rather than on the technical details of data formats and Web forms. As the MOBY-S protocol nears a 1.0 specification, we expect more biologists to adopt these new semantic-oriented ways of doing Web-based analysis, which empower them to do more complicated, ad hoc analysis workflow creation without the assistance of a programmer.
Seahawk: moving beyond HTML in Web-based bioinformatics analysis
Gordon, Paul MK; Sensen, Christoph W
2007-01-01
Background Traditional HTML interfaces for input to and output from Bioinformatics analysis on the Web are highly variable in style, content and data formats. Combining multiple analyses can therfore be an onerous task for biologists. Semantic Web Services allow automated discovery of conceptual links between remote data analysis servers. A shared data ontology and service discovery/execution framework is particularly attractive in Bioinformatics, where data and services are often both disparate and distributed. Instead of biologists copying, pasting and reformatting data between various Web sites, Semantic Web Service protocols such as MOBY-S hold out the promise of seamlessly integrating multi-step analysis. Results We have developed a program (Seahawk) that allows biologists to intuitively and seamlessly chain together Web Services using a data-centric, rather than the customary service-centric approach. The approach is illustrated with a ferredoxin mutation analysis. Seahawk concentrates on lowering entry barriers for biologists: no prior knowledge of the data ontology, or relevant services is required. In stark contrast to other MOBY-S clients, in Seahawk users simply load Web pages and text files they already work with. Underlying the familiar Web-browser interaction is an XML data engine based on extensible XSLT style sheets, regular expressions, and XPath statements which import existing user data into the MOBY-S format. Conclusion As an easily accessible applet, Seahawk moves beyond standard Web browser interaction, providing mechanisms for the biologist to concentrate on the analytical task rather than on the technical details of data formats and Web forms. As the MOBY-S protocol nears a 1.0 specification, we expect more biologists to adopt these new semantic-oriented ways of doing Web-based analysis, which empower them to do more complicated, ad hoc analysis workflow creation without the assistance of a programmer. PMID:17577405
Wren, Jonathan D
2016-09-01
To analyze the relative proportion of bioinformatics papers and their non-bioinformatics counterparts in the top 20 most cited papers annually for the past two decades. When defining bioinformatics papers as encompassing both those that provide software for data analysis or methods underlying data analysis software, we find that over the past two decades, more than a third (34%) of the most cited papers in science were bioinformatics papers, which is approximately a 31-fold enrichment relative to the total number of bioinformatics papers published. More than half of the most cited papers during this span were bioinformatics papers. Yet, the average 5-year JIF of top 20 bioinformatics papers was 7.7, whereas the average JIF for top 20 non-bioinformatics papers was 25.8, significantly higher (P < 4.5 × 10(-29)). The 20-year trend in the average JIF between the two groups suggests the gap does not appear to be significantly narrowing. For a sampling of the journals producing top papers, bioinformatics journals tended to have higher Gini coefficients, suggesting that development of novel bioinformatics resources may be somewhat 'hit or miss'. That is, relative to other fields, bioinformatics produces some programs that are extremely widely adopted and cited, yet there are fewer of intermediate success. jdwren@gmail.com Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
GSCALite: A Web Server for Gene Set Cancer Analysis.
Liu, Chun-Jie; Hu, Fei-Fei; Xia, Mengxuan; Han, Leng; Zhang, Qiong; Guo, An-Yuan
2018-05-22
The availability of cancer genomic data makes it possible to analyze genes related to cancer. Cancer is usually the result of a set of genes and the signal of a single gene could be covered by background noise. Here, we present a web server named Gene Set Cancer Analysis (GSCALite) to analyze a set of genes in cancers with the following functional modules. (i) Differential expression in tumor vs normal, and the survival analysis; (ii) Genomic variations and their survival analysis; (iii) Gene expression associated cancer pathway activity; (iv) miRNA regulatory network for genes; (v) Drug sensitivity for genes; (vi) Normal tissue expression and eQTL for genes. GSCALite is a user-friendly web server for dynamic analysis and visualization of gene set in cancer and drug sensitivity correlation, which will be of broad utilities to cancer researchers. GSCALite is available on http://bioinfo.life.hust.edu.cn/web/GSCALite/. guoay@hust.edu.cn or zhangqiong@hust.edu.cn. Supplementary data are available at Bioinformatics online.
Jin, Yuan; Goodman, Richard E; Tetteh, Afua O; Lu, Mei; Tripathi, Leena
2017-11-01
Banana Xanthomonas wilt (BXW) disease threatens banana production and food security throughout East Africa. Natural resistance is lacking among common cultivars. Genetically modified (GM) bananas resistant to BXW disease were developed by inserting the hypersensitive response-assisting protein (Hrap) or/and the plant ferredoxin-like protein (Pflp) gene(s) from sweet pepper (Capsicum annuum). Several of these GM banana events showed 100% resistance to BXW disease under field conditions in Uganda. The current study evaluated the potential allergenicity and toxicity of the expressed proteins HRAP and PFLP based on evaluation of published information on the history of safe use of the natural source of the proteins as well as established bioinformatics sequence comparison methods to known allergens (www.AllergenOnline.org and NCBI Protein) and toxins (NCBI Protein). The results did not identify potential risks of allergy and toxicity to either HRAP or PFLP proteins expressed in the GM bananas that might suggest potential health risks to humans. We recognize that additional tests including stability of these proteins in pepsin assay, nutrient analysis and possibly an acute rodent toxicity assay may be required by national regulatory authorities. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.
Li, Xiaofang; Tian, Run; Gao, Hugh; Yan, Feng; Ying, Le; Yang, Yongkang; Yang, Pei
2018-01-01
Cervical cancer is the leading cause of death with gynecological malignancies. We aimed to explore the molecular mechanism of carcinogenesis and biomarkers for cervical cancer by integrated bioinformatic analysis. We employed RNA-sequencing details of 254 cervical squamous cell carcinomas and 3 normal samples from The Cancer Genome Atlas. To explore the distinct pathways, messenger RNA expression was submitted to a Gene Set Enrichment Analysis. Kyoto Encyclopedia of Genes and Genomes and protein–protein interaction network analysis of differentially expressed genes were performed. Then, we conducted pathway enrichment analysis for modules acquired in protein–protein interaction analysis and obtained a list of pathways in every module. After intersecting the results from the 3 approaches, we evaluated the survival rates of both mutual pathways and genes in the pathway, and 5 survival-related genes were obtained. Finally, Cox hazards ratio analysis of these 5 genes was performed. DNA replication pathway (P < .001; 12 genes included) was suggested to have the strongest association with the prognosis of cervical squamous cancer. In total, 5 of the 12 genes, namely, minichromosome maintenance 2, minichromosome maintenance 4, minichromosome maintenance 5, proliferating cell nuclear antigen, and ribonuclease H2 subunit A were significantly correlated with survival. Minichromosome maintenance 5 was shown as an independent prognostic biomarker for patients with cervical cancer. This study identified a distinct pathway (DNA replication). Five genes which may be prognostic biomarkers and minichromosome maintenance 5 were identified as independent prognostic biomarkers for patients with cervical cancer. PMID:29642758
Single Cell Gene Expression Profiling of Skeletal Muscle-Derived Cells.
Gatto, Sole; Puri, Pier Lorenzo; Malecova, Barbora
2017-01-01
Single cell gene expression profiling is a fundamental tool for studying the heterogeneity of a cell population by addressing the phenotypic and functional characteristics of each cell. Technological advances that have coupled microfluidic technologies with high-throughput quantitative RT-PCR analyses have enabled detailed analyses of single cells in various biological contexts. In this chapter, we describe the procedure for isolating the skeletal muscle interstitial cells termed Fibro-Adipogenic Progenitors (FAPs ) and their gene expression profiling at the single cell level. Moreover, we accompany our bench protocol with bioinformatics analysis designed to process raw data as well as to visualize single cell gene expression data. Single cell gene expression profiling is therefore a useful tool in the investigation of FAPs heterogeneity and their contribution to muscle homeostasis.
Zhang, Limin; Jiang, Haowen; Xu, Gang; Wen, Hui; Gu, Bin; Liu, Jun; Mao, Shanghua; Na, Rong; Jing, Yan; Ding, Qiang; Zhang, Yuanfang
2015-06-01
In order to investigate the two members of the EF‑hand Ca2+ binding protein S100 family, S100A8 and S100A9, in renal cell carcinoma (RCC), serum samples were collected from patients with RCC, transitional cell carcinoma in the kidney, benign renal masses and normal controls. The samples were analyzed by isobaric tags for relative and absolute quantification technology to identify the differential expression of S100A8 and S100A9 in the respective groups. Hierarchical clustering analysis was then conducted for the samples and the relevant selected gene. The cross‑platform analysis for the external validation was performed by means of The Cancer Genome Atlas database, containing the gene/microRNA expression pattern and clinical information of patients with RCC. Immunohistochemical staining was used to verify the expression of S100A8 and S100A9 in the four groups. As a result, serum and mRNA expression levels of S100A8 and S100A9 were found to be upregulated in patients with RCC compared with the other three groups, which was consistent with the result of the upregulated expression of mRNA levels in RCC tissue. The overexpression of S100A8 and S100A9 in cancer cells was also confirmed by immunohistochemistry. In addition, bioinformatics revealed that let‑7, a microRNA formerly identified as an inhibiting factor of RCC was downregulated in RCC, which contrasted with S100A8. It was also complementary to the sequence at the 3' untranslated region terminal of S100A8. Therefore, indicating that S100A8 and S100A9 may serve as biomarkers for the detection of RCC.
Pan, Yue; Lu, Lingyun; Chen, Junquan; Zhong, Yong; Dai, Zhehao
2018-01-01
This study aimed to identify potential crucial genes and construction of microRNA-mRNA negative regulatory networks in osteosarcoma by comprehensive bioinformatics analysis. Data of gene expression profiles (GSE28424) and miRNA expression profiles (GSE28423) were downloaded from GEO database. The differentially expressed genes (DEGs) and miRNAs (DEMIs) were obtained by R Bioconductor packages. Functional and enrichment analyses of selected genes were performed using DAVID database. Protein-protein interaction (PPI) network was constructed by STRING and visualized in Cytoscape. The relationships among the DEGs and module in PPI network were analyzed by plug-in NetworkAnalyzer and MCODE seperately. Through the TargetScan and comparing target genes with DEGs, the miRNA-mRNA regulation network was established. Totally 346 DEGs and 90 DEMIs were found to be differentially expressed. These DEGs were enriched in biological processes and KEGG pathway of inflammatory immune response. 25 genes in the PPI network were selected as hub genes. Top 10 hub genes were TYROBP, HLA-DRA, VWF, PPBP, SERPING1, HLA-DPA1, SERPINA1, KIF20A, FERMT3, HLA-E. PPI network of DEGs followed a pattern of power law network and met the characteristics of small-world network. MCODE analysis identified 4 clusters and the most significant cluster consisted of 11 nodes and 55 edges. SEPP1, CKS2, TCAP, BPI were identified as the seed genes in their own clusters, respectively. The miRNA-mRNA regulation network which was composed of 89 pairs was established. MiR-210 had the highest connectivity with 12 target genes. Among the predicted target of MiR-96, HLA-DPA1 and TYROBP were the hub genes. Our study indicated possible differentially expressed genes and miRNA, and microRNA-mRNA negative regulatory networks in osteosarcoma by bioinformatics analysis, which may provide novel insights for unraveling pathogenesis of osteosarcoma.
Prediction of the in planta Phakopsora pachyrhizi secretome and potential effector families.
de Carvalho, Mayra C da C G; Costa Nascimento, Leandro; Darben, Luana M; Polizel-Podanosqui, Adriana M; Lopes-Caitar, Valéria S; Qi, Mingsheng; Rocha, Carolina S; Carazzolle, Marcelo Falsarella; Kuwahara, Márcia K; Pereira, Goncalo A G; Abdelnoor, Ricardo V; Whitham, Steven A; Marcelino-Guimarães, Francismar C
2017-04-01
Asian soybean rust (ASR), caused by the obligate biotrophic fungus Phakopsora pachyrhizi, can cause losses greater than 80%. Despite its economic importance, there is no soybean cultivar with durable ASR resistance. In addition, the P. pachyrhizi genome is not yet available. However, the availability of other rust genomes, as well as the development of sample enrichment strategies and bioinformatics tools, has improved our knowledge of the ASR secretome and its potential effectors. In this context, we used a combination of laser capture microdissection (LCM), RNAseq and a bioinformatics pipeline to identify a total of 36 350 P. pachyrhizi contigs expressed in planta and a predicted secretome of 851 proteins. Some of the predicted secreted proteins had characteristics of candidate effectors: small size, cysteine rich, do not contain PFAM domains (except those associated with pathogenicity) and strongly expressed in planta. A comparative analysis of the predicted secreted proteins present in Pucciniales species identified new members of soybean rust and new Pucciniales- or P. pachyrhizi-specific families (tribes). Members of some families were strongly up-regulated during early infection, starting with initial infection through haustorium formation. Effector candidates selected from two of these families were able to suppress immunity in transient assays, and were localized in the plant cytoplasm and nuclei. These experiments support our bioinformatics predictions and show that these families contain members that have functions consistent with P. pachyrhizi effectors. © 2016 BSPP AND JOHN WILEY & SONS LTD.
RNA sequencing uncovers antisense RNAs and novel small RNAs in Streptococcus pyogenes.
Le Rhun, Anaïs; Beer, Yan Yan; Reimegård, Johan; Chylinski, Krzysztof; Charpentier, Emmanuelle
2016-01-01
Streptococcus pyogenes is a human pathogen responsible for a wide spectrum of diseases ranging from mild to life-threatening infections. During the infectious process, the temporal and spatial expression of pathogenicity factors is tightly controlled by a complex network of protein and RNA regulators acting in response to various environmental signals. Here, we focus on the class of small RNA regulators (sRNAs) and present the first complete analysis of sRNA sequencing data in S. pyogenes. In the SF370 clinical isolate (M1 serotype), we identified 197 and 428 putative regulatory RNAs by visual inspection and bioinformatics screening of the sequencing data, respectively. Only 35 from the 197 candidates identified by visual screening were assigned a predicted function (T-boxes, ribosomal protein leaders, characterized riboswitches or sRNAs), indicating how little is known about sRNA regulation in S. pyogenes. By comparing our list of predicted sRNAs with previous S. pyogenes sRNA screens using bioinformatics or microarrays, 92 novel sRNAs were revealed, including antisense RNAs that are for the first time shown to be expressed in this pathogen. We experimentally validated the expression of 30 novel sRNAs and antisense RNAs. We show that the expression profile of 9 sRNAs including 2 predicted regulatory elements is affected by the endoribonucleases RNase III and/or RNase Y, highlighting the critical role of these enzymes in sRNA regulation.
Leduc, Magalie S.; Blair, Rachael Hageman; Verdugo, Ricardo A.; Tsaih, Shirng-Wern; Walsh, Kenneth; Churchill, Gary A.; Paigen, Beverly
2012-01-01
A higher incidence of coronary artery disease is associated with a lower level of HDL-cholesterol. We searched for genetic loci influencing HDL-cholesterol in F2 mice from a cross between MRL/MpJ and SM/J mice. Quantitative trait loci (QTL) mapping revealed one significant HDL QTL (Apoa2 locus), four suggestive QTL on chromosomes 10, 11, 13, and 18 and four additional QTL on chromosomes 1 proximal, 3, 4, and 7 after adjusting HDL for the strong Apoa2 locus. A novel nonsynonymous polymorphism supports Lipg as the QTL gene for the chromosome 18 QTL, and a difference in Abca1 expression in liver tissue supports it as the QTL gene for the chromosome 4 QTL. Using weighted gene co-expression network analysis, we identified a module that after adjustment for Apoa2, correlated with HDL, was genetically determined by a QTL on chromosome 11, and overlapped with the HDL QTL. A combination of bioinformatics tools and systems genetics helped identify several candidate genes for both the chromosome 11 HDL and module QTL based on differential expression between the parental strains, cis regulation of expression, and causality modeling. We conclude that integrating systems genetics to a more-traditional genetics approach improves the power of complex trait gene identification. PMID:22498810
Systems analysis of arrestin pathway functions.
Maudsley, Stuart; Siddiqui, Sana; Martin, Bronwen
2013-01-01
To fully appreciate the diversity and specificity of complex cellular signaling events, such as arrestin-mediated signaling from G protein-coupled receptor activation, a complex systems-level investigation currently appears to be the best option. A rational combination of transcriptomics, proteomics, and interactomics, all coherently integrated with applied next-generation bioinformatics, is vital for the future understanding of the development, translation, and expression of GPCR-mediated arrestin signaling events in physiological contexts. Through a more nuanced, systems-level appreciation of arrestin-mediated signaling, the creation of arrestin-specific molecular response "signatures" should be made simple and ultimately amenable to drug discovery processes. Arrestin-based signaling paradigms possess important aspects, such as its specific temporal kinetics and ability to strongly affect transcriptional activity, that make it an ideal test bed for next-generation of drug discovery bioinformatic approaches such as multi-parallel dose-response analysis, data texturization, and latent semantic indexing-based natural language data processing and feature extraction. Copyright © 2013 Elsevier Inc. All rights reserved.
Gan, Xiao-Ning; Luo, Jie; Tang, Rui-Xue; Wang, Han-Lin; Zhou, Hong; Qin, Hui; Gan, Ting-Qing; Chen, Gang
2017-05-01
The role and mechanism of miR-452-5p in lung adenocarcinoma remain unclear. In this study, we performed a systematic study to investigate the clinical value of miR-452-5p expression in lung adenocarcinoma. The expression of miR-452-5p in 101 lung adenocarcinoma patients was detected by quantitative real-time polymerase chain reaction. The Cancer Genome Atlas and Gene Expression Omnibus databases were joined to verify the expression level of miR-452-5p in lung adenocarcinoma. Via several online prediction databases and bioinformatics software, pathway and network analyses of miR-452-5p target genes were performed to explore its prospective molecular mechanism. The expression of miR-452-5p in lung adenocarcinoma in house was significantly lower than that in adjacent tissues (p < 0.001). Additionally, the expression level of miR-452-5p was negatively correlated with several clinicopathological parameters including the tumor size (p = 0.014), lymph node metastasis (p = 0.032), and tumor-node-metastasis stage (p = 0.036). Data from The Cancer Genome Atlas also confirmed the low expression of miR-452 in lung adenocarcinoma (p < 0.001). Furthermore, reduced expression of miR-452-5p in lung adenocarcinoma (standard mean deviations = -0.393, 95% confidence interval: -0.774 to -0.011, p = 0.044) was validated by a meta-analysis. Five hub genes targeted by miR-452-5p, including SMAD family member 4, SMAD family member 2, cyclin-dependent kinase inhibitor 1B, tyrosine 3-monooxygenase/tryptophan 5-monooxygenase activation protein epsilon, and tyrosine 3-monooxygenase/tryptophan 5-monooxygenase activation protein beta, were significantly enriched in the cell-cycle pathway. In conclusion, low expression of miR-452-5p tends to play an essential role in lung adenocarcinoma. Bioinformatics analysis might be beneficial to reveal the potential mechanism of miR-452-5p in lung adenocarcinoma.
Mu, Min; Lu, Xu-Ke; Wang, Jun-Juan; Wang, De-Long; Yin, Zu-Jun; Wang, Shuai; Fan, Wei-Li; Ye, Wu-Wei
2016-03-18
Trehalose (a-D-glucopyranosyl a-D-glucopyranoside) is a nonreducing disaccharide and is widely distributed in bacteria, fungi, algae, plants and invertebrates. In the study, the identification of trehalose-6-phosphate synthase (TPS) genes stress-related in cotton, and the genetic structure analysis and molecular evolution analysis of TPSs were conducted with bioinformatics methods, which could lay a foundation for further research of TPS functions in cotton. The genome information of Gossypium raimondii (group D), G. arboreum L. (group A), and G. hirsutum L. (group AD) was used in the study. Fifty-three TPSs were identified comprising 15 genes in group D, 14 in group A, and 24 in group AD. Bioinformatics methods were used to analyze the genetic structure and molecular evolution of TPSs. Real-time PCR analysis was performed to investigate the expression patterns of gene family members. All TPS family members in cotton can be divided into two subfamilies: Class I and Class II. The similarity of the TPS sequence is high within the same species and close within their family relatives. The genetic structures of two TPS subfamily members are different, with more introns and a more complicated gene structure in Class I. There is a TPS domain(Glyco transf_20) at the N-terminal in all TPS family members and a TPP domain(Trehalose_PPase) at the C-terminal in all except GrTPS6, GhTPS4, and GhTPS9. All Class II members contain a UDP-forming domain. The responses to environmental stresses showed that stresses could induce the expression of TPSs but the expression patterns vary with different stresses. The distribution of TPSs varies with different species but is relatively uniform on chromosomes. Genetic structure varies with different gene members, and expression levels vary with different stresses and exhibit tissue specificity. The upregulated genes in upland cotton TM-1 is significantly more than that in G. raimondii and G. arboreum L. Shixiya 1.
Stangeland, Biljana; Mughal, Awais A; Grieg, Zanina; Sandberg, Cecilie Jonsgar; Joel, Mrinal; Nygård, Ståle; Meling, Torstein; Murrell, Wayne; Vik Mo, Einar O; Langmoen, Iver A
2015-09-22
Glioblastoma (GBM) is both the most common and the most lethal primary brain tumor. It is thought that GBM stem cells (GSCs) are critically important in resistance to therapy. Therefore, there is a strong rationale to target these cells in order to develop new molecular therapies.To identify molecular targets in GSCs, we compared gene expression in GSCs to that in neural stem cells (NSCs) from the adult human brain, using microarrays. Bioinformatic filtering identified 20 genes (PBK/TOPK, CENPA, KIF15, DEPDC1, CDC6, DLG7/DLGAP5/HURP, KIF18A, EZH2, HMMR/RHAMM/CD168, NOL4, MPP6, MDM1, RAPGEF4, RHBDD1, FNDC3B, FILIP1L, MCC, ATXN7L4/ATXN7L1, P2RY5/LPAR6 and FAM118A) that were consistently expressed in GSC cultures and consistently not expressed in NSC cultures. The expression of these genes was confirmed in clinical samples (TCGA and REMBRANDT). The first nine genes were highly co-expressed in all GBM subtypes and were part of the same protein-protein interaction network. Furthermore, their combined up-regulation correlated negatively with patient survival in the mesenchymal GBM subtype. Using targeted proteomics and the COGNOSCENTE database we linked these genes to GBM signalling pathways.Nine genes: PBK, CENPA, KIF15, DEPDC1, CDC6, DLG7, KIF18A, EZH2 and HMMR should be further explored as targets for treatment of GBM.
Shen, Shixuan; Chen, Xiaohui; Li, Hao; Sun, Liping; Yuan, Yuan
2018-01-01
Background: The promoter methylation of MLH1 gene and gastric cancer (GC)has been investigated previously. To get a more credible conclusion, we performed a systematic review and meta and bioinformatic analysis to clarify the role of MLH1 methylation in the prediction and prognosis of GC. Methods: Eligible studies were targeted after searching the PubMed, Web of Science, Embase, BIOSIS, CNKI and Wanfang Data to collect the information of MLH1 methylation and GC. The link strength between the two was estimated by odds ratio with its 95% confidence interval. The Newcastle-Ottawa scale was used for quantity assessment . Subgroup and sensitivity analysis were conducted to explore sources of heterogeneity. The Gene Expression Omnibus (GEO) and The Cancer Genome Atlas (TCGA) were employed for bioinformatics analysis on the correlation between MLH1 methylation and GC risk, clinicopathological behavior as well as prognosis. Results: 2365 GC and 1563 controls were included in the meta-analysis. The pooled OR of MLH1 methylation in GC was 4.895 (95% CI: 3.149-7.611, P<0.001), which considerably associated with increased GC risk. No significant difference was found in relation to Lauren classification, tumor invasion, lymph node/distant metastasis and tumor stage in GC. Analysis based on GEO and TCGA showed that high MLH1 methylation enhanced GC risk but might not related with GC clinicopathological features and prognosis. Conclusion: MLH1 methylation is an alive biomarker for the prediction of GC and it might not affect GC behavior. Further study could be conducted to verify the impact of MLH1 methylation on GC prognosis.
Shen, Shixuan; Chen, Xiaohui; Li, Hao; Sun, Liping; Yuan, Yuan
2018-01-01
Background: The promoter methylation of MLH1 gene and gastric cancer (GC)has been investigated previously. To get a more credible conclusion, we performed a systematic review and meta and bioinformatic analysis to clarify the role of MLH1 methylation in the prediction and prognosis of GC. Methods: Eligible studies were targeted after searching the PubMed, Web of Science, Embase, BIOSIS, CNKI and Wanfang Data to collect the information of MLH1 methylation and GC. The link strength between the two was estimated by odds ratio with its 95% confidence interval. The Newcastle-Ottawa scale was used for quantity assessment. Subgroup and sensitivity analysis were conducted to explore sources of heterogeneity. The Gene Expression Omnibus (GEO) and The Cancer Genome Atlas (TCGA) were employed for bioinformatics analysis on the correlation between MLH1 methylation and GC risk, clinicopathological behavior as well as prognosis. Results: 2365 GC and 1563 controls were included in the meta-analysis. The pooled OR of MLH1 methylation in GC was 4.895 (95% CI: 3.149-7.611, P<0.001), which considerably associated with increased GC risk. No significant difference was found in relation to Lauren classification, tumor invasion, lymph node/distant metastasis and tumor stage in GC. Analysis based on GEO and TCGA showed that high MLH1 methylation enhanced GC risk but might not related with GC clinicopathological features and prognosis. Conclusion: MLH1 methylation is an alive biomarker for the prediction of GC and it might not affect GC behavior. Further study could be conducted to verify the impact of MLH1 methylation on GC prognosis. PMID:29896277
Bioinformatics analysis of transcriptome dynamics during growth in angus cattle longissimus muscle.
Moisá, Sonia J; Shike, Daniel W; Graugnard, Daniel E; Rodriguez-Zas, Sandra L; Everts, Robin E; Lewin, Harris A; Faulkner, Dan B; Berger, Larry L; Loor, Juan J
2013-01-01
Transcriptome dynamics in the longissimus muscle (LM) of young Angus cattle were evaluated at 0, 60, 120, and 220 days from early-weaning. Bioinformatic analysis was performed using the dynamic impact approach (DIA) by means of Kyoto Encyclopedia of Genes and Genomes (KEGG) and Database for Annotation, Visualization and Integrated Discovery (DAVID) databases. Between 0 to 120 days (growing phase) most of the highly-impacted pathways (eg, ascorbate and aldarate metabolism, drug metabolism, cytochrome P450 and Retinol metabolism) were inhibited. The phase between 120 to 220 days (finishing phase) was characterized by the most striking differences with 3,784 differentially expressed genes (DEGs). Analysis of those DEGs revealed that the most impacted KEGG canonical pathway was glycosylphosphatidylinositol (GPI)-anchor biosynthesis, which was inhibited. Furthermore, inhibition of calpastatin and activation of tyrosine aminotransferase ubiquitination at 220 days promotes proteasomal degradation, while the concurrent activation of ribosomal proteins promotes protein synthesis. Therefore, the balance of these processes likely results in a steady-state of protein turnover during the finishing phase. Results underscore the importance of transcriptome dynamics in LM during growth.
Niu, Sheng-Yong; Yang, Jinyu; McDermaid, Adam; Zhao, Jing; Kang, Yu; Ma, Qin
2017-05-08
Metagenomic and metatranscriptomic sequencing approaches are more frequently being used to link microbiota to important diseases and ecological changes. Many analyses have been used to compare the taxonomic and functional profiles of microbiota across habitats or individuals. While a large portion of metagenomic analyses focus on species-level profiling, some studies use strain-level metagenomic analyses to investigate the relationship between specific strains and certain circumstances. Metatranscriptomic analysis provides another important insight into activities of genes by examining gene expression levels of microbiota. Hence, combining metagenomic and metatranscriptomic analyses will help understand the activity or enrichment of a given gene set, such as drug-resistant genes among microbiome samples. Here, we summarize existing bioinformatics tools of metagenomic and metatranscriptomic data analysis, the purpose of which is to assist researchers in deciding the appropriate tools for their microbiome studies. Additionally, we propose an Integrated Meta-Function mapping pipeline to incorporate various reference databases and accelerate functional gene mapping procedures for both metagenomic and metatranscriptomic analyses. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
expVIP: a Customizable RNA-seq Data Analysis and Visualization Platform1[OPEN
2016-01-01
The majority of transcriptome sequencing (RNA-seq) expression studies in plants remain underutilized and inaccessible due to the use of disparate transcriptome references and the lack of skills and resources to analyze and visualize these data. We have developed expVIP, an expression visualization and integration platform, which allows easy analysis of RNA-seq data combined with an intuitive and interactive interface. Users can analyze public and user-specified data sets with minimal bioinformatics knowledge using the expVIP virtual machine. This generates a custom Web browser to visualize, sort, and filter the RNA-seq data and provides outputs for differential gene expression analysis. We demonstrate expVIP’s suitability for polyploid crops and evaluate its performance across a range of biologically relevant scenarios. To exemplify its use in crop research, we developed a flexible wheat (Triticum aestivum) expression browser (www.wheat-expression.com) that can be expanded with user-generated data in a local virtual machine environment. The open-access expVIP platform will facilitate the analysis of gene expression data from a wide variety of species by enabling the easy integration, visualization, and comparison of RNA-seq data across experiments. PMID:26869702
Epigenetic regulation of gene expression in cancer: techniques, resources and analysis
Kagohara, Luciane T; Stein-O’Brien, Genevieve L; Kelley, Dylan; Flam, Emily; Wick, Heather C; Danilova, Ludmila V; Easwaran, Hariharan; Favorov, Alexander V; Qian, Jiang; Gaykalova, Daria A; Fertig, Elana J
2018-01-01
Abstract Cancer is a complex disease, driven by aberrant activity in numerous signaling pathways in even individual malignant cells. Epigenetic changes are critical mediators of these functional changes that drive and maintain the malignant phenotype. Changes in DNA methylation, histone acetylation and methylation, noncoding RNAs, posttranslational modifications are all epigenetic drivers in cancer, independent of changes in the DNA sequence. These epigenetic alterations were once thought to be crucial only for the malignant phenotype maintenance. Now, epigenetic alterations are also recognized as critical for disrupting essential pathways that protect the cells from uncontrolled growth, longer survival and establishment in distant sites from the original tissue. In this review, we focus on DNA methylation and chromatin structure in cancer. The precise functional role of these alterations is an area of active research using emerging high-throughput approaches and bioinformatics analysis tools. Therefore, this review also describes these high-throughput measurement technologies, public domain databases for high-throughput epigenetic data in tumors and model systems and bioinformatics algorithms for their analysis. Advances in bioinformatics data that combine these epigenetic data with genomics data are essential to infer the function of specific epigenetic alterations in cancer. These integrative algorithms are also a focus of this review. Future studies using these emerging technologies will elucidate how alterations in the cancer epigenome cooperate with genetic aberrations during tumor initiation and progression. This deeper understanding is essential to future studies with epigenetics biomarkers and precision medicine using emerging epigenetic therapies. PMID:28968850
PanGEA: identification of allele specific gene expression using the 454 technology.
Kofler, Robert; Teixeira Torres, Tatiana; Lelley, Tamas; Schlötterer, Christian
2009-05-14
Next generation sequencing technologies hold great potential for many biological questions. While mainly used for genomic sequencing, they are also very promising for gene expression profiling. Sequencing of cDNA does not only provide an estimate of the absolute expression level, it can also be used for the identification of allele specific gene expression. We developed PanGEA, a tool which enables a fast and user-friendly analysis of allele specific gene expression using the 454 technology. PanGEA allows mapping of 454-ESTs to genes or whole genomes, displaying gene expression profiles, identification of SNPs and the quantification of allele specific gene expression. The intuitive GUI of PanGEA facilitates a flexible and interactive analysis of the data. PanGEA additionally implements a modification of the Smith-Waterman algorithm which deals with incorrect estimates of homopolymer length as occuring in the 454 technology To our knowledge, PanGEA is the first tool which facilitates the identification of allele specific gene expression. PanGEA is distributed under the Mozilla Public License and available at: http://www.kofler.or.at/bioinformatics/PanGEA
Protein-protein interaction network of gene expression in the hydrocortisone-treated keloid.
Chen, Rui; Zhang, Zhiliang; Xue, Zhujia; Wang, Lin; Fu, Mingang; Lu, Yi; Bai, Ling; Zhang, Ping; Fan, Zhihong
2015-01-01
In order to explore the molecular mechanism of hydrocortisone in keloid tissue, the gene expression profiles of keloid samples treated with hydrocortisone were subjected to bioinformatics analysis. Firstly, the gene expression profiles (GSE7890) of five samples of keloid treated with hydrocortisone and five untreated keloid samples were downloaded from the Gene Expression Omnibus (GEO) database. Secondly, data were preprocessed using packages in R language and differentially expressed genes (DEGs) were screened using a significance analysis of microarrays (SAM) protocol. Thirdly, the DEGs were subjected to gene ontology (GO) function and KEGG pathway enrichment analysis. Finally, the interactions of DEGs in samples of keloid treated with hydrocortisone were explored in a human protein-protein interaction (PPI) network, and sub-modules of the DEGs interaction network were analyzed using Cytoscape software. Based on the analysis, 572 DEGs in the hydrocortisone-treated samples were screened; most of these were involved in the signal transduction and cell cycle. Furthermore, three critical genes in the module, including COL1A1, NID1, and PRELP, were screened in the PPI network analysis. These findings enhance understanding of the pathogenesis of the keloid and provide references for keloid therapy. © 2015 The International Society of Dermatology.
Characterization of microRNAs from goat (Capra hircus) by Solexa deep-sequencing technology.
Ling, Y H; Ding, J P; Zhang, X D; Wang, L J; Zhang, Y H; Li, Y S; Zhang, Z J; Zhang, X R
2013-06-13
MicroRNAs (miRNAs) are an important class of small noncoding RNAs that are highly conserved in plants and animals. Many miRNAs are known to mediate a myriad of cell processes, including proliferation and differentiation, via the regulation of some transcription and signaling factors, which are closely related to muscle development and disease. In this study, small RNA cDNA libraries of Boer goats were constructed. In addition, we obtained the goat muscle miRNAs by using Solexa deep-sequencing technology and analyzed these miRNA characteristics by combining it with the bioinformatics technology. Based on Solexa sequencing and bioinformatics analysis, 562 species-conserved and 5 goat genome-specific miRNAs were identified, 322 of which exceeded 100 in the expression levels. The results of real-time quantitative polymerase chain reaction from 8 randomly selected miRNAs showed that the 8 miRNAs were expressed in goat muscle, and the expression patterns were consistent with the Solexa sequencing results. The identification and characterization of miRNAs in goat muscle provide important information on the role of miRNA regulation in muscle growth and development. These data will help to facilitate studies on the regulatory roles played by miRNAs during goat growth and development.
ERIC Educational Resources Information Center
Shachak, Aviv; Ophir, Ron; Rubin, Eitan
2005-01-01
The need to support bioinformatics training has been widely recognized by scientists, industry, and government institutions. However, the discussion of instructional methods for teaching bioinformatics is only beginning. Here we report on a systematic attempt to design two bioinformatics workshops for graduate biology students on the basis of…
Curtis, Ross E; Kim, Seyoung; Woolford, John L; Xu, Wenjie; Xing, Eric P
2013-03-21
Association analysis using genome-wide expression quantitative trait locus (eQTL) data investigates the effect that genetic variation has on cellular pathways and leads to the discovery of candidate regulators. Traditional analysis of eQTL data via pairwise statistical significance tests or linear regression does not leverage the availability of the structural information of the transcriptome, such as presence of gene networks that reveal correlation and potentially regulatory relationships among the study genes. We employ a new eQTL mapping algorithm, GFlasso, which we have previously developed for sparse structured regression, to reanalyze a genome-wide yeast dataset. GFlasso fully takes into account the dependencies among expression traits to suppress false positives and to enhance the signal/noise ratio. Thus, GFlasso leverages the gene-interaction network to discover the pleiotropic effects of genetic loci that perturb the expression level of multiple (rather than individual) genes, which enables us to gain more power in detecting previously neglected signals that are marginally weak but pleiotropically significant. While eQTL hotspots in yeast have been reported previously as genomic regions controlling multiple genes, our analysis reveals additional novel eQTL hotspots and, more interestingly, uncovers groups of multiple contributing eQTL hotspots that affect the expression level of functional gene modules. To our knowledge, our study is the first to report this type of gene regulation stemming from multiple eQTL hotspots. Additionally, we report the results from in-depth bioinformatics analysis for three groups of these eQTL hotspots: ribosome biogenesis, telomere silencing, and retrotransposon biology. We suggest candidate regulators for the functional gene modules that map to each group of hotspots. Not only do we find that many of these candidate regulators contain mutations in the promoter and coding regions of the genes, in the case of the Ribi group, we provide experimental evidence suggesting that the identified candidates do regulate the target genes predicted by GFlasso. Thus, this structured association analysis of a yeast eQTL dataset via GFlasso, coupled with extensive bioinformatics analysis, discovers a novel regulation pattern between multiple eQTL hotspots and functional gene modules. Furthermore, this analysis demonstrates the potential of GFlasso as a powerful computational tool for eQTL studies that exploit the rich structural information among expression traits due to correlation, regulation, or other forms of biological dependencies.
NEIBank: Genomics and bioinformatics resources for vision research
Peterson, Katherine; Gao, James; Buchoff, Patee; Jaworski, Cynthia; Bowes-Rickman, Catherine; Ebright, Jessica N.; Hauser, Michael A.; Hoover, David
2008-01-01
NEIBank is an integrated resource for genomics and bioinformatics in vision research. It includes expressed sequence tag (EST) data and sequence-verified cDNA clones for multiple eye tissues of several species, web-based access to human eye-specific SAGE data through EyeSAGE, and comprehensive, annotated databases of known human eye disease genes and candidate disease gene loci. All expression- and disease-related data are integrated in EyeBrowse, an eye-centric genome browser. NEIBank provides a comprehensive overview of current knowledge of the transcriptional repertoires of eye tissues and their relation to pathology. PMID:18648525
Carving a niche: establishing bioinformatics collaborations
Lyon, Jennifer A.; Tennant, Michele R.; Messner, Kevin R.; Osterbur, David L.
2006-01-01
Objectives: The paper describes collaborations and partnerships developed between library bioinformatics programs and other bioinformatics-related units at four academic institutions. Methods: A call for information on bioinformatics partnerships was made via email to librarians who have participated in the National Center for Biotechnology Information's Advanced Workshop for Bioinformatics Information Specialists. Librarians from Harvard University, the University of Florida, the University of Minnesota, and Vanderbilt University responded and expressed willingness to contribute information on their institutions, programs, services, and collaborating partners. Similarities and differences in programs and collaborations were identified. Results: The four librarians have developed partnerships with other units on their campuses that can be categorized into the following areas: knowledge management, instruction, and electronic resource support. All primarily support freely accessible electronic resources, while other campus units deal with fee-based ones. These demarcations are apparent in resource provision as well as in subsequent support and instruction. Conclusions and Recommendations: Through environmental scanning and networking with colleagues, librarians who provide bioinformatics support can develop fruitful collaborations. Visibility is key to building collaborations, as is broad-based thinking in terms of potential partners. PMID:16888668
Analysis of microRNA profile of Anopheles sinensis by deep sequencing and bioinformatic approaches.
Feng, Xinyu; Zhou, Xiaojian; Zhou, Shuisen; Wang, Jingwen; Hu, Wei
2018-03-12
microRNAs (miRNAs) are small non-coding RNAs widely identified in many mosquitoes. They are reported to play important roles in development, differentiation and innate immunity. However, miRNAs in Anopheles sinensis, one of the Chinese malaria mosquitoes, remain largely unknown. We investigated the global miRNA expression profile of An. sinensis using Illumina Hiseq 2000 sequencing. Meanwhile, we applied a bioinformatic approach to identify potential miRNAs in An. sinensis. The identified miRNA profiles were compared and analyzed by two approaches. The selected miRNAs from the sequencing result and the bioinformatic approach were confirmed with qRT-PCR. Moreover, target prediction, GO annotation and pathway analysis were carried out to understand the role of miRNAs in An. sinensis. We identified 49 conserved miRNAs and 12 novel miRNAs by next-generation high-throughput sequencing technology. In contrast, 43 miRNAs were predicted by the bioinformatic approach, of which two were assigned as novel. Comparative analysis of miRNA profiles by two approaches showed that 21 miRNAs were shared between them. Twelve novel miRNAs did not match any known miRNAs of any organism, indicating that they are possibly species-specific. Forty miRNAs were found in many mosquito species, indicating that these miRNAs are evolutionally conserved and may have critical roles in the process of life. Both the selected known and novel miRNAs (asi-miR-281, asi-miR-184, asi-miR-14, asi-miR-nov5, asi-miR-nov4, asi-miR-9383, and asi-miR-2a) could be detected by quantitative real-time PCR (qRT-PCR) in the sequenced sample, and the expression patterns of these miRNAs measured by qRT-PCR were in concordance with the original miRNA sequencing data. The predicted targets for the known and the novel miRNAs covered many important biological roles and pathways indicating the diversity of miRNA functions. We also found 21 conserved miRNAs and eight counterparts of target immune pathway genes in An. sinensis based on the analysis of An. gambiae. Our results provide the first lead to the elucidation of the miRNA profile in An. sinensis. Unveiling the roles of mosquito miRNAs will undoubtedly lead to a better understanding of mosquito biology and mosquito-pathogen interactions. This work lays the foundation for the further functional study of An. sinensis miRNAs and will facilitate their application in vector control.
A Web-based assessment of bioinformatics end-user support services at US universities.
Messersmith, Donna J; Benson, Dennis A; Geer, Renata C
2006-07-01
This study was conducted to gauge the availability of bioinformatics end-user support services at US universities and to identify the providers of those services. The study primarily focused on the availability of short-term workshops that introduce users to molecular biology databases and analysis software. Websites of selected US universities were reviewed to determine if bioinformatics educational workshops were offered, and, if so, what organizational units in the universities provided them. Of 239 reviewed universities, 72 (30%) offered bioinformatics educational workshops. These workshops were located at libraries (N = 15), bioinformatics centers (N = 38), or other facilities (N = 35). No such training was noted on the sites of 167 universities (70%). Of the 115 bioinformatics centers identified, two-thirds did not offer workshops. This analysis of university Websites indicates that a gap may exist in the availability of workshops and related training to assist researchers in the use of bioinformatics resources, representing a potential opportunity for libraries and other facilities to provide training and assistance for this growing user group.
Application of machine learning methods in bioinformatics
NASA Astrophysics Data System (ADS)
Yang, Haoyu; An, Zheng; Zhou, Haotian; Hou, Yawen
2018-05-01
Faced with the development of bioinformatics, high-throughput genomic technology have enabled biology to enter the era of big data. [1] Bioinformatics is an interdisciplinary, including the acquisition, management, analysis, interpretation and application of biological information, etc. It derives from the Human Genome Project. The field of machine learning, which aims to develop computer algorithms that improve with experience, holds promise to enable computers to assist humans in the analysis of large, complex data sets.[2]. This paper analyzes and compares various algorithms of machine learning and their applications in bioinformatics.
Stephan, Christian; Hamacher, Michael; Blüggel, Martin; Körting, Gerhard; Chamrad, Daniel; Scheer, Christian; Marcus, Katrin; Reidegeld, Kai A; Lohaus, Christiane; Schäfer, Heike; Martens, Lennart; Jones, Philip; Müller, Michael; Auyeung, Kevin; Taylor, Chris; Binz, Pierre-Alain; Thiele, Herbert; Parkinson, David; Meyer, Helmut E; Apweiler, Rolf
2005-09-01
The Bioinformatics Committee of the HUPO Brain Proteome Project (HUPO BPP) meets regularly to execute the post-lab analyses of the data produced in the HUPO BPP pilot studies. On July 7, 2005 the members came together for the 5th time at the European Bioinformatics Institute (EBI) in Hinxton, UK, hosted by Rolf Apweiler. As a main result, the parameter set of the semi-automated data re-analysis of MS/MS spectra has been elaborated and the subsequent work steps have been defined.
Mining featured biomarkers associated with prostatic carcinoma based on bioinformatics.
Piao, Guanying; Wu, Jiarui
2013-11-01
To analyze the differentially expressed genes and identify featured biomarkers from prostatic carcinoma. The software "Significance Analysis of Microarray" (SAM) was used to identify the differentially coexpressed genes (DCGs). The DCGs existed in two datasets were analyzed by GO (Gene Ontology) functional annotation. A total of 389 DCGs were obtained. By GO analysis, we found these DCGs were closely related with the acinus development, TGF-β receptor and signal transduction pathways. Furthermore, five featured biomarkers were discovered by interaction analysis. These important signal pathways and oncogenes may provide potential therapeutic targets for prostatic carcinoma.
2016-09-01
assigned a classification. MLST analysis MLST was determined using an in-house automated pipeline that first searches for homologs of each gene of...and virulence mechanism contributing to their success as pathogens in the wound environment. A novel bioinformatics pipeline was used to incorporate...monitored in two ways: read-based genome QC and assembly based metrics. The JCVI Genome QC pipeline samples sequence reads and performs BLAST
Stephens, Susie M; Chen, Jake Y; Davidson, Marcel G; Thomas, Shiby; Trute, Barry M
2005-01-01
As database management systems expand their array of analytical functionality, they become powerful research engines for biomedical data analysis and drug discovery. Databases can hold most of the data types commonly required in life sciences and consequently can be used as flexible platforms for the implementation of knowledgebases. Performing data analysis in the database simplifies data management by minimizing the movement of data from disks to memory, allowing pre-filtering and post-processing of datasets, and enabling data to remain in a secure, highly available environment. This article describes the Oracle Database 10g implementation of BLAST and Regular Expression Searches and provides case studies of their usage in bioinformatics. http://www.oracle.com/technology/software/index.html.
Wu, Chengjiang; Zhao, Yangjing; Lin, Yu; Yang, Xinxin; Yan, Meina; Min, Yujiao; Pan, Zihui; Xia, Sheng; Shao, Qixiang
2018-01-01
DNA microarray and high-throughput sequencing have been widely used to identify the differentially expressed genes (DEGs) in systemic lupus erythematosus (SLE). However, the big data from gene microarrays are also challenging to work with in terms of analysis and processing. The presents study combined data from the microarray expression profile (GSE65391) and bioinformatics analysis to identify the key genes and cellular pathways in SLE. Gene ontology (GO) and cellular pathway enrichment analyses of DEGs were performed to investigate significantly enriched pathways. A protein-protein interaction network was constructed to determine the key genes in the occurrence and development of SLE. A total of 310 DEGs were identified in SLE, including 193 upregulated genes and 117 downregulated genes. GO analysis revealed that the most significant biological process of DEGs was immune system process. Kyoto Encyclopedia of Genes and Genome pathway analysis showed that these DEGs were enriched in signaling pathways associated with the immune system, including the RIG-I-like receptor signaling pathway, intestinal immune network for IgA production, antigen processing and presentation and the toll-like receptor signaling pathway. The current study screened the top 10 genes with higher degrees as hub genes, which included 2′-5′-oligoadenylate synthetase 1, MX dynamin like GTPase 2, interferon induced protein with tetratricopeptide repeats 1, interferon regulatory factor 7, interferon induced with helicase C domain 1, signal transducer and activator of transcription 1, ISG15 ubiquitin-like modifier, DExD/H-box helicase 58, interferon induced protein with tetratricopeptide repeats 3 and 2′-5′-oligoadenylate synthetase 2. Module analysis revealed that these hub genes were also involved in the RIG-I-like receptor signaling, cytosolic DNA-sensing, toll-like receptor signaling and ribosome biogenesis pathways. In addition, these hub genes, from different probe sets, exhibited significant co-expressed tendency in multi-experiment microarray datasets (P<0.01). In conclusion, these key genes and cellular pathways may improve the current understanding of the underlying mechanism of development of SLE. These key genes may be potential biomarkers of diagnosis, therapy and prognosis for SLE. PMID:29257335
Pan, Feng; You, Jinwei; Liu, Yuan; Qiu, Xuefeng; Yu, Wen; Ma, Jiehua; Pan, Lianjun; Zhang, Aixia; Zhang, Qipeng
2016-12-01
To better understand the molecular aetiology of type 2 diabetes mellitus-associated erectile dysfunction (T2DMED) and to provide candidates for further study of its diagnosis and treatment, this study was designed to investigate differentially expressed microRNAs (miRNAs) in the corpus cavernosum (CC) of mice with T2DMED using GeneChip array techniques (Affymetrix miRNA 4.0 Array) and to predict target genes and signalling pathways regulated by these miRNAs based on bioinformatic analysis using TargetScan, the DAIAN web platform and DAVID. In the initial screening, 21 miRNAs appeared distinctly expressed in the T2DMED group (fold change ≥3, p ≤ 0.01). Among them, the differential expression of miR-18a, miR-206, miR-122, and miR-133 were confirmed by qRT-PCR (p < 0.05 and FDR <5 %). According to bioinformatic analysis, the four miRNAs were speculated to play potential roles in the mechanisms of T2DMED via regulating 28 different genes and several pathways, including apoptosis, fibrosis, eNOS/cGMP/PKG, and vascular smooth muscle contraction processes, which mainly focused on influencing the functions of the endothelium and smooth muscle in the CC. IGF-1, as one of the target genes, was verified to decrease in the CCs of T2DMED animals via ELISA and was confirmed as the target of miR-18a or miR-206 via luciferase assay. Finally, these four miRNAs deserve further confirmation as biomarkers of T2DMED in larger studies. Additionally, miR-18a and/or miR-206 may provide new preventive/therapeutic targets for ED management by targeting IGF-1.
Comparative bioinformatics analyses and profiling of lysosome-related organelle proteomes
NASA Astrophysics Data System (ADS)
Hu, Zhang-Zhi; Valencia, Julio C.; Huang, Hongzhan; Chi, An; Shabanowitz, Jeffrey; Hearing, Vincent J.; Appella, Ettore; Wu, Cathy
2007-01-01
Complete and accurate profiling of cellular organelle proteomes, while challenging, is important for the understanding of detailed cellular processes at the organelle level. Mass spectrometry technologies coupled with bioinformatics analysis provide an effective approach for protein identification and functional interpretation of organelle proteomes. In this study, we have compiled human organelle reference datasets from large-scale proteomic studies and protein databases for seven lysosome-related organelles (LROs), as well as the endoplasmic reticulum and mitochondria, for comparative organelle proteome analysis. Heterogeneous sources of human organelle proteins and rodent homologs are mapped to human UniProtKB protein entries based on ID and/or peptide mappings, followed by functional annotation and categorization using the iProXpress proteomic expression analysis system. Cataloging organelle proteomes allows close examination of both shared and unique proteins among various LROs and reveals their functional relevance. The proteomic comparisons show that LROs are a closely related family of organelles. The shared proteins indicate the dynamic and hybrid nature of LROs, while the unique transmembrane proteins may represent additional candidate marker proteins for LROs. This comparative analysis, therefore, provides a basis for hypothesis formulation and experimental validation of organelle proteins and their functional roles.
Marshall, Elaine; Lowrey, Jacqueline; MacPherson, Sheila; Maybin, Jacqueline A.; Collins, Frances; Critchley, Hilary O. D.
2011-01-01
Context: The endometrium is a multicellular, steroid-responsive tissue that undergoes dynamic remodeling every menstrual cycle in preparation for implantation and, in absence of pregnancy, menstruation. Androgen receptors are present in the endometrium. Objective: The objective of the study was to investigate the impact of androgens on human endometrial stromal cells (hESC). Design: Bioinformatics was used to identify an androgen-regulated gene set and processes associated with their function. Regulation of target genes and impact of androgens on cell function were validated using primary hESC. Setting: The study was conducted at the University Research Institute. Patients: Endometrium was collected from women with regular menses; tissues were used for recovery of cells, total mRNA, or protein and for immunohistochemistry. Results: A new endometrial androgen target gene set (n = 15) was identified. Bioinformatics revealed 12 of these genes interacted in one pathway and identified an association with control of cell survival. Dynamic androgen-dependent changes in expression of the gene set were detected in hESC with nine significantly down-regulated at 2 and/or 8 h. Treatment of hESC with dihydrotestosterone reduced staurosporine-induced apoptosis and cell migration/proliferation. Conclusions: Rigorous in silico analysis resulted in identification of a group of androgen-regulated genes expressed in human endometrium. Pathway analysis and functional assays suggest androgen-dependent changes in gene expression may have a significant impact on stromal cell proliferation, migration, and survival. These data provide the platform for further studies on the role of circulatory or local androgens in the regulation of endometrial function and identify androgens as candidates in the pathogenesis of common endometrial disorders including polycystic ovarian syndrome, cancer, and endometriosis. PMID:21865353
Ferreira Filho, Jaire Alves; Horta, Maria Augusta Crivelente; Beloti, Lilian Luzia; Dos Santos, Clelton Aparecido; de Souza, Anete Pereira
2017-10-12
Trichoderma harzianum is used in biotechnology applications due to its ability to produce powerful enzymes for the conversion of lignocellulosic substrates into soluble sugars. Active enzymes involved in carbohydrate metabolism are defined as carbohydrate-active enzymes (CAZymes), and the most abundant family in the CAZy database is the glycoside hydrolases. The enzymes of this family play a fundamental role in the decomposition of plant biomass. In this study, the CAZymes of T. harzianum were identified and classified using bioinformatic approaches after which the expression profiles of all annotated CAZymes were assessed via RNA-Seq, and a phylogenetic analysis was performed. A total of 430 CAZymes (3.7% of the total proteins for this organism) were annotated in T. harzianum, including 259 glycoside hydrolases (GHs), 101 glycosyl transferases (GTs), 6 polysaccharide lyases (PLs), 22 carbohydrate esterases (CEs), 42 auxiliary activities (AAs) and 46 carbohydrate-binding modules (CBMs). Among the identified T. harzianum CAZymes, 47% were predicted to harbor a signal peptide sequence and were therefore classified as secreted proteins. The GH families were the CAZyme class with the greatest number of expressed genes, including GH18 (23 genes), GH3 (17 genes), GH16 (16 genes), GH2 (13 genes) and GH5 (12 genes). A phylogenetic analysis of the proteins in the AA9/GH61, CE5 and GH55 families showed high functional variation among the proteins. Identifying the main proteins used by T. harzianum for biomass degradation can ensure new advances in the biofuel production field. Herein, we annotated and characterized the expression levels of all of the CAZymes from T. harzianum, which may contribute to future studies focusing on the functional and structural characterization of the identified proteins.
Gelbart, Hadas; Ben-Dor, Shifra; Yarden, Anat
2017-01-01
Despite the central place held by bioinformatics in modern life sciences and related areas, it has only recently been integrated to a limited extent into high-school teaching and learning programs. Here we describe the assessment of a learning environment entitled ‘Bioinformatics in the Service of Biotechnology’. Students’ learning outcomes and attitudes toward the bioinformatics learning environment were measured by analyzing their answers to questions embedded within the activities, questionnaires, interviews and observations. Students’ difficulties and knowledge acquisition were characterized based on four categories: the required domain-specific knowledge (declarative, procedural, strategic or situational), the scientific field that each question stems from (biology, bioinformatics or their combination), the associated cognitive-process dimension (remember, understand, apply, analyze, evaluate, create) and the type of question (open-ended or multiple choice). Analysis of students’ cognitive outcomes revealed learning gains in bioinformatics and related scientific fields, as well as appropriation of the bioinformatics approach as part of the students’ scientific ‘toolbox’. For students, questions stemming from the ‘old world’ biology field and requiring declarative or strategic knowledge were harder to deal with. This stands in contrast to their teachers’ prediction. Analysis of students’ affective outcomes revealed positive attitudes toward bioinformatics and the learning environment, as well as their perception of the teacher’s role. Insights from this analysis yielded implications and recommendations for curriculum design, classroom enactment, teacher education and research. For example, we recommend teaching bioinformatics in an integrative and comprehensive manner, through an inquiry process, and linking it to the wider science curriculum. PMID:26801769
Machluf, Yossy; Gelbart, Hadas; Ben-Dor, Shifra; Yarden, Anat
2017-01-01
Despite the central place held by bioinformatics in modern life sciences and related areas, it has only recently been integrated to a limited extent into high-school teaching and learning programs. Here we describe the assessment of a learning environment entitled 'Bioinformatics in the Service of Biotechnology'. Students' learning outcomes and attitudes toward the bioinformatics learning environment were measured by analyzing their answers to questions embedded within the activities, questionnaires, interviews and observations. Students' difficulties and knowledge acquisition were characterized based on four categories: the required domain-specific knowledge (declarative, procedural, strategic or situational), the scientific field that each question stems from (biology, bioinformatics or their combination), the associated cognitive-process dimension (remember, understand, apply, analyze, evaluate, create) and the type of question (open-ended or multiple choice). Analysis of students' cognitive outcomes revealed learning gains in bioinformatics and related scientific fields, as well as appropriation of the bioinformatics approach as part of the students' scientific 'toolbox'. For students, questions stemming from the 'old world' biology field and requiring declarative or strategic knowledge were harder to deal with. This stands in contrast to their teachers' prediction. Analysis of students' affective outcomes revealed positive attitudes toward bioinformatics and the learning environment, as well as their perception of the teacher's role. Insights from this analysis yielded implications and recommendations for curriculum design, classroom enactment, teacher education and research. For example, we recommend teaching bioinformatics in an integrative and comprehensive manner, through an inquiry process, and linking it to the wider science curriculum. © The Author 2016. Published by Oxford University Press.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sundstrom, J.; Tash, B; Murakami, T
2009-01-01
The molecular function of occludin, an integral membrane component of tight junctions, remains unclear. VEGF-induced phosphorylation sites were mapped on occludin by combining MS data analysis with bioinformatics. In vivo phosphorylation of Ser490 was validated and protein interaction studies combined with crystal structure analysis suggest that Ser490 phosphorylation attenuates the interaction between occludin and ZO-1. This study demonstrates that combining MS data and bioinformatics can successfully identify novel phosphorylation sites from limiting samples.
Exploring of the molecular mechanism of rhinitis via bioinformatics methods
Song, Yufen; Yan, Zhaohui
2018-01-01
The aim of this study was to analyze gene expression profiles for exploring the function and regulatory network of differentially expressed genes (DEGs) in pathogenesis of rhinitis by a bioinformatics method. The gene expression profile of GSE43523 was downloaded from the Gene Expression Omnibus database. The dataset contained 7 seasonal allergic rhinitis samples and 5 non-allergic normal samples. DEGs between rhinitis samples and normal samples were identified via the limma package of R. The webGestal database was used to identify enriched Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways of the DEGs. The differentially co-expressed pairs of the DEGs were identified via the DCGL package in R, and the differential co-expression network was constructed based on these pairs. A protein-protein interaction (PPI) network of the DEGs was constructed based on the Search Tool for the Retrieval of Interacting Genes database. A total of 263 DEGs were identified in rhinitis samples compared with normal samples, including 125 downregulated ones and 138 upregulated ones. The DEGs were enriched in 7 KEGG pathways. 308 differential co-expression gene pairs were obtained. A differential co-expression network was constructed, containing 212 nodes. In total, 148 PPI pairs of the DEGs were identified, and a PPI network was constructed based on these pairs. Bioinformatics methods could help us identify significant genes and pathways related to the pathogenesis of rhinitis. Steroid biosynthesis pathway and metabolic pathways might play important roles in the development of allergic rhinitis (AR). Genes such as CDC42 effector protein 5, solute carrier family 39 member A11 and PR/SET domain 10 might be also associated with the pathogenesis of AR, which provided references for the molecular mechanisms of AR. PMID:29257233
Lin, Chih-Hui; Lu, Chien-Te; Lin, Hsin-Tang; Pan, Tzu-Ming
2009-03-11
Sporamins are tuberous storage proteins and account for 80% of soluble protein in sweet potato tubers with trypsin-inhibitory activity. The expression of sporamin protein in transgenic Chinese kale (line BoA 3-1) conferred insecticidal activity toward corn earworm [ Helicoverpa armigera (Hubner)] in a previous report. In this study, we present a preliminary safety assessment of transgenic Chinese kale BoA 3-1. Bioinformatic and simulated gastric fluid (SGF) analyses were performed to evaluate the allergenicity of sporamin protein. The substantial equivalence between transgenic Chinese kale and its wild-type host has been demonstrated by the comparison of important constituents. A reliable real-time polymerase chain reaction (PCR) detection method was also developed to control sample quality. Despite the results of most evaluations in this study being negative, the safety of sporamin in transgenic Chinese kale BoA 3-1 was uncluded because of the allergenic risk revealed by bioinformatic analysis.
Bioinformatics tools in predictive ecology: applications to fisheries
Tucker, Allan; Duplisea, Daniel
2012-01-01
There has been a huge effort in the advancement of analytical techniques for molecular biological data over the past decade. This has led to many novel algorithms that are specialized to deal with data associated with biological phenomena, such as gene expression and protein interactions. In contrast, ecological data analysis has remained focused to some degree on off-the-shelf statistical techniques though this is starting to change with the adoption of state-of-the-art methods, where few assumptions can be made about the data and a more explorative approach is required, for example, through the use of Bayesian networks. In this paper, some novel bioinformatics tools for microarray data are discussed along with their ‘crossover potential’ with an application to fisheries data. In particular, a focus is made on the development of models that identify functionally equivalent species in different fish communities with the aim of predicting functional collapse. PMID:22144390
Giraldo-Calderón, Gloria I.; Emrich, Scott J.; MacCallum, Robert M.; Maslen, Gareth; Dialynas, Emmanuel; Topalis, Pantelis; Ho, Nicholas; Gesing, Sandra; Madey, Gregory; Collins, Frank H.; Lawson, Daniel
2015-01-01
VectorBase is a National Institute of Allergy and Infectious Diseases supported Bioinformatics Resource Center (BRC) for invertebrate vectors of human pathogens. Now in its 11th year, VectorBase currently hosts the genomes of 35 organisms including a number of non-vectors for comparative analysis. Hosted data range from genome assemblies with annotated gene features, transcript and protein expression data to population genetics including variation and insecticide-resistance phenotypes. Here we describe improvements to our resource and the set of tools available for interrogating and accessing BRC data including the integration of Web Apollo to facilitate community annotation and providing Galaxy to support user-based workflows. VectorBase also actively supports our community through hands-on workshops and online tutorials. All information and data are freely available from our website at https://www.vectorbase.org/. PMID:25510499
Bioinformatics tools in predictive ecology: applications to fisheries.
Tucker, Allan; Duplisea, Daniel
2012-01-19
There has been a huge effort in the advancement of analytical techniques for molecular biological data over the past decade. This has led to many novel algorithms that are specialized to deal with data associated with biological phenomena, such as gene expression and protein interactions. In contrast, ecological data analysis has remained focused to some degree on off-the-shelf statistical techniques though this is starting to change with the adoption of state-of-the-art methods, where few assumptions can be made about the data and a more explorative approach is required, for example, through the use of Bayesian networks. In this paper, some novel bioinformatics tools for microarray data are discussed along with their 'crossover potential' with an application to fisheries data. In particular, a focus is made on the development of models that identify functionally equivalent species in different fish communities with the aim of predicting functional collapse.
He, Yongqun
2011-01-01
Brucella is a Gram-negative, facultative intracellular bacterium that causes zoonotic brucellosis in humans and various animals. Out of 10 classified Brucella species, B. melitensis, B. abortus, B. suis, and B. canis are pathogenic to humans. In the past decade, the mechanisms of Brucella pathogenesis and host immunity have been extensively investigated using the cutting edge systems biology and bioinformatics approaches. This article provides a comprehensive review of the applications of Omics (including genomics, transcriptomics, and proteomics) and bioinformatics technologies for the analysis of Brucella pathogenesis, host immune responses, and vaccine targets. Based on more than 30 sequenced Brucella genomes, comparative genomics is able to identify gene variations among Brucella strains that help to explain host specificity and virulence differences among Brucella species. Diverse transcriptomics and proteomics gene expression studies have been conducted to analyze gene expression profiles of wild type Brucella strains and mutants under different laboratory conditions. High throughput Omics analyses of host responses to infections with virulent or attenuated Brucella strains have been focused on responses by mouse and cattle macrophages, bovine trophoblastic cells, mouse and boar splenocytes, and ram buffy coat. Differential serum responses in humans and rams to Brucella infections have been analyzed using high throughput serum antibody screening technology. The Vaxign reverse vaccinology has been used to predict many Brucella vaccine targets. More than 180 Brucella virulence factors and their gene interaction networks have been identified using advanced literature mining methods. The recent development of community-based Vaccine Ontology and Brucellosis Ontology provides an efficient way for Brucella data integration, exchange, and computer-assisted automated reasoning. PMID:22919594
He, Yongqun
2012-01-01
Brucella is a Gram-negative, facultative intracellular bacterium that causes zoonotic brucellosis in humans and various animals. Out of 10 classified Brucella species, B. melitensis, B. abortus, B. suis, and B. canis are pathogenic to humans. In the past decade, the mechanisms of Brucella pathogenesis and host immunity have been extensively investigated using the cutting edge systems biology and bioinformatics approaches. This article provides a comprehensive review of the applications of Omics (including genomics, transcriptomics, and proteomics) and bioinformatics technologies for the analysis of Brucella pathogenesis, host immune responses, and vaccine targets. Based on more than 30 sequenced Brucella genomes, comparative genomics is able to identify gene variations among Brucella strains that help to explain host specificity and virulence differences among Brucella species. Diverse transcriptomics and proteomics gene expression studies have been conducted to analyze gene expression profiles of wild type Brucella strains and mutants under different laboratory conditions. High throughput Omics analyses of host responses to infections with virulent or attenuated Brucella strains have been focused on responses by mouse and cattle macrophages, bovine trophoblastic cells, mouse and boar splenocytes, and ram buffy coat. Differential serum responses in humans and rams to Brucella infections have been analyzed using high throughput serum antibody screening technology. The Vaxign reverse vaccinology has been used to predict many Brucella vaccine targets. More than 180 Brucella virulence factors and their gene interaction networks have been identified using advanced literature mining methods. The recent development of community-based Vaccine Ontology and Brucellosis Ontology provides an efficient way for Brucella data integration, exchange, and computer-assisted automated reasoning.
Yersinia Type III Secretion System Master Regulator LcrF
Schwiesow, Leah; Lam, Hanh
2015-01-01
Many Gram-negative pathogens express a type III secretion (T3SS) system to enable growth and survival within a host. The three human-pathogenic Yersinia species, Y. pestis, Y. pseudotuberculosis, and Y. enterocolitica, encode the Ysc T3SS, whose expression is controlled by an AraC-like master regulator called LcrF. In this review, we discuss LcrF structure and function as well as the environmental cues and pathways known to regulate LcrF expression. Similarities and differences in binding motifs and modes of action between LcrF and the Pseudomonas aeruginosa homolog ExsA are summarized. In addition, we present a new bioinformatics analysis that identifies putative LcrF binding sites within Yersinia target gene promoters. PMID:26644429
A Web-based assessment of bioinformatics end-user support services at US universities
Messersmith, Donna J.; Benson, Dennis A.; Geer, Renata C.
2006-01-01
Objectives: This study was conducted to gauge the availability of bioinformatics end-user support services at US universities and to identify the providers of those services. The study primarily focused on the availability of short-term workshops that introduce users to molecular biology databases and analysis software. Methods: Websites of selected US universities were reviewed to determine if bioinformatics educational workshops were offered, and, if so, what organizational units in the universities provided them. Results: Of 239 reviewed universities, 72 (30%) offered bioinformatics educational workshops. These workshops were located at libraries (N = 15), bioinformatics centers (N = 38), or other facilities (N = 35). No such training was noted on the sites of 167 universities (70%). Of the 115 bioinformatics centers identified, two-thirds did not offer workshops. Conclusions: This analysis of university Websites indicates that a gap may exist in the availability of workshops and related training to assist researchers in the use of bioinformatics resources, representing a potential opportunity for libraries and other facilities to provide training and assistance for this growing user group. PMID:16888663
RNA sequencing uncovers antisense RNAs and novel small RNAs in Streptococcus pyogenes
Le Rhun, Anaïs; Beer, Yan Yan; Reimegård, Johan; Chylinski, Krzysztof; Charpentier, Emmanuelle
2016-01-01
ABSTRACT Streptococcus pyogenes is a human pathogen responsible for a wide spectrum of diseases ranging from mild to life-threatening infections. During the infectious process, the temporal and spatial expression of pathogenicity factors is tightly controlled by a complex network of protein and RNA regulators acting in response to various environmental signals. Here, we focus on the class of small RNA regulators (sRNAs) and present the first complete analysis of sRNA sequencing data in S. pyogenes. In the SF370 clinical isolate (M1 serotype), we identified 197 and 428 putative regulatory RNAs by visual inspection and bioinformatics screening of the sequencing data, respectively. Only 35 from the 197 candidates identified by visual screening were assigned a predicted function (T-boxes, ribosomal protein leaders, characterized riboswitches or sRNAs), indicating how little is known about sRNA regulation in S. pyogenes. By comparing our list of predicted sRNAs with previous S. pyogenes sRNA screens using bioinformatics or microarrays, 92 novel sRNAs were revealed, including antisense RNAs that are for the first time shown to be expressed in this pathogen. We experimentally validated the expression of 30 novel sRNAs and antisense RNAs. We show that the expression profile of 9 sRNAs including 2 predicted regulatory elements is affected by the endoribonucleases RNase III and/or RNase Y, highlighting the critical role of these enzymes in sRNA regulation. PMID:26580233
Zong, Yanan; Liu, Ning; Ma, Shanshan; Bai, Ying; Guan, Fangxia; Kong, Xiangdong
2018-08-20
Phenylketonuria (PKU) is the most common inherited metabolic disease, an autosomal recessive disorder affecting >10,000 newborns each year globally. It can be caused by over 1000 different naturally occurring mutations in the phenylalanine hydroxylase (PAH) gene. We analyzed three novel naturally occurring PAH gene variants: p.Glu178Lys (c.532G>A), p.Val245Met (c.733G>A) and p.Ser250Phe (c.749C>T). The mutant effect on the PAH enzyme structure and function was predicted by bioinformatics software. Vectors expressing the corresponding PAH variants were generated for expression in E. coli and in HEK293T cells. The RNA expression of the three PAH variants was measured by quantitative reverse transcription polymerase chain reaction (RT-qPCR). The mutant PAH protein levels were determined by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE), western blot and enzyme-linked immunosorbent assay (ELISA). All three variants were predicted to be pathogenic by bioinformatics analysis. The transcription of the three PAH variants was similar to the wild type PAH gene in HEK293T cells. In contrast, the levels of mutant PAH proteins decreased significantly compared to the wild type control, in both E. coli and HEK293T cells. Our results indicate that the three novel PAH gene variants (p.Glu178Lys, p.Val245Met, p.Ser250Phe) impair PAH protein expression and function in prokaryotic and eukaryotic cells. Copyright © 2018. Published by Elsevier B.V.
Park, Hyun-Seok
2012-12-01
Whereas a vast amount of new information on bioinformatics is made available to the public through patents, only a small set of patents are cited in academic papers. A detailed analysis of registered bioinformatics patents, using the existing patent search system, can provide valuable information links between science and technology. However, it is extremely difficult to select keywords to capture bioinformatics patents, reflecting the convergence of several underlying technologies. No single word or even several words are sufficient to identify such patents. The analysis of patent subclasses can provide valuable information. In this paper, I did a preliminary study of the current status of bioinformatics patents and their International Patent Classification (IPC) groups registered in the Korea Intellectual Property Rights Information Service (KIPRIS) database.
Development of Bioinformatics Infrastructure for Genomics Research.
Mulder, Nicola J; Adebiyi, Ezekiel; Adebiyi, Marion; Adeyemi, Seun; Ahmed, Azza; Ahmed, Rehab; Akanle, Bola; Alibi, Mohamed; Armstrong, Don L; Aron, Shaun; Ashano, Efejiro; Baichoo, Shakuntala; Benkahla, Alia; Brown, David K; Chimusa, Emile R; Fadlelmola, Faisal M; Falola, Dare; Fatumo, Segun; Ghedira, Kais; Ghouila, Amel; Hazelhurst, Scott; Isewon, Itunuoluwa; Jung, Segun; Kassim, Samar Kamal; Kayondo, Jonathan K; Mbiyavanga, Mamana; Meintjes, Ayton; Mohammed, Somia; Mosaku, Abayomi; Moussa, Ahmed; Muhammd, Mustafa; Mungloo-Dilmohamud, Zahra; Nashiru, Oyekanmi; Odia, Trust; Okafor, Adaobi; Oladipo, Olaleye; Osamor, Victor; Oyelade, Jellili; Sadki, Khalid; Salifu, Samson Pandam; Soyemi, Jumoke; Panji, Sumir; Radouani, Fouzia; Souiai, Oussama; Tastan Bishop, Özlem
2017-06-01
Although pockets of bioinformatics excellence have developed in Africa, generally, large-scale genomic data analysis has been limited by the availability of expertise and infrastructure. H3ABioNet, a pan-African bioinformatics network, was established to build capacity specifically to enable H3Africa (Human Heredity and Health in Africa) researchers to analyze their data in Africa. Since the inception of the H3Africa initiative, H3ABioNet's role has evolved in response to changing needs from the consortium and the African bioinformatics community. H3ABioNet set out to develop core bioinformatics infrastructure and capacity for genomics research in various aspects of data collection, transfer, storage, and analysis. Various resources have been developed to address genomic data management and analysis needs of H3Africa researchers and other scientific communities on the continent. NetMap was developed and used to build an accurate picture of network performance within Africa and between Africa and the rest of the world, and Globus Online has been rolled out to facilitate data transfer. A participant recruitment database was developed to monitor participant enrollment, and data is being harmonized through the use of ontologies and controlled vocabularies. The standardized metadata will be integrated to provide a search facility for H3Africa data and biospecimens. Because H3Africa projects are generating large-scale genomic data, facilities for analysis and interpretation are critical. H3ABioNet is implementing several data analysis platforms that provide a large range of bioinformatics tools or workflows, such as Galaxy, the Job Management System, and eBiokits. A set of reproducible, portable, and cloud-scalable pipelines to support the multiple H3Africa data types are also being developed and dockerized to enable execution on multiple computing infrastructures. In addition, new tools have been developed for analysis of the uniquely divergent African data and for downstream interpretation of prioritized variants. To provide support for these and other bioinformatics queries, an online bioinformatics helpdesk backed by broad consortium expertise has been established. Further support is provided by means of various modes of bioinformatics training. For the past 4 years, the development of infrastructure support and human capacity through H3ABioNet, have significantly contributed to the establishment of African scientific networks, data analysis facilities, and training programs. Here, we describe the infrastructure and how it has affected genomics and bioinformatics research in Africa. Copyright © 2017 World Heart Federation (Geneva). Published by Elsevier B.V. All rights reserved.
TEcandidates: Prediction of genomic origin of expressed Transposable Elements using RNA-seq data.
Valdebenito-Maturana, Braulio; Riadi, Gonzalo
2018-06-01
In recent years, Transposable Elements (TEs) have been related to gene regulation. However, estimating the origin of expression of TEs through RNA-seq is complicated by multimapping reads coming from their repetitive sequences. Current approaches that address multimapping reads are focused in expression quantification and not in finding the origin of expression. Addressing the genomic origin of expressed TEs could further aid in understanding the role that TEs might have in the cell. We have developed a new pipeline called TEcandidates, based on de novo transcriptome assembly to assess the instances of TEs being expressed, along with their location, to include in downstream DE analysis. TEcandidates takes as input the RNA-seq data, the genome sequence and the TE annotation file, and returns a list of coordinates of candidate TEs being expressed, the TEs that have been removed, and the genome sequence with removed TEs as masked. This masked genome is suited to include TEs in downstream expression analysis, as the ambiguity of reads coming from TEs is significantly reduced in the mapping step of the analysis. The script which runs the pipeline can be downloaded at http://www.mobilomics.org/tecandidates/downloads or http://github.com/TEcandidates/TEcandidates. griadi@utalca.cl. Supplementary data are available at Bioinformatics online.
Wan, Changrong; Yin, Peng; Xu, Xiaolong; Liu, Mingjiang; He, Shasha; Song, Shixiu; Liu, Fenghua; Xu, Jianqin
2014-04-01
The present study investigated the effects of simulated transport stress on morphology and gene expression in the small intestine of laboratory rats. Sprague Dawley rats were subjected to 35°C and 0.1×g on a constant temperature shaker for physiological, biochemical, morphological and microarray analysis before and after treatment. The treatment induced obvious stress responses with significant decreases in body weight (P<0.01), increases in rectal temperature, serum corticosterone (CORT), serum glucose (GLU), creatine kinase (CK) and lactate dehydrogenase (LDH) levels (P<0.01), as well as expression of Hsp27/70/90 mRNA (P<0.05; P<0.01). The rat jejunum was severely damaged and apoptotic after mimicking transport stress, which may mainly be related to cell death, oxidation reduction and hormone imbalance determined by microarray analysis. The bioinformatics analysis from the present study would provide insight into the potential mechanisms underlying transport stress-induced injury in the rat small intestine. Copyright © 2014 Elsevier Ltd. All rights reserved.
An alternative splicing program promotes adipose tissue thermogenesis
Vernia, Santiago; Edwards, Yvonne JK; Han, Myoung Sook; Cavanagh-Kyros, Julie; Barrett, Tamera; Kim, Jason K; Davis, Roger J
2016-01-01
Alternative pre-mRNA splicing expands the complexity of the transcriptome and controls isoform-specific gene expression. Whether alternative splicing contributes to metabolic regulation is largely unknown. Here we investigated the contribution of alternative splicing to the development of diet-induced obesity. We found that obesity-induced changes in adipocyte gene expression include alternative pre-mRNA splicing. Bioinformatics analysis associated part of this alternative splicing program with sequence specific NOVA splicing factors. This conclusion was confirmed by studies of mice with NOVA deficiency in adipocytes. Phenotypic analysis of the NOVA-deficient mice demonstrated increased adipose tissue thermogenesis and improved glycemia. We show that NOVA proteins mediate a splicing program that suppresses adipose tissue thermogenesis. Together, these data provide quantitative analysis of gene expression at exon-level resolution in obesity and identify a novel mechanism that contributes to the regulation of adipose tissue function and the maintenance of normal glycemia. DOI: http://dx.doi.org/10.7554/eLife.17672.001 PMID:27635635
Human, vector and parasite Hsp90 proteins: A comparative bioinformatics analysis.
Faya, Ngonidzashe; Penkler, David L; Tastan Bishop, Özlem
2015-01-01
The treatment of protozoan parasitic diseases is challenging, and thus identification and analysis of new drug targets is important. Parasites survive within host organisms, and some need intermediate hosts to complete their life cycle. Changing host environment puts stress on parasites, and often adaptation is accompanied by the expression of large amounts of heat shock proteins (Hsps). Among Hsps, Hsp90 proteins play an important role in stress environments. Yet, there has been little computational research on Hsp90 proteins to analyze them comparatively as potential parasitic drug targets. Here, an attempt was made to gain detailed insights into the differences between host, vector and parasitic Hsp90 proteins by large-scale bioinformatics analysis. A total of 104 Hsp90 sequences were divided into three groups based on their cellular localizations; namely cytosolic, mitochondrial and endoplasmic reticulum (ER). Further, the parasitic proteins were divided according to the type of parasite (protozoa, helminth and ectoparasite). Primary sequence analysis, phylogenetic tree calculations, motif analysis and physicochemical properties of Hsp90 proteins suggested that despite the overall structural conservation of these proteins, parasitic Hsp90 proteins have unique features which differentiate them from human ones, thus encouraging the idea that protozoan Hsp90 proteins should be further analyzed as potential drug targets.
Radulović, Željko; Porter, Lindsay M.; Kim, Tae K.; Mulenga, Albert
2015-01-01
Organic anion-transporting polypeptides (Oatps) are an integral part of the detoxification mechanism in vertebrates and invertebrates. These cell surface proteins are involved in mediating the sodium-independent uptake and/or distribution of a broad array of organic amphipathic compounds and xenobiotic drugs. This study describes bioinformatics and biological characterization of 9 Oatp sequences in the Ixodes scapularis genome. These sequences have been annotated on the basis of 12 transmembrane domains, consensus motif D-X-RW-(I,V)-GAWW-X-G-(F,L)-L, and 11 conserved cysteine amino acid residues in the large extracellular loop 5 that characterize the Oatp superfamily. Ixodes scapularis Oatps may regulate non-redundant cross-tick species conserved functions in that they did not cluster as a monolithic group on the phylogeny tree and that they have orthologs in other ticks. Phylogeny clustering patterns also suggest that some tick Oatp sequences transport substrates that are similar to those of body louse, mosquito, eye worm, and filarial worm Oatps. Semi-quantitative RT-PCR analysis demonstrated that all 9 I. scapularis Oatp sequences were expressed during tick feeding. Ixodes scapularis Oatp genes potentially regulate functions during early and/or late-stage tick feeding as revealed by normalized mRNA profiles. Normalized transcript abundance indicates that I. scapularis Oatp genes are strongly expressed in unfed ticks during the first 24 h of feeding and/or at the end of the tick feeding process. Except for 2 I. scapularis Oatps, which were expressed in the salivary glands and ovaries, all other genes were expressed in all tested organs, suggesting the significance of I. scapularis Oatps in maintaining tick homeostasis. Different I. scapularis Oatp mRNA expression patterns were detected and discussed with reference to different physiological states of unfed and feeding ticks. PMID:24582512
Stephens, Susie M.; Chen, Jake Y.; Davidson, Marcel G.; Thomas, Shiby; Trute, Barry M.
2005-01-01
As database management systems expand their array of analytical functionality, they become powerful research engines for biomedical data analysis and drug discovery. Databases can hold most of the data types commonly required in life sciences and consequently can be used as flexible platforms for the implementation of knowledgebases. Performing data analysis in the database simplifies data management by minimizing the movement of data from disks to memory, allowing pre-filtering and post-processing of datasets, and enabling data to remain in a secure, highly available environment. This article describes the Oracle Database 10g implementation of BLAST and Regular Expression Searches and provides case studies of their usage in bioinformatics. http://www.oracle.com/technology/software/index.html PMID:15608287
[Genome-wide identification and expression analysis of auxin-related gene families in grape].
Yuan, Hua-zhao; Zhao, Mi-zhen; Wu, Wei-min; Yu, Hong-Mei; Qian, Ya-ming; Wang, Zhuang-wei; Wang, Xi-cheng
2015-07-01
The auxin response gene family adjusts the auxin balance and the growth hormone signaling pathways in plants. Using bioinformatics methods, the auxin-response genes from the grape genome database are identified and their chromosomal location, gene collinearity and phylogenetic analysis are performed. Probable genes include 25 AUX_IAA, 19 ARF, 9 GH3 and 42 LBD genes, which are unevenly distributed on all 19 chromosomes and some of them formed distinct tandem duplicate gene clusters. The available grape microarray databases show that all of the auxin-response genes are expressed in fruit and leaf buds, and significant overexpressed during fruit color-changing, bud break and bud dormancy periods. This paper provides a resource for functional studies of auxin-response genes in grape leaf and fruit development.
Taking Bioinformatics to Systems Medicine.
van Kampen, Antoine H C; Moerland, Perry D
2016-01-01
Systems medicine promotes a range of approaches and strategies to study human health and disease at a systems level with the aim of improving the overall well-being of (healthy) individuals, and preventing, diagnosing, or curing disease. In this chapter we discuss how bioinformatics critically contributes to systems medicine. First, we explain the role of bioinformatics in the management and analysis of data. In particular we show the importance of publicly available biological and clinical repositories to support systems medicine studies. Second, we discuss how the integration and analysis of multiple types of omics data through integrative bioinformatics may facilitate the determination of more predictive and robust disease signatures, lead to a better understanding of (patho)physiological molecular mechanisms, and facilitate personalized medicine. Third, we focus on network analysis and discuss how gene networks can be constructed from omics data and how these networks can be decomposed into smaller modules. We discuss how the resulting modules can be used to generate experimentally testable hypotheses, provide insight into disease mechanisms, and lead to predictive models. Throughout, we provide several examples demonstrating how bioinformatics contributes to systems medicine and discuss future challenges in bioinformatics that need to be addressed to enable the advancement of systems medicine.
Workflows for microarray data processing in the Kepler environment.
Stropp, Thomas; McPhillips, Timothy; Ludäscher, Bertram; Bieda, Mark
2012-05-17
Microarray data analysis has been the subject of extensive and ongoing pipeline development due to its complexity, the availability of several options at each analysis step, and the development of new analysis demands, including integration with new data sources. Bioinformatics pipelines are usually custom built for different applications, making them typically difficult to modify, extend and repurpose. Scientific workflow systems are intended to address these issues by providing general-purpose frameworks in which to develop and execute such pipelines. The Kepler workflow environment is a well-established system under continual development that is employed in several areas of scientific research. Kepler provides a flexible graphical interface, featuring clear display of parameter values, for design and modification of workflows. It has capabilities for developing novel computational components in the R, Python, and Java programming languages, all of which are widely used for bioinformatics algorithm development, along with capabilities for invoking external applications and using web services. We developed a series of fully functional bioinformatics pipelines addressing common tasks in microarray processing in the Kepler workflow environment. These pipelines consist of a set of tools for GFF file processing of NimbleGen chromatin immunoprecipitation on microarray (ChIP-chip) datasets and more comprehensive workflows for Affymetrix gene expression microarray bioinformatics and basic primer design for PCR experiments, which are often used to validate microarray results. Although functional in themselves, these workflows can be easily customized, extended, or repurposed to match the needs of specific projects and are designed to be a toolkit and starting point for specific applications. These workflows illustrate a workflow programming paradigm focusing on local resources (programs and data) and therefore are close to traditional shell scripting or R/BioConductor scripting approaches to pipeline design. Finally, we suggest that microarray data processing task workflows may provide a basis for future example-based comparison of different workflow systems. We provide a set of tools and complete workflows for microarray data analysis in the Kepler environment, which has the advantages of offering graphical, clear display of conceptual steps and parameters and the ability to easily integrate other resources such as remote data and web services.
Buying in to bioinformatics: an introduction to commercial sequence analysis software
2015-01-01
Advancements in high-throughput nucleotide sequencing techniques have brought with them state-of-the-art bioinformatics programs and software packages. Given the importance of molecular sequence data in contemporary life science research, these software suites are becoming an essential component of many labs and classrooms, and as such are frequently designed for non-computer specialists and marketed as one-stop bioinformatics toolkits. Although beautifully designed and powerful, user-friendly bioinformatics packages can be expensive and, as more arrive on the market each year, it can be difficult for researchers, teachers and students to choose the right software for their needs, especially if they do not have a bioinformatics background. This review highlights some of the currently available and most popular commercial bioinformatics packages, discussing their prices, usability, features and suitability for teaching. Although several commercial bioinformatics programs are arguably overpriced and overhyped, many are well designed, sophisticated and, in my opinion, worth the investment. If you are just beginning your foray into molecular sequence analysis or an experienced genomicist, I encourage you to explore proprietary software bundles. They have the potential to streamline your research, increase your productivity, energize your classroom and, if anything, add a bit of zest to the often dry detached world of bioinformatics. PMID:25183247
Buying in to bioinformatics: an introduction to commercial sequence analysis software.
Smith, David Roy
2015-07-01
Advancements in high-throughput nucleotide sequencing techniques have brought with them state-of-the-art bioinformatics programs and software packages. Given the importance of molecular sequence data in contemporary life science research, these software suites are becoming an essential component of many labs and classrooms, and as such are frequently designed for non-computer specialists and marketed as one-stop bioinformatics toolkits. Although beautifully designed and powerful, user-friendly bioinformatics packages can be expensive and, as more arrive on the market each year, it can be difficult for researchers, teachers and students to choose the right software for their needs, especially if they do not have a bioinformatics background. This review highlights some of the currently available and most popular commercial bioinformatics packages, discussing their prices, usability, features and suitability for teaching. Although several commercial bioinformatics programs are arguably overpriced and overhyped, many are well designed, sophisticated and, in my opinion, worth the investment. If you are just beginning your foray into molecular sequence analysis or an experienced genomicist, I encourage you to explore proprietary software bundles. They have the potential to streamline your research, increase your productivity, energize your classroom and, if anything, add a bit of zest to the often dry detached world of bioinformatics. © The Author 2014. Published by Oxford University Press.
Vidak, Marko; Jovcevska, Ivana; Samec, Neja; Zottel, Alja; Liovic, Mirjana; Rozman, Damjana; Dzeroski, Saso; Juvan, Peter; Komel, Radovan
2018-05-04
Glioblastoma (GB) is the most aggressive brain malignancy. Although some potential glioblastoma biomarkers have already been identified, there is a lack of cell membrane-bound biomarkers capable of distinguishing brain tissue from glioblastoma and/or glioblastoma stem cells (GSC), which are responsible for the rapid post-operative tumor reoccurrence. In order to find new GB/GSC marker candidates that would be cell surface proteins (CSP), we have performed meta-analysis of genome-scale mRNA expression data from three data repositories (GEO, ArrayExpress and GLIOMASdb). The search yielded ten appropriate datasets, and three (GSE4290/GDS1962, GSE23806/GDS3885, and GLIOMASdb) were used for selection of new GB/GSC marker candidates, while the other seven (GSE4412/GDS1975, GSE4412/GDS1976, E-GEOD-52009, E-GEOD-68848, E-GEOD-16011, E-GEOD-4536, and E-GEOD-74571) were used for bioinformatic validation. The selection identified four new CSP-encoding candidate genes— CD276 , FREM2 , SPRY1 , and SLC47A1 —and the bioinformatic validation confirmed these findings. A review of the literature revealed that CD276 is not a novel candidate, while SLC47A1 had lower validation test scores than the other new candidates and was therefore not considered for experimental validation. This validation revealed that the expression of FREM2—but not SPRY1—is higher in glioblastoma cell lines when compared to non-malignant astrocytes. In addition, FREM2 gene and protein expression levels are higher in GB stem-like cell lines than in conventional glioblastoma cell lines. FREM2 is thus proposed as a novel GB biomarker and a putative biomarker of glioblastoma stem cells. Both FREM2 and SPRY1 are expressed on the surface of the GB cells, while SPRY1 alone was found overexpressed in the cytosol of non-malignant astrocytes.
A Microarray Tool Provides Pathway and GO Term Analysis.
Koch, Martin; Royer, Hans-Dieter; Wiese, Michael
2011-12-01
Analysis of gene expression profiles is no longer exclusively a task for bioinformatic experts. However, gaining statistically significant results is challenging and requires both biological knowledge and computational know-how. Here we present a novel, user-friendly microarray reporting tool called maRt. The software provides access to bioinformatic resources, like gene ontology terms and biological pathways by use of the DAVID and the BioMart web-service. Results are summarized in structured HTML reports, each presenting a different layer of information. In these report, contents of diverse sources are integrated and interlinked. To speed up processing, maRt takes advantage of the multi-core technology of modern desktop computers by using parallel processing. Since the software is built upon a RCP infrastructure it might be an outset for developers aiming to integrate novel R based applications. Installer, documentation and various kinds of tutorials are available under LGPL license at the website of our institute http://www.pharma.uni-bonn.de/www/mart. This software is free for academic use. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Saliva Proteomics Analysis Offers Insights on Type 1 Diabetes Pathology in a Pediatric Population
Pappa, Eftychia; Vastardis, Heleni; Mermelekas, George; Gerasimidi-Vazeou, Andriani; Zoidakis, Jerome; Vougas, Konstantinos
2018-01-01
The composition of the salivary proteome is affected by pathological conditions. We analyzed by high resolution mass spectrometry approaches saliva samples collected from children and adolescents with type 1 diabetes and healthy controls. The list of more than 2000 high confidence protein identifications constitutes a comprehensive characterization of the salivary proteome. Patients with good glycemic regulation and healthy individuals have comparable proteomic profiles. In contrast, a significant number of differentially expressed proteins were identified in the saliva of patients with poor glycemic regulation compared to patients with good glycemic control and healthy children. These proteins are involved in biological processes relevant to diabetic pathology such as endothelial damage and inflammation. Moreover, a putative preventive therapeutic approach was identified based on bioinformatic analysis of the deregulated salivary proteins. Thus, thorough characterization of saliva proteins in diabetic pediatric patients established a connection between molecular changes and disease pathology. This proteomic and bioinformatic approach highlights the potential of salivary diagnostics in diabetes pathology and opens the way for preventive treatment of the disease. PMID:29755368
Ooka, Hideshi; Hashimoto, Kazuhito; Nakamura, Ryuhei
2018-05-14
Understanding the design strategy of photosynthetic and respiratory enzymes is important to develop efficient artificial catalysts for oxygen evolution and reduction reactions. Here, based on a bioinformatic analysis of cyanobacterial oxygen evolution and reduction enzymes (photosystem II: PS II and cytochrome c oxidase: COX, respectively), the gene encoding the catalytic D1 subunit of PS II was found to be expressed individually across 38 phylogenetically diverse strains, which is in contrast to the operon structure of the genes encoding major COX subunits. Selective synthesis of the D1 subunit minimizes the repair cost of PS II, which allows compensation for its instability by lowering the turnover number required to generate a net positive energy yield. The different bioenergetics observed between PS II and COX suggest that in addition to the catalytic activity rationalized by the Sabatier principle, stability factors have also provided a major influence on the design strategy of biological multi-electron transfer enzymes. © 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Molecular cloing and bioinformatics analysis of lactate dehydrogenase from Taenia multiceps.
Guo, Cheng; Wang, Yu; Huang, Xing; Wang, Ning; Yan, Ming; He, Ran; Gu, Xiaobin; Xie, Yue; Lai, Weimin; Jing, Bo; Peng, Xuerong; Yang, Guangyou
2017-10-01
Coenurus cerebralis, the larval stage (metacestode or coenurus) of Taenia multiceps, parasitizes sheep, goats, and other ruminants and causes coenurosis. In this study, we isolated and characterized complementary DNAs that encode lactate dehydrogenase A (Tm-LDHA) and B (Tm-LDHB) from the transcriptome of T. multiceps and expressed recombinant Tm-LDHB (rTm-LDHB) in Escherichia coli. Bioinformatic analysis showed that both Tm-LDH genes (LDHA and LDHB) contain a 996-bp open reading frame and encode a protein of 331 amino acids. After determination of the immunogenicity of the recombinant Tm-LDHB, an indirect enzyme-linked immunosorbent assay (ELISA) was developed for preliminary evaluation of the serodiagnostic potential of rTm-LDHB in goats. However, the rTm-LDHB-based indirect ELISA developed here exhibited specificity of only 71.42% (10/14) and sensitivity of 1:3200 in detection of goats infected with T. multiceps in the field. This study is the first to describe LDHA and LDHB of T. multiceps; meanwhile, our results indicate that rTm-LDHB is not a specific antigen candidate for immunodiagnosis of T. multiceps infection in goats.
NaderiSoorki, Maryam; Galehdari, Hamid; Baradaran, Masomeh; Jalali, Amir
2016-09-15
Scorpion venom contains mixture of biologic molecules including selective toxins with medical capability. Odonthubuthus doriae (O. doriae) belonged to Buthidae family of scorpions and gained more interest among Iranian dangerous scorpion since 2005. We constructed the first cDNA library to explore the transcriptomic composition of this Iranian scorpiontelson. Then by used of bioinformatic software each expression sequence taq (EST) from the library analyzed and its quiddity was clear. Analysis showed that toxins (42%) had more venom transcript than other component such as antimicrobial peptides, venom peptides and cell proteins. Over 16% of transcripts didn't have any open reading frames (ORF), however their sequences showed similarity by other scorpion sequences. One EST didn't have any similarity by known scorpion peptides. For the first time; we report a comprehensive study of an Iranian scorpion with interesting and novel findings. We characterized a new putative sodium channel modifier in scorpions by some bioinformatics software, and then predicted its structure and function. Copyright © 2016. Published by Elsevier Ltd.
2013-01-01
Background To understand the carcinogenesis caused by accumulated genetic and epigenetic alterations and seek novel biomarkers for various cancers, studying differentially expressed genes between cancerous and normal tissues is crucial. In the study, two cDNA libraries of lung cancer were constructed and screened for identification of differentially expressed genes. Methods Two cDNA libraries of differentially expressed genes were constructed using lung adenocarcinoma tissue and adjacent nonmalignant lung tissue by suppression subtractive hybridization. The data of the cDNA libraries were then analyzed and compared using bioinformatics analysis. Levels of mRNA and protein were measured by quantitative real-time polymerase chain reaction (q-RT-PCR) and western blot respectively, as well as expression and localization of proteins were determined by immunostaining. Gene functions were investigated using proliferation and migration assays after gene silencing and gene over-expression. Results Two libraries of differentially expressed genes were obtained. The forward-subtracted library (FSL) and the reverse-subtracted library (RSL) contained 177 and 59 genes, respectively. Bioinformatic analysis demonstrated that these genes were involved in a wide range of cellular functions. The vast majority of these genes were newly identified to be abnormally expressed in lung cancer. In the first stage of the screening for 16 genes, we compared lung cancer tissues with their adjacent non-malignant tissues at the mRNA level, and found six genes (ERGIC3, DDR1, HSP90B1, SDC1, RPSA, and LPCAT1) from the FSL were significantly up-regulated while two genes (GPX3 and TIMP3) from the RSL were significantly down-regulated (P < 0.05). The ERGIC3 protein was also over-expressed in lung cancer tissues and cultured cells, and expression of ERGIC3 was correlated with the differentiated degree and histological type of lung cancer. The up-regulation of ERGIC3 could promote cellular migration and proliferation in vitro. Conclusions The two libraries of differentially expressed genes may provide the basis for new insights or clues for finding novel lung cancer-related genes; several genes were newly found in lung cancer with ERGIC3 seeming a novel lung cancer-related gene. ERGIC3 may play an active role in the development and progression of lung cancer. PMID:23374247
PmiRExAt: plant miRNA expression atlas database and web applications
Gurjar, Anoop Kishor Singh; Panwar, Abhijeet Singh; Gupta, Rajinder; Mantri, Shrikant S.
2016-01-01
High-throughput small RNA (sRNA) sequencing technology enables an entirely new perspective for plant microRNA (miRNA) research and has immense potential to unravel regulatory networks. Novel insights gained through data mining in publically available rich resource of sRNA data will help in designing biotechnology-based approaches for crop improvement to enhance plant yield and nutritional value. Bioinformatics resources enabling meta-analysis of miRNA expression across multiple plant species are still evolving. Here, we report PmiRExAt, a new online database resource that caters plant miRNA expression atlas. The web-based repository comprises of miRNA expression profile and query tool for 1859 wheat, 2330 rice and 283 maize miRNA. The database interface offers open and easy access to miRNA expression profile and helps in identifying tissue preferential, differential and constitutively expressing miRNAs. A feature enabling expression study of conserved miRNA across multiple species is also implemented. Custom expression analysis feature enables expression analysis of novel miRNA in total 117 datasets. New sRNA dataset can also be uploaded for analysing miRNA expression profiles for 73 plant species. PmiRExAt application program interface, a simple object access protocol web service allows other programmers to remotely invoke the methods written for doing programmatic search operations on PmiRExAt database. Database URL: http://pmirexat.nabi.res.in. PMID:27081157
Is there room for ethics within bioinformatics education?
Taneri, Bahar
2011-07-01
When bioinformatics education is considered, several issues are addressed. At the undergraduate level, the main issue revolves around conveying information from two main and different fields: biology and computer science. At the graduate level, the main issue is bridging the gap between biology students and computer science students. However, there is an educational component that is rarely addressed within the context of bioinformatics education: the ethics component. Here, a different perspective is provided on bioinformatics education, and the current status of ethics is analyzed within the existing bioinformatics programs. Analysis of the existing undergraduate and graduate programs, in both Europe and the United States, reveals the minimal attention given to ethics within bioinformatics education. Given that bioinformaticians speedily and effectively shape the biomedical sciences and hence their implications for society, here redesigning of the bioinformatics curricula is suggested in order to integrate the necessary ethics education. Unique ethical problems awaiting bioinformaticians and bioinformatics ethics as a separate field of study are discussed. In addition, a template for an "Ethics in Bioinformatics" course is provided.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lo, Chien-Chi
2015-08-03
Edge Bioinformatics is a developmental bioinformatics and data management platform which seeks to supply laboratories with bioinformatics pipelines for analyzing data associated with common samples case goals. Edge Bioinformatics enables sequencing as a solution and forward-deployed situations where human-resources, space, bandwidth, and time are limited. The Edge bioinformatics pipeline was designed based on following USE CASES and specific to illumina sequencing reads. 1. Assay performance adjudication (PCR): Analysis of an existing PCR assay in a genomic context, and automated design of a new assay to resolve conflicting results; 2. Clinical presentation with extreme symptoms: Characterization of a known pathogen ormore » co-infection with a. Novel emerging disease outbreak or b. Environmental surveillance« less
Liu, Qun; Peng, Yong-Bo; Qi, Lian-Wen; Cheng, Xiao-Lan; Xu, Xiao-Jun; Liu, Le-Le; Liu, E-Hu; Li, Ping
2012-01-01
Cervical cancer is one of the most common cancers among women in the world. 6-Shogaol is a natural compound isolated from the rhizome of ginger (Zingiber officinale). In this paper, we demonstrated that 6-shogaol induced apoptosis and G2/M phase arrest in human cervical cancer HeLa cells. Endoplasmic reticulum stress and mitochondrial pathway were involved in 6-shogaol-mediated apoptosis. Proteomic analysis based on label-free strategy by liquid chromatography chip quadrupole time-of-flight mass spectrometry was subsequently proposed to identify, in a non-target-biased manner, the molecular changes in cellular proteins in response to 6-shogaol treatment. A total of 287 proteins were differentially expressed in response to 24 h treatment with 15 μM 6-shogaol in HeLa cells. Significantly changed proteins were subjected to functional pathway analysis by multiple analyzing software. Ingenuity pathway analysis (IPA) suggested that 14-3-3 signaling is a predominant canonical pathway involved in networks which may be significantly associated with the process of apoptosis and G2/M cell cycle arrest induced by 6-shogaol. In conclusion, this work developed an unbiased protein analysis strategy by shotgun proteomics and bioinformatics analysis. Data observed provide a comprehensive analysis of the 6-shogaol-treated HeLa cell proteome and reveal protein alterations that are associated with its anticancer mechanism.
Liu, Qun; Peng, Yong-Bo; Qi, Lian-Wen; Cheng, Xiao-Lan; Xu, Xiao-Jun; Liu, Le-Le; Liu, E-Hu; Li, Ping
2012-01-01
Cervical cancer is one of the most common cancers among women in the world. 6-Shogaol is a natural compound isolated from the rhizome of ginger (Zingiber officinale). In this paper, we demonstrated that 6-shogaol induced apoptosis and G2/M phase arrest in human cervical cancer HeLa cells. Endoplasmic reticulum stress and mitochondrial pathway were involved in 6-shogaol-mediated apoptosis. Proteomic analysis based on label-free strategy by liquid chromatography chip quadrupole time-of-flight mass spectrometry was subsequently proposed to identify, in a non-target-biased manner, the molecular changes in cellular proteins in response to 6-shogaol treatment. A total of 287 proteins were differentially expressed in response to 24 h treatment with 15 μM 6-shogaol in HeLa cells. Significantly changed proteins were subjected to functional pathway analysis by multiple analyzing software. Ingenuity pathway analysis (IPA) suggested that 14-3-3 signaling is a predominant canonical pathway involved in networks which may be significantly associated with the process of apoptosis and G2/M cell cycle arrest induced by 6-shogaol. In conclusion, this work developed an unbiased protein analysis strategy by shotgun proteomics and bioinformatics analysis. Data observed provide a comprehensive analysis of the 6-shogaol-treated HeLa cell proteome and reveal protein alterations that are associated with its anticancer mechanism. PMID:23243437
Yang, Yalan; Sun, Wei; Wang, Ruiqi; Lei, Chuzhao; Zhou, Rong; Tang, Zhonglin; Li, Kui
2015-03-08
The Wnt signaling pathway is involved in the control of cell proliferation and differentiation during skeletal muscle development. Secreted frizzled-related proteins (SFRPs), such as SFRP1, function as inhibitors of Wnt signaling. MicroRNA-1/206(miRNA-1/206) is specifically expressed in skeletal muscle and play a critical role in myogenesis. The miRNA-mRNA profiles and bioinformatics study suggested that the SFRP1 gene was potentially regulated by miRNA-1/206 during porcine skeletal muscle development. To understand the function of SFRP1 and miRNA-1/206 in swine myogenesis, we first predicted the targets of miRNA-1/206 with the TargetScan and PicTar programs, and analyzed the molecular characterization of the porcine SFRP1 gene. We performed a temporal-spatial expression analysis of SFRP1 mRNA and miRNA-206 in Tongcheng pigs (a Chinese indigenous breed) by quantitative real-time polymerase chain reaction, and conducted the co-expression analyses of SFRP1 and miRNA-1/206. Subsequently, the interaction between SFRP1 and miRNA-1/206 was validated via dual luciferase and Western blot assays. The bioinformatics analysis predicted SFRP1 to be a target of miRNA-1/206. The expression level of the SFRP1 was highly varied across numerous pig tissues and it was down-regulated during porcine skeletal muscle development. The expression level of the SFRP1 was significantly higher in the embryonic skeletal compared with postnatal skeletal muscle, whereas miR-206 showed the inverse pattern of expression. A significant negative correlation was observed between the expression of miR-1/206 and SFRP1 during porcine skeletal muscle development (p <0.05). Dual luciferase assay and Western-blot results demonstrated that SFRP1 was a target of miR-1/206 in porcine iliac endothelial cells. Our results indicate that the SFRP1 gene is regulated by miR-1/206 and potentially affects skeletal muscle development. These findings increase understanding of the biological functions and the regulation of the SFRP1 gene in mammals.
AnaBench: a Web/CORBA-based workbench for biomolecular sequence analysis
Badidi, Elarbi; De Sousa, Cristina; Lang, B Franz; Burger, Gertraud
2003-01-01
Background Sequence data analyses such as gene identification, structure modeling or phylogenetic tree inference involve a variety of bioinformatics software tools. Due to the heterogeneity of bioinformatics tools in usage and data requirements, scientists spend much effort on technical issues including data format, storage and management of input and output, and memorization of numerous parameters and multi-step analysis procedures. Results In this paper, we present the design and implementation of AnaBench, an interactive, Web-based bioinformatics Analysis workBench allowing streamlined data analysis. Our philosophy was to minimize the technical effort not only for the scientist who uses this environment to analyze data, but also for the administrator who manages and maintains the workbench. With new bioinformatics tools published daily, AnaBench permits easy incorporation of additional tools. This flexibility is achieved by employing a three-tier distributed architecture and recent technologies including CORBA middleware, Java, JDBC, and JSP. A CORBA server permits transparent access to a workbench management database, which stores information about the users, their data, as well as the description of all bioinformatics applications that can be launched from the workbench. Conclusion AnaBench is an efficient and intuitive interactive bioinformatics environment, which offers scientists application-driven, data-driven and protocol-driven analysis approaches. The prototype of AnaBench, managed by a team at the Université de Montréal, is accessible on-line at: . Please contact the authors for details about setting up a local-network AnaBench site elsewhere. PMID:14678565
ERIC Educational Resources Information Center
Rowe, Laura
2017-01-01
An introductory bioinformatics laboratory experiment focused on protein analysis has been developed that is suitable for undergraduate students in introductory biochemistry courses. The laboratory experiment is designed to be potentially used as a "stand-alone" activity in which students are introduced to basic bioinformatics tools and…
Rodríguez-García, María Juliana; García-Reina, Andrés; Machado, Vilmar; Galián, José
2016-09-01
In this study, a defensin gene (Clit-Def) has been characterised in the tiger beetle Calomera littoralis for the first time. Bioinformatic analysis showed that the gene has an open reading frame of 246bp that contains a 46 amino acid mature peptide. The phylogenetic analysis showed a high variability in the coleopteran defensins analysed. The Clit-Def mature peptide has the features to be involved in the antimicrobial function: a predicted cationic isoelectric point of 8.94, six cysteine residues that form three disulfide bonds, and the typical cysteine-stabilized α-helix β-sheet (CSαβ) structural fold. Real time quantitative PCR analysis showed that Clit-Def was upregulated in the different body parts analysed after infection with lipopolysaccharides of Escherichia coli, and also indicated that has an expression peak at 12h post infection. The expression patterns of Clit-Def suggest that this gene plays important roles in the humoral system in the adephagan beetle Calomera littoralis. Copyright © 2016 Elsevier B.V. All rights reserved.
Differentially-Expressed Pseudogenes in HIV-1 Infection.
Gupta, Aditi; Brown, C Titus; Zheng, Yong-Hui; Adami, Christoph
2015-09-29
Not all pseudogenes are transcriptionally silent as previously thought. Pseudogene transcripts, although not translated, contribute to the non-coding RNA pool of the cell that regulates the expression of other genes. Pseudogene transcripts can also directly compete with the parent gene transcripts for mRNA stability and other cell factors, modulating their expression levels. Tissue-specific and cancer-specific differential expression of these "functional" pseudogenes has been reported. To ascertain potential pseudogene:gene interactions in HIV-1 infection, we analyzed transcriptomes from infected and uninfected T-cells and found that 21 pseudogenes are differentially expressed in HIV-1 infection. This is interesting because parent genes of one-third of these differentially-expressed pseudogenes are implicated in HIV-1 life cycle, and parent genes of half of these pseudogenes are involved in different viral infections. Our bioinformatics analysis identifies candidate pseudogene:gene interactions that may be of significance in HIV-1 infection. Experimental validation of these interactions would establish that retroviruses exploit this newly-discovered layer of host gene expression regulation for their own benefit.
Circular RNA expression in basal cell carcinoma.
Sand, Michael; Bechara, Falk G; Sand, Daniel; Gambichler, Thilo; Hahn, Stephan A; Bromba, Michael; Stockfleth, Eggert; Hessam, Schapoor
2016-05-01
Circular RNAs (circRNAs), are nonprotein coding RNAs consisting of a circular loop with multiple miRNA, binding sites called miRNA response elements (MREs), functioning as miRNA sponges. This study was performed to identify differentially expressed circRNAs and their MREs in basal cell carcinoma (BCC). Microarray circRNA expression profiles were acquired from BCC and control followed by qRT-PCR validation. Bioinformatical target prediction revealed multiple MREs. Sequence analysis was performed concerning MRE interaction potential with the BCC miRNome. We identified 23 upregulated and 48 downregulated circRNAs with 354 miRNA response elements capable of sequestering miRNA target sequences of the BCC miRNome. The present study describes a variety of circRNAs that are potentially involved in the molecular pathogenesis of BCC.
Agyei, Dominic; Tsopmo, Apollinaire; Udenigwe, Chibuike C
2018-06-01
There are emerging advancements in the strategies used for the discovery and development of food-derived bioactive peptides because of their multiple food and health applications. Bioinformatics and peptidomics are two computational and analytical techniques that have the potential to speed up the development of bioactive peptides from bench to market. Structure-activity relationships observed in peptides form the basis for bioinformatics and in silico prediction of bioactive sequences encrypted in food proteins. Peptidomics, on the other hand, relies on "hyphenated" (liquid chromatography-mass spectrometry-based) techniques for the detection, profiling, and quantitation of peptides. Together, bioinformatics and peptidomics approaches provide a low-cost and effective means of predicting, profiling, and screening bioactive protein hydrolysates and peptides from food. This article discuses the basis, strengths, and limitations of bioinformatics and peptidomics approaches currently used for the discovery and analysis of food-derived bioactive peptides.
Secretome Analysis of Vibrio cholerae Type VI Secretion System Reveals a New Effector-Immunity Pair
Altindis, Emrah; Dong, Tao; Catalano, Christy
2015-01-01
ABSTRACT The type VI secretion system (T6SS) is a dynamic macromolecular organelle that many Gram-negative bacteria use to inhibit or kill other prokaryotic or eukaryotic cells. The toxic effectors of T6SS are delivered to the prey cells in a contact-dependent manner. In Vibrio cholerae, the etiologic agent of cholera, T6SS is active during intestinal infection. Here, we describe the use of comparative proteomics coupled with bioinformatics to identify a new T6SS effector-immunity pair. This analysis was able to identify all previously identified secreted substrates of T6SS except PAAR (proline, alanine, alanine, arginine) motif-containing proteins. Additionally, this approach led to the identification of a new secreted protein encoded by VCA0285 (TseH) that carries a predicted hydrolase domain. We confirmed that TseH is toxic when expressed in the periplasm of Escherichia coli and V. cholerae cells. The toxicity observed in V. cholerae was suppressed by coexpression of the protein encoded by VCA0286 (TsiH), indicating that this protein is the cognate immunity protein of TseH. Furthermore, exogenous addition of purified recombinant TseH to permeabilized E. coli cells caused cell lysis. Bioinformatics analysis of the TseH protein sequence suggest that it is a member of a new family of cell wall-degrading enzymes that include proteins belonging to the YD repeat and Rhs superfamilies and that orthologs of TseH are likely expressed by species belonging to phyla as diverse as Bacteroidetes and Proteobacteria. PMID:25759499
BioVLAB-MMIA-NGS: microRNA-mRNA integrated analysis using high-throughput sequencing data.
Chae, Heejoon; Rhee, Sungmin; Nephew, Kenneth P; Kim, Sun
2015-01-15
It is now well established that microRNAs (miRNAs) play a critical role in regulating gene expression in a sequence-specific manner, and genome-wide efforts are underway to predict known and novel miRNA targets. However, the integrated miRNA-mRNA analysis remains a major computational challenge, requiring powerful informatics systems and bioinformatics expertise. The objective of this study was to modify our widely recognized Web server for the integrated mRNA-miRNA analysis (MMIA) and its subsequent deployment on the Amazon cloud (BioVLAB-MMIA) to be compatible with high-throughput platforms, including next-generation sequencing (NGS) data (e.g. RNA-seq). We developed a new version called the BioVLAB-MMIA-NGS, deployed on both Amazon cloud and on a high-performance publicly available server called MAHA. By using NGS data and integrating various bioinformatics tools and databases, BioVLAB-MMIA-NGS offers several advantages. First, sequencing data is more accurate than array-based methods for determining miRNA expression levels. Second, potential novel miRNAs can be detected by using various computational methods for characterizing miRNAs. Third, because miRNA-mediated gene regulation is due to hybridization of an miRNA to its target mRNA, sequencing data can be used to identify many-to-many relationship between miRNAs and target genes with high accuracy. http://epigenomics.snu.ac.kr/biovlab_mmia_ngs/. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Tadano, Toshihiro; Kakuta, Yoichi; Hamada, Shin; Shimodaira, Yosuke; Kuroha, Masatake; Kawakami, Yoko; Kimura, Tomoya; Shiga, Hisashi; Endo, Katsuya; Masamune, Atsushi; Takahashi, Seiichi; Kinouchi, Yoshitaka; Shimosegawa, Tooru
2016-07-15
To investigate the microRNA (miRNA) expression during histological progression from colorectal normal mucosa through adenoma to carcinoma within a lesion. Using microarray, the sequential changes in miRNA expression profiles were compared in colonic lesions from matched samples; histologically, non-neoplastic mucosa, adenoma, and submucosal invasive carcinoma were microdissected from a tissue sample. Cell proliferation assay was performed to observe the effect of miRNA, and its target genes were predicted using bioinformatics approaches and the expression profile of SW480 transfected with the miRNA mimics. mRNA and protein levels of the target gene in colon cancer cell lines with a mimic control or miRNA mimics were measured using qRT-PCR and Western blotting. The expression levels of miRNA and target gene in colorectal tissue samples were also measured. Microarray analysis identified that the miR-320 family, including miR-320a, miR-320b, miR-320c, miR-320d and miR-320e, were differentially expressed in adenoma and submucosal invasive carcinoma. The miR-320 family, which inhibits cell proliferation, is frequently downregulated in colorectal adenoma and submucosal invasive carcinoma tissues. Seven genes including CDK6 were identified to be common in the results of gene expression array and bioinformatics analyses performed to find the target gene of the miR-320 family. We confirmed that mRNA and protein levels of CDK6 were significantly suppressed in colon cancer cell lines with miR-320 family mimics. CDK6 expression was found to increase from non-neoplastic mucosa through adenoma to submucosal invasive carcinoma tissues and showed an inverse correlation with miR-320 family expression. MiR-320 family affects colorectal tumor proliferation by targeting CDK6, plays important role in its growth, and is considered to be a biomarker for its early detection.
Tadano, Toshihiro; Kakuta, Yoichi; Hamada, Shin; Shimodaira, Yosuke; Kuroha, Masatake; Kawakami, Yoko; Kimura, Tomoya; Shiga, Hisashi; Endo, Katsuya; Masamune, Atsushi; Takahashi, Seiichi; Kinouchi, Yoshitaka; Shimosegawa, Tooru
2016-01-01
AIM: To investigate the microRNA (miRNA) expression during histological progression from colorectal normal mucosa through adenoma to carcinoma within a lesion. METHODS: Using microarray, the sequential changes in miRNA expression profiles were compared in colonic lesions from matched samples; histologically, non-neoplastic mucosa, adenoma, and submucosal invasive carcinoma were microdissected from a tissue sample. Cell proliferation assay was performed to observe the effect of miRNA, and its target genes were predicted using bioinformatics approaches and the expression profile of SW480 transfected with the miRNA mimics. mRNA and protein levels of the target gene in colon cancer cell lines with a mimic control or miRNA mimics were measured using qRT-PCR and Western blotting. The expression levels of miRNA and target gene in colorectal tissue samples were also measured. RESULTS: Microarray analysis identified that the miR-320 family, including miR-320a, miR-320b, miR-320c, miR-320d and miR-320e, were differentially expressed in adenoma and submucosal invasive carcinoma. The miR-320 family, which inhibits cell proliferation, is frequently downregulated in colorectal adenoma and submucosal invasive carcinoma tissues. Seven genes including CDK6 were identified to be common in the results of gene expression array and bioinformatics analyses performed to find the target gene of the miR-320 family. We confirmed that mRNA and protein levels of CDK6 were significantly suppressed in colon cancer cell lines with miR-320 family mimics. CDK6 expression was found to increase from non-neoplastic mucosa through adenoma to submucosal invasive carcinoma tissues and showed an inverse correlation with miR-320 family expression. CONCLUSION: MiR-320 family affects colorectal tumor proliferation by targeting CDK6, plays important role in its growth, and is considered to be a biomarker for its early detection. PMID:27559432
NASA Astrophysics Data System (ADS)
Balqis, Widodo, Lukiati, Betty; Amin, Mohamad
2017-05-01
A way to improve the quality of learning in the course of Plant Metabolism in the Department of Biology, State University of Malang, is to develop teaching materials. This research evaluates the needs of bioinformatics-based teaching material in the course Plant Metabolism by the Analyze, Design, Develop, Implement, and Evaluate (ADDIE) development model. Data were collected through questionnaires distributed to the students in the Plant Metabolism course of the Department of Biology, University of Malang, and analysis of the plan of lectures semester (RPS). Learning gains of this course show that it is not yet integrated into the field of bioinformatics. All respondents stated that plant metabolism books do not include bioinformatics and fail to explain the metabolism of a chemical compound of a local plant in Indonesia. Respondents thought that bioinformatics can explain examples and metabolism of a secondary metabolite analysis techniques and discuss potential medicinal compounds from local plants. As many as 65% of the respondents said that the existing metabolism book could not be used to understand secondary metabolism in lectures of plant metabolism. Therefore, the development of teaching materials including plant metabolism-based bioinformatics is important to improve the understanding of the lecture material in plant metabolism.
An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics
2010-01-01
Background Bioinformatics researchers are now confronted with analysis of ultra large-scale data sets, a problem that will only increase at an alarming rate in coming years. Recent developments in open source software, that is, the Hadoop project and associated software, provide a foundation for scaling to petabyte scale data warehouses on Linux clusters, providing fault-tolerant parallelized analysis on such data using a programming style named MapReduce. Description An overview is given of the current usage within the bioinformatics community of Hadoop, a top-level Apache Software Foundation project, and of associated open source software projects. The concepts behind Hadoop and the associated HBase project are defined, and current bioinformatics software that employ Hadoop is described. The focus is on next-generation sequencing, as the leading application area to date. Conclusions Hadoop and the MapReduce programming paradigm already have a substantial base in the bioinformatics community, especially in the field of next-generation sequencing analysis, and such use is increasing. This is due to the cost-effectiveness of Hadoop-based analysis on commodity Linux clusters, and in the cloud via data upload to cloud vendors who have implemented Hadoop/HBase; and due to the effectiveness and ease-of-use of the MapReduce method in parallelization of many data analysis algorithms. PMID:21210976
An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics.
Taylor, Ronald C
2010-12-21
Bioinformatics researchers are now confronted with analysis of ultra large-scale data sets, a problem that will only increase at an alarming rate in coming years. Recent developments in open source software, that is, the Hadoop project and associated software, provide a foundation for scaling to petabyte scale data warehouses on Linux clusters, providing fault-tolerant parallelized analysis on such data using a programming style named MapReduce. An overview is given of the current usage within the bioinformatics community of Hadoop, a top-level Apache Software Foundation project, and of associated open source software projects. The concepts behind Hadoop and the associated HBase project are defined, and current bioinformatics software that employ Hadoop is described. The focus is on next-generation sequencing, as the leading application area to date. Hadoop and the MapReduce programming paradigm already have a substantial base in the bioinformatics community, especially in the field of next-generation sequencing analysis, and such use is increasing. This is due to the cost-effectiveness of Hadoop-based analysis on commodity Linux clusters, and in the cloud via data upload to cloud vendors who have implemented Hadoop/HBase; and due to the effectiveness and ease-of-use of the MapReduce method in parallelization of many data analysis algorithms.
ERIC Educational Resources Information Center
Magana, Alejandra J.; Taleyarkhan, Manaz; Alvarado, Daniela Rivera; Kane, Michael; Springer, John; Clase, Kari
2014-01-01
Bioinformatics education can be broadly defined as the teaching and learning of the use of computer and information technology, along with mathematical and statistical analysis for gathering, storing, analyzing, interpreting, and integrating data to solve biological problems. The recent surge of genomics, proteomics, and structural biology in the…
Hu, Hejing; Zhang, Yannan; Shi, Yanfeng; Feng, Lin; Duan, Junchao; Sun, Zhiwei
2017-10-01
With rapid development of nanotechnology and growing environmental pollution, the combined toxic effects of SiNPs and pollutants of heavy metals like lead have received global attentions. The aim of this study was to explore the cardiovascular effects of the co-exposure of SiNPs and lead acetate (PbAc) in zebrafish using microarray and bioinformatics analysis. Although there was no other obvious cardiovascular malformation except bleeding phenotype, bradycardia, angiogenesis inhibition and declined cardiac output in zebrafish co-exposed of SiNPs and PbAc at NOAEL level, significant changes were observed in mRNA and microRNA (miRNA) expression patterns. STC-GO analysis indicated that the co-exposure might have more toxic effects on cardiovascular system than that exposure alone. Key differentially expressed genes were discerned out based on the Dynamic-gene-network, including stxbp1a, ndfip2, celf4 and gsk3b. Furthermore, several miRNAs obtained from the miRNA-Gene-Network might play crucial roles in cardiovascular disease, such as dre-miR-93, dre-miR-34a, dre-miR-181c, dre-miR-7145, dre-miR-730, dre-miR-129-5p, dre-miR-19d, dre-miR-218b, dre-miR-221. Besides, the analysis of miRNA-pathway-network indicated that the zebrafish were stimulated by the co-exposure of SiNPs and PbAc, which might cause the disturbance of calcium homeostasis and endoplasmic reticulum stress. As a result, cardiac muscle contraction might be deteriorated. In general, our data provide abundant fundamental research clues to the combined toxicity of environmental pollutants and further in-depth verifications are needed. Copyright © 2017 Elsevier Ltd. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mefford, Megan E., E-mail: megan_mefford@hms.harvard.edu; Kunstman, Kevin, E-mail: kunstman@northwestern.edu; Wolinsky, Steven M., E-mail: s-wolinsky@northwestern.edu
Macrophages express low levels of the CD4 receptor compared to T-cells. Macrophage-tropic HIV strains replicating in brain of untreated patients with HIV-associated dementia (HAD) express Envs that are adapted to overcome this restriction through mechanisms that are poorly understood. Here, bioinformatic analysis of env sequence datasets together with functional studies identified polymorphisms in the β3 strand of the HIV gp120 bridging sheet that increase M-tropism. D197, which results in loss of an N-glycan located near the HIV Env trimer apex, was detected in brain in some HAD patients, while position 200 was estimated to be under positive selection. D197 andmore » T/V200 increased fusion and infection of cells expressing low CD4 by enhancing gp120 binding to CCR5. These results identify polymorphisms in the HIV gp120 bridging sheet that overcome the restriction to macrophage infection imposed by low CD4 through enhanced gp120–CCR5 interactions, thereby promoting infection of brain and other macrophage-rich tissues. - Highlights: • We analyze HIV Env sequences and identify amino acids in beta 3 of the gp120 bridging sheet that enhance macrophage tropism. • These amino acids at positions 197 and 200 are present in brain of some patients with HIV-associated dementia. • D197 results in loss of a glycan near the HIV Env trimer apex, which may increase exposure of V3. • These variants may promote infection of macrophages in the brain by enhancing gp120–CCR5 interactions.« less
Taleyarkhan, Manaz; Alvarado, Daniela Rivera; Kane, Michael; Springer, John; Clase, Kari
2014-01-01
Bioinformatics education can be broadly defined as the teaching and learning of the use of computer and information technology, along with mathematical and statistical analysis for gathering, storing, analyzing, interpreting, and integrating data to solve biological problems. The recent surge of genomics, proteomics, and structural biology in the potential advancement of research and development in complex biomedical systems has created a need for an educated workforce in bioinformatics. However, effectively integrating bioinformatics education through formal and informal educational settings has been a challenge due in part to its cross-disciplinary nature. In this article, we seek to provide an overview of the state of bioinformatics education. This article identifies: 1) current approaches of bioinformatics education at the undergraduate and graduate levels; 2) the most common concepts and skills being taught in bioinformatics education; 3) pedagogical approaches and methods of delivery for conveying bioinformatics concepts and skills; and 4) assessment results on the impact of these programs, approaches, and methods in students’ attitudes or learning. Based on these findings, it is our goal to describe the landscape of scholarly work in this area and, as a result, identify opportunities and challenges in bioinformatics education. PMID:25452484
Bauerová-Hlinková, Vladena; Hostinová, Eva; Gašperík, Juraj; Beck, Konrad; Borko, Ľubomír; Lai, F. Anthony; Zahradníková, Alexandra; Ševčík, Jozef
2010-01-01
We report the domain analysis of the N-terminal region (residues 1–759) of the human cardiac ryanodine receptor (RyR2) that encompasses one of the discrete RyR2 mutation clusters associated with catecholaminergic polymorphic ventricular tachycardia (CPVT1) and arrhythmogenic right ventricular dysplasia (ARVD2). Our strategy utilizes a bioinformatics approach complemented by protein expression, solubility analysis and limited proteolytic digestion. Based on the bioinformatics analysis, we designed a series of specific RyR2 N-terminal fragments for cloning and overexpression in Escherichia coli. High yields of soluble proteins were achieved for fragments RyR21–606·His6, RyR2391–606·His6, RyR2409–606·His6, Trx·RyR2384–606·His6, Trx·RyR2391-606·His6 and Trx·RyR2409–606·His6. The folding of RyR21–606·His6 was analyzed by circular dichroism spectroscopy resulting in α-helix and β-sheet content of ∼23% and ∼29%, respectively, at temperatures up to 35 °C, which is in agreement with sequence based secondary structure predictions. Tryptic digestion of the largest recombinant protein, RyR21–606·His6, resulted in the appearance of two specific subfragments of ∼40 and 25 kDa. The 25 kDa fragment exhibited greater stability. Hybridization with anti-His6·Tag antibody indicated that RyR21–606·His6 is cleaved from the N-terminus and amino acid sequencing of the proteolytic fragments revealed that digestion occurred after residues 259 and 384, respectively. PMID:20045464
Takashima, Y; Fujita, K; Ardin, A C; Nagayama, K; Nomura, R; Nakano, K; Matsumoto-Nakano, M
2015-10-01
Streptococcus mutans produces multiple glucan-binding proteins (Gbps), among which GbpC encoded by the gbpC gene is known to be a cell-surface-associated protein involved in dextran-induced aggregation. The purpose of the present study was to characterize the dextran-binding domain of GbpC using bioinformatics analysis and molecular techniques. Bioinformatics analysis specified five possible regions containing molecular binding sites termed GB1 through GB5. Next, truncated recombinant GbpC (rGbpC) encoding each region was produced using a protein expression vector and five deletion mutant strains were generated, termed CDGB1 through CDGB5 respectively. The dextran-binding rates of truncated rGbpC that included the GB1, GB3, GB4 and GB5 regions in the upstream sequences were higher than that of the construct containing GB2 in the downstream region. In addition, the rates of dextran-binding for strains CDGB4 and CD1, which was entire gbpC deletion mutant, were significantly lower than for the other strains, while those of all other deletion mutants were quite similar to that of the parental strain MT8148. Biofilm structures formed by CDGB4 and CD1 were not as pronounced as that of MT8148, while those formed by other strains had greater density as compared to that of CD1. Our results suggest that the dextran-binding domain may be located in the GB4 region in the interior of the gbpC gene. Bioinformatics analysis is useful for determination of functional domains in many bacterial species. © 2015 The Society for Applied Microbiology.
Moore, Jason H
2007-11-01
Bioinformatics is an interdisciplinary field that blends computer science and biostatistics with biological and biomedical sciences such as biochemistry, cell biology, developmental biology, genetics, genomics, and physiology. An important goal of bioinformatics is to facilitate the management, analysis, and interpretation of data from biological experiments and observational studies. The goal of this review is to introduce some of the important concepts in bioinformatics that must be considered when planning and executing a modern biological research study. We review database resources as well as data mining software tools.
Ladics, Gregory S; Cressman, Robert F; Herouet-Guicheney, Corinne; Herman, Rod A; Privalle, Laura; Song, Ping; Ward, Jason M; McClain, Scott
2011-06-01
Bioinformatic tools are being increasingly utilized to evaluate the degree of similarity between a novel protein and known allergens within the context of a larger allergy safety assessment process. Importantly, bioinformatics is not a predictive analysis that can determine if a novel protein will ''become" an allergen, but rather a tool to assess whether the protein is a known allergen or is potentially cross-reactive with an existing allergen. Bioinformatic tools are key components of the 2009 CodexAlimentarius Commission's weight-of-evidence approach, which encompasses a variety of experimental approaches for an overall assessment of the allergenic potential of a novel protein. Bioinformatic search comparisons between novel protein sequences, as well as potential novel fusion sequences derived from the genome and transgene, and known allergens are required by all regulatory agencies that assess the safety of genetically modified (GM) products. The objective of this paper is to identify opportunities for consensus in the methods of applying bioinformatics and to outline differences that impact a consistent and reliable allergy safety assessment. The bioinformatic comparison process has some critical features, which are outlined in this paper. One of them is a curated, publicly available and well-managed database with known allergenic sequences. In this paper, the best practices, scientific value, and food safety implications of bioinformatic analyses, as they are applied to GM food crops are discussed. Recommendations for conducting bioinformatic analysis on novel food proteins for potential cross-reactivity to known allergens are also put forth. Copyright © 2011 Elsevier Inc. All rights reserved.
Xiang, Xue-Lian; Yang, Xia; Liang, Hai-Wei; Qiu, Xiao-Hui; Yang, Li-Hua; Peng, Zhi-Gang; Chen, Gang
2018-01-01
Mounting evidence has shown that miR-23b-3p, which is associated with cell proliferation, invasion, and apoptosis, acts as a biomarker for diagnosis and outcomes in numerous cancers. However, the clinicopathological implication of miR-23b-3p in hepatocellular carcinoma (HCC) remains unclear. Our study evaluated the role of miR-23b-3p in HCC and investigated its potential application as a marker for preliminary diagnosis and therapy in HCC. High-throughput data from the NCBI Gene Expression Omnibus (GEO) and The Cancer Genome Atlas (TCGA) were collected and analyzed. One hundred and one tissue sections of HCC were paired with adjacent non-cancerous HCC as further supplements. miR-23b-3p expression was detected using quantitative real-time PCR. Additionally, the relationship between miR-23b-3p expression and HCC progression and Time-to-recurrence (months) was explored. Ten algorithms were applied to predict the prospective target genes of miR-23b-3p. Next, we conducted bioinformatics analysis for further study. miR-23b-3p expression was pronouncedly decreased in HCC tissues in contrast with their paired adjacent non-cancerous HCC (P<0.001) with RT-qPCR. In total, 405 targets, acquired with consistent prediction from at least five databases, were used for the bioinformatics analysis. According to the Gene Ontology (GO) analysis, all targets were classified into biological processes, cellular components and molecular functions. In the pathway analysis, targets of miR-23b-3p were primarily enriched in the signaling pathways of renal cell carcinoma, hepatitis B and pancreatic cancer (corrected P-value <0.05). In the protein-protein interaction (PPI) network for miR-23b-3p, a total of 8 targets, including SRC, AKT1, EGFR, CTNNB1, BCL2, SMAD3, PTEN and KDM6A, were located in the key nodes with high degree (>35). In conclusion, this study provides impressive illumination of the potential role of miR-23b-3p in HCC tumorigenesis and progression. Furthermore, miR-23b-3p may act as a predictor of HCC and could be a new treatment target. PMID:29484429
Effect of Wnt3a on Keratinocytes Utilizing in Vitro and Bioinformatics Analysis
Nam, Ju-Suk; Chakraborty, Chiranjib; Sharma, Ashish Ranjan; Her, Young; Bae, Kee-Jeong; Sharma, Garima; Doss, George Priya; Lee, Sang-Soo; Hong, Myung-Sun; Song, Dong-Keun
2014-01-01
Wingless-type (Wnt) signaling proteins participate in various cell developmental processes. A suppressive role of Wnt5a on keratinocyte growth has already been observed. However, the role of other Wnt proteins in proliferation and differentiation of keratinocytes remains unknown. Here, we investigated the effects of the Wnt ligand, Wnt3a, on proliferation and differentiation of keratinocytes. Keratinocytes from normal human skin were cultured and treated with recombinant Wnt3a alone or in combination with the inflammatory cytokine, tumor necrosis factor α (TNFα). Furthermore, using bioinformatics, we analyzed the biochemical parameters, molecular evolution, and protein–protein interaction network for the Wnt family. Application of recombinant Wnt3a showed an anti-proliferative effect on keratinocytes in a dose-dependent manner. After treatment with TNFα, Wnt3a still demonstrated an anti-proliferative effect on human keratinocytes. Exogenous treatment of Wnt3a was unable to alter mRNA expression of differentiation markers of keratinocytes, whereas an altered expression was observed in TNFα-stimulated keratinocytes. In silico phylogenetic, biochemical, and protein–protein interaction analysis showed several close relationships among the family members of the Wnt family. Moreover, a close phylogenetic and biochemical similarity was observed between Wnt3a and Wnt5a. Finally, we proposed a hypothetical mechanism to illustrate how the Wnt3a protein may inhibit the process of proliferation in keratinocytes, which would be useful for future researchers. PMID:24686518
Zhou, Lei-Lei; Xu, Xiao-Yue; Ni, Jie; Zhao, Xia; Zhou, Jian-Wei; Feng, Ji-Feng
2018-06-01
Due to the low incidence and the heterogeneity of subtypes, the biological process of T-cell lymphomas is largely unknown. Although many genes have been detected in T-cell lymphomas, the role of these genes in biological process of T-cell lymphomas was not further analyzed. Two qualified datasets were downloaded from Gene Expression Omnibus database. The biological functions of differentially expressed genes were evaluated by gene ontology enrichment and KEGG pathway analysis. The network for intersection genes was constructed by the cytoscape v3.0 software. Kaplan-Meier survival curves and log-rank test were employed to assess the association between differentially expressed genes and clinical characters. The intersection mRNAs were proved to be associated with fundamental processes of T-cell lymphoma cells. These intersection mRNAs were involved in the activation of some cancer-related pathways, including PI3K/AKT, Ras, JAK-STAT, and NF-kappa B signaling pathway. PDGFRA, CXCL12, and CCL19 were the most significant central genes in the signal-net analysis. The results of survival analysis are not entirely credible. Our findings uncovered aberrantly expressed genes and a complex RNA signal network in T-cell lymphomas and indicated cancer-related pathways involved in disease initiation and progression, providing a new insight for biotargeted therapy in T-cell lymphomas. © 2018 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Fang, Yantian; Ma, Minzhe; Wang, Jiangli; Liu, Xiaowen; Wang, Yanong
2017-06-01
Gastric cancer is one of the most common tumors of the digestive system. Here, analysis of the expression profiles of circular RNAs in advanced gastric adenocarcinoma and adjacent normal mucosa tissues revealed differential expression of 306 circular RNAs, among which 273 were predicted to exert regulatory effects on target microRNAs. The downstream pathway networks of circular RNA-microRNA were mapped and the node genes were identified. In particular, we found that the expression of hsa_circ_0058246 was elevated in tumor specimens of patients with poor clinical outcomes. Our collective findings indicate that circular RNAs play a critical role in gastric cancer tumorigenesis. Data from this study provide a new perspective on the molecular pathways underlying metastasis and recurrence of gastric cancer and highlight potential therapeutic targets that may contribute to more effective diagnosis and treatment of the disease.
Polyester: simulating RNA-seq datasets with differential transcript expression.
Frazee, Alyssa C; Jaffe, Andrew E; Langmead, Ben; Leek, Jeffrey T
2015-09-01
Statistical methods development for differential expression analysis of RNA sequencing (RNA-seq) requires software tools to assess accuracy and error rate control. Since true differential expression status is often unknown in experimental datasets, artificially constructed datasets must be utilized, either by generating costly spike-in experiments or by simulating RNA-seq data. Polyester is an R package designed to simulate RNA-seq data, beginning with an experimental design and ending with collections of RNA-seq reads. Its main advantage is the ability to simulate reads indicating isoform-level differential expression across biological replicates for a variety of experimental designs. Data generated by Polyester is a reasonable approximation to real RNA-seq data and standard differential expression workflows can recover differential expression set in the simulation by the user. Polyester is freely available from Bioconductor (http://bioconductor.org/). jtleek@gmail.com Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Wood, David L. A.; Nones, Katia; Steptoe, Anita; Christ, Angelika; Harliwong, Ivon; Newell, Felicity; Bruxner, Timothy J. C.; Miller, David; Cloonan, Nicole; Grimmond, Sean M.
2015-01-01
Genetic variation modulates gene expression transcriptionally or post-transcriptionally, and can profoundly alter an individual’s phenotype. Measuring allelic differential expression at heterozygous loci within an individual, a phenomenon called allele-specific expression (ASE), can assist in identifying such factors. Massively parallel DNA and RNA sequencing and advances in bioinformatic methodologies provide an outstanding opportunity to measure ASE genome-wide. In this study, matched DNA and RNA sequencing, genotyping arrays and computationally phased haplotypes were integrated to comprehensively and conservatively quantify ASE in a single human brain and liver tissue sample. We describe a methodological evaluation and assessment of common bioinformatic steps for ASE quantification, and recommend a robust approach to accurately measure SNP, gene and isoform ASE through the use of personalized haplotype genome alignment, strict alignment quality control and intragenic SNP aggregation. Our results indicate that accurate ASE quantification requires careful bioinformatic analyses and is adversely affected by sample specific alignment confounders and random sampling even at moderate sequence depths. We identified multiple known and several novel ASE genes in liver, including WDR72, DSP and UBD, as well as genes that contained ASE SNPs with imbalance direction discordant with haplotype phase, explainable by annotated transcript structure, suggesting isoform derived ASE. The methods evaluated in this study will be of use to researchers performing highly conservative quantification of ASE, and the genes and isoforms identified as ASE of interest to researchers studying those loci. PMID:25965996
DR-Integrator: a new analytic tool for integrating DNA copy number and gene expression data.
Salari, Keyan; Tibshirani, Robert; Pollack, Jonathan R
2010-02-01
DNA copy number alterations (CNA) frequently underlie gene expression changes by increasing or decreasing gene dosage. However, only a subset of genes with altered dosage exhibit concordant changes in gene expression. This subset is likely to be enriched for oncogenes and tumor suppressor genes, and can be identified by integrating these two layers of genome-scale data. We introduce DNA/RNA-Integrator (DR-Integrator), a statistical software tool to perform integrative analyses on paired DNA copy number and gene expression data. DR-Integrator identifies genes with significant correlations between DNA copy number and gene expression, and implements a supervised analysis that captures genes with significant alterations in both DNA copy number and gene expression between two sample classes. DR-Integrator is freely available for non-commercial use from the Pollack Lab at http://pollacklab.stanford.edu/ and can be downloaded as a plug-in application to Microsoft Excel and as a package for the R statistical computing environment. The R package is available under the name 'DRI' at http://cran.r-project.org/. An example analysis using DR-Integrator is included as supplemental material. Supplementary data are available at Bioinformatics online.
DNA methylation biomarkers for head and neck squamous cell carcinoma.
Zhou, Chongchang; Ye, Meng; Ni, Shumin; Li, Qun; Ye, Dong; Li, Jinyun; Shen, Zhishen; Deng, Hongxia
2018-06-21
DNA methylation plays an important role in the etiology and pathogenesis of head and neck squamous cell carcinoma (HNSCC). The current study aimed to identify aberrantly methylated-differentially expressed genes (DEGs) by a comprehensive bioinformatics analysis. In addition, we screened for DEGs affected by DNA methylation modification and further investigated their prognostic values for HNSCC. We included microarray data of DNA methylation (GSE25093 and GSE33202) and gene expression (GSE23036 and GSE58911) from Gene Expression Omnibus. Aberrantly methylated-DEGs were analyzed with R software. The Cancer Genome Atlas (TCGA) RNA sequencing and DNA methylation (Illumina HumanMethylation450) databases were utilized for validation. In total, 27 aberrantly methylated genes accompanied by altered expression were identified. After confirmation by The Cancer Genome Atlas (TCGA) database, 2 hypermethylated-low-expression genes (FAM135B and ZNF610) and 2 hypomethylated-high-expression genes (HOXA9 and DCC) were identified. A receiver operating characteristic (ROC) curve confirmed the diagnostic value of these four methylated genes for HNSCC. Multivariate Cox proportional hazards analysis showed that FAM135B methylation was a favorable independent prognostic biomarker for overall survival of HNSCC patients.
Xiong, Kun; Long, Lingling; Zhang, Xudong; Qu, Hongke; Deng, Haixiao; Ding, Yanjun; Cai, Jifeng; Wang, Shuchao; Wang, Mi; Liao, Lvshuang; Huang, Jufang; Yi, Chun-Xia; Yan, Jie
2017-10-01
Long non-coding RNAs (lncRNAs) display multiple functions including regulation of neuronal injury. However, their impact in methamphetamine (METH)-induced neurotoxicity has rarely been reported. Here, using microarray analysis, we investigated the expression profiling of lncRNAs and mRNAs in primary cultured prefrontal cortical neurons after METH treatment. We observed a difference in lncRNA and mRNA expression between the experimental and sham control groups. Using bioinformatics, we analyzed the highest enriched gene ontology (GO) terms of biological process, cellular component, and molecular function, and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway and pathway network analysis. Furthermore, an lncRNA-mRNA co-expression sub-network for aberrantly expressed terms revealed possible interactions of lncRNA NR_110713 and NR_027943 with their related genes. Afterwards, three lncRNAs (NR_110713, NR_027943, GAS5) and two mRNAs (Ddit3, Casp12) were targeted to validate the microarray data by qRT-PCR. This presented an overview of lncRNA and mRNA expression profiling and indicated that lncRNA might participate in METH-induced neuronal apoptosis by regulating the coding genes of neurons. Copyright © 2017 Elsevier Ltd. All rights reserved.
Zhang, Chaoyang; Peng, Li; Zhang, Yaqin; Liu, Zhaoyang; Li, Wenling; Chen, Shilian; Li, Guancheng
2017-06-01
Liver cancer is a serious threat to public health and has fairly complicated pathogenesis. Therefore, the identification of key genes and pathways is of much importance for clarifying molecular mechanism of hepatocellular carcinoma (HCC) initiation and progression. HCC-associated gene expression dataset was downloaded from Gene Expression Omnibus database. Statistical software R was used for significance analysis of differentially expressed genes (DEGs) between liver cancer samples and normal samples. Gene Ontology (GO) term enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis, based on R software, were applied for the identification of pathways in which DEGs significantly enriched. Cytoscape software was for the construction of protein-protein interaction (PPI) network and module analysis to find the hub genes and key pathways. Finally, weighted correlation network analysis (WGCNA) was conducted to further screen critical gene modules with similar expression pattern and explore their biological significance. Significance analysis identified 1230 DEGs with fold change >2, including 632 significantly down-regulated DEGs and 598 significantly up-regulated DEGs. GO term enrichment analysis suggested that up-regulated DEG significantly enriched in immune response, cell adhesion, cell migration, type I interferon signaling pathway, and cell proliferation, and the down-regulated DEG mainly enriched in response to endoplasmic reticulum stress and endoplasmic reticulum unfolded protein response. KEGG pathway analysis found DEGs significantly enriched in five pathways including complement and coagulation cascades, focal adhesion, ECM-receptor interaction, antigen processing and presentation, and protein processing in endoplasmic reticulum. The top 10 hub genes in HCC were separately GMPS, ACACA, ALB, TGFB1, KRAS, ERBB2, BCL2, EGFR, STAT3, and CD8A, which resulted from PPI network. The top 3 gene interaction modules in PPI network enriched in immune response, organ development, and response to other organism, respectively. WGCNA revealed that the confirmed eight gene modules significantly enriched in monooxygenase and oxidoreductase activity, response to endoplasmic reticulum stress, type I interferon signaling pathway, processing, presentation and binding of peptide antigen, cellular response to cadmium and zinc ion, cell locomotion and differentiation, ribonucleoprotein complex and RNA processing, and immune system process, respectively. In conclusion, we identified some key genes and pathways closely related with HCC initiation and progression by a series of bioinformatics analysis on DEGs. These screened genes and pathways provided for a more detailed molecular mechanism underlying HCC occurrence and progression, holding promise for acting as biomarkers and potential therapeutic targets.
Colangelo, Christopher M.; Shifman, Mark; Cheung, Kei-Hoi; Stone, Kathryn L.; Carriero, Nicholas J.; Gulcicek, Erol E.; Lam, TuKiet T.; Wu, Terence; Bjornson, Robert D.; Bruce, Can; Nairn, Angus C.; Rinehart, Jesse; Miller, Perry L.; Williams, Kenneth R.
2015-01-01
We report a significantly-enhanced bioinformatics suite and database for proteomics research called Yale Protein Expression Database (YPED) that is used by investigators at more than 300 institutions worldwide. YPED meets the data management, archival, and analysis needs of a high-throughput mass spectrometry-based proteomics research ranging from a single laboratory, group of laboratories within and beyond an institution, to the entire proteomics community. The current version is a significant improvement over the first version in that it contains new modules for liquid chromatography–tandem mass spectrometry (LC–MS/MS) database search results, label and label-free quantitative proteomic analysis, and several scoring outputs for phosphopeptide site localization. In addition, we have added both peptide and protein comparative analysis tools to enable pairwise analysis of distinct peptides/proteins in each sample and of overlapping peptides/proteins between all samples in multiple datasets. We have also implemented a targeted proteomics module for automated multiple reaction monitoring (MRM)/selective reaction monitoring (SRM) assay development. We have linked YPED’s database search results and both label-based and label-free fold-change analysis to the Skyline Panorama repository for online spectra visualization. In addition, we have built enhanced functionality to curate peptide identifications into an MS/MS peptide spectral library for all of our protein database search identification results. PMID:25712262
Colangelo, Christopher M; Shifman, Mark; Cheung, Kei-Hoi; Stone, Kathryn L; Carriero, Nicholas J; Gulcicek, Erol E; Lam, TuKiet T; Wu, Terence; Bjornson, Robert D; Bruce, Can; Nairn, Angus C; Rinehart, Jesse; Miller, Perry L; Williams, Kenneth R
2015-02-01
We report a significantly-enhanced bioinformatics suite and database for proteomics research called Yale Protein Expression Database (YPED) that is used by investigators at more than 300 institutions worldwide. YPED meets the data management, archival, and analysis needs of a high-throughput mass spectrometry-based proteomics research ranging from a single laboratory, group of laboratories within and beyond an institution, to the entire proteomics community. The current version is a significant improvement over the first version in that it contains new modules for liquid chromatography-tandem mass spectrometry (LC-MS/MS) database search results, label and label-free quantitative proteomic analysis, and several scoring outputs for phosphopeptide site localization. In addition, we have added both peptide and protein comparative analysis tools to enable pairwise analysis of distinct peptides/proteins in each sample and of overlapping peptides/proteins between all samples in multiple datasets. We have also implemented a targeted proteomics module for automated multiple reaction monitoring (MRM)/selective reaction monitoring (SRM) assay development. We have linked YPED's database search results and both label-based and label-free fold-change analysis to the Skyline Panorama repository for online spectra visualization. In addition, we have built enhanced functionality to curate peptide identifications into an MS/MS peptide spectral library for all of our protein database search identification results. Copyright © 2015 The Authors. Production and hosting by Elsevier Ltd.. All rights reserved.
Bioinformatics clouds for big data manipulation.
Dai, Lin; Gao, Xin; Guo, Yan; Xiao, Jingfa; Zhang, Zhang
2012-11-28
As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS), Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS), and present our perspectives on the adoption of cloud computing in bioinformatics. This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor.
Bioinformatics approach reveals systematic mechanism underlying lung adenocarcinoma.
Wu, Xiya; Zhang, Wei; Hu, Yunhua; Yi, Xianghua
2015-01-01
The purpose of this work was to explore the systematic molecular mechanism of lung adenocarcinoma and gain a deeper insight into it. Comprehensive bioinformatics methods were applied. Initially, significant differentially expressed genes (DEGs) were analyzed from the Affymetrix microarray data (GSE27262) deposited in the Gene Expression Omnibus (GEO). Subsequently, gene ontology (GO) analysis was performed using online Database for Annotation, Visualization and Integration Discovery (DAVID) software. Finally, significant pathway crosstalk was investigated based on the information derived from the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. According to our results, the N-terminal globular domain of the type X collagen (COL10A1) gene and transmembrane protein 100 (TMEM100) gene were identified to be the most significant DEGs in tumor tissue compared with the adjacent normal tissues. The main GO categories were biological process, cellular component and molecular function. In addition, the crosstalk was significantly different between non-small cell lung cancer pathways and inositol phosphate metabolism pathway, focal adhesion signal pathway, vascular smooth muscle contraction signal pathway, peroxisome proliferator-activated receptor (PPAR) signaling pathway and calcium signaling pathway in tumor. Dysfunctional genes and pathways may play key roles in the progression and development of lung adenocarcinoma. Our data provide a systematic perspective for understanding this mechanism and may be helpful in discovering an effective treatment for lung adenocarcinoma.
Li, Chen-Ye; Ma, Lan; Yu, Bo
2017-11-01
Circular RNAs (circRNAs) are a novel class of RNAs generated from back-splicing and characterized by covalently closed continuous loops. Recently, circRNAs have recently shown large regulation on cardiovascular system, including atherosclerosis. The present study aims to investigate the circRNA expression profile and identify their roles on vascular endothelial cells induced by oxLDL. Human circRNA microarray analysis revealed that total 943 differently expressed circRNAs were screened with 2 fold change. Hsa_circ_0003575 was validated to be significantly up-regulated in oxLDL induced HUVECs. Loss-of-function experiments indicated that hsa_circ_0003575 silencing promoted the proliferation and angiogenesis ability of HUVECs. Bioinformatics online programs predicted the potential circRNA-miRNA-mRNA network for hsa_circ_0003575. In summary, circRNA microarray analysis reveals the expression profiles of HUVECs and verifies the role of hsa_circ_0003575 on HUVECs, providing a therapeutic strategy for vascular endothelial cell injury of atherosclerosis. Copyright © 2017. Published by Elsevier Masson SAS.
Expression of fox-related genes in the skin follicles of Inner Mongolia cashmere goat.
Han, Wenjing; Li, Xiaoyan; Wang, Lele; Wang, Honghao; Yang, Kun; Wang, Zhixin; Wang, Ruijun; Su, Rui; Liu, Zhihong; Zhao, Yanhong; Zhang, Yanjun; Li, Jinquan
2018-03-01
This study investigated the expression of genes in cashmere goats at different periods of their fetal development. Bioinformatics analysis was used to evaluate data obtained by transcriptome sequencing of fetus skin samples collected from Inner Mongolia cashmere goats on days 45, 55, and 65 of fetal age. We found that FoxN1 , FoxE1 , and FoxI3 genes of the Fox gene family were probably involved in the growth and development of the follicle and the formation of hair, which is consistent with previous findings. Real-time quantitative polymerase chain reaction detecting system and Western blot analysis were employed to study the relative differentially expressed genes FoxN1 , FoxE1 , and FoxI3 in the body skin of cashmere goat fetuses and adult individuals. This study provided new fundamental information for further investigation of the genes related to follicle development and exploration of their roles in hair follicle initiation, growth, and development.
Analysis of Altered Micro RNA Expression Profiles in Focal Cortical Dysplasia IIB.
Li, Lin; Liu, Chang-Qing; Li, Tian-Fu; Guan, Yu-Guang; Zhou, Jian; Qi, Xue-Ling; Yang, Yu-Tao; Deng, Jia-Hui; Xu, Zhi-Qing David; Luan, Guo-Ming
2016-04-01
Focal cortical dysplasia type IIB is a commonly encountered subtype of developmental malformation of the cerebral cortex and is often associated with pharmacoresistant epilepsy. In this study, to investigate the molecular etiology of focal cortical dysplasia type IIB, the authors performed micro ribonucleic acid (RNA) microarray on surgical specimens from 5 children (2 female and 3 male, mean age was 73.4 months, range 50-112 months) diagnosed of focal cortical dysplasia type IIB and matched normal tissue adjacent to the lesion. In all, 24 micro RNAs were differentially expressed in focal cortical dysplasia type IIB, and the microarray results were validated using quantitative real-time polymerase chain reaction (PCR). Then the putative target genes of the differentially expressed micro RNAs were identified by bioinformatics analysis. Moreover, biological significance of the target genes was evaluated by investigating the pathways in which the genes were enriched, and the Hippo signaling pathway was proposed to be highly related with the pathogenesis of focal cortical dysplasia type IIB. © The Author(s) 2015.
A novel gene expression-based prognostic scoring system to predict survival in gastric cancer
Wang, Pin; Wang, Yunshan; Hang, Bo; ...
2016-07-11
Analysis of gene expression patterns in gastric cancer (GC) can help to identify a comprehensive panel of gene biomarkers for predicting clinical outcomes and to discover potential new therapeutic targets. Here, a multi-step bioinformatics analytic approach was developed to establish a novel prognostic scoring system for GC. We first identified 276 genes that were robustly differentially expressed between normal and GC tissues, of which, 249 were found to be significantly associated with overall survival (OS) by univariate Cox regression analysis. The biological functions of 249 genes are related to cell cycle, RNA/ncRNA process, acetylation and extracellular matrix organization. A networkmore » was generated for view of the gene expression architecture of 249 genes in 265 GCs. Finally, we applied a canonical discriminant analysis approach to identify a 53-gene signature and a prognostic scoring system was established based on a canonical discriminant function of 53 genes. The prognostic scores strongly predicted patients with GC to have either a poor or good OS. Our study raises the prospect that the practicality of GC patient prognosis can be assessed by this prognostic scoring system.« less
Identification of microRNAs differentially expressed involved in male flower development.
Wang, Zhengjia; Huang, Jianqin; Sun, Zhichao; Zheng, Bingsong
2015-03-01
Hickory (Carya cathayensis Sarg.) is one of the most economically important woody trees in eastern China, but its long flowering phase delays yield. Our understanding of the regulatory roles of microRNAs (miRNAs) in male flower development in hickory remains poor. Using high-throughput sequencing technology, we have pyrosequenced two small RNA libraries from two male flower differentiation stages in hickory. Analysis of the sequencing data identified 114 conserved miRNAs that belonged to 23 miRNA families, five novel miRNAs including their corresponding miRNA*s, and 22 plausible miRNA candidates. Differential expression analysis revealed 12 miRNA sequences that were upregulated in the later (reproductive) stage of male flower development. Quantitative real-time PCR showed similar expression trends as that of the deep sequencing. Novel miRNAs and plausible miRNA candidates were predicted using bioinformatic analysis methods. The miRNAs newly identified in this study have increased the number of known miRNAs in hickory, and the identification of differentially expressed miRNAs will provide new avenues for studies into miRNAs involved in the process of male flower development in hickory and other related trees.
A novel gene expression-based prognostic scoring system to predict survival in gastric cancer
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Pin; Wang, Yunshan; Hang, Bo
Analysis of gene expression patterns in gastric cancer (GC) can help to identify a comprehensive panel of gene biomarkers for predicting clinical outcomes and to discover potential new therapeutic targets. Here, a multi-step bioinformatics analytic approach was developed to establish a novel prognostic scoring system for GC. We first identified 276 genes that were robustly differentially expressed between normal and GC tissues, of which, 249 were found to be significantly associated with overall survival (OS) by univariate Cox regression analysis. The biological functions of 249 genes are related to cell cycle, RNA/ncRNA process, acetylation and extracellular matrix organization. A networkmore » was generated for view of the gene expression architecture of 249 genes in 265 GCs. Finally, we applied a canonical discriminant analysis approach to identify a 53-gene signature and a prognostic scoring system was established based on a canonical discriminant function of 53 genes. The prognostic scores strongly predicted patients with GC to have either a poor or good OS. Our study raises the prospect that the practicality of GC patient prognosis can be assessed by this prognostic scoring system.« less
Circular RNA Profiling and Bioinformatic Modeling Identify Its Regulatory Role in Hepatic Steatosis.
Guo, Xing-Ya; He, Chong-Xin; Wang, Yu-Qin; Sun, Chao; Li, Guang-Ming; Su, Qing; Pan, Qin; Fan, Jian-Gao
2017-01-01
Circular RNAs (circRNAs) exhibit a wide range of physiological and pathological activities. To uncover their role in hepatic steatosis, we investigated the expression profile of circRNAs in HepG2-based hepatic steatosis induced by high-fat stimulation. Differentially expressed circRNAs were subjected to validation using QPCR and functional analyses using principal component analysis, hierarchical clustering, target prediction, gene ontology (GO), and pathway annotation, respectively. Bioinformatic integration established the circRNA-miRNA-mRNA regulatory network so as to identify the mechanisms underlying circRNAs' metabolic effect. Here we reported that hepatic steatosis was associated with a total of 357 circRNAs. Enrichment of transcription-related GOs, especially GO: 0006355, GO: 004589, GO: 0045944, GO: 0045892, and GO: 0000122, demonstrated their specific actions in transcriptional regulation. Lipin 1 (LPIN1) was recognized to mediate the transcriptional regulatory effect of circRNAs on metabolic pathways. circRNA-miRNA-mRNA network further identified the signaling cascade of circRNA_021412/miR-1972/LPIN1, which was characterized by decreased level of circRNA_021412 and miR-1972-based inhibition of LPIN1. LPIN1-induced downregulation of long chain acyl-CoA synthetases (ACSLs) expression finally resulted in the hepatosteatosis. These findings identify circRNAs to be important regulators of hepatic steatosis. Transcription-dependent modulation of metabolic pathways may underlie their effects, partially by the circRNA_021412/miR-1972/LPIN1 signaling.
Datasets2Tools, repository and search engine for bioinformatics datasets, tools and canned analyses
Torre, Denis; Krawczuk, Patrycja; Jagodnik, Kathleen M.; Lachmann, Alexander; Wang, Zichen; Wang, Lily; Kuleshov, Maxim V.; Ma’ayan, Avi
2018-01-01
Biomedical data repositories such as the Gene Expression Omnibus (GEO) enable the search and discovery of relevant biomedical digital data objects. Similarly, resources such as OMICtools, index bioinformatics tools that can extract knowledge from these digital data objects. However, systematic access to pre-generated ‘canned’ analyses applied by bioinformatics tools to biomedical digital data objects is currently not available. Datasets2Tools is a repository indexing 31,473 canned bioinformatics analyses applied to 6,431 datasets. The Datasets2Tools repository also contains the indexing of 4,901 published bioinformatics software tools, and all the analyzed datasets. Datasets2Tools enables users to rapidly find datasets, tools, and canned analyses through an intuitive web interface, a Google Chrome extension, and an API. Furthermore, Datasets2Tools provides a platform for contributing canned analyses, datasets, and tools, as well as evaluating these digital objects according to their compliance with the findable, accessible, interoperable, and reusable (FAIR) principles. By incorporating community engagement, Datasets2Tools promotes sharing of digital resources to stimulate the extraction of knowledge from biomedical research data. Datasets2Tools is freely available from: http://amp.pharm.mssm.edu/datasets2tools. PMID:29485625
Datasets2Tools, repository and search engine for bioinformatics datasets, tools and canned analyses.
Torre, Denis; Krawczuk, Patrycja; Jagodnik, Kathleen M; Lachmann, Alexander; Wang, Zichen; Wang, Lily; Kuleshov, Maxim V; Ma'ayan, Avi
2018-02-27
Biomedical data repositories such as the Gene Expression Omnibus (GEO) enable the search and discovery of relevant biomedical digital data objects. Similarly, resources such as OMICtools, index bioinformatics tools that can extract knowledge from these digital data objects. However, systematic access to pre-generated 'canned' analyses applied by bioinformatics tools to biomedical digital data objects is currently not available. Datasets2Tools is a repository indexing 31,473 canned bioinformatics analyses applied to 6,431 datasets. The Datasets2Tools repository also contains the indexing of 4,901 published bioinformatics software tools, and all the analyzed datasets. Datasets2Tools enables users to rapidly find datasets, tools, and canned analyses through an intuitive web interface, a Google Chrome extension, and an API. Furthermore, Datasets2Tools provides a platform for contributing canned analyses, datasets, and tools, as well as evaluating these digital objects according to their compliance with the findable, accessible, interoperable, and reusable (FAIR) principles. By incorporating community engagement, Datasets2Tools promotes sharing of digital resources to stimulate the extraction of knowledge from biomedical research data. Datasets2Tools is freely available from: http://amp.pharm.mssm.edu/datasets2tools.
Wu, Zhifeng; Ding, Nannan; Yu, Mengxi; Wang, Ke; Luo, Shasha; Zou, Wenjun; Zhou, Ying; Yan, Biao; Jiang, Qin
2016-01-01
Rhegmatogenous retinal detachment associated with choroidal detachment (RRDCD) is a complicated and serious type of rhegmatogenous retinal detachment (RRD). In this study, we identified differentially expressed proteins in the vitreous humors of RRDCD and RRD using isobaric tags for relative and absolute quantitation (iTRAQ) combined with nano-liquid chromatography-electrospray ion trap-mass spectrometry-mass spectrometry (nano-LC-ESI-MS/MS) and bioinformatic analysis. Our result shows that 103 differentially expressed proteins, including 54 up-regulated and 49 down-regulated proteins were identified in RRDCD. Gene ontology (GO) analysis suggested that most of the differentially expressed proteins were extracellular.The Kyoto encyclopedia of genes and genomes (KEGG) pathway analysis suggested that proteins related to complement and coagulation cascades were significantly enriched. iTRAQ-based proteomic profiling reveals that complement and coagulation cascades and inflammation may play important roles in the pathogenesis of RRDCD. This study may provide novel insights into the pathogenesis of RRDCD and offer potential opportunities for the diagnosis and treatment of RRDCD. PMID:27941623
Janjanam, Jagadeesh; Singh, Surender; Jena, Manoj K; Varshney, Nishant; Kola, Srujana; Kumar, Sudarshan; Kaushik, Jai K; Grover, Sunita; Dang, Ajay K; Mukesh, Manishi; Prakash, B S; Mohanty, Ashok K
2014-01-01
Mammary gland is made up of a branching network of ducts that end with alveoli which surrounds the lumen. These alveolar mammary epithelial cells (MEC) reflect the milk producing ability of farm animals. In this study, we have used 2D-DIGE and mass spectrometry to identify the protein changes in MEC during immediate early, peak and late stages of lactation and also compared differentially expressed proteins in MEC isolated from milk of high and low milk producing cows. We have identified 41 differentially expressed proteins during lactation stages and 22 proteins in high and low milk yielding cows. Bioinformatics analysis showed that a majority of the differentially expressed proteins are associated in metabolic process, catalytic and binding activity. The differentially expressed proteins were mapped to the available biological pathways and networks involved in lactation. The proteins up-regulated during late stage of lactation are associated with NF-κB stress induced signaling pathways and whereas Akt, PI3K and p38/MAPK signaling pathways are associated with high milk production mediated through insulin hormone signaling.
Chen, Chao-Jin; Liu, De-Zhao; Yao, Wei-Feng; Gu, Yu; Huang, Fei; Hei, Zi-Qing; Li, Xiang
2017-01-01
Neuropathic pain is a complex chronic condition occurring post-nervous system damage. The transcriptional reprogramming of injured dorsal root ganglia (DRGs) drives neuropathic pain. However, few comparative analyses using high-throughput platforms have investigated uninjured DRG in neuropathic pain, and potential interactions among differentially expressed genes (DEGs) and pathways were not taken into consideration. The aim of this study was to identify changes in genes and pathways associated with neuropathic pain in uninjured L4 DRG after L5 spinal nerve ligation (SNL) by using bioinformatic analysis. The microarray profile GSE24982 was downloaded from the Gene Expression Omnibus database to identify DEGs between DRGs in SNL and sham rats. The prioritization for these DEGs was performed using the Toppgene database followed by gene ontology and pathway enrichment analyses. The relationships among DEGs from the protein interactive perspective were analyzed using protein-protein interaction (PPI) network and module analysis. Real-time polymerase chain reaction (PCR) and Western blotting were used to confirm the expression of DEGs in the rodent neuropathic pain model. A total of 206 DEGs that might play a role in neuropathic pain were identified in L4 DRG, of which 75 were upregulated and 131 were downregulated. The upregulated DEGs were enriched in biological processes related to transcription regulation and molecular functions such as DNA binding, cell cycle, and the FoxO signaling pathway. Ctnnb1 protein had the highest connectivity degrees in the PPI network. The in vivo studies also validated that mRNA and protein levels of Ctnnb1 were upregulated in both L4 and L5 DRGs. This study provides insight into the functional gene sets and pathways associated with neuropathic pain in L4 uninjured DRG after L5 SNL, which might promote our understanding of the molecular mechanisms underlying the development of neuropathic pain.
Malin, Bradley; Carley, Kathleen
2007-01-01
The goal of this research is to learn how the editorial staffs of bioinformatics and medical informatics journals provide support for cross-community exposure. Models such as co-citation and co-author analysis measure the relationships between researchers; but they do not capture how environments that support knowledge transfer across communities are organized. In this paper, we propose a social network analysis model to study how editorial boards integrate researchers from disparate communities. We evaluate our model by building relational networks based on the editorial boards of approximately 40 journals that serve as research outlets in medical informatics and bioinformatics. We track the evolution of editorial relationships through a longitudinal investigation over the years 2000 through 2005. Our findings suggest that there are research journals that support the collocation of editorial board members from the bioinformatics and medical informatics communities. Network centrality metrics indicate that editorial board members are located in the intersection of the communities and that the number of individuals in the intersection is growing with time. Social network analysis methods provide insight into the relationships between the medical informatics and bioinformatics communities. The number of editorial board members facilitating the publication intersection of the communities has grown, but the intersection remains dependent on a small group of individuals and fragile.
Qu, Changfeng; He, Yingying; Zheng, Zhou; An, Meiling; Li, Lulu; Wang, Xixi; He, Xiaodong; Wang, Yibin; Liu, Fangming; Miao, Jinlai
2018-01-01
The α-carbonic anhydrase (α-CA) is a zinc ion-containing enzyme that catalyzes the hydration of carbon dioxide. In this paper, a full-length α-CA gene was cloned from Chlamydomonas sp. ICE-L using RT-PCR and RACE-PCR for bioinformatic analysis. The α-CA open reading frame obtained by PCR was cloned into a vector and transformed into Escherichia coli to generate α-CA-producing bacteria. The α-CA was highly expressed upon induction with isopropyl-β-d-thiogalactoside (IPTG) at a final concentration of 0.8 mM. A single band with a molecular weight of approximate 40 kDa expressed in the recombinant E. coli strain harboring the α-CA vector was observed in SDS-PAGE analysis. The carbon dioxide hydration activity and esterase activity of α-CA expressed by the recombinant strain were 0.404 U/mg and 0.319 U, respectively. In addition, three conditions, temperature, salinity and UVB radiation exposure, were selected to analyze α-CA transcription levels by qRT-PCR. The results suggested UVB exposure increased the expression of relative mRNA; meanwhile, the α-CA mRNA expression was rapidly induced by temperature and salinity stress, indicating that Chlamydomonas sp. ICE-L might modulate the α-CA mRNA expression to adapt to the extreme environments.
Granatum: a graphical single-cell RNA-Seq analysis pipeline for genomics scientists.
Zhu, Xun; Wolfgruber, Thomas K; Tasato, Austin; Arisdakessian, Cédric; Garmire, David G; Garmire, Lana X
2017-12-05
Single-cell RNA sequencing (scRNA-Seq) is an increasingly popular platform to study heterogeneity at the single-cell level. Computational methods to process scRNA-Seq data are not very accessible to bench scientists as they require a significant amount of bioinformatic skills. We have developed Granatum, a web-based scRNA-Seq analysis pipeline to make analysis more broadly accessible to researchers. Without a single line of programming code, users can click through the pipeline, setting parameters and visualizing results via the interactive graphical interface. Granatum conveniently walks users through various steps of scRNA-Seq analysis. It has a comprehensive list of modules, including plate merging and batch-effect removal, outlier-sample removal, gene-expression normalization, imputation, gene filtering, cell clustering, differential gene expression analysis, pathway/ontology enrichment analysis, protein network interaction visualization, and pseudo-time cell series construction. Granatum enables broad adoption of scRNA-Seq technology by empowering bench scientists with an easy-to-use graphical interface for scRNA-Seq data analysis. The package is freely available for research use at http://garmiregroup.org/granatum/app.
Convergent evidence from systematic analysis of GWAS revealed genetic basis of esophageal cancer.
Gao, Xue-Xin; Gao, Lei; Wang, Jiu-Qiang; Qu, Su-Su; Qu, Yue; Sun, Hong-Lei; Liu, Si-Dang; Shang, Ying-Li
2016-07-12
Recent genome-wide association studies (GWAS) have identified single nucleotide polymorphisms (SNPs) associated with risk of esophageal cancer (EC). However, investigation of genetic basis from the perspective of systematic biology and integrative genomics remains scarce.In this study, we explored genetic basis of EC based on GWAS data and implemented a series of bioinformatics methods including functional annotation, expression quantitative trait loci (eQTL) analysis, pathway enrichment analysis and pathway grouped network analysis.Two hundred and thirteen risk SNPs were identified, in which 44 SNPs were found to have significantly differential gene expression in esophageal tissues by eQTL analysis. By pathway enrichment analysis, 170 risk genes mapped by risk SNPs were enriched into 38 significant GO terms and 17 significant KEGG pathways, which were significantly grouped into 9 sub-networks by pathway grouped network analysis. The 9 groups of interconnected pathways were mainly involved with muscle cell proliferation, cellular response to interleukin-6, cell adhesion molecules, and ethanol oxidation, which might participate in the development of EC.Our findings provide genetic evidence and new insight for exploring the molecular mechanisms of EC.
Improvements to PATRIC, the all-bacterial Bioinformatics Database and Analysis Resource Center
Davis, James J.; Brettin, Thomas; Dietrich, Emily M.; ...
2016-11-28
Here, the Pathosystems Resource Integration Center (PATRIC) is the bacterial Bioinformatics Resource Center. Recent changes to PATRIC include a redesign of the web interface and some new services that provide users with a platform that takes them from raw reads to an integrated analysis experience. The redesigned interface allows researchers direct access to tools and data, and the emphasis has changed to user-created genome-groups, with detailed summaries and views of the data that researchers have selected. Perhaps the biggest change has been the enhanced capability for researchers to analyze their private data and compare it to the available public data.more » Researchers can assemble their raw sequence reads and annotate the contigs using RASTtk. PATRIC also provides services for RNA-Seq, variation, model reconstruction and differential expression analysis, all delivered through an updated private workspace. Private data can be compared by `virtual integration' to any of PATRIC's public data. The number of genomes available for comparison in PATRIC has expanded to over 80 000, with a special emphasis on genomes with antimicrobial resistance data. PATRIC uses this data to improve both subsystem annotation and k-mer classification, and tags new genomes as having signatures that indicate susceptibility or resistance to specific antibiotics.« less
Improvements to PATRIC, the all-bacterial Bioinformatics Database and Analysis Resource Center
Wattam, Alice R.; Davis, James J.; Assaf, Rida; Boisvert, Sébastien; Brettin, Thomas; Bun, Christopher; Conrad, Neal; Dietrich, Emily M.; Disz, Terry; Gabbard, Joseph L.; Gerdes, Svetlana; Henry, Christopher S.; Kenyon, Ronald W.; Machi, Dustin; Mao, Chunhong; Nordberg, Eric K.; Olsen, Gary J.; Murphy-Olson, Daniel E.; Olson, Robert; Overbeek, Ross; Parrello, Bruce; Pusch, Gordon D.; Shukla, Maulik; Vonstein, Veronika; Warren, Andrew; Xia, Fangfang; Yoo, Hyunseung; Stevens, Rick L.
2017-01-01
The Pathosystems Resource Integration Center (PATRIC) is the bacterial Bioinformatics Resource Center (https://www.patricbrc.org). Recent changes to PATRIC include a redesign of the web interface and some new services that provide users with a platform that takes them from raw reads to an integrated analysis experience. The redesigned interface allows researchers direct access to tools and data, and the emphasis has changed to user-created genome-groups, with detailed summaries and views of the data that researchers have selected. Perhaps the biggest change has been the enhanced capability for researchers to analyze their private data and compare it to the available public data. Researchers can assemble their raw sequence reads and annotate the contigs using RASTtk. PATRIC also provides services for RNA-Seq, variation, model reconstruction and differential expression analysis, all delivered through an updated private workspace. Private data can be compared by ‘virtual integration’ to any of PATRIC's public data. The number of genomes available for comparison in PATRIC has expanded to over 80 000, with a special emphasis on genomes with antimicrobial resistance data. PATRIC uses this data to improve both subsystem annotation and k-mer classification, and tags new genomes as having signatures that indicate susceptibility or resistance to specific antibiotics. PMID:27899627
Secretome profiles of immortalized dental follicle cells using iTRAQ-based proteomic analysis.
Dou, Lei; Wu, Yan; Yan, Qifang; Wang, Jinhua; Zhang, Yan; Ji, Ping
2017-08-04
Secretomes produced by mesenchymal stromal cells (MSCs) were considered to be therapeutic potential. However, harvesting enough primary MSCs from tissue was time-consuming and costly, which impeded the application of MSCs secretomes. This study was to immortalize MSCs and compare the secretomes profile of immortalized and original MSCs. Human dental follicle cells (DFCs) were isolated and immortalized using pMPH86. The secretome profile of immortalized DFCs (iDFCs) was investigated and compared using iTRAQ labeling combined with mass spectrometry (MS) quantitative proteomics. The MS data was analyzed using ProteinPilotTM software, and then bioinformatic analysis of identified proteins was done. A total of 2092 secreted proteins were detected in conditioned media of iDFCs. Compared with primary DFCs, 253 differently expressed proteins were found in iDFCs secretome (142 up-regulated and 111 down-regulated). Intensive bioinformatic analysis revealed that the majority of secreted proteins were involved in cellular process, metabolic process, biological regulation, cellular component organization or biogenesis, immune system process, developmental process, response to stimulus and signaling. Proteomic profile of cell secretome wasn't largely affected after immortalization converted by this piggyBac immortalization system. The secretome of iDFCs may be a good candidate of primary DFCs for regenerative medicine.
Redox-active antibiotics control gene expression and community behavior in divergent bacteria.
Dietrich, Lars E P; Teal, Tracy K; Price-Whelan, Alexa; Newman, Dianne K
2008-08-29
It is thought that bacteria excrete redox-active pigments as antibiotics to inhibit competitors. In Pseudomonas aeruginosa, the endogenous antibiotic pyocyanin activates SoxR, a transcription factor conserved in Proteo- and Actinobacteria. In Escherichia coli, SoxR regulates the superoxide stress response. Bioinformatic analysis coupled with gene expression studies in P. aeruginosa and Streptomyces coelicolor revealed that the majority of SoxR regulons in bacteria lack the genes required for stress responses, despite the fact that many of these organisms still produce redox-active small molecules, which indicates that redox-active pigments play a role independent of oxidative stress. These compounds had profound effects on the structural organization of colony biofilms in both P. aeruginosa and S. coelicolor, which shows that "secondary metabolites" play important conserved roles in gene expression and development.
No-boundary thinking in bioinformatics research
2013-01-01
Currently there are definitions from many agencies and research societies defining “bioinformatics” as deriving knowledge from computational analysis of large volumes of biological and biomedical data. Should this be the bioinformatics research focus? We will discuss this issue in this review article. We would like to promote the idea of supporting human-infrastructure (HI) with no-boundary thinking (NT) in bioinformatics (HINT). PMID:24192339
2005-01-01
The need to support bioinformatics training has been widely recognized by scientists, industry, and government institutions. However, the discussion of instructional methods for teaching bioinformatics is only beginning. Here we report on a systematic attempt to design two bioinformatics workshops for graduate biology students on the basis of Gagne's Conditions of Learning instructional design theory. This theory, although first published in the early 1970s, is still fundamental in instructional design and instructional technology. First, top-level as well as prerequisite learning objectives for a microarray analysis workshop and a primer design workshop were defined. Then a hierarchy of objectives for each workshop was created. Hands-on tutorials were designed to meet these objectives. Finally, events of learning proposed by Gagne's theory were incorporated into the hands-on tutorials. The resultant manuals were tested on a small number of trainees, revised, and applied in 1-day bioinformatics workshops. Based on this experience and on observations made during the workshops, we conclude that Gagne's Conditions of Learning instructional design theory provides a useful framework for developing bioinformatics training, but may not be optimal as a method for teaching it. PMID:16220141
contamDE: differential expression analysis of RNA-seq data for contaminated tumor samples.
Shen, Qi; Hu, Jiyuan; Jiang, Ning; Hu, Xiaohua; Luo, Zewei; Zhang, Hong
2016-03-01
Accurate detection of differentially expressed genes between tumor and normal samples is a primary approach of cancer-related biomarker identification. Due to the infiltration of tumor surrounding normal cells, the expression data derived from tumor samples would always be contaminated with normal cells. Ignoring such cellular contamination would deflate the power of detecting DE genes and further confound the biological interpretation of the analysis results. For the time being, there does not exists any differential expression analysis approach for RNA-seq data in literature that can properly account for the contamination of tumor samples. Without appealing to any extra information, we develop a new method 'contamDE' based on a novel statistical model that associates RNA-seq expression levels with cell types. It is demonstrated through simulation studies that contamDE could be much more powerful than the existing methods that ignore the contamination. In the application to two cancer studies, contamDE uniquely found several potential therapy and prognostic biomarkers of prostate cancer and non-small cell lung cancer. An R package contamDE is freely available at http://homepage.fudan.edu.cn/zhangh/softwares/ zhanghfd@fudan.edu.cn Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
miRToolsGallery: a tag-based and rankable microRNA bioinformatics resources database portal
Chen, Liang; Heikkinen, Liisa; Wang, ChangLiang; Yang, Yang; Knott, K Emily
2018-01-01
Abstract Hundreds of bioinformatics tools have been developed for MicroRNA (miRNA) investigations including those used for identification, target prediction, structure and expression profile analysis. However, finding the correct tool for a specific application requires the tedious and laborious process of locating, downloading, testing and validating the appropriate tool from a group of nearly a thousand. In order to facilitate this process, we developed a novel database portal named miRToolsGallery. We constructed the portal by manually curating > 950 miRNA analysis tools and resources. In the portal, a query to locate the appropriate tool is expedited by being searchable, filterable and rankable. The ranking feature is vital to quickly identify and prioritize the more useful from the obscure tools. Tools are ranked via different criteria including the PageRank algorithm, date of publication, number of citations, average of votes and number of publications. miRToolsGallery provides links and data for the comprehensive collection of currently available miRNA tools with a ranking function which can be adjusted using different criteria according to specific requirements. Database URL: http://www.mirtoolsgallery.org PMID:29688355
USDA-ARS?s Scientific Manuscript database
Remarkable advances in next-generation sequencing (NGS) technologies, bioinformatics algorithms, and computational technologies have significantly accelerated genomic research. However, complicated NGS data analysis still remains as a major bottleneck. RNA-seq, as one of the major area in the NGS fi...
Assessing an effective undergraduate module teaching applied bioinformatics to biology students
2018-01-01
Applied bioinformatics skills are becoming ever more indispensable for biologists, yet incorporation of these skills into the undergraduate biology curriculum is lagging behind, in part due to a lack of instructors willing and able to teach basic bioinformatics in classes that don’t specifically focus on quantitative skill development, such as statistics or computer sciences. To help undergraduate course instructors who themselves did not learn bioinformatics as part of their own education and are hesitant to plunge into teaching big data analysis, a module was developed that is written in plain-enough language, using publicly available computing tools and data, to allow novice instructors to teach next-generation sequence analysis to upper-level undergraduate students. To determine if the module allowed students to develop a better understanding of and appreciation for applied bioinformatics, various tools were developed and employed to assess the impact of the module. This article describes both the module and its assessment. Students found the activity valuable for their education and, in focus group discussions, emphasized that they saw a need for more and earlier instruction of big data analysis as part of the undergraduate biology curriculum. PMID:29324777
Bioinformatics clouds for big data manipulation
2012-01-01
Abstract As advances in life sciences and information technology bring profound influences on bioinformatics due to its interdisciplinary nature, bioinformatics is experiencing a new leap-forward from in-house computing infrastructure into utility-supplied cloud computing delivered over the Internet, in order to handle the vast quantities of biological data generated by high-throughput experimental technologies. Albeit relatively new, cloud computing promises to address big data storage and analysis issues in the bioinformatics field. Here we review extant cloud-based services in bioinformatics, classify them into Data as a Service (DaaS), Software as a Service (SaaS), Platform as a Service (PaaS), and Infrastructure as a Service (IaaS), and present our perspectives on the adoption of cloud computing in bioinformatics. Reviewers This article was reviewed by Frank Eisenhaber, Igor Zhulin, and Sandor Pongor. PMID:23190475
[Preliminary analysis of retinal gene expression profile of diabetic rat].
Mei, Yan; Zhou, Hong-ying; Xiang, Tao; Lu, You-guang; Li, Ai-dong; Tang, En-jie; Yang, Hui-jun
2005-10-01
Establishing the retinal gene expression profiles of non-diabetic rat and diabetic rat and comparing the profiles in order to analyze the possible genes related with diabetic retinopathy. The whole retinal transcriptional fragments of non-diabetic rat and 8-week diabetic rat were obtained by restriction fragments differential display-PCR (RFDD-PCR). Bioinformatic analysis of retinal gene expression was performed using soft wares, including Fragment Analysis. After comparison of the expression profiles, the related gene fragments of diabetic retinopathy were initially selected as the target gene of further approach. A total of 3639 significant fragments were obtained. By means of more than 3-fold contrast of fluorescent intensity as the differential expression standard, the authors got 840 differential fragments, accounting for 23.08% of the expressed numbers and including 5 visual related genes, 13 excitatory neruotransmitter genes and 3 inhibitory neurotransmitter genes. At the 8th week, the expression of Rhodopsin kinase, beta-arrestin, Phosducinìrod photoreceptor cGMP-gated channel and Rpe65 as well as iGlu R1-4 were down-regulated. mGluRs and GABA-Rs were all up-regulated, whereas the expression of GlyR was unchanged. These results prompt again that the changes in retinal nervous layer of rat have occurred at an early stage of diabetes. The genes expression pattern of visual related genes and excitatory and inhibitory neurotransmitters in rat diabetic retina have been involved in neuro-dysfunctions of diabetic retina.
Identification and pathway analysis of microRNAs with no previous involvement in breast cancer.
Romero-Cordoba, Sandra; Rodriguez-Cuevas, Sergio; Rebollar-Vega, Rosa; Quintanar-Jurado, Valeria; Maffuz-Aziz, Antonio; Jimenez-Sanchez, Gerardo; Bautista-Piña, Veronica; Arellano-Llamas, Rocio; Hidalgo-Miranda, Alfredo
2012-01-01
microRNA expression signatures can differentiate normal and breast cancer tissues and can define specific clinico-pathological phenotypes in breast tumors. In order to further evaluate the microRNA expression profile in breast cancer, we analyzed the expression of 667 microRNAs in 29 tumors and 21 adjacent normal tissues using TaqMan Low-density arrays. 130 miRNAs showed significant differential expression (adjusted P value = 0.05, Fold Change = 2) in breast tumors compared to the normal adjacent tissue. Importantly, the role of 43 of these microRNAs has not been previously reported in breast cancer, including several evolutionary conserved microRNA*, showing similar expression rates to that of their corresponding leading strand. The expression of 14 microRNAs was replicated in an independent set of 55 tumors. Bioinformatic analysis of mRNA targets of the altered miRNAs, identified oncogenes like ERBB2, YY1, several MAP kinases, and known tumor-suppressors like FOXA1 and SMAD4. Pathway analysis identified that some biological process which are important in breast carcinogenesis are affected by the altered microRNA expression, including signaling through MAP kinases and TP53 pathways, as well as biological processes like cell death and communication, focal adhesion and ERBB2-ERBB3 signaling. Our data identified the altered expression of several microRNAs whose aberrant expression might have an important impact on cancer-related cellular pathways and whose role in breast cancer has not been previously described.
Rebholz-Schuhman, Dietrich; Cameron, Graham; Clark, Dominic; van Mulligen, Erik; Coatrieux, Jean-Louis; Del Hoyo Barbolla, Eva; Martin-Sanchez, Fernando; Milanesi, Luciano; Porro, Ivan; Beltrame, Francesco; Tollis, Ioannis; Van der Lei, Johan
2007-03-08
The SYMBIOmatics Specific Support Action (SSA) is "an information gathering and dissemination activity" that seeks "to identify synergies between the bioinformatics and the medical informatics" domain to improve collaborative progress between both domains (ref. to http://www.symbiomatics.org). As part of the project experts in both research fields will be identified and approached through a survey. To provide input to the survey, the scientific literature was analysed to extract topics relevant to both medical informatics and bioinformatics. This paper presents results of a systematic analysis of the scientific literature from medical informatics research and bioinformatics research. In the analysis pairs of words (bigrams) from the leading bioinformatics and medical informatics journals have been used as indication of existing and emerging technologies and topics over the period 2000-2005 ("recent") and 1990-1990 ("past"). We identified emerging topics that were equally important to bioinformatics and medical informatics in recent years such as microarray experiments, ontologies, open source, text mining and support vector machines. Emerging topics that evolved only in bioinformatics were system biology, protein interaction networks and statistical methods for microarray analyses, whereas emerging topics in medical informatics were grid technology and tissue microarrays. We conclude that although both fields have their own specific domains of interest, they share common technological developments that tend to be initiated by new developments in biotechnology and computer science.
Rebholz-Schuhman, Dietrich; Cameron, Graham; Clark, Dominic; van Mulligen, Erik; Coatrieux, Jean-Louis; Del Hoyo Barbolla, Eva; Martin-Sanchez, Fernando; Milanesi, Luciano; Porro, Ivan; Beltrame, Francesco; Tollis, Ioannis; Van der Lei, Johan
2007-01-01
Background The SYMBIOmatics Specific Support Action (SSA) is "an information gathering and dissemination activity" that seeks "to identify synergies between the bioinformatics and the medical informatics" domain to improve collaborative progress between both domains (ref. to ). As part of the project experts in both research fields will be identified and approached through a survey. To provide input to the survey, the scientific literature was analysed to extract topics relevant to both medical informatics and bioinformatics. Results This paper presents results of a systematic analysis of the scientific literature from medical informatics research and bioinformatics research. In the analysis pairs of words (bigrams) from the leading bioinformatics and medical informatics journals have been used as indication of existing and emerging technologies and topics over the period 2000–2005 ("recent") and 1990–1990 ("past"). We identified emerging topics that were equally important to bioinformatics and medical informatics in recent years such as microarray experiments, ontologies, open source, text mining and support vector machines. Emerging topics that evolved only in bioinformatics were system biology, protein interaction networks and statistical methods for microarray analyses, whereas emerging topics in medical informatics were grid technology and tissue microarrays. Conclusion We conclude that although both fields have their own specific domains of interest, they share common technological developments that tend to be initiated by new developments in biotechnology and computer science. PMID:17430562
Analyzing the field of bioinformatics with the multi-faceted topic modeling technique.
Heo, Go Eun; Kang, Keun Young; Song, Min; Lee, Jeong-Hoon
2017-05-31
Bioinformatics is an interdisciplinary field at the intersection of molecular biology and computing technology. To characterize the field as convergent domain, researchers have used bibliometrics, augmented with text-mining techniques for content analysis. In previous studies, Latent Dirichlet Allocation (LDA) was the most representative topic modeling technique for identifying topic structure of subject areas. However, as opposed to revealing the topic structure in relation to metadata such as authors, publication date, and journals, LDA only displays the simple topic structure. In this paper, we adopt the Tang et al.'s Author-Conference-Topic (ACT) model to study the field of bioinformatics from the perspective of keyphrases, authors, and journals. The ACT model is capable of incorporating the paper, author, and conference into the topic distribution simultaneously. To obtain more meaningful results, we use journals and keyphrases instead of conferences and bag-of-words.. For analysis, we use PubMed to collected forty-six bioinformatics journals from the MEDLINE database. We conducted time series topic analysis over four periods from 1996 to 2015 to further examine the interdisciplinary nature of bioinformatics. We analyze the ACT Model results in each period. Additionally, for further integrated analysis, we conduct a time series analysis among the top-ranked keyphrases, journals, and authors according to their frequency. We also examine the patterns in the top journals by simultaneously identifying the topical probability in each period, as well as the top authors and keyphrases. The results indicate that in recent years diversified topics have become more prevalent and convergent topics have become more clearly represented. The results of our analysis implies that overtime the field of bioinformatics becomes more interdisciplinary where there is a steady increase in peripheral fields such as conceptual, mathematical, and system biology. These results are confirmed by integrated analysis of topic distribution as well as top ranked keyphrases, authors, and journals.
Singh, Amarjeet; Baranwal, Vinay; Shankar, Alka; Kanwar, Poonam; Ranjan, Rajeev; Yadav, Sandeep; Pandey, Amita; Kapoor, Sanjay; Pandey, Girdhar K.
2012-01-01
Background Phospholipase A (PLA) is an important group of enzymes responsible for phospholipid hydrolysis in lipid signaling. PLAs have been implicated in abiotic stress signaling and developmental events in various plants species. Genome-wide analysis of PLA superfamily has been carried out in dicot plant Arabidopsis. A comprehensive genome-wide analysis of PLAs has not been presented yet in crop plant rice. Methodology/Principal Findings A comprehensive bioinformatics analysis identified a total of 31 PLA encoding genes in the rice genome, which are divided into three classes; phospholipase A1 (PLA1), patatin like phospholipases (pPLA) and low molecular weight secretory phospholipase A2 (sPLA2) based on their sequences and phylogeny. A subset of 10 rice PLAs exhibited chromosomal duplication, emphasizing the role of duplication in the expansion of this gene family in rice. Microarray expression profiling revealed a number of PLA members expressing differentially and significantly under abiotic stresses and reproductive development. Comparative expression analysis with Arabidopsis PLAs revealed a high degree of functional conservation between the orthologs in two plant species, which also indicated the vital role of PLAs in stress signaling and plant development across different plant species. Moreover, sub-cellular localization of a few candidates suggests their differential localization and functional role in the lipid signaling. Conclusion/Significance The comprehensive analysis and expression profiling would provide a critical platform for the functional characterization of the candidate PLA genes in crop plants. PMID:22363522
DEsingle for detecting three types of differential expression in single-cell RNA-seq data.
Miao, Zhun; Deng, Ke; Wang, Xiaowo; Zhang, Xuegong
2018-04-24
The excessive amount of zeros in single-cell RNA-seq data include "real" zeros due to the on-off nature of gene transcription in single cells and "dropout" zeros due to technical reasons. Existing differential expression (DE) analysis methods cannot distinguish these two types of zeros. We developed an R package DEsingle which employed Zero-Inflated Negative Binomial model to estimate the proportion of real and dropout zeros and to define and detect 3 types of DE genes in single-cell RNA-seq data with higher accuracy. The R package DEsingle is freely available at https://github.com/miaozhun/DEsingle and is under Bioconductor's consideration now. zhangxg@tsinghua.edu.cn. Supplementary data are available at Bioinformatics online.
Federation in genomics pipelines: techniques and challenges.
Chaterji, Somali; Koo, Jinkyu; Li, Ninghui; Meyer, Folker; Grama, Ananth; Bagchi, Saurabh
2017-08-29
Federation is a popular concept in building distributed cyberinfrastructures, whereby computational resources are provided by multiple organizations through a unified portal, decreasing the complexity of moving data back and forth among multiple organizations. Federation has been used in bioinformatics only to a limited extent, namely, federation of datastores, e.g. SBGrid Consortium for structural biology and Gene Expression Omnibus (GEO) for functional genomics. Here, we posit that it is important to federate both computational resources (CPU, GPU, FPGA, etc.) and datastores to support popular bioinformatics portals, with fast-increasing data volumes and increasing processing requirements. A prime example, and one that we discuss here, is in genomics and metagenomics. It is critical that the processing of the data be done without having to transport the data across large network distances. We exemplify our design and development through our experience with metagenomics-RAST (MG-RAST), the most popular metagenomics analysis pipeline. Currently, it is hosted completely at Argonne National Laboratory. However, through a recently started collaborative National Institutes of Health project, we are taking steps toward federating this infrastructure. Being a widely used resource, we have to move toward federation without disrupting 50 K annual users. In this article, we describe the computational tools that will be useful for federating a bioinformatics infrastructure and the open research challenges that we see in federating such infrastructures. It is hoped that our manuscript can serve to spur greater federation of bioinformatics infrastructures by showing the steps involved, and thus, allow them to scale to support larger user bases. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Feng, Yuandong; Shen, Ying; Chen, Hongli; Wang, Xiaman; Zhang, Ru; Peng, Yue; Lei, Xiaoru; Liu, Tian; Liu, Jing; Gu, Liufang; Wang, Fangxia; Yang, Yun; Bai, Ju; Wang, Jianli; Zhao, Wanhong; He, Aili
2018-02-01
Long non-coding RNAs (lncRNAs) are transcripts longer than 200 nt that are involved in tumorigenesis and play a key role in cancer progression. To determine whether lncRNAs are involved in acute myeloid leukemia (AML), we analyzed the expression profile of lncRNAs and mRNAs in AML. Five pairs of AML patients and iron deficiency anemia (IDA) controls were screened by microarray. Through coexpression analysis, differently expressed transcripts were divided into modules, and lncRNAs were functionally annotated. We further analyzed the clinical significance of crucial lncRNAs from modules in public data. Finally, the expression of three lncRNAs, RP11-222K16.2, AC092580.4, and RP11-305O.6, were validated in newly diagnosed AML, AML relapse, and IDA patient groups by quantitative RT-PCR, which may be associated with AML patients' overall survival. Further analysis showed that RP11-222K16.2 might affect the differentiation of natural killer cells, and promote the immunized evasion of AML by regulating Eomesodermin expression. Analysis of this study revealed that dysregulated lncRNAs and mRNAs in AML vs IDA controls could affect the immune system and hematopoietic cell differentiation. The biological functions of those lncRNAs need to be further validated. © 2017 The Authors. Cancer Science published by John Wiley & Sons Australia, Ltd on behalf of Japanese Cancer Association.
Familial aggregation analysis of gene expressions
Rao, Shao-Qi; Xu, Liang-De; Zhang, Guang-Mei; Li, Xia; Li, Lin; Shen, Gong-Qing; Jiang, Yang; Yang, Yue-Ying; Gong, Bin-Sheng; Jiang, Wei; Zhang, Fan; Xiao, Yun; Wang, Qing K
2007-01-01
Traditional studies of familial aggregation are aimed at defining the genetic (and non-genetic) causes of a disease from physiological or clinical traits. However, there has been little attempt to use genome-wide gene expressions, the direct phenotypic measures of genes, as the traits to investigate several extended issues regarding the distributions of familially aggregated genes on chromosomes or in functions. In this study we conducted a genome-wide familial aggregation analysis by using the in vitro cell gene expressions of 3300 human autosome genes (Problem 1 data provided to Genetic Analysis Workshop 15) in order to answer three basic genetics questions. First, we investigated how gene expressions aggregate among different types (degrees) of relative pairs. Second, we conducted a bioinformatics analysis of highly familially aggregated genes to see how they are distributed on chromosomes. Third, we performed a gene ontology enrichment test of familially aggregated genes to find evidence to support their functional consensus. The results indicated that 1) gene expressions did aggregate in families, especially between sibs. Of 3300 human genes analyzed, there were a total of 1105 genes with one or more significant (empirical p < 0.05) familial correlation; 2) there were several genomic hot spots where highly familially aggregated genes (e.g., the chromosome 6 HLA genes cluster) were clustered; 3) as we expected, gene ontology enrichment tests revealed that the 1105 genes were aggregating not only in families but also in functional categories. PMID:18466548
Accessing and integrating data and knowledge for biomedical research.
Burgun, A; Bodenreider, O
2008-01-01
To review the issues that have arisen with the advent of translational research in terms of integration of data and knowledge, and survey current efforts to address these issues. Using examples form the biomedical literature, we identified new trends in biomedical research and their impact on bioinformatics. We analyzed the requirements for effective knowledge repositories and studied issues in the integration of biomedical knowledge. New diagnostic and therapeutic approaches based on gene expression patterns have brought about new issues in the statistical analysis of data, and new workflows are needed are needed to support translational research. Interoperable data repositories based on standard annotations, infrastructures and services are needed to support the pooling and meta-analysis of data, as well as their comparison to earlier experiments. High-quality, integrated ontologies and knowledge bases serve as a source of prior knowledge used in combination with traditional data mining techniques and contribute to the development of more effective data analysis strategies. As biomedical research evolves from traditional clinical and biological investigations towards omics sciences and translational research, specific needs have emerged, including integrating data collected in research studies with patient clinical data, linking omics knowledge with medical knowledge, modeling the molecular basis of diseases, and developing tools that support in-depth analysis of research data. As such, translational research illustrates the need to bridge the gap between bioinformatics and medical informatics, and opens new avenues for biomedical informatics research.
Nebula--a web-server for advanced ChIP-seq data analysis.
Boeva, Valentina; Lermine, Alban; Barette, Camille; Guillouf, Christel; Barillot, Emmanuel
2012-10-01
ChIP-seq consists of chromatin immunoprecipitation and deep sequencing of the extracted DNA fragments. It is the technique of choice for accurate characterization of the binding sites of transcription factors and other DNA-associated proteins. We present a web service, Nebula, which allows inexperienced users to perform a complete bioinformatics analysis of ChIP-seq data. Nebula was designed for both bioinformaticians and biologists. It is based on the Galaxy open source framework. Galaxy already includes a large number of functionalities for mapping reads and peak calling. We added the following to Galaxy: (i) peak calling with FindPeaks and a module for immunoprecipitation quality control, (ii) de novo motif discovery with ChIPMunk, (iii) calculation of the density and the cumulative distribution of peak locations relative to gene transcription start sites, (iv) annotation of peaks with genomic features and (v) annotation of genes with peak information. Nebula generates the graphs and the enrichment statistics at each step of the process. During Steps 3-5, Nebula optionally repeats the analysis on a control dataset and compares these results with those from the main dataset. Nebula can also incorporate gene expression (or gene modulation) data during these steps. In summary, Nebula is an innovative web service that provides an advanced ChIP-seq analysis pipeline providing ready-to-publish results. Nebula is available at http://nebula.curie.fr/ Supplementary data are available at Bioinformatics online.
Chen, Jinyun; Wu, Xifeng; Huang, Yujing; Chen, Wei; Brand, Randall E.; Killary, Ann M.; Sen, Subrata; Frazier, Marsha L.
2016-01-01
Biomarkers are critically needed for the early detection of pancreatic cancer (PC) are urgently needed. Our purpose was to identify a panel of genetic variants that, combined, can predict increased risk for early-onset PC and thereby identify individuals who should begin screening at an early age. Previously, we identified genes using a functional genomic approach that were aberrantly expressed in early pathways to PC tumorigenesis. We now report the discovery of single nucleotide polymorphisms (SNPs) in these genes associated with early age at diagnosis of PC using a two-phase study design. In silico and bioinformatics tools were used to examine functional relevance of the identified SNPs. Eight SNPs were consistently associated with age at diagnosis in the discovery phase, validation phase and pooled analysis. Further analysis of the joint effects of these 8 SNPs showed that, compared to participants carrying none of these unfavorable genotypes (median age at PC diagnosis 70 years), those carrying 1–2, 3–4, or 5 or more unfavorable genotypes had median ages at diagnosis of 64, 63, and 62 years, respectively (P = 3.0E–04). A gene-dosage effect was observed, with age at diagnosis inversely related to number of unfavorable genotypes (Ptrend = 1.0E–04). Using bioinformatics tools, we found that all of the 8 SNPs were predicted to play functional roles in the disruption of transcription factor and/or enhancer binding sites and most of them were expression quantitative trait loci (eQTL) of the target genes. The panel of genetic markers identified may serve as susceptibility markers for earlier PC diagnosis. PMID:27486767
Li, Shengjie; Shen, Li; Sun, Lianjie; Xu, Jiao; Jin, Ping; Chen, Liming; Ma, Fei
2017-05-01
Drosophila have served as a model for research on innate immunity for decades. However, knowledge of the post-transcriptional regulation of immune gene expression by microRNAs (miRNAs) remains rudimentary. In the present study, using small RNA-seq and bioinformatics analysis, we identified 67 differentially expressed miRNAs in Drosophila infected with Escherichia coli compared to injured flies at three time-points. Furthermore, we found that 21 of these miRNAs were potentially involved in the regulation of Imd pathway-related genes. Strikingly, based on UAS-miRNAs line screening and Dual-luciferase assay, we identified that miR-9a and miR-981 could both negatively regulate Drosophila antibacterial defenses and decrease the level of the antibacterial peptide, Diptericin. Taken together, these data support the involvement of miRNAs in the regulation of the Drosophila Imd pathway. Copyright © 2017 Elsevier Ltd. All rights reserved.
[Weighted gene co-expression network analysis in biomedicine research].
Liu, Wei; Li, Li; Ye, Hua; Tu, Wei
2017-11-25
High-throughput biological technologies are now widely applied in biology and medicine, allowing scientists to monitor thousands of parameters simultaneously in a specific sample. However, it is still an enormous challenge to mine useful information from high-throughput data. The emergence of network biology provides deeper insights into complex bio-system and reveals the modularity in tissue/cellular networks. Correlation networks are increasingly used in bioinformatics applications. Weighted gene co-expression network analysis (WGCNA) tool can detect clusters of highly correlated genes. Therefore, we systematically reviewed the application of WGCNA in the study of disease diagnosis, pathogenesis and other related fields. First, we introduced principle, workflow, advantages and disadvantages of WGCNA. Second, we presented the application of WGCNA in disease, physiology, drug, evolution and genome annotation. Then, we indicated the application of WGCNA in newly developed high-throughput methods. We hope this review will help to promote the application of WGCNA in biomedicine research.
Pathway analysis from lists of microRNAs: common pitfalls and alternative strategy
Godard, Patrice; van Eyll, Jonathan
2015-01-01
MicroRNAs (miRNAs) are involved in the regulation of gene expression at a post-transcriptional level. As such, monitoring miRNA expression has been increasingly used to assess their role in regulatory mechanisms of biological processes. In large scale studies, once miRNAs of interest have been identified, the target genes they regulate are often inferred using algorithms or databases. A pathway analysis is then often performed in order to generate hypotheses about the relevant biological functions controlled by the miRNA signature. Here we show that the method widely used in scientific literature to identify these pathways is biased and leads to inaccurate results. In addition to describing the bias and its origin we present an alternative strategy to identify potential biological functions specifically impacted by a miRNA signature. More generally, our study exemplifies the crucial need of relevant negative controls when developing, and using, bioinformatics methods. PMID:25800743
2012-01-01
Background Thalidomide is an anti-inflammatory and anti-angiogenic drug currently used for the treatment of several diseases, including erythema nodosum leprosum, which occurs in patients with lepromatous leprosy. In this research, we use DNA microarray analysis to identify the impact of thalidomide on gene expression responses in human cells after lipopolysaccharide (LPS) stimulation. We employed a two-stage framework. Initially, we identified 1584 altered genes in response to LPS. Modulation of this set of genes was then analyzed in the LPS stimulated cells treated with thalidomide. Results We identified 64 genes with altered expression induced by thalidomide using the rank product method. In addition, the lists of up-regulated and down-regulated genes were investigated by means of bioinformatics functional analysis, which allowed for the identification of biological processes affected by thalidomide. Confirmatory analysis was done in five of the identified genes using real time PCR. Conclusions The results showed some genes that can further our understanding of the biological mechanisms in the action of thalidomide. Of the five genes evaluated with real time PCR, three were down regulated and two were up regulated confirming the initial results of the microarray analysis. PMID:22695124
Unmasking Upstream Gene Expression Regulators with miRNA-corrected mRNA Data
Bollmann, Stephanie; Bu, Dengpan; Wang, Jiaqi; Bionaz, Massimo
2015-01-01
Expressed micro-RNA (miRNA) affects messenger RNA (mRNA) abundance, hindering the accuracy of upstream regulator analysis. Our objective was to provide an algorithm to correct such bias. Large mRNA and miRNA analyses were performed on RNA extracted from bovine liver and mammary tissue. Using four levels of target scores from TargetScan (all miRNA:mRNA target gene pairs or only the top 25%, 50%, or 75%). Using four levels of target scores from TargetScan (all miRNA:mRNA target gene pairs or only the top 25%, 50%, or 75%) and four levels of the magnitude of miRNA effect (ME) on mRNA expression (30%, 50%, 75%, and 83% mRNA reduction), we generated 17 different datasets (including the original dataset). For each dataset, we performed upstream regulator analysis using two bioinformatics tools. We detected an increased effect on the upstream regulator analysis with larger miRNA:mRNA pair bins and higher ME. The miRNA correction allowed identification of several upstream regulators not present in the analysis of the original dataset. Thus, the proposed algorithm improved the prediction of upstream regulators. PMID:27279737
Dong, Zhiyong; Zheng, Longzhi; Liu, Weimin; Wang, Cunchuan
2018-01-01
The relationship between TP53 codon 72 Pro/Arg gene polymorphism and colorectal cancer risk in Asians is still controversial, and this bioinformatics analysis and meta-analysis was performed to assess the associations. The association studies were identified from PubMed, and eligible reports were included. RevMan 5.3.1 software, Oncolnc, cBioPortal, and Oncomine online tools were used for statistical analysis. A random/fixed effects model was used in meta-analysis. The data were reported as risk ratios or mean differences with corresponding 95% CI. We confirmed that TP53 was associated with colorectal cancer, the alteration frequency of TP53 was 53% mutation and 7% deep deletion, and TP53 mRNA expression was different in different types of colorectal cancer based on The Cancer Genome Atlas database. Then, 18 studies were included that examine the association of TP53 codon 72 gene polymorphism with colorectal cancer risk in Asians. The meta-analysis indicated that TP53 Pro allele and Pro/Pro genotype were associated with colorectal cancer risk in Asian population, but Arg/Arg genotype was not (Pro allele: odds ratios [OR]=1.20, 95% CI: 1.06 to 1.35, P =0.003; Pro/Pro genotype: OR=1.39, 95% CI: 1.15 to 1.69, P =0.0007; Arg/Arg genotype: OR=0.86, 95% CI: 0.74 to 1.00, P =0.05). Interestingly, in the meta-analysis of the controls from the population-based studies, we found that TP53 codon 72 Pro/Arg gene polymorphism was associated with colorectal cancer risk (Pro allele: OR=1.33, 95% CI: 1.15 to 1.55, P =0.0002; Pro/Pro genotype: OR=1.61, 95% CI: 1.28 to 2.02, P <0.0001; Arg/Arg genotype: OR=0.77, 95% CI: 0.63 to 0.93, P =0.009). TP53 was associated with colorectal cancer, but the different value levels of mRNA expression were not associated with survival rate of colon and rectal cancer. TP53 Pro allele and Pro/Pro genotype were associated with colorectal cancer risk in Asians.
Liu, Weimin; Wang, Cunchuan
2018-01-01
Background The relationship between TP53 codon 72 Pro/Arg gene polymorphism and colorectal cancer risk in Asians is still controversial, and this bioinformatics analysis and meta-analysis was performed to assess the associations. Methods The association studies were identified from PubMed, and eligible reports were included. RevMan 5.3.1 software, Oncolnc, cBioPortal, and Oncomine online tools were used for statistical analysis. A random/fixed effects model was used in meta-analysis. The data were reported as risk ratios or mean differences with corresponding 95% CI. Results We confirmed that TP53 was associated with colorectal cancer, the alteration frequency of TP53 was 53% mutation and 7% deep deletion, and TP53 mRNA expression was different in different types of colorectal cancer based on The Cancer Genome Atlas database. Then, 18 studies were included that examine the association of TP53 codon 72 gene polymorphism with colorectal cancer risk in Asians. The meta-analysis indicated that TP53 Pro allele and Pro/Pro genotype were associated with colorectal cancer risk in Asian population, but Arg/Arg genotype was not (Pro allele: odds ratios [OR]=1.20, 95% CI: 1.06 to 1.35, P=0.003; Pro/Pro genotype: OR=1.39, 95% CI: 1.15 to 1.69, P=0.0007; Arg/Arg genotype: OR=0.86, 95% CI: 0.74 to 1.00, P=0.05). Interestingly, in the meta-analysis of the controls from the population-based studies, we found that TP53 codon 72 Pro/Arg gene polymorphism was associated with colorectal cancer risk (Pro allele: OR=1.33, 95% CI: 1.15 to 1.55, P=0.0002; Pro/Pro genotype: OR=1.61, 95% CI: 1.28 to 2.02, P<0.0001; Arg/Arg genotype: OR=0.77, 95% CI: 0.63 to 0.93, P=0.009). Conclusion TP53 was associated with colorectal cancer, but the different value levels of mRNA expression were not associated with survival rate of colon and rectal cancer. TP53 Pro allele and Pro/Pro genotype were associated with colorectal cancer risk in Asians. PMID:29872345
Differentially-Expressed Pseudogenes in HIV-1 Infection
Gupta, Aditi; Brown, C. Titus; Zheng, Yong-Hui; Adami, Christoph
2015-01-01
Not all pseudogenes are transcriptionally silent as previously thought. Pseudogene transcripts, although not translated, contribute to the non-coding RNA pool of the cell that regulates the expression of other genes. Pseudogene transcripts can also directly compete with the parent gene transcripts for mRNA stability and other cell factors, modulating their expression levels. Tissue-specific and cancer-specific differential expression of these “functional” pseudogenes has been reported. To ascertain potential pseudogene:gene interactions in HIV-1 infection, we analyzed transcriptomes from infected and uninfected T-cells and found that 21 pseudogenes are differentially expressed in HIV-1 infection. This is interesting because parent genes of one-third of these differentially-expressed pseudogenes are implicated in HIV-1 life cycle, and parent genes of half of these pseudogenes are involved in different viral infections. Our bioinformatics analysis identifies candidate pseudogene:gene interactions that may be of significance in HIV-1 infection. Experimental validation of these interactions would establish that retroviruses exploit this newly-discovered layer of host gene expression regulation for their own benefit. PMID:26426037
Phenome-genome association studies of pancreatic cancer: new targets for therapy and diagnosis.
Narayanan, Ramaswamy
2015-01-01
Pancreatic cancer, has a very high mortality rate and requires novel molecular targets for diagnosis and therapy. Genetic association studies over databases offer an attractive starting point for gene discovery. The National Center for Biotechnology Information (NCBI) Phenome Genome Integrator (PheGenI) tool was enriched for pancreatic cancer-associated traits. The genes associated with the trait were characterized using diverse bioinformatics tools for Genome-Wide Association (GWA), transcriptome and proteome profile and protein classes for motif and domain. Two hundred twenty-six genes were identified that had a genetic association with pancreatic cancer in the human genome. This included 25 uncharacterized open reading frames (ORFs). Bioinformatics analysis of these ORFs identified putative druggable proteins and biomarkers including enzymes, transporters and G-protein-coupled receptor signaling proteins. Secreted proteins including a neuroendocrine factor and a chemokine were identified. Five out of these ORFs encompassed non coding RNAs. The ORF protein expression was detected in numerous body fluids, such as ascites, bile, pancreatic juice, milk, plasma, serum and saliva. Transcriptome and proteome analyses showed a correlation of mRNA and protein expression for nine ORFs. Analysis of the Catalogue of Somatic Mutations in Cancer (COSMIC) database revealed a strong correlation across copy number variations and mRNA over-expression for four ORFs. Mining of the International Cancer Gene Consortium (ICGC) database identified somatic mutations in a significant number of pancreatic patients' tumors for most of these ORFs. The pancreatic cancer-associated ORFs were also found to be genetically associated with other neoplasms, including leukemia, malignant melanoma, neuroblastoma and prostate carcinomas, as well as other unrelated diseases and disorders, such as Alzheimer's disease, Crohn's disease, coronary diseases, attention deficit disorder and addiction. Based on Genome-Wide Association Studies (GWAS), copy number variations, somatic mutational status and correlation of gene expression in pancreatic tumors at the mRNA and protein level, expression specificity in normal tissues and detection in body fluids, six ORFs emerged as putative leads for pancreatic cancer. These six targets provide a basis for accelerated drug discovery and diagnostic marker development for pancreatic cancer. Copyright© 2015, International Institute of Anticancer Research (Dr. John G. Delinasios), All rights reserved.
Xie, Zu-Cheng; Li, Tian-Tian; Gan, Bin-Liang; Gao, Xiang; Gao, Li; Chen, Gang; Hu, Xiao-Hua
2018-05-01
Lung squamous cell cancer (LUSC) is a common but challenging malignancy. It is important to illuminate the molecular mechanism of LUSC. Thus, we aim to explore the molecular mechanism of miR-136-5p in relation to LUSC. We used the Cancer Genome Atlas (TCGA) database to investigate the expression of miR-136-5p in relation to LUSC. Then, we identified the possible miR-136-5p target genes through intersection of the predicted miR-136-5p target genes and LUSC upregulated genes from TCGA. Bioinformatics analysis was performed to determine the key miR-136-5p targets and pathways associated with LUSC. Finally, the expression of hub genes, correlation between miR-136-5p and hub genes, and expected significance of hub genes were evaluated via the TCGA and Genotype-Tissue Expression (GTEx) project. MiR-136-5p was significantly downregulated in LUSC patients. Glucuronidation, glucuronosyltransferase, and the retinoic acid metabolic process were the most enriched metabolic interactions in LUSC patients. Ascorbate and aldarate metabolism, pentose and glucuronate interconversions, and retinol metabolism were identified as crucial pathways. Seven hub genes (UGT1A1, UGT1A3, UGT1A6, UGT1A7, UGT1A10, SRD5A1, and ADH7) were found to be upregulated, and UGT1A1, UGT1A3, UGT1A6, UGT1A7, and ADH7 were negatively correlated with miR-136-5p. UGT1A7 and ADH7 were the most significantly involved miR-136-5p target genes, and high expression of these genes was correlated with better overall survival and disease-free survival of LUSC patients. Downregulated miR-136-5p may target UGT1A7 and ADH7 and participate in ascorbate and aldarate metabolism, pentose and glucuronate interconversions, and retinol metabolism. High expression of UGT1A7 and ADH7 may indicate better prognosis of LUSC patients. Copyright © 2018. Published by Elsevier GmbH.
Two interactive Bioinformatics courses at the Bielefeld University Bioinformatics Server.
Sczyrba, Alexander; Konermann, Susanne; Giegerich, Robert
2008-05-01
Conferences in computational biology continue to provide tutorials on classical and new methods in the field. This can be taken as an indicator that education is still a bottleneck in our field's process of becoming an established scientific discipline. Bielefeld University has been one of the early providers of bioinformatics education, both locally and via the internet. The Bielefeld Bioinformatics Server (BiBiServ) offers a variety of older and new materials. Here, we report on two online courses made available recently, one introductory and one on the advanced level: (i) SADR: Sequence Analysis with Distributed Resources (http://bibiserv.techfak.uni-bielefeld.de/sadr/) and (ii) ADP: Algebraic Dynamic Programming in Bioinformatics (http://bibiserv.techfak.uni-bielefeld.de/dpcourse/).
Ramharack, Pritika; Soliman, Mahmoud E S
2018-06-01
Originally developed for the analysis of biological sequences, bioinformatics has advanced into one of the most widely recognized domains in the scientific community. Despite this technological evolution, there is still an urgent need for nontoxic and efficient drugs. The onus now falls on the 'omics domain to meet this need by implementing bioinformatics techniques that will allow for the introduction of pioneering approaches in the rational drug design process. Here, we categorize an updated list of informatics tools and explore the capabilities of integrative bioinformatics in disease control. We believe that our review will serve as a comprehensive guide toward bioinformatics-oriented disease and drug discovery research. Copyright © 2018 Elsevier Ltd. All rights reserved.
Qian, Heying; Li, Gang; He, Qingling; Zhang, Huaguang; Xu, Anying
2016-08-15
Fluoride tolerance is an economically important trait of silkworm. Near-isogenic lines (NILs) of the dominant endurance to fluoride (Def) gene in Bombyx mori has been constructed before. Here, we analyzed the gene expression profiles of midgut of fluoride-sensitive and fluoride-endurable individuals of Def NILs by using high-throughput Illumina sequencing technology and bioinformatics tools, and identified differentially expressed genes between these individuals. A total of 3,612,399 and 3,567,631 clean tags for the libraries of fluoride-endurable and fluoride-sensitive individuals were obtained, which corresponded to 32,933 and 43,976 distinct clean tags, respectively. Analysis of differentially expressed genes indicates that 241 genes are differentially expressed between the two libraries. Among the 241 genes, 30 are up-regulated and 211 are down-regulated in fluoride-endurable individuals. Pathway enrichment analysis demonstrates that genes related to ribosomes, pancreatic secretion, steroid biosynthesis, glutathione metabolism, steroid biosynthesis, and glycerolipid metabolism are down-regulated in fluoride-endurable individuals. qRT-PCR was conducted to confirm the results of the DGE. The present study analyzed differential expression of related genes and tried to find out whether the crucial genes were related to fluoride detoxification which might elucidate fluoride effect and provide a new way in the fluorosis research. Copyright © 2016 Elsevier B.V. All rights reserved.
An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Taylor, Ronald C.
Bioinformatics researchers are increasingly confronted with analysis of ultra large-scale data sets, a problem that will only increase at an alarming rate in coming years. Recent developments in open source software, that is, the Hadoop project and associated software, provide a foundation for scaling to petabyte scale data warehouses on Linux clusters, providing fault-tolerant parallelized analysis on such data using a programming style named MapReduce. An overview is given of the current usage within the bioinformatics community of Hadoop, a top-level Apache Software Foundation project, and of associated open source software projects. The concepts behind Hadoop and the associated HBasemore » project are defined, and current bioinformatics software that employ Hadoop is described. The focus is on next-generation sequencing, as the leading application area to date.« less
Isoform-level gene expression patterns in single-cell RNA-sequencing data.
Vu, Trung Nghia; Wills, Quin F; Kalari, Krishna R; Niu, Nifang; Wang, Liewei; Pawitan, Yudi; Rantalainen, Mattias
2018-02-27
RNA sequencing of single cells enables characterization of transcriptional heterogeneity in seemingly homogeneous cell populations. Single-cell sequencing has been applied in a wide range of researches fields. However, few studies have focus on characterization of isoform-level expression patterns at the single-cell level. In this study we propose and apply a novel method, ISOform-Patterns (ISOP), based on mixture modeling, to characterize the expression patterns of isoform pairs from the same gene in single-cell isoform-level expression data. We define six principal patterns of isoform expression relationships and describe a method for differential-pattern analysis. We demonstrate ISOP through analysis of single-cell RNA-sequencing data from a breast cancer cell line, with replication in three independent datasets. We assigned the pattern types to each of 16,562 isoform-pairs from 4,929 genes. Among those, 26% of the discovered patterns were significant (p<0.05), while remaining patterns are possibly effects of transcriptional bursting, drop-out and stochastic biological heterogeneity. Furthermore, 32% of genes discovered through differential-pattern analysis were not detected by differential-expression analysis. The effect of drop-out events, mean expression level, and properties of the expression distribution on the performances of ISOP were also investigated through simulated datasets. To conclude, ISOP provides a novel approach for characterization of isoformlevel preference, commitment and heterogeneity in single-cell RNA-sequencing data. The ISOP method has been implemented as a R package and is available at https://github.com/nghiavtr/ISOP under a GPL-3 license. mattias.rantalainen@ki.se. Supplementary data are available at Bioinformatics online.
Genome-wide meta-analysis identifies five new susceptibility loci for cutaneous malignant melanoma.
Law, Matthew H; Bishop, D Timothy; Lee, Jeffrey E; Brossard, Myriam; Martin, Nicholas G; Moses, Eric K; Song, Fengju; Barrett, Jennifer H; Kumar, Rajiv; Easton, Douglas F; Pharoah, Paul D P; Swerdlow, Anthony J; Kypreou, Katerina P; Taylor, John C; Harland, Mark; Randerson-Moor, Juliette; Akslen, Lars A; Andresen, Per A; Avril, Marie-Françoise; Azizi, Esther; Scarrà, Giovanna Bianchi; Brown, Kevin M; Dębniak, Tadeusz; Duffy, David L; Elder, David E; Fang, Shenying; Friedman, Eitan; Galan, Pilar; Ghiorzo, Paola; Gillanders, Elizabeth M; Goldstein, Alisa M; Gruis, Nelleke A; Hansson, Johan; Helsing, Per; Hočevar, Marko; Höiom, Veronica; Ingvar, Christian; Kanetsky, Peter A; Chen, Wei V; Landi, Maria Teresa; Lang, Julie; Lathrop, G Mark; Lubiński, Jan; Mackie, Rona M; Mann, Graham J; Molven, Anders; Montgomery, Grant W; Novaković, Srdjan; Olsson, Håkan; Puig, Susana; Puig-Butille, Joan Anton; Qureshi, Abrar A; Radford-Smith, Graham L; van der Stoep, Nienke; van Doorn, Remco; Whiteman, David C; Craig, Jamie E; Schadendorf, Dirk; Simms, Lisa A; Burdon, Kathryn P; Nyholt, Dale R; Pooley, Karen A; Orr, Nick; Stratigos, Alexander J; Cust, Anne E; Ward, Sarah V; Hayward, Nicholas K; Han, Jiali; Schulze, Hans-Joachim; Dunning, Alison M; Bishop, Julia A Newton; Demenais, Florence; Amos, Christopher I; MacGregor, Stuart; Iles, Mark M
2015-09-01
Thirteen common susceptibility loci have been reproducibly associated with cutaneous malignant melanoma (CMM). We report the results of an international 2-stage meta-analysis of CMM genome-wide association studies (GWAS). This meta-analysis combines 11 GWAS (5 previously unpublished) and a further three stage 2 data sets, totaling 15,990 CMM cases and 26,409 controls. Five loci not previously associated with CMM risk reached genome-wide significance (P < 5 × 10(-8)), as did 2 previously reported but unreplicated loci and all 13 established loci. Newly associated SNPs fall within putative melanocyte regulatory elements, and bioinformatic and expression quantitative trait locus (eQTL) data highlight candidate genes in the associated regions, including one involved in telomere biology.
Genome-wide meta-analysis identifies five new susceptibility loci for cutaneous malignant melanoma
Law, Matthew H.; Bishop, D. Timothy; Martin, Nicholas G.; Moses, Eric K.; Song, Fengju; Barrett, Jennifer H.; Kumar, Rajiv; Easton, Douglas F.; Pharoah, Paul D. P.; Swerdlow, Anthony J.; Kypreou, Katerina P.; Taylor, John C.; Harland, Mark; Randerson-Moor, Juliette; Akslen, Lars A.; Andresen, Per A.; Avril, Marie-Françoise; Azizi, Esther; Scarrà, Giovanna Bianchi; Brown, Kevin M.; Dębniak, Tadeusz; Duffy, David L.; Elder, David E.; Fang, Shenying; Friedman, Eitan; Galan, Pilar; Ghiorzo, Paola; Gillanders, Elizabeth M.; Goldstein, Alisa M.; Gruis, Nelleke A.; Hansson, Johan; Helsing, Per; Hočevar, Marko; Höiom, Veronica; Ingvar, Christian; Kanetsky, Peter A.; Chen, Wei V.; Landi, Maria Teresa; Lang, Julie; Lathrop, G. Mark; Lubiński, Jan; Mackie, Rona M.; Mann, Graham J.; Molven, Anders; Montgomery, Grant W.; Novaković, Srdjan; Olsson, Håkan; Puig, Susana; Puig-Butille, Joan Anton; Qureshi, Abrar A.; Radford-Smith, Graham L.; van der Stoep, Nienke; van Doorn, Remco; Whiteman, David C.; Craig, Jamie E.; Schadendorf, Dirk; Simms, Lisa A.; Burdon, Kathryn P.; Nyholt, Dale R.; Pooley, Karen A.; Orr, Nick; Stratigos, Alexander J.; Cust, Anne E.; Ward, Sarah V.; Hayward, Nicholas K.; Han, Jiali; Schulze, Hans-Joachim; Dunning, Alison M.; Bishop, Julia A. Newton; MacGregor, Stuart; Iles, Mark M.
2015-01-01
Thirteen common susceptibility loci have been reproducibly associated with cutaneous malignant melanoma (CMM). We report the results of an international 2-stage meta-analysis of CMM genome-wide association studies (GWAS). This meta-analysis combines 11 GWAS (5 previously unpublished) and a further three stage 2 data sets, totaling 15,990 CMM cases and 26,409 controls. Five loci not previously associated with CMM risk reached genome-wide significance (P < 5×10–8), as did two previously-reported but un-replicated loci and all thirteen established loci. Novel SNPs fall within putative melanocyte regulatory elements, and bioinformatic and expression quantitative trait locus (eQTL) data highlight candidate genes including one involved in telomere biology. PMID:26237428
Cyclin D1 and Ewing's sarcoma/PNET: A microarray analysis.
Fagone, Paolo; Nicoletti, Ferdinando; Salvatorelli, Lucia; Musumeci, Giuseppe; Magro, Gaetano
2015-10-01
Recent immunohistochemical analyses have showed that cyclin D1 is expressed in soft tissue Ewing's sarcoma/peripheral neuroectodermal tumor (PNET) of childhood and adolescents, while it is undetectable in both embryonal and alveolar rhabdomyosarcoma. In the present paper, microarray analysis provided evidence of a significant upregulation of cyclin D1 in Ewing's sarcoma as compared to normal tissues. In addition, we confirmed our previous findings of a significant over-expression of cyclin D1 in Ewing sarcoma as compared to rhabdomyosarcoma. Bioinformatic analysis also allowed to identify some other genes, strongly correlated to cyclin D1, which, although not previously studied in pediatric tumors, could represent novel markers for the diagnosis and prognosis of Ewing's sarcoma/PNET. The data herein provided support not only the use of cyclin D1 as a diagnostic marker of Ewing sarcoma/PNET but also the possibility of using drugs targeting cyclin D1 as potential therapeutic strategies. Copyright © 2015 Elsevier GmbH. All rights reserved.
Use of a bovine genome chip to identify new biological pathways for beef quality in cattle.
Guifen, Liu; Xiaomu, Liu; Fachun, Wan; Xiuwen, Tan; Haijian, Cheng; Enliang, Song
2012-12-01
The accumulation of muscle is largely influenced by the genetic background of cattle. Muscle tissue was collected from the longissimus muscle of Lilu beef cattle at 12, 18, 24 and 30 months old. Using meat quality analysis, we found that the Lilu beef cattle have good production and slaughter performance, the performance meets the criterion of beef cattle. Microarray analysis was able to identify a total of 4,219 genes that are differentially expressed (P ≤ 0.01) between the two groups of cattle (12 vs 18; 18 vs 24; 24 vs 30). Bioinformatics analysis results suggested that most of the differentially expressed genes are involved in the metabolic pathways and neuroactive ligand-receptor interaction pathways. In the future study that aims to look for genes relating to growth and meat quality, we will focus on the genes that have been shown to have a significant variation between groups and are involved in the two pathways.
USDA-ARS?s Scientific Manuscript database
The JAK signal transducer and STAT signaling pathway is an important regulator of cell proliferation, differentiation, survival, motility, apoptosis, immune response, and development. In this study, we used RNA-Sequencing, qRT-PCR, and bioinformatics tools to investigate the differential expression ...
The MIGenAS integrated bioinformatics toolkit for web-based sequence analysis
Rampp, Markus; Soddemann, Thomas; Lederer, Hermann
2006-01-01
We describe a versatile and extensible integrated bioinformatics toolkit for the analysis of biological sequences over the Internet. The web portal offers convenient interactive access to a growing pool of chainable bioinformatics software tools and databases that are centrally installed and maintained by the RZG. Currently, supported tasks comprise sequence similarity searches in public or user-supplied databases, computation and validation of multiple sequence alignments, phylogenetic analysis and protein–structure prediction. Individual tools can be seamlessly chained into pipelines allowing the user to conveniently process complex workflows without the necessity to take care of any format conversions or tedious parsing of intermediate results. The toolkit is part of the Max-Planck Integrated Gene Analysis System (MIGenAS) of the Max Planck Society available at (click ‘Start Toolkit’). PMID:16844980
Sehgal, Vasudha; Seviour, Elena G; Moss, Tyler J; Mills, Gordon B; Azencott, Robert; Ram, Prahlad T
2015-01-01
MicroRNAs (miRNAs) play a crucial role in the maintenance of cellular homeostasis by regulating the expression of their target genes. As such, the dysregulation of miRNA expression has been frequently linked to cancer. With rapidly accumulating molecular data linked to patient outcome, the need for identification of robust multi-omic molecular markers is critical in order to provide clinical impact. While previous bioinformatic tools have been developed to identify potential biomarkers in cancer, these methods do not allow for rapid classification of oncogenes versus tumor suppressors taking into account robust differential expression, cutoffs, p-values and non-normality of the data. Here, we propose a methodology, Robust Selection Algorithm (RSA) that addresses these important problems in big data omics analysis. The robustness of the survival analysis is ensured by identification of optimal cutoff values of omics expression, strengthened by p-value computed through intensive random resampling taking into account any non-normality in the data and integration into multi-omic functional networks. Here we have analyzed pan-cancer miRNA patient data to identify functional pathways involved in cancer progression that are associated with selected miRNA identified by RSA. Our approach demonstrates the way in which existing survival analysis techniques can be integrated with a functional network analysis framework to efficiently identify promising biomarkers and novel therapeutic candidates across diseases.
Camicia, Federico; Paredes, Rodolfo; Chalar, Cora; Galanti, Norbel; Kamenetzky, Laura; Gutierrez, Ariana; Rosenzvit, Mara C
2008-03-31
We have sequenced and partially characterized an Echinococcus granulosus cDNA, termed egat1, from a protoscolex signal sequence trap (SST) cDNA library. The isolated 1627 bp long cDNA contains an ORF of 489 amino acids and shows an amino acid identity of 30% with neutral and excitatory amino acid transporters members of the Dicarboxylate/Amino Acid Na+ and/or H+ Cation Symporter family (DAACS) (TC 2.A.23). Additional bioinformatics analysis of EgAT1, confirmed the results obtained by similarity searches and showed the presence of 9 to 10 transmembrane domains, consensus sequences for N-glycosylation between the third and fourth transmembrane domain, a highly similar hydropathy profile with ASCT1 (a known member of DAACS family), high score with SDF (Sodium Dicarboxilate Family) and similar motifs with EDTRANSPORT, a fingerprint of excitatory amino acid transporters. The localization of the putative amino acid transporter was analyzed by in situ hybridization and immunofluorescence in protoscoleces and associated germinal layer. The in situ hybridization labelling indicates the distribution of egat1 mRNA throughout the tegument. EgAT1 protein, which showed in Western blots a molecular mass of approximately 60 kD, is localized in the subtegumental region of the metacestode, particularly around suckers and rostellum of protoscoleces and layers from brood capsules. The sequence and expression analyses of EgAT1 pave the way for functional analysis of amino acids transporters of E. granulosus and its evaluation as new drug targets against cystic echinococcosis.
Screening key candidate genes and pathways involved in insulinoma by microarray analysis.
Zhou, Wuhua; Gong, Li; Li, Xuefeng; Wan, Yunyan; Wang, Xiangfei; Li, Huili; Jiang, Bin
2018-06-01
Insulinoma is a rare type tumor and its genetic features remain largely unknown. This study aimed to search for potential key genes and relevant enriched pathways of insulinoma.The gene expression data from GSE73338 were downloaded from Gene Expression Omnibus database. Differentially expressed genes (DEGs) were identified between insulinoma tissues and normal pancreas tissues, followed by pathway enrichment analysis, protein-protein interaction (PPI) network construction, and module analysis. The expressions of candidate key genes were validated by quantitative real-time polymerase chain reaction (RT-PCR) in insulinoma tissues.A total of 1632 DEGs were obtained, including 1117 upregulated genes and 514 downregulated genes. Pathway enrichment results showed that upregulated DEGs were significantly implicated in insulin secretion, and downregulated DEGs were mainly enriched in pancreatic secretion. PPI network analysis revealed 7 hub genes with degrees more than 10, including GCG (glucagon), GCGR (glucagon receptor), PLCB1 (phospholipase C, beta 1), CASR (calcium sensing receptor), F2R (coagulation factor II thrombin receptor), GRM1 (glutamate metabotropic receptor 1), and GRM5 (glutamate metabotropic receptor 5). DEGs involved in the significant modules were enriched in calcium signaling pathway, protein ubiquitination, and platelet degranulation. Quantitative RT-PCR data confirmed that the expression trends of these hub genes were similar to the results of bioinformatic analysis.The present study demonstrated that candidate DEGs and enriched pathways were the potential critical molecule events involved in the development of insulinoma, and these findings were useful for better understanding of insulinoma genesis.
Analysis of miRNA expression profiles in melatonin-exposed GC-1 spg cell line.
Zhu, Xiaoling; Chen, Shuxiong; Jiang, Yanwen; Xu, Ying; Zhao, Yun; Chen, Lu; Li, Chunjin; Zhou, Xu
2018-02-05
Melatonin is an endocrine neurohormone secreted by pinealocytes in the pineal gland. It exerts diverse physiological effects, such as circadian rhythm regulator and antioxidant. However, the functional importance of melatonin in spermatogenesis regulation remains unclear. The objectives of this study are to: (1) detect melatonin affection on miRNA expression profiles in GC-1 spg cells by miRNA deep sequencing (DeepSeq) and (2) define melatonin affected miRNA-mRNA interactions and associated biological processes using bioinformatics analysis. GC-1 spg cells were cultured with melatonin (10 -7 M) for 24h. DeepSeq data were validated using quantitative real-time reverse transcription polymerase chain reaction analysis (qRT-PCR). A total of 176 miRNA expressions were found to be significantly different between two groups (fold change of >2 or <0.5 and FDR<0.05). Among these expressions, 171 were up-regulated, and 5 were down-regulated. Ontology analysis of biological processes of these targets indicated a variety of biological functions. Pathway analysis indicated that the predicted targets were involved in cancers, apoptosis and signaling pathways, such as VEGF, TNF, Ras and Notch. Results implicated that melatonin could regulate the expression of miRNA to perform its physiological effects in GC-1 spg cells. These results should be useful to investigate the biological function of miRNAs regulated by melatonin in spermatogenesis and testicular germ cell tumor. Copyright © 2017 Elsevier B.V. All rights reserved.
Workflows in bioinformatics: meta-analysis and prototype implementation of a workflow generator.
Garcia Castro, Alexander; Thoraval, Samuel; Garcia, Leyla J; Ragan, Mark A
2005-04-07
Computational methods for problem solving need to interleave information access and algorithm execution in a problem-specific workflow. The structures of these workflows are defined by a scaffold of syntactic, semantic and algebraic objects capable of representing them. Despite the proliferation of GUIs (Graphic User Interfaces) in bioinformatics, only some of them provide workflow capabilities; surprisingly, no meta-analysis of workflow operators and components in bioinformatics has been reported. We present a set of syntactic components and algebraic operators capable of representing analytical workflows in bioinformatics. Iteration, recursion, the use of conditional statements, and management of suspend/resume tasks have traditionally been implemented on an ad hoc basis and hard-coded; by having these operators properly defined it is possible to use and parameterize them as generic re-usable components. To illustrate how these operations can be orchestrated, we present GPIPE, a prototype graphic pipeline generator for PISE that allows the definition of a pipeline, parameterization of its component methods, and storage of metadata in XML formats. This implementation goes beyond the macro capacities currently in PISE. As the entire analysis protocol is defined in XML, a complete bioinformatic experiment (linked sets of methods, parameters and results) can be reproduced or shared among users. http://if-web1.imb.uq.edu.au/Pise/5.a/gpipe.html (interactive), ftp://ftp.pasteur.fr/pub/GenSoft/unix/misc/Pise/ (download). From our meta-analysis we have identified syntactic structures and algebraic operators common to many workflows in bioinformatics. The workflow components and algebraic operators can be assimilated into re-usable software components. GPIPE, a prototype implementation of this framework, provides a GUI builder to facilitate the generation of workflows and integration of heterogeneous analytical tools.
Huang, Jianyan; Zhao, Xiaobo; Weng, Xiaoyu; Wang, Lei; Xie, Weibo
2012-01-01
Background The B-box (BBX) -containing proteins are a class of zinc finger proteins that contain one or two B-box domains and play important roles in plant growth and development. The Arabidopsis BBX gene family has recently been re-identified and renamed. However, there has not been a genome-wide survey of the rice BBX (OsBBX) gene family until now. Methodology/Principal Findings In this study, we identified 30 rice BBX genes through a comprehensive bioinformatics analysis. Each gene was assigned a uniform nomenclature. We described the chromosome localizations, gene structures, protein domains, phylogenetic relationship, whole life-cycle expression profile and diurnal expression patterns of the OsBBX family members. Based on the phylogeny and domain constitution, the OsBBX gene family was classified into five subfamilies. The gene duplication analysis revealed that only chromosomal segmental duplication contributed to the expansion of the OsBBX gene family. The expression profile of the OsBBX genes was analyzed by Affymetrix GeneChip microarrays throughout the entire life-cycle of rice cultivar Zhenshan 97 (ZS97). In addition, microarray analysis was performed to obtain the expression patterns of these genes under light/dark conditions and after three phytohormone treatments. This analysis revealed that the expression patterns of the OsBBX genes could be classified into eight groups. Eight genes were regulated under the light/dark treatments, and eleven genes showed differential expression under at least one phytohormone treatment. Moreover, we verified the diurnal expression of the OsBBX genes using the data obtained from the Diurnal Project and qPCR analysis, and the results indicated that many of these genes had a diurnal expression pattern. Conclusions/Significance The combination of the genome-wide identification and the expression and diurnal analysis of the OsBBX gene family should facilitate additional functional studies of the OsBBX genes. PMID:23118960
Nohata, Nijiro; Abba, Martin C; Gutkind, J Silvio
2016-08-01
The role of long non-coding RNA (lncRNA) expression in human head and neck squamous cell carcinoma (HNSCC) is still poorly understood. In this study, we aimed at establishing the onco-lncRNAome profiling of HNSCC and to identify lncRNAs correlating with prognosis and patient survival. The Atlas of Noncoding RNAs in Cancer (TANRIC) database was employed to retrieve the lncRNA expression information generated from The Cancer Genome Atlas (TCGA) HNSCC RNA-sequencing data. RNA-sequencing data from HNSCC cell lines were also considered for this study. Bioinformatics approaches, such as differential gene expression analysis, survival analysis, principal component analysis, and Co-LncRNA enrichment analysis were performed. Using TCGA HNSCC RNA-sequencing data from 426 HNSCC and 42 adjacent normal tissues, we found 728 lncRNA transcripts significantly and differentially expressed in HNSCC. Among the 728 lncRNAs, 55 lncRNAs were significantly associated with poor prognosis, such as overall survival and/or disease-free survival. Next, we found 140 lncRNA transcripts significantly and differentially expressed between Human Papilloma Virus (HPV) positive tumors and HPV negative tumors. Thirty lncRNA transcripts were differentially expressed between TP53 mutated and TP53 wild type tumors. Co-LncRNA analysis suggested that protein-coding genes that are co-expressed with these deregulated lncRNAs might be involved in cancer associated molecular events. With consideration of differential expression of lncRNAs in a HNSCC cell lines panel (n=22), we found several lncRNAs that may represent potential targets for diagnosis, therapy and prevention of HNSCC. LncRNAs profiling could provide novel insights into the potential mechanisms of HNSCC oncogenesis. Copyright © 2016 Elsevier Ltd. All rights reserved.
Nohata, Nijiro; Abba, Martin C.; Gutkind, J. Silvio
2017-01-01
Objectives The role of long non-coding RNA (lncRNA) expression in human head and neck squamous cell carcinoma (HNSCC) is still poorly understood. In this study, we aimed at establishing the onco-lncRNAome profiling of HNSCC and to identify lncRNAs correlating with prognosis and patient survival. Materials and Methods The Atlas of Noncoding RNAs in Cancer (TANRIC) database was employed to retrieve the lncRNA expression information generated from The Cancer Genome Atlas (TCGA) HNSCC RNA-sequencing data. RNA-sequencing data from HNSCC cell lines were also considered for this study. Bioinformatics approaches, such as differential gene expression analysis, survival analysis, principal component analysis, and Co-LncRNA enrichment analysis were performed. Results Using TCGA HNSCC RNA-sequencing data from 426 HNSCC and 42 adjacent normal tissues, we found 728 lncRNA transcripts significantly and differentially expressed in HNSCC. Among the 728 lncRNAs, 55 lncRNAs were significantly associated with poor prognosis, such as overall survival and/or disease-free survival. Next, we found 140 lncRNA transcripts significantly and differentially expressed between Human Papilloma Virus (HPV) positive tumors and HPV negative tumors. Thirty lncRNA transcripts were differentially expressed between TP53 mutated and TP53 wild type tumors. Co-LncRNA analysis suggested that protein-coding genes that are co-expressed with these deregulated lncRNAs might be involved in cancer associated molecular events. With consideration of differential expression of lncRNAs in a HNSCC cell lines panel (n=22), we found several lncRNAs that may represent potential targets for diagnosis, therapy and prevention of HNSCC. Conclusion LncRNAs profiling could provide novel insights into the potential mechanisms of HNSCC oncogenesis. PMID:27424183
Whole transcriptome profiling of taste bud cells.
Sukumaran, Sunil K; Lewandowski, Brian C; Qin, Yumei; Kotha, Ramana; Bachmanov, Alexander A; Margolskee, Robert F
2017-08-08
Analysis of single-cell RNA-Seq data can provide insights into the specific functions of individual cell types that compose complex tissues. Here, we examined gene expression in two distinct subpopulations of mouse taste cells: Tas1r3-expressing type II cells and physiologically identified type III cells. Our RNA-Seq libraries met high quality control standards and accurately captured differential expression of marker genes for type II (e.g. the Tas1r genes, Plcb2, Trpm5) and type III (e.g. Pkd2l1, Ncam, Snap25) taste cells. Bioinformatics analysis showed that genes regulating responses to stimuli were up-regulated in type II cells, while pathways related to neuronal function were up-regulated in type III cells. We also identified highly expressed genes and pathways associated with chemotaxis and axon guidance, providing new insights into the mechanisms underlying integration of new taste cells into the taste bud. We validated our results by immunohistochemically confirming expression of selected genes encoding synaptic (Cplx2 and Pclo) and semaphorin signalling pathway (Crmp2, PlexinB1, Fes and Sema4a) components. The approach described here could provide a comprehensive map of gene expression for all taste cell subpopulations and will be particularly relevant for cell types in taste buds and other tissues that can be identified only by physiological methods.
Xu, Zhongwei; Jin, Xiaohan; Cai, Wei; Zhou, Maobin; Shao, Ping; Yang, Zhen; Fu, Rong; Cao, Jin; Liu, Yan; Yu, Fang; Fan, Rong; Zhang, Yan; Zou, Shuang; Zhou, Xin; Yang, Ning; Chen, Xu; Li, Yuming
2018-04-20
Early-onset preeclampsia (EOS-PE) refers to preeclampsia that occurred before 34 gestation weeks. This study is conducted to explore the relationship between mitochondrial dysfunction and the pathogenesis of EOS-PE using proteomic strategy. To identify altering expressed mitochondrial proteins between severe EOS-PE and healthy pregnancies, enrichment of mitochondria coupled with iTRAQ-based quantitative proteomic method is performed. Immunohistochemistry (IHC) and western blot are performed to detect the alteration of changing expression proteins, and confirmed the accuracy of proteomic results. A total of 1372 proteins were quantified and 132 altering expressed proteins were screened, including 86 downregulated expression proteins and 46 upregulated expression proteins (p < 0.05). Bioinformatics analysis showed that differentially expressed proteins participated in numerous biological processes, including oxidation-reduction process, respiratory electron transport chain, and oxidative phosphorylation. Especially, mitochondria-related molecules, PRDX2, PARK7, BNIP3, BCL2, PDHA1, SUCLG1, ACADM, and NDUFV1, are involved in energy-production process in the matrix and membrane of mitochondria. Results of the experiment show that abnormal electron transport, excessive oxidative stress, and mitochondrion disassembly might be the main cause of mitochondrial dysfunction, and is related to the pathogenesis of EOS-PE. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Bioinformatic training needs at a health sciences campus.
Oliver, Jeffrey C
2017-01-01
Health sciences research is increasingly focusing on big data applications, such as genomic technologies and precision medicine, to address key issues in human health. These approaches rely on biological data repositories and bioinformatic analyses, both of which are growing rapidly in size and scope. Libraries play a key role in supporting researchers in navigating these and other information resources. With the goal of supporting bioinformatics research in the health sciences, the University of Arizona Health Sciences Library established a Bioinformation program. To shape the support provided by the library, I developed and administered a needs assessment survey to the University of Arizona Health Sciences campus in Tucson, Arizona. The survey was designed to identify the training topics of interest to health sciences researchers and the preferred modes of training. Survey respondents expressed an interest in a broad array of potential training topics, including "traditional" information seeking as well as interest in analytical training. Of particular interest were training in transcriptomic tools and the use of databases linking genotypes and phenotypes. Staff were most interested in bioinformatics training topics, while faculty were the least interested. Hands-on workshops were significantly preferred over any other mode of training. The University of Arizona Health Sciences Library is meeting those needs through internal programming and external partnerships. The results of the survey demonstrate a keen interest in a variety of bioinformatic resources; the challenge to the library is how to address those training needs. The mode of support depends largely on library staff expertise in the numerous subject-specific databases and tools. Librarian-led bioinformatic training sessions provide opportunities for engagement with researchers at multiple points of the research life cycle. When training needs exceed library capacity, partnering with intramural and extramural units will be crucial in library support of health sciences bioinformatic research.
Xie, Xin-Ping; Xie, Yu-Feng; Wang, Hong-Qiang
2017-08-23
Large-scale accumulation of omics data poses a pressing challenge of integrative analysis of multiple data sets in bioinformatics. An open question of such integrative analysis is how to pinpoint consistent but subtle gene activity patterns across studies. Study heterogeneity needs to be addressed carefully for this goal. This paper proposes a regulation probability model-based meta-analysis, jGRP, for identifying differentially expressed genes (DEGs). The method integrates multiple transcriptomics data sets in a gene regulatory space instead of in a gene expression space, which makes it easy to capture and manage data heterogeneity across studies from different laboratories or platforms. Specifically, we transform gene expression profiles into a united gene regulation profile across studies by mathematically defining two gene regulation events between two conditions and estimating their occurring probabilities in a sample. Finally, a novel differential expression statistic is established based on the gene regulation profiles, realizing accurate and flexible identification of DEGs in gene regulation space. We evaluated the proposed method on simulation data and real-world cancer datasets and showed the effectiveness and efficiency of jGRP in identifying DEGs identification in the context of meta-analysis. Data heterogeneity largely influences the performance of meta-analysis of DEGs identification. Existing different meta-analysis methods were revealed to exhibit very different degrees of sensitivity to study heterogeneity. The proposed method, jGRP, can be a standalone tool due to its united framework and controllable way to deal with study heterogeneity.
Feng, Cong; Wu, Bo; Fan, Hongxia; Li, Changfei; Meng, Songdong
2014-10-04
To investigate the mechanism of gp96 raised during hepatitis B virus (HBV) infection and the pathological mechanism. The mechanism of NF-KB activating gp96 expression was determined by bioinformatics analysis, luciferase reporter assay, real-time PCR and Western blot. The effect of over-expression and knockdown gp96 expression by transfection or RNA interference on hepatocyte proliferation, apoptosis and cell cycle was examined by CCK-8 and flow cytometry. The role of gp96 for HCC development was determined by epithelial-mesenchymal transition (EMT) and colony formation assay. NF-kB significantly increased the gp96 expression by binding to the NF-kappaB binding site. Over-expression and knockdown studies both show that gp96 promoted hepatocyte proliferation, inhibited apoptosis, and induced G0/G1 to S phase cell cycle progression. Moreover, gp96 induced epithelial-mesenchymal transition and increased colony formation ability of hepatocytes. Our results therefore provide insights in chronic HBV infection-induced gp96 expression, and indicate that elevated gp96 may contribute to HCC development during chronic inflammation.
Chen, Xiao-Min; Feng, Ming-Jun; Shen, Cai-Jie; He, Bin; Du, Xian-Feng; Yu, Yi-Bo; Liu, Jing; Chu, Hui-Min
2017-07-01
The present study was designed to develop a novel method for identifying significant pathways associated with human hypertrophic cardiomyopathy (HCM), based on gene co‑expression analysis. The microarray dataset associated with HCM (E‑GEOD‑36961) was obtained from the European Molecular Biology Laboratory‑European Bioinformatics Institute database. Informative pathways were selected based on the Reactome pathway database and screening treatments. An empirical Bayes method was utilized to construct co‑expression networks for informative pathways, and a weight value was assigned to each pathway. Differential pathways were extracted based on weight threshold, which was calculated using a random model. In order to assess whether the co‑expression method was feasible, it was compared with traditional pathway enrichment analysis of differentially expressed genes, which were identified using the significance analysis of microarrays package. A total of 1,074 informative pathways were screened out for subsequent investigations and their weight values were also obtained. According to the threshold of weight value of 0.01057, 447 differential pathways, including folding of actin by chaperonin containing T‑complex protein 1 (CCT)/T‑complex protein 1 ring complex (TRiC), purine ribonucleoside monophosphate biosynthesis and ubiquinol biosynthesis, were obtained. Compared with traditional pathway enrichment analysis, the number of pathways obtained from the co‑expression approach was increased. The results of the present study demonstrated that this method may be useful to predict marker pathways for HCM. The pathways of folding of actin by CCT/TRiC and purine ribonucleoside monophosphate biosynthesis may provide evidence of the underlying molecular mechanisms of HCM, and offer novel therapeutic directions for HCM.
Novel approaches for bioinformatic analysis of salivary RNA sequencing data for development.
Kaczor-Urbanowicz, Karolina Elzbieta; Kim, Yong; Li, Feng; Galeev, Timur; Kitchen, Rob R; Gerstein, Mark; Koyano, Kikuye; Jeong, Sung-Hee; Wang, Xiaoyan; Elashoff, David; Kang, So Young; Kim, Su Mi; Kim, Kyoung; Kim, Sung; Chia, David; Xiao, Xinshu; Rozowsky, Joel; Wong, David T W
2018-01-01
Analysis of RNA sequencing (RNA-Seq) data in human saliva is challenging. Lack of standardization and unification of the bioinformatic procedures undermines saliva's diagnostic potential. Thus, it motivated us to perform this study. We applied principal pipelines for bioinformatic analysis of small RNA-Seq data of saliva of 98 healthy Korean volunteers including either direct or indirect mapping of the reads to the human genome using Bowtie1. Analysis of alignments to exogenous genomes by another pipeline revealed that almost all of the reads map to bacterial genomes. Thus, salivary exRNA has fundamental properties that warrant the design of unique additional steps while performing the bioinformatic analysis. Our pipelines can serve as potential guidelines for processing of RNA-Seq data of human saliva. Processing and analysis results of the experimental data generated by the exceRpt (v4.6.3) small RNA-seq pipeline (github.gersteinlab.org/exceRpt) are available from exRNA atlas (exrna-atlas.org). Alignment to exogenous genomes and their quantification results were used in this paper for the analyses of small RNAs of exogenous origin. dtww@ucla.edu. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Gu, Yifeng; Zhang, Lei; Chen, Xiaowu
2014-08-01
MicroRNAs (miRNAs) play an important role in gonadal development and differentiation in fish. However, understanding of the mechanism of this process is hindered by our poor knowledge of miRNA expression patterns in fish gonads. In this study, miRNA libraries derived from adult gonads of Paralichthys olivaceus were generated by using next-generation sequencing (NGS) technology. Bioinformatics analysis was performed to distinguish mature miRNA sequences from two classes of small RNAs represented in the sequencing data. A total of 141 mature miRNAs were identified, in which 21 miRNAs were found in P. olivaceus for the first time. Variance and preference of miRNAs expression were concluded from the deep sequencing reads. Some miRNAs, such as pol-miR-143, pol-miR-26a and pol-let-7a were found with quite high expression levels in both gonads, while some exhibited a clear sex-biased expression in different gonad. Approximate 20.0% and 13.1% of the isolated miRNAs were preferentially expressed in the testis (FC<0.5) or ovary (FC>2), respectively. The identification and the preliminary analysis of the sex-biased expression of miRNAs in P. olivaceus gonads in our work by using NGS will provide us a basic catalog of miRNAs to facilitate future improvement and exploitation of sexual regulatory mechanisms in P. olivaceus. Copyright © 2014. Published by Elsevier Inc.
Akbarzadeh-Sharbaf, Soudabeh; Yakhchali, Bagher; Minuchehr, Zarrin; Shokrgozar, Mohammad Ali; Zeinali, Sirous
2012-01-01
Background: There is a novel hypothesis in that antibodies may have specificity for two distinct antigens that have been named “dual specificity”. This hypothesis was evaluated for some defined therapeutic monoclonal antibodies (mAbs) such as Trastuzumab, Pertuzumab, Bevacizumab, and Cetuximab. In silico design and construction of expression vectors for trastuzumab monoclonal antibody also in this work were performed. Materials and Methods: First, in bioinformatics studies the 3D structures of concerned mAbs were obtained from the Protein Data Bank (PDB). Three-dimensional structural alignments were performed with SIM and MUSTANG softwares. AutoDock4.2 software also was used for the docking analysis. Second, the suitable genes for trastuzumab heavy and light chains were designed, synthesized, and cloned in the prokaryotic vector. These fragments individually were PCR amplified and cloned into pcDNA™ 3.3-TOPO® and pOptiVEC™ TOPO® shuttle vectors, using standard methods. Results: First, many bioinformatics tools and softwares were applied but we did not meet any new dual specificity in the selected antibodies. In the following step, the suitable expression cascade for the heavy and light chains of Trastuzumab therapeutic mAb were designed and constructed. Gene cloning was successfully performed and created constructs were confirmed using gene mapping and sequencing. Conclusions: This study was based on a recently developed technology for mAb expression in mammalian cells. The obtained constructs could be successfully used for biosimilar recombinant mAb production in CHO DG44 dihydrofolate reductase (DHFR) gene deficient cell line in the suspension culture medium. PMID:23210080
Hernandez-Prieto, Miguel A; Futschik, Matthias E
2012-01-01
Synechocystis sp. PCC6803 is one of the best studied cyanobacteria and an important model organism for our understanding of photosynthesis. The early availability of its complete genome sequence initiated numerous transcriptome studies, which have generated a wealth of expression data. Analysis of the accumulated data can be a powerful tool to study transcription in a comprehensive manner and to reveal underlying regulatory mechanisms, as well as to annotate genes whose functions are yet unknown. However, use of divergent microarray platforms, as well as distributed data storage make meta-analyses of Synechocystis expression data highly challenging, especially for researchers with limited bioinformatic expertise and resources. To facilitate utilisation of the accumulated expression data for a wider research community, we have developed CyanoEXpress, a web database for interactive exploration and visualisation of transcriptional response patterns in Synechocystis. CyanoEXpress currently comprises expression data for 3073 genes and 178 environmental and genetic perturbations obtained in 31 independent studies. At present, CyanoEXpress constitutes the most comprehensive collection of expression data available for Synechocystis and can be freely accessed. The database is available for free at http://cyanoexpress.sysbiolab.eu.
ENFIN--A European network for integrative systems biology.
Kahlem, Pascal; Clegg, Andrew; Reisinger, Florian; Xenarios, Ioannis; Hermjakob, Henning; Orengo, Christine; Birney, Ewan
2009-11-01
Integration of biological data of various types and the development of adapted bioinformatics tools represent critical objectives to enable research at the systems level. The European Network of Excellence ENFIN is engaged in developing an adapted infrastructure to connect databases, and platforms to enable both the generation of new bioinformatics tools and the experimental validation of computational predictions. With the aim of bridging the gap existing between standard wet laboratories and bioinformatics, the ENFIN Network runs integrative research projects to bring the latest computational techniques to bear directly on questions dedicated to systems biology in the wet laboratory environment. The Network maintains internally close collaboration between experimental and computational research, enabling a permanent cycling of experimental validation and improvement of computational prediction methods. The computational work includes the development of a database infrastructure (EnCORE), bioinformatics analysis methods and a novel platform for protein function analysis FuncNet.
Beinke, C; Port, M; Ullmann, R; Gilbertz, K; Majewski, M; Abend, M
2018-06-01
Dicentric chromosome analysis (DCA) is the gold standard for individual radiation dose assessment. However, DCA is limited by the time-consuming phytohemagglutinin (PHA)-mediated lymphocyte activation. In this study using human peripheral blood lymphocytes, we investigated PHA-associated whole genome gene expression changes to elucidate this process and sought to identify suitable gene targets as a means of meeting our long-term objective of accelerating cell cycle kinetics to reduce DCA culture time. Human peripheral whole blood from three healthy donors was separately cultured in RPMI/FCS/antibiotics with BrdU and PHA-M. Diluted whole blood samples were transferred into PAXgene tubes at 0, 12, 24 and 36 h culture time. RNA was isolated and aliquots were used for whole genome gene expression screening. Microarray results were validated using qRT-PCR and differentially expressed genes [significantly (FDR corrected) twofold different from the 0 h value reference] were analyzed using several bioinformatic tools. The cell cycle positions and DNA-synthetic activities of lymphocytes were determined by analyzing the correlated total DNA content and incorporated BrdU level with flow cytometry after continued BrdU incubation. From 42,545 transcripts of the whole genome microarray 47.6%, on average, appeared expressed. The number of differentially expressed genes increased linearly from 855 to 2,858 and 4,607 at 12, 24 and 36 h after PHA addition, respectively. Approximately 2-3 times more up- than downregulated genes were observed with several hundred genes differentially expressed at each time point. Earliest enrichment was observed for gene sets related to the nucleus (12 h) followed by genes assigned to intracellular structures such as organelles (24 h) and finally genes related to the membrane and the extracellular matrix were enriched (36 h). Early gene expression changes at 12 h, in particular, were associated with protein classes such as chemokines/cytokines (e.g., CXCL1, CXCL2) and chaperones. Genes coding for biological processes involved in cell cycle control (e.g., MYBL2, RBL1, CCNA, CCNE) and DNA replication (e.g., POLA, POLE, MCM) appeared enriched at 24 h and later, but many more biological processes (42 altogether) showed enrichment as well. Flow cytometry data fit together with gene expression and bioinformatic analyses as cell cycle transition into S phase was observed with interindividual differences from 12 h onward, whereas progression into G 2 as well as into the second G 1 occurred from 36 h onward after activation. Gene set enrichment analysis over time identifies, in particular, two molecular categories of PHA-responsive gene targets (cytokine and cell cycle control genes). Based on that analysis target genes for cell cycle acceleration in lymphocytes have been identified ( CDKN1A/B/C, RBL-1/RBL-2, E2F2, Deaf-1), and it remains undetermined whether the time expenditure for DCA can be reduced by influencing gene expression involved in the regulatory circuits controlling PHA-associated cell cycle entry and/or progression at a specific early cell cycle phase.
Lo, Wan-Yu; Yang, Wen-Kai; Peng, Ching-Tien; Pai, Wan-Yu; Wang, Huang-Joe
2018-01-01
Background and Aims: Increased O -linked N -acetylglucosamine ( O -GlcNAc) modification of proteins by O -GlcNAc transferase (OGT) is associated with diabetic complications. Furthermore, oxidative stress promotes endothelial inflammation during diabetes. A previous study reported that microRNA-200 (miR-200) family members are sensitive to oxidative stress. In this study, we examined whether miR-200a and miR-200b regulate high-glucose (HG)-induced OGT expression in human aortic endothelial cells (HAECs) and whether miRNA-200a/200b downregulate OGT expression to control HG-induced endothelial inflammation. Methods: HAECs were stimulated with high glucose (25 mM) for 12 and 24 h. Real-time polymerase chain reaction (PCR), western blotting, THP-1 adhesion assay, bioinformatics predication, transfection of miR-200a/200b mimic or inhibitor, luciferase reporter assay, and transfection of siRNA OGT were performed. The aortic endothelium of db/db diabetic mice was evaluated by immunohistochemistry staining. Results: HG upregulated OGT mRNA and protein expression and protein O -GlcNAcylation levels (RL2 antibody) in HAECs, and showed increased intercellular adhesion molecule 1 (ICAM-1), vascular cell adhesion molecule 1 (VCAM-1), and E-selectin gene expression; ICAM-1 expression; and THP-1 adhesion. Bioinformatics analysis revealed homologous sequences between members of the miR-200 family and the 3'-untranslated region (3'-UTR) of OGT mRNA, and real-time PCR analysis confirmed that members of miR-200 family were significantly decreased in HG-stimulated HAECs. This suggests the presence of an impaired feedback restraint on HG-induced endothelial protein O -GlcNAcylation levels because of OGT upregulation. A luciferase reporter assay demonstrated that miR-200a/200b mimics bind to the 3'-UTR of OGT mRNA. Transfection with miR-200a/200b mimics significantly inhibited HG-induced OGT mRNA expression, OGT protein expression; protein O -GlcNAcylation levels; ICAM-1, VCAM-1, and E-selectin gene expression; ICAM-1 expression; and THP-1 adhesion. Additionally, siRNA-mediated OGT depletion reduced HG-induced protein O -GlcNAcylation; ICAM-1, VCAM-1, and E-selectin gene expression; ICAM-1 expression; and THP-1 adhesion, confirming that HG-induced endothelial inflammation is partially mediated via OGT-induced protein O -GlcNAcylation. These results were validated in vivo : tail-vein injection of miR-200a/200b mimics downregulated endothelial OGT and ICAM-1 expression in db/db mice. Conclusion: miR-200a/200b are involved in modulating HG-induced endothelial inflammation by regulating OGT-mediated protein O -GlcNAcylation, suggesting the therapeutic role of miR-200a/200b on vascular complications in diabetes.
Zhou, Yunying; Zhang, Qishu; Gao, Ge; Zhang, Xiaoli; Liu, Yafei; Yuan, Shoudao
2016-01-01
ABSTRACT The E7 oncoprotein of the high-risk human papillomavirus (HPV) plays a major role in HPV-induced carcinogenesis. E7 abrogates the G1 cell cycle checkpoint and induces genomic instability, but the mechanism is not fully understood. In this study, we performed RNA sequencing (RNA-seq) to characterize the transcriptional profile of keratinocytes expressing HPV 16 (HPV-16) E7. At the transcriptome level, 236 genes were differentially expressed between E7 and vector control cells. A subset of the differentially expressed genes, most of them novel to E7-expressing cells, was further confirmed by real-time PCR. Of interest, the activities of multiple transcription factors were altered in E7-expressing cells. Through bioinformatics analysis, pathways altered in E7-expressing cells were investigated. The upregulated genes were enriched in cell cycle and DNA replication, as well as in the DNA metabolic process, transcription, DNA damage, DNA repair, and nucleotide metabolism. Specifically, we focused our studies on the gene encoding WDHD1 (WD repeat and high mobility group [HMG]-box DNA-binding protein), one of the genes that was upregulated in E7-expressing cells. WDHD1 is a component of the replisome that regulates DNA replication. Recent studies suggest that WDHD1 may also function as a DNA replication initiation factor as well as a G1 checkpoint regulator. We found that in E7-expressing cells, the steady-state level of WDHD1 protein was increased along with the half-life. Moreover, downregulation of WDHD1 reduced E7-induced G1 checkpoint abrogation and rereplication, demonstrating a novel function for WDHD1. These studies shed light on mechanisms by which HPV induces genomic instability and have therapeutic implications. IMPORTANCE The high-risk HPV types induce cervical cancer and encode an E7 oncoprotein that plays a major role in HPV-induced carcinogenesis. However, the mechanism by which E7 induces carcinogenesis is not fully understood; specific anti-HPV agents are not available. In this study, we performed RNA-seq to characterize transcriptional profiling of keratinocytes expressing HPV-16 E7 and identified more than 200 genes that were differentially expressed between E7 and vector control cells. Through bioinformatics analysis, pathways altered in E7-expressing cells were identified. Significantly, the WDHD1 gene, one of the genes that is upregulated in E7-expressing cells, was found to play an important role in E7-induced G1 checkpoint abrogation and rereplication. These studies shed light on mechanisms by which HPV induces genomic instability and have therapeutic implications. PMID:27099318
Bioinformatics core competencies for undergraduate life sciences education.
Wilson Sayres, Melissa A; Hauser, Charles; Sierk, Michael; Robic, Srebrenka; Rosenwald, Anne G; Smith, Todd M; Triplett, Eric W; Williams, Jason J; Dinsdale, Elizabeth; Morgan, William R; Burnette, James M; Donovan, Samuel S; Drew, Jennifer C; Elgin, Sarah C R; Fowlks, Edison R; Galindo-Gonzalez, Sebastian; Goodman, Anya L; Grandgenett, Nealy F; Goller, Carlos C; Jungck, John R; Newman, Jeffrey D; Pearson, William; Ryder, Elizabeth F; Tosado-Acevedo, Rafael; Tapprich, William; Tobin, Tammy C; Toro-Martínez, Arlín; Welch, Lonnie R; Wright, Robin; Barone, Lindsay; Ebenbach, David; McWilliams, Mindy; Olney, Kimberly C; Pauley, Mark A
2018-01-01
Although bioinformatics is becoming increasingly central to research in the life sciences, bioinformatics skills and knowledge are not well integrated into undergraduate biology education. This curricular gap prevents biology students from harnessing the full potential of their education, limiting their career opportunities and slowing research innovation. To advance the integration of bioinformatics into life sciences education, a framework of core bioinformatics competencies is needed. To that end, we here report the results of a survey of biology faculty in the United States about teaching bioinformatics to undergraduate life scientists. Responses were received from 1,260 faculty representing institutions in all fifty states with a combined capacity to educate hundreds of thousands of students every year. Results indicate strong, widespread agreement that bioinformatics knowledge and skills are critical for undergraduate life scientists as well as considerable agreement about which skills are necessary. Perceptions of the importance of some skills varied with the respondent's degree of training, time since degree earned, and/or the Carnegie Classification of the respondent's institution. To assess which skills are currently being taught, we analyzed syllabi of courses with bioinformatics content submitted by survey respondents. Finally, we used the survey results, the analysis of the syllabi, and our collective research and teaching expertise to develop a set of bioinformatics core competencies for undergraduate biology students. These core competencies are intended to serve as a guide for institutions as they work to integrate bioinformatics into their life sciences curricula.
Bioinformatics core competencies for undergraduate life sciences education
Wilson Sayres, Melissa A.; Hauser, Charles; Sierk, Michael; Robic, Srebrenka; Rosenwald, Anne G.; Smith, Todd M.; Triplett, Eric W.; Williams, Jason J.; Dinsdale, Elizabeth; Morgan, William R.; Burnette, James M.; Donovan, Samuel S.; Drew, Jennifer C.; Elgin, Sarah C. R.; Fowlks, Edison R.; Galindo-Gonzalez, Sebastian; Goodman, Anya L.; Grandgenett, Nealy F.; Goller, Carlos C.; Jungck, John R.; Newman, Jeffrey D.; Pearson, William; Ryder, Elizabeth F.; Tosado-Acevedo, Rafael; Tapprich, William; Tobin, Tammy C.; Toro-Martínez, Arlín; Welch, Lonnie R.; Wright, Robin; Ebenbach, David; McWilliams, Mindy; Olney, Kimberly C.
2018-01-01
Although bioinformatics is becoming increasingly central to research in the life sciences, bioinformatics skills and knowledge are not well integrated into undergraduate biology education. This curricular gap prevents biology students from harnessing the full potential of their education, limiting their career opportunities and slowing research innovation. To advance the integration of bioinformatics into life sciences education, a framework of core bioinformatics competencies is needed. To that end, we here report the results of a survey of biology faculty in the United States about teaching bioinformatics to undergraduate life scientists. Responses were received from 1,260 faculty representing institutions in all fifty states with a combined capacity to educate hundreds of thousands of students every year. Results indicate strong, widespread agreement that bioinformatics knowledge and skills are critical for undergraduate life scientists as well as considerable agreement about which skills are necessary. Perceptions of the importance of some skills varied with the respondent’s degree of training, time since degree earned, and/or the Carnegie Classification of the respondent’s institution. To assess which skills are currently being taught, we analyzed syllabi of courses with bioinformatics content submitted by survey respondents. Finally, we used the survey results, the analysis of the syllabi, and our collective research and teaching expertise to develop a set of bioinformatics core competencies for undergraduate biology students. These core competencies are intended to serve as a guide for institutions as they work to integrate bioinformatics into their life sciences curricula. PMID:29870542
yStreX: yeast stress expression database
Wanichthanarak, Kwanjeera; Nookaew, Intawat; Petranovic, Dina
2014-01-01
Over the past decade genome-wide expression analyses have been often used to study how expression of genes changes in response to various environmental stresses. Many of these studies (such as effects of oxygen concentration, temperature stress, low pH stress, osmotic stress, depletion or limitation of nutrients, addition of different chemical compounds, etc.) have been conducted in the unicellular Eukaryal model, yeast Saccharomyces cerevisiae. However, the lack of a unifying or integrated, bioinformatics platform that would permit efficient and rapid use of all these existing data remain an important issue. To facilitate research by exploiting existing transcription data in the field of yeast physiology, we have developed the yStreX database. It is an online repository of analyzed gene expression data from curated data sets from different studies that capture genome-wide transcriptional changes in response to diverse environmental transitions. The first aim of this online database is to facilitate comparison of cross-platform and cross-laboratory gene expression data. Additionally, we performed different expression analyses, meta-analyses and gene set enrichment analyses; and the results are also deposited in this database. Lastly, we constructed a user-friendly Web interface with interactive visualization to provide intuitive access and to display the queried data for users with no background in bioinformatics. Database URL: http://www.ystrexdb.com PMID:25024351
Large-scale modelling of the divergent spectrin repeats in nesprins: giant modular proteins.
Autore, Flavia; Pfuhl, Mark; Quan, Xueping; Williams, Aisling; Roberts, Roland G; Shanahan, Catherine M; Fraternali, Franca
2013-01-01
Nesprin-1 and nesprin-2 are nuclear envelope (NE) proteins characterized by a common structure of an SR (spectrin repeat) rod domain and a C-terminal transmembrane KASH [Klarsicht-ANC-Syne-homology] domain and display N-terminal actin-binding CH (calponin homology) domains. Mutations in these proteins have been described in Emery-Dreifuss muscular dystrophy and attributed to disruptions of interactions at the NE with nesprins binding partners, lamin A/C and emerin. Evolutionary analysis of the rod domains of the nesprins has shown that they are almost entirely composed of unbroken SR-like structures. We present a bioinformatical approach to accurate definition of the boundaries of each SR by comparison with canonical SR structures, allowing for a large-scale homology modelling of the 74 nesprin-1 and 56 nesprin-2 SRs. The exposed and evolutionary conserved residues identify important pbs for protein-protein interactions that can guide tailored binding experiments. Most importantly, the bioinformatics analyses and the 3D models have been central to the design of selected constructs for protein expression. 1D NMR and CD spectra have been performed of the expressed SRs, showing a folded, stable, high content α-helical structure, typical of SRs. Molecular Dynamics simulations have been performed to study the structural and elastic properties of consecutive SRs, revealing insights in the mechanical properties adopted by these modules in the cell.
Li, Hua-Xiang; Lu, Zhen-Ming; Zhu, Qing; Gong, Jin-Song; Geng, Yan; Shi, Jin-Song; Xu, Zheng-Hong; Ma, Yan-He
2017-09-01
Medicinal mushroom Antrodia camphorata sporulate large numbers of arthroconidia in submerged fermentation, which is rarely reported in basidiomycetous fungi. Nevertheless, the molecular mechanisms underlying this asexual sporulation (conidiation) remain unclear. Here, we used comparative transcriptomic and proteomic approaches to elucidate possible signaling pathway relating to the asexual sporulation of A. camphorata. First, 104 differentially expressed proteins and 2586 differential cDNA sequences during the culture process of A. camphorata were identified by 2DE and RNA-seq, respectively. By applying bioinformatics analysis, a total of 67 genes which might play roles in the sporulation were obtained, and 18 of these genes, including fluG, sfgA, SfaD, flbA, flbB, flbC, flbD, nsdD, brlA, abaA, wetA, ganB, fadA, PkaA, veA, velB, vosA, and stuA might be involved in a potential FluG-mediated signaling pathway. Furthermore, the mRNA expression levels of the 18 genes in the proposed FluG-mediated signaling pathway were analyzed by quantitative real-time PCR. In summary, our study helps elucidate the molecular mechanisms underlying the asexual sporulation of A. camphorata, and provides also useful transcripts and proteome for further bioinformatics study of this valuable medicinal mushroom. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Randhawa, Gurinder Jit; Singh, Monika; Grover, Monendra
2011-02-01
The novel proteins introduced into the genetically modified (GM) crops need to be evaluated for the potential allergenicity before their introduction into the food chain to address the safety concerns of consumers. At present, there is no single definitive test that can be relied upon to predict allergic response in humans to a new protein; hence a composite approach to allergic response prediction is described in this study. The present study reports on the evaluation of the Cry proteins, encoded by cry1Ac, cry1Ab, cry2Ab, cry1Ca, cry1Fa/cry1Ca hybrid, being expressed in Bt food crops that are under field trials in India, for potential allergenic cross-reactivity using bioinformatics search tools. The sequence identity of amino acids was analyzed using FASTA3 of AllergenOnline version 10.0 and BLASTX of NCBI Entrez to identify any potential sequence matches to allergen proteins. As a step further in the detection of allergens, an independent database of domains in the allergens available in the AllergenOnline database was also developed. The results indicated no significant alignment and similarity of Cry proteins at domain level with any of the known allergens revealing that there is no potential risk of allergenic cross-reactivity. Copyright © 2010 Elsevier Ltd. All rights reserved.
Application of proteomics to ecology and population biology.
Karr, T L
2008-02-01
Proteomics is a relatively new scientific discipline that merges protein biochemistry, genome biology and bioinformatics to determine the spatial and temporal expression of proteins in cells, tissues and whole organisms. There has been very little application of proteomics to the fields of behavioral genetics, evolution, ecology and population dynamics, and has only recently been effectively applied to the closely allied fields of molecular evolution and genetics. However, there exists considerable potential for proteomics to impact in areas related to functional ecology; this review will introduce the general concepts and methodologies that define the field of proteomics and compare and contrast the advantages and disadvantages with other methods. Examples of how proteomics can aid, complement and indeed extend the study of functional ecology will be discussed including the main tool of ecological studies, population genetics with an emphasis on metapopulation structure analysis. Because proteomic analyses provide a direct measure of gene expression, it obviates some of the limitations associated with other genomic approaches, such as microarray and EST analyses. Likewise, in conjunction with associated bioinformatics and molecular evolutionary tools, proteomics can provide the foundation of a systems-level integration approach that can enhance ecological studies. It can be envisioned that proteomics will provide important new information on issues specific to metapopulation biology and adaptive processes in nature. A specific example of the application of proteomics to sperm ageing is provided to illustrate the potential utility of the approach.
Pan, Jie-Xue; Tan, Ya-Jing; Wang, Fang-Fang; Hou, Ning-Ning; Xiang, Yu-Qian; Zhang, Jun-Yu; Liu, Ye; Qu, Fan; Meng, Qing; Xu, Jian; Sheng, Jian-Zhong; Huang, He-Feng
2018-01-01
Polycystic ovary syndrome (PCOS), whose etiology remains uncertain, is a highly heterogenous and genetically complex endocrine disorder. The aim of this study was to identify differentially expressed genes (DEGs) in granulosa cells (GCs) from PCOS patients and make epigenetic insights into the pathogenesis of PCOS. Included in this study were 110 women with PCOS and 119 women with normal ovulatory cycles undergoing in vitro fertilization acting as the control group. RNA-seq identified 92 DEGs unique to PCOS GCs in comparison with the control group. Bioinformatic analysis indicated that synthesis of lipids and steroids was activated in PCOS GCs. 5-Methylcytosine analysis demonstrated that there was an approximate 25% reduction in global DNA methylation of GCs in PCOS women (4.44 ± 0.65%) compared with the controls (6.07 ± 0.72%; P < 0.05). Using MassArray EpiTYPER quantitative DNA methylation analysis, we also found hypomethylation of several gene promoters related to lipid and steroid synthesis, which might result in the aberrant expression of these genes. Our results suggest that hypomethylated genes related to the synthesis of lipid and steroid may dysregulate expression of these genes and promote synthesis of steroid hormones including androgen, which could partially explain mechanisms of hyperandrogenism in PCOS.
Chen, Min; Tan, Qiuping; Sun, Mingyue; Li, Dongmei; Fu, Xiling; Chen, Xiude; Xiao, Wei; Li, Ling; Gao, Dongsheng
2016-06-01
Bud dormancy in deciduous fruit trees is an important adaptive mechanism for their survival in cold climates. The WRKY genes participate in several developmental and physiological processes, including dormancy. However, the dormancy mechanisms of WRKY genes have not been studied in detail. We conducted a genome-wide analysis and identified 58 WRKY genes in peach. These putative genes were located on all eight chromosomes. In bioinformatics analyses, we compared the sequences of WRKY genes from peach, rice, and Arabidopsis. In a cluster analysis, the gene sequences formed three groups, of which group II was further divided into five subgroups. Gene structure was highly conserved within each group, especially in groups IId and III. Gene expression analyses by qRT-PCR showed that WRKY genes showed different expression patterns in peach buds during dormancy. The mean expression levels of six WRKY genes (Prupe.6G286000, Prupe.1G393000, Prupe.1G114800, Prupe.1G071400, Prupe.2G185100, and Prupe.2G307400) increased during endodormancy and decreased during ecodormancy, indicating that these six WRKY genes may play a role in dormancy in a perennial fruit tree. This information will be useful for selecting fruit trees with desirable dormancy characteristics or for manipulating dormancy in genetic engineering programs.
Deng, Dawei; Li, Yang; Xue, Jianpeng; Wang, Jie; Ai, Guanhua; Li, Xin; Gu, Yueqing
2015-01-01
Messenger RNA (mRNA), a single-strand ribonucleic acid with functional gene information is usually abnormally expressed in cancer cells and has become a promising biomarker for the study of tumor progress. Hairpin DNA-coated gold nanoparticle (hDAuNP) beacon containing a bare gold nanoparticle (AuNP) as fluorescence quencher and thiol-terminated fluorescently labeled stem-loop-stem oligonucleotide sequences attached by Au-S bond is currently a new nanoscale biodiagnostic platform capable of mRNA detection, in which the design of the loop region sequence is crucial for hybridizing with the target mRNA. Hence, in this study, to improve the sensitivity and selectivity of hDAuNP beacon simultaneously, the loop region of hairpin DNA was screened by bioinformatics strategy. Here, signal transducer and activator of transcription 5b (STAT5b) mRNA was selected and used as a practical example. The results from the combined characterizations using optical techniques, flow cytometry assay, and cell microscopic imaging showed that after optimization, the as-prepared hDAuNP beacon had higher selectivity and sensitivity for the detection of STAT5b mRNA in living cells, as compared with our previous beacon. Thus, the bioinformatics method may be a promising new strategy for assisting in the designing of the hDAuNP beacon, extending its application in the detection of mRNA expression and the resultant mRNA-based biological processes and disease pathogenesis.
Deng, Dawei; Li, Yang; Xue, Jianpeng; Wang, Jie; Ai, Guanhua; Li, Xin; Gu, Yueqing
2015-01-01
Messenger RNA (mRNA), a single-strand ribonucleic acid with functional gene information is usually abnormally expressed in cancer cells and has become a promising biomarker for the study of tumor progress. Hairpin DNA-coated gold nanoparticle (hDAuNP) beacon containing a bare gold nanoparticle (AuNP) as fluorescence quencher and thiol-terminated fluorescently labeled stem–loop–stem oligonucleotide sequences attached by Au–S bond is currently a new nanoscale biodiagnostic platform capable of mRNA detection, in which the design of the loop region sequence is crucial for hybridizing with the target mRNA. Hence, in this study, to improve the sensitivity and selectivity of hDAuNP beacon simultaneously, the loop region of hairpin DNA was screened by bioinformatics strategy. Here, signal transducer and activator of transcription 5b (STAT5b) mRNA was selected and used as a practical example. The results from the combined characterizations using optical techniques, flow cytometry assay, and cell microscopic imaging showed that after optimization, the as-prepared hDAuNP beacon had higher selectivity and sensitivity for the detection of STAT5b mRNA in living cells, as compared with our previous beacon. Thus, the bioinformatics method may be a promising new strategy for assisting in the designing of the hDAuNP beacon, extending its application in the detection of mRNA expression and the resultant mRNA-based biological processes and disease pathogenesis. PMID:25987838
Analysis of genetic association using hierarchical clustering and cluster validation indices.
Pagnuco, Inti A; Pastore, Juan I; Abras, Guillermo; Brun, Marcel; Ballarin, Virginia L
2017-10-01
It is usually assumed that co-expressed genes suggest co-regulation in the underlying regulatory network. Determining sets of co-expressed genes is an important task, based on some criteria of similarity. This task is usually performed by clustering algorithms, where the genes are clustered into meaningful groups based on their expression values in a set of experiment. In this work, we propose a method to find sets of co-expressed genes, based on cluster validation indices as a measure of similarity for individual gene groups, and a combination of variants of hierarchical clustering to generate the candidate groups. We evaluated its ability to retrieve significant sets on simulated correlated and real genomics data, where the performance is measured based on its detection ability of co-regulated sets against a full search. Additionally, we analyzed the quality of the best ranked groups using an online bioinformatics tool that provides network information for the selected genes. Copyright © 2017 Elsevier Inc. All rights reserved.
Online Tools for Bioinformatics Analyses in Nutrition Sciences12
Malkaram, Sridhar A.; Hassan, Yousef I.; Zempleni, Janos
2012-01-01
Recent advances in “omics” research have resulted in the creation of large datasets that were generated by consortiums and centers, small datasets that were generated by individual investigators, and bioinformatics tools for mining these datasets. It is important for nutrition laboratories to take full advantage of the analysis tools to interrogate datasets for information relevant to genomics, epigenomics, transcriptomics, proteomics, and metabolomics. This review provides guidance regarding bioinformatics resources that are currently available in the public domain, with the intent to provide a starting point for investigators who want to take advantage of the opportunities provided by the bioinformatics field. PMID:22983844
India's Computational Biology Growth and Challenges.
Chakraborty, Chiranjib; Bandyopadhyay, Sanghamitra; Agoramoorthy, Govindasamy
2016-09-01
India's computational science is growing swiftly due to the outburst of internet and information technology services. The bioinformatics sector of India has been transforming rapidly by creating a competitive position in global bioinformatics market. Bioinformatics is widely used across India to address a wide range of biological issues. Recently, computational researchers and biologists are collaborating in projects such as database development, sequence analysis, genomic prospects and algorithm generations. In this paper, we have presented the Indian computational biology scenario highlighting bioinformatics-related educational activities, manpower development, internet boom, service industry, research activities, conferences and trainings undertaken by the corporate and government sectors. Nonetheless, this new field of science faces lots of challenges.
Li, Nan; Tang, Anliu; Huang, Shuo; Li, Zeng; Li, Xiayu; Shen, Shourong; Ma, Jian; Wang, Xiaoyan
2013-08-01
Recent data strongly suggests the profound role of miRNAs in cancer progression. Here, we showed miR-126 expression was much lower in HCT116, SW620 and HT-29 colon cancer cells with highly metastatic potential and miR-126 downregulation was more frequent in colorectal cancers with metastasis. Restored miR-126 expression inhibited HT-29 cell growth, cell-cycle progression and invasion. Mechanically, microarray results combined with bioinformatic and experimental analysis demonstrated miR-126 exerted cancer suppressor role via inhibiting RhoA/ROCK signaling pathway. These results suggest miR-126 function as a potential tumor suppressor in colon cancer progression and miR-126/RhoA/ROCK may be a novel candidate for developing rational therapeutic strategies.
Xu, Fan; Yang, Jing; Chen, Jin; Wu, Qingyuan; Gong, Wei; Zhang, Jianguo; Shao, Weihua; Mu, Jun; Yang, Deyu; Yang, Yongtao; Li, Zhiwei; Xie, Peng
2015-04-03
Recent depression research has revealed a growing awareness of how to best classify depression into depressive subtypes. Appropriately subtyping depression can lead to identification of subtypes that are more responsive to current pharmacological treatment and aid in separating out depressed patients in which current antidepressants are not particularly effective. Differential co-expression analysis (DCEA) and differential regulation analysis (DRA) were applied to compare the transcriptomic profiles of peripheral blood lymphocytes from patients with two depressive subtypes: major depressive disorder (MDD) and subsyndromal symptomatic depression (SSD). Six differentially regulated genes (DRGs) (FOSL1, SRF, JUN, TFAP4, SOX9, and HLF) and 16 transcription factor-to-target differentially co-expressed gene links or pairs (TF2target DCLs) appear to be the key differential factors in MDD; in contrast, one DRG (PATZ1) and eight TF2target DCLs appear to be the key differential factors in SSD. There was no overlap between the MDD target genes and SSD target genes. Venlafaxine (Efexor™, Effexor™) appears to have a significant effect on the gene expression profile of MDD patients but no significant effect on the gene expression profile of SSD patients. DCEA and DRA revealed no apparent similarities between the differential regulatory processes underlying MDD and SSD. This bioinformatic analysis may provide novel insights that can support future antidepressant R&D efforts.
Deonovic, Benjamin; Wang, Yunhao; Weirather, Jason; Wang, Xiu-Jie; Au, Kin Fai
2017-01-01
Abstract Allele-specific expression (ASE) is a fundamental problem in studying gene regulation and diploid transcriptome profiles, with two key challenges: (i) haplotyping and (ii) estimation of ASE at the gene isoform level. Existing ASE analysis methods are limited by a dependence on haplotyping from laborious experiments or extra genome/family trio data. In addition, there is a lack of methods for gene isoform level ASE analysis. We developed a tool, IDP-ASE, for full ASE analysis. By innovative integration of Third Generation Sequencing (TGS) long reads with Second Generation Sequencing (SGS) short reads, the accuracy of haplotyping and ASE quantification at the gene and gene isoform level was greatly improved as demonstrated by the gold standard data GM12878 data and semi-simulation data. In addition to methodology development, applications of IDP-ASE to human embryonic stem cells and breast cancer cells indicate that the imbalance of ASE and non-uniformity of gene isoform ASE is widespread, including tumorigenesis relevant genes and pluripotency markers. These results show that gene isoform expression and allele-specific expression cooperate to provide high diversity and complexity of gene regulation and expression, highlighting the importance of studying ASE at the gene isoform level. Our study provides a robust bioinformatics solution to understand ASE using RNA sequencing data only. PMID:27899656
Microarray Data Mining for Potential Selenium Targets in Chemoprevention of Prostate Cancer
ZHANG, HAITAO; DONG, YAN; ZHAO, HONGJUAN; BROOKS, JAMES D.; HAWTHORN, LESLEYANN; NOWAK, NORMA; MARSHALL, JAMES R.; GAO, ALLEN C.; IP, CLEMENT
2008-01-01
Background A previous clinical trial showed that selenium supplementation significantly reduced the incidence of prostate cancer. We report here a bioinformatics approach to gain new insights into selenium molecular targets that might be relevant to prostate cancer chemoprevention. Materials and Methods We first performed data mining analysis to identify genes which are consistently dysregulated in prostate cancer using published datasets from gene expression profiling of clinical prostate specimens. We then devised a method to systematically analyze three selenium microarray datasets from the LNCaP human prostate cancer cells, and to match the analysis to the cohort of genes implicated in prostate carcinogenesis. Moreover, we compared the selenium datasets with two datasets obtained from expression profiling of androgen-stimulated LNCaP cells. Results We found that selenium reverses the expression of genes implicated in prostate carcinogenesis. In addition, we found that selenium could counteract the effect of androgen on the expression of a subset obtained from androgen-regulated genes. Conclusions The above information provides us with a treasure of new clues to investigate the mechanism of selenium chemoprevention of prostate cancer. Furthermore, these selenium target genes could also serve as biomarkers in future clinical trials to gauge the efficacy of selenium intervention. PMID:18548127
Li, Jingyun; Chen, Ling; Li, Qian; Cao, Jing; Gao, Yanli; Li, Jun
2018-08-01
Endogenous peptides recently attract increasing attention for their participation in various biological processes. Their roles in the pathogenesis of human hypertrophic scar remains poorly understood. In this study, we used liquid chromatography-tandem mass spectrometry to construct a comparative peptidomic profiling between human hypertrophic scar tissue and matched normal skin. A total of 179 peptides were significantly differentially expressed in human hypertrophic scar tissue, with 95 upregulated and 84 downregulated peptides between hypertrophic scar tissue and matched normal skin. Further bioinformatics analysis (Gene ontology and Kyoto Encyclopedia of Genes and Genomes pathway analysis) indicated that precursor proteins of these differentially expressed peptides correlate with cellular process, biological regulation, cell part, binding and structural molecule activity ribosome, and PPAR signaling pathway occurring during pathological changes of hypertrophic scar. Based on prediction database, we found that 78 differentially expressed peptides shared homology with antimicrobial peptides and five matched known immunomodulatory peptides. In conclusion, our results show significantly altered expression profiles of peptides in human hypertrophic scar tissue. These peptides may participate in the etiology of hypertrophic scar and provide beneficial scheme for scar evaluation and treatments. © 2017 Wiley Periodicals, Inc.
Shi, Xingjie; Zhao, Qing; Huang, Jian; Xie, Yang; Ma, Shuangge
2015-01-01
Motivation: Both gene expression levels (GEs) and copy number alterations (CNAs) have important biological implications. GEs are partly regulated by CNAs, and much effort has been devoted to understanding their relations. The regulation analysis is challenging with one gene expression possibly regulated by multiple CNAs and one CNA potentially regulating the expressions of multiple genes. The correlations among GEs and among CNAs make the analysis even more complicated. The existing methods have limitations and cannot comprehensively describe the regulation. Results: A sparse double Laplacian shrinkage method is developed. It jointly models the effects of multiple CNAs on multiple GEs. Penalization is adopted to achieve sparsity and identify the regulation relationships. Network adjacency is computed to describe the interconnections among GEs and among CNAs. Two Laplacian shrinkage penalties are imposed to accommodate the network adjacency measures. Simulation shows that the proposed method outperforms the competing alternatives with more accurate marker identification. The Cancer Genome Atlas data are analysed to further demonstrate advantages of the proposed method. Availability and implementation: R code is available at http://works.bepress.com/shuangge/49/ Contact: shuangge.ma@yale.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26342102
Janjanam, Jagadeesh; Singh, Surender; Jena, Manoj K.; Varshney, Nishant; Kola, Srujana; Kumar, Sudarshan; Kaushik, Jai K.; Grover, Sunita; Dang, Ajay K.; Mukesh, Manishi; Prakash, B. S.; Mohanty, Ashok K.
2014-01-01
Mammary gland is made up of a branching network of ducts that end with alveoli which surrounds the lumen. These alveolar mammary epithelial cells (MEC) reflect the milk producing ability of farm animals. In this study, we have used 2D-DIGE and mass spectrometry to identify the protein changes in MEC during immediate early, peak and late stages of lactation and also compared differentially expressed proteins in MEC isolated from milk of high and low milk producing cows. We have identified 41 differentially expressed proteins during lactation stages and 22 proteins in high and low milk yielding cows. Bioinformatics analysis showed that a majority of the differentially expressed proteins are associated in metabolic process, catalytic and binding activity. The differentially expressed proteins were mapped to the available biological pathways and networks involved in lactation. The proteins up-regulated during late stage of lactation are associated with NF-κB stress induced signaling pathways and whereas Akt, PI3K and p38/MAPK signaling pathways are associated with high milk production mediated through insulin hormone signaling. PMID:25111801
Jones, John T; Kumar, Amar; Pylypenko, Liliya A; Thirugnanasambandam, Amarnath; Castelli, Lydia; Chapman, Sean; Cock, Peter J A; Grenier, Eric; Lilley, Catherine J; Phillips, Mark S; Blok, Vivian C
2009-11-01
In this article, we describe the analysis of over 9000 expressed sequence tags (ESTs) from cDNA libraries obtained from various life cycle stages of Globodera pallida. We have identified over 50 G. pallida effectors from this dataset using bioinformatics analysis, by screening clones in order to identify secreted proteins up-regulated after the onset of parasitism and using in situ hybridization to confirm the expression in pharyngeal gland cells. A substantial gene family encoding G. pallida SPRYSEC proteins has been identified. The expression of these genes is restricted to the dorsal pharyngeal gland cell. Different members of the SPRYSEC family of proteins from G. pallida show different subcellular localization patterns in plants, with some localized to the cytoplasm and others to the nucleus and nucleolus. Differences in subcellular localization may reflect diverse functional roles for each individual protein or, more likely, variety in the compartmentalization of plant proteins targeted by the nematode. Our data are therefore consistent with the suggestion that the SPRYSEC proteins suppress host defences, as suggested previously, and that they achieve this through interaction with a range of host targets.
Wang, Li-Chao; Wei, Wen-Hui; Zhang, Xiao-Wen; Liu, Dan; Zeng, Ke-Wu; Tu, Peng-Fei
2018-01-01
Drastic macrophages activation triggered by exogenous infection or endogenous stresses is thought to be implicated in the pathogenesis of various inflammatory diseases. Carnosic acid (CA), a natural phenolic diterpene extracted from Salvia officinalis plant, has been reported to possess anti-inflammatory activity. However, its role in macrophages activation as well as potential molecular mechanism is largely unexplored. In the current study, we sought to elucidate the anti-inflammatory property of CA using an integrated approach based on unbiased proteomics and bioinformatics analysis. CA significantly inhibited the robust increase of nitric oxide and TNF-α, downregulated COX2 protein expression, and lowered the transcriptional level of inflammatory genes including Nos2, Tnfα, Cox2, and Mcp1 in LPS-stimulated RAW264.7 cells, a murine model of peritoneal macrophage cell line. The LC-MS/MS-based shotgun proteomics analysis showed CA negatively regulated 217 LPS-elicited proteins which were involved in multiple inflammatory processes including MAPK, nuclear factor (NF)-κB, and FoxO signaling pathways. A further molecular biology analysis revealed that CA effectually inactivated IKKβ/IκB-α/NF-κB, ERK/JNK/p38 MAPKs, and FoxO1/3 signaling pathways. Collectively, our findings demonstrated the role of CA in regulating inflammation response and provide some insights into the proteomics-guided pharmacological mechanism study of natural products. PMID:29713284
NASA Astrophysics Data System (ADS)
Yao, Lu; Zhu, Li-Ping; Xu, Xiao-Yan; Tan, Ling-Ling; Sadilek, Martin; Fan, Huan; Hu, Bo; Shen, Xiao-Ting; Yang, Jie; Qiao, Bin; Yang, Song
2016-09-01
Transcriptomic analysis of cultured fungi suggests that many genes for secondary metabolite synthesis are presumably silent under standard laboratory condition. In order to investigate the expression of silent genes in symbiotic systems, 136 fungi-fungi symbiotic systems were built up by co-culturing seventeen basidiomycetes, among which the co-culture of Trametes versicolor and Ganoderma applanatum demonstrated the strongest coloration of confrontation zones. Metabolomics study of this co-culture discovered that sixty-two features were either newly synthesized or highly produced in the co-culture compared with individual cultures. Molecular network analysis highlighted a subnetwork including two novel xylosides (compounds 2 and 3). Compound 2 was further identified as N-(4-methoxyphenyl)formamide 2-O-β-D-xyloside and was revealed to have the potential to enhance the cell viability of human immortalized bronchial epithelial cell line of Beas-2B. Moreover, bioinformatics and transcriptional analysis of T. versicolor revealed a potential candidate gene (GI: 636605689) encoding xylosyltransferases for xylosylation. Additionally, 3-phenyllactic acid and orsellinic acid were detected for the first time in G. applanatum, which may be ascribed to response against T.versicolor stress. In general, the described co-culture platform provides a powerful tool to discover novel metabolites and help gain insights into the mechanism of silent gene activation in fungal defense.
Chen, Yanyu; Xie, Yong; Xu, Lai; Zhan, Shaohua; Xiao, Yi; Gao, Yanpan; Wu, Bin; Ge, Wei
2017-02-15
Tumor cells of colorectal cancer (CRC) release exosomes into the circulation. These exosomes can mediate communication between cells and affect various tumor-related processes in their target cells. We present a quantitative proteomics analysis of the exosomes purified from serum of patients with CRC and normal volunteers; data are available via ProteomeXchange with identifier PXD003875. We identified 918 proteins with an overlap of 725 Gene IDs in the Exocarta proteins list. Compared with the serum-purified exosomes (SPEs) of normal volunteers, we found 36 proteins upregulated and 22 proteins downregulated in the SPEs of CRC patients. Bioinformatics analysis revealed that upregulated proteins are involved in processes that modulate the pretumorigenic microenvironment for metastasis. In contrast, differentially expressed proteins (DEPs) that play critical roles in tumor growth and cell survival were principally downregulated. Our study demonstrates that SPEs of CRC patients play a pivotal role in promoting the tumor invasiveness, but have minimal influence on putative alterations in tumor survival or proliferation. According to bioinformatics analysis, we speculate that the protein contents of exosomes might be associated with whether they are involved in premetastatic niche establishment or growth and survival of metastatic tumor cells. This information will be helpful in elucidating the pathophysiological functions of tumor-derived exosomes, and aid in the development of CRC diagnostics and therapeutics. © 2016 UICC.
Yao, Lu; Zhu, Li-Ping; Xu, Xiao-Yan; Tan, Ling-Ling; Sadilek, Martin; Fan, Huan; Hu, Bo; Shen, Xiao-Ting; Yang, Jie; Qiao, Bin; Yang, Song
2016-01-01
Transcriptomic analysis of cultured fungi suggests that many genes for secondary metabolite synthesis are presumably silent under standard laboratory condition. In order to investigate the expression of silent genes in symbiotic systems, 136 fungi-fungi symbiotic systems were built up by co-culturing seventeen basidiomycetes, among which the co-culture of Trametes versicolor and Ganoderma applanatum demonstrated the strongest coloration of confrontation zones. Metabolomics study of this co-culture discovered that sixty-two features were either newly synthesized or highly produced in the co-culture compared with individual cultures. Molecular network analysis highlighted a subnetwork including two novel xylosides (compounds 2 and 3). Compound 2 was further identified as N-(4-methoxyphenyl)formamide 2-O-β-D-xyloside and was revealed to have the potential to enhance the cell viability of human immortalized bronchial epithelial cell line of Beas-2B. Moreover, bioinformatics and transcriptional analysis of T. versicolor revealed a potential candidate gene (GI: 636605689) encoding xylosyltransferases for xylosylation. Additionally, 3-phenyllactic acid and orsellinic acid were detected for the first time in G. applanatum, which may be ascribed to response against T.versicolor stress. In general, the described co-culture platform provides a powerful tool to discover novel metabolites and help gain insights into the mechanism of silent gene activation in fungal defense. PMID:27616058
Genomewide analysis of TCP transcription factor gene family in Malus domestica.
Xu, Ruirui; Sun, Peng; Jia, Fengjuan; Lu, Longtao; Li, Yuanyuan; Zhang, Shizhong; Huang, Jinguang
2014-12-01
Teosinte branched 1/cycloidea/proliferating cell factor 1 (TCP) proteins are a large family of transcriptional regulators in angiosperms. They are involved in various biological processes, including development and plant metabolism pathways. In this study, a total of 52 TCP genes were identified in apple (Malus domestica) genome. Bioinformatic methods were employed to predicate and analyse their relevant gene classification, gene structure, chromosome location, sequence alignment and conserved domains of MdTCP proteins. Expression analysis from microarray data showed that the expression levels of 28 and 51 MdTCP genes changed during the ripening and rootstock-scion interaction processes, respectively. The expression patterns of 12 selected MdTCP genes were analysed in different tissues and in response to abiotic stresses. All of the selected genes were detected in at least one of the tissues tested, and most of them were modulated by adverse treatments indicating that the MdTCPs were involved in various developmental and physiological processes. To the best of our knowledge, this is the first study of a genomewide analysis of apple TCP gene family. These results provide valuable information for studies on functions of the TCP transcription factor genes in apple.
2013-01-01
Background High risk, unfavorable classical Hodgkin lymphoma (cHL) includes those patients with primary refractory or early relapse, and progressive disease. To improve the availability of biomarkers for this group of patients, we investigated both tumor biopsies and peripheral blood leukocytes (PBL) of untreated (chemo-naïve, CN) Nodular Sclerosis Classic Hodgkin Lymphoma (NS-cHL) patients for consistent biomarkers that can predict the outcome prior to frontline treatment. Methods and materials Bioinformatics data mining was used to generate 151 candidate biomarkers, which were screened against a library of 10 HL cell lines. Expression of FGF2 and SDC1 by CD30+ cells from HL patient samples representing good and poor outcomes were analyzed by qRT-PCR, immunohistochemical (IHC), and immunofluorescence analyses. Results To identify predictive HL-specific biomarkers, potential marker genes selected using bioinformatics approaches were screened against HL cell lines and HL patient samples. Fibroblast Growth Factor-2 (FGF2) and Syndecan-1 (SDC1) were overexpressed in all HL cell lines, and the overexpression was HL-specific when compared to 116 non-Hodgkin lymphoma tissues. In the analysis of stratified NS-cHL patient samples, expression of FGF2 and SDC1 were 245 fold and 91 fold higher, respectively, in the poor outcome (PO) group than in the good outcome (GO) group. The PO group exhibited higher expression of the HL marker CD30, the macrophage marker CD68, and metastatic markers TGFβ1 and MMP9 compared to the GO group. This expression signature was confirmed by qualitative immunohistochemical and immunofluorescent data. A Kaplan-Meier analysis indicated that samples in which the CD30+ cells carried an FGF2+/SDC1+ immunophenotype showed shortened survival. Analysis of chemo-naive HL blood samples suggested that in the PO group a subset of CD30+ HL cells had entered the circulation. These cells significantly overexpressed FGF2 and SDC1 compared to the GO group. The PO group showed significant down-regulation of markers for monocytes, T-cells, and B-cells. These expression signatures were eliminated in heavily pretreated patients. Conclusion The results suggest that small subsets of circulating CD30+/CD15+ cells expressing FGF2 and SDC1 represent biomarkers that identify NS-cHL patients who will experience a poor outcome (primary refractory and early relapsing). PMID:23988031
Evolutionarily conserved ELOVL4 gene expression in the vertebrate retina.
Lagali, Pamela S; Liu, Jiafan; Ambasudhan, Rajesh; Kakuk, Laura E; Bernstein, Steven L; Seigel, Gail M; Wong, Paul W; Ayyagari, Radha
2003-07-01
The gene elongation of very long chain fatty acids-4 (ELOVL4) has been shown to underlie phenotypically heterogeneous forms of autosomal dominant macular degeneration. In this study, the extent of evolutionary conservation and the existence and localization of retinal expression of this gene was investigated across a wide variety of species. Southern blot analysis of genomic DNA and bioinformatic analysis using the human ELOVL4 cDNA and protein sequences, respectively, were performed to identify species in which ELOVL4 orthologues and/or homologues are present. Retinal RNA and protein extracts derived from different species were assessed by Northern hybridization and immunoblot techniques to assess evolutionary conservation of gene expression. Immunohistochemical analysis of tissue sections prepared from various mammalian retinas was performed to determine the distribution of ELOVL4 and homologous proteins within specific retinal cell layers. The existence of ELOVL4 sequence orthologues and homologues was confirmed by both Southern blot analysis and in silico searches of protein sequence databases. Phylogenetic analysis places ELOVL4 among a large family of known and putative fatty acid elongase proteins. Northern blot analysis revealed the presence of multiple transcripts corresponding to ELOVL4 homologues expressed in the retina of several different mammalian species. Conserved proteins were also detected among retinal extracts of different mammals and were found to localize predominantly to the photoreceptor cell layer within retinal tissue preparations. The ELOVL4 gene is highly conserved throughout evolution and is expressed in the photoreceptor cells of the retina in a variety of different species, which suggests that it plays a critical role in retinal cell biology.
Lutz, Michael W.; Saul, Robert; Linnertz, Colton; Glenn, Omolara-Chinue; Roses, Allen D.; Chiba-Falek, Ornit
2015-01-01
INTRODUCTION We recently showed that tagging-SNPs across the SNCA locus were significantly associated with increased risk for LB pathology in AD cases. However, the actual genetic variant(s) that underlie the observed associations remain elusive. METHODS We used a bioinformatics algorithm to catalogue Structural-Variants in a region of SNCA-intron4, followed by phased-sequencing. We performed a genetic-association analysis in autopsy series of LBV/AD cases compared with AD-only controls. We investigated the biological functions by expression analysis using temporal-cortex samples. RESULTS We identified four distinct haplotypes within a highly-polymorphic-low-complexity CT-rich region. We showed that a specific haplotype conferred risk to develop LBV/AD. We demonstrated that the CT-rich site acts as an enhancer element, where the risk haplotype was significantly associated with elevated levels of SNCA-mRNA. DISCUSSION We have discovered a novel haplotype in a CT-rich region in SNCA that contributes to LB pathology in AD patients, possibly via cis-regulation of the gene expression. PMID:26079410
Honts, Jerry E.
2003-01-01
Recent advances in genomics and structural biology have resulted in an unprecedented increase in biological data available from Internet-accessible databases. In order to help students effectively use this vast repository of information, undergraduate biology students at Drake University were introduced to bioinformatics software and databases in three courses, beginning with an introductory course in cell biology. The exercises and projects that were used to help students develop literacy in bioinformatics are described. In a recently offered course in bioinformatics, students developed their own simple sequence analysis tool using the Perl programming language. These experiences are described from the point of view of the instructor as well as the students. A preliminary assessment has been made of the degree to which students had developed a working knowledge of bioinformatics concepts and methods. Finally, some conclusions have been drawn from these courses that may be helpful to instructors wishing to introduce bioinformatics within the undergraduate biology curriculum. PMID:14673489
Zheng, Linli; Ge, Yumei; Hu, Weilin; Yan, Jie
2013-03-01
To determine expression changes of major outer membrane protein(OMP) antigens of Leptospira interrogans serogroup Icterohaemorrhagiae serovar Lai strain Lai during infection of human macrophages and its mechanism. OmpR encoding genes and OmpR-related histidine kinase (HK) encoding gene of L.interrogans strain Lai and their functional domains were predicted using bioinformatics technique. mRNA level changes of the leptospiral major OMP-encoding genes before and after infection of human THP-1 macrophages were detected by real-time fluorescence quantitative RT-PCR. Effects of the OmpR-encoding genes and HK-encoding gene on the expression of leptospiral OMPs during infection were determined by HK-peptide antiserum block assay and closantel inhibitive assays. The bioinformatics analysis indicated that LB015 and LB333 were referred to OmpR-encoding genes of the spirochete, while LB014 might act as a OmpR-related HK-encoding gene. After the spirochete infecting THP-1 cells, mRNA levels of leptospiral lipL21, lipL32 and lipL41 genes were rapidly and persistently down-regulated (P <0.01), whereas mRNA levels of leptospiral groEL, mce, loa22 and ligB genes were rapidly but transiently up-regulated (P<0.01). The treatment with closantel and HK-peptide antiserum partly reversed the infection-based down-regulated mRNA levels of lipL21 and lipL48 genes (P <0.01). Moreover, closantel caused a decrease of the infection-based up-regulated mRNA levels of groEL, mce, loa22 and ligB genes (P <0.01). Expression levels of L.interrogans strain Lai major OMP antigens present notable changes during infection of human macrophages. There is a group of OmpR-and HK-encoding genes which may play a major role in down-regulation of expression levels of partial OMP antigens during infection.
Sun, Wen-Chong; Liang, Zuo-Di; Pei, Ling
2015-12-01
Propofol exerts neurotoxic effects on the developing mammalian brains, but the underlying molecular mechanism remains unclear. MicroRNAs (miRNAs) are a class of small noncoding RNAs that modulate gene expression at the post-transcriptional level. However, in specific types of neurocytes, the detailed functions of miRNAs were not entirely understood. We investigated the potential role of miRNAs in astrocyte pathogenesis caused by propofol. We performed genome-wide microRNA expression profiling in immature cultured hippocampal astrocytes by microarray analysis and predicted their targets and functions using bioinformatics tools. The functional effects of one differentially expressed miRNA were examined experimentally in relation to astrocyte viability. The results showed that 13 miRNAs were significantly differentially expressed after both short-term exposure to high-concentration propofol (10 μg/ml for 1h) and long-term exposure to low-concentration propofol (0.9 μg/ml for 48 h), including rno-miR-665, differing significantly between the 2. Bioinformatics predicted putative binding sites for rno-miR-665 existing in the 3'-untranslated region of Bcl-2-like protein 1 BCL2L1 (Bcl-xl) mRNA. Moreover, such relationship was assessed by luciferase reporter assay, qRT-PCR and western blot. Rno-miR-665 which was significantly up-regulated by propofol can suppress BCL2L1 and elevate cleaved caspase-3 expression in immature astrocytes in vitro. Apoptosis of developing hippocampal astrocytes was thus significantly influenced by propofol or rno-miR-665, or both. Taken together, rno-miR-665 is involved in the neurotoxicity induced by propofol via a caspase-3 mediated mechanism by negatively regulating BCL2L1. It might act as an alternative therapeutic target for treatment of neurological disorders in peadiatric prolonged anesthesia or sedation with propofol clinically. Copyright © 2015. Published by Elsevier B.V.
Ma, Xinlong; Shang, Feng; Zhu, Weidong; Lin, Qingtang
2017-09-01
CXCR4 is an oncogene in glioblastoma multiforme (GBM) but the mechanism of its dysregulation and its prognostic value in GBM have not been fully understood. Bioinformatic analysis was performed by using R2 and the UCSC Xena browser based on data from GSE16011 in GEO datasets and in GBM cohort in TCGA database (TCGA-GBM). Kaplan Meier curves of overall survival (OS) were generated to assess the association between CXCR4 expression/methylation and OS in patients with GBM. GBM patients with high CXCR4 expression had significantly worse 5 and 10 yrs OS (p < 0.05). Across different GBM subtypes, there was an inverse relationship between overall DNA methylation and CXCR4 expression. CXCR4 expression was significantly lower in CpG island methylation phenotype (CIMP) group than in non CIMP group. Log rank test results showed that patients with high CXCR4 methylation (first tertile) had significantly better 5 yrs OS (p = 0.038). CXCR4 expression is regulated by DNA methylation in GBM and its low expression or hypermethylation might indicate favorable OS in GBM patients.
H3ABioNet, a sustainable pan-African bioinformatics network for human heredity and health in Africa
Mulder, Nicola J.; Adebiyi, Ezekiel; Alami, Raouf; Benkahla, Alia; Brandful, James; Doumbia, Seydou; Everett, Dean; Fadlelmola, Faisal M.; Gaboun, Fatima; Gaseitsiwe, Simani; Ghazal, Hassan; Hazelhurst, Scott; Hide, Winston; Ibrahimi, Azeddine; Jaufeerally Fakim, Yasmina; Jongeneel, C. Victor; Joubert, Fourie; Kassim, Samar; Kayondo, Jonathan; Kumuthini, Judit; Lyantagaye, Sylvester; Makani, Julie; Mansour Alzohairy, Ahmed; Masiga, Daniel; Moussa, Ahmed; Nash, Oyekanmi; Ouwe Missi Oukem-Boyer, Odile; Owusu-Dabo, Ellis; Panji, Sumir; Patterton, Hugh; Radouani, Fouzia; Sadki, Khalid; Seghrouchni, Fouad; Tastan Bishop, Özlem; Tiffin, Nicki; Ulenga, Nzovu
2016-01-01
The application of genomics technologies to medicine and biomedical research is increasing in popularity, made possible by new high-throughput genotyping and sequencing technologies and improved data analysis capabilities. Some of the greatest genetic diversity among humans, animals, plants, and microbiota occurs in Africa, yet genomic research outputs from the continent are limited. The Human Heredity and Health in Africa (H3Africa) initiative was established to drive the development of genomic research for human health in Africa, and through recognition of the critical role of bioinformatics in this process, spurred the establishment of H3ABioNet, a pan-African bioinformatics network for H3Africa. The limitations in bioinformatics capacity on the continent have been a major contributory factor to the lack of notable outputs in high-throughput biology research. Although pockets of high-quality bioinformatics teams have existed previously, the majority of research institutions lack experienced faculty who can train and supervise bioinformatics students. H3ABioNet aims to address this dire need, specifically in the area of human genetics and genomics, but knock-on effects are ensuring this extends to other areas of bioinformatics. Here, we describe the emergence of genomics research and the development of bioinformatics in Africa through H3ABioNet. PMID:26627985
Legendre, Marine; Rodriguez-Ballesteros, Montserrat; Rossi, Massimiliano; Abadie, Véronique; Amiel, Jeanne; Revencu, Nicole; Blanchet, Patricia; Brioude, Frédéric; Delrue, Marie-Ange; Doubaj, Yassamine; Sefiani, Abdelaziz; Francannet, Christine; Holder-Espinasse, Muriel; Jouk, Pierre-Simon; Julia, Sophie; Melki, Judith; Mur, Sébastien; Naudion, Sophie; Fabre-Teste, Jennifer; Busa, Tiffany; Stamm, Stephen; Lyonnet, Stanislas; Attie-Bitach, Tania; Kitzis, Alain; Gilbert-Dussardier, Brigitte; Bilan, Frédéric
2018-02-01
CHARGE syndrome is a rare genetic disorder mainly due to de novo and private truncating mutations of CHD7 gene. Here we report an intriguing hot spot of intronic mutations (c.5405-7G > A, c.5405-13G > A, c.5405-17G > A and c.5405-18C > A) located in CHD7 IVS25. Combining computational in silico analysis, experimental branch-point determination and in vitro minigene assays, our study explains this mutation hot spot by a particular genomic context, including the weakness of the IVS25 natural acceptor-site and an unconventional lariat sequence localized outside the common 40 bp upstream the acceptor splice site. For each of the mutations reported here, bioinformatic tools indicated a newly created 3' splice site, of which the existence was confirmed using pSpliceExpress, an easy-to-use and reliable splicing reporter tool. Our study emphasizes the idea that combining these two complementary approaches could increase the efficiency of routine molecular diagnosis.
Planning bioinformatics workflows using an expert system.
Chen, Xiaoling; Chang, Jeffrey T
2017-04-15
Bioinformatic analyses are becoming formidably more complex due to the increasing number of steps required to process the data, as well as the proliferation of methods that can be used in each step. To alleviate this difficulty, pipelines are commonly employed. However, pipelines are typically implemented to automate a specific analysis, and thus are difficult to use for exploratory analyses requiring systematic changes to the software or parameters used. To automate the development of pipelines, we have investigated expert systems. We created the Bioinformatics ExperT SYstem (BETSY) that includes a knowledge base where the capabilities of bioinformatics software is explicitly and formally encoded. BETSY is a backwards-chaining rule-based expert system comprised of a data model that can capture the richness of biological data, and an inference engine that reasons on the knowledge base to produce workflows. Currently, the knowledge base is populated with rules to analyze microarray and next generation sequencing data. We evaluated BETSY and found that it could generate workflows that reproduce and go beyond previously published bioinformatics results. Finally, a meta-investigation of the workflows generated from the knowledge base produced a quantitative measure of the technical burden imposed by each step of bioinformatics analyses, revealing the large number of steps devoted to the pre-processing of data. In sum, an expert system approach can facilitate exploratory bioinformatic analysis by automating the development of workflows, a task that requires significant domain expertise. https://github.com/jefftc/changlab. jeffrey.t.chang@uth.tmc.edu. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Planning bioinformatics workflows using an expert system
Chen, Xiaoling; Chang, Jeffrey T.
2017-01-01
Abstract Motivation: Bioinformatic analyses are becoming formidably more complex due to the increasing number of steps required to process the data, as well as the proliferation of methods that can be used in each step. To alleviate this difficulty, pipelines are commonly employed. However, pipelines are typically implemented to automate a specific analysis, and thus are difficult to use for exploratory analyses requiring systematic changes to the software or parameters used. Results: To automate the development of pipelines, we have investigated expert systems. We created the Bioinformatics ExperT SYstem (BETSY) that includes a knowledge base where the capabilities of bioinformatics software is explicitly and formally encoded. BETSY is a backwards-chaining rule-based expert system comprised of a data model that can capture the richness of biological data, and an inference engine that reasons on the knowledge base to produce workflows. Currently, the knowledge base is populated with rules to analyze microarray and next generation sequencing data. We evaluated BETSY and found that it could generate workflows that reproduce and go beyond previously published bioinformatics results. Finally, a meta-investigation of the workflows generated from the knowledge base produced a quantitative measure of the technical burden imposed by each step of bioinformatics analyses, revealing the large number of steps devoted to the pre-processing of data. In sum, an expert system approach can facilitate exploratory bioinformatic analysis by automating the development of workflows, a task that requires significant domain expertise. Availability and Implementation: https://github.com/jefftc/changlab Contact: jeffrey.t.chang@uth.tmc.edu PMID:28052928
Prediction of Acute Mountain Sickness using a Blood-Based Test
2016-01-01
2015): In quarter 17 we focused on two major tasks: getting the RNA purified and ready for chip analysis and working on the bioinformatics ... bioinformatics organization of all the data we will examine for this study. To remind the reviewer, we have a primary dataset of ~120 subjects who were studied...companion study, AltitudeOmics, to the database of gene studies to be analyzed for AMS prediction • expansion of a bioinformatics team to include an
THULIN, PETRA; WEI, TIANLING; WERNGREN, OLIVERA; CHEUNG, LOUISA; FISHER, RACHEL M.; GRANDÉR, DAN; CORCORAN, MARTIN; EHRENBORG, EWA
2013-01-01
PPARδ is involved in the inflammatory response and its expression is induced by cytokines, however, limited knowledge has been produced regarding its regulation. Since recent findings have shown that microRNAs, which are small non-coding RNAs that regulate gene expression, are involved in the immune response, we set out to investigate whether PPARδ can be regulated by microRNAs expressed in monocytes. Bioinformatic analysis identified a putative miR-9 target site within the 3′-UTR of PPARδ that was subsequently verified to be functional using reporter constructs. Primary human monocytes stimulated with LPS showed a downregulation of PPARδ and its target genes after 4 h while the expression of miR-9 was induced. Analysis of pro-inflammatory (M1) and anti-inflammatory (M2) macrophages showed that human PPARδ mRNA as well as miR-9 expression was higher in M1 compared to M2 macrophages. Furthermore, treatment with the PPARδ agonist, GW501516, induced the expression of PPARδ target genes in the pro-inflammatory M1 macrophages while no change was observed in the anti-inflammatory M2 macrophages. Taken together, these data suggest that PPARδ is regulated by miR-9 in monocytes and that activation of PPARδ may be of importance in M1 pro-inflammatory but not in M2 anti-inflammatory macrophages in humans. PMID:23525285
Wang, Jian; Qi, Meng-Die; Guo, Juan; Shen, Ye; Lin, Hui-Xin; Huang, Lu-Qi
2017-03-01
Andrographis paniculata is widely used as medicinal herb in China for a long time and andrographolide is its main medicinal constituent. To investigate the underlying andrographolide biosynthesis mechanisms, RNA-seq for A. paniculata leaves with MeJA treatment was performed. In A. paniculata transcriptomic data, the expression pattern of one member of NAC transcription factor family (ApNAC1) matched with andrographolide accumulation. The coding sequence of ApNAC1 was cloned by RT-PCR, and GenBank accession number was KY196416. The analysis of bioinformatics showed that the gene encodes a peptide of 323 amino acids, with a predicted relative molecular weight of 35.9 kDa and isoelectric point of 6.14. To confirm the subcellular localization, ApNAC1-GFP was transiently expressed in A. paniculata protoplast. The results indicated that ApNAC1 is a nucleus-localized protein. The analysis of real-time quantitative PCR revealed that ApNAC1 gene predominantly expresses in leaves. Compared with control sample, its expression abundance sharply increased with methyl jasmonate treatment. Based on its expression pattern, ApNAC1 gene might involve in andrographolide biosynthesis. ApNAC1 was heterologously expressed in Escherichia coli and recombinant protein was purified by Ni-NTA agarose. Further study will help us to understand the function of ApNAC1 in andrographolide biosynthesis. Copyright© by the Chinese Pharmaceutical Association.
Image analysis tools and emerging algorithms for expression proteomics
English, Jane A.; Lisacek, Frederique; Morris, Jeffrey S.; Yang, Guang-Zhong; Dunn, Michael J.
2012-01-01
Since their origins in academic endeavours in the 1970s, computational analysis tools have matured into a number of established commercial packages that underpin research in expression proteomics. In this paper we describe the image analysis pipeline for the established 2-D Gel Electrophoresis (2-DE) technique of protein separation, and by first covering signal analysis for Mass Spectrometry (MS), we also explain the current image analysis workflow for the emerging high-throughput ‘shotgun’ proteomics platform of Liquid Chromatography coupled to MS (LC/MS). The bioinformatics challenges for both methods are illustrated and compared, whilst existing commercial and academic packages and their workflows are described from both a user’s and a technical perspective. Attention is given to the importance of sound statistical treatment of the resultant quantifications in the search for differential expression. Despite wide availability of proteomics software, a number of challenges have yet to be overcome regarding algorithm accuracy, objectivity and automation, generally due to deterministic spot-centric approaches that discard information early in the pipeline, propagating errors. We review recent advances in signal and image analysis algorithms in 2-DE, MS, LC/MS and Imaging MS. Particular attention is given to wavelet techniques, automated image-based alignment and differential analysis in 2-DE, Bayesian peak mixture models and functional mixed modelling in MS, and group-wise consensus alignment methods for LC/MS. PMID:21046614
Prabhanjan, Manasa; Suresh, Raviraj V; Murthy, Megha N; Ramachandra, Nallur B
2016-03-01
To identify the role of copy number variations (CNVs) on disease risk genes and its effect on disease phenotypes in type 2 diabetes mellitus (T2DM) in 12 random populations using high throughput arrays. CNV analysis was carried out on a total of 1715 individuals from 12 populations, from ArrayExpress Archive of the European Bioinformatics Institute along with our subjects using Affymetrix Genome Wide SNP 6.0 array. CNV effect on T2DM genes were analyzed using several bioinformatics tools and a molecular protein interaction network was constructed to identify the disease mechanism altered by the CNVs. Analysis showed 34.4% of the total population to be under CNV burden for T2DM, with 83 disease causal and associated genes being under CNV influence. Hotspots were identified on chromosomes 22, 12, 6, 19 and 11.Overlap studies with case cohorts revealed significant disease risk genes such as EGFR, E2F1, PPP1R3A, HLA and TSPAN8. CNVs play a significant role in predisposing T2DM in normal cohorts and contribute to the phenotypic effects. Thus, CNVs should be considered as one of the major contributors in predisposition of the disease. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Bryzgunova, O. E.; Lekchnov, E. A.; Zaripov, M. M.; Yurchenko, Yu. B.; Yarmoschuk, S. V.; Pashkovskaya, O. A.; Rykova, E. Yu.; Zheravin, A. A.; Laktionov, P. P.
2017-09-01
Presence of tumor-derived cell-free miRNA in biological fluids as well as simplicity and robustness of cell-free miRNA quantification makes them suitable markers for cancer diagnostics. Based on previously published data demonstrating diagnostic potentialities of miR-205 in blood and miR-19b as well as miR-125b in urine of prostate cancer patients, bioinformatics analysis was carried out to follow their involvement in prostate cancer development and select additional miRNA-markers for prostate cancer diagnostics. Studied miRNAs are involved in different signaling pathways and regulate a number of genes involved in cancer development. Five of their targets (CCND1, BRAF, CCNE1, CCNE2, RAF1), according to the STRING database, act as part of the same signaling pathway. RAF1 is regulated by miR-19b and miR-125b, and it was shown to be involved in prostate cancer development by DIANA and STRING databases. Thus, other microRNAs regulating RAF1 expression such as miR-16, -195, -497, and -7 (suggested by DIANA, TargetScan, MiRTarBase and miRDB databases) can potentially be regarded as prostate cancer markers.
Qaadri, Kashef [Biomatters Inc., San Francisco, CA (United States)
2018-05-21
Kashef Qaadri on "NGS for the Masses: Empowering biologists to improve bioinformatic productivity" at the 2012 Sequencing, Finishing, Analysis in the Future Meeting held June 5-7, 2012 in Santa Fe, New Mexico.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Qaadri, Kashef
2012-06-01
Kashef Qaadri on "NGS for the Masses: Empowering biologists to improve bioinformatic productivity" at the 2012 Sequencing, Finishing, Analysis in the Future Meeting held June 5-7, 2012 in Santa Fe, New Mexico.
Han, Jingjia; Gerstenhaber, Jonathan A; Lazarovici, Philip; Lelkes, Peter I
2013-05-13
All blood vessels are lined with a quiescent endothelium, which aids in regulating regular blood flow and avoiding thrombus formation. Current attempts at replacing diseased blood vessels frequently fail due to the intrinsic thrombogenicity of the materials used as vascular grafts. In extending our previous work where we introduced a new candidate scaffolds for vascular grafts electrospun from a blend solution of PLGA, gelatin, and elastin (PGE), this study aimed to evaluate the potential of PGE scaffolds to support nonthrombogenic monolayers of primary isolates of human aortic endothelial cells (HAECs), as assessed by a combination of biochemical, molecular, and bioinformatics-based analyses. After 24 h of culture on 3-D fibrous PGE scaffolds, HAECs formed a confluent, nonthrombogenic, and physiologically competent monolayer, as assessed by tissue factor (TF) gene expression and protein activity assays. The levels of TF mRNA/protein activity in HAECs grown on PGE scaffolds were similar to those on gelatin or collagen IV-coated 2-D surfaces. In addition, bioinformatics-based analysis of a focused microarray containing 84 ECM-related cDNA probes demonstrated that HAECs essentially expressed a histotypic ECM-related "transcriptome" on PGE scaffolds, where cells were more quiescent than cells cultured on 2-D coverslips coated with gelatin (a well-known "inert" substrate for conventional EC culture), but less so than on 2-D PGE films. These data suggest an important role for nanorough substrates (PGE films) in passivating endothelial cells and confirm the crucial effect of substrate composition in this process. Principal component analysis of microarray data on the above substrates (including collagen IV) implied that substrate composition plays a greater role than surface topography in affecting the endothelial ECM-related "transcriptome". Taken together, our findings suggest that electrospun PGE scaffolds are potentially suitable for application in small diameter vascular tissue engineering.
MAAMD: a workflow to standardize meta-analyses and comparison of affymetrix microarray data
2014-01-01
Background Mandatory deposit of raw microarray data files for public access, prior to study publication, provides significant opportunities to conduct new bioinformatics analyses within and across multiple datasets. Analysis of raw microarray data files (e.g. Affymetrix CEL files) can be time consuming, complex, and requires fundamental computational and bioinformatics skills. The development of analytical workflows to automate these tasks simplifies the processing of, improves the efficiency of, and serves to standardize multiple and sequential analyses. Once installed, workflows facilitate the tedious steps required to run rapid intra- and inter-dataset comparisons. Results We developed a workflow to facilitate and standardize Meta-Analysis of Affymetrix Microarray Data analysis (MAAMD) in Kepler. Two freely available stand-alone software tools, R and AltAnalyze were embedded in MAAMD. The inputs of MAAMD are user-editable csv files, which contain sample information and parameters describing the locations of input files and required tools. MAAMD was tested by analyzing 4 different GEO datasets from mice and drosophila. MAAMD automates data downloading, data organization, data quality control assesment, differential gene expression analysis, clustering analysis, pathway visualization, gene-set enrichment analysis, and cross-species orthologous-gene comparisons. MAAMD was utilized to identify gene orthologues responding to hypoxia or hyperoxia in both mice and drosophila. The entire set of analyses for 4 datasets (34 total microarrays) finished in ~ one hour. Conclusions MAAMD saves time, minimizes the required computer skills, and offers a standardized procedure for users to analyze microarray datasets and make new intra- and inter-dataset comparisons. PMID:24621103
Stajdohar, Miha; Rosengarten, Rafael D; Kokosar, Janez; Jeran, Luka; Blenkus, Domen; Shaulsky, Gad; Zupan, Blaz
2017-06-02
Dictyostelium discoideum, a soil-dwelling social amoeba, is a model for the study of numerous biological processes. Research in the field has benefited mightily from the adoption of next-generation sequencing for genomics and transcriptomics. Dictyostelium biologists now face the widespread challenges of analyzing and exploring high dimensional data sets to generate hypotheses and discovering novel insights. We present dictyExpress (2.0), a web application designed for exploratory analysis of gene expression data, as well as data from related experiments such as Chromatin Immunoprecipitation sequencing (ChIP-Seq). The application features visualization modules that include time course expression profiles, clustering, gene ontology enrichment analysis, differential expression analysis and comparison of experiments. All visualizations are interactive and interconnected, such that the selection of genes in one module propagates instantly to visualizations in other modules. dictyExpress currently stores the data from over 800 Dictyostelium experiments and is embedded within a general-purpose software framework for management of next-generation sequencing data. dictyExpress allows users to explore their data in a broader context by reciprocal linking with dictyBase-a repository of Dictyostelium genomic data. In addition, we introduce a companion application called GenBoard, an intuitive graphic user interface for data management and bioinformatics analysis. dictyExpress and GenBoard enable broad adoption of next generation sequencing based inquiries by the Dictyostelium research community. Labs without the means to undertake deep sequencing projects can mine the data available to the public. The entire information flow, from raw sequence data to hypothesis testing, can be accomplished in an efficient workspace. The software framework is generalizable and represents a useful approach for any research community. To encourage more wide usage, the backend is open-source, available for extension and further development by bioinformaticians and data scientists.
Proteomic profiling of early degenerative retina of RCS rats.
Zhu, Zhi-Hong; Fu, Yan; Weng, Chuan-Huang; Zhao, Cong-Jian; Yin, Zheng-Qin
2017-01-01
To identify the underlying cellular and molecular changes in retinitis pigmentosa (RP). Label-free quantification-based proteomics analysis, with its advantages of being more economic and consisting of simpler procedures, has been used with increasing frequency in modern biological research. Dystrophic RCS rats, the first laboratory animal model for the study of RP, possess a similar pathological course as human beings with the diseases. Thus, we employed a comparative proteomics analysis approach for in-depth proteome profiling of retinas from dystrophic RCS rats and non-dystrophic congenic controls through Linear Trap Quadrupole - orbitrap MS/MS, to identify the significant differentially expressed proteins (DEPs). Bioinformatics analyses, including Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway annotation and upstream regulatory analysis, were then performed on these retina proteins. Finally, a Western blotting experiment was carried out to verify the difference in the abundance of transcript factor E2F1. In this study, we identified a total of 2375 protein groups from the retinal protein samples of RCS rats and non-dystrophic congenic controls. Four hundred thirty-four significantly DEPs were selected by Student's t -test. Based on the results of the bioinformatics analysis, we identified mitochondrial dysfunction and transcription factor E2F1 as the key initiation factors in early retinal degenerative process. We showed that the mitochondrial dysfunction and the transcription factor E2F1 substantially contribute to the disease etiology of RP. The results provide a new potential therapeutic approach for this retinal degenerative disease.
Accessing and Integrating Data and Knowledge for Biomedical Research
Burgun, A.; Bodenreider, O.
2008-01-01
Summary Objectives To review the issues that have arisen with the advent of translational research in terms of integration of data and knowledge, and survey current efforts to address these issues. Methods Using examples form the biomedical literature, we identified new trends in biomedical research and their impact on bioinformatics. We analyzed the requirements for effective knowledge repositories and studied issues in the integration of biomedical knowledge. Results New diagnostic and therapeutic approaches based on gene expression patterns have brought about new issues in the statistical analysis of data, and new workflows are needed are needed to support translational research. Interoperable data repositories based on standard annotations, infrastructures and services are needed to support the pooling and meta-analysis of data, as well as their comparison to earlier experiments. High-quality, integrated ontologies and knowledge bases serve as a source of prior knowledge used in combination with traditional data mining techniques and contribute to the development of more effective data analysis strategies. Conclusion As biomedical research evolves from traditional clinical and biological investigations towards omics sciences and translational research, specific needs have emerged, including integrating data collected in research studies with patient clinical data, linking omics knowledge with medical knowledge, modeling the molecular basis of diseases, and developing tools that support in-depth analysis of research data. As such, translational research illustrates the need to bridge the gap between bioinformatics and medical informatics, and opens new avenues for biomedical informatics research. PMID:18660883
Hu, Qing-bi; He, Yu; Zhou, Xun
2015-01-01
Species included in the Sporothrix schenckii complex are temperature-dependent with dimorphic growth and cause sporotrichosis that is characterized by chronic and fatal lymphocutaneous lesions. The putative species included in the Sporothrix complex are S. brasiliensis, S. globosa, S. mexicana, S. pallida, S. schenckii, and S. lurei. S. globosa is the causal agent of sporotrichosis in China, and its pathogenicity appears to be closely related to the dimorphic transition, i.e. from the mycelial to the yeast phase, it adapts to changing environmental conditions. To determine the molecular mechanisms of the switching process that mediates the dimorphic transition of S. globosa, suppression subtractive hybridization (SSH) was used to prepare a complementary DNA (cDNA) subtraction library from the yeast and mycelial phases. Bioinformatics analysis was performed to profile the relationship between differently expressed genes and the dimorphic transition. Two genes that were expressed at higher levels by the yeast form were selected, and their differential expression levels were verified using a quantitative real-time reverse transcriptase polymerase chain reaction (qRT-PCR). It is believed that these differently expressed genes are involved in the pathogenesis of S. globosa infection in China. PMID:26642182
[DNA microarray reveals changes in gene expression of endothelial cells under shear stress].
Cheng, Min; Zhang, Wensheng; Chen, Huaiqing; Wu, Wenchao; Huang, Hua
2004-04-01
cDNA microarray technology is used as a powerful tool for rapid, comprehensive, and quantitative analysis of gene profiles of cultured human umbilical vein endothelial cells(HUVECs) in the normal static group and the shear stressed (4.20 dyne/cm2, 2 h) group. The total RNA from normal static cultured HUVECs was labeled by Cy3-dCTP, and total RNA of HUVECs from the paired shear stressed experiment was labeled by Cy5-dCTP. The expression ratios reported are the average from the two separate experiments. After bioinformatics analysis, we identified a total of 108 genes (approximately 0.026%) revealing differential expression. Of these 53 genes expressions were up-regulated, the most enhanced ones being human homolog of yeast IPP isomerase, human low density lipoprotein receptor gene, Squalene epoxidase gene, 7-dehydrocholesterol reductase, and 55 were down-regulated, the most decreased ones being heat shock 70 kD protein 1, TCB gene encoding cytosolic thyroid hormone-binding protein in HUVECs exposed to low shear stress. These results indicate that the cDNA microarray technique is effective in screening the differentially expressed genes in endothelial cells induced by various experimental conditions and the data may serve as stimuli to further researches.
bioalcidae, samjs and vcffilterjs: object-oriented formatters and filters for bioinformatics files.
Lindenbaum, Pierre; Redon, Richard
2018-04-01
Reformatting and filtering bioinformatics files are common tasks for bioinformaticians. Standard Linux tools and specific programs are usually used to perform such tasks but there is still a gap between using these tools and the programming interface of some existing libraries. In this study, we developed a set of tools namely bioalcidae, samjs and vcffilterjs that reformat or filter files using a JavaScript engine or a pure java expression and taking advantage of the java API for high-throughput sequencing data (htsjdk). https://github.com/lindenb/jvarkit. pierre.lindenbaum@univ-nantes.fr.
Khorsandi, Shirin Elizabeth; Quaglia, Alberto; Salehi, Siamak; Jassem, Wayel; Vilca-Melendez, Hector; Prachalias, Andreas; Srinivasan, Parthi; Heaton, Nigel
2015-01-01
Donation after cardiac death (DCD) livers are marginal organs for transplant and their use is associated with a higher risk of primary non function (PNF) or early graft dysfunction (EGD). The aim was to determine if microRNA (miRNA) was able to discriminate between DCD livers of varying clinical outcome. DCD groups were categorized as PNF retransplanted within a week (n=7), good functional outcome (n=7) peak aspartate transaminase (AST) ≤ 1000 IU/L and EGD (n=9) peak AST ≥ 2500 IU/L. miRNA was extracted from archival formalin fixed post-perfusion tru-cut liver biopsies. High throughput expression analysis was performed using miRNA arrays. Bioinformatics for expression data analysis was performed and validated with real time quantitative PCR (RT-qPCR). The function of miRNA of interest was investigated using computational biology prediction algorithms. From the array analysis 16 miRNAs were identified as significantly different (p<0.05). On RT-qPCR miR-155 and miR-940 had the highest expression across all three DCD clinical groups. Only one miRNA, miR-22, was validated with marginal significance, to have differential expression between the three groups (p=0.049). From computational biology miR-22 was predicted to affect signalling pathways that impact protein turnover, metabolism and apoptosis/cell cycle. In conclusion, microRNA expression patterns have a low diagnostic potential clinically in discriminating DCD liver quality and outcome.
Yang, Hong; Lin, Shan; Cui, Jingru
2014-02-10
Arsenic trioxide (ATO) is presently the most active single agent in the treatment of acute promyelocytic leukemia (APL). In order to explore the molecular mechanism of ATO in leukemia cells with time series, we adopted bioinformatics strategy to analyze expression changing patterns and changes in transcription regulation modules of time series genes filtered from Gene Expression Omnibus database (GSE24946). We totally screened out 1847 time series genes for subsequent analysis. The KEGG (Kyoto encyclopedia of genes and genomes) pathways enrichment analysis of these genes showed that oxidative phosphorylation and ribosome were the top 2 significantly enriched pathways. STEM software was employed to compare changing patterns of gene expression with assigned 50 expression patterns. We screened out 7 significantly enriched patterns and 4 tendency charts of time series genes. The result of Gene Ontology showed that functions of times series genes mainly distributed in profiles 41, 40, 39 and 38. Seven genes with positive regulation of cell adhesion function were enriched in profile 40, and presented the same first increased model then decreased model as profile 40. The transcription module analysis showed that they mainly involved in oxidative phosphorylation pathway and ribosome pathway. Overall, our data summarized the gene expression changes in ATO treated K562-r cell lines with time and suggested that time series genes mainly regulated cell adhesive. Furthermore, our result may provide theoretical basis of molecular biology in treating acute promyelocytic leukemia. Copyright © 2013 Elsevier B.V. All rights reserved.
OP17MICRORNA PROFILING USING SMALL RNA-SEQ IN PAEDIATRIC LOW GRADE GLIOMAS
Jeyapalan, Jennie N.; Jones, Tania A.; Tatevossian, Ruth G.; Qaddoumi, Ibrahim; Ellison, David W.; Sheer, Denise
2014-01-01
INTRODUCTION: MicroRNAs regulate gene expression by targeting mRNAs for translational repression or degradation at the post-transcriptional level. In paediatric low-grade gliomas a few key genetic mutations have been identified, including BRAF fusions, FGFR1 duplications and MYB rearrangements. Our aim in the current study is to profile aberrant microRNA expression in paediatric low-grade gliomas and determine the role of epigenetic changes in the aetiology and behaviour of these tumours. METHOD: MicroRNA profiling of tumour samples (6 pilocytic, 2 diffuse, 2 pilomyxoid astrocytomas) and normal brain controls (4 adult normal brain samples and a primary glial progenitor cell-line) was performed using small RNA sequencing. Bioinformatic analysis included sequence alignment, analysis of the number of reads (CPM, counts per million) and differential expression. RESULTS: Sequence alignment identified 695 microRNAs, whose expression was compared in tumours v. normal brain. PCA and hierarchical clustering showed separate groups for tumours and normal brain. Computational analysis identified approximately 400 differentially expressed microRNAs in the tumours compared to matched location controls. Our findings will then be validated and integrated with extensive genetic and epigenetic information we have previously obtained for the full tumour cohort. CONCLUSION: We have identified microRNAs that are differentially expressed in paediatric low-grade gliomas. As microRNAs are known to target genes involved in the initiation and progression of cancer, they provide critical information on tumour pathogenesis and are an important class of biomarkers.
Modern Computational Techniques for the HMMER Sequence Analysis
2013-01-01
This paper focuses on the latest research and critical reviews on modern computing architectures, software and hardware accelerated algorithms for bioinformatics data analysis with an emphasis on one of the most important sequence analysis applications—hidden Markov models (HMM). We show the detailed performance comparison of sequence analysis tools on various computing platforms recently developed in the bioinformatics society. The characteristics of the sequence analysis, such as data and compute-intensive natures, make it very attractive to optimize and parallelize by using both traditional software approach and innovated hardware acceleration technologies. PMID:25937944
Tripathi, Anita; Goswami, Kavita; Sanan-Mishra, Neeti
2015-01-01
microRNAs (miRs) are a class of 21–24 nucleotide long non-coding RNAs responsible for regulating the expression of associated genes mainly by cleavage or translational inhibition of the target transcripts. With this characteristic of silencing, miRs act as an important component in regulation of plant responses in various stress conditions. In recent years, with drastic change in environmental and soil conditions different type of stresses have emerged as a major challenge for plants growth and productivity. The identification and profiling of miRs has itself been a challenge for research workers given their small size and large number of many probable sequences in the genome. Application of computational approaches has expedited the process of identification of miRs and their expression profiling in different conditions. The development of High-Throughput Sequencing (HTS) techniques has facilitated to gain access to the global profiles of the miRs for understanding their mode of action in plants. Introduction of various bioinformatics databases and tools have revolutionized the study of miRs and other small RNAs. This review focuses the role of bioinformatics approaches in the identification and study of the regulatory roles of plant miRs in the adaptive response to stresses. PMID:26578966
Li, Xiao-Jiao; Pang, Jin-Shu; Li, Yao-Mei; Ahmed, Farah Abdirahman; He, Rong-Quan; Ma, Jie; Ma, Fu-Chao; Chen, Gang
2018-03-01
An increasing number of studies have confirmed that survivin (BIRC5) plays essential roles in ovarian cancer. Nevertheless, inconsistent or controversial results exist in some studies. In the present study, we sought to determine the clinical significance of survivin and its potential molecular pathways. The correlation between survivin (BIRC5) expression and diagnostic value, prognostic value and clinicopathological features was assessed by meta-analysis with more than 4000 patients from literature, GEO and TCGA. In addition, the potential molecular mechanism of survivin in ovarian cancer was also determined. The pooled sensitivity and specificity were 0.71 (95%CI: 0.68-0.74) and 0.97 (95%CI: 0.94-0.98), respectively. The AUC of sROC was 0.8765. The results showed that there was also a significant relationship between survivin expression and poor overall survival (HR: 1.24, 95%CI: 1.14-1.35, p < 0.001), disease-free survival (HR: 1.53, 95%CI: 0.57-4.09, p < 0.001), as well as higher recurrence rate (HR: 1.11, 95%CI: 0.97-1.27). Moreover, survivin expression was also associated with tumor progression (cancerous vs. benign, OR: 11.29, 95%CI: 8.96-14.24, p < 0.001), TNM stage (III + IV vs. I + II, OR: 5.38, 95%CI: 4.16-6.97, p < 0.001), histological grades (G3 vs. G1 ∼ G2, OR: 4.36, 95%CI: 3.29-5.77, p < 0.001), and lymphatic metastasis (metastasis vs. non-metastasis, 3.35, 95%CI 2.36-4.75, p < 0.001). Bioinformatics analysis revealed the 50 most frequently altered neighboring genes of survivin in OC, and then Gene Oncology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis were conducted. GO analysis showed that these genes were related to signal conduction, cell cycle, apoptosis, and metabolism. KEGG pathways analysis indicated that these genes were primarily enriched in mitotic prometaphase, PLK1 signaling events and the regulation of glucokinase by the glucokinase regulatory protein. Survivin (BIRC5) expression might become a specific but low-sensitivity biomarker in ovarian cancer patients, and its presence indicated poor prognosis and worse TNM stages. This protein might function as an oncoprotein by influencing specific pathways involving the 50 genes identified herein. Additional studies are needed to confirm these results. Copyright © 2018 Elsevier GmbH. All rights reserved.
Chae, Heejoon; Lee, Sangseon; Seo, Seokjun; Jung, Daekyoung; Chang, Hyeonsook; Nephew, Kenneth P; Kim, Sun
2016-12-01
Measuring gene expression, DNA sequence variation, and DNA methylation status is routinely done using high throughput sequencing technologies. To analyze such multi-omics data and explore relationships, reliable bioinformatics systems are much needed. Existing systems are either for exploring curated data or for processing omics data in the form of a library such as R. Thus scientists have much difficulty in investigating relationships among gene expression, DNA sequence variation, and DNA methylation using multi-omics data. In this study, we report a system called BioVLAB-mCpG-SNP-EXPRESS for the integrated analysis of DNA methylation, sequence variation (SNPs), and gene expression for distinguishing cellular phenotypes at the pairwise and multiple phenotype levels. The system can be deployed on either the Amazon cloud or a publicly available high-performance computing node, and the data analysis and exploration of the analysis result can be conveniently done using a web-based interface. In order to alleviate analysis complexity, all the process are fully automated, and graphical workflow system is integrated to represent real-time analysis progression. The BioVLAB-mCpG-SNP-EXPRESS system works in three stages. First, it processes and analyzes multi-omics data as input in the form of the raw data, i.e., FastQ files. Second, various integrated analyses such as methylation vs. gene expression and mutation vs. methylation are performed. Finally, the analysis result can be explored in a number of ways through a web interface for the multi-level, multi-perspective exploration. Multi-level interpretation can be done by either gene, gene set, pathway or network level and multi-perspective exploration can be explored from either gene expression, DNA methylation, sequence variation, or their relationship perspective. The utility of the system is demonstrated by performing analysis of phenotypically distinct 30 breast cancer cell line data set. BioVLAB-mCpG-SNP-EXPRESS is available at http://biohealth.snu.ac.kr/software/biovlab_mcpg_snp_express/. Copyright © 2016 Elsevier Inc. All rights reserved.
Winterhoff, Boris J; Maile, Makayla; Mitra, Amit Kumar; Sebe, Attila; Bazzaro, Martina; Geller, Melissa A; Abrahante, Juan E; Klein, Molly; Hellweg, Raffaele; Mullany, Sally A; Beckman, Kenneth; Daniel, Jerry; Starr, Timothy K
2017-03-01
The purpose of this study was to determine the level of heterogeneity in high grade serous ovarian cancer (HGSOC) by analyzing RNA expression in single epithelial and cancer associated stromal cells. In addition, we explored the possibility of identifying subgroups based on pathway activation and pre-defined signatures from cancer stem cells and chemo-resistant cells. A fresh, HGSOC tumor specimen derived from ovary was enzymatically digested and depleted of immune infiltrating cells. RNA sequencing was performed on 92 single cells and 66 of these single cell datasets passed quality control checks. Sequences were analyzed using multiple bioinformatics tools, including clustering, principle components analysis, and geneset enrichment analysis to identify subgroups and activated pathways. Immunohistochemistry for ovarian cancer, stem cell and stromal markers was performed on adjacent tumor sections. Analysis of the gene expression patterns identified two major subsets of cells characterized by epithelial and stromal gene expression patterns. The epithelial group was characterized by proliferative genes including genes associated with oxidative phosphorylation and MYC activity, while the stromal group was characterized by increased expression of extracellular matrix (ECM) genes and genes associated with epithelial-to-mesenchymal transition (EMT). Neither group expressed a signature correlating with published chemo-resistant gene signatures, but many cells, predominantly in the stromal subgroup, expressed markers associated with cancer stem cells. Single cell sequencing provides a means of identifying subpopulations of cancer cells within a single patient. Single cell sequence analysis may prove to be critical for understanding the etiology, progression and drug resistance in ovarian cancer. Copyright © 2017 Elsevier Inc. All rights reserved.
Genome-Wide Identification and Expression of Xenopus F-Box Family of Proteins.
Saritas-Yildirim, Banu; Pliner, Hannah A; Ochoa, Angelica; Silva, Elena M
2015-01-01
Protein degradation via the multistep ubiquitin/26S proteasome pathway is a rapid way to alter the protein profile and drive cell processes and developmental changes. Many key regulators of embryonic development are targeted for degradation by E3 ubiquitin ligases. The most studied family of E3 ubiquitin ligases is the SCF ubiquitin ligases, which use F-box adaptor proteins to recognize and recruit target proteins. Here, we used a bioinformatics screen and phylogenetic analysis to identify and annotate the family of F-box proteins in the Xenopus tropicalis genome. To shed light on the function of the F-box proteins, we analyzed expression of F-box genes during early stages of Xenopus development. Many F-box genes are broadly expressed with expression domains localized to diverse tissues including brain, spinal cord, eye, neural crest derivatives, somites, kidneys, and heart. All together, our genome-wide identification and expression profiling of the Xenopus F-box family of proteins provide a foundation for future research aimed to identify the precise role of F-box dependent E3 ubiquitin ligases and their targets in the regulatory circuits of development.
Zhou, Yong; Hu, Lifang; Wu, Hao; Jiang, Lunwei
2017-01-01
Superoxide dismutase (SOD) proteins are widely present in the plant kingdom and play important roles in different biological processes. However, little is known about the SOD genes in cucumber. In this study, night SOD genes were identified from cucumber (Cucumis sativus) using bioinformatics-based methods, including 5 Cu/ZnSODs, 3 FeSODs, and 1 MnSOD. Gene structure and motif analysis indicated that most of the SOD genes have relatively conserved exon/intron arrangement and motif composition. Phylogenetic analyses with SODs from cucumber and several other species revealed that these SOD proteins can be traced back to two ancestral SODs before the divergence of monocot and dicot plants. Many cis-elements related to stress responses and plant hormones were found in the promoter sequence of each CsSOD gene. Gene expression analysis revealed that most of the CsSOD genes are expressed in almost all the tested tissues. qRT-PCR analysis of 8 selected CsSOD genes showed that these genes could respond to heat, cold, osmotic, and salt stresses. Our results provide a basis for further functional research on SOD gene family in cucumber and facilitate their potential applications in the genetic improvement of cucumber. PMID:28808654
Transcriptional profiling of CD31(+) cells isolated from murine embryonic stem cells.
Mariappan, Devi; Winkler, Johannes; Chen, Shuhua; Schulz, Herbert; Hescheler, Jürgen; Sachinidis, Agapios
2009-02-01
Identification of genes involved in endothelial differentiation is of great interest for the understanding of the cellular and molecular mechanisms involved in the development of new blood vessels. Mouse embryonic stem (mES) cells serve as a potential source of endothelial cells for transcriptomic analysis. We isolated endothelial cells from 8-days old embryoid bodies by immuno-magnetic separation using platelet endothelial cell adhesion molecule-1 (also known as CD31) expressed on both early and mature endothelial cells. CD31(+) cells exhibit endothelial-like behavior by being able to incorporate DiI-labeled acetylated low-density lipoprotein as well as form tubular structures on matrigel. Quantitative and semi-quantitative PCR analysis further demonstrated the increased expression of endothelial transcripts. To ascertain the specific transcriptomic identity of the CD31(+) cells, large-scale microarray analysis was carried out. Comparative bioinformatic analysis reveals an enrichment of the gene ontology categories angiogenesis, blood vessel morphogenesis, vasculogenesis and blood coagulation in the CD31(+) cell population. Based on the transcriptomic signatures of the CD31(+) cells, we conclude that this ES cell-derived population contains endothelial-like cells expressing a mesodermal marker BMP2 and possess an angiogenic potential. The transcriptomic characterization of CD31(+) cells enables an in vitro functional genomic model to identify genes required for angiogenesis.
Cake: a bioinformatics pipeline for the integrated analysis of somatic variants in cancer genomes
Rashid, Mamunur; Robles-Espinoza, Carla Daniela; Rust, Alistair G.; Adams, David J.
2013-01-01
Summary: We have developed Cake, a bioinformatics software pipeline that integrates four publicly available somatic variant-calling algorithms to identify single nucleotide variants with higher sensitivity and accuracy than any one algorithm alone. Cake can be run on a high-performance computer cluster or used as a stand-alone application. Availabilty: Cake is open-source and is available from http://cakesomatic.sourceforge.net/ Contact: da1@sanger.ac.uk Supplementary Information: Supplementary data are available at Bioinformatics online. PMID:23803469
USDA-ARS?s Scientific Manuscript database
This study is focused on the characterization and expression of genes in the red flour beetle, Tribolium castaneum, encoding proteins that possess six-cysteine-containing chitin-binding domains (CBDs) related to the peritrophin A domain (ChtBD2). An exhaustive bioinformatics search of the genome of...
Promoting synergistic research and education in genomics and bioinformatics.
Yang, Jack Y; Yang, Mary Qu; Zhu, Mengxia Michelle; Arabnia, Hamid R; Deng, Youping
2008-01-01
Bioinformatics and Genomics are closely related disciplines that hold great promises for the advancement of research and development in complex biomedical systems, as well as public health, drug design, comparative genomics, personalized medicine and so on. Research and development in these two important areas are impacting the science and technology.High throughput sequencing and molecular imaging technologies marked the beginning of a new era for modern translational medicine and personalized healthcare. The impact of having the human sequence and personalized digital images in hand has also created tremendous demands of developing powerful supercomputing, statistical learning and artificial intelligence approaches to handle the massive bioinformatics and personalized healthcare data, which will obviously have a profound effect on how biomedical research will be conducted toward the improvement of human health and prolonging of human life in the future. The International Society of Intelligent Biological Medicine (http://www.isibm.org) and its official journals, the International Journal of Functional Informatics and Personalized Medicine (http://www.inderscience.com/ijfipm) and the International Journal of Computational Biology and Drug Design (http://www.inderscience.com/ijcbdd) in collaboration with International Conference on Bioinformatics and Computational Biology (Biocomp), touch tomorrow's bioinformatics and personalized medicine throughout today's efforts in promoting the research, education and awareness of the upcoming integrated inter/multidisciplinary field. The 2007 international conference on Bioinformatics and Computational Biology (BIOCOMP07) was held in Las Vegas, the United States of American on June 25-28, 2007. The conference attracted over 400 papers, covering broad research areas in the genomics, biomedicine and bioinformatics. The Biocomp 2007 provides a common platform for the cross fertilization of ideas, and to help shape knowledge and scientific achievements by bridging these two very important disciplines into an interactive and attractive forum. Keeping this objective in mind, Biocomp 2007 aims to promote interdisciplinary and multidisciplinary education and research. 25 high quality peer-reviewed papers were selected from 400+ submissions for this supplementary issue of BMC Genomics. Those papers contributed to a wide-range of important research fields including gene expression data analysis and applications, high-throughput genome mapping, sequence analysis, gene regulation, protein structure prediction, disease prediction by machine learning techniques, systems biology, database and biological software development. We always encourage participants submitting proposals for genomics sessions, special interest research sessions, workshops and tutorials to Professor Hamid R. Arabnia (hra@cs.uga.edu) in order to ensure that Biocomp continuously plays the leadership role in promoting inter/multidisciplinary research and education in the fields. Biocomp received top conference ranking with a high score of 0.95/1.00. Biocomp is academically co-sponsored by the International Society of Intelligent Biological Medicine and the Research Laboratories and Centers of Harvard University--Massachusetts Institute of Technology, Indiana University--Purdue University, Georgia Tech--Emory University, UIUC, UCLA, Columbia University, University of Texas at Austin and University of Iowa etc. Biocomp--Worldcomp brings leading scientists together across the nation and all over the world and aims to promote synergistic components such as keynote lectures, special interest sessions, workshops and tutorials in response to the advances of cutting-edge research.
Profiling and bioinformatic analysis of circular RNA expression regulated by c-Myc.
Gou, Qiheng; Wu, Ke; Zhou, Jian-Kang; Xie, Yuxin; Liu, Lunxu; Peng, Yong
2017-09-22
The c-Myc transcription factor is involved in cell proliferation, cell cycle and apoptosis by activating or repressing transcription of multiple genes. Circular RNAs (circRNAs) are widely expressed non-coding RNAs participating in the regulation of gene expression. Using a high-throughput microarray assay, we showed that Myc regulates the expression of certain circRNAs. A total of 309 up- and 252 down-regulated circRNAs were identified. Among them, randomly selected 8 circRNAs were confirmed by real-time PCR. Subsequently, Myc-binding sites were found to generally exist in the promoter regions of differentially expressed circRNAs. Based on miRNA sponge mechanism, we constructed circRNAs/miRNAs network regulated by Myc, suggesting that circRNAs may widely regulate protein expression through miRNA sponge mechanism. Lastly, we took advantage of Gene Ontology and KEGG analyses to point out that Myc-regulated circRNAs could impact cell proliferation through affecting Ras signaling pathway and pathways in cancer. Our study for the first time demonstrated that Myc transcription factor regulates the expression of circRNAs, adding a novel component of the Myc tumorigenic program and opening a window to investigate the function of certain circRNAs in tumorigenesis.
Dinga, Jerome Nyhalah; Wamalwa, Mark; Njimoh, Dieudonné Lemuh; Njahira, Moses N.; Djikeng, Appolinaire; Skilton, Rob; Titanji, Vincent Pryde Kehdingha; Pellé, Roger
2015-01-01
Introduction East Coast fever, a devastating disease of cattle, can be controlled partially by vaccination with live T. parva sporozoites. The antigens responsible for conferring immunity are not fully characterized. Recently it was shown that the P. falciparum immunodominant protein UB05 is highly conserved in T. parva, the causative agent of East Coast fever. The aim of the present investigation was to determine the role of the homologue TpUB05 in protective immunity to East Coast fever. Methods The cloning, sequencing and expression of TpUB05 were done according to standard protocols. Bioinformatics analysis of TpUB05 gene was carried out using algorithms found in the public domain. Polyclonal antiserum against recombinant TpUB05 were raised in rabbits and used for further analysis by Western blotting, ELISA, immunolocalization and in vitro infection neutralization assay. The ability of recombinant TpUB05 (r-TpUB05) to stimulate bovine PBMCs ex-vivo to produce IFN-γ or to proliferate was tested using ELISpot and [3H]-thymidine incorporation assays, respectively. Results All the 20 cattle immunised by the infection and treatment method (ITM) developed significantly higher levels of TpUB05 specific antibodies (p<0.0001) compared to the non-vaccinated ones. Similarly, r-TpUB05 highly stimulated bovine PMBCs from 8/12 (67%) of ITM-immunized cattle tested to produce IFN-γ and proliferate (p< 0.029) as compared to the 04 naїve cattle included as controls. Polyclonal TpUB05 antiserum raised against r-TpUB05 also marginally inhibited infection (p < 0.046) of bovine PBMCs by T. parva sporozoites. In further experiments RT-PCR showed that the TpUB05 gene is expressed by the parasite. This was confirmed by immunolocalization studies which revealed TpUB05 expression by schizonts and piroplasms. Bioinformatics analysis also revealed that this antigen possesses two transmembrane domains, a N-glycosylation site and several O-glycosylation sites. Conclusion It was concluded that TpUB05 is a potential marker of protective immunity in ECF worth investigating further. PMID:26053064
Dinga, Jerome Nyhalah; Wamalwa, Mark; Njimoh, Dieudonné Lemuh; Njahira, Moses N; Djikeng, Appolinaire; Skilton, Rob; Titanji, Vincent Pryde Kehdingha; Pellé, Roger
2015-01-01
East Coast fever, a devastating disease of cattle, can be controlled partially by vaccination with live T. parva sporozoites. The antigens responsible for conferring immunity are not fully characterized. Recently it was shown that the P. falciparum immunodominant protein UB05 is highly conserved in T. parva, the causative agent of East Coast fever. The aim of the present investigation was to determine the role of the homologue TpUB05 in protective immunity to East Coast fever. The cloning, sequencing and expression of TpUB05 were done according to standard protocols. Bioinformatics analysis of TpUB05 gene was carried out using algorithms found in the public domain. Polyclonal antiserum against recombinant TpUB05 were raised in rabbits and used for further analysis by Western blotting, ELISA, immunolocalization and in vitro infection neutralization assay. The ability of recombinant TpUB05 (r-TpUB05) to stimulate bovine PBMCs ex-vivo to produce IFN-γ or to proliferate was tested using ELISpot and [3H]-thymidine incorporation assays, respectively. All the 20 cattle immunised by the infection and treatment method (ITM) developed significantly higher levels of TpUB05 specific antibodies (p<0.0001) compared to the non-vaccinated ones. Similarly, r-TpUB05 highly stimulated bovine PMBCs from 8/12 (67%) of ITM-immunized cattle tested to produce IFN-γ and proliferate (p< 0.029) as compared to the 04 naїve cattle included as controls. Polyclonal TpUB05 antiserum raised against r-TpUB05 also marginally inhibited infection (p < 0.046) of bovine PBMCs by T. parva sporozoites. In further experiments RT-PCR showed that the TpUB05 gene is expressed by the parasite. This was confirmed by immunolocalization studies which revealed TpUB05 expression by schizonts and piroplasms. Bioinformatics analysis also revealed that this antigen possesses two transmembrane domains, a N-glycosylation site and several O-glycosylation sites. It was concluded that TpUB05 is a potential marker of protective immunity in ECF worth investigating further.
Yoo, Minjae; Shin, Jimin; Kim, Hyunmin; Kim, Jihye; Kang, Jaewoo; Tan, Aik Choon
2018-04-04
Traditional Chinese Medicine (TCM) has been practiced over thousands of years in China and other Asian countries for treating various symptoms and diseases. However, the underlying molecular mechanisms of TCM are poorly understood, partly due to the "multi-component, multi-target" nature of TCM. To uncover the molecular mechanisms of TCM, we perform comprehensive gene expression analysis using connectivity map. We interrogated gene expression signatures obtained 102 TCM components using the next generation Connectivity Map (CMap) resource. We performed systematic data mining and analysis on the mechanism of action (MoA) of these TCM components based on the CMap results. We clustered the 102 TCM components into four groups based on their MoAs using next generation CMap resource. We performed gene set enrichment analysis on these components to provide additional supports for explaining these molecular mechanisms. We also provided literature evidence to validate the MoAs identified through this bioinformatics analysis. Finally, we developed the Traditional Chinese Medicine Drug Repurposing Hub (TCM Hub) - a connectivity map resource to facilitate the elucidation of TCM MoA for drug repurposing research. TCMHub is freely available in http://tanlab.ucdenver.edu/TCMHub. Molecular mechanisms of TCM could be uncovered by using gene expression signatures and connectivity map. Through this analysis, we identified many of the TCM components possess diverse MoAs, this may explain the applications of TCM in treating various symptoms and diseases. Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.
Cheng, Gong; Lu, Quan; Ma, Ling; Zhang, Guocai; Xu, Liang; Zhou, Zongshan
2017-01-01
Recently, Docker technology has received increasing attention throughout the bioinformatics community. However, its implementation has not yet been mastered by most biologists; accordingly, its application in biological research has been limited. In order to popularize this technology in the field of bioinformatics and to promote the use of publicly available bioinformatics tools, such as Dockerfiles and Images from communities, government sources, and private owners in the Docker Hub Registry and other Docker-based resources, we introduce here a complete and accurate bioinformatics workflow based on Docker. The present workflow enables analysis and visualization of pan-genomes and biosynthetic gene clusters of bacteria. This provides a new solution for bioinformatics mining of big data from various publicly available biological databases. The present step-by-step guide creates an integrative workflow through a Dockerfile to allow researchers to build their own Image and run Container easily.
Cheng, Gong; Zhang, Guocai; Xu, Liang
2017-01-01
Recently, Docker technology has received increasing attention throughout the bioinformatics community. However, its implementation has not yet been mastered by most biologists; accordingly, its application in biological research has been limited. In order to popularize this technology in the field of bioinformatics and to promote the use of publicly available bioinformatics tools, such as Dockerfiles and Images from communities, government sources, and private owners in the Docker Hub Registry and other Docker-based resources, we introduce here a complete and accurate bioinformatics workflow based on Docker. The present workflow enables analysis and visualization of pan-genomes and biosynthetic gene clusters of bacteria. This provides a new solution for bioinformatics mining of big data from various publicly available biological databases. The present step-by-step guide creates an integrative workflow through a Dockerfile to allow researchers to build their own Image and run Container easily. PMID:29204317
Sgadò, Paola; Provenzano, Giovanni; Dassi, Erik; Adami, Valentina; Zunino, Giulia; Genovesi, Sacha; Casarosa, Simona; Bozzi, Yuri
2013-12-19
Transcriptome analysis has been used in autism spectrum disorder (ASD) to unravel common pathogenic pathways based on the assumption that distinct rare genetic variants or epigenetic modifications affect common biological pathways. To unravel recurrent ASD-related neuropathological mechanisms, we took advantage of the En2-/- mouse model and performed transcriptome profiling on cerebellar and hippocampal adult tissues. Cerebellar and hippocampal tissue samples from three En2-/- and wild type (WT) littermate mice were assessed for differential gene expression using microarray hybridization followed by RankProd analysis. To identify functional categories overrepresented in the differentially expressed genes, we used integrated gene-network analysis, gene ontology enrichment and mouse phenotype ontology analysis. Furthermore, we performed direct enrichment analysis of ASD-associated genes from the SFARI repository in our differentially expressed genes. Given the limited number of animals used in the study, we used permissive criteria and identified 842 differentially expressed genes in En2-/- cerebellum and 862 in the En2-/- hippocampus. Our functional analysis revealed that the molecular signature of En2-/- cerebellum and hippocampus shares convergent pathological pathways with ASD, including abnormal synaptic transmission, altered developmental processes and increased immune response. Furthermore, when directly compared to the repository of the SFARI database, our differentially expressed genes in the hippocampus showed enrichment of ASD-associated genes significantly higher than previously reported. qPCR was performed for representative genes to confirm relative transcript levels compared to those detected in microarrays. Despite the limited number of animals used in the study, our bioinformatic analysis indicates the En2-/- mouse is a valuable tool for investigating molecular alterations related to ASD.
Muralidharan, Arumugam Ramachandran; Leema, George; Annadurai, Thangaraj; Anitha, Thirugnanasambandhar Sivasubramanian; Thomas, Philip A.
2012-01-01
Purpose To determine the putative role of acetyl-L-carnitine (ALCAR) in maintaining normal intercellular communication in the lens through connexin. Methods In the present study, Wistar rat pups were divided into 3 groups of eight each. On postpartum day ten, Group I rat pups received an intraperitoneal injection (50 µl) of 0.89% saline. Rats in Groups II and III received a subcutaneous injection (50 µl) of sodium selenite (19 µmol/kg bodyweight); Group III rat pups also received an intraperitoneal injection of ALCAR (200 mg/kg bodyweight) once daily on postpartum days 9–14. Both eyes of each pup were examined from day 16 up to postpartum day 30. Alterations in the mean activity of the channel pumps, calcium-ATPase and sodium/potassium-ATPase, were determined. The expression of genes encoding key lenticular gap junctions (connexin 46 and connexin 50) and a channel pump (plasma membrane Ca2+-ATPase [PMCA1]) was evaluated by reverse transcription-PCR. Immunoblot analysis was also performed to confirm the differential expression of key lenticular connexin proteins. In addition, bioinformatics analysis was performed to determine the interacting residues of the connexin proteins with ALCAR. Results Significantly lower mean activities of Ca2+-ATPase and Na+/K+ -ATPase were observed in the lenses of Group II rats than those in Group I rat lenses. However, the observed mean activities of Ca2+-ATPase and Na+/K+-ATPase in Group III rat lenses were significantly higher than those in Group II rat lenses. The mean mRNA transcript levels of the connexin 46 and connexin 50 genes were significantly lower, while the mean levels of PMCA1 gene transcripts were significantly higher, in Group II rat lenses than in Group I rat lenses. Immunoblot analysis also confirmed the altered expression of connexin proteins in lysates of whole lenses of Group II rats. However, the expression of connexin 46 and connexin 50 proteins in lenses from group III rats was essentially similar to that noted in lenses from normal (Group I) rats. Hydrogen bond-interaction between ALCAR and amino acid residues at the functional domain regions of connexin 46 and connexin 50 proteins was also demonstrated through bioinformatics tools. Conclusions The results suggest that ALCAR plays a key role in maintaining lenticular homeostasis by promoting gap junctional intercellular communication. PMID:22876134
Muralidharan, Arumugam Ramachandran; Leema, George; Annadurai, Thangaraj; Anitha, Thirugnanasambandhar Sivasubramanian; Thomas, Philip A; Geraldine, Pitchairaj
2012-01-01
To determine the putative role of acetyl-L-carnitine (ALCAR) in maintaining normal intercellular communication in the lens through connexin. In the present study, Wistar rat pups were divided into 3 groups of eight each. On postpartum day ten, Group I rat pups received an intraperitoneal injection (50 µl) of 0.89% saline. Rats in Groups II and III received a subcutaneous injection (50 µl) of sodium selenite (19 µmol/kg bodyweight); Group III rat pups also received an intraperitoneal injection of ALCAR (200 mg/kg bodyweight) once daily on postpartum days 9-14. Both eyes of each pup were examined from day 16 up to postpartum day 30. Alterations in the mean activity of the channel pumps, calcium-ATPase and sodium/potassium-ATPase, were determined. The expression of genes encoding key lenticular gap junctions (connexin 46 and connexin 50) and a channel pump (plasma membrane Ca(2+)-ATPase [PMCA1]) was evaluated by reverse transcription-PCR. Immunoblot analysis was also performed to confirm the differential expression of key lenticular connexin proteins. In addition, bioinformatics analysis was performed to determine the interacting residues of the connexin proteins with ALCAR. Significantly lower mean activities of Ca(2+)-ATPase and Na(+)/K(+) -ATPase were observed in the lenses of Group II rats than those in Group I rat lenses. However, the observed mean activities of Ca(2+)-ATPase and Na(+)/K(+)-ATPase in Group III rat lenses were significantly higher than those in Group II rat lenses. The mean mRNA transcript levels of the connexin 46 and connexin 50 genes were significantly lower, while the mean levels of PMCA1 gene transcripts were significantly higher, in Group II rat lenses than in Group I rat lenses. Immunoblot analysis also confirmed the altered expression of connexin proteins in lysates of whole lenses of Group II rats. However, the expression of connexin 46 and connexin 50 proteins in lenses from group III rats was essentially similar to that noted in lenses from normal (Group I) rats. Hydrogen bond-interaction between ALCAR and amino acid residues at the functional domain regions of connexin 46 and connexin 50 proteins was also demonstrated through bioinformatics tools. The results suggest that ALCAR plays a key role in maintaining lenticular homeostasis by promoting gap junctional intercellular communication.
Gao, Li; Zhang, Li-Jie; Li, Sheng-Hua; Wei, Li-Li; Luo, Bin; He, Rong-Quan; Xia, Shuang
2018-03-06
MiR-452-5p has been reported to be down-regulated in prostate cancer, affecting the development of this type of cancer. However, the molecular mechanism of miR-452-5p in prostate cancer remains unclear. Therefore, we investigated the network of target genes of miR-452-5p in prostate cancer using bioinformatics analyses. We first analyzed the expression profiles and prognostic value of miR-452-5p in prostate cancer tissues from a public database. Gene Ontology (GO), the Kyoto Encyclopedia of Genes and Genomes (KEGG), PANTHER pathway analyses, and a disease ontology (DG) analysis were performed to find the molecular functions of the target genes from GSE datasets and miRWalk. Finally, we validated hub genes from the protein-protein interaction (PPI) networks of the target genes in the Human Protein Atlas (HPA) database and Gene Expression Profiling Interactive Analysis (GEPIA). Narrowing down the optimal target genes was conducted by seeking the common parts of up-regulated genes from GEPIA, down-regulated genes from GSE datasets, and predicted genes in miRWalk. Based on mining of GEO and ArrayExpress microarray chips and miRNA-Seq data in the TCGA database, which includes 1007 prostate cancer samples and 387 non-cancer samples, miR-452-5p is shown to be down-regulated in prostate cancer. GO, KEGG, and PANTHER pathway analyses suggested that the target genes might participate in important biological processes, such as transforming growth factor beta signaling and the positive regulation of brown fat cell differentiation and mesenchymal cell differentiation, as well as the Ras signaling pathway and pathways regulating the pluripotency of stem cells and arrhythmogenic right ventricular cardiomyopathy (ARVC). Nine genes-GABBR, PNISR, NTSR1, DOCK1, EREG, SFRP1, PTGS2, LEF1, and BMP2-were defined as hub genes in the PPI network. Three genes-FAM174B, SLC30A4, and SLIT1-were jointly shared by GEPIA, the GSE datasets, and miRWalk. Down-regulated miR-452-5p might play an essential role in the tumorigenesis of prostate cancer. Copyright © 2018. Published by Elsevier GmbH.
Xue, Linlin; Xie, Li; Song, Xingguo; Song, Xianrang
2018-04-17
Platelets have emerged as key players in tumorigenesis and tumor progression. Tumor-educated platelet (TEP) RNA profile has the potential to diagnose non-small-cell lung cancer (NSCLC). The objective of this study was to identify potential TEP RNA biomarkers for the diagnosis of NSCLC and to explore the mechanisms in alternations of TEP RNA profile. The RNA-seq datasets GSE68086 and GSE89843 were downloaded from Gene Expression Omnibus DataSets (GEO DataSets). Then, the functional enrichment of the differentially expressed mRNAs was analyzed by the Database for Annotation Visualization and Integrated Discovery (DAVID). The miRNAs which regulated the differential mRNAs and the target mRNAs of miRNAs were identified by miRanda and miRDB. Then, the miRNA-mRNA regulatory network was visualized via Cytoscape software. Twenty consistently altered mRNAs (2 up-regulated and 18 down-regulated) were identified from the two GSE datasets, and they were significantly enriched in several biological processes, including transport and establishment of localization. Twenty identical miRNAs were found between exosomal miRNA-seq dataset and 229 miRNAs that regulated 20 consistently differential mRNAs in platelets. We also analyzed 13 spliceosomal mRNAs and their miRNA predictions; there were 27 common miRNAs between 206 differential exosomal miRNAs and 338 miRNAs that regulated 13 distinct spliceosomal mRNAs. This study identified 20 potential TEP RNA biomarkers in NSCLC for diagnosis by integrated bioinformatical analysis, and alternations in TEP RNA profile may be related to the post-transcriptional regulation and the splicing metabolisms of spliceosome. © 2018 Wiley Periodicals, Inc.
The Landscape of MicroRNA, Piwi-Interacting RNA, and Circular RNA in Human Saliva
Bahn, Jae Hoon; Zhang, Qing; Li, Feng; Chan, Tak-Ming; Lin, Xianzhi; Kim, Yong; Wong, David T.W.; Xiao, Xinshu
2015-01-01
BACKGROUND Extracellular RNAs (exRNAs) in human body fluids are emerging as effective biomarkers for detection of diseases. Saliva, as the most accessible and noninvasive body fluid, has been shown to harbor exRNA biomarkers for several human diseases. However, the entire spectrum of exRNA from saliva has not been fully characterized. METHODS Using high-throughput RNA sequencing (RNA-Seq), we conducted an in-depth bioinformatic analysis of noncoding RNAs (ncRNAs) in human cell-free saliva (CFS) from healthy individuals, with a focus on microRNAs (miRNAs), piwi-interacting RNAs (piRNAs), and circular RNAs (circRNAs). RESULTS Our data demonstrated robust reproducibility of miRNA and piRNA profiles across individuals. Furthermore, individual variability of these salivary RNA species was highly similar to those in other body fluids or cellular samples, despite the direct exposure of saliva to environmental impacts. By comparative analysis of >90 RNA-Seq data sets of different origins, we observed that piRNAs were surprisingly abundant in CFS compared with other body fluid or intracellular samples, with expression levels in CFS comparable to those found in embryonic stem cells and skin cells. Conversely, miRNA expression profiles in CFS were highly similar to those in serum and cerebrospinal fluid. Using a customized bioinformatics method, we identified >400 circRNAs in CFS. These data represent the first global characterization and experimental validation of circRNAs in any type of extracellular body fluid. CONCLUSIONS Our study provides a comprehensive landscape of ncRNA species in human saliva that will facilitate further biomarker discoveries and lay a foundation for future studies related to ncRNAs in human saliva. PMID:25376581
Gong, Cuihua; Sun, Shangtong; Liu, Bing; Wang, Jing; Chen, Xiaodong
2017-06-01
The study aimed to identify the potential target genes and key miRNAs as well as to explore the underlying mechanisms in the pathogenesis of oral lichen planus (OLP) by bioinformatics analysis. The microarray data of GSE38617 were downloaded from Gene Expression Omnibus (GEO) database. A total of 7 OLP and 7 normal samples were used to identify the differentially expressed genes (DEGs) and miRNAs. The DEGs were then performed functional enrichment analyses. Furthermore, DEG-miRNA network and miRNA-function network were constructed by Cytoscape software. Total 1758 DEGs (598 up- and 1160 down-regulated genes) and 40 miRNAs (17 up- and 23 down-regulated miRNAs) were selected. The up-regulated genes were related to nuclear factor-Kappa B (NF-κB) signaling pathway, while down-regulated genes were mainly enriched in the function of ribosome. Tumor necrosis factor (TNF), caspase recruitment domain family, member 11 (CARD11) and mitochondrial ribosomal protein (MRP) genes were identified in these functions. In addition, miR-302 was a hub node in DEG-miRNA network and regulated cyclin D1 (CCND1). MiR-548a-2 was the key miRNA in miRNA-function network by regulating multiple functions including ribosomal function. The NF-κB signaling pathway and ribosome function may be the pathogenic mechanisms of OLP. The genes such as TNF, CARD11, MRP genes and CCND1 may be potential therapeutic target genes in OLP. MiR-548a-2 and miR-302 may play important roles in OLP development. Copyright © 2017 Elsevier Ltd. All rights reserved.
Djordjevic, Michael A; Chen, Han Cai; Natera, Siria; Van Noorden, Giel; Menzel, Christian; Taylor, Scott; Renard, Clotilde; Geiger, Otto; Weiller, Georg F
2003-06-01
A proteomic examination of Sinorhizobium meliloti strain 1021 was undertaken using a combination of 2-D gel electrophoresis, peptide mass fingerprinting, and bioinformatics. Our goal was to identify (i) putative symbiosis- or nutrient-stress-specific proteins, (ii) the biochemical pathways active under different conditions, (iii) potential new genes, and (iv) the extent of posttranslational modifications of S. meliloti proteins. In total, we identified the protein products of 810 genes (13.1% of the genome's coding capacity). The 810 genes generated 1,180 gene products, with chromosomal genes accounting for 78% of the gene products identified (18.8% of the chromosome's coding capacity). The activity of 53 metabolic pathways was inferred from bioinformatic analysis of proteins with assigned Enzyme Commission numbers. Of the remaining proteins that did not encode enzymes, ABC-type transporters composed 12.7% and regulatory proteins 3.4% of the total. Proteins with up to seven transmembrane domains were identified in membrane preparations. A total of 27 putative nodule-specific proteins and 35 nutrient-stress-specific proteins were identified and used as a basis to define genes and describe processes occurring in S. meliloti cells in nodules and under stress. Several nodule proteins from the plant host were present in the nodule bacteria preparations. We also identified seven potentially novel proteins not predicted from the DNA sequence. Post-translational modifications such as N-terminal processing could be inferred from the data. The posttranslational addition of UMP to the key regulator of nitrogen metabolism, PII, was demonstrated. This work demonstrates the utility of combining mass spectrometry with protein arraying or separation techniques to identify candidate genes involved in important biological processes and niche occupations that may be intransigent to other methods of gene expression profiling.
Kim, Dong Hyun; Patnaik, Bharat Bhusan; Seo, Gi Won; Kang, Seong Min; Lee, Yong Seok; Lee, Bok Luel; Han, Yeon Soo
2013-11-01
We have identified novel ricin-type (R-type) lectin by sequencing of random clones from cDNA library of the coleopteran beetle, Tenebrio molitor. The cDNA sequence is comprised of 495 bp encoding a protein of 164 amino acid residues and shows 49% identity with galectin of Tribolium castaneum. Bioinformatics analysis shows that the amino acid residues from 35 to 162 belong to ricin-type beta-trefoil structure. The transcript was significantly upregulated after early hours of injection with peptidoglycans derived from Gram (+) and Gram (-) bacteria, beta-1, 3 glucan from fungi and an intracellular pathogen, Listeria monocytogenes suggesting putative function in innate immunity. Copyright © 2013 Elsevier Inc. All rights reserved.
Bioinformatics and the Undergraduate Curriculum
ERIC Educational Resources Information Center
Maloney, Mark; Parker, Jeffrey; LeBlanc, Mark; Woodard, Craig T.; Glackin, Mary; Hanrahan, Michael
2010-01-01
Recent advances involving high-throughput techniques for data generation and analysis have made familiarity with basic bioinformatics concepts and programs a necessity in the biological sciences. Undergraduate students increasingly need training in methods related to finding and retrieving information stored in vast databases. The rapid rise of…
Using Kepler for Tool Integration in Microarray Analysis Workflows.
Gan, Zhuohui; Stowe, Jennifer C; Altintas, Ilkay; McCulloch, Andrew D; Zambon, Alexander C
Increasing numbers of genomic technologies are leading to massive amounts of genomic data, all of which requires complex analysis. More and more bioinformatics analysis tools are being developed by scientist to simplify these analyses. However, different pipelines have been developed using different software environments. This makes integrations of these diverse bioinformatics tools difficult. Kepler provides an open source environment to integrate these disparate packages. Using Kepler, we integrated several external tools including Bioconductor packages, AltAnalyze, a python-based open source tool, and R-based comparison tool to build an automated workflow to meta-analyze both online and local microarray data. The automated workflow connects the integrated tools seamlessly, delivers data flow between the tools smoothly, and hence improves efficiency and accuracy of complex data analyses. Our workflow exemplifies the usage of Kepler as a scientific workflow platform for bioinformatics pipelines.
Scalability and Validation of Big Data Bioinformatics Software.
Yang, Andrian; Troup, Michael; Ho, Joshua W K
2017-01-01
This review examines two important aspects that are central to modern big data bioinformatics analysis - software scalability and validity. We argue that not only are the issues of scalability and validation common to all big data bioinformatics analyses, they can be tackled by conceptually related methodological approaches, namely divide-and-conquer (scalability) and multiple executions (validation). Scalability is defined as the ability for a program to scale based on workload. It has always been an important consideration when developing bioinformatics algorithms and programs. Nonetheless the surge of volume and variety of biological and biomedical data has posed new challenges. We discuss how modern cloud computing and big data programming frameworks such as MapReduce and Spark are being used to effectively implement divide-and-conquer in a distributed computing environment. Validation of software is another important issue in big data bioinformatics that is often ignored. Software validation is the process of determining whether the program under test fulfils the task for which it was designed. Determining the correctness of the computational output of big data bioinformatics software is especially difficult due to the large input space and complex algorithms involved. We discuss how state-of-the-art software testing techniques that are based on the idea of multiple executions, such as metamorphic testing, can be used to implement an effective bioinformatics quality assurance strategy. We hope this review will raise awareness of these critical issues in bioinformatics.
Li, Dandan; Li, Chunjin; Xu, Ying; Xu, Duo; Li, Hongjiao; Gao, Liwei; Chen, Shuxiong; Fu, Lulu; Xu, Xin; Liu, Yongzheng; Zhang, Xueying; Zhang, Jingshun; Ming, Hao; Zheng, Lianwen
2016-04-01
Polycystic ovary syndrome (PCOS) is a complex and heterogeneous endocrine disorder. To understand the pathogenesis of PCOS, we established rat models of PCOS induced by letrozole and employed deep sequencing to screen the differential expression of microRNAs (miRNAs) in PCOS rats and control rats. We observed vaginal smear and detected ovarian pathological alteration and hormone level changes in PCOS rats. Deep sequencing showed that a total of 129 miRNAs were differentially expressed in the ovaries from letrozole-induced rat model compared with the control, including 49 miRNAs upregulated and 80 miRNAs downregulated. Furthermore, the differential expression of miR-201-5p, miR-34b-5p, miR-141-3p, and miR-200a-3p were confirmed by real-time polymerase chain reaction. Bioinformatic analysis revealed that these four miRNAs were predicted to target a large set of genes with different functions. Pathway analysis supported that the miRNAs regulate oocyte meiosis, mitogen-activated protein kinase (MAPK) signaling, phosphoinositide 3-kinase/Akt (PI3K-Akt) signaling, Rap1 signaling, and Notch signaling. These data indicate that miRNAs are differentially expressed in rat PCOS model and the differentially expressed miRNA are involved in the etiology and pathophysiology of PCOS. Our findings will help identify miRNAs as novel diagnostic markers and therapeutic targets for PCOS.
Dysregulation of hepatic microRNA expression profiles with Clonorchis sinensis infection.
Han, Su; Tang, Qiaoran; Lu, Xi; Chen, Rui; Li, Yihong; Shu, Jing; Zhang, Xiaoli; Cao, Jianping
2016-11-30
Clonorchiasis remains an important zoonotic parasitic disease worldwide. The molecular mechanisms of host-parasite interaction are not fully understood. Non-coding microRNAs (miRNAs) are considered to be key regulators in parasitic diseases. The regulation of miRNAs and host micro-environment may be involved in clonorchiasis, and require further investigation. MiRNA microarray technology and bioinformatic analysis were used to investigate the regulatory mechanisms of host miRNA and to compare miRNA expression profiles in the liver tissues of control and Clonorchis sinensis (C. sinensis)-infected rats. A total of eight miRNAs were downregulated and two were upregulated, which showed differentially altered expression profiles in the liver tissue of C. sinensis-infected rats. Further analysis of the differentially expressed miRNAs revealed that many important signal pathways were triggered after infection with C. sinensis, which were related to clonorchiasis pathogenesis, such as cell apoptosis and inflammation, as well as genes involved in signal transduction mechanisms, such as pathways in cancer and the Wnt and Mitogen-activated protein kinases (MAPK) signaling pathways. The present study revealed that the miRNA expression profiles of the host were changed by C. sinensis infection. This dysregulation in miRNA expression may contribute to the etiology and pathophysiology of clonorchiasis. These results also provide new insights into the regulatory mechanisms of miRNAs in clonorchiasis, which may present potential targets for future C. sinensis control strategies.
2010-01-01
Background Infection by infectious laryngotracheitis virus (ILTV; gallid herpesvirus 1) causes acute respiratory diseases in chickens often with high mortality. To better understand host-ILTV interactions at the host transcriptional level, a microarray analysis was performed using 4 × 44 K Agilent chicken custom oligo microarrays. Results Microarrays were hybridized using the two color hybridization method with total RNA extracted from ILTV infected chicken embryo lung cells at 0, 1, 3, 5, and 7 days post infection (dpi). Results showed that 789 genes were differentially expressed in response to ILTV infection that include genes involved in the immune system (cytokines, chemokines, MHC, and NF-κB), cell cycle regulation (cyclin B2, CDK1, and CKI3), matrix metalloproteinases (MMPs) and cellular metabolism. Differential expression for 20 out of 789 genes were confirmed by quantitative reverse transcription-PCR (qRT-PCR). A bioinformatics tool (Ingenuity Pathway Analysis) used to analyze biological functions and pathways on the group of 789 differentially expressed genes revealed that 21 possible gene networks with intermolecular connections among 275 functionally identified genes. These 275 genes were classified into a number of functional groups that included cancer, genetic disorder, cellular growth and proliferation, and cell death. Conclusion The results of this study provide comprehensive knowledge on global gene expression, and biological functionalities of differentially expressed genes in chicken embryo lung cells in response to ILTV infections. PMID:20663125
Sjögren, Rasmus J. O.; Egan, Brendan; Katayama, Mutsumi; Zierath, Juleen R.
2014-01-01
microRNAs (miRNAs) are short noncoding RNAs that regulate gene expression through posttranscriptional repression of target genes. miRNAs exert a fundamental level of control over many developmental processes, but their role in the differentiation and development of skeletal muscle from myogenic progenitor cells in humans remains incompletely understood. Using primary cultures established from human skeletal muscle satellite cells, we performed microarray profiling of miRNA expression during differentiation of myoblasts (day 0) into myotubes at 48 h intervals (day 2, 4, 6, 8, and 10). Based on a time-course analysis, we identified 44 miRNAs with altered expression [false discovery rate (FDR) < 5%, fold change > ±1.2] during differentiation, including the marked upregulation of the canonical myogenic miRNAs miR-1, miR-133a, miR-133b, and miR-206. Microarray profiling of mRNA expression at day 0, 4, and 10 identified 842 and 949 genes differentially expressed (FDR < 10%) at day 4 and 10, respectively. At day 10, 42% of altered transcripts demonstrated reciprocal expression patterns in relation to the directional change of their in silico predicted regulatory miRNAs based on analysis using Ingenuity Pathway Analysis microRNA Target Filter. Bioinformatic analysis predicted networks of regulation during differentiation including myomiRs miR-1/206 and miR-133a/b, miRNAs previously established in differentiation including miR-26 and miR-30, and novel miRNAs regulated during differentiation of human skeletal muscle cells such as miR-138-5p and miR-20a. These reciprocal expression patterns may represent new regulatory nodes in human skeletal muscle cell differentiation. This analysis serves as a reference point for future studies of human skeletal muscle differentiation and development in healthy and disease states. PMID:25547110
Wang, Lingyan; Yu, Xiaoling; Wu, Chao; Zhu, Teng; Wang, Wenming; Zheng, Xiaofeng; Jin, Hongzhong
2018-06-05
Generalized pustular psoriasis (GPP) is a rare, episodic, potentially life-threatening inflammatory disease. However, the pathogenesis of GPP, and universally accepted therapies for treating it, remain undefined. To better understand the disease mechanism of GPP, we performed a transcriptome analysis to profile the gene expression of peripheral blood mononuclear cells (PBMCs) from patients enrolled at the time of diagnosis and receiving follow-up treatment for up to 6 months. RNA sequencing data revealed that gene expression in five GPP patients' PBMCs was profoundly altered following acitretin treatment. Differentially expressed gene (DEG) analysis suggested that genes related to psoriatic inflammation, including CXCL1, CXCL8 (IL-8), S100A8, S100A9, S100A12 and LCN2, were significantly downregulated in patients in remission from GPP. Functional enrichment and annotation analysis unveiled a cluster of DEGs significantly associated with the function of leukocytes, particularly neutrophils. Pathway analysis suggested that a variety of pro-inflammatory pathways were inhibited in patients in remission. This analysis not only reaffirmed known signaling pathways in GPP pathogenesis, but also implicated novel factors and pathways, such as cell cycle regulation pathways. Furthermore, regulator network analysis provided bioinformatics-based support for upstream molecules as potential therapeutic targets such as oncostatin M. This longitudinal analysis of blood transcriptomes provides the first evidence that dysregulated gene expression in peripheral blood may significantly contribute to psoriatic inflammation in GPP patients. Novel canonical pathways and biomarkers identified in the current research may provide insights to help understand GPP pathobiology and advance novel therapeutics.
Jin, Xiaohan; Xu, Zhongwei; Cao, Jin; Shao, Ping; Zhou, Maobin; Qin, Zhe; Liu, Yan; Yu, Fang; Zhou, Xin; Ji, Wenjie; Cai, Wei; Ma, Yongqiang; Wang, Chengyan; Shan, Nana; Yang, Ning; Chen, Xu; Li, Yuming
2017-09-01
Hypertensive disorder in pregnancy (HDP) refers to a series of diseases that cause the hypertension during pregnancy, including HDP, preeclampsia (PE) and eclampsia. This study screens differentially expressed proteins of placenta tissues in PE cases using 2D LC-MS/MS quantitative proteomics strategy. A total of 2281 proteins are quantified, of these, 145 altering expression proteins are successfully screened between PE and control cases (p<0.05). Bioinformatics analysis suggests that these proteins are mainly involved in many biological processes, such as oxidation reduction, mitochondrion organization, and acute inflammatory response. Especially, the glutamine metabolic process related molecules, GPX1, GPX3, SMS, GGCT, GSTK1, NFκB, GSTT2, SOD1 and GCLM, are involved in the switching process from oxidized glutathione (GSSG) conversion to the reduced glutathione (GSH) by glutathione, mercapturic acid and arginine metabolism process. Results of this study revealed that glutathione metabolism disorder of placenta tissues may contribute to the occurrence of PE disease. Copyright © 2017. Published by Elsevier B.V.
Genome-wide analysis of TCP family in tobacco.
Chen, L; Chen, Y Q; Ding, A M; Chen, H; Xia, F; Wang, W F; Sun, Y H
2016-05-23
The TCP family is a transcription factor family, members of which are extensively involved in plant growth and development as well as in signal transduction in the response against many physiological and biochemical stimuli. In the present study, 61 TCP genes were identified in tobacco (Nicotiana tabacum) genome. Bioinformatic methods were employed for predicting and analyzing the gene structure, gene expression, phylogenetic analysis, and conserved domains of TCP proteins in tobacco. The 61 NtTCP genes were divided into three diverse groups, based on the division of TCP genes in tomato and Arabidopsis, and the results of the conserved domain and sequence analyses further confirmed the classification of the NtTCP genes. The expression pattern of NtTCP also demonstrated that majority of these genes play important roles in all the tissues, while some special genes exercise their functions only in specific tissues. In brief, the comprehensive and thorough study of the TCP family in other plants provides sufficient resources for studying the structure and functions of TCPs in tobacco.
Transcriptional profiling of Medicago truncatula meristematic root cells
Holmes, Peta; Goffard, Nicolas; Weiller, Georg F; Rolfe, Barry G; Imin, Nijat
2008-01-01
Background The root apical meristem of crop and model legume Medicago truncatula is a significantly different stem cell system to that of the widely studied model plant species Arabidopsis thaliana. In this study we used the Affymetrix Medicago GeneChip® to compare the transcriptomes of meristem and non-meristematic root to identify root meristem specific candidate genes. Results Using mRNA from root meristem and non-meristem we were able to identify 324 and 363 transcripts differentially expressed from the two regions. With bioinformatics tools developed to functionally annotate the Medicago genome array we could identify significant changes in metabolism, signalling and the differentially expression of 55 transcription factors in meristematic and non-meristematic roots. Conclusion This is the first comprehensive analysis of M. truncatula root meristem cells using this genome array. This data will facilitate the mapping of regulatory and metabolic networks involved in the open root meristem of M. truncatula and provides candidates for functional analysis. PMID:18302802
Sequence analysis and molecular characterization of Wnt4 gene in metacestodes of Taenia solium.
Hou, Junling; Luo, Xuenong; Wang, Shuai; Yin, Cai; Zhang, Shaohua; Zhu, Xueliang; Dou, Yongxi; Cai, Xuepeng
2014-04-01
Wnt proteins are a family of secreted glycoproteins that are evolutionarily conserved and considered to be involved in extensive developmental processes in metazoan organisms. The characterization of wnt genes may improve understanding the parasite's development. In the present study, a wnt4 gene encoding 491amino acids was amplified from cDNA of metacestodes of Taenia solium using reverse transcription PCR (RT-PCR). Bioinformatics tools were used for sequence analysis. The conserved domain of the wnt gene family was predicted. The expression profile of Wnt4 was investigated using real-time PCR. Wnt4 expression was found to be dramatically increased in scolex evaginated cysticerci when compared to invaginated cysticerci. In situ hybridization showed that wnt4 gene was distributed in the posterior end of the worm along the primary body axis in evaginated cysticerci. These findings indicated that wnt4 may take part in the process of cysticerci evagination and play a role in scolex/bladder development of cysticerci of T. solium.
Proteomic profiling of halloysite clay nanotube exposure in intestinal cell co-culture
Lai, Xianyin; Agarwal, Mangilal; Lvov, Yuri M.; Pachpande, Chetan; Varahramyan, Kody; Witzmann, Frank A.
2013-01-01
Halloysite is aluminosilicate clay with a hollow tubular structure with nanoscale internal and external diameters. Assessment of halloysite biocompatibility has gained importance in view of its potential application in oral drug delivery. To investigate the effect of halloysite nanotubes on an in vitro model of the large intestine, Caco-2/HT29-MTX cells in monolayer co-culture were exposed to nanotubes for toxicity tests and proteomic analysis. Results indicate that halloysite exhibits a high degree of biocompatibility characterized by an absence of cytotoxicity, in spite of elevated pro-inflammatory cytokine release. Exposure-specific changes in expression were observed among 4081 proteins analyzed. Bioinformatic analysis of differentially expressed protein profiles suggest that halloysite stimulates processes related to cell growth and proliferation, subtle responses to cell infection, irritation and injury, enhanced antioxidant capability, and an overall adaptive response to exposure. These potentially relevant functional effects warrant further investigation in in vivo models and suggest that chronic or bolus occupational exposure to halloysite nanotubes may have unintended outcomes. PMID:23606564
Generalized Centroid Estimators in Bioinformatics
Hamada, Michiaki; Kiryu, Hisanori; Iwasaki, Wataru; Asai, Kiyoshi
2011-01-01
In a number of estimation problems in bioinformatics, accuracy measures of the target problem are usually given, and it is important to design estimators that are suitable to those accuracy measures. However, there is often a discrepancy between an employed estimator and a given accuracy measure of the problem. In this study, we introduce a general class of efficient estimators for estimation problems on high-dimensional binary spaces, which represent many fundamental problems in bioinformatics. Theoretical analysis reveals that the proposed estimators generally fit with commonly-used accuracy measures (e.g. sensitivity, PPV, MCC and F-score) as well as it can be computed efficiently in many cases, and cover a wide range of problems in bioinformatics from the viewpoint of the principle of maximum expected accuracy (MEA). It is also shown that some important algorithms in bioinformatics can be interpreted in a unified manner. Not only the concept presented in this paper gives a useful framework to design MEA-based estimators but also it is highly extendable and sheds new light on many problems in bioinformatics. PMID:21365017
Kalra, Shikha; Puniya, Bhanwar Lal; Kulshreshtha, Deepika; Kumar, Sunil; Kaur, Jagdeep; Ramachandran, Srinivasan; Singh, Kashmir
2013-01-01
Chlorophytum borivilianum, an endangered medicinal plant species is highly recognized for its aphrodisiac properties provided by saponins present in the plant. The transcriptome information of this species is limited and only few hundred expressed sequence tags (ESTs) are available in the public databases. To gain molecular insight of this plant, high throughput transcriptome sequencing of leaf RNA was carried out using Illumina's HiSeq 2000 sequencing platform. A total of 22,161,444 single end reads were retrieved after quality filtering. Available (e.g., De-Bruijn/Eulerian graph) and in-house developed bioinformatics tools were used for assembly and annotation of transcriptome. A total of 101,141 assembled transcripts were obtained, with coverage size of 22.42 Mb and average length of 221 bp. Guanine-cytosine (GC) content was found to be 44%. Bioinformatics analysis, using non-redundant proteins, gene ontology (GO), enzyme commission (EC) and kyoto encyclopedia of genes and genomes (KEGG) databases, extracted all the known enzymes involved in saponin and flavonoid biosynthesis. Few genes of the alkaloid biosynthesis, along with anticancer and plant defense genes, were also discovered. Additionally, several cytochrome P450 (CYP450) and glycosyltransferase unique sequences were also found. We identified simple sequence repeat motifs in transcripts with an abundance of di-nucleotide simple sequence repeat (SSR; 43.1%) markers. Large scale expression profiling through Reads per Kilobase per Million mapped reads (RPKM) showed major genes involved in different metabolic pathways of the plant. Genes, expressed sequence tags (ESTs) and unique sequences from this study provide an important resource for the scientific community, interested in the molecular genetics and functional genomics of C. borivilianum. PMID:24376689
Fish oil improves gene targets of Down syndrome in C57BL and BALB/c mice.
Zmijewski, Peter A; Gao, Linda Y; Saxena, Abhinav R; Chavannes, Nastacia K; Hushmendy, Shazaan F; Bhoiwala, Devang L; Crawford, Dana R
2015-05-01
We have considered a novel gene targeting approach for treating pathologies and conditions whose genetic bases are defined using diet and nutrition. One such condition is Down syndrome, which is linked to overexpression of RCAN1 on human chromosome 21 for some phenotypes. We hypothesize that a decrease in RCAN1 expression with dietary supplements in individuals with Down syndrome represents a potential treatment. Toward this, we used in vivo studies and bioinformatic analysis to identify potential healthy dietary RCAN1 expression modulators. We observed Rcan1 isoform 1 (Rcan1-1) protein reduction in mice pup hippocampus after a 4-week curcumin and fish oil supplementation, with only fish oil reduction being statistically significant. Focusing on fish oil, we observed a 17% Rcan1-1 messenger RNA (mRNA) and 19% Rcan1-1 protein reduction in BALB/c mice after 5 weeks of fish oil supplementation. Fish oil supplementation starting at conception and in a different mouse strain (C57BL) led to a 27% reduction in hippocampal Rcan1-1 mRNA and a 34% reduction in spleen Rcan1-1 mRNA at 6 weeks of age. Hippocampal protein results revealed a modest 11% reduction in RCAN1-1, suggesting translational compensation. Bioinformatic mining of human fish oil studies also revealed reduced RCAN1 mRNA expression, consistent with the above studies. These results suggest the potential use of fish oil in treating Down syndrome and support our strategy of using select healthy dietary agents to treat genetically defined pathologies, an approach that we believe is simple, healthy, and cost-effective. Copyright © 2015 Elsevier Inc. All rights reserved.
Identification of potential target genes of ROR-alpha in THP1 and HUVEC cell lines.
Gulec, Cagri; Coban, Neslihan; Ozsait-Selcuk, Bilge; Sirma-Ekmekci, Sema; Yildirim, Ozlem; Erginel-Unaltuna, Nihan
2017-04-01
ROR-alpha is a nuclear receptor, activity of which can be modulated by natural or synthetic ligands. Due to its possible involvement in, and potential therapeutic target for atherosclerosis, we aimed to identify ROR-alpha target genes in monocytic and endothelial cell lines. We performed chromatin immunoprecipitation (ChIP) followed by tiling array (ChIP-on-chip) for ROR-alpha in monocytic cell line THP1 and endothelial cell line HUVEC. Following bioinformatic analysis of the array data, we tested four candidate genes in terms of dependence of their expression level on ligand-mediated ROR-alpha activity, and two of them in terms of promoter occupancy by ROR-alpha. Bioinformatic analyses of ChIP-on-chip data suggested that ROR-alpha binds to genomic regions near the transcription start site (TSS) of more than 3000 genes in THP1 and HUVEC. Potential ROR-alpha target genes in both cell types seem to be involved mainly in membrane receptor activity, signal transduction and ion transport. While SPP1 and IKBKA were shown to be direct target genes of ROR-alpha in THP1 monocytes, inflammation related gene HMOX1 and heat shock protein gene HSPA8 were shown to be potential target genes of ROR-alpha. Our results suggest that ROR-alpha may regulate signaling receptor activity, and transmembrane transport activity through its potential target genes. ROR-alpha seems also to play role in cellular sensitivity to environmental substances like arsenite and chloroprene. Although, the expression analyses have shown that synthetic ROR-alpha ligands can modulate some of potential ROR-alpha target genes, functional significance of ligand-dependent modulation of gene expression needs to be confirmed with further analyses. Copyright © 2017 Elsevier Inc. All rights reserved.
Yang, Yabo; Hu, Dong; Wang, Lexun; Liang, Chi; Hu, Xuchu; Xu, Jin; Huang, Yan; Yu, Xinbing
2014-01-01
Clonorchiasis, which has been an important public health problem in China, is caused by ingestion of raw or undercooked fish contaminated by live metacercaria. Therefore, preventing fish from infecting is of great significance for controlling the disease. SERPINs (serine protease inhibitors) are well known as negative regulators of hemostasis, thrombolysis, and innate immune responses. In the present study, two full-length sequences encoding SERPIN were identified from metacercaria cDNA library of Clonorchis sinensis (C. sinensis) and were denominated as CsSERPIN and CsSERPIN3, respectively. Bioinformatics analysis showed that the two sequences shares 35.9% identity to each other. Both of the sequences have SERPIN domain and the greatest difference between the two domains is the reactive centre loop. Transmembrane region was found in CsSERPIN3 while not in CsSERPIN. The expression of the two CsSERPINs was significantly higher at the life stage of metacercaria than that of adult. The transcription levels of CsSERPIN and CsSERPIN3 at metacercaria stage were 3.249- and 11.314-fold of that at adult stage, respectively. Furthermore, the expression of CsSERPIN was 4.32-fold of that of CsSERPIN3 at metacercaria stage. Immunobiochemistry revealed that CsERPIN was dispersed at subtegument and oral sucker of metacercaria, while CsSERPIN3 localized intensely in the tegument of metacercaria of C. sinensis inside of the cyst wall. All these indicated that the CsSERPINs play important roles at metacercaria stage of the parasite. CsSERPIN may take part in regulation of endogenous serine proteinase and CsSERPIN3 may be involved in immune evasion and be a potential candidate for vaccine and drug target for clonorchiasis. PMID:24831344
Yang, Yabo; Hu, Dong; Wang, Lexun; Liang, Chi; Hu, Xuchu; Xu, Jin; Huang, Yan; Yu, Xinbing
2014-06-01
Clonorchiasis, which has been an important public health problem in China, is caused by ingestion of raw or undercooked fish contaminated by live metacercaria. Therefore, preventing fish from infecting is of great significance for controlling the disease. SERPINs (serine protease inhibitors) are well known as negative regulators of hemostasis, thrombolysis, and innate immune responses. In the present study, two full-length sequences encoding SERPIN were identified from metacercaria cDNA library of Clonorchis sinensis (C. sinensis) and were denominated as CsSERPIN and CsSERPIN3, respectively. Bioinformatics analysis showed that the two sequences shares 35.9% identity to each other. Both of the sequences have SERPIN domain and the greatest difference between the two domains is the reactive centre loop. Transmembrane region was found in CsSERPIN3 while not in CsSERPIN. The expression of the two CsSERPINs was significantly higher at the life stage of metacercaria than that of adult. The transcription levels of CsSERPIN and CsSERPIN3 at metacercaria stage were 3.249- and 11.314-fold of that at adult stage, respectively. Furthermore, the expression of CsSERPIN was 4.32-fold of that of CsSERPIN3 at metacercaria stage. Immunobiochemistry revealed that CsERPIN was dispersed at subtegument and oral sucker of metacercaria, while CsSERPIN3 localized intensely in the tegument of metacercaria of C. sinensis inside of the cyst wall. All these indicated that the CsSERPINs play important roles at metacercaria stage of the parasite. CsSERPIN may take part in regulation of endogenous serine proteinase and CsSERPIN3 may be involved in immune evasion and be a potential candidate for vaccine and drug target for clonorchiasis.
Myocyte enhancer factor 2D provides a cross-talk between chronic inflammation and lung cancer.
Zhu, Hai-Xing; Shi, Lin; Zhang, Yong; Zhu, Yi-Chun; Bai, Chun-Xue; Wang, Xiang-Dong; Zhou, Jie-Bai
2017-03-24
Lung cancer is the leading cause of cancer-related morbidity and mortality worldwide. Patients with chronic respiratory diseases, such as chronic obstructive pulmonary disease (COPD), are exposed to a higher risk of developing lung cancer. Chronic inflammation may play an important role in the lung carcinogenesis among those patients. The present study aimed at identifying candidate biomarker predicting lung cancer risk among patients with chronic respiratory diseases. We applied clinical bioinformatics tools to analyze different gene profile datasets with a special focus on screening the potential biomarker during chronic inflammation-lung cancer transition. Then we adopted an in vitro model based on LPS-challenged A549 cells to validate the biomarker through RNA-sequencing, quantitative real time polymerase chain reaction, and western blot analysis. Bioinformatics analyses of the 16 enrolled GSE datasets from Gene Expression Omnibus online database showed myocyte enhancer factor 2D (MEF2D) level significantly increased in COPD patients coexisting non-small-cell lung carcinoma (NSCLC). Inflammation challenge increased MEF2D expression in NSCLC cell line A549, associated with the severity of inflammation. Extracellular signal-regulated protein kinase inhibition could reverse the up-regulation of MEF2D in inflammation-activated A549. MEF2D played a critical role in NSCLC cell bio-behaviors, including proliferation, differentiation, and movement. Inflammatory conditions led to increased MEF2D expression, which might further contribute to the development of lung cancer through influencing cancer microenvironment and cell bio-behaviors. MEF2D might be a potential biomarker during chronic inflammation-lung cancer transition, predicting the risk of lung cancer among patients with chronic respiratory diseases.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gulec, Cagri, E-mail: cagri.gulec@gmail.com; Coban, Neslihan, E-mail: neslic@istanbul.edu.tr; Ozsait-Selcuk, Bilge, E-mail: ozsaitb@istanbul.edu.tr
ROR-alpha is a nuclear receptor, activity of which can be modulated by natural or synthetic ligands. Due to its possible involvement in, and potential therapeutic target for atherosclerosis, we aimed to identify ROR-alpha target genes in monocytic and endothelial cell lines. We performed chromatin immunoprecipitation (ChIP) followed by tiling array (ChIP-on-chip) for ROR-alpha in monocytic cell line THP1 and endothelial cell line HUVEC. Following bioinformatic analysis of the array data, we tested four candidate genes in terms of dependence of their expression level on ligand-mediated ROR-alpha activity, and two of them in terms of promoter occupancy by ROR-alpha. Bioinformatic analysesmore » of ChIP-on-chip data suggested that ROR-alpha binds to genomic regions near the transcription start site (TSS) of more than 3000 genes in THP1 and HUVEC. Potential ROR-alpha target genes in both cell types seem to be involved mainly in membrane receptor activity, signal transduction and ion transport. While SPP1 and IKBKA were shown to be direct target genes of ROR-alpha in THP1 monocytes, inflammation related gene HMOX1 and heat shock protein gene HSPA8 were shown to be potential target genes of ROR-alpha. Our results suggest that ROR-alpha may regulate signaling receptor activity, and transmembrane transport activity through its potential target genes. ROR-alpha seems also to play role in cellular sensitivity to environmental substances like arsenite and chloroprene. Although, the expression analyses have shown that synthetic ROR-alpha ligands can modulate some of potential ROR-alpha target genes, functional significance of ligand-dependent modulation of gene expression needs to be confirmed with further analyses.« less
Kalra, Shikha; Puniya, Bhanwar Lal; Kulshreshtha, Deepika; Kumar, Sunil; Kaur, Jagdeep; Ramachandran, Srinivasan; Singh, Kashmir
2013-01-01
Chlorophytum borivilianum, an endangered medicinal plant species is highly recognized for its aphrodisiac properties provided by saponins present in the plant. The transcriptome information of this species is limited and only few hundred expressed sequence tags (ESTs) are available in the public databases. To gain molecular insight of this plant, high throughput transcriptome sequencing of leaf RNA was carried out using Illumina's HiSeq 2000 sequencing platform. A total of 22,161,444 single end reads were retrieved after quality filtering. Available (e.g., De-Bruijn/Eulerian graph) and in-house developed bioinformatics tools were used for assembly and annotation of transcriptome. A total of 101,141 assembled transcripts were obtained, with coverage size of 22.42 Mb and average length of 221 bp. Guanine-cytosine (GC) content was found to be 44%. Bioinformatics analysis, using non-redundant proteins, gene ontology (GO), enzyme commission (EC) and kyoto encyclopedia of genes and genomes (KEGG) databases, extracted all the known enzymes involved in saponin and flavonoid biosynthesis. Few genes of the alkaloid biosynthesis, along with anticancer and plant defense genes, were also discovered. Additionally, several cytochrome P450 (CYP450) and glycosyltransferase unique sequences were also found. We identified simple sequence repeat motifs in transcripts with an abundance of di-nucleotide simple sequence repeat (SSR; 43.1%) markers. Large scale expression profiling through Reads per Kilobase per Million mapped reads (RPKM) showed major genes involved in different metabolic pathways of the plant. Genes, expressed sequence tags (ESTs) and unique sequences from this study provide an important resource for the scientific community, interested in the molecular genetics and functional genomics of C. borivilianum.
Pap, Domonkos; Sziksz, Erna; Kiss, Zoltán; Rokonay, Réka; Veres-Székely, Apor; Lippai, Rita; Takács, István Márton; Kis, Éva; Fekete, Andrea; Reusz, György; Szabó, Attila J; Vannay, Adam
2017-01-01
Congenital obstructive nephropathy (CON) is the main cause of pediatric chronic kidney diseases leading to renal fibrosis. High morbidity and limited treatment opportunities of CON urge the better understanding of the underlying molecular mechanisms. To identify the differentially expressed genes, microarray analysis was performed on the kidney samples of neonatal rats underwent unilateral ureteral obstruction (UUO). Microarray results were then validated by real-time RT-PCR and bioinformatics analysis was carried out to identify the relevant genes, functional groups and pathways involved in the pathomechanism of CON. Renal expression of matrix metalloproteinase (MMP)-12 and interleukin (IL)-24 were evaluated by real-time RT-PCR, flow cytometry and immunohistochemical analysis. Effect of the main profibrotic factors on the expression of MMP-12 and IL-24 was investigated on HK-2 and HEK-293 cell lines. Finally, the effect of IL-24 treatment on the expression of pro-inflammatory cytokines and MMPs were tested in vitro. Microarray analysis revealed 880 transcripts showing >2.0-fold change following UUO, enriched mainly in immune response related processes. The most up-regulated genes were MMPs and members of IL-20 cytokine subfamily, including MMP-3, MMP-7, MMP-12, IL-19 and IL-24. We found that while TGF-β treatment inhibits the expression of MMP-12 and IL-24, H2O2 or PDGF-B treatment induce the epithelial expression of MMP-12. We demonstrated that IL-24 treatment decreases the expression of IL-6 and MMP-3 in the renal epithelial cells. This study provides an extensive view of UUO induced changes in the gene expression profile of the developing kidney and describes novel molecules, which may play significant role in the pathomechanism of CON. © 2017 The Author(s)Published by S. Karger AG, Basel.
Li, Xiaohui; Han, Xingtao; Yang, Jinhui; Sun, Jiantao; Wei, Pengtao
2018-01-01
Objective To observe the effect of microRNA-519d-3p (miR-519d-3p) on the proliferation of prostate cancer cells and explore the possible molecular mechanism. Methods The expression level of miR-519d-3p in PC-3, DU-145, 22RV1, PC-3M, LNCaP human prostate cancer cells and RWPE-1 human normal prostate epithelial cells was detected by real-time quantitative PCR. miR-519d-3p mimics or negative control microRNAs (miR-NC) was transfected into the prostate cancer cells with the lowest level of miR-519d-3p expression. Transfection efficiency was examined. The effect of miR-519d-3p on the cell cycle of prostate cancer was detected by flow cytometry. MTT assay and plate clone formation assay were used to detect its effect on the proliferation of prostate cancer cells. Bioinformatics software was used to predict and dual luciferase reporter assay was used to validate the target gene of miR-519d-3p. Real-time quantitative PCR was used to detect the expression of miR-519d-3p target gene. Western blot analysis was used to detect the expression of target gene protein and downstream protein. Results The expression of miR-519d-3p in normal prostate epithelial cells was significantly higher than that in prostate cancer cells, and the lowest was found in DU-145 cells. After transfected with miR-519d-3p mimics, the expression level of miR-519d-3p in DU-145 cells increased significantly. Bioinformatics prediction and dual luciferase reporter gene confirmed that tumor necrosis factor receptor associated factor 4 (TRAF4) was the target gene of miR-519d-3p. Overexpression of miR-519d-3p significantly reduced the expression of TRAF4 gene and its downstream TGF-β signaling pathway proteins in the prostate cancer cells. Conclusion The expression of miR-519d-3p is down-regulated in prostate cancer cells. Overexpression of miR-519d-3p can inhibit the proliferation of prostate cancer cells. The possible mechanism is that miR-519d-3p inhibits the expression of TRAF4.
RNA-Seq Analysis to Measure the Expression of SINE Retroelements.
Román, Ángel Carlos; Morales-Hernández, Antonio; Fernández-Salguero, Pedro M
2016-01-01
The intrinsic features of retroelements, like their repetitive nature and disseminated presence in their host genomes, demand the use of advanced methodologies for their bioinformatic and functional study. The short length of SINE (short interspersed elements) retrotransposons makes such analyses even more complex. Next-generation sequencing (NGS) technologies are currently one of the most widely used tools to characterize the whole repertoire of gene expression in a specific tissue. In this chapter, we will review the molecular and computational methods needed to perform NGS analyses on SINE elements. We will also describe new methods of potential interest for researchers studying repetitive elements. We intend to outline the general ideas behind the computational analyses of NGS data obtained from SINE elements, and to stimulate other scientists to expand our current knowledge on SINE biology using RNA-seq and other NGS tools.
The use of open source bioinformatics tools to dissect transcriptomic data.
Nitsche, Benjamin M; Ram, Arthur F J; Meyer, Vera
2012-01-01
Microarrays are a valuable technology to study fungal physiology on a transcriptomic level. Various microarray platforms are available comprising both single and two channel arrays. Despite different technologies, preprocessing of microarray data generally includes quality control, background correction, normalization, and summarization of probe level data. Subsequently, depending on the experimental design, diverse statistical analysis can be performed, including the identification of differentially expressed genes and the construction of gene coexpression networks.We describe how Bioconductor, a collection of open source and open development packages for the statistical programming language R, can be used for dissecting microarray data. We provide fundamental details that facilitate the process of getting started with R and Bioconductor. Using two publicly available microarray datasets from Aspergillus niger, we give detailed protocols on how to identify differentially expressed genes and how to construct gene coexpression networks.
Philipp, E E R; Kraemer, L; Mountfort, D; Schilhabel, M; Schreiber, S; Rosenstiel, P
2012-03-15
Next generation sequencing (NGS) technologies allow a rapid and cost-effective compilation of large RNA sequence datasets in model and non-model organisms. However, the storage and analysis of transcriptome information from different NGS platforms is still a significant bottleneck, leading to a delay in data dissemination and subsequent biological understanding. Especially database interfaces with transcriptome analysis modules going beyond mere read counts are missing. Here, we present the Transcriptome Analysis and Comparison Explorer (T-ACE), a tool designed for the organization and analysis of large sequence datasets, and especially suited for transcriptome projects of non-model organisms with little or no a priori sequence information. T-ACE offers a TCL-based interface, which accesses a PostgreSQL database via a php-script. Within T-ACE, information belonging to single sequences or contigs, such as annotation or read coverage, is linked to the respective sequence and immediately accessible. Sequences and assigned information can be searched via keyword- or BLAST-search. Additionally, T-ACE provides within and between transcriptome analysis modules on the level of expression, GO terms, KEGG pathways and protein domains. Results are visualized and can be easily exported for external analysis. We developed T-ACE for laboratory environments, which have only a limited amount of bioinformatics support, and for collaborative projects in which different partners work on the same dataset from different locations or platforms (Windows/Linux/MacOS). For laboratories with some experience in bioinformatics and programming, the low complexity of the database structure and open-source code provides a framework that can be customized according to the different needs of the user and transcriptome project.
Expression and Bioinformatics Analysis of Pectate Lyase Gene from Bacillus subtilis521
NASA Astrophysics Data System (ADS)
Xiao, Jing; Lu, Fu-Ping; Li, Yu; Li, Jin-Ting
In order to exploit new genetic resources, Pectate lyase(PEL) gene was amplified by PCR using the genome DNA from an alkaline Bacillus subtilis521. The PCR product was inserted into pET22b(+) vector. The recombinant plasmids were cloned in E.coli DH5α and then expressed in E.coli BL21. When cultured in the optimized medium, the positive clones E.coli BL21(pET22b(+)pel)showed intracellular pectate lyase activity of 90.0 U/mL. It was indicated that we had obtained the correct PEL gene. The pel has an open reading frame of 1263 nucleotides and codes for a product of 420 amino acids with a calculated molecular mass of 45.5 kD. Based on computer assisted analysis, a signal peptides and two conserved domains were revealed. The sequence analysis for PEL showed that it shares 26-82% homology with other strains in GenBank. In addition, the advanced structure of PEL were also predicted and analysed. This study will help to the experimental design of PEL fermentation and production purification and enzyme evolution.
Marzano, Valeria; Santini, Simonetta; Rossi, Claudia; Zucchelli, Mirco; D'Alessandro, Annamaria; Marchetti, Carlo; Mingardi, Michele; Stagni, Venturina; Barilà, Daniela; Urbani, Andrea
2012-01-01
Ataxia Telangiectasia Mutated (ATM) protein kinase is a key effector in the modulation of the functionality of some important stress responses, including DNA damage and oxidative stress response, and its deficiency is the hallmark of Ataxia Telangiectasia (A-T), a rare genetic disorder. ATM modulates the activity of hundreds of target proteins, essential for the correct balance between proliferation and cell death. The aim of this study is to evaluate the phenotypic adaptation at the protein level both in basal condition and in presence of proteasome blockage in order to identify the molecules whose level and stability are modulated through ATM expression. We pursued a comparative analysis of ATM deficient and proficient lymphoblastoid cells by label-free shotgun proteomic experiments comparing the panel of proteins differentially expressed. Through a non-supervised comparative bioinformatic analysis these data provided an insight on the functional role of ATM deficiency in cellular carbohydrate metabolism's regulation. This hypothesis has been demonstrated by targeted metabolic fingerprint analysis SRM (Selected Reaction Monitoring) on specific thermodynamic checkpoints of glycolysis. This article is part of a Special Issue entitled: Translational Proteomics. PMID:22641158
Ji, Xiaoyu; Liu, Xiaoqiang; Peng, Yuanxia; Zhan, Ruoting; Xu, Hui; Ge, Xijin
2017-12-09
Emodin has a strong antibacterial activity, including methicillin-resistant Staphylococcus aureus (MRSA). However, the mechanism by which emodin induces growth inhibition against MRSA remains unclear. In this study, the isobaric tags for relative and absolute quantitation (iTRAQ) proteomics approach was used to investigate the modes of action of emodin on a MRSA isolate and methicillin-sensitive S. aureus ATCC29213(MSSA). Proteomic analysis showed that expression levels of 145 and 122 proteins were changed significantly in MRSA and MSSA, respectively, after emodin treatment. Comparative analysis of the functions of differentially expressed proteins between the two strains was performed via bioinformatics tools blast2go and STRING database. Proteins related to pyruvate pathway imbalance induction, protein synthesis inhibition, and DNA synthesis suppression were found in both methicillin-sensitive and resistant strains. Moreover, Interference proteins related to membrane damage mechanism were also observed in MRSA. Our findings indicate that emodin is a potential antibacterial agent targeting MRSA via multiple mechanisms. Copyright © 2017 Elsevier Inc. All rights reserved.
Interoperability of GADU in using heterogeneous Grid resources for bioinformatics applications.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sulakhe, D.; Rodriguez, A.; Wilde, M.
2008-03-01
Bioinformatics tools used for efficient and computationally intensive analysis of genetic sequences require large-scale computational resources to accommodate the growing data. Grid computational resources such as the Open Science Grid and TeraGrid have proved useful for scientific discovery. The genome analysis and database update system (GADU) is a high-throughput computational system developed to automate the steps involved in accessing the Grid resources for running bioinformatics applications. This paper describes the requirements for building an automated scalable system such as GADU that can run jobs on different Grids. The paper describes the resource-independent configuration of GADU using the Pegasus-based virtual datamore » system that makes high-throughput computational tools interoperable on heterogeneous Grid resources. The paper also highlights the features implemented to make GADU a gateway to computationally intensive bioinformatics applications on the Grid. The paper will not go into the details of problems involved or the lessons learned in using individual Grid resources as it has already been published in our paper on genome analysis research environment (GNARE) and will focus primarily on the architecture that makes GADU resource independent and interoperable across heterogeneous Grid resources.« less
Anslan, Sten; Bahram, Mohammad; Hiiesalu, Indrek; Tedersoo, Leho
2017-11-01
High-throughput sequencing methods have become a routine analysis tool in environmental sciences as well as in public and private sector. These methods provide vast amount of data, which need to be analysed in several steps. Although the bioinformatics may be applied using several public tools, many analytical pipelines allow too few options for the optimal analysis for more complicated or customized designs. Here, we introduce PipeCraft, a flexible and handy bioinformatics pipeline with a user-friendly graphical interface that links several public tools for analysing amplicon sequencing data. Users are able to customize the pipeline by selecting the most suitable tools and options to process raw sequences from Illumina, Pacific Biosciences, Ion Torrent and Roche 454 sequencing platforms. We described the design and options of PipeCraft and evaluated its performance by analysing the data sets from three different sequencing platforms. We demonstrated that PipeCraft is able to process large data sets within 24 hr. The graphical user interface and the automated links between various bioinformatics tools enable easy customization of the workflow. All analytical steps and options are recorded in log files and are easily traceable. © 2017 John Wiley & Sons Ltd.
Survey of MapReduce frame operation in bioinformatics.
Zou, Quan; Li, Xu-Bin; Jiang, Wen-Rui; Lin, Zi-Yu; Li, Gui-Lin; Chen, Ke
2014-07-01
Bioinformatics is challenged by the fact that traditional analysis tools have difficulty in processing large-scale data from high-throughput sequencing. The open source Apache Hadoop project, which adopts the MapReduce framework and a distributed file system, has recently given bioinformatics researchers an opportunity to achieve scalable, efficient and reliable computing performance on Linux clusters and on cloud computing services. In this article, we present MapReduce frame-based applications that can be employed in the next-generation sequencing and other biological domains. In addition, we discuss the challenges faced by this field as well as the future works on parallel computing in bioinformatics. © The Author 2013. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.
A Scientific Software Product Line for the Bioinformatics domain.
Costa, Gabriella Castro B; Braga, Regina; David, José Maria N; Campos, Fernanda
2015-08-01
Most specialized users (scientists) that use bioinformatics applications do not have suitable training on software development. Software Product Line (SPL) employs the concept of reuse considering that it is defined as a set of systems that are developed from a common set of base artifacts. In some contexts, such as in bioinformatics applications, it is advantageous to develop a collection of related software products, using SPL approach. If software products are similar enough, there is the possibility of predicting their commonalities, differences and then reuse these common features to support the development of new applications in the bioinformatics area. This paper presents the PL-Science approach which considers the context of SPL and ontology in order to assist scientists to define a scientific experiment, and to specify a workflow that encompasses bioinformatics applications of a given experiment. This paper also focuses on the use of ontologies to enable the use of Software Product Line in biological domains. In the context of this paper, Scientific Software Product Line (SSPL) differs from the Software Product Line due to the fact that SSPL uses an abstract scientific workflow model. This workflow is defined according to a scientific domain and using this abstract workflow model the products (scientific applications/algorithms) are instantiated. Through the use of ontology as a knowledge representation model, we can provide domain restrictions as well as add semantic aspects in order to facilitate the selection and organization of bioinformatics workflows in a Scientific Software Product Line. The use of ontologies enables not only the expression of formal restrictions but also the inferences on these restrictions, considering that a scientific domain needs a formal specification. This paper presents the development of the PL-Science approach, encompassing a methodology and an infrastructure, and also presents an approach evaluation. This evaluation presents case studies in bioinformatics, which were conducted in two renowned research institutions in Brazil. Copyright © 2015 Elsevier Inc. All rights reserved.
CHARACTERIZING THE ROLE OF THE NELL1 GENE IN CARDIOVASCULAR DEVELOPMENT
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, L. Y.; Culiat, C.
Nell1{sup 6R} is a chemically-induced point mutation in a novel cell-signaling gene, Nell1, which results in truncation of the protein and degradation of the Nell16R transcript. Earlier studies revealed that loss of Nell1 function reduces expression of numerous extracellular matrix (ECM) proteins required for differentiation of bone and cartilage precursor cells, thereby causing severe skull and spinal defects. Since skeletal and cardiovascular development are closely linked biological processes, this research focused on: a) examining Nell16R mutant mice for cardiovascular defects, b) determining Nell1 expression in fetal and adult hearts, and c) establishing how ECM genes affected by Nell1 infl uencemore » heart development. Structural heart defects in Nell16R mutant fetuses were analyzed by heart length and width measurements and standard histological methods (haematoxylin and eosin staining). Nell1 expression was assayed in fetal and adult hearts using reverse transcription polymerase chain reaction (RT-PCR). A comprehensive bioinformatics analysis using public databases (Stanford SOURCE Search, Integrated Cartilage Gene Database, Mouse Genome Informatics, and NCBI UniGene) was undertaken to investigate the relationship between cardiovascular development and each of twentyeight genes affected by Nell1. Nell1-defi cient mice have signifi cantly enlarged hearts (particularly the heart width), dramatically reduced blood fl ow out of the heart and unexpanded lungs. Isolation of total RNAs from hearts of adult (control and heterozygote) and fetal (control and homozygous mutant) mice have been completed and RT-PCR assays are in progress. The bioinformatics analysis showed that the majority of genes with reduced expression in Nell1-defi cient mice are normally expressed in the heart (79%; 22/28), blood vessels (71%; 20/28) and bone marrow (61%; 17/28). Moreover, mouse mutations in seven of these genes (Col15a1, Osf-2, Bmpr1a, Pkd1, Mfge8, Ptger4, Col5a1) manifest abnormalities in cardiovascular development. These data demonstrate for the fi rst time that Nell1 has a role in early mammalian cardiovascular development, mediated by its regulation of ECM proteins necessary for normal cell growth and differentiation. In addition, understanding the mechanisms by which Nell1 and its associated ECM genes affect the cardiovascular system can provide future strategies for the treatment of heart and blood vessel defects.« less
Analysis of gene expression profile microarray data in complex regional pain syndrome.
Tan, Wulin; Song, Yiyan; Mo, Chengqiang; Jiang, Shuangjian; Wang, Zhongxing
2017-09-01
The aim of the present study was to predict key genes and proteins associated with complex regional pain syndrome (CRPS) using bioinformatics analysis. The gene expression profiling microarray data, GSE47603, which included peripheral blood samples from 4 patients with CRPS and 5 healthy controls, was obtained from the Gene Expression Omnibus (GEO) database. The differentially expressed genes (DEGs) in CRPS patients compared with healthy controls were identified using the GEO2R online tool. Functional enrichment analysis was then performed using The Database for Annotation Visualization and Integrated Discovery online tool. Protein‑protein interaction (PPI) network analysis was subsequently performed using Search Tool for the Retrieval of Interaction Genes database and analyzed with Cytoscape software. A total of 257 DEGs were identified, including 243 upregulated genes and 14 downregulated ones. Genes in the human leukocyte antigen (HLA) family were most significantly differentially expressed. Enrichment analysis demonstrated that signaling pathways, including immune response, cell motion, adhesion and angiogenesis were associated with CRPS. PPI network analysis revealed that key genes, including early region 1A binding protein p300 (EP300), CREB‑binding protein (CREBBP), signal transducer and activator of transcription (STAT)3, STAT5A and integrin α M were associated with CRPS. The results suggest that the immune response may therefore serve an important role in CRPS development. In addition, genes in the HLA family, such as HLA‑DQB1 and HLA‑DRB1, may present potential biomarkers for the diagnosis of CRPS. Furthermore, EP300, its paralog CREBBP, and the STAT family genes, STAT3 and STAT5 may be important in the development of CRPS.
Ning, Tongbo; Cui, Hao; Sun, Feng; Zou, Jidian
2017-09-05
Glioblastoma represents one of the most aggressive malignant brain tumors with high morbidity and motility. Demethylation drugs have been developed for its treatment with little efficacy has been observed. The purpose of this study was to screen therapeutic targets of demethylation drugs or bioactive molecules for glioblastoma through systemic bioinformatics analysis. We firstly downloaded genome-wide expression profiles from the Gene Expression Omnibus (GEO) and conducted the primary analysis through R software, mainly including preprocessing of raw microarray data, transformation between probe ID and gene symbol and identification of differential expression genes (DEGs). Secondly, functional enrichment analysis was conducted via the Database for Annotation, Visualization and Integrated Discovery (DAVID) to explore biological processes involved in the development of glioblastoma. Thirdly, we constructed protein-protein interaction (PPI) network of interested genes and conducted cross analysis for multi datasets to obtain potential therapeutic targets for glioblastoma. Finally, we further confirmed the therapeutic targets through real-time RT-PCR. As a result, biological processes that related to cancer development, amino metabolism, immune response and etc. were found to be significantly enriched in genes that differential expression in glioblastoma and regulated by 5'aza-dC. Besides, network and cross analysis identified ACAT2, UFC1 and CYB5R1 as novel therapeutic targets of demethylation drugs which also confirmed by real time RT-PCR. In conclusions, our study identified several biological processes and genes that involved in the development of glioblastoma and regulated by 5'aza-dC, which would be helpful for the treatment of glioblastoma. Copyright © 2017 Elsevier B.V. All rights reserved.
Collado-Romero, Melania; Aguilar, Carmen; Arce, Cristina; Lucena, Concepción; Codrea, Marius C.; Morera, Luis; Bendixen, Emoke; Moreno, Ángela; Garrido, Juan J.
2015-01-01
The enteropathogen Salmonella Typhimurium (S. Typhimurium) is the most commonly non-typhoideal serotype isolated in pig worldwide. Currently, one of the main sources of human infection is by consumption of pork meat. Therefore, prevention and control of salmonellosis in pigs is crucial for minimizing risks to public health. The aim of the present study was to use isobaric tags for relative and absolute quantification (iTRAQ) to explore differences in the response to Salmonella in two segment of the porcine gut (ileum and colon) along a time course of 1, 2, and 6 days post infection (dpi) with S. Typhimurium. A total of 298 proteins were identified in the infected ileum samples of which, 112 displayed significant expression differences due to Salmonella infection. In colon, 184 proteins were detected in the infected samples of which 46 resulted differentially expressed with respect to the controls. The higher number of changes in protein expression was quantified in ileum at 2 dpi. Further biological interpretation of proteomics data using bioinformatics tools demonstrated that the expression changes in colon were found in proteins involved in cell death and survival, tissue morphology or molecular transport at the early stages and tissue regeneration at 6 dpi. In ileum, however, changes in protein expression were mainly related to immunological and infection diseases, inflammatory response or connective tissue disorders at 1 and 2 dpi. iTRAQ has proved to be a proteomic robust approach allowing us to identify ileum as the earliest response focus upon S. Typhimurium in the porcine gut. In addition, new functions involved in the response to bacteria such as eIF2 signaling, free radical scavengers or antimicrobial peptides (AMP) expression have been identified. Finally, the impairment at of the enterohepatic circulation of bile acids and lipid metabolism by means the under regulation of FABP6 protein and FXR/RXR and LXR/RXR signaling pathway in ileum has been established for the first time in pigs. Taken together, our results provide a better understanding of the porcine response to Salmonella infection and the molecular mechanisms underlying Salmonella-host interactions. PMID:26389078
Collado-Romero, Melania; Aguilar, Carmen; Arce, Cristina; Lucena, Concepción; Codrea, Marius C; Morera, Luis; Bendixen, Emoke; Moreno, Ángela; Garrido, Juan J
2015-01-01
The enteropathogen Salmonella Typhimurium (S. Typhimurium) is the most commonly non-typhoideal serotype isolated in pig worldwide. Currently, one of the main sources of human infection is by consumption of pork meat. Therefore, prevention and control of salmonellosis in pigs is crucial for minimizing risks to public health. The aim of the present study was to use isobaric tags for relative and absolute quantification (iTRAQ) to explore differences in the response to Salmonella in two segment of the porcine gut (ileum and colon) along a time course of 1, 2, and 6 days post infection (dpi) with S. Typhimurium. A total of 298 proteins were identified in the infected ileum samples of which, 112 displayed significant expression differences due to Salmonella infection. In colon, 184 proteins were detected in the infected samples of which 46 resulted differentially expressed with respect to the controls. The higher number of changes in protein expression was quantified in ileum at 2 dpi. Further biological interpretation of proteomics data using bioinformatics tools demonstrated that the expression changes in colon were found in proteins involved in cell death and survival, tissue morphology or molecular transport at the early stages and tissue regeneration at 6 dpi. In ileum, however, changes in protein expression were mainly related to immunological and infection diseases, inflammatory response or connective tissue disorders at 1 and 2 dpi. iTRAQ has proved to be a proteomic robust approach allowing us to identify ileum as the earliest response focus upon S. Typhimurium in the porcine gut. In addition, new functions involved in the response to bacteria such as eIF2 signaling, free radical scavengers or antimicrobial peptides (AMP) expression have been identified. Finally, the impairment at of the enterohepatic circulation of bile acids and lipid metabolism by means the under regulation of FABP6 protein and FXR/RXR and LXR/RXR signaling pathway in ileum has been established for the first time in pigs. Taken together, our results provide a better understanding of the porcine response to Salmonella infection and the molecular mechanisms underlying Salmonella-host interactions.
Progress and challenges in bioinformatics approaches for enhancer identification
Kleftogiannis, Dimitrios; Kalnis, Panos
2016-01-01
Enhancers are cis-acting DNA elements that play critical roles in distal regulation of gene expression. Identifying enhancers is an important step for understanding distinct gene expression programs that may reflect normal and pathogenic cellular conditions. Experimental identification of enhancers is constrained by the set of conditions used in the experiment. This requires multiple experiments to identify enhancers, as they can be active under specific cellular conditions but not in different cell types/tissues or cellular states. This has opened prospects for computational prediction methods that can be used for high-throughput identification of putative enhancers to complement experimental approaches. Potential functions and properties of predicted enhancers have been catalogued and summarized in several enhancer-oriented databases. Because the current methods for the computational prediction of enhancers produce significantly different enhancer predictions, it will be beneficial for the research community to have an overview of the strategies and solutions developed in this field. In this review, we focus on the identification and analysis of enhancers by bioinformatics approaches. First, we describe a general framework for computational identification of enhancers, present relevant data types and discuss possible computational solutions. Next, we cover over 30 existing computational enhancer identification methods that were developed since 2000. Our review highlights advantages, limitations and potentials, while suggesting pragmatic guidelines for development of more efficient computational enhancer prediction methods. Finally, we discuss challenges and open problems of this topic, which require further consideration. PMID:26634919
A case study of tuning MapReduce for efficient Bioinformatics in the cloud
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shi, Lizhen; Wang, Zhong; Yu, Weikuan
The combination of the Hadoop MapReduce programming model and cloud computing allows biological scientists to analyze next-generation sequencing (NGS) data in a timely and cost-effective manner. Cloud computing platforms remove the burden of IT facility procurement and management from end users and provide ease of access to Hadoop clusters. However, biological scientists are still expected to choose appropriate Hadoop parameters for running their jobs. More importantly, the available Hadoop tuning guidelines are either obsolete or too general to capture the particular characteristics of bioinformatics applications. In this paper, we aim to minimize the cloud computing cost spent on bioinformatics datamore » analysis by optimizing the extracted significant Hadoop parameters. When using MapReduce-based bioinformatics tools in the cloud, the default settings often lead to resource underutilization and wasteful expenses. We choose k-mer counting, a representative application used in a large number of NGS data analysis tools, as our study case. Experimental results show that, with the fine-tuned parameters, we achieve a total of 4× speedup compared with the original performance (using the default settings). Finally, this paper presents an exemplary case for tuning MapReduce-based bioinformatics applications in the cloud, and documents the key parameters that could lead to significant performance benefits.« less
LGR5/GPR49 is implicated in motor neuron specification in nervous system.
Song, Shao-jun; Mao, Xing-gang; Wang, Chao; Han, An-guo; Yan, Ming; Xue, Xiao-yan
2015-01-01
The biological roles of stem cell marker LGR5, the receptor for the Wnt-agonistic R-spondins, for nervous system are poorly known. Bioinformatics analysis in normal human brain tissues revealed that LGR5 is closely related with neuron development and functions. Interestingly, LGR5 and its ligands R-spondins (RSPO2 and RSPO3) are specifically highly expressed in projection motor neurons in the spinal cord, brain stem and cerebral. Inhibition of Notch activity in neural stem cells (NSCs) increased the percentage of neuronal cells and promoted LGR5 expression, while activation of Notch signal decreased neuronal cells and inhibited the LGR5 expression. Furthermore, knockdown of LGR5 inhibited the expression of neuronal markers MAP2, NeuN, GAP43, SYP and CHRM3, and also reduced the expression of genes that program the identity of motor neurons, including Isl1, Lhx3, PHOX2A, TBX20 and NEUROG2. Our data demonstrated that LGR5 is highly expressed in motor neurons in nervous system and is involved in their development by regulating transcription factors that program motor neuron identity. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Trayhurn, Paul; Denyer, Gareth
2012-01-01
Microarray datasets are a rich source of information in nutritional investigation. Targeted mining of microarray data following initial, non-biased bioinformatic analysis can provide key insight into specific genes and metabolic processes of interest. Microarrays from human adipocytes were examined to explore the effects of macrophage secretions on the expression of the G-protein-coupled receptor (GPR) genes that encode fatty acid receptors/sensors. Exposure of the adipocytes to macrophage-conditioned medium for 4 or 24 h had no effect on GPR40 and GPR43 expression, but there was a marked stimulation of GPR84 expression (receptor for medium-chain fatty acids), the mRNA level increasing 13·5-fold at 24 h relative to unconditioned medium. Importantly, expression of GPR120, which encodes an n-3 PUFA receptor/sensor, was strongly inhibited by the conditioned medium (15-fold decrease in mRNA at 24 h). Macrophage secretions have major effects on the expression of fatty acid receptor/sensor genes in human adipocytes, which may lead to an augmentation of the inflammatory response in adipose tissue in obesity.
Trayhurn, Paul; Denyer, Gareth
2012-01-01
Microarray datasets are a rich source of information in nutritional investigation. Targeted mining of microarray data following initial, non-biased bioinformatic analysis can provide key insight into specific genes and metabolic processes of interest. Microarrays from human adipocytes were examined to explore the effects of macrophage secretions on the expression of the G-protein-coupled receptor (GPR) genes that encode fatty acid receptors/sensors. Exposure of the adipocytes to macrophage-conditioned medium for 4 or 24 h had no effect on GPR40 and GPR43 expression, but there was a marked stimulation of GPR84 expression (receptor for medium-chain fatty acids), the mRNA level increasing 13·5-fold at 24 h relative to unconditioned medium. Importantly, expression of GPR120, which encodes an n-3 PUFA receptor/sensor, was strongly inhibited by the conditioned medium (15-fold decrease in mRNA at 24 h). Macrophage secretions have major effects on the expression of fatty acid receptor/sensor genes in human adipocytes, which may lead to an augmentation of the inflammatory response in adipose tissue in obesity. PMID:25191551
A Guide for Designing and Analyzing RNA-Seq Data.
Chatterjee, Aniruddha; Ahn, Antonio; Rodger, Euan J; Stockwell, Peter A; Eccles, Michael R
2018-01-01
The identity of a cell or an organism is at least in part defined by its gene expression and therefore analyzing gene expression remains one of the most frequently performed experimental techniques in molecular biology. The development of the RNA-Sequencing (RNA-Seq) method allows an unprecedented opportunity to analyze expression of protein-coding, noncoding RNA and also de novo transcript assembly of a new species or organism. However, the planning and design of RNA-Seq experiments has important implications for addressing the desired biological question and maximizing the value of the data obtained. In addition, RNA-Seq generates a huge volume of data and accurate analysis of this data involves several different steps and choices of tools. This can be challenging and overwhelming, especially for bench scientists. In this chapter, we describe an entire workflow for performing RNA-Seq experiments. We describe critical aspects of wet lab experiments such as RNA isolation, library preparation and the initial design of an experiment. Further, we provide a step-by-step description of the bioinformatics workflow for different steps involved in RNA-Seq data analysis. This includes power calculations, setting up a computational environment, acquisition and processing of publicly available data if desired, quality control measures, preprocessing steps for the raw data, differential expression analysis, and data visualization. We particularly mention important considerations for each step to provide a guide for designing and analyzing RNA-Seq data.
Visualising "Junk" DNA through Bioinformatics
ERIC Educational Resources Information Center
Elwess, Nancy L.; Latourelle, Sandra M.; Cauthorn, Olivia
2005-01-01
One of the hottest areas of science today is the field in which biology, information technology,and computer science are merged into a single discipline called bioinformatics. This field enables the discovery and analysis of biological data, including nucleotide and amino acid sequences that are easily accessed through the use of computers. As…
Jha, Prabhash Kumar; Sahu, Anita; Prabhakar, Amit; Tyagi, Tarun; Chatterjee, Tathagata; Arvind, Prathima; Nair, Jiny; Gupta, Neha; Kumari, Babita; Nair, Velu; Bajaj, Nitin; Shanker, Jayashree; Sharma, Manish; Kumar, Bhuvnesh; Ashraf, Mohammad Zahid
2018-06-04
Venous thromboembolism (VTE), a multi-factorial disease, is the third most common cardiovascular disease. Established genetic and acquired risk factors are responsible for the onset of VTE. High altitude (HA) also poses as an additional risk factor, predisposing individuals to VTE; however, its molecular mechanism remains elusive. This study aimed to identify genes/pathways associated with the pathophysiology of deep vein thrombosis (DVT) at HA. Gene expression profiling of DVT patients, who developed the disease, either at sea level or at HA-DVT locations, resulted in differential expression of 378 and 875 genes, respectively. Gene expression profiles were subjected to bioinformatic analysis, followed by technical and biological validation of selected genes using quantitative reverse transcription-polymerase chain reaction. Both gene ontology and pathway analysis showed enrichment of genes involved in haemostasis and platelet activation in HA-DVT patients with the most relevant pathway being 'response to hypoxia'. Thus, given the environmental condition the differential expression of hypoxia-responsive genes (angiogenin, ribonuclease, RNase A family, 5; early growth response 1; lamin A; matrix metallopeptidase 14 [membrane-inserted]; neurofibromin 1; PDZ and LIM domain 1; procollagen-lysine 1, 2-oxoglutarate 5-dioxygenase 1; solute carrier family 6 [neurotransmitter transporter, serotonin], member 4; solute carrier family 9 [sodium/hydrogen exchanger], member 1; and TEK tyrosine kinase, endothelial) in HA-DVT could be a determining factor to understand the pathophysiology of DVT at HA. Schattauer GmbH Stuttgart.
Shi, Bing; Gao, Hongmin; Zhang, Tianyang; Cui, Qinghua
2016-01-01
Cigarette smoking is a world-wide habit and an important risk factor for cancer. It was known that cigarette smoking can change the expression of circulating microRNAs (miRNAs) in healthy middle-aged adults. However, it remains unclear whether cigarette smoking can change the levels of circulating miRNAs in young healthy smokers and whether there are differences in cancer susceptibility for the two cases. In this study, the miRNA expression profiles of 28 smokers and 12 non-smokers were determined by Agilent human MicroRNA array. We further performed bioinformatics analysis for the differentially expressed miRNAs. The result showed that 35 miRNAs were differentially expressed. Among them, 24 miRNAs were up-regulated and 11 miRNAs were down-regulated in smokers. Functional enrichment analysis showed that the deregulated miRNAs are related to immune system and hormones regulation. Strikingly, the up-regulated miRNAs are mostly associated with hematologic cancers, such as lymphoma, leukemia. As a comparison, the up-regulated plasma miRNAs in middle-aged smokers are mostly associated with solid cancers, such as hepatocellular carcinoma and lung cancer, suggesting that smoking could have different influences on young adults and middle-aged adults. In a conclusion, we identified the circulating miRNAs deregulated by cigarette smoking and revealed that the age-dependent deregulated miRNAs tend to be mainly involved in different types of human cancers. PMID:26943588
Shi, Bing; Gao, Hongmin; Zhang, Tianyang; Cui, Qinghua
2016-04-19
Cigarette smoking is a world-wide habit and an important risk factor for cancer. It was known that cigarette smoking can change the expression of circulating microRNAs (miRNAs) in healthy middle-aged adults. However, it remains unclear whether cigarette smoking can change the levels of circulating miRNAs in young healthy smokers and whether there are differences in cancer susceptibility for the two cases. In this study, the miRNA expression profiles of 28 smokers and 12 non-smokers were determined by Agilent human MicroRNA array. We further performed bioinformatics analysis for the differentially expressed miRNAs. The result showed that 35 miRNAs were differentially expressed. Among them, 24 miRNAs were up-regulated and 11 miRNAs were down-regulated in smokers. Functional enrichment analysis showed that the deregulated miRNAs are related to immune system and hormones regulation. Strikingly, the up-regulated miRNAs are mostly associated with hematologic cancers, such as lymphoma, leukemia. As a comparison, the up-regulated plasma miRNAs in middle-aged smokers are mostly associated with solid cancers, such as hepatocellular carcinoma and lung cancer, suggesting that smoking could have different influences on young adults and middle-aged adults. In a conclusion, we identified the circulating miRNAs deregulated by cigarette smoking and revealed that the age-dependent deregulated miRNAs tend to be mainly involved in different types of human cancers.
Fan, Sheng; Zhang, Dong; Xing, Libo; Qi, Siyan; Du, Lisha; Wu, Haiqin; Shao, Hongxia; Li, Youmei; Ma, Juanjuan; Han, Mingyu
2017-08-01
Although INDETERMINATE DOMAIN (IDD) genes encoding specific plant transcription factors have important roles in plant growth and development, little is known about apple IDD (MdIDD) genes and their potential functions in the flower induction. In this study, we identified 20 putative IDD genes in apple and named them according to their chromosomal locations. All identified MdIDD genes shared a conserved IDD domain. A phylogenetic analysis separated MdIDDs and other plant IDD genes into four groups. Bioinformatic analysis of chemical characteristics, gene structure, and prediction of protein-protein interactions demonstrated the functional and structural diversity of MdIDD genes. To further uncover their potential functions, we performed analysis of tandem, synteny, and gene duplications, which indicated several paired homologs of IDD genes between apple and Arabidopsis. Additionally, genome duplications also promoted the expansion and evolution of the MdIDD genes. Quantitative real-time PCR revealed that all the MdIDD genes showed distinct expression levels in five different tissues (stems, leaves, buds, flowers, and fruits). Furthermore, the expression levels of candidate MdIDD genes were also investigated in response to various circumstances, including GA treatment (decreased the flowering rate), sugar treatment (increased the flowering rate), alternate-bearing conditions, and two varieties with different-flowering intensities. Parts of them were affected by exogenous treatments and showed different expression patterns. Additionally, changes in response to alternate-bearing and different-flowering varieties of apple trees indicated that they were also responsive to flower induction. Taken together, our comprehensive analysis provided valuable information for further analysis of IDD genes aiming at flower induction.