Sample records for expression statistical analysis

  1. Evaluating the consistency of gene sets used in the analysis of bacterial gene expression data.

    PubMed

    Tintle, Nathan L; Sitarik, Alexandra; Boerema, Benjamin; Young, Kylie; Best, Aaron A; Dejongh, Matthew

    2012-08-08

    Statistical analyses of whole genome expression data require functional information about genes in order to yield meaningful biological conclusions. The Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) are common sources of functionally grouped gene sets. For bacteria, the SEED and MicrobesOnline provide alternative, complementary sources of gene sets. To date, no comprehensive evaluation of the data obtained from these resources has been performed. We define a series of gene set consistency metrics directly related to the most common classes of statistical analyses for gene expression data, and then perform a comprehensive analysis of 3581 Affymetrix® gene expression arrays across 17 diverse bacteria. We find that gene sets obtained from GO and KEGG demonstrate lower consistency than those obtained from the SEED and MicrobesOnline, regardless of gene set size. Despite the widespread use of GO and KEGG gene sets in bacterial gene expression data analysis, the SEED and MicrobesOnline provide more consistent sets for a wide variety of statistical analyses. Increased use of the SEED and MicrobesOnline gene sets in the analysis of bacterial gene expression data may improve statistical power and utility of expression data.

  2. Exploratory Visual Analysis of Statistical Results from Microarray Experiments Comparing High and Low Grade Glioma

    PubMed Central

    Reif, David M.; Israel, Mark A.; Moore, Jason H.

    2007-01-01

    The biological interpretation of gene expression microarray results is a daunting challenge. For complex diseases such as cancer, wherein the body of published research is extensive, the incorporation of expert knowledge provides a useful analytical framework. We have previously developed the Exploratory Visual Analysis (EVA) software for exploring data analysis results in the context of annotation information about each gene, as well as biologically relevant groups of genes. We present EVA as a flexible combination of statistics and biological annotation that provides a straightforward visual interface for the interpretation of microarray analyses of gene expression in the most commonly occuring class of brain tumors, glioma. We demonstrate the utility of EVA for the biological interpretation of statistical results by analyzing publicly available gene expression profiles of two important glial tumors. The results of a statistical comparison between 21 malignant, high-grade glioblastoma multiforme (GBM) tumors and 19 indolent, low-grade pilocytic astrocytomas were analyzed using EVA. By using EVA to examine the results of a relatively simple statistical analysis, we were able to identify tumor class-specific gene expression patterns having both statistical and biological significance. Our interactive analysis highlighted the potential importance of genes involved in cell cycle progression, proliferation, signaling, adhesion, migration, motility, and structure, as well as candidate gene loci on a region of Chromosome 7 that has been implicated in glioma. Because EVA does not require statistical or computational expertise and has the flexibility to accommodate any type of statistical analysis, we anticipate EVA will prove a useful addition to the repertoire of computational methods used for microarray data analysis. EVA is available at no charge to academic users and can be found at http://www.epistasis.org. PMID:19390666

  3. The statistics of identifying differentially expressed genes in Expresso and TM4: a comparison

    PubMed Central

    Sioson, Allan A; Mane, Shrinivasrao P; Li, Pinghua; Sha, Wei; Heath, Lenwood S; Bohnert, Hans J; Grene, Ruth

    2006-01-01

    Background Analysis of DNA microarray data takes as input spot intensity measurements from scanner software and returns differential expression of genes between two conditions, together with a statistical significance assessment. This process typically consists of two steps: data normalization and identification of differentially expressed genes through statistical analysis. The Expresso microarray experiment management system implements these steps with a two-stage, log-linear ANOVA mixed model technique, tailored to individual experimental designs. The complement of tools in TM4, on the other hand, is based on a number of preset design choices that limit its flexibility. In the TM4 microarray analysis suite, normalization, filter, and analysis methods form an analysis pipeline. TM4 computes integrated intensity values (IIV) from the average intensities and spot pixel counts returned by the scanner software as input to its normalization steps. By contrast, Expresso can use either IIV data or median intensity values (MIV). Here, we compare Expresso and TM4 analysis of two experiments and assess the results against qRT-PCR data. Results The Expresso analysis using MIV data consistently identifies more genes as differentially expressed, when compared to Expresso analysis with IIV data. The typical TM4 normalization and filtering pipeline corrects systematic intensity-specific bias on a per microarray basis. Subsequent statistical analysis with Expresso or a TM4 t-test can effectively identify differentially expressed genes. The best agreement with qRT-PCR data is obtained through the use of Expresso analysis and MIV data. Conclusion The results of this research are of practical value to biologists who analyze microarray data sets. The TM4 normalization and filtering pipeline corrects microarray-specific systematic bias and complements the normalization stage in Expresso analysis. The results of Expresso using MIV data have the best agreement with qRT-PCR results. In one experiment, MIV is a better choice than IIV as input to data normalization and statistical analysis methods, as it yields as greater number of statistically significant differentially expressed genes; TM4 does not support the choice of MIV input data. Overall, the more flexible and extensive statistical models of Expresso achieve more accurate analytical results, when judged by the yardstick of qRT-PCR data, in the context of an experimental design of modest complexity. PMID:16626497

  4. New Statistics for Testing Differential Expression of Pathways from Microarray Data

    NASA Astrophysics Data System (ADS)

    Siu, Hoicheong; Dong, Hua; Jin, Li; Xiong, Momiao

    Exploring biological meaning from microarray data is very important but remains a great challenge. Here, we developed three new statistics: linear combination test, quadratic test and de-correlation test to identify differentially expressed pathways from gene expression profile. We apply our statistics to two rheumatoid arthritis datasets. Notably, our results reveal three significant pathways and 275 genes in common in two datasets. The pathways we found are meaningful to uncover the disease mechanisms of rheumatoid arthritis, which implies that our statistics are a powerful tool in functional analysis of gene expression data.

  5. TRAPR: R Package for Statistical Analysis and Visualization of RNA-Seq Data.

    PubMed

    Lim, Jae Hyun; Lee, Soo Youn; Kim, Ju Han

    2017-03-01

    High-throughput transcriptome sequencing, also known as RNA sequencing (RNA-Seq), is a standard technology for measuring gene expression with unprecedented accuracy. Numerous bioconductor packages have been developed for the statistical analysis of RNA-Seq data. However, these tools focus on specific aspects of the data analysis pipeline, and are difficult to appropriately integrate with one another due to their disparate data structures and processing methods. They also lack visualization methods to confirm the integrity of the data and the process. In this paper, we propose an R-based RNA-Seq analysis pipeline called TRAPR, an integrated tool that facilitates the statistical analysis and visualization of RNA-Seq expression data. TRAPR provides various functions for data management, the filtering of low-quality data, normalization, transformation, statistical analysis, data visualization, and result visualization that allow researchers to build customized analysis pipelines.

  6. Similar protein expression profiles of ovarian and endometrial high-grade serous carcinomas.

    PubMed

    Hiramatsu, Kosuke; Yoshino, Kiyoshi; Serada, Satoshi; Yoshihara, Kosuke; Hori, Yumiko; Fujimoto, Minoru; Matsuzaki, Shinya; Egawa-Takata, Tomomi; Kobayashi, Eiji; Ueda, Yutaka; Morii, Eiichi; Enomoto, Takayuki; Naka, Tetsuji; Kimura, Tadashi

    2016-03-01

    Ovarian and endometrial high-grade serous carcinomas (HGSCs) have similar clinical and pathological characteristics; however, exhaustive protein expression profiling of these cancers has yet to be reported. We performed protein expression profiling on 14 cases of HGSCs (7 ovarian and 7 endometrial) and 18 endometrioid carcinomas (9 ovarian and 9 endometrial) using iTRAQ-based exhaustive and quantitative protein analysis. We identified 828 tumour-expressed proteins and evaluated the statistical similarity of protein expression profiles between ovarian and endometrial HGSCs using unsupervised hierarchical cluster analysis (P<0.01). Using 45 statistically highly expressed proteins in HGSCs, protein ontology analysis detected two enriched terms and proteins composing each term: IMP2 and MCM2. Immunohistochemical analyses confirmed the higher expression of IMP2 and MCM2 in ovarian and endometrial HGSCs as well as in tubal and peritoneal HGSCs than in endometrioid carcinomas (P<0.01). The knockdown of either IMP2 or MCM2 by siRNA interference significantly decreased the proliferation rate of ovarian HGSC cell line (P<0.01). We demonstrated the statistical similarity of the protein expression profiles of ovarian and endometrial HGSC beyond the organs. We suggest that increased IMP2 and MCM2 expression may underlie some of the rapid HGSC growth observed clinically.

  7. Kolmogorov-Smirnov statistical test for analysis of ZAP-70 expression in B-CLL, compared with quantitative PCR and IgV(H) mutation status.

    PubMed

    Van Bockstaele, Femke; Janssens, Ann; Piette, Anne; Callewaert, Filip; Pede, Valerie; Offner, Fritz; Verhasselt, Bruno; Philippé, Jan

    2006-07-15

    ZAP-70 has been proposed as a surrogate marker for immunoglobulin heavy-chain variable region (IgV(H)) mutation status, which is known as a prognostic marker in B-cell chronic lymphocytic leukemia (CLL). The flow cytometric analysis of ZAP-70 suffers from difficulties in standardization and interpretation. We applied the Kolmogorov-Smirnov (KS) statistical test to make analysis more straightforward. We examined ZAP-70 expression by flow cytometry in 53 patients with CLL. Analysis was performed as initially described by Crespo et al. (New England J Med 2003; 348:1764-1775) and alternatively by application of the KS statistical test comparing T cells with B cells. Receiver-operating-characteristics (ROC)-curve analyses were performed to determine the optimal cut-off values for ZAP-70 measured by the two approaches. ZAP-70 protein expression was compared with ZAP-70 mRNA expression measured by a quantitative PCR (qPCR) and with the IgV(H) mutation status. Both flow cytometric analyses correlated well with the molecular technique and proved to be of equal value in predicting the IgV(H) mutation status. Applying the KS test is reproducible, simple, straightforward, and overcomes a number of difficulties encountered in the Crespo-method. The KS statistical test is an essential part of the software delivered with modern routine analytical flow cytometers and is well suited for analysis of ZAP-70 expression in CLL. (c) 2006 International Society for Analytical Cytology.

  8. Combining Shapley value and statistics to the analysis of gene expression data in children exposed to air pollution

    PubMed Central

    Moretti, Stefano; van Leeuwen, Danitsja; Gmuender, Hans; Bonassi, Stefano; van Delft, Joost; Kleinjans, Jos; Patrone, Fioravante; Merlo, Domenico Franco

    2008-01-01

    Background In gene expression analysis, statistical tests for differential gene expression provide lists of candidate genes having, individually, a sufficiently low p-value. However, the interpretation of each single p-value within complex systems involving several interacting genes is problematic. In parallel, in the last sixty years, game theory has been applied to political and social problems to assess the power of interacting agents in forcing a decision and, more recently, to represent the relevance of genes in response to certain conditions. Results In this paper we introduce a Bootstrap procedure to test the null hypothesis that each gene has the same relevance between two conditions, where the relevance is represented by the Shapley value of a particular coalitional game defined on a microarray data-set. This method, which is called Comparative Analysis of Shapley value (shortly, CASh), is applied to data concerning the gene expression in children differentially exposed to air pollution. The results provided by CASh are compared with the results from a parametric statistical test for testing differential gene expression. Both lists of genes provided by CASh and t-test are informative enough to discriminate exposed subjects on the basis of their gene expression profiles. While many genes are selected in common by CASh and the parametric test, it turns out that the biological interpretation of the differences between these two selections is more interesting, suggesting a different interpretation of the main biological pathways in gene expression regulation for exposed individuals. A simulation study suggests that CASh offers more power than t-test for the detection of differential gene expression variability. Conclusion CASh is successfully applied to gene expression analysis of a data-set where the joint expression behavior of genes may be critical to characterize the expression response to air pollution. We demonstrate a synergistic effect between coalitional games and statistics that resulted in a selection of genes with a potential impact in the regulation of complex pathways. PMID:18764936

  9. Explorations in Statistics: The Analysis of Change

    ERIC Educational Resources Information Center

    Curran-Everett, Douglas; Williams, Calvin L.

    2015-01-01

    Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This tenth installment of "Explorations in Statistics" explores the analysis of a potential change in some physiological response. As researchers, we often express absolute change as percent change so we can…

  10. QPROT: Statistical method for testing differential expression using protein-level intensity data in label-free quantitative proteomics.

    PubMed

    Choi, Hyungwon; Kim, Sinae; Fermin, Damian; Tsou, Chih-Chiang; Nesvizhskii, Alexey I

    2015-11-03

    We introduce QPROT, a statistical framework and computational tool for differential protein expression analysis using protein intensity data. QPROT is an extension of the QSPEC suite, originally developed for spectral count data, adapted for the analysis using continuously measured protein-level intensity data. QPROT offers a new intensity normalization procedure and model-based differential expression analysis, both of which account for missing data. Determination of differential expression of each protein is based on the standardized Z-statistic based on the posterior distribution of the log fold change parameter, guided by the false discovery rate estimated by a well-known Empirical Bayes method. We evaluated the classification performance of QPROT using the quantification calibration data from the clinical proteomic technology assessment for cancer (CPTAC) study and a recently published Escherichia coli benchmark dataset, with evaluation of FDR accuracy in the latter. QPROT is a statistical framework with computational software tool for comparative quantitative proteomics analysis. It features various extensions of QSPEC method originally built for spectral count data analysis, including probabilistic treatment of missing values in protein intensity data. With the increasing popularity of label-free quantitative proteomics data, the proposed method and accompanying software suite will be immediately useful for many proteomics laboratories. This article is part of a Special Issue entitled: Computational Proteomics. Copyright © 2015 Elsevier B.V. All rights reserved.

  11. Common pitfalls in statistical analysis: “P” values, statistical significance and confidence intervals

    PubMed Central

    Ranganathan, Priya; Pramesh, C. S.; Buyse, Marc

    2015-01-01

    In the second part of a series on pitfalls in statistical analysis, we look at various ways in which a statistically significant study result can be expressed. We debunk some of the myths regarding the ‘P’ value, explain the importance of ‘confidence intervals’ and clarify the importance of including both values in a paper PMID:25878958

  12. In silico identification and comparative analysis of differentially expressed genes in human and mouse tissues

    PubMed Central

    Pao, Sheng-Ying; Lin, Win-Li; Hwang, Ming-Jing

    2006-01-01

    Background Screening for differentially expressed genes on the genomic scale and comparative analysis of the expression profiles of orthologous genes between species to study gene function and regulation are becoming increasingly feasible. Expressed sequence tags (ESTs) are an excellent source of data for such studies using bioinformatic approaches because of the rich libraries and tremendous amount of data now available in the public domain. However, any large-scale EST-based bioinformatics analysis must deal with the heterogeneous, and often ambiguous, tissue and organ terms used to describe EST libraries. Results To deal with the issue of tissue source, in this work, we carefully screened and organized more than 8 million human and mouse ESTs into 157 human and 108 mouse tissue/organ categories, to which we applied an established statistic test using different thresholds of the p value to identify genes differentially expressed in different tissues. Further analysis of the tissue distribution and level of expression of human and mouse orthologous genes showed that tissue-specific orthologs tended to have more similar expression patterns than those lacking significant tissue specificity. On the other hand, a number of orthologs were found to have significant disparity in their expression profiles, hinting at novel functions, divergent regulation, or new ortholog relationships. Conclusion Comprehensive statistics on the tissue-specific expression of human and mouse genes were obtained in this very large-scale, EST-based analysis. These statistical results have been organized into a database, freely accessible at our website , for easy searching of human and mouse tissue-specific genes and for investigating gene expression profiles in the context of comparative genomics. Comparative analysis showed that, although highly tissue-specific genes tend to exhibit similar expression profiles in human and mouse, there are significant exceptions, indicating that orthologous genes, while sharing basic genomic properties, could result in distinct phenotypes. PMID:16626500

  13. Sex genes for genomic analysis in human brain: internal controls for comparison of probe level data extraction.

    PubMed Central

    Galfalvy, Hanga C; Erraji-Benchekroun, Loubna; Smyrniotopoulos, Peggy; Pavlidis, Paul; Ellis, Steven P; Mann, J John; Sibille, Etienne; Arango, Victoria

    2003-01-01

    Background Genomic studies of complex tissues pose unique analytical challenges for assessment of data quality, performance of statistical methods used for data extraction, and detection of differentially expressed genes. Ideally, to assess the accuracy of gene expression analysis methods, one needs a set of genes which are known to be differentially expressed in the samples and which can be used as a "gold standard". We introduce the idea of using sex-chromosome genes as an alternative to spiked-in control genes or simulations for assessment of microarray data and analysis methods. Results Expression of sex-chromosome genes were used as true internal biological controls to compare alternate probe-level data extraction algorithms (Microarray Suite 5.0 [MAS5.0], Model Based Expression Index [MBEI] and Robust Multi-array Average [RMA]), to assess microarray data quality and to establish some statistical guidelines for analyzing large-scale gene expression. These approaches were implemented on a large new dataset of human brain samples. RMA-generated gene expression values were markedly less variable and more reliable than MAS5.0 and MBEI-derived values. A statistical technique controlling the false discovery rate was applied to adjust for multiple testing, as an alternative to the Bonferroni method, and showed no evidence of false negative results. Fourteen probesets, representing nine Y- and two X-chromosome linked genes, displayed significant sex differences in brain prefrontal cortex gene expression. Conclusion In this study, we have demonstrated the use of sex genes as true biological internal controls for genomic analysis of complex tissues, and suggested analytical guidelines for testing alternate oligonucleotide microarray data extraction protocols and for adjusting multiple statistical analysis of differentially expressed genes. Our results also provided evidence for sex differences in gene expression in the brain prefrontal cortex, supporting the notion of a putative direct role of sex-chromosome genes in differentiation and maintenance of sexual dimorphism of the central nervous system. Importantly, these analytical approaches are applicable to all microarray studies that include male and female human or animal subjects. PMID:12962547

  14. Sex genes for genomic analysis in human brain: internal controls for comparison of probe level data extraction.

    PubMed

    Galfalvy, Hanga C; Erraji-Benchekroun, Loubna; Smyrniotopoulos, Peggy; Pavlidis, Paul; Ellis, Steven P; Mann, J John; Sibille, Etienne; Arango, Victoria

    2003-09-08

    Genomic studies of complex tissues pose unique analytical challenges for assessment of data quality, performance of statistical methods used for data extraction, and detection of differentially expressed genes. Ideally, to assess the accuracy of gene expression analysis methods, one needs a set of genes which are known to be differentially expressed in the samples and which can be used as a "gold standard". We introduce the idea of using sex-chromosome genes as an alternative to spiked-in control genes or simulations for assessment of microarray data and analysis methods. Expression of sex-chromosome genes were used as true internal biological controls to compare alternate probe-level data extraction algorithms (Microarray Suite 5.0 [MAS5.0], Model Based Expression Index [MBEI] and Robust Multi-array Average [RMA]), to assess microarray data quality and to establish some statistical guidelines for analyzing large-scale gene expression. These approaches were implemented on a large new dataset of human brain samples. RMA-generated gene expression values were markedly less variable and more reliable than MAS5.0 and MBEI-derived values. A statistical technique controlling the false discovery rate was applied to adjust for multiple testing, as an alternative to the Bonferroni method, and showed no evidence of false negative results. Fourteen probesets, representing nine Y- and two X-chromosome linked genes, displayed significant sex differences in brain prefrontal cortex gene expression. In this study, we have demonstrated the use of sex genes as true biological internal controls for genomic analysis of complex tissues, and suggested analytical guidelines for testing alternate oligonucleotide microarray data extraction protocols and for adjusting multiple statistical analysis of differentially expressed genes. Our results also provided evidence for sex differences in gene expression in the brain prefrontal cortex, supporting the notion of a putative direct role of sex-chromosome genes in differentiation and maintenance of sexual dimorphism of the central nervous system. Importantly, these analytical approaches are applicable to all microarray studies that include male and female human or animal subjects.

  15. mapDIA: Preprocessing and statistical analysis of quantitative proteomics data from data independent acquisition mass spectrometry.

    PubMed

    Teo, Guoshou; Kim, Sinae; Tsou, Chih-Chiang; Collins, Ben; Gingras, Anne-Claude; Nesvizhskii, Alexey I; Choi, Hyungwon

    2015-11-03

    Data independent acquisition (DIA) mass spectrometry is an emerging technique that offers more complete detection and quantification of peptides and proteins across multiple samples. DIA allows fragment-level quantification, which can be considered as repeated measurements of the abundance of the corresponding peptides and proteins in the downstream statistical analysis. However, few statistical approaches are available for aggregating these complex fragment-level data into peptide- or protein-level statistical summaries. In this work, we describe a software package, mapDIA, for statistical analysis of differential protein expression using DIA fragment-level intensities. The workflow consists of three major steps: intensity normalization, peptide/fragment selection, and statistical analysis. First, mapDIA offers normalization of fragment-level intensities by total intensity sums as well as a novel alternative normalization by local intensity sums in retention time space. Second, mapDIA removes outlier observations and selects peptides/fragments that preserve the major quantitative patterns across all samples for each protein. Last, using the selected fragments and peptides, mapDIA performs model-based statistical significance analysis of protein-level differential expression between specified groups of samples. Using a comprehensive set of simulation datasets, we show that mapDIA detects differentially expressed proteins with accurate control of the false discovery rates. We also describe the analysis procedure in detail using two recently published DIA datasets generated for 14-3-3β dynamic interaction network and prostate cancer glycoproteome. The software was written in C++ language and the source code is available for free through SourceForge website http://sourceforge.net/projects/mapdia/.This article is part of a Special Issue entitled: Computational Proteomics. Copyright © 2015 Elsevier B.V. All rights reserved.

  16. Dynamic association rules for gene expression data analysis.

    PubMed

    Chen, Shu-Chuan; Tsai, Tsung-Hsien; Chung, Cheng-Han; Li, Wen-Hsiung

    2015-10-14

    The purpose of gene expression analysis is to look for the association between regulation of gene expression levels and phenotypic variations. This association based on gene expression profile has been used to determine whether the induction/repression of genes correspond to phenotypic variations including cell regulations, clinical diagnoses and drug development. Statistical analyses on microarray data have been developed to resolve gene selection issue. However, these methods do not inform us of causality between genes and phenotypes. In this paper, we propose the dynamic association rule algorithm (DAR algorithm) which helps ones to efficiently select a subset of significant genes for subsequent analysis. The DAR algorithm is based on association rules from market basket analysis in marketing. We first propose a statistical way, based on constructing a one-sided confidence interval and hypothesis testing, to determine if an association rule is meaningful. Based on the proposed statistical method, we then developed the DAR algorithm for gene expression data analysis. The method was applied to analyze four microarray datasets and one Next Generation Sequencing (NGS) dataset: the Mice Apo A1 dataset, the whole genome expression dataset of mouse embryonic stem cells, expression profiling of the bone marrow of Leukemia patients, Microarray Quality Control (MAQC) data set and the RNA-seq dataset of a mouse genomic imprinting study. A comparison of the proposed method with the t-test on the expression profiling of the bone marrow of Leukemia patients was conducted. We developed a statistical way, based on the concept of confidence interval, to determine the minimum support and minimum confidence for mining association relationships among items. With the minimum support and minimum confidence, one can find significant rules in one single step. The DAR algorithm was then developed for gene expression data analysis. Four gene expression datasets showed that the proposed DAR algorithm not only was able to identify a set of differentially expressed genes that largely agreed with that of other methods, but also provided an efficient and accurate way to find influential genes of a disease. In the paper, the well-established association rule mining technique from marketing has been successfully modified to determine the minimum support and minimum confidence based on the concept of confidence interval and hypothesis testing. It can be applied to gene expression data to mine significant association rules between gene regulation and phenotype. The proposed DAR algorithm provides an efficient way to find influential genes that underlie the phenotypic variance.

  17. DEIVA: a web application for interactive visual analysis of differential gene expression profiles.

    PubMed

    Harshbarger, Jayson; Kratz, Anton; Carninci, Piero

    2017-01-07

    Differential gene expression (DGE) analysis is a technique to identify statistically significant differences in RNA abundance for genes or arbitrary features between different biological states. The result of a DGE test is typically further analyzed using statistical software, spreadsheets or custom ad hoc algorithms. We identified a need for a web-based system to share DGE statistical test results, and locate and identify genes in DGE statistical test results with a very low barrier of entry. We have developed DEIVA, a free and open source, browser-based single page application (SPA) with a strong emphasis on being user friendly that enables locating and identifying single or multiple genes in an immediate, interactive, and intuitive manner. By design, DEIVA scales with very large numbers of users and datasets. Compared to existing software, DEIVA offers a unique combination of design decisions that enable inspection and analysis of DGE statistical test results with an emphasis on ease of use.

  18. Time Series Expression Analyses Using RNA-seq: A Statistical Approach

    PubMed Central

    Oh, Sunghee; Song, Seongho; Grabowski, Gregory; Zhao, Hongyu; Noonan, James P.

    2013-01-01

    RNA-seq is becoming the de facto standard approach for transcriptome analysis with ever-reducing cost. It has considerable advantages over conventional technologies (microarrays) because it allows for direct identification and quantification of transcripts. Many time series RNA-seq datasets have been collected to study the dynamic regulations of transcripts. However, statistically rigorous and computationally efficient methods are needed to explore the time-dependent changes of gene expression in biological systems. These methods should explicitly account for the dependencies of expression patterns across time points. Here, we discuss several methods that can be applied to model timecourse RNA-seq data, including statistical evolutionary trajectory index (SETI), autoregressive time-lagged regression (AR(1)), and hidden Markov model (HMM) approaches. We use three real datasets and simulation studies to demonstrate the utility of these dynamic methods in temporal analysis. PMID:23586021

  19. Time series expression analyses using RNA-seq: a statistical approach.

    PubMed

    Oh, Sunghee; Song, Seongho; Grabowski, Gregory; Zhao, Hongyu; Noonan, James P

    2013-01-01

    RNA-seq is becoming the de facto standard approach for transcriptome analysis with ever-reducing cost. It has considerable advantages over conventional technologies (microarrays) because it allows for direct identification and quantification of transcripts. Many time series RNA-seq datasets have been collected to study the dynamic regulations of transcripts. However, statistically rigorous and computationally efficient methods are needed to explore the time-dependent changes of gene expression in biological systems. These methods should explicitly account for the dependencies of expression patterns across time points. Here, we discuss several methods that can be applied to model timecourse RNA-seq data, including statistical evolutionary trajectory index (SETI), autoregressive time-lagged regression (AR(1)), and hidden Markov model (HMM) approaches. We use three real datasets and simulation studies to demonstrate the utility of these dynamic methods in temporal analysis.

  20. Searching for molecular markers in head and neck squamous cell carcinomas (HNSCC) by statistical and bioinformatic analysis of larynx-derived SAGE libraries

    PubMed Central

    Silveira, Nelson JF; Varuzza, Leonardo; Machado-Lima, Ariane; Lauretto, Marcelo S; Pinheiro, Daniel G; Rodrigues, Rodrigo V; Severino, Patrícia; Nobrega, Francisco G; Silva, Wilson A; de B Pereira, Carlos A; Tajara, Eloiza H

    2008-01-01

    Background Head and neck squamous cell carcinoma (HNSCC) is one of the most common malignancies in humans. The average 5-year survival rate is one of the lowest among aggressive cancers, showing no significant improvement in recent years. When detected early, HNSCC has a good prognosis, but most patients present metastatic disease at the time of diagnosis, which significantly reduces survival rate. Despite extensive research, no molecular markers are currently available for diagnostic or prognostic purposes. Methods Aiming to identify differentially-expressed genes involved in laryngeal squamous cell carcinoma (LSCC) development and progression, we generated individual Serial Analysis of Gene Expression (SAGE) libraries from a metastatic and non-metastatic larynx carcinoma, as well as from a normal larynx mucosa sample. Approximately 54,000 unique tags were sequenced in three libraries. Results Statistical data analysis identified a subset of 1,216 differentially expressed tags between tumor and normal libraries, and 894 differentially expressed tags between metastatic and non-metastatic carcinomas. Three genes displaying differential regulation, one down-regulated (KRT31) and two up-regulated (BST2, MFAP2), as well as one with a non-significant differential expression pattern (GNA15) in our SAGE data were selected for real-time polymerase chain reaction (PCR) in a set of HNSCC samples. Consistent with our statistical analysis, quantitative PCR confirmed the upregulation of BST2 and MFAP2 and the downregulation of KRT31 when samples of HNSCC were compared to tumor-free surgical margins. As expected, GNA15 presented a non-significant differential expression pattern when tumor samples were compared to normal tissues. Conclusion To the best of our knowledge, this is the first study reporting SAGE data in head and neck squamous cell tumors. Statistical analysis was effective in identifying differentially expressed genes reportedly involved in cancer development. The differential expression of a subset of genes was confirmed in additional larynx carcinoma samples and in carcinomas from a distinct head and neck subsite. This result suggests the existence of potential common biomarkers for prognosis and targeted-therapy development in this heterogeneous type of tumor. PMID:19014460

  1. Genome Expression Pathway Analysis Tool – Analysis and visualization of microarray gene expression data under genomic, proteomic and metabolic context

    PubMed Central

    Weniger, Markus; Engelmann, Julia C; Schultz, Jörg

    2007-01-01

    Background Regulation of gene expression is relevant to many areas of biology and medicine, in the study of treatments, diseases, and developmental stages. Microarrays can be used to measure the expression level of thousands of mRNAs at the same time, allowing insight into or comparison of different cellular conditions. The data derived out of microarray experiments is highly dimensional and often noisy, and interpretation of the results can get intricate. Although programs for the statistical analysis of microarray data exist, most of them lack an integration of analysis results and biological interpretation. Results We have developed GEPAT, Genome Expression Pathway Analysis Tool, offering an analysis of gene expression data under genomic, proteomic and metabolic context. We provide an integration of statistical methods for data import and data analysis together with a biological interpretation for subsets of probes or single probes on the chip. GEPAT imports various types of oligonucleotide and cDNA array data formats. Different normalization methods can be applied to the data, afterwards data annotation is performed. After import, GEPAT offers various statistical data analysis methods, as hierarchical, k-means and PCA clustering, a linear model based t-test or chromosomal profile comparison. The results of the analysis can be interpreted by enrichment of biological terms, pathway analysis or interaction networks. Different biological databases are included, to give various information for each probe on the chip. GEPAT offers no linear work flow, but allows the usage of any subset of probes and samples as a start for a new data analysis. GEPAT relies on established data analysis packages, offers a modular approach for an easy extension, and can be run on a computer grid to allow a large number of users. It is freely available under the LGPL open source license for academic and commercial users at . Conclusion GEPAT is a modular, scalable and professional-grade software integrating analysis and interpretation of microarray gene expression data. An installation available for academic users can be found at . PMID:17543125

  2. Acoustic correlates of Japanese expressions associated with voice quality of male adults

    NASA Astrophysics Data System (ADS)

    Kido, Hiroshi; Kasuya, Hideki

    2004-05-01

    Japanese expressions associated with the voice quality of male adults were extracted by a series of questionnaire surveys and statistical multivariate analysis. One hundred and thirty-seven Japanese expressions were collected through the first questionnaire and careful investigations of well-established Japanese dictionaries and articles. From the second questionnaire about familiarity with each of the expressions and synonymity that were addressed to 249 subjects, 25 expressions were extracted. The third questionnaire was about an evaluation of their own voice quality. By applying a statistical clustering method and a correlation analysis to the results of the questionnaires, eight bipolar expressions and one unipolar expression were obtained. They constituted high-pitched/low-pitched, masculine/feminine, hoarse/clear, calm/excited, powerful/weak, youthful/elderly, thick/thin, tense/lax, and nasal, respectively. Acoustic correlates of each of the eight bipolar expressions were extracted by means of perceptual evaluation experiments that were made with sentence utterances of 36 males and by a statistical decision tree method. They included an average of the fundamental frequency (F0) of the utterance, speaking rate, spectral tilt, formant frequency parameter, standard deviation of F0 values, and glottal noise, when SPL of each of the stimuli was maintained identical in the perceptual experiments.

  3. Prognostic value of GLUT-1 expression in ovarian surface epithelial tumors: a morphometric study.

    PubMed

    Ozcan, Ayhan; Deveci, Mehmet Salih; Oztas, Emin; Dede, Murat; Yenen, Mufit Cemal; Korgun, Emin Turkay; Gunhan, Omer

    2005-08-01

    To investigate the reported increase in the expression of the glucose transporter GLUT-1 in borderline and malignant ovarian epithelial tumors and its relationship to prognosis. In this study, areas in which immunohistochemical membranous staining with GLUT-1 were most evident were selected, and the proportions of GLUT-1 expression in 46 benign, 11 borderline and 42 malignant cases of ovarian epithelial tumors were determined quantitatively with a computer and Zeiss Vision KS 400 3.0 (Göttingen, Germany) for Windows (Microsoft, Redmond, Washington, U.S.A.) image analysis. GLUT-1 expression was determined in all borderline tumors (11 of 11) and in 97.6% of malignant tumors (41 of 42). No GLUT-1 expression was observed in benign tumors. The intensity of GLUT-1 staining was lower in borderline tumors than in malignant cases. This was statistically significant (p = 0.005). As differentiation in malignant tumors increased, proportions of GLUT-1 expression showed a relative increase, but this difference was not statistically significant (p = 0.68). When GLUT-1 expression in borderline and malignant ovarian epithelial tumors was analyzed against prognosis, no statistically significant difference was identified. Assessment of GLUT-1 expression using the image analysis program was more reliable, with higher reproducibility than in previous studies.

  4. Differential gene expression detection and sample classification using penalized linear regression models.

    PubMed

    Wu, Baolin

    2006-02-15

    Differential gene expression detection and sample classification using microarray data have received much research interest recently. Owing to the large number of genes p and small number of samples n (p > n), microarray data analysis poses big challenges for statistical analysis. An obvious problem owing to the 'large p small n' is over-fitting. Just by chance, we are likely to find some non-differentially expressed genes that can classify the samples very well. The idea of shrinkage is to regularize the model parameters to reduce the effects of noise and produce reliable inferences. Shrinkage has been successfully applied in the microarray data analysis. The SAM statistics proposed by Tusher et al. and the 'nearest shrunken centroid' proposed by Tibshirani et al. are ad hoc shrinkage methods. Both methods are simple, intuitive and prove to be useful in empirical studies. Recently Wu proposed the penalized t/F-statistics with shrinkage by formally using the (1) penalized linear regression models for two-class microarray data, showing good performance. In this paper we systematically discussed the use of penalized regression models for analyzing microarray data. We generalize the two-class penalized t/F-statistics proposed by Wu to multi-class microarray data. We formally derive the ad hoc shrunken centroid used by Tibshirani et al. using the (1) penalized regression models. And we show that the penalized linear regression models provide a rigorous and unified statistical framework for sample classification and differential gene expression detection.

  5. The Shock and Vibration Digest. Volume 13, Number 12

    DTIC Science & Technology

    1981-12-01

    Resulting Unsteady Forces and Flow Phenomenon. Part III 26 BOOK REVIEWS STATISTICAL ENERGY ANALYSIS Chapter IV considers the problems of estimating J OF...stress, acceleration, modes. Statistical energy analysis (SEA), which is and pressure; estimations of the average system expressed in terms of random...by F.C. Nelson, SVD, 13 (8), pp 30-31 (Aug 1981) Lyons, R.H., Statistical Energy Analysis of Dynamic Systems, MIT Press, Cambridge, MA; Revieed by H

  6. Immunohistochemical evaluation of inducible nitric oxide synthase in the epithelial lining of odontogenic cysts: A qualitative and quantitative analysis

    PubMed Central

    Akshatha, B K; Karuppiah, Karpagaselvi; Manjunath, G S; Kumarswamy, Jayalakshmi; Papaiah, Lokesh; Rao, Jyothi

    2017-01-01

    Introduction: The three common odontogenic cysts include radicular cysts (RCs), dentigerous cysts (DCs), and odontogenic keratocysts (OKCs). Among these 3 cysts, OKC is recently been classified as benign keratocystic odontogenic tumor attributing to its aggressive behavior, recurrence rate, and malignant potential. The present study involved qualitative and quantitative analysis of inducible nitric oxide synthase (iNOS) expression in epithelial lining of RCs, DCs, and OKCs, compare iNOS expression in epithelial linings of all the 3 cysts and determined overexpression of iNOS in OKCs which might contribute to its aggressive behavior and malignant potential. Aims: The present study is to investigate the role of iNOS in the pathogenesis of OKCs, DCs, and RCs by evaluating the iNOS expression in the epithelial lining of these cysts. Subjects and Methods: Analysis of iNOS expression in epithelial lining cells of 20 RCs, 20 DCs, and 20 OKCs using immunohistochemistry done. Statistical Analysis Used: The percentage of positive cells and intensity of stain was assessed and compared among all the 3 cysts using contingency coefficient. Kappa statistics for the two observers were computed for finding interobserver agreement. Results: The percentage of iNOS-positive cells was found to be remarkably high in OKCs (12/20) –57.1% as compared to RCs (6/20) – 28.6% and DCs (3/20) – 14.3%. The interobserver agreement for iNOS-positive percentage cells was arrived with kappa values with OKCs → Statistically significant (P > 0.000), RCs → statistically significant (P > 0.001) with no significant values for DCs. No statistical difference exists among 3 study samples in regard to the intensity of staining with iNOS. Conclusions: Increased iNOS expression in OKCs may contribute to bone resorption and accumulation of wild-type p53, hence, making OKCs more aggressive. PMID:29391711

  7. A global estimate of the Earth's magnetic crustal thickness

    NASA Astrophysics Data System (ADS)

    Vervelidou, Foteini; Thébault, Erwan

    2014-05-01

    The Earth's lithosphere is considered to be magnetic only down to the Curie isotherm. Therefore the Curie isotherm can, in principle, be estimated by analysis of magnetic data. Here, we propose such an analysis in the spectral domain by means of a newly introduced regional spatial power spectrum. This spectrum is based on the Revised Spherical Cap Harmonic Analysis (R-SCHA) formalism (Thébault et al., 2006). We briefly discuss its properties and its relationship with the Spherical Harmonic spatial power spectrum. This relationship allows us to adapt any theoretical expression of the lithospheric field power spectrum expressed in Spherical Harmonic degrees to the regional formulation. We compared previously published statistical expressions (Jackson, 1994 ; Voorhies et al., 2002) to the recent lithospheric field models derived from the CHAMP and airborne measurements and we finally developed a new statistical form for the power spectrum of the Earth's magnetic lithosphere that we think provides more consistent results. This expression depends on the mean magnetization, the mean crustal thickness and a power law value that describes the amount of spatial correlation of the sources. In this study, we make a combine use of the R-SCHA surface power spectrum and this statistical form. We conduct a series of regional spectral analyses for the entire Earth. For each region, we estimate the R-SCHA surface power spectrum of the NGDC-720 Spherical Harmonic model (Maus, 2010). We then fit each of these observational spectra to the statistical expression of the power spectrum of the Earth's lithosphere. By doing so, we estimate the large wavelengths of the magnetic crustal thickness on a global scale that are not accessible directly from the magnetic measurements due to the masking core field. We then discuss these results and compare them to the results we obtained by conducting a similar spectral analysis, but this time in the cartesian coordinates, by means of a published statistical expression (Maus et al., 1997). We also compare our results to crustal thickness global maps derived by means of additional geophysical data (Purucker et al., 2002).

  8. Gene coexpression measures in large heterogeneous samples using count statistics.

    PubMed

    Wang, Y X Rachel; Waterman, Michael S; Huang, Haiyan

    2014-11-18

    With the advent of high-throughput technologies making large-scale gene expression data readily available, developing appropriate computational tools to process these data and distill insights into systems biology has been an important part of the "big data" challenge. Gene coexpression is one of the earliest techniques developed that is still widely in use for functional annotation, pathway analysis, and, most importantly, the reconstruction of gene regulatory networks, based on gene expression data. However, most coexpression measures do not specifically account for local features in expression profiles. For example, it is very likely that the patterns of gene association may change or only exist in a subset of the samples, especially when the samples are pooled from a range of experiments. We propose two new gene coexpression statistics based on counting local patterns of gene expression ranks to take into account the potentially diverse nature of gene interactions. In particular, one of our statistics is designed for time-course data with local dependence structures, such as time series coupled over a subregion of the time domain. We provide asymptotic analysis of their distributions and power, and evaluate their performance against a wide range of existing coexpression measures on simulated and real data. Our new statistics are fast to compute, robust against outliers, and show comparable and often better general performance.

  9. The MetabolomeExpress Project: enabling web-based processing, analysis and transparent dissemination of GC/MS metabolomics datasets.

    PubMed

    Carroll, Adam J; Badger, Murray R; Harvey Millar, A

    2010-07-14

    Standardization of analytical approaches and reporting methods via community-wide collaboration can work synergistically with web-tool development to result in rapid community-driven expansion of online data repositories suitable for data mining and meta-analysis. In metabolomics, the inter-laboratory reproducibility of gas-chromatography/mass-spectrometry (GC/MS) makes it an obvious target for such development. While a number of web-tools offer access to datasets and/or tools for raw data processing and statistical analysis, none of these systems are currently set up to act as a public repository by easily accepting, processing and presenting publicly submitted GC/MS metabolomics datasets for public re-analysis. Here, we present MetabolomeExpress, a new File Transfer Protocol (FTP) server and web-tool for the online storage, processing, visualisation and statistical re-analysis of publicly submitted GC/MS metabolomics datasets. Users may search a quality-controlled database of metabolite response statistics from publicly submitted datasets by a number of parameters (eg. metabolite, species, organ/biofluid etc.). Users may also perform meta-analysis comparisons of multiple independent experiments or re-analyse public primary datasets via user-friendly tools for t-test, principal components analysis, hierarchical cluster analysis and correlation analysis. They may interact with chromatograms, mass spectra and peak detection results via an integrated raw data viewer. Researchers who register for a free account may upload (via FTP) their own data to the server for online processing via a novel raw data processing pipeline. MetabolomeExpress https://www.metabolome-express.org provides a new opportunity for the general metabolomics community to transparently present online the raw and processed GC/MS data underlying their metabolomics publications. Transparent sharing of these data will allow researchers to assess data quality and draw their own insights from published metabolomics datasets.

  10. Featured Article: Transcriptional landscape analysis identifies differently expressed genes involved in follicle-stimulating hormone induced postmenopausal osteoporosis.

    PubMed

    Maasalu, Katre; Laius, Ott; Zhytnik, Lidiia; Kõks, Sulev; Prans, Ele; Reimann, Ene; Märtson, Aare

    2017-01-01

    Osteoporosis is a disorder associated with bone tissue reorganization, bone mass, and mineral density. Osteoporosis can severely affect postmenopausal women, causing bone fragility and osteoporotic fractures. The aim of the current study was to compare blood mRNA profiles of postmenopausal women with and without osteoporosis, with the aim of finding different gene expressions and thus targets for future osteoporosis biomarker studies. Our study consisted of transcriptome analysis of whole blood serum from 12 elderly female osteoporotic patients and 12 non-osteoporotic elderly female controls. The transcriptome analysis was performed with RNA sequencing technology. For data analysis, the edgeR package of R Bioconductor was used. Two hundred and fourteen genes were expressed differently in osteoporotic compared with non-osteoporotic patients. Statistical analysis revealed 20 differently expressed genes with a false discovery rate of less than 1.47 × 10 -4 among osteoporotic patients. The expression of 10 genes were up-regulated and 10 down-regulated. Further statistical analysis identified a potential osteoporosis mRNA biomarker pattern consisting of six genes: CACNA1G, ALG13, SBK1, GGT7, MBNL3, and RIOK3. Functional ingenuity pathway analysis identified the strongest candidate genes with regard to potential involvement in a follicle-stimulating hormone activated network of increased osteoclast activity and hypogonadal bone loss. The differentially expressed genes identified in this study may contribute to future research of postmenopausal osteoporosis blood biomarkers.

  11. A statistical method for measuring activation of gene regulatory networks.

    PubMed

    Esteves, Gustavo H; Reis, Luiz F L

    2018-06-13

    Gene expression data analysis is of great importance for modern molecular biology, given our ability to measure the expression profiles of thousands of genes and enabling studies rooted in systems biology. In this work, we propose a simple statistical model for the activation measuring of gene regulatory networks, instead of the traditional gene co-expression networks. We present the mathematical construction of a statistical procedure for testing hypothesis regarding gene regulatory network activation. The real probability distribution for the test statistic is evaluated by a permutation based study. To illustrate the functionality of the proposed methodology, we also present a simple example based on a small hypothetical network and the activation measuring of two KEGG networks, both based on gene expression data collected from gastric and esophageal samples. The two KEGG networks were also analyzed for a public database, available through NCBI-GEO, presented as Supplementary Material. This method was implemented in an R package that is available at the BioConductor project website under the name maigesPack.

  12. CXCR4 expression is associated with time-course permanent and temporary myocardial infarction in rats.

    PubMed

    Kiani, Ali Asghar; Babaei, Fereshteh; Sedighi, Mehrnoosh; Soleimani, Azam; Ahmadi, Kolsum; Shahrokhi, Somayeh; Anbari, Khatereh; Nazari, Afshin

    2017-06-01

    Experimental myocardial infarction triggers secretion of Stromal cell-derived factor1 and lead to increase in the expression of its receptor "CXCR4" on the surface of various cells. The aim of this study was to evaluate the expression pattern of CXCR4 in peripheral blood cells following time-course permanent and temporary ischemia in rats. Fourteen male Wistar rats were divided into two groups of seven and were placed under permanent and transient ischemia. Peripheral blood mononuclear cells were isolated at different time points, RNAs extracted and applied to qRT-PCR analysis of the CXCR4 gene. Based on repeated measures analysis of variance, the differences in the expression levels of the gene in each of the groups were statistically significant over time (the effect of time) ( P <0.001). Additionally, the difference in gene expression between the two groups was statistically significant (the effect of group), such that at all times, the expression levels of the gene were significantly higher in the permanent ischemia than in the transient ischemia group ( P <0.001). Moreover, the interactive effect of time-group on gene expression was statistically significant ( P <0.001). CXCR4 is modulated in an induced ischemia context implying a possible association with myocardial infarction. Checking of CXCR4 expression in the ischemic changes shows that damage to the heart tissue trigger the secretion of inflammatory chemokine SDF, Followed by it CXCR4 expression in blood cells. These observations suggest that changes in the expression of CXCR4 may be considered a valuable marker for monitoring myocardial infarction.

  13. Immunohistochemical evaluation of inducible nitric oxide synthase in the epithelial lining of odontogenic cysts: A qualitative and quantitative analysis.

    PubMed

    Akshatha, B K; Karuppiah, Karpagaselvi; Manjunath, G S; Kumarswamy, Jayalakshmi; Papaiah, Lokesh; Rao, Jyothi

    2017-01-01

    The three common odontogenic cysts include radicular cysts (RCs), dentigerous cysts (DCs), and odontogenic keratocysts (OKCs). Among these 3 cysts, OKC is recently been classified as benign keratocystic odontogenic tumor attributing to its aggressive behavior, recurrence rate, and malignant potential. The present study involved qualitative and quantitative analysis of inducible nitric oxide synthase (iNOS) expression in epithelial lining of RCs, DCs, and OKCs, compare iNOS expression in epithelial linings of all the 3 cysts and determined overexpression of iNOS in OKCs which might contribute to its aggressive behavior and malignant potential. The present study is to investigate the role of iNOS in the pathogenesis of OKCs, DCs, and RCs by evaluating the iNOS expression in the epithelial lining of these cysts. Analysis of iNOS expression in epithelial lining cells of 20 RCs, 20 DCs, and 20 OKCs using immunohistochemistry done. The percentage of positive cells and intensity of stain was assessed and compared among all the 3 cysts using contingency coefficient. Kappa statistics for the two observers were computed for finding interobserver agreement. The percentage of iNOS-positive cells was found to be remarkably high in OKCs (12/20) -57.1% as compared to RCs (6/20) - 28.6% and DCs (3/20) - 14.3%. The interobserver agreement for iNOS-positive percentage cells was arrived with kappa values with OKCs → Statistically significant ( P > 0.000), RCs → statistically significant ( P > 0.001) with no significant values for DCs. No statistical difference exists among 3 study samples in regard to the intensity of staining with iNOS. Increased iNOS expression in OKCs may contribute to bone resorption and accumulation of wild-type p53, hence, making OKCs more aggressive.

  14. A user-friendly workflow for analysis of Illumina gene expression bead array data available at the arrayanalysis.org portal.

    PubMed

    Eijssen, Lars M T; Goelela, Varshna S; Kelder, Thomas; Adriaens, Michiel E; Evelo, Chris T; Radonjic, Marijana

    2015-06-30

    Illumina whole-genome expression bead arrays are a widely used platform for transcriptomics. Most of the tools available for the analysis of the resulting data are not easily applicable by less experienced users. ArrayAnalysis.org provides researchers with an easy-to-use and comprehensive interface to the functionality of R and Bioconductor packages for microarray data analysis. As a modular open source project, it allows developers to contribute modules that provide support for additional types of data or extend workflows. To enable data analysis of Illumina bead arrays for a broad user community, we have developed a module for ArrayAnalysis.org that provides a free and user-friendly web interface for quality control and pre-processing for these arrays. This module can be used together with existing modules for statistical and pathway analysis to provide a full workflow for Illumina gene expression data analysis. The module accepts data exported from Illumina's GenomeStudio, and provides the user with quality control plots and normalized data. The outputs are directly linked to the existing statistics module of ArrayAnalysis.org, but can also be downloaded for further downstream analysis in third-party tools. The Illumina bead arrays analysis module is available at http://www.arrayanalysis.org . A user guide, a tutorial demonstrating the analysis of an example dataset, and R scripts are available. The module can be used as a starting point for statistical evaluation and pathway analysis provided on the website or to generate processed input data for a broad range of applications in life sciences research.

  15. MUC4: a novel prognostic factor of oral squamous cell carcinoma.

    PubMed

    Hamada, Tomofumi; Wakamatsu, Tsunenobu; Miyahara, Mayumi; Nagata, Satoshi; Nomura, Masahiro; Kamikawa, Yoshiaki; Yamada, Norishige; Batra, Surinder K; Yonezawa, Suguru; Sugihara, Kazumasa

    2012-04-15

    MUC4 mucin is now known to be expressed in various normal and cancer tissues. We have previously reported that MUC4 expression is a novel prognostic factor in several malignant tumors; however, it has not been investigated in oral squamous cell carcinoma (OSCC). The aim of our study is to evaluate the prognostic significance of MUC4 expression in OSCC. We examined the expression profile of MUC4 in OSCC tissues from 150 patients using immunohistochemistry. Its prognostic significance in OSCC was statistically analyzed. MUC4 was expressed in 61 of the 150 patients with OSCC. MUC4 expression was significantly correlated with higher T classification (p = 0.0004), positive nodal metastasis (p = 0.049), advanced tumor stage (p = 0.002), diffuse invasion of cancer cells (p = 0.004) and patient's death (p = 0.004) in OSCC. Multivariate analysis showed that MUC4 expression (p = 0.011), tumor location (p = 0.032) and diffuse invasion (p = 0.009) were statistically significant risk factors. Backward stepwise multivariate analysis demonstrated MUC4 expression (p = 0.0015) and diffuse invasion (p = 0.018) to be statistically significant independent risk factors of poor survival in OSCC. The disease-free and overall survival of patients with MUC4 expression was significantly worse than those without MUC4 expression (p < 0.0001 and p = 0.0001). In addition, the MUC4 expression was a significant risk factor for local recurrence and subsequent nodal metastasis in OSCC (p = 0.017 and p = 0.0001). We first report MUC4 overexpression is an independent factor for poor prognosis of patients with OSCC; therefore, patients with OSCC showing positive MUC4 expression should be followed up carefully. Copyright © 2011 UICC.

  16. HYPOTHESIS SETTING AND ORDER STATISTIC FOR ROBUST GENOMIC META-ANALYSIS.

    PubMed

    Song, Chi; Tseng, George C

    2014-01-01

    Meta-analysis techniques have been widely developed and applied in genomic applications, especially for combining multiple transcriptomic studies. In this paper, we propose an order statistic of p-values ( r th ordered p-value, rOP) across combined studies as the test statistic. We illustrate different hypothesis settings that detect gene markers differentially expressed (DE) "in all studies", "in the majority of studies", or "in one or more studies", and specify rOP as a suitable method for detecting DE genes "in the majority of studies". We develop methods to estimate the parameter r in rOP for real applications. Statistical properties such as its asymptotic behavior and a one-sided testing correction for detecting markers of concordant expression changes are explored. Power calculation and simulation show better performance of rOP compared to classical Fisher's method, Stouffer's method, minimum p-value method and maximum p-value method under the focused hypothesis setting. Theoretically, rOP is found connected to the naïve vote counting method and can be viewed as a generalized form of vote counting with better statistical properties. The method is applied to three microarray meta-analysis examples including major depressive disorder, brain cancer and diabetes. The results demonstrate rOP as a more generalizable, robust and sensitive statistical framework to detect disease-related markers.

  17. Finding differentially expressed genes in high dimensional data: Rank based test statistic via a distance measure.

    PubMed

    Mathur, Sunil; Sadana, Ajit

    2015-12-01

    We present a rank-based test statistic for the identification of differentially expressed genes using a distance measure. The proposed test statistic is highly robust against extreme values and does not assume the distribution of parent population. Simulation studies show that the proposed test is more powerful than some of the commonly used methods, such as paired t-test, Wilcoxon signed rank test, and significance analysis of microarray (SAM) under certain non-normal distributions. The asymptotic distribution of the test statistic, and the p-value function are discussed. The application of proposed method is shown using a real-life data set. © The Author(s) 2011.

  18. The chemiluminescence based Ziplex automated workstation focus array reproduces ovarian cancer Affymetrix GeneChip expression profiles.

    PubMed

    Quinn, Michael C J; Wilson, Daniel J; Young, Fiona; Dempsey, Adam A; Arcand, Suzanna L; Birch, Ashley H; Wojnarowicz, Paulina M; Provencher, Diane; Mes-Masson, Anne-Marie; Englert, David; Tonin, Patricia N

    2009-07-06

    As gene expression signatures may serve as biomarkers, there is a need to develop technologies based on mRNA expression patterns that are adaptable for translational research. Xceed Molecular has recently developed a Ziplex technology, that can assay for gene expression of a discrete number of genes as a focused array. The present study has evaluated the reproducibility of the Ziplex system as applied to ovarian cancer research of genes shown to exhibit distinct expression profiles initially assessed by Affymetrix GeneChip analyses. The new chemiluminescence-based Ziplex gene expression array technology was evaluated for the expression of 93 genes selected based on their Affymetrix GeneChip profiles as applied to ovarian cancer research. Probe design was based on the Affymetrix target sequence that favors the 3' UTR of transcripts in order to maximize reproducibility across platforms. Gene expression analysis was performed using the Ziplex Automated Workstation. Statistical analyses were performed to evaluate reproducibility of both the magnitude of expression and differences between normal and tumor samples by correlation analyses, fold change differences and statistical significance testing. Expressions of 82 of 93 (88.2%) genes were highly correlated (p < 0.01) in a comparison of the two platforms. Overall, 75 of 93 (80.6%) genes exhibited consistent results in normal versus tumor tissue comparisons for both platforms (p < 0.001). The fold change differences were concordant for 87 of 93 (94%) genes, where there was agreement between the platforms regarding statistical significance for 71 (76%) of 87 genes. There was a strong agreement between the two platforms as shown by comparisons of log2 fold differences of gene expression between tumor versus normal samples (R = 0.93) and by Bland-Altman analysis, where greater than 90% of expression values fell within the 95% limits of agreement. Overall concordance of gene expression patterns based on correlations, statistical significance between tumor and normal ovary data, and fold changes was consistent between the Ziplex and Affymetrix platforms. The reproducibility and ease-of-use of the technology suggests that the Ziplex array is a suitable platform for translational research.

  19. Accurate landmarking of three-dimensional facial data in the presence of facial expressions and occlusions using a three-dimensional statistical facial feature model.

    PubMed

    Zhao, Xi; Dellandréa, Emmanuel; Chen, Liming; Kakadiaris, Ioannis A

    2011-10-01

    Three-dimensional face landmarking aims at automatically localizing facial landmarks and has a wide range of applications (e.g., face recognition, face tracking, and facial expression analysis). Existing methods assume neutral facial expressions and unoccluded faces. In this paper, we propose a general learning-based framework for reliable landmark localization on 3-D facial data under challenging conditions (i.e., facial expressions and occlusions). Our approach relies on a statistical model, called 3-D statistical facial feature model, which learns both the global variations in configurational relationships between landmarks and the local variations of texture and geometry around each landmark. Based on this model, we further propose an occlusion classifier and a fitting algorithm. Results from experiments on three publicly available 3-D face databases (FRGC, BU-3-DFE, and Bosphorus) demonstrate the effectiveness of our approach, in terms of landmarking accuracy and robustness, in the presence of expressions and occlusions.

  20. Analyzing gene expression from relative codon usage bias in Yeast genome: a statistical significance and biological relevance.

    PubMed

    Das, Shibsankar; Roymondal, Uttam; Sahoo, Satyabrata

    2009-08-15

    Based on the hypothesis that highly expressed genes are often characterized by strong compositional bias in terms of codon usage, there are a number of measures currently in use that quantify codon usage bias in genes, and hence provide numerical indices to predict the expression levels of genes. With the recent advent of expression measure from the score of the relative codon usage bias (RCBS), we have explicitly tested the performance of this numerical measure to predict the gene expression level and illustrate this with an analysis of Yeast genomes. In contradiction with previous other studies, we observe a weak correlations between GC content and RCBS, but a selective pressure on the codon preferences in highly expressed genes. The assertion that the expression of a given gene depends on the score of relative codon usage bias (RCBS) is supported by the data. We further observe a strong correlation between RCBS and protein length indicating natural selection in favour of shorter genes to be expressed at higher level. We also attempt a statistical analysis to assess the strength of relative codon bias in genes as a guide to their likely expression level, suggesting a decrease of the informational entropy in the highly expressed genes.

  1. Application of survival analysis methodology to the quantitative analysis of LC-MS proteomics data.

    PubMed

    Tekwe, Carmen D; Carroll, Raymond J; Dabney, Alan R

    2012-08-01

    Protein abundance in quantitative proteomics is often based on observed spectral features derived from liquid chromatography mass spectrometry (LC-MS) or LC-MS/MS experiments. Peak intensities are largely non-normal in distribution. Furthermore, LC-MS-based proteomics data frequently have large proportions of missing peak intensities due to censoring mechanisms on low-abundance spectral features. Recognizing that the observed peak intensities detected with the LC-MS method are all positive, skewed and often left-censored, we propose using survival methodology to carry out differential expression analysis of proteins. Various standard statistical techniques including non-parametric tests such as the Kolmogorov-Smirnov and Wilcoxon-Mann-Whitney rank sum tests, and the parametric survival model and accelerated failure time-model with log-normal, log-logistic and Weibull distributions were used to detect any differentially expressed proteins. The statistical operating characteristics of each method are explored using both real and simulated datasets. Survival methods generally have greater statistical power than standard differential expression methods when the proportion of missing protein level data is 5% or more. In particular, the AFT models we consider consistently achieve greater statistical power than standard testing procedures, with the discrepancy widening with increasing missingness in the proportions. The testing procedures discussed in this article can all be performed using readily available software such as R. The R codes are provided as supplemental materials. ctekwe@stat.tamu.edu.

  2. Length bias correction in gene ontology enrichment analysis using logistic regression.

    PubMed

    Mi, Gu; Di, Yanming; Emerson, Sarah; Cumbie, Jason S; Chang, Jeff H

    2012-01-01

    When assessing differential gene expression from RNA sequencing data, commonly used statistical tests tend to have greater power to detect differential expression of genes encoding longer transcripts. This phenomenon, called "length bias", will influence subsequent analyses such as Gene Ontology enrichment analysis. In the presence of length bias, Gene Ontology categories that include longer genes are more likely to be identified as enriched. These categories, however, are not necessarily biologically more relevant. We show that one can effectively adjust for length bias in Gene Ontology analysis by including transcript length as a covariate in a logistic regression model. The logistic regression model makes the statistical issue underlying length bias more transparent: transcript length becomes a confounding factor when it correlates with both the Gene Ontology membership and the significance of the differential expression test. The inclusion of the transcript length as a covariate allows one to investigate the direct correlation between the Gene Ontology membership and the significance of testing differential expression, conditional on the transcript length. We present both real and simulated data examples to show that the logistic regression approach is simple, effective, and flexible.

  3. The Omics Dashboard for interactive exploration of gene-expression data.

    PubMed

    Paley, Suzanne; Parker, Karen; Spaulding, Aaron; Tomb, Jean-Francois; O'Maille, Paul; Karp, Peter D

    2017-12-01

    The Omics Dashboard is a software tool for interactive exploration and analysis of gene-expression datasets. The Omics Dashboard is organized as a hierarchy of cellular systems. At the highest level of the hierarchy the Dashboard contains graphical panels depicting systems such as biosynthesis, energy metabolism, regulation and central dogma. Each of those panels contains a series of X-Y plots depicting expression levels of subsystems of that panel, e.g. subsystems within the central dogma panel include transcription, translation and protein maturation and folding. The Dashboard presents a visual read-out of the expression status of cellular systems to facilitate a rapid top-down user survey of how all cellular systems are responding to a given stimulus, and to enable the user to quickly view the responses of genes within specific systems of interest. Although the Dashboard is complementary to traditional statistical methods for analysis of gene-expression data, we show how it can detect changes in gene expression that statistical techniques may overlook. We present the capabilities of the Dashboard using two case studies: the analysis of lipid production for the marine alga Thalassiosira pseudonana, and an investigation of a shift from anaerobic to aerobic growth for the bacterium Escherichia coli. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  4. The Omics Dashboard for interactive exploration of gene-expression data

    PubMed Central

    Paley, Suzanne; Parker, Karen; Spaulding, Aaron; Tomb, Jean-Francois; O’Maille, Paul

    2017-01-01

    Abstract The Omics Dashboard is a software tool for interactive exploration and analysis of gene-expression datasets. The Omics Dashboard is organized as a hierarchy of cellular systems. At the highest level of the hierarchy the Dashboard contains graphical panels depicting systems such as biosynthesis, energy metabolism, regulation and central dogma. Each of those panels contains a series of X–Y plots depicting expression levels of subsystems of that panel, e.g. subsystems within the central dogma panel include transcription, translation and protein maturation and folding. The Dashboard presents a visual read-out of the expression status of cellular systems to facilitate a rapid top-down user survey of how all cellular systems are responding to a given stimulus, and to enable the user to quickly view the responses of genes within specific systems of interest. Although the Dashboard is complementary to traditional statistical methods for analysis of gene-expression data, we show how it can detect changes in gene expression that statistical techniques may overlook. We present the capabilities of the Dashboard using two case studies: the analysis of lipid production for the marine alga Thalassiosira pseudonana, and an investigation of a shift from anaerobic to aerobic growth for the bacterium Escherichia coli. PMID:29040755

  5. Transcriptome profiling of a Saccharomyces cerevisiae mutant with a constitutively activated Ras/cAMP pathway.

    PubMed

    Jones, D L; Petty, J; Hoyle, D C; Hayes, A; Ragni, E; Popolo, L; Oliver, S G; Stateva, L I

    2003-12-16

    Often changes in gene expression levels have been considered significant only when above/below some arbitrarily chosen threshold. We investigated the effect of applying a purely statistical approach to microarray analysis and demonstrated that small changes in gene expression have biological significance. Whole genome microarray analysis of a pde2Delta mutant, constructed in the Saccharomyces cerevisiae reference strain FY23, revealed altered expression of approximately 11% of protein encoding genes. The mutant, characterized by constitutive activation of the Ras/cAMP pathway, has increased sensitivity to stress, reduced ability to assimilate nonfermentable carbon sources, and some cell wall integrity defects. Applying the Munich Information Centre for Protein Sequences (MIPS) functional categories revealed increased expression of genes related to ribosome biogenesis and downregulation of genes in the cell rescue, defense, cell death and aging category, suggesting a decreased response to stress conditions. A reduced level of gene expression in the unfolded protein response pathway (UPR) was observed. Cell wall genes whose expression was affected by this mutation were also identified. Several of the cAMP-responsive orphan genes, upon further investigation, revealed cell wall functions; others had previously unidentified phenotypes assigned to them. This investigation provides a statistical global transcriptome analysis of the cellular response to constitutive activation of the Ras/cAMP pathway.

  6. mESAdb: microRNA Expression and Sequence Analysis Database

    PubMed Central

    Kaya, Koray D.; Karakülah, Gökhan; Yakıcıer, Cengiz M.; Acar, Aybar C.; Konu, Özlen

    2011-01-01

    microRNA expression and sequence analysis database (http://konulab.fen.bilkent.edu.tr/mirna/) (mESAdb) is a regularly updated database for the multivariate analysis of sequences and expression of microRNAs from multiple taxa. mESAdb is modular and has a user interface implemented in PHP and JavaScript and coupled with statistical analysis and visualization packages written for the R language. The database primarily comprises mature microRNA sequences and their target data, along with selected human, mouse and zebrafish expression data sets. mESAdb analysis modules allow (i) mining of microRNA expression data sets for subsets of microRNAs selected manually or by motif; (ii) pair-wise multivariate analysis of expression data sets within and between taxa; and (iii) association of microRNA subsets with annotation databases, HUGE Navigator, KEGG and GO. The use of existing and customized R packages facilitates future addition of data sets and analysis tools. Furthermore, the ability to upload and analyze user-specified data sets makes mESAdb an interactive and expandable analysis tool for microRNA sequence and expression data. PMID:21177657

  7. mESAdb: microRNA expression and sequence analysis database.

    PubMed

    Kaya, Koray D; Karakülah, Gökhan; Yakicier, Cengiz M; Acar, Aybar C; Konu, Ozlen

    2011-01-01

    microRNA expression and sequence analysis database (http://konulab.fen.bilkent.edu.tr/mirna/) (mESAdb) is a regularly updated database for the multivariate analysis of sequences and expression of microRNAs from multiple taxa. mESAdb is modular and has a user interface implemented in PHP and JavaScript and coupled with statistical analysis and visualization packages written for the R language. The database primarily comprises mature microRNA sequences and their target data, along with selected human, mouse and zebrafish expression data sets. mESAdb analysis modules allow (i) mining of microRNA expression data sets for subsets of microRNAs selected manually or by motif; (ii) pair-wise multivariate analysis of expression data sets within and between taxa; and (iii) association of microRNA subsets with annotation databases, HUGE Navigator, KEGG and GO. The use of existing and customized R packages facilitates future addition of data sets and analysis tools. Furthermore, the ability to upload and analyze user-specified data sets makes mESAdb an interactive and expandable analysis tool for microRNA sequence and expression data.

  8. EzArray: A web-based highly automated Affymetrix expression array data management and analysis system

    PubMed Central

    Zhu, Yuerong; Zhu, Yuelin; Xu, Wei

    2008-01-01

    Background Though microarray experiments are very popular in life science research, managing and analyzing microarray data are still challenging tasks for many biologists. Most microarray programs require users to have sophisticated knowledge of mathematics, statistics and computer skills for usage. With accumulating microarray data deposited in public databases, easy-to-use programs to re-analyze previously published microarray data are in high demand. Results EzArray is a web-based Affymetrix expression array data management and analysis system for researchers who need to organize microarray data efficiently and get data analyzed instantly. EzArray organizes microarray data into projects that can be analyzed online with predefined or custom procedures. EzArray performs data preprocessing and detection of differentially expressed genes with statistical methods. All analysis procedures are optimized and highly automated so that even novice users with limited pre-knowledge of microarray data analysis can complete initial analysis quickly. Since all input files, analysis parameters, and executed scripts can be downloaded, EzArray provides maximum reproducibility for each analysis. In addition, EzArray integrates with Gene Expression Omnibus (GEO) and allows instantaneous re-analysis of published array data. Conclusion EzArray is a novel Affymetrix expression array data analysis and sharing system. EzArray provides easy-to-use tools for re-analyzing published microarray data and will help both novice and experienced users perform initial analysis of their microarray data from the location of data storage. We believe EzArray will be a useful system for facilities with microarray services and laboratories with multiple members involved in microarray data analysis. EzArray is freely available from . PMID:18218103

  9. A comprehensive comparison of RNA-Seq-based transcriptome analysis from reads to differential gene expression and cross-comparison with microarrays: a case study in Saccharomyces cerevisiae

    PubMed Central

    Nookaew, Intawat; Papini, Marta; Pornputtapong, Natapol; Scalcinati, Gionata; Fagerberg, Linn; Uhlén, Matthias; Nielsen, Jens

    2012-01-01

    RNA-seq, has recently become an attractive method of choice in the studies of transcriptomes, promising several advantages compared with microarrays. In this study, we sought to assess the contribution of the different analytical steps involved in the analysis of RNA-seq data generated with the Illumina platform, and to perform a cross-platform comparison based on the results obtained through Affymetrix microarray. As a case study for our work we, used the Saccharomyces cerevisiae strain CEN.PK 113-7D, grown under two different conditions (batch and chemostat). Here, we asses the influence of genetic variation on the estimation of gene expression level using three different aligners for read-mapping (Gsnap, Stampy and TopHat) on S288c genome, the capabilities of five different statistical methods to detect differential gene expression (baySeq, Cuffdiff, DESeq, edgeR and NOISeq) and we explored the consistency between RNA-seq analysis using reference genome and de novo assembly approach. High reproducibility among biological replicates (correlation ≥0.99) and high consistency between the two platforms for analysis of gene expression levels (correlation ≥0.91) are reported. The results from differential gene expression identification derived from the different statistical methods, as well as their integrated analysis results based on gene ontology annotation are in good agreement. Overall, our study provides a useful and comprehensive comparison between the two platforms (RNA-seq and microrrays) for gene expression analysis and addresses the contribution of the different steps involved in the analysis of RNA-seq data. PMID:22965124

  10. Statistical Test of Expression Pattern (STEPath): a new strategy to integrate gene expression data with genomic information in individual and meta-analysis studies.

    PubMed

    Martini, Paolo; Risso, Davide; Sales, Gabriele; Romualdi, Chiara; Lanfranchi, Gerolamo; Cagnin, Stefano

    2011-04-11

    In the last decades, microarray technology has spread, leading to a dramatic increase of publicly available datasets. The first statistical tools developed were focused on the identification of significant differentially expressed genes. Later, researchers moved toward the systematic integration of gene expression profiles with additional biological information, such as chromosomal location, ontological annotations or sequence features. The analysis of gene expression linked to physical location of genes on chromosomes allows the identification of transcriptionally imbalanced regions, while, Gene Set Analysis focuses on the detection of coordinated changes in transcriptional levels among sets of biologically related genes. In this field, meta-analysis offers the possibility to compare different studies, addressing the same biological question to fully exploit public gene expression datasets. We describe STEPath, a method that starts from gene expression profiles and integrates the analysis of imbalanced region as an a priori step before performing gene set analysis. The application of STEPath in individual studies produced gene set scores weighted by chromosomal activation. As a final step, we propose a way to compare these scores across different studies (meta-analysis) on related biological issues. One complication with meta-analysis is batch effects, which occur because molecular measurements are affected by laboratory conditions, reagent lots and personnel differences. Major problems occur when batch effects are correlated with an outcome of interest and lead to incorrect conclusions. We evaluated the power of combining chromosome mapping and gene set enrichment analysis, performing the analysis on a dataset of leukaemia (example of individual study) and on a dataset of skeletal muscle diseases (meta-analysis approach). In leukaemia, we identified the Hox gene set, a gene set closely related to the pathology that other algorithms of gene set analysis do not identify, while the meta-analysis approach on muscular disease discriminates between related pathologies and correlates similar ones from different studies. STEPath is a new method that integrates gene expression profiles, genomic co-expressed regions and the information about the biological function of genes. The usage of the STEPath-computed gene set scores overcomes batch effects in the meta-analysis approaches allowing the direct comparison of different pathologies and different studies on a gene set activation level.

  11. CHESS (CgHExpreSS): a comprehensive analysis tool for the analysis of genomic alterations and their effects on the expression profile of the genome.

    PubMed

    Lee, Mikyung; Kim, Yangseok

    2009-12-16

    Genomic alterations frequently occur in many cancer patients and play important mechanistic roles in the pathogenesis of cancer. Furthermore, they can modify the expression level of genes due to altered copy number in the corresponding region of the chromosome. An accumulating body of evidence supports the possibility that strong genome-wide correlation exists between DNA content and gene expression. Therefore, more comprehensive analysis is needed to quantify the relationship between genomic alteration and gene expression. A well-designed bioinformatics tool is essential to perform this kind of integrative analysis. A few programs have already been introduced for integrative analysis. However, there are many limitations in their performance of comprehensive integrated analysis using published software because of limitations in implemented algorithms and visualization modules. To address this issue, we have implemented the Java-based program CHESS to allow integrative analysis of two experimental data sets: genomic alteration and genome-wide expression profile. CHESS is composed of a genomic alteration analysis module and an integrative analysis module. The genomic alteration analysis module detects genomic alteration by applying a threshold based method or SW-ARRAY algorithm and investigates whether the detected alteration is phenotype specific or not. On the other hand, the integrative analysis module measures the genomic alteration's influence on gene expression. It is divided into two separate parts. The first part calculates overall correlation between comparative genomic hybridization ratio and gene expression level by applying following three statistical methods: simple linear regression, Spearman rank correlation and Pearson's correlation. In the second part, CHESS detects the genes that are differentially expressed according to the genomic alteration pattern with three alternative statistical approaches: Student's t-test, Fisher's exact test and Chi square test. By successive operations of two modules, users can clarify how gene expression levels are affected by the phenotype specific genomic alterations. As CHESS was developed in both Java application and web environments, it can be run on a web browser or a local machine. It also supports all experimental platforms if a properly formatted text file is provided to include the chromosomal position of probes and their gene identifiers. CHESS is a user-friendly tool for investigating disease specific genomic alterations and quantitative relationships between those genomic alterations and genome-wide gene expression profiling.

  12. ArraySolver: an algorithm for colour-coded graphical display and Wilcoxon signed-rank statistics for comparing microarray gene expression data.

    PubMed

    Khan, Haseeb Ahmad

    2004-01-01

    The massive surge in the production of microarray data poses a great challenge for proper analysis and interpretation. In recent years numerous computational tools have been developed to extract meaningful interpretation of microarray gene expression data. However, a convenient tool for two-groups comparison of microarray data is still lacking and users have to rely on commercial statistical packages that might be costly and require special skills, in addition to extra time and effort for transferring data from one platform to other. Various statistical methods, including the t-test, analysis of variance, Pearson test and Mann-Whitney U test, have been reported for comparing microarray data, whereas the utilization of the Wilcoxon signed-rank test, which is an appropriate test for two-groups comparison of gene expression data, has largely been neglected in microarray studies. The aim of this investigation was to build an integrated tool, ArraySolver, for colour-coded graphical display and comparison of gene expression data using the Wilcoxon signed-rank test. The results of software validation showed similar outputs with ArraySolver and SPSS for large datasets. Whereas the former program appeared to be more accurate for 25 or fewer pairs (n < or = 25), suggesting its potential application in analysing molecular signatures that usually contain small numbers of genes. The main advantages of ArraySolver are easy data selection, convenient report format, accurate statistics and the familiar Excel platform.

  13. ArraySolver: An Algorithm for Colour-Coded Graphical Display and Wilcoxon Signed-Rank Statistics for Comparing Microarray Gene Expression Data

    PubMed Central

    2004-01-01

    The massive surge in the production of microarray data poses a great challenge for proper analysis and interpretation. In recent years numerous computational tools have been developed to extract meaningful interpretation of microarray gene expression data. However, a convenient tool for two-groups comparison of microarray data is still lacking and users have to rely on commercial statistical packages that might be costly and require special skills, in addition to extra time and effort for transferring data from one platform to other. Various statistical methods, including the t-test, analysis of variance, Pearson test and Mann–Whitney U test, have been reported for comparing microarray data, whereas the utilization of the Wilcoxon signed-rank test, which is an appropriate test for two-groups comparison of gene expression data, has largely been neglected in microarray studies. The aim of this investigation was to build an integrated tool, ArraySolver, for colour-coded graphical display and comparison of gene expression data using the Wilcoxon signed-rank test. The results of software validation showed similar outputs with ArraySolver and SPSS for large datasets. Whereas the former program appeared to be more accurate for 25 or fewer pairs (n ≤ 25), suggesting its potential application in analysing molecular signatures that usually contain small numbers of genes. The main advantages of ArraySolver are easy data selection, convenient report format, accurate statistics and the familiar Excel platform. PMID:18629036

  14. Separate-channel analysis of two-channel microarrays: recovering inter-spot information.

    PubMed

    Smyth, Gordon K; Altman, Naomi S

    2013-05-26

    Two-channel (or two-color) microarrays are cost-effective platforms for comparative analysis of gene expression. They are traditionally analysed in terms of the log-ratios (M-values) of the two channel intensities at each spot, but this analysis does not use all the information available in the separate channel observations. Mixed models have been proposed to analyse intensities from the two channels as separate observations, but such models can be complex to use and the gain in efficiency over the log-ratio analysis is difficult to quantify. Mixed models yield test statistics for the null distributions can be specified only approximately, and some approaches do not borrow strength between genes. This article reformulates the mixed model to clarify the relationship with the traditional log-ratio analysis, to facilitate information borrowing between genes, and to obtain an exact distributional theory for the resulting test statistics. The mixed model is transformed to operate on the M-values and A-values (average log-expression for each spot) instead of on the log-expression values. The log-ratio analysis is shown to ignore information contained in the A-values. The relative efficiency of the log-ratio analysis is shown to depend on the size of the intraspot correlation. A new separate channel analysis method is proposed that assumes a constant intra-spot correlation coefficient across all genes. This approach permits the mixed model to be transformed into an ordinary linear model, allowing the data analysis to use a well-understood empirical Bayes analysis pipeline for linear modeling of microarray data. This yields statistically powerful test statistics that have an exact distributional theory. The log-ratio, mixed model and common correlation methods are compared using three case studies. The results show that separate channel analyses that borrow strength between genes are more powerful than log-ratio analyses. The common correlation analysis is the most powerful of all. The common correlation method proposed in this article for separate-channel analysis of two-channel microarray data is no more difficult to apply in practice than the traditional log-ratio analysis. It provides an intuitive and powerful means to conduct analyses and make comparisons that might otherwise not be possible.

  15. Reproducibility-optimized test statistic for ranking genes in microarray studies.

    PubMed

    Elo, Laura L; Filén, Sanna; Lahesmaa, Riitta; Aittokallio, Tero

    2008-01-01

    A principal goal of microarray studies is to identify the genes showing differential expression under distinct conditions. In such studies, the selection of an optimal test statistic is a crucial challenge, which depends on the type and amount of data under analysis. While previous studies on simulated or spike-in datasets do not provide practical guidance on how to choose the best method for a given real dataset, we introduce an enhanced reproducibility-optimization procedure, which enables the selection of a suitable gene- anking statistic directly from the data. In comparison with existing ranking methods, the reproducibilityoptimized statistic shows good performance consistently under various simulated conditions and on Affymetrix spike-in dataset. Further, the feasibility of the novel statistic is confirmed in a practical research setting using data from an in-house cDNA microarray study of asthma-related gene expression changes. These results suggest that the procedure facilitates the selection of an appropriate test statistic for a given dataset without relying on a priori assumptions, which may bias the findings and their interpretation. Moreover, the general reproducibilityoptimization procedure is not limited to detecting differential expression only but could be extended to a wide range of other applications as well.

  16. Vascular endothelial growth factor (VEGF) expression in locally advanced prostate cancer: secondary analysis of radiation therapy oncology group (RTOG) 8610.

    PubMed

    Pan, Larry; Baek, Seunghee; Edmonds, Pamela R; Roach, Mack; Wolkov, Harvey; Shah, Satish; Pollack, Alan; Hammond, M Elizabeth; Dicker, Adam P

    2013-04-25

    Angiogenesis is a key element in solid-tumor growth, invasion, and metastasis. VEGF is among the most potent angiogenic factor thus far detected. The aim of the present study is to explore the potential of VEGF (also known as VEGF-A) as a prognostic and predictive biomarker among men with locally advanced prostate cancer. The analysis was performed using patients enrolled on RTOG 8610, a phase III randomized control trial of radiation therapy alone (Arm 1) versus short-term neoadjuvant and concurrent androgen deprivation and radiation therapy (Arm 2) in men with locally advanced prostate carcinoma. Tissue samples were obtained from the RTOG tissue repository. Hematoxylin and eosin slides were reviewed, and paraffin blocks were immunohistochemically stained for VEGF expression and graded by Intensity score (0-3). Cox or Fine and Gray's proportional hazards models were used. Sufficient pathologic material was available from 103 (23%) of the 456 analyzable patients enrolled in the RTOG 8610 study. There were no statistically significant differences in the pre-treatment characteristics between the patient groups with and without VEGF intensity data. Median follow-up for all surviving patients with VEGF intensity data is 12.2 years. Univariate and multivariate analyses demonstrated no statistically significant correlation between the intensity of VEGF expression and overall survival, distant metastasis, local progression, disease-free survival, or biochemical failure. VEGF expression was also not statistically significantly associated with any of the endpoints when analyzed by treatment arm. This study revealed no statistically significant prognostic or predictive value of VEGF expression for locally advanced prostate cancer. This analysis is among one of the largest sample bases with long-term follow-up in a well-characterized patient population. There is an urgent need to establish multidisciplinary initiatives for coordinating further research in the area of human prostate cancer biomarkers.

  17. DR-Integrator: a new analytic tool for integrating DNA copy number and gene expression data.

    PubMed

    Salari, Keyan; Tibshirani, Robert; Pollack, Jonathan R

    2010-02-01

    DNA copy number alterations (CNA) frequently underlie gene expression changes by increasing or decreasing gene dosage. However, only a subset of genes with altered dosage exhibit concordant changes in gene expression. This subset is likely to be enriched for oncogenes and tumor suppressor genes, and can be identified by integrating these two layers of genome-scale data. We introduce DNA/RNA-Integrator (DR-Integrator), a statistical software tool to perform integrative analyses on paired DNA copy number and gene expression data. DR-Integrator identifies genes with significant correlations between DNA copy number and gene expression, and implements a supervised analysis that captures genes with significant alterations in both DNA copy number and gene expression between two sample classes. DR-Integrator is freely available for non-commercial use from the Pollack Lab at http://pollacklab.stanford.edu/ and can be downloaded as a plug-in application to Microsoft Excel and as a package for the R statistical computing environment. The R package is available under the name 'DRI' at http://cran.r-project.org/. An example analysis using DR-Integrator is included as supplemental material. Supplementary data are available at Bioinformatics online.

  18. Using Genome-Wide Expression Profiling to Define Gene Networks Relevant to the Study of Complex Traits: From RNA Integrity to Network Topology

    PubMed Central

    O'Brien, M.A.; Costin, B.N.; Miles, M.F.

    2014-01-01

    Postgenomic studies of the function of genes and their role in disease have now become an area of intense study since efforts to define the raw sequence material of the genome have largely been completed. The use of whole-genome approaches such as microarray expression profiling and, more recently, RNA-sequence analysis of transcript abundance has allowed an unprecedented look at the workings of the genome. However, the accurate derivation of such high-throughput data and their analysis in terms of biological function has been critical to truly leveraging the postgenomic revolution. This chapter will describe an approach that focuses on the use of gene networks to both organize and interpret genomic expression data. Such networks, derived from statistical analysis of large genomic datasets and the application of multiple bioinformatics data resources, poten-tially allow the identification of key control elements for networks associated with human disease, and thus may lead to derivation of novel therapeutic approaches. However, as discussed in this chapter, the leveraging of such networks cannot occur without a thorough understanding of the technical and statistical factors influencing the derivation of genomic expression data. Thus, while the catch phrase may be “it's the network … stupid,” the understanding of factors extending from RNA isolation to genomic profiling technique, multivariate statistics, and bioinformatics are all critical to defining fully useful gene networks for study of complex biology. PMID:23195313

  19. Prognostic value of stromal decorin expression in patients with breast cancer: a meta-analysis.

    PubMed

    Li, Shuang-Jiang; Chen, Da-Li; Zhang, Wen-Biao; Shen, Cheng; Che, Guo-Wei

    2015-11-01

    Numbers of studies have investigated the biological functions of decorin (DCN) in oncogenesis, tumor progression, angiogenesis and metastasis. Although many of them aim to highlight the prognostic value of stromal DCN expression in breast cancer, some controversial results still exist and a consensus has not been reached until now. Therefore, our meta-analysis aims to determine the prognostic significance of stromal DCN expression in breast cancer patients. PubMed, EMBASE, the Web of Science and China National Knowledge Infrastructure (CNKI) databases were searched for full-text literatures met out inclusion criteria. We applied the hazard ratio (HR) with 95% confidence interval (CI) as the appropriate summarized statistics. Q-test and I(2) statistic were employed to estimate the level of heterogeneity across the included studies. Sensitivity analysis was conducted to further identify the possible origins of heterogeneity. The publication bias was detected by Begg's test and Egger's test. There were three English literatures (involving 6 studies) included into our meta-analysis. On the one hand, both the summarized outcomes based on univariate analysis (HR: 0.513; 95% CI: 0.406-0.648; P<0.001) and multivariate analysis (HR: 0.544; 95% CI: 0.388-0.763; P<0.001) indicated that stromal DCN expression could promise the high cancer-specific survival (CSS) of breast cancer patients. On the other hand, both the summarized outcomes based on univariate analysis (HR: 0.504; 95% CI: 0.389-0.651; P<0.001) and multivariate analysis (HR: 0.568; 95% CI: 0.400-0.806; P=0.002) also indicated that stromal DCN expression was positively associated with high disease-free survival (DFS) of breast cancer patients. No significant heterogeneity or publication bias was observed within this meta-analysis. The present evidences indicate that high stromal DCN expression can significantly predict the good prognosis in patients with breast cancer. The discoveries from our meta-analysis have better be confirmed in the updated review pooling more relevant investigations in the future.

  20. Using the gene ontology for microarray data mining: a comparison of methods and application to age effects in human prefrontal cortex.

    PubMed

    Pavlidis, Paul; Qin, Jie; Arango, Victoria; Mann, John J; Sibille, Etienne

    2004-06-01

    One of the challenges in the analysis of gene expression data is placing the results in the context of other data available about genes and their relationships to each other. Here, we approach this problem in the study of gene expression changes associated with age in two areas of the human prefrontal cortex, comparing two computational methods. The first method, "overrepresentation analysis" (ORA), is based on statistically evaluating the fraction of genes in a particular gene ontology class found among the set of genes showing age-related changes in expression. The second method, "functional class scoring" (FCS), examines the statistical distribution of individual gene scores among all genes in the gene ontology class and does not involve an initial gene selection step. We find that FCS yields more consistent results than ORA, and the results of ORA depended strongly on the gene selection threshold. Our findings highlight the utility of functional class scoring for the analysis of complex expression data sets and emphasize the advantage of considering all available genomic information rather than sets of genes that pass a predetermined "threshold of significance."

  1. Contributions to Statistical Problems Related to Microarray Data

    ERIC Educational Resources Information Center

    Hong, Feng

    2009-01-01

    Microarray is a high throughput technology to measure the gene expression. Analysis of microarray data brings many interesting and challenging problems. This thesis consists three studies related to microarray data. First, we propose a Bayesian model for microarray data and use Bayes Factors to identify differentially expressed genes. Second, we…

  2. Senior Computational Scientist | Center for Cancer Research

    Cancer.gov

    The Basic Science Program (BSP) pursues independent, multidisciplinary research in basic and applied molecular biology, immunology, retrovirology, cancer biology, and human genetics. Research efforts and support are an integral part of the Center for Cancer Research (CCR) at the Frederick National Laboratory for Cancer Research (FNLCR). The Cancer & Inflammation Program (CIP), Basic Science Program, HLA Immunogenetics Section, under the leadership of Dr. Mary Carrington, studies the influence of human leukocyte antigens (HLA) and specific KIR/HLA genotypes on risk of and outcomes to infection, cancer, autoimmune disease, and maternal-fetal disease. Recent studies have focused on the impact of HLA gene expression in disease, the molecular mechanism regulating expression levels, and the functional basis for the effect of differential expression on disease outcome. The lab’s further focus is on the genetic basis for resistance/susceptibility to disease conferred by immunogenetic variation. KEY ROLES/RESPONSIBILITIES The Senior Computational Scientist will provide research support to the CIP-BSP-HLA Immunogenetics Section performing bio-statistical design, analysis and reporting of research projects conducted in the lab. This individual will be involved in the implementation of statistical models and data preparation. Successful candidate should have 5 or more years of competent, innovative biostatistics/bioinformatics research experience, beyond doctoral training Considerable experience with statistical software, such as SAS, R and S-Plus Sound knowledge, and demonstrated experience of theoretical and applied statistics Write program code to analyze data using statistical analysis software Contribute to the interpretation and publication of research results

  3. From reads to genes to pathways: differential expression analysis of RNA-Seq experiments using Rsubread and the edgeR quasi-likelihood pipeline.

    PubMed

    Chen, Yunshun; Lun, Aaron T L; Smyth, Gordon K

    2016-01-01

    In recent years, RNA sequencing (RNA-seq) has become a very widely used technology for profiling gene expression. One of the most common aims of RNA-seq profiling is to identify genes or molecular pathways that are differentially expressed (DE) between two or more biological conditions. This article demonstrates a computational workflow for the detection of DE genes and pathways from RNA-seq data by providing a complete analysis of an RNA-seq experiment profiling epithelial cell subsets in the mouse mammary gland. The workflow uses R software packages from the open-source Bioconductor project and covers all steps of the analysis pipeline, including alignment of read sequences, data exploration, differential expression analysis, visualization and pathway analysis. Read alignment and count quantification is conducted using the Rsubread package and the statistical analyses are performed using the edgeR package. The differential expression analysis uses the quasi-likelihood functionality of edgeR.

  4. A κ-generalized statistical mechanics approach to income analysis

    NASA Astrophysics Data System (ADS)

    Clementi, F.; Gallegati, M.; Kaniadakis, G.

    2009-02-01

    This paper proposes a statistical mechanics approach to the analysis of income distribution and inequality. A new distribution function, having its roots in the framework of κ-generalized statistics, is derived that is particularly suitable for describing the whole spectrum of incomes, from the low-middle income region up to the high income Pareto power-law regime. Analytical expressions for the shape, moments and some other basic statistical properties are given. Furthermore, several well-known econometric tools for measuring inequality, which all exist in a closed form, are considered. A method for parameter estimation is also discussed. The model is shown to fit remarkably well the data on personal income for the United States, and the analysis of inequality performed in terms of its parameters is revealed as very powerful.

  5. Detecting discordance enrichment among a series of two-sample genome-wide expression data sets.

    PubMed

    Lai, Yinglei; Zhang, Fanni; Nayak, Tapan K; Modarres, Reza; Lee, Norman H; McCaffrey, Timothy A

    2017-01-25

    With the current microarray and RNA-seq technologies, two-sample genome-wide expression data have been widely collected in biological and medical studies. The related differential expression analysis and gene set enrichment analysis have been frequently conducted. Integrative analysis can be conducted when multiple data sets are available. In practice, discordant molecular behaviors among a series of data sets can be of biological and clinical interest. In this study, a statistical method is proposed for detecting discordance gene set enrichment. Our method is based on a two-level multivariate normal mixture model. It is statistically efficient with linearly increased parameter space when the number of data sets is increased. The model-based probability of discordance enrichment can be calculated for gene set detection. We apply our method to a microarray expression data set collected from forty-five matched tumor/non-tumor pairs of tissues for studying pancreatic cancer. We divided the data set into a series of non-overlapping subsets according to the tumor/non-tumor paired expression ratio of gene PNLIP (pancreatic lipase, recently shown it association with pancreatic cancer). The log-ratio ranges from a negative value (e.g. more expressed in non-tumor tissue) to a positive value (e.g. more expressed in tumor tissue). Our purpose is to understand whether any gene sets are enriched in discordant behaviors among these subsets (when the log-ratio is increased from negative to positive). We focus on KEGG pathways. The detected pathways will be useful for our further understanding of the role of gene PNLIP in pancreatic cancer research. Among the top list of detected pathways, the neuroactive ligand receptor interaction and olfactory transduction pathways are the most significant two. Then, we consider gene TP53 that is well-known for its role as tumor suppressor in cancer research. The log-ratio also ranges from a negative value (e.g. more expressed in non-tumor tissue) to a positive value (e.g. more expressed in tumor tissue). We divided the microarray data set again according to the expression ratio of gene TP53. After the discordance enrichment analysis, we observed overall similar results and the above two pathways are still the most significant detections. More interestingly, only these two pathways have been identified for their association with pancreatic cancer in a pathway analysis of genome-wide association study (GWAS) data. This study illustrates that some disease-related pathways can be enriched in discordant molecular behaviors when an important disease-related gene changes its expression. Our proposed statistical method is useful in the detection of these pathways. Furthermore, our method can also be applied to genome-wide expression data collected by the recent RNA-seq technology.

  6. Analyzing Large Gene Expression and Methylation Data Profiles Using StatBicRM: Statistical Biclustering-Based Rule Mining

    PubMed Central

    Maulik, Ujjwal; Mallik, Saurav; Mukhopadhyay, Anirban; Bandyopadhyay, Sanghamitra

    2015-01-01

    Microarray and beadchip are two most efficient techniques for measuring gene expression and methylation data in bioinformatics. Biclustering deals with the simultaneous clustering of genes and samples. In this article, we propose a computational rule mining framework, StatBicRM (i.e., statistical biclustering-based rule mining) to identify special type of rules and potential biomarkers using integrated approaches of statistical and binary inclusion-maximal biclustering techniques from the biological datasets. At first, a novel statistical strategy has been utilized to eliminate the insignificant/low-significant/redundant genes in such way that significance level must satisfy the data distribution property (viz., either normal distribution or non-normal distribution). The data is then discretized and post-discretized, consecutively. Thereafter, the biclustering technique is applied to identify maximal frequent closed homogeneous itemsets. Corresponding special type of rules are then extracted from the selected itemsets. Our proposed rule mining method performs better than the other rule mining algorithms as it generates maximal frequent closed homogeneous itemsets instead of frequent itemsets. Thus, it saves elapsed time, and can work on big dataset. Pathway and Gene Ontology analyses are conducted on the genes of the evolved rules using David database. Frequency analysis of the genes appearing in the evolved rules is performed to determine potential biomarkers. Furthermore, we also classify the data to know how much the evolved rules are able to describe accurately the remaining test (unknown) data. Subsequently, we also compare the average classification accuracy, and other related factors with other rule-based classifiers. Statistical significance tests are also performed for verifying the statistical relevance of the comparative results. Here, each of the other rule mining methods or rule-based classifiers is also starting with the same post-discretized data-matrix. Finally, we have also included the integrated analysis of gene expression and methylation for determining epigenetic effect (viz., effect of methylation) on gene expression level. PMID:25830807

  7. Analyzing large gene expression and methylation data profiles using StatBicRM: statistical biclustering-based rule mining.

    PubMed

    Maulik, Ujjwal; Mallik, Saurav; Mukhopadhyay, Anirban; Bandyopadhyay, Sanghamitra

    2015-01-01

    Microarray and beadchip are two most efficient techniques for measuring gene expression and methylation data in bioinformatics. Biclustering deals with the simultaneous clustering of genes and samples. In this article, we propose a computational rule mining framework, StatBicRM (i.e., statistical biclustering-based rule mining) to identify special type of rules and potential biomarkers using integrated approaches of statistical and binary inclusion-maximal biclustering techniques from the biological datasets. At first, a novel statistical strategy has been utilized to eliminate the insignificant/low-significant/redundant genes in such way that significance level must satisfy the data distribution property (viz., either normal distribution or non-normal distribution). The data is then discretized and post-discretized, consecutively. Thereafter, the biclustering technique is applied to identify maximal frequent closed homogeneous itemsets. Corresponding special type of rules are then extracted from the selected itemsets. Our proposed rule mining method performs better than the other rule mining algorithms as it generates maximal frequent closed homogeneous itemsets instead of frequent itemsets. Thus, it saves elapsed time, and can work on big dataset. Pathway and Gene Ontology analyses are conducted on the genes of the evolved rules using David database. Frequency analysis of the genes appearing in the evolved rules is performed to determine potential biomarkers. Furthermore, we also classify the data to know how much the evolved rules are able to describe accurately the remaining test (unknown) data. Subsequently, we also compare the average classification accuracy, and other related factors with other rule-based classifiers. Statistical significance tests are also performed for verifying the statistical relevance of the comparative results. Here, each of the other rule mining methods or rule-based classifiers is also starting with the same post-discretized data-matrix. Finally, we have also included the integrated analysis of gene expression and methylation for determining epigenetic effect (viz., effect of methylation) on gene expression level.

  8. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics.

    PubMed

    Giambartolomei, Claudia; Vukcevic, Damjan; Schadt, Eric E; Franke, Lude; Hingorani, Aroon D; Wallace, Chris; Plagnol, Vincent

    2014-05-01

    Genetic association studies, in particular the genome-wide association study (GWAS) design, have provided a wealth of novel insights into the aetiology of a wide range of human diseases and traits, in particular cardiovascular diseases and lipid biomarkers. The next challenge consists of understanding the molecular basis of these associations. The integration of multiple association datasets, including gene expression datasets, can contribute to this goal. We have developed a novel statistical methodology to assess whether two association signals are consistent with a shared causal variant. An application is the integration of disease scans with expression quantitative trait locus (eQTL) studies, but any pair of GWAS datasets can be integrated in this framework. We demonstrate the value of the approach by re-analysing a gene expression dataset in 966 liver samples with a published meta-analysis of lipid traits including >100,000 individuals of European ancestry. Combining all lipid biomarkers, our re-analysis supported 26 out of 38 reported colocalisation results with eQTLs and identified 14 new colocalisation results, hence highlighting the value of a formal statistical test. In three cases of reported eQTL-lipid pairs (SYPL2, IFT172, TBKBP1) for which our analysis suggests that the eQTL pattern is not consistent with the lipid association, we identify alternative colocalisation results with SORT1, GCKR, and KPNB1, indicating that these genes are more likely to be causal in these genomic intervals. A key feature of the method is the ability to derive the output statistics from single SNP summary statistics, hence making it possible to perform systematic meta-analysis type comparisons across multiple GWAS datasets (implemented online at http://coloc.cs.ucl.ac.uk/coloc/). Our methodology provides information about candidate causal genes in associated intervals and has direct implications for the understanding of complex diseases as well as the design of drugs to target disease pathways.

  9. Statistical Analysis of Big Data on Pharmacogenomics

    PubMed Central

    Fan, Jianqing; Liu, Han

    2013-01-01

    This paper discusses statistical methods for estimating complex correlation structure from large pharmacogenomic datasets. We selectively review several prominent statistical methods for estimating large covariance matrix for understanding correlation structure, inverse covariance matrix for network modeling, large-scale simultaneous tests for selecting significantly differently expressed genes and proteins and genetic markers for complex diseases, and high dimensional variable selection for identifying important molecules for understanding molecule mechanisms in pharmacogenomics. Their applications to gene network estimation and biomarker selection are used to illustrate the methodological power. Several new challenges of Big data analysis, including complex data distribution, missing data, measurement error, spurious correlation, endogeneity, and the need for robust statistical methods, are also discussed. PMID:23602905

  10. Differential Expression and Clinical Significance of DNA Methyltransferase 3B (DNMT3B), Phosphatase and Tensin Homolog (PTEN) and Human MutL Homologs 1 (hMLH1) in Endometrial Carcinomas.

    PubMed

    Li, Wenting; Wang, Ying; Fang, Xinzhi; Zhou, Mei; Li, Yiqun; Dong, Ying; Wang, Ruozheng

    2017-02-21

    BACKGROUND The aim of this study was to investigate the expression and the clinicopathologic significance of DNA methyltransferase 3B (DNMT3B), phosphatase and tensin homolog (PTEN) and human MutL homologs 1 (hMLH1) in endometrial carcinomas between Han and Uygur women in Xinjiang. MATERIAL AND METHODS The expression of DNMT3B, PTEN, and hMLH1 in endometrial carcinomas were assessed by immunohistochemistry, followed by an analysis of their relationship to clinical-pathological features and prognosis. RESULTS There were a 61.7% (95/154) overexpression of DNMT3B, 50.0% (77/154) loss of PTEN expression and 18.2% (28/154) loss of hMLH1 expression. The expression of DNMT3B and PTEN in endometrial carcinomas was statistically significantly different between Uygur women and Han women (p=0.001, p=0.010, respectively). DNMT3B expression was statistically significant based on the grade of endometrial carcinomas (p=0.031). PTEN loss was statistically significant between endometrioid carcinomas (ECs) and non endometrioid carcinomas (NECs) (p=0.040). DNMT3B expression was statistically significant in different myometrial invasion groups in Uygur women (p=0.010). Furthermore, the correlation of DNMT3B and PTEN expression was significant in endometrial carcinomas (p=0.021). PTEN expression was statistically significant in the overall survival (OS) rate of women with endometrial cancers (p=0.041). CONCLUSIONS Our findings suggest that PTEN and DNMT3B possess common regulation features as well as certain ethnic differences in expression between Han women and Uygur women. An interaction may exist in the pathogenesis of endometrial carcinoma. DNMT3B was expressed differently in cases of myometrial invasion and PTEN was associated with OS, which suggested that these molecular markers may be useful in the evaluation of the biological behavior of endometrial carcinomas and may be useful indicators of prognosis in women with endometrial carcinomas.

  11. Differential Expression and Clinical Significance of DNA Methyltransferase 3B (DNMT3B), Phosphatase and Tensin Homolog (PTEN) and Human MutL Homologs 1 (hMLH1) in Endometrial Carcinomas

    PubMed Central

    Li, Wenting; Wang, Ying; Fang, Xinzhi; Zhou, Mei; Li, Yiqun; Dong, Ying; Wang, Ruozheng

    2017-01-01

    Background The aim of this study was to investigate the expression and the clinicopathologic significance of DNA methyltransferase 3B (DNMT3B), phosphatase and tensin homolog (PTEN) and human MutL homologs 1 (hMLH1) in endometrial carcinomas between Han and Uygur women in Xinjiang. Material/Methods The expression of DNMT3B, PTEN, and hMLH1 in endometrial carcinomas were assessed by immunohistochemistry, followed by an analysis of their relationship to clinical-pathological features and prognosis. Results There were a 61.7% (95/154) overexpression of DNMT3B, 50.0% (77/154) loss of PTEN expression and 18.2% (28/154) loss of hMLH1 expression. The expression of DNMT3B and PTEN in endometrial carcinomas was statistically significantly different between Uygur women and Han women (p=0.001, p=0.010, respectively). DNMT3B expression was statistically significant based on the grade of endometrial carcinomas (p=0.031). PTEN loss was statistically significant between endometrioid carcinomas (ECs) and non endometrioid carcinomas (NECs) (p=0.040). DNMT3B expression was statistically significant in different myometrial invasion groups in Uygur women (p=0.010). Furthermore, the correlation of DNMT3B and PTEN expression was significant in endometrial carcinomas (p=0.021). PTEN expression was statistically significant in the overall survival (OS) rate of women with endometrial cancers (p=0.041). Conclusions Our findings suggest that PTEN and DNMT3B possess common regulation features as well as certain ethnic differences in expression between Han women and Uygur women. An interaction may exist in the pathogenesis of endometrial carcinoma. DNMT3B was expressed differently in cases of myometrial invasion and PTEN was associated with OS, which suggested that these molecular markers may be useful in the evaluation of the biological behavior of endometrial carcinomas and may be useful indicators of prognosis in women with endometrial carcinomas. PMID:28220037

  12. Proof of Concept Study to Assess Fetal Gene Expression in Amniotic Fluid by NanoArray PCR

    PubMed Central

    Massingham, Lauren J.; Johnson, Kirby L.; Bianchi, Diana W.; Pei, Shermin; Peter, Inga; Cowan, Janet M.; Tantravahi, Umadevi; Morrison, Tom B.

    2011-01-01

    Microarray analysis of cell-free RNA in amniotic fluid (AF) supernatant has revealed differential fetal gene expression as a function of gestational age and karyotype. Once informative genes are identified, research moves to a more focused platform such as quantitative reverse transcriptase-PCR. Standardized NanoArray PCR (SNAP) is a recently developed gene profiling technology that enables the measurement of transcripts from samples containing reduced quantities or degraded nucleic acids. We used a previously developed SNAP gene panel as proof of concept to determine whether fetal functional gene expression could be ascertained from AF supernatant. RNA was extracted and converted to cDNA from 19 AF supernatant samples of euploid fetuses between 15 to 20 weeks of gestation, and transcript abundance of 21 genes was measured. Statistically significant differences in expression, as a function of advancing gestational age, were observed for 5 of 21 genes. ANXA5, GUSB, and PPIA showed decreasing gene expression over time, whereas CASC3 and ZNF264 showed increasing gene expression over time. Statistically significantly increased expression of MTOR and STAT2 was seen in female compared with male fetuses. This study demonstrates the feasibility of focused fetal gene expression analysis using SNAP technology. In the future, this technique could be optimized to examine specific genes instrumental in fetal organ system function, which could be a useful addition to prenatal care. PMID:21827969

  13. Fas expression in renal cell carcinoma accurately predicts patient survival after radical nephrectomy.

    PubMed

    Sejima, Takehiro; Morizane, Shuichi; Hinata, Nobuyuki; Yao, Akihisa; Isoyama, Tadahiro; Saito, Motoaki; Takenaka, Atsushi

    2012-01-01

    To investigate Fas, Fas ligand (FasL) and Bcl-2 expression, which are considered to be important apoptotic regulatory factors in renal cell carcinomas (RCCs). mRNA quantification and immunohistochemistry allowed for the determination of the expression of these three factors in surgically resected tumors from 82 patients with RCC. The correlation of protein and gene expression with more than 10 years of survival data following nephrectomy (along with clinical and pathologic parameters) was analyzed using uni- and multivariate statistical models. A significantly poorer outcome was observed in patients with tumors expressing high levels of Fas mRNA in the multivariate analysis (p = 0.0002). In addition, patient survival was significantly worse in FasL mRNA-positive tumor cases when compared with FasL mRNA-negative cases (p = 0.0345). Ten cases relapsed more than 5 years after nephrectomy. Among them, the tumors of 8 cases (80%) did not express FasL mRNA. Analysis of Bcl-2 did not show statistical significance of Bcl-2 expression as a prognostic indicator. The data suggest that pronounced Fas expression is a surrogate biomarker of active cancer cell proliferation. Given the FasL tumor counterattack theory, FasL overexpression in RCC may be one of the host immune deficiencies, consequently leading to poor prognosis. Copyright © 2012 S. Karger AG, Basel.

  14. Implementation and evaluation of an efficient secure computation system using ‘R’ for healthcare statistics

    PubMed Central

    Chida, Koji; Morohashi, Gembu; Fuji, Hitoshi; Magata, Fumihiko; Fujimura, Akiko; Hamada, Koki; Ikarashi, Dai; Yamamoto, Ryuichi

    2014-01-01

    Background and objective While the secondary use of medical data has gained attention, its adoption has been constrained due to protection of patient privacy. Making medical data secure by de-identification can be problematic, especially when the data concerns rare diseases. We require rigorous security management measures. Materials and methods Using secure computation, an approach from cryptography, our system can compute various statistics over encrypted medical records without decrypting them. An issue of secure computation is that the amount of processing time required is immense. We implemented a system that securely computes healthcare statistics from the statistical computing software ‘R’ by effectively combining secret-sharing-based secure computation with original computation. Results Testing confirmed that our system could correctly complete computation of average and unbiased variance of approximately 50 000 records of dummy insurance claim data in a little over a second. Computation including conditional expressions and/or comparison of values, for example, t test and median, could also be correctly completed in several tens of seconds to a few minutes. Discussion If medical records are simply encrypted, the risk of leaks exists because decryption is usually required during statistical analysis. Our system possesses high-level security because medical records remain in encrypted state even during statistical analysis. Also, our system can securely compute some basic statistics with conditional expressions using ‘R’ that works interactively while secure computation protocols generally require a significant amount of processing time. Conclusions We propose a secure statistical analysis system using ‘R’ for medical data that effectively integrates secret-sharing-based secure computation and original computation. PMID:24763677

  15. Implementation and evaluation of an efficient secure computation system using 'R' for healthcare statistics.

    PubMed

    Chida, Koji; Morohashi, Gembu; Fuji, Hitoshi; Magata, Fumihiko; Fujimura, Akiko; Hamada, Koki; Ikarashi, Dai; Yamamoto, Ryuichi

    2014-10-01

    While the secondary use of medical data has gained attention, its adoption has been constrained due to protection of patient privacy. Making medical data secure by de-identification can be problematic, especially when the data concerns rare diseases. We require rigorous security management measures. Using secure computation, an approach from cryptography, our system can compute various statistics over encrypted medical records without decrypting them. An issue of secure computation is that the amount of processing time required is immense. We implemented a system that securely computes healthcare statistics from the statistical computing software 'R' by effectively combining secret-sharing-based secure computation with original computation. Testing confirmed that our system could correctly complete computation of average and unbiased variance of approximately 50,000 records of dummy insurance claim data in a little over a second. Computation including conditional expressions and/or comparison of values, for example, t test and median, could also be correctly completed in several tens of seconds to a few minutes. If medical records are simply encrypted, the risk of leaks exists because decryption is usually required during statistical analysis. Our system possesses high-level security because medical records remain in encrypted state even during statistical analysis. Also, our system can securely compute some basic statistics with conditional expressions using 'R' that works interactively while secure computation protocols generally require a significant amount of processing time. We propose a secure statistical analysis system using 'R' for medical data that effectively integrates secret-sharing-based secure computation and original computation. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  16. Integrated Analysis of Pharmacologic, Clinical, and SNP Microarray Data using Projection onto the Most Interesting Statistical Evidence with Adaptive Permutation Testing

    PubMed Central

    Pounds, Stan; Cao, Xueyuan; Cheng, Cheng; Yang, Jun; Campana, Dario; Evans, William E.; Pui, Ching-Hon; Relling, Mary V.

    2010-01-01

    Powerful methods for integrated analysis of multiple biological data sets are needed to maximize interpretation capacity and acquire meaningful knowledge. We recently developed Projection Onto the Most Interesting Statistical Evidence (PROMISE). PROMISE is a statistical procedure that incorporates prior knowledge about the biological relationships among endpoint variables into an integrated analysis of microarray gene expression data with multiple biological and clinical endpoints. Here, PROMISE is adapted to the integrated analysis of pharmacologic, clinical, and genome-wide genotype data that incorporating knowledge about the biological relationships among pharmacologic and clinical response data. An efficient permutation-testing algorithm is introduced so that statistical calculations are computationally feasible in this higher-dimension setting. The new method is applied to a pediatric leukemia data set. The results clearly indicate that PROMISE is a powerful statistical tool for identifying genomic features that exhibit a biologically meaningful pattern of association with multiple endpoint variables. PMID:21516175

  17. Gene expression analysis of rheumatoid arthritis synovial lining regions by cDNA microarray combined with laser microdissection: up-regulation of inflammation-associated STAT1, IRF1, CXCL9, CXCL10, and CCL5

    PubMed Central

    Yoshida, S; Arakawa, F; Higuchi, F; Ishibashi, Y; Goto, M; Sugita, Y; Nomura, Y; Niino, D; Shimizu, K; Aoki, R; Hashikawa, K; Kimura, Y; Yasuda, K; Tashiro, K; Kuhara, S; Nagata, K; Ohshima, K

    2012-01-01

    Objectives The main histological change in rheumatoid arthritis (RA) is the villous proliferation of synovial lining cells, an important source of cytokines and chemokines, which are associated with inflammation. The aim of this study was to evaluate gene expression in the microdissected synovial lining cells of RA patients, using those of osteoarthritis (OA) patients as the control. Methods Samples were obtained during total joint replacement from 11 RA and five OA patients. Total RNA from the synovial lining cells was derived from selected specimens by laser microdissection (LMD) for subsequent cDNA microarray analysis. In addition, the expression of significant genes was confirmed immunohistochemically. Results The 14 519 genes detected by cDNA microarray were used to compare gene expression levels in synovial lining cells from RA with those from OA patients. Cluster analysis indicated that RA cells, including low- and high-expression subgroups, and OA cells were stored in two main clusters. The molecular activity of RA was statistically consistent with its clinical and histological activity. Expression levels of signal transducer and activator of transcription 1 (STAT1), interferon regulatory factor 1 (IRF1), and the chemokines CXCL9, CXCL10, and CCL5 were statistically significantly higher in the synovium of RA than in that of OA. Immunohistochemically, the lining synovium of RA, but not that of OA, clearly expressed STAT1, IRF1, and chemokines, as was seen in microarray analysis combined with LMD. Conclusions Our findings indicate an important role for lining synovial cells in the inflammatory and proliferative processes of RA. Further understanding of the local signalling in structural components is important in rheumatology. PMID:22401175

  18. Clinicopathological and prognostic significance of cyclooxygenase-2 expression in head and neck cancer: A meta-analysis

    PubMed Central

    Guo, Qiaojuan; Ren, Hui; Hu, Yanping; Xie, Tao

    2016-01-01

    Several studies have assessed the clinicopathological and prognostic value of cyclooxygenase-2 (COX-2) expression in patients with head and neck cancer (HNC), but their results remain controversial. To address this issue, a meta-analysis was carried out. A total of 29 studies involving 2430 patients were subjected to final analysis. Our results indicated that COX-2 expression was not statistically associated with advanced tumor stage (OR, 1.23; 95% CI, 0.98–1.55) but correlated with high risk of lymph node metastasis (OR, 1.28; 95% CI, 1.03–1.60) and advanced TNM stage (OR, 1.33; 95% CI, 1.06–1.66). Moreover, COX-2 expression had significant effect on poor OS (HR, 1.93; 95% CI, 1.29–2.90), RFS (HR, 2.02; 95% CI, 1.00–4.08) and DFS (HR, 5.14; 95% CI, 2.84–9.31). The results of subgroup analyses revealed that COX-2 expression was related with high possibility of lymph node metastasis in oral cancer (OR, 1.49; 95% CI, 1.01–2.20) and advanced TNM stage in oral cancer (OR, 1.58; 95% CI, 1.05–2.37) and no site-specific HNC (OR, 1.64; 95% CI, 1.02–2.62). However, subgroup analyses only showed a tendency without statistically significant association between COX-2 expression and survival. Significant heterogeneity was not found when analyzing clinicopathological data, but it appeared when considering survival data. No publication bias was detected in this study. This meta-analysis suggested that COX-2 expression could act as a prognostic factor for patients with HNC. PMID:27323811

  19. [Effect of overdose fluoride on expression of bone sialoprotein in developing dental tissues of rats].

    PubMed

    Xu, Zhi-ling; Wang, Qiang; Liu, Tian-lin; Guo, Li-ying; Jing, Feng-qiu; Liu, Hui

    2006-04-01

    To investigate the changes of bone sialoprotein (BSP) in developing dental tissues of rats exposed to fluoride. Twenty rats were randomly divided into two groups, one was with distilled water (control group), the other was with distilled water treated by fluoride (experimental group). When the fluorosis model was established, the changes of the expression of BSP were investigated and compared between the two groups. HE staining was used to observe the morphology of the cell, and immunohistochemisty assay was used to determine the expression of BSP in rat incisor. Student's t test was used for statistical analysis. The ameloblasts had normal morphology and arranged orderly. Immunoreactivitis of BSP was present in matured ameloblasts, dentinoblasts, cementoblasts, and the matrix in the control group. But in the experimental group the ameloblasts arranged in multiple layers, the enamel matrix was confused and the expression of BSP was significantly lower than that of the control group. Statistical analysis showed significant differences between the two groups (P<0.01). Fluoride can inhibit the expression of BSP in developing dental tissues of rats, and then inhibit differentiation of the tooth epithelial cells and secretion of matrix. This is a probable intracellular mechanism of dental fluorosis.

  20. Cyclin d1 expression in odontogenic cysts.

    PubMed

    Taghavi, Nasim; Modabbernia, Shirin; Akbarzadeh, Alireza; Sajjadi, Samad

    2013-01-01

    In the present study expression of cyclin D1 in the epithelial lining of odontogenic keratocyst, radicular cyst, dentigerous cyst and glandular odontogenic cyst was investigated to compare proliferative activity in these lesions. Immunohistochemical staining of cyclin D1 on formalin-fixed, paraffin-embedded tissue sections of odontogenic keratocysts (n=23), dentigerous cysts (n=20), radicular cysts (n=20) and glandular odontogenic cysts (n=5) was performed by standard EnVision method. Then, slides were studied to evaluate the following parameters in epithelial lining of cysts: expression, expression pattern, staining intensity and localization of expression. The data analysis showed statistically significant difference in cyclin D1 expression in studied groups (p < 0.001). Assessment of staining intensity and staining pattern showed more strong intensity and focally pattern in odontogenic keratocysts, but difference was not statistically significant among groups respectively (p=0.204, 0.469). Considering expression localization, cyclin D1 positive cells in odontogenic keratocysts and dentigerous cysts were frequently confined in parabasal layer, different from radicular cysts and glandular odontogenic cysts. The difference was statistically significant (p < 0.01). Findings showed higher expression of cyclin D1 in parabasal layer of odontogenic keratocyst and the entire cystic epithelium of glandular odontogenic cysts comparing to dentigerous cysts and radicular cysts, implying the possible role of G1-S cell cycle phase disturbances in the aggressiveness of odontogenic keratocyst and glandular odontogenic cyst.

  1. Prognostic relevance of Centromere protein H expression in esophageal carcinoma.

    PubMed

    Guo, Xian-Zhi; Zhang, Ge; Wang, Jun-Ye; Liu, Wan-Li; Wang, Fang; Dong, Ju-Qin; Xu, Li-Hua; Cao, Jing-Yan; Song, Li-Bing; Zeng, Mu-Sheng

    2008-08-13

    Many kinetochore proteins have been shown to be associated with human cancers. The aim of the present study was to clarify the expression of Centromere protein H (CENP-H), one of the fundamental components of the human active kinetochore, in esophageal carcinoma and its correlation with clinicopathological features. We examined the expression of CENP-H in immortalized esophageal epithelial cells as well as in esophageal carcinoma cells, and in 12 cases of esophageal carcinoma tissues and the paired normal esophageal tissues by RT-PCR and Western blot analysis. In addition, we analyzed CENP-H protein expression in 177 clinicopathologically characterized esophageal carcinoma cases by immunohistochemistry. Statistical analyses were applied to test for prognostic and diagnostic associations. The level of CENP-H mRNA and protein were higher in the immortalized cells, cancer cell lines and most cancer tissues than in normal control tissues. Immunohistochemistry showed that CENP-H was expressed in 127 of 171 ESCC cases (74.3%) and in 3 of 6 esophageal adenocarcinoma cases (50%). Statistical analysis of ESCC cases showed that there was a significant difference of CENP-H expression in patients categorized according to gender (P = 0.013), stage (P = 0.023) and T classification (P = 0.019). Patients with lower CENP-H expression had longer overall survival time than those with higher CENP-H expression. Multivariate analysis suggested that CENP-H expression was an independent prognostic marker for esophageal carcinoma patients. A prognostic value of CENP-H was also found in the subgroup of T3 approximately T4 and N0 tumor classification. Our results suggest that CENP-H protein is a valuable marker of esophageal carcinoma progression. CENP-H might be used as a valuable prognostic marker for esophageal carcinoma patients.

  2. Expression Profiling of Nonpolar Lipids in Meibum From Patients With Dry Eye: A Pilot Study

    PubMed Central

    Chen, Jianzhong; Keirsey, Jeremy K.; Green, Kari B.; Nichols, Kelly K.

    2017-01-01

    Purpose The purpose of this investigation was to characterize differentially expressed lipids in meibum samples from patients with dry eye disease (DED) in order to better understand the underlying pathologic mechanisms. Methods Meibum samples were collected from postmenopausal women with DED (PW-DED; n = 5) and a control group of postmenopausal women without DED (n = 4). Lipid profiles were analyzed by direct infusion full-scan electrospray ionization mass spectrometry (ESI-MS). An initial analysis of 145 representative peaks from four classes of lipids in PW-DED samples revealed that additional manual corrections for peak overlap and isotopes only slightly affected the statistical analysis. Therefore, analysis of uncorrected data, which can be applied to a greater number of peaks, was used to compare more than 500 lipid peaks common to PW-DED and control samples. Statistical analysis of peak intensities identified several lipid species that differed significantly between the two groups. Data from contact lens wearers with DED (CL-DED; n = 5) were also analyzed. Results Many species of the two types of diesters (DE) and very long chain wax esters (WE) were decreased by ∼20% in PW-DED, whereas levels of triacylglycerols were increased by an average of 39% ± 3% in meibum from PW-DED compared to that in the control group. Approximately the same reduction (20%) of similar DE and WE was observed for CL-DED. Conclusions Statistical analysis of peak intensities from direct infusion ESI-MS results identified differentially expressed lipids in meibum from dry eye patients. Further studies are warranted to support these findings. PMID:28426869

  3. Volcano plots in analyzing differential expressions with mRNA microarrays.

    PubMed

    Li, Wentian

    2012-12-01

    A volcano plot displays unstandardized signal (e.g. log-fold-change) against noise-adjusted/standardized signal (e.g. t-statistic or -log(10)(p-value) from the t-test). We review the basic and interactive use of the volcano plot and its crucial role in understanding the regularized t-statistic. The joint filtering gene selection criterion based on regularized statistics has a curved discriminant line in the volcano plot, as compared to the two perpendicular lines for the "double filtering" criterion. This review attempts to provide a unifying framework for discussions on alternative measures of differential expression, improved methods for estimating variance, and visual display of a microarray analysis result. We also discuss the possibility of applying volcano plots to other fields beyond microarray.

  4. Comparisons of non-Gaussian statistical models in DNA methylation analysis.

    PubMed

    Ma, Zhanyu; Teschendorff, Andrew E; Yu, Hong; Taghia, Jalil; Guo, Jun

    2014-06-16

    As a key regulatory mechanism of gene expression, DNA methylation patterns are widely altered in many complex genetic diseases, including cancer. DNA methylation is naturally quantified by bounded support data; therefore, it is non-Gaussian distributed. In order to capture such properties, we introduce some non-Gaussian statistical models to perform dimension reduction on DNA methylation data. Afterwards, non-Gaussian statistical model-based unsupervised clustering strategies are applied to cluster the data. Comparisons and analysis of different dimension reduction strategies and unsupervised clustering methods are presented. Experimental results show that the non-Gaussian statistical model-based methods are superior to the conventional Gaussian distribution-based method. They are meaningful tools for DNA methylation analysis. Moreover, among several non-Gaussian methods, the one that captures the bounded nature of DNA methylation data reveals the best clustering performance.

  5. Comparisons of Non-Gaussian Statistical Models in DNA Methylation Analysis

    PubMed Central

    Ma, Zhanyu; Teschendorff, Andrew E.; Yu, Hong; Taghia, Jalil; Guo, Jun

    2014-01-01

    As a key regulatory mechanism of gene expression, DNA methylation patterns are widely altered in many complex genetic diseases, including cancer. DNA methylation is naturally quantified by bounded support data; therefore, it is non-Gaussian distributed. In order to capture such properties, we introduce some non-Gaussian statistical models to perform dimension reduction on DNA methylation data. Afterwards, non-Gaussian statistical model-based unsupervised clustering strategies are applied to cluster the data. Comparisons and analysis of different dimension reduction strategies and unsupervised clustering methods are presented. Experimental results show that the non-Gaussian statistical model-based methods are superior to the conventional Gaussian distribution-based method. They are meaningful tools for DNA methylation analysis. Moreover, among several non-Gaussian methods, the one that captures the bounded nature of DNA methylation data reveals the best clustering performance. PMID:24937687

  6. CGO: utilizing and integrating gene expression microarray data in clinical research and data management.

    PubMed

    Bumm, Klaus; Zheng, Mingzhong; Bailey, Clyde; Zhan, Fenghuang; Chiriva-Internati, M; Eddlemon, Paul; Terry, Julian; Barlogie, Bart; Shaughnessy, John D

    2002-02-01

    Clinical GeneOrganizer (CGO) is a novel windows-based archiving, organization and data mining software for the integration of gene expression profiling in clinical medicine. The program implements various user-friendly tools and extracts data for further statistical analysis. This software was written for Affymetrix GeneChip *.txt files, but can also be used for any other microarray-derived data. The MS-SQL server version acts as a data mart and links microarray data with clinical parameters of any other existing database and therefore represents a valuable tool for combining gene expression analysis and clinical disease characteristics.

  7. Super-delta: a new differential gene expression analysis procedure with robust data normalization.

    PubMed

    Liu, Yuhang; Zhang, Jinfeng; Qiu, Xing

    2017-12-21

    Normalization is an important data preparation step in gene expression analyses, designed to remove various systematic noise. Sample variance is greatly reduced after normalization, hence the power of subsequent statistical analyses is likely to increase. On the other hand, variance reduction is made possible by borrowing information across all genes, including differentially expressed genes (DEGs) and outliers, which will inevitably introduce some bias. This bias typically inflates type I error; and can reduce statistical power in certain situations. In this study we propose a new differential expression analysis pipeline, dubbed as super-delta, that consists of a multivariate extension of the global normalization and a modified t-test. A robust procedure is designed to minimize the bias introduced by DEGs in the normalization step. The modified t-test is derived based on asymptotic theory for hypothesis testing that suitably pairs with the proposed robust normalization. We first compared super-delta with four commonly used normalization methods: global, median-IQR, quantile, and cyclic loess normalization in simulation studies. Super-delta was shown to have better statistical power with tighter control of type I error rate than its competitors. In many cases, the performance of super-delta is close to that of an oracle test in which datasets without technical noise were used. We then applied all methods to a collection of gene expression datasets on breast cancer patients who received neoadjuvant chemotherapy. While there is a substantial overlap of the DEGs identified by all of them, super-delta were able to identify comparatively more DEGs than its competitors. Downstream gene set enrichment analysis confirmed that all these methods selected largely consistent pathways. Detailed investigations on the relatively small differences showed that pathways identified by super-delta have better connections to breast cancer than other methods. As a new pipeline, super-delta provides new insights to the area of differential gene expression analysis. Solid theoretical foundation supports its asymptotic unbiasedness and technical noise-free properties. Implementation on real and simulated datasets demonstrates its decent performance compared with state-of-art procedures. It also has the potential of expansion to be incorporated with other data type and/or more general between-group comparison problems.

  8. Statistical Analysis of Microarray Data with Replicated Spots: A Case Study with Synechococcus WH8102

    PubMed Central

    Thomas, E. V.; Phillippy, K. H.; Brahamsha, B.; Haaland, D. M.; Timlin, J. A.; Elbourne, L. D. H.; Palenik, B.; Paulsen, I. T.

    2009-01-01

    Until recently microarray experiments often involved relatively few arrays with only a single representation of each gene on each array. A complete genome microarray with multiple spots per gene (spread out spatially across the array) was developed in order to compare the gene expression of a marine cyanobacterium and a knockout mutant strain in a defined artificial seawater medium. Statistical methods were developed for analysis in the special situation of this case study where there is gene replication within an array and where relatively few arrays are used, which can be the case with current array technology. Due in part to the replication within an array, it was possible to detect very small changes in the levels of expression between the wild type and mutant strains. One interesting biological outcome of this experiment is the indication of the extent to which the phosphorus regulatory system of this cyanobacterium affects the expression of multiple genes beyond those strictly involved in phosphorus acquisition. PMID:19404483

  9. Statistical Analysis of Microarray Data with Replicated Spots: A Case Study with Synechococcus WH8102

    DOE PAGES

    Thomas, E. V.; Phillippy, K. H.; Brahamsha, B.; ...

    2009-01-01

    Until recently microarray experiments often involved relatively few arrays with only a single representation of each gene on each array. A complete genome microarray with multiple spots per gene (spread out spatially across the array) was developed in order to compare the gene expression of a marine cyanobacterium and a knockout mutant strain in a defined artificial seawater medium. Statistical methods were developed for analysis in the special situation of this case study where there is gene replication within an array and where relatively few arrays are used, which can be the case with current array technology. Due in partmore » to the replication within an array, it was possible to detect very small changes in the levels of expression between the wild type and mutant strains. One interesting biological outcome of this experiment is the indication of the extent to which the phosphorus regulatory system of this cyanobacterium affects the expression of multiple genes beyond those strictly involved in phosphorus acquisition.« less

  10. Bayesian models based on test statistics for multiple hypothesis testing problems.

    PubMed

    Ji, Yuan; Lu, Yiling; Mills, Gordon B

    2008-04-01

    We propose a Bayesian method for the problem of multiple hypothesis testing that is routinely encountered in bioinformatics research, such as the differential gene expression analysis. Our algorithm is based on modeling the distributions of test statistics under both null and alternative hypotheses. We substantially reduce the complexity of the process of defining posterior model probabilities by modeling the test statistics directly instead of modeling the full data. Computationally, we apply a Bayesian FDR approach to control the number of rejections of null hypotheses. To check if our model assumptions for the test statistics are valid for various bioinformatics experiments, we also propose a simple graphical model-assessment tool. Using extensive simulations, we demonstrate the performance of our models and the utility of the model-assessment tool. In the end, we apply the proposed methodology to an siRNA screening and a gene expression experiment.

  11. Expression of CD10 predicts tumor progression and unfavorable prognosis in malignant melanoma.

    PubMed

    Oba, Junna; Nakahara, Takeshi; Hayashida, Sayaka; Kido, Makiko; Xie, Lining; Takahara, Masakazu; Uchi, Hiroshi; Miyazaki, Shogo; Abe, Takeru; Hagihara, Akihito; Moroi, Yoichi; Furue, Masutaka

    2011-12-01

    CD10 expression in malignant melanoma (MM) has been reported to increase according to tumor progression and metastasis; however, its association with patient outcome has not been clarified. We examined the immunohistochemical expression of CD10 in MM to determine whether or not it could serve as a marker for tumor progression and prognosis. A total of 64 formalin-fixed, paraffin-embedded samples of primary MM were immunostained for CD10. Similarly, 40 samples of melanocytic nevus and 20 of metastatic MM were analyzed for comparison. The following clinicopathologic variables were evaluated: age, gender, histologic type, tumor site, Breslow thickness, Clark level, the presence or absence of ulceration and tumor-infiltrating lymphocytes, and survival. Statistical analyses were performed to assess for associations. Several parameters were analyzed for survival using the Kaplan-Meier method and Cox proportional hazards model. Immunohistochemical analysis revealed that 34 of 64 cases (53%) of primary MM expressed CD10, compared with 15 of 20 cases (75%) of metastatic MM and only 4 of 40 cases (10%) of nevus. There was a significant positive relationship between CD10 expression and Breslow thickness, Clark level, and ulceration. Univariate analysis revealed 4 significant factors for shorter survival periods: CD10 expression, high Breslow thickness, high Clark level, and the presence of ulceration (P < .01 each). In multivariate analysis, CD10 expression was revealed to be a statistically significant and independent prognostic factor. The major limitation was the small sample size. CD10 expression may serve as a progression marker and can predict unfavorable prognosis in patients with MM. Copyright © 2010 American Academy of Dermatology, Inc. Published by Mosby, Inc. All rights reserved.

  12. Non-Gaussian Distributions Affect Identification of Expression Patterns, Functional Annotation, and Prospective Classification in Human Cancer Genomes

    PubMed Central

    Marko, Nicholas F.; Weil, Robert J.

    2012-01-01

    Introduction Gene expression data is often assumed to be normally-distributed, but this assumption has not been tested rigorously. We investigate the distribution of expression data in human cancer genomes and study the implications of deviations from the normal distribution for translational molecular oncology research. Methods We conducted a central moments analysis of five cancer genomes and performed empiric distribution fitting to examine the true distribution of expression data both on the complete-experiment and on the individual-gene levels. We used a variety of parametric and nonparametric methods to test the effects of deviations from normality on gene calling, functional annotation, and prospective molecular classification using a sixth cancer genome. Results Central moments analyses reveal statistically-significant deviations from normality in all of the analyzed cancer genomes. We observe as much as 37% variability in gene calling, 39% variability in functional annotation, and 30% variability in prospective, molecular tumor subclassification associated with this effect. Conclusions Cancer gene expression profiles are not normally-distributed, either on the complete-experiment or on the individual-gene level. Instead, they exhibit complex, heavy-tailed distributions characterized by statistically-significant skewness and kurtosis. The non-Gaussian distribution of this data affects identification of differentially-expressed genes, functional annotation, and prospective molecular classification. These effects may be reduced in some circumstances, although not completely eliminated, by using nonparametric analytics. This analysis highlights two unreliable assumptions of translational cancer gene expression analysis: that “small” departures from normality in the expression data distributions are analytically-insignificant and that “robust” gene-calling algorithms can fully compensate for these effects. PMID:23118863

  13. Expression and prognostic examination of heat shock proteins (HSP 27, HSP 70, and HSP 90) in medulloblastoma.

    PubMed

    Hauser, Péter; Hanzély, Zoltán; Jakab, Zsuzsanna; Oláh, Lászlóné; Szabó, Erika; Jeney, András; Schuler, Dezso; Fekete, Gyoörgy; Bognár, László; Garami, Miklós

    2006-07-01

    Expression of heat shock proteins (HSPs) is of prognostic significance in several tumor types. HSP expression levels were determined in medulloblastomas and tested whether HSPs expression was associated with prognostic parameters. Expression of antiapoptotic HSP 27, HSP 70, and HSP 90 was investigated by immunohistochemistry, on paraffin-embedded sections from 65 patients. Expression of HSPs was validated on internal vascular controls and by Western blotting analysis. Sample evaluation was based on the estimated percentage of HSP positive tumor cells. For survival analysis Kaplan-Meier method, for statistical analysis chi2 test, univariate analysis, and log rank test were applied. Expression of HSPs varied in medulloblastomas. On the basis of the average expression rate of HSPs, at HSP 27 and HSP 90 with a 10% cut off, and at HSP 70 with a 70% cut off 2 groups were created. The amount of expression of any of the HSP types was not significantly associated with known prognostic factors (age of patient, extent of resection, presence of metastasis) and histologic subtype. After an average follow-up period of 4.30 years, no significant difference was observed in survival depending on the expression of HSP 27 or HSP 70 or HSP 90. The high expression of HSPs indicates that these proteins are potential therapeutic targets.

  14. Separate enrichment analysis of pathways for up- and downregulated genes.

    PubMed

    Hong, Guini; Zhang, Wenjing; Li, Hongdong; Shen, Xiaopei; Guo, Zheng

    2014-03-06

    Two strategies are often adopted for enrichment analysis of pathways: the analysis of all differentially expressed (DE) genes together or the analysis of up- and downregulated genes separately. However, few studies have examined the rationales of these enrichment analysis strategies. Using both microarray and RNA-seq data, we show that gene pairs with functional links in pathways tended to have positively correlated expression levels, which could result in an imbalance between the up- and downregulated genes in particular pathways. We then show that the imbalance could greatly reduce the statistical power for finding disease-associated pathways through the analysis of all-DE genes. Further, using gene expression profiles from five types of tumours, we illustrate that the separate analysis of up- and downregulated genes could identify more pathways that are really pertinent to phenotypic difference. In conclusion, analysing up- and downregulated genes separately is more powerful than analysing all of the DE genes together.

  15. Gene co-expression network analysis in Rhodobacter capsulatus and application to comparative expression analysis of Rhodobacter sphaeroides

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pena-Castillo, Lourdes; Mercer, Ryan; Gurinovich, Anastasia

    2014-08-28

    The genus Rhodobacter contains purple nonsulfur bacteria found mostly in freshwater environments. Representative strains of two Rhodobacter species, R. capsulatus and R. sphaeroides, have had their genomes fully sequenced and both have been the subject of transcriptional profiling studies. Gene co-expression networks can be used to identify modules of genes with similar expression profiles. Functional analysis of gene modules can then associate co-expressed genes with biological pathways, and network statistics can determine the degree of module preservation in related networks. In this paper, we constructed an R. capsulatus gene co-expression network, performed functional analysis of identified gene modules, and investigatedmore » preservation of these modules in R. capsulatus proteomics data and in R. sphaeroides transcriptomics data. Results: The analysis identified 40 gene co-expression modules in R. capsulatus. Investigation of the module gene contents and expression profiles revealed patterns that were validated based on previous studies supporting the biological relevance of these modules. We identified two R. capsulatus gene modules preserved in the protein abundance data. We also identified several gene modules preserved between both Rhodobacter species, which indicate that these cellular processes are conserved between the species and are candidates for functional information transfer between species. Many gene modules were non-preserved, providing insight into processes that differentiate the two species. In addition, using Local Network Similarity (LNS), a recently proposed metric for expression divergence, we assessed the expression conservation of between-species pairs of orthologs, and within-species gene-protein expression profiles. Conclusions: Our analyses provide new sources of information for functional annotation in R. capsulatus because uncharacterized genes in modules are now connected with groups of genes that constitute a joint functional annotation. We identified R. capsulatus modules enriched with genes for ribosomal proteins, porphyrin and bacteriochlorophyll anabolism, and biosynthesis of secondary metabolites to be preserved in R. sphaeroides whereas modules related to RcGTA production and signalling showed lack of preservation in R. sphaeroides. In addition, we demonstrated that network statistics may also be applied within-species to identify congruence between mRNA expression and protein abundance data for which simple correlation measurements have previously had mixed results.« less

  16. Characteristics of genomic signatures derived using univariate methods and mechanistically anchored functional descriptors for predicting drug- and xenobiotic-induced nephrotoxicity.

    PubMed

    Shi, Weiwei; Bugrim, Andrej; Nikolsky, Yuri; Nikolskya, Tatiana; Brennan, Richard J

    2008-01-01

    ABSTRACT The ideal toxicity biomarker is composed of the properties of prediction (is detected prior to traditional pathological signs of injury), accuracy (high sensitivity and specificity), and mechanistic relationships to the endpoint measured (biological relevance). Gene expression-based toxicity biomarkers ("signatures") have shown good predictive power and accuracy, but are difficult to interpret biologically. We have compared different statistical methods of feature selection with knowledge-based approaches, using GeneGo's database of canonical pathway maps, to generate gene sets for the classification of renal tubule toxicity. The gene set selection algorithms include four univariate analyses: t-statistics, fold-change, B-statistics, and RankProd, and their combination and overlap for the identification of differentially expressed probes. Enrichment analysis following the results of the four univariate analyses, Hotelling T-square test, and, finally out-of-bag selection, a variant of cross-validation, were used to identify canonical pathway maps-sets of genes coordinately involved in key biological processes-with classification power. Differentially expressed genes identified by the different statistical univariate analyses all generated reasonably performing classifiers of tubule toxicity. Maps identified by enrichment analysis or Hotelling T-square had lower classification power, but highlighted perturbed lipid homeostasis as a common discriminator of nephrotoxic treatments. The out-of-bag method yielded the best functionally integrated classifier. The map "ephrins signaling" performed comparably to a classifier derived using sparse linear programming, a machine learning algorithm, and represents a signaling network specifically involved in renal tubule development and integrity. Such functional descriptors of toxicity promise to better integrate predictive toxicogenomics with mechanistic analysis, facilitating the interpretation and risk assessment of predictive genomic investigations.

  17. Aberrant Expression of Calretinin, D2-40 and Mesothelin in Mucinous and Non-Mucinous Colorectal Carcinomas and Relation to Clinicopathological Features and Prognosis.

    PubMed

    Foda, Abd AlRahman Mohammad; El-Hawary, Amira Kamal; Hamed, Hazem

    2016-10-01

    CRC is a heterogeneous disease in terms of morphology, invasive behavior, metastatic capacity, and clinical outcome. Recently, many so-called mesothelial markers, including calretinin, D2-40, WT1, thrombomodulin, mesothelin, and others, have been certified. The aim of this study was to assess the immunohistochemical expression of calretinin and other mesothelial markers (D2-40 and mesothelin) in colorectal mucinous adenocarcinoma (MA) and non mucinous adenocarcinoma (NMA) specimens and relation to clinicopathological features and prognosis using manual tissue microarray technique. We studied tumor tissue specimens from 150 patients with colorectal MA and NMA who underwent radical surgery from January 2007 to January 2012. High-density manual tissue microarrays were constructed using a modified mechanical pencil tip technique, and paraffin sections were submitted for immunohistochemistry using Calretinin, D2-40 and mesothelin expressions. We found that NMA showed significantly more calretinin and D2-40 expression than MA In contrast, no statistically significant difference between NMA and MA was detected in mesothelin expression. There were no statistically significant relations between any of the clinicopathological or histological parameters and any of the three markers. In a univariate analysis, neither calretinin nor D2-40 expressions showed any significant relations to DFS or OS. However, mesothelin luminal expression was significantly associated with worse DFS. Multivariate Cox regression analysis proved that luminal mesothelin expression was an independent negative prognostic factor in NMA. In conclusion, Calretinin, D2-40 and mesothelin are aberrantly expressed in a proportion of CRC cases with more expression in NMA than MA. Aberrant expression of these mesothelial markers was not associated with clinicopathological or histological features of CRCs. Only mesothelin expression appears to be a strong predictor of adverse prognosis.

  18. Cytokeratin 19 Expression Patterns of Dentigerous Cysts and Odontogenic Keratocysts

    PubMed Central

    Kamath, KP; Vidya, M

    2015-01-01

    Background: Although numerous investigators have studied the pattern of keratin expression in different odontogenic cysts, the results have been variable. Aim: The present study was conducted to determine the pattern of expression of cytokeratin 19 (CK 19) in the epithelial lining of odontogenic keratocysts and dentigerous cysts. Materials and Methods: The epithelial layers showing expression of the epithelial marker CK 19 was determined by immunohistochemical methods in 15 tissue specimens each of histopathologically confirmed cases of dentigerous cysts and odontogenic keratocysts. Statistical analysis was done to compare the CK 19 expression between dentigerous cyst and odontogenic keratocyst using the Chi-square test. P < 0.05 was considered to be statistically significant. Results: All specimens of dentigerous cysts were positive for CK 19 with 20% (3/15) of the specimens showing expression only in a single layer of the epithelium, 40% (6/15) of the specimens showing expression in more than one layer but not the entire thickness of the epithelium, and the remaining 40% (6/15) showing expression throughout the entire thickness of the epithelium. In the case of odontogenic keratocysts, 40% (6/15) of the specimens were negative for CK 19, 40% (6/15) of the specimens showed expression only in a single layer of the epithelium, and 20% (3/15) of the specimens showed expression in more than one layer, but not the entire thickness of the epithelium. The observed differences in CK 19 expression by the two lesions were statistically significant (P < 0.01). Conclusion: The differences in CK 19 expression by these cysts may be utilized as a diagnostic tool in differentiating between these two lesions. PMID:25861531

  19. A common base method for analysis of qPCR data and the application of simple blocking in qPCR experiments.

    PubMed

    Ganger, Michael T; Dietz, Geoffrey D; Ewing, Sarah J

    2017-12-01

    qPCR has established itself as the technique of choice for the quantification of gene expression. Procedures for conducting qPCR have received significant attention; however, more rigorous approaches to the statistical analysis of qPCR data are needed. Here we develop a mathematical model, termed the Common Base Method, for analysis of qPCR data based on threshold cycle values (C q ) and efficiencies of reactions (E). The Common Base Method keeps all calculations in the logscale as long as possible by working with log 10 (E) ∙ C q , which we call the efficiency-weighted C q value; subsequent statistical analyses are then applied in the logscale. We show how efficiency-weighted C q values may be analyzed using a simple paired or unpaired experimental design and develop blocking methods to help reduce unexplained variation. The Common Base Method has several advantages. It allows for the incorporation of well-specific efficiencies and multiple reference genes. The method does not necessitate the pairing of samples that must be performed using traditional analysis methods in order to calculate relative expression ratios. Our method is also simple enough to be implemented in any spreadsheet or statistical software without additional scripts or proprietary components.

  20. Statistical Model to Analyze Quantitative Proteomics Data Obtained by 18O/16O Labeling and Linear Ion Trap Mass Spectrometry

    PubMed Central

    Jorge, Inmaculada; Navarro, Pedro; Martínez-Acedo, Pablo; Núñez, Estefanía; Serrano, Horacio; Alfranca, Arántzazu; Redondo, Juan Miguel; Vázquez, Jesús

    2009-01-01

    Statistical models for the analysis of protein expression changes by stable isotope labeling are still poorly developed, particularly for data obtained by 16O/18O labeling. Besides large scale test experiments to validate the null hypothesis are lacking. Although the study of mechanisms underlying biological actions promoted by vascular endothelial growth factor (VEGF) on endothelial cells is of considerable interest, quantitative proteomics studies on this subject are scarce and have been performed after exposing cells to the factor for long periods of time. In this work we present the largest quantitative proteomics study to date on the short term effects of VEGF on human umbilical vein endothelial cells by 18O/16O labeling. Current statistical models based on normality and variance homogeneity were found unsuitable to describe the null hypothesis in a large scale test experiment performed on these cells, producing false expression changes. A random effects model was developed including four different sources of variance at the spectrum-fitting, scan, peptide, and protein levels. With the new model the number of outliers at scan and peptide levels was negligible in three large scale experiments, and only one false protein expression change was observed in the test experiment among more than 1000 proteins. The new model allowed the detection of significant protein expression changes upon VEGF stimulation for 4 and 8 h. The consistency of the changes observed at 4 h was confirmed by a replica at a smaller scale and further validated by Western blot analysis of some proteins. Most of the observed changes have not been described previously and are consistent with a pattern of protein expression that dynamically changes over time following the evolution of the angiogenic response. With this statistical model the 18O labeling approach emerges as a very promising and robust alternative to perform quantitative proteomics studies at a depth of several thousand proteins. PMID:19181660

  1. Temporal expression profiling of plasma proteins reveals oxidative stress in early stages of Type 1 Diabetes progression

    DOE PAGES

    Liu, Chih-Wei; Bramer, Lisa; Webb-Robertson, Bobbie-Jo; ...

    2017-10-07

    We report that blood markers other than islet autoantibodies are greatly needed to indicate the pancreatic beta cell destruction process as early as possible, and more accurately reflect the progression of Type 1 Diabetes Mellitus (T1D). To this end, a longitudinal proteomic profiling of human plasma using TMT-10plex-based LC-MS/MS analysis was performed to track temporal proteomic changes of T1D patients (n = 11) across 9 serial time points, spanning the period of T1D natural progression, in comparison with those of the matching healthy controls (n = 10). To our knowledge, the current study represents the largest (> 2000 proteins measured)more » longitudinal expression profiles of human plasma proteome in T1D research. By applying statistical trend analysis on the temporal expression patterns between T1D and controls, and Benjamini-Hochberg procedure for multiple-testing correction, 13 protein groups were regarded as having statistically significant differences during the entire follow-up period. Moreover, 16 protein groups, which play pivotal roles in response to oxidative stress, have consistently abnormal expression trend before seroconversion to islet autoimmunity. Importantly, the expression trends of two key reactive oxygen species-decomposing enzymes, Catalase and Superoxide dismutase were verified independently by ELISA.« less

  2. Temporal expression profiling of plasma proteins reveals oxidative stress in early stages of Type 1 Diabetes progression

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Liu, Chih-Wei; Bramer, Lisa; Webb-Robertson, Bobbie-Jo

    We report that blood markers other than islet autoantibodies are greatly needed to indicate the pancreatic beta cell destruction process as early as possible, and more accurately reflect the progression of Type 1 Diabetes Mellitus (T1D). To this end, a longitudinal proteomic profiling of human plasma using TMT-10plex-based LC-MS/MS analysis was performed to track temporal proteomic changes of T1D patients (n = 11) across 9 serial time points, spanning the period of T1D natural progression, in comparison with those of the matching healthy controls (n = 10). To our knowledge, the current study represents the largest (> 2000 proteins measured)more » longitudinal expression profiles of human plasma proteome in T1D research. By applying statistical trend analysis on the temporal expression patterns between T1D and controls, and Benjamini-Hochberg procedure for multiple-testing correction, 13 protein groups were regarded as having statistically significant differences during the entire follow-up period. Moreover, 16 protein groups, which play pivotal roles in response to oxidative stress, have consistently abnormal expression trend before seroconversion to islet autoimmunity. Importantly, the expression trends of two key reactive oxygen species-decomposing enzymes, Catalase and Superoxide dismutase were verified independently by ELISA.« less

  3. Effect of the absolute statistic on gene-sampling gene-set analysis methods.

    PubMed

    Nam, Dougu

    2017-06-01

    Gene-set enrichment analysis and its modified versions have commonly been used for identifying altered functions or pathways in disease from microarray data. In particular, the simple gene-sampling gene-set analysis methods have been heavily used for datasets with only a few sample replicates. The biggest problem with this approach is the highly inflated false-positive rate. In this paper, the effect of absolute gene statistic on gene-sampling gene-set analysis methods is systematically investigated. Thus far, the absolute gene statistic has merely been regarded as a supplementary method for capturing the bidirectional changes in each gene set. Here, it is shown that incorporating the absolute gene statistic in gene-sampling gene-set analysis substantially reduces the false-positive rate and improves the overall discriminatory ability. Its effect was investigated by power, false-positive rate, and receiver operating curve for a number of simulated and real datasets. The performances of gene-set analysis methods in one-tailed (genome-wide association study) and two-tailed (gene expression data) tests were also compared and discussed.

  4. GENE-Counter: A Computational Pipeline for the Analysis of RNA-Seq Data for Gene Expression Differences

    PubMed Central

    Di, Yanming; Schafer, Daniel W.; Wilhelm, Larry J.; Fox, Samuel E.; Sullivan, Christopher M.; Curzon, Aron D.; Carrington, James C.; Mockler, Todd C.; Chang, Jeff H.

    2011-01-01

    GENE-counter is a complete Perl-based computational pipeline for analyzing RNA-Sequencing (RNA-Seq) data for differential gene expression. In addition to its use in studying transcriptomes of eukaryotic model organisms, GENE-counter is applicable for prokaryotes and non-model organisms without an available genome reference sequence. For alignments, GENE-counter is configured for CASHX, Bowtie, and BWA, but an end user can use any Sequence Alignment/Map (SAM)-compliant program of preference. To analyze data for differential gene expression, GENE-counter can be run with any one of three statistics packages that are based on variations of the negative binomial distribution. The default method is a new and simple statistical test we developed based on an over-parameterized version of the negative binomial distribution. GENE-counter also includes three different methods for assessing differentially expressed features for enriched gene ontology (GO) terms. Results are transparent and data are systematically stored in a MySQL relational database to facilitate additional analyses as well as quality assessment. We used next generation sequencing to generate a small-scale RNA-Seq dataset derived from the heavily studied defense response of Arabidopsis thaliana and used GENE-counter to process the data. Collectively, the support from analysis of microarrays as well as the observed and substantial overlap in results from each of the three statistics packages demonstrates that GENE-counter is well suited for handling the unique characteristics of small sample sizes and high variability in gene counts. PMID:21998647

  5. Establishing glucose- and ABA-regulated transcription networks in Arabidopsis by microarray analysis and promoter classification using a Relevance Vector Machine.

    PubMed

    Li, Yunhai; Lee, Kee Khoon; Walsh, Sean; Smith, Caroline; Hadingham, Sophie; Sorefan, Karim; Cawley, Gavin; Bevan, Michael W

    2006-03-01

    Establishing transcriptional regulatory networks by analysis of gene expression data and promoter sequences shows great promise. We developed a novel promoter classification method using a Relevance Vector Machine (RVM) and Bayesian statistical principles to identify discriminatory features in the promoter sequences of genes that can correctly classify transcriptional responses. The method was applied to microarray data obtained from Arabidopsis seedlings treated with glucose or abscisic acid (ABA). Of those genes showing >2.5-fold changes in expression level, approximately 70% were correctly predicted as being up- or down-regulated (under 10-fold cross-validation), based on the presence or absence of a small set of discriminative promoter motifs. Many of these motifs have known regulatory functions in sugar- and ABA-mediated gene expression. One promoter motif that was not known to be involved in glucose-responsive gene expression was identified as the strongest classifier of glucose-up-regulated gene expression. We show it confers glucose-responsive gene expression in conjunction with another promoter motif, thus validating the classification method. We were able to establish a detailed model of glucose and ABA transcriptional regulatory networks and their interactions, which will help us to understand the mechanisms linking metabolism with growth in Arabidopsis. This study shows that machine learning strategies coupled to Bayesian statistical methods hold significant promise for identifying functionally significant promoter sequences.

  6. Characterization and recognition of mixed emotional expressions in thermal face image

    NASA Astrophysics Data System (ADS)

    Saha, Priya; Bhattacharjee, Debotosh; De, Barin K.; Nasipuri, Mita

    2016-05-01

    Facial expressions in infrared imaging have been introduced to solve the problem of illumination, which is an integral constituent of visual imagery. The paper investigates facial skin temperature distribution on mixed thermal facial expressions of our created face database where six are basic expressions and rest 12 are a mixture of those basic expressions. Temperature analysis has been performed on three facial regions of interest (ROIs); periorbital, supraorbital and mouth. Temperature variability of the ROIs in different expressions has been measured using statistical parameters. The temperature variation measurement in ROIs of a particular expression corresponds to a vector, which is later used in recognition of mixed facial expressions. Investigations show that facial features in mixed facial expressions can be characterized by positive emotion induced facial features and negative emotion induced facial features. Supraorbital is a useful facial region that can differentiate basic expressions from mixed expressions. Analysis and interpretation of mixed expressions have been conducted with the help of box and whisker plot. Facial region containing mixture of two expressions is generally less temperature inducing than corresponding facial region containing basic expressions.

  7. Structural Analysis of Covariance and Correlation Matrices.

    ERIC Educational Resources Information Center

    Joreskog, Karl G.

    1978-01-01

    A general approach to analysis of covariance structures is considered, in which the variances and covariances or correlations of the observed variables are directly expressed in terms of the parameters of interest. The statistical problems of identification, estimation and testing of such covariance or correlation structures are discussed.…

  8. Unsupervised Outlier Profile Analysis

    PubMed Central

    Ghosh, Debashis; Li, Song

    2014-01-01

    In much of the analysis of high-throughput genomic data, “interesting” genes have been selected based on assessment of differential expression between two groups or generalizations thereof. Most of the literature focuses on changes in mean expression or the entire distribution. In this article, we explore the use of C(α) tests, which have been applied in other genomic data settings. Their use for the outlier expression problem, in particular with continuous data, is problematic but nevertheless motivates new statistics that give an unsupervised analog to previously developed outlier profile analysis approaches. Some simulation studies are used to evaluate the proposal. A bivariate extension is described that can accommodate data from two platforms on matched samples. The proposed methods are applied to data from a prostate cancer study. PMID:25452686

  9. Meta-Analysis of Placental Transcriptome Data Identifies a Novel Molecular Pathway Related to Preeclampsia.

    PubMed

    van Uitert, Miranda; Moerland, Perry D; Enquobahrie, Daniel A; Laivuori, Hannele; van der Post, Joris A M; Ris-Stalpers, Carrie; Afink, Gijs B

    2015-01-01

    Studies using the placental transcriptome to identify key molecules relevant for preeclampsia are hampered by a relatively small sample size. In addition, they use a variety of bioinformatics and statistical methods, making comparison of findings challenging. To generate a more robust preeclampsia gene expression signature, we performed a meta-analysis on the original data of 11 placenta RNA microarray experiments, representing 139 normotensive and 116 preeclamptic pregnancies. Microarray data were pre-processed and analyzed using standardized bioinformatics and statistical procedures and the effect sizes were combined using an inverse-variance random-effects model. Interactions between genes in the resulting gene expression signature were identified by pathway analysis (Ingenuity Pathway Analysis, Gene Set Enrichment Analysis, Graphite) and protein-protein associations (STRING). This approach has resulted in a comprehensive list of differentially expressed genes that led to a 388-gene meta-signature of preeclamptic placenta. Pathway analysis highlights the involvement of the previously identified hypoxia/HIF1A pathway in the establishment of the preeclamptic gene expression profile, while analysis of protein interaction networks indicates CREBBP/EP300 as a novel element central to the preeclamptic placental transcriptome. In addition, there is an apparent high incidence of preeclampsia in women carrying a child with a mutation in CREBBP/EP300 (Rubinstein-Taybi Syndrome). The 388-gene preeclampsia meta-signature offers a vital starting point for further studies into the relevance of these genes (in particular CREBBP/EP300) and their concomitant pathways as biomarkers or functional molecules in preeclampsia. This will result in a better understanding of the molecular basis of this disease and opens up the opportunity to develop rational therapies targeting the placental dysfunction causal to preeclampsia.

  10. A probabilistic framework for microarray data analysis: fundamental probability models and statistical inference.

    PubMed

    Ogunnaike, Babatunde A; Gelmi, Claudio A; Edwards, Jeremy S

    2010-05-21

    Gene expression studies generate large quantities of data with the defining characteristic that the number of genes (whose expression profiles are to be determined) exceed the number of available replicates by several orders of magnitude. Standard spot-by-spot analysis still seeks to extract useful information for each gene on the basis of the number of available replicates, and thus plays to the weakness of microarrays. On the other hand, because of the data volume, treating the entire data set as an ensemble, and developing theoretical distributions for these ensembles provides a framework that plays instead to the strength of microarrays. We present theoretical results that under reasonable assumptions, the distribution of microarray intensities follows the Gamma model, with the biological interpretations of the model parameters emerging naturally. We subsequently establish that for each microarray data set, the fractional intensities can be represented as a mixture of Beta densities, and develop a procedure for using these results to draw statistical inference regarding differential gene expression. We illustrate the results with experimental data from gene expression studies on Deinococcus radiodurans following DNA damage using cDNA microarrays. Copyright (c) 2010 Elsevier Ltd. All rights reserved.

  11. Comparative effects of conjugated linoleic acid (CLA) and linoleic acid (LA) on the oxidoreduction status in THP-1 macrophages.

    PubMed

    Rybicka, Marta; Stachowska, Ewa; Gutowska, Izabela; Parczewski, Miłosz; Baśkiewicz, Magdalena; Machaliński, Bogusław; Boroń-Kaczmarska, Anna; Chlubek, Dariusz

    2011-04-27

    The aim of this study was to investigate the effect of conjugated linoleic acids (CLAs) on macrophage reactive oxygen species synthesis and the activity and expression of antioxidant enzymes, catalase (Cat), glutathione peroxidase (GPx), and superoxide dismutase (SOD). The macrophages were obtained from the THP-1 monocytic cell line. Cells were incubated with the addition of cis-9,trans-11 CLA or trans-10,cis-12 CLA or linoleic acid. Reactive oxygen species (ROS) formation was estimated by flow cytometry. Enzymes activity was measured spectrophotometrically. The antioxidant enzyme mRNA expression was estimated by real-time reverse transcriptase polymerase chain reaction (RT-PCR). Statistical analysis was based on nonparametric statistical tests [Friedman analysis of variation (ANOVA) and Wilcoxon signed-rank test]. cis-9,trans-11 CLA significantly increased the activity of Cat, while trans-10,cis-12 CLA notably influenced GPx activity. Both isomers significantly decreased mRNA expression for Cat. Only trans-10,cis-12 significantly influenced mRNA for SOD-2 expression. The CLAs activate processes of the ROS formation in macrophages. Adverse metabolic effects of each isomer action were observed.

  12. Immunohistochemical study of p21 and Bcl-2 in leukoplakia, oral submucous fibrosis and oral squamous cell carcinoma.

    PubMed

    Sutariya, Rakesh V; Manjunatha, Bhari Sharanesha

    2016-11-01

    Oral Squamous cell carcinoma (OSCC) results from genetic damage, leading to uncontrolled cell proliferation of damaged cells and the cell death. In the course of its progression, visible changes are taking place at the cellular level (atypical) and the resultant at the tissue level (epithelial dysplasia). The Aim of the present study was to evaluate and compare the expressions of intensity of p21 and Bcl-2 in Leukoplakia, oralsubmucous fibrosis (OSMF) and oral squamous cell carcinoma. Total 60 cases, 30 cases of oral squamous cell carcinoma, 15 cases of oral submucous fibrosis and 15 cases of Leukoplakia were evaluated immunohistochemically for p21 and Bcl-2 expression. p21 showed positive expression in 13 (86.67%) cases out of 15 cases of OSMF, 12 (80%) cases of leukoplakia out of 15 cases and 24 (80%) cases out of 30 cases of OSCC. The Bcl-2 expression was positive in 13 (86.67%) cases of OSMF, all cases of Leukoplakia and 25 (83.33%) cases of OSCC. No statistical significance was noted in the expression of p21 and Bcl-2 positive expression between OSMF, Leukoplakia and OSCC. Statistical analysis for comparison of intensity of p21 expression in different grades of OSCC showed no significance. Statistical significance difference was found between the expressions of Bcl-2 in moderately and poorly differentiated SCC. The intensity of p21 and Bcl-2 expressions in different grades of OSCC indicates a key role in progression of oral neoplasia.

  13. On some stochastic formulations and related statistical moments of pharmacokinetic models.

    PubMed

    Matis, J H; Wehrly, T E; Metzler, C M

    1983-02-01

    This paper presents the deterministic and stochastic model for a linear compartment system with constant coefficients, and it develops expressions for the mean residence times (MRT) and the variances of the residence times (VRT) for the stochastic model. The expressions are relatively simple computationally, involving primarily matrix inversion, and they are elegant mathematically, in avoiding eigenvalue analysis and the complex domain. The MRT and VRT provide a set of new meaningful response measures for pharmacokinetic analysis and they give added insight into the system kinetics. The new analysis is illustrated with an example involving the cholesterol turnover in rats.

  14. Phosphorylated neurofilament heavy: A potential blood biomarker to evaluate the severity of acute spinal cord injuries in adults

    PubMed Central

    Singh, Ajai; Kumar, Vineet; Ali, Sabir; Mahdi, Abbas Ali; Srivastava, Rajeshwer Nath

    2017-01-01

    Aims: The aim of this study is to analyze the serial estimation of phosphorylated neurofilament heavy (pNF-H) in blood plasma that would act as a potential biomarker for early prediction of the neurological severity of acute spinal cord injuries (SCI) in adults. Settings and Design: Pilot study/observational study. Subjects and Methods: A total of 40 patients (28 cases and 12 controls) of spine injury were included in this study. In the enrolled cases, plasma level of pNF-H was evaluated in blood samples and neurological evaluation was performed by the American Spinal Injury Association Injury Scale at specified period. Serial plasma neurofilament heavy values were then correlated with the neurological status of these patients during follow-up visits and were analyzed statistically. Statistical Analysis Used: Statistical analysis was performed using GraphPad InStat software (version 3.05 for Windows, San Diego, CA, USA). The correlation analysis between the clinical progression and pNF-H expression was done using Spearman's correlation. Results: The mean baseline level of pNF-H in cases was 6.40 ± 2.49 ng/ml, whereas in controls it was 0.54 ± 0.27 ng/ml. On analyzing the association between the two by Mann–Whitney U–test, the difference in levels was found to be statistically significant. The association between the neurological progression and pNF-H expression was determined using correlation analysis (Spearman's correlation). At 95% confidence interval, the correlation coefficient was found to be 0.64, and the correlation was statistically significant. Conclusions: Plasma pNF-H levels were elevated in accordance with the severity of SCI. Therefore, pNF-H may be considered as a potential biomarker to determine early the severity of SCI in adult patients. PMID:29291173

  15. Genetic architecture of wood properties based on association analysis and co-expression networks in white spruce.

    PubMed

    Lamara, Mebarek; Raherison, Elie; Lenz, Patrick; Beaulieu, Jean; Bousquet, Jean; MacKay, John

    2016-04-01

    Association studies are widely utilized to analyze complex traits but their ability to disclose genetic architectures is often limited by statistical constraints, and functional insights are usually minimal in nonmodel organisms like forest trees. We developed an approach to integrate association mapping results with co-expression networks. We tested single nucleotide polymorphisms (SNPs) in 2652 candidate genes for statistical associations with wood density, stiffness, microfibril angle and ring width in a population of 1694 white spruce trees (Picea glauca). Associations mapping identified 229-292 genes per wood trait using a statistical significance level of P < 0.05 to maximize discovery. Over-representation of genes associated for nearly all traits was found in a xylem preferential co-expression group developed in independent experiments. A xylem co-expression network was reconstructed with 180 wood associated genes and several known MYB and NAC regulators were identified as network hubs. The network revealed a link between the gene PgNAC8, wood stiffness and microfibril angle, as well as considerable within-season variation for both genetic control of wood traits and gene expression. Trait associations were distributed throughout the network suggesting complex interactions and pleiotropic effects. Our findings indicate that integration of association mapping and co-expression networks enhances our understanding of complex wood traits. © 2015 The Authors. New Phytologist © 2015 New Phytologist Trust.

  16. Functional genomics annotation of a statistical epistasis network associated with bladder cancer susceptibility.

    PubMed

    Hu, Ting; Pan, Qinxin; Andrew, Angeline S; Langer, Jillian M; Cole, Michael D; Tomlinson, Craig R; Karagas, Margaret R; Moore, Jason H

    2014-04-11

    Several different genetic and environmental factors have been identified as independent risk factors for bladder cancer in population-based studies. Recent studies have turned to understanding the role of gene-gene and gene-environment interactions in determining risk. We previously developed the bioinformatics framework of statistical epistasis networks (SEN) to characterize the global structure of interacting genetic factors associated with a particular disease or clinical outcome. By applying SEN to a population-based study of bladder cancer among Caucasians in New Hampshire, we were able to identify a set of connected genetic factors with strong and significant interaction effects on bladder cancer susceptibility. To support our statistical findings using networks, in the present study, we performed pathway enrichment analyses on the set of genes identified using SEN, and found that they are associated with the carcinogen benzo[a]pyrene, a component of tobacco smoke. We further carried out an mRNA expression microarray experiment to validate statistical genetic interactions, and to determine if the set of genes identified in the SEN were differentially expressed in a normal bladder cell line and a bladder cancer cell line in the presence or absence of benzo[a]pyrene. Significant nonrandom sets of genes from the SEN were found to be differentially expressed in response to benzo[a]pyrene in both the normal bladder cells and the bladder cancer cells. In addition, the patterns of gene expression were significantly different between these two cell types. The enrichment analyses and the gene expression microarray results support the idea that SEN analysis of bladder in population-based studies is able to identify biologically meaningful statistical patterns. These results bring us a step closer to a systems genetic approach to understanding cancer susceptibility that integrates population and laboratory-based studies.

  17. Energy-density field approach for low- and medium-frequency vibroacoustic analysis of complex structures using a statistical computational model

    NASA Astrophysics Data System (ADS)

    Kassem, M.; Soize, C.; Gagliardini, L.

    2009-06-01

    In this paper, an energy-density field approach applied to the vibroacoustic analysis of complex industrial structures in the low- and medium-frequency ranges is presented. This approach uses a statistical computational model. The analyzed system consists of an automotive vehicle structure coupled with its internal acoustic cavity. The objective of this paper is to make use of the statistical properties of the frequency response functions of the vibroacoustic system observed from previous experimental and numerical work. The frequency response functions are expressed in terms of a dimensionless matrix which is estimated using the proposed energy approach. Using this dimensionless matrix, a simplified vibroacoustic model is proposed.

  18. pcr: an R package for quality assessment, analysis and testing of qPCR data

    PubMed Central

    Ahmed, Mahmoud

    2018-01-01

    Background Real-time quantitative PCR (qPCR) is a broadly used technique in the biomedical research. Currently, few different analysis models are used to determine the quality of data and to quantify the mRNA level across the experimental conditions. Methods We developed an R package to implement methods for quality assessment, analysis and testing qPCR data for statistical significance. Double Delta CT and standard curve models were implemented to quantify the relative expression of target genes from CT in standard qPCR control-group experiments. In addition, calculation of amplification efficiency and curves from serial dilution qPCR experiments are used to assess the quality of the data. Finally, two-group testing and linear models were used to test for significance of the difference in expression control groups and conditions of interest. Results Using two datasets from qPCR experiments, we applied different quality assessment, analysis and statistical testing in the pcr package and compared the results to the original published articles. The final relative expression values from the different models, as well as the intermediary outputs, were checked against the expected results in the original papers and were found to be accurate and reliable. Conclusion The pcr package provides an intuitive and unified interface for its main functions to allow biologist to perform all necessary steps of qPCR analysis and produce graphs in a uniform way. PMID:29576953

  19. Similarity of markers identified from cancer gene expression studies: observations from GEO.

    PubMed

    Shi, Xingjie; Shen, Shihao; Liu, Jin; Huang, Jian; Zhou, Yong; Ma, Shuangge

    2014-09-01

    Gene expression profiling has been extensively conducted in cancer research. The analysis of multiple independent cancer gene expression datasets may provide additional information and complement single-dataset analysis. In this study, we conduct multi-dataset analysis and are interested in evaluating the similarity of cancer-associated genes identified from different datasets. The first objective of this study is to briefly review some statistical methods that can be used for such evaluation. Both marginal analysis and joint analysis methods are reviewed. The second objective is to apply those methods to 26 Gene Expression Omnibus (GEO) datasets on five types of cancers. Our analysis suggests that for the same cancer, the marker identification results may vary significantly across datasets, and different datasets share few common genes. In addition, datasets on different cancers share few common genes. The shared genetic basis of datasets on the same or different cancers, which has been suggested in the literature, is not observed in the analysis of GEO data. © The Author 2013. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

  20. Genome-scale cluster analysis of replicated microarrays using shrinkage correlation coefficient.

    PubMed

    Yao, Jianchao; Chang, Chunqi; Salmi, Mari L; Hung, Yeung Sam; Loraine, Ann; Roux, Stanley J

    2008-06-18

    Currently, clustering with some form of correlation coefficient as the gene similarity metric has become a popular method for profiling genomic data. The Pearson correlation coefficient and the standard deviation (SD)-weighted correlation coefficient are the two most widely-used correlations as the similarity metrics in clustering microarray data. However, these two correlations are not optimal for analyzing replicated microarray data generated by most laboratories. An effective correlation coefficient is needed to provide statistically sufficient analysis of replicated microarray data. In this study, we describe a novel correlation coefficient, shrinkage correlation coefficient (SCC), that fully exploits the similarity between the replicated microarray experimental samples. The methodology considers both the number of replicates and the variance within each experimental group in clustering expression data, and provides a robust statistical estimation of the error of replicated microarray data. The value of SCC is revealed by its comparison with two other correlation coefficients that are currently the most widely-used (Pearson correlation coefficient and SD-weighted correlation coefficient) using statistical measures on both synthetic expression data as well as real gene expression data from Saccharomyces cerevisiae. Two leading clustering methods, hierarchical and k-means clustering were applied for the comparison. The comparison indicated that using SCC achieves better clustering performance. Applying SCC-based hierarchical clustering to the replicated microarray data obtained from germinating spores of the fern Ceratopteris richardii, we discovered two clusters of genes with shared expression patterns during spore germination. Functional analysis suggested that some of the genetic mechanisms that control germination in such diverse plant lineages as mosses and angiosperms are also conserved among ferns. This study shows that SCC is an alternative to the Pearson correlation coefficient and the SD-weighted correlation coefficient, and is particularly useful for clustering replicated microarray data. This computational approach should be generally useful for proteomic data or other high-throughput analysis methodology.

  1. CORNAS: coverage-dependent RNA-Seq analysis of gene expression data without biological replicates.

    PubMed

    Low, Joel Z B; Khang, Tsung Fei; Tammi, Martti T

    2017-12-28

    In current statistical methods for calling differentially expressed genes in RNA-Seq experiments, the assumption is that an adjusted observed gene count represents an unknown true gene count. This adjustment usually consists of a normalization step to account for heterogeneous sample library sizes, and then the resulting normalized gene counts are used as input for parametric or non-parametric differential gene expression tests. A distribution of true gene counts, each with a different probability, can result in the same observed gene count. Importantly, sequencing coverage information is currently not explicitly incorporated into any of the statistical models used for RNA-Seq analysis. We developed a fast Bayesian method which uses the sequencing coverage information determined from the concentration of an RNA sample to estimate the posterior distribution of a true gene count. Our method has better or comparable performance compared to NOISeq and GFOLD, according to the results from simulations and experiments with real unreplicated data. We incorporated a previously unused sequencing coverage parameter into a procedure for differential gene expression analysis with RNA-Seq data. Our results suggest that our method can be used to overcome analytical bottlenecks in experiments with limited number of replicates and low sequencing coverage. The method is implemented in CORNAS (Coverage-dependent RNA-Seq), and is available at https://github.com/joel-lzb/CORNAS .

  2. Immunohistochemical Analysis of ATRX, IDH1 and p53 in Glioblastoma and Their Correlations with Patient Survival

    PubMed Central

    2016-01-01

    Glioblastoma (GBM) can be classified into molecular subgroups, on the basis of biomarker expression. Here, we classified our cohort of 163 adult GBMs into molecular subgroups according to the expression of proteins encoded by genes of alpha thalassemia/mental retardation syndrome X-linked (ATRX), isocitrate dehydrogenase (IDH) and TP53. We focused on the survival rate of molecular subgroups, depending on each and various combination of these biomarkers. ATRX, IDH1 and p53 protein expression were evaluated immunohistochemically and Kaplan-Meier analysis were carried out in each group. A total of 15.3% of enrolled GBMs demonstrated loss of ATRX expression (ATRX-), 10.4% expressed an aberrant IDH1 R132H protein (IDH1+), and 48.4% exhibited p53 overexpression (p53+). Survival differences were statistically significant when single protein expression or different combinations of expression of these proteins were analyzed. In conclusion, in the case of single protein expression, the patients with each IDH1+, or ATRX-, or p53- GBMs showed better survival than patients with counterparts protein expressed GBMs. In the case of double protein pairs, the patients with ATRX-/p53-, ATRX-/IDH1+, and IDH1+/p53- GBMs revealed better survival than the patients with GBMs with the remained pairs. In the case of triple protein combinations, the patients with ATRX-/p53-/IDH+ showed statistically significant survival gain than the patients with remained combination of proteins-expression status. Therefore, these three biomarkers, individually and as a combination, can stratify GBMs into prognostically relevant subgroups and have strong prognostic values in adult GBMs. PMID:27478330

  3. Developing a Zebrafish Model of NF1 for Structure-Function Analysis and Identification of Modifier Genes

    DTIC Science & Technology

    2010-04-01

    equipped with a spinning-disc confocal system ( Yokogawa ) was used. The statistical significance of changes to OPC cell numbers and migration upon nf1...that they are expressed in overlapping tissues. We examined the expression of both genes by whole mount in situ hybridization between the 4- cell stage...sorted cells confirmed expression, particularly in the vascular endothelium (Figure 4E-G), while RNA from 1- cell embryos indicate that both genes are

  4. Tightly Regulated Expression of Autographa californica Multicapsid Nucleopolyhedrovirus Immediate Early Genes Emerges from Their Interactions and Possible Collective Behaviors

    PubMed Central

    Taka, Hitomi; Asano, Shin-ichiro; Matsuura, Yoshiharu; Bando, Hisanori

    2015-01-01

    To infect their hosts, DNA viruses must successfully initiate the expression of viral genes that control subsequent viral gene expression and manipulate the host environment. Viral genes that are immediately expressed upon infection play critical roles in the early infection process. In this study, we investigated the expression and regulation of five canonical regulatory immediate-early (IE) genes of Autographa californica multicapsid nucleopolyhedrovirus: ie0, ie1, ie2, me53, and pe38. A systematic transient gene-expression analysis revealed that these IE genes are generally transactivators, suggesting the existence of a highly interactive regulatory network. A genetic analysis using gene knockout viruses demonstrated that the expression of these IE genes was tolerant to the single deletions of activator IE genes in the early stage of infection. A network graph analysis on the regulatory relationships observed in the transient expression analysis suggested that the robustness of IE gene expression is due to the organization of the IE gene regulatory network and how each IE gene is activated. However, some regulatory relationships detected by the genetic analysis were contradictory to those observed in the transient expression analysis, especially for IE0-mediated regulation. Statistical modeling, combined with genetic analysis using knockout alleles for ie0 and ie1, showed that the repressor function of ie0 was due to the interaction between ie0 and ie1, not ie0 itself. Taken together, these systematic approaches provided insight into the topology and nature of the IE gene regulatory network. PMID:25816136

  5. RNA-Seq Mouse Brain Regions Expression Data Analysis: Focus on ApoE Functional Network

    PubMed

    Babenko, Vladimir N; Smagin, Dmitry A; Kudryavtseva, Natalia N

    2017-09-13

    ApoE expression status was proved to be a highly specific marker of energy metabolism rate in the brain. Along with its neighbor, Translocase of Outer Mitochondrial Membrane 40 kDa (TOMM40) which is involved in mitochondrial metabolism, the corresponding genomic region constitutes the neuroenergetic hotspot. Using RNA-Seq data from a murine model of chronic stress a significant positive expression coordination of seven neighboring genes in ApoE locus in five brain regions was observed. ApoE maintains one of the highest absolute expression values genome-wide, implying that ApoE can be the driver of the neighboring gene expression alteration observed under stressful loads. Notably, we revealed the highly statistically significant increase of ApoE expression in the hypothalamus of chronically aggressive (FDR < 0.007) and defeated (FDR < 0.001) mice compared to the control. Correlation analysis revealed a close association of ApoE and proopiomelanocortin (Pomc) gene expression profiles implying the putative neuroendocrine stress response background of ApoE expression elevation therein.

  6. Expression of Vascular Endothelial Growth Factor in Odontogenic Cysts: Is There Any Impression on Clinical Outcome?

    PubMed

    Sadri, Donia; Farhadi, Sareh; Shahabi, Zahra; Sarshar, Samaneh

    2016-01-01

    The recent scientific reports have shown that angiogenesis can affect biological behavior of pathologic lesions. Regarding unique clinical outcome of Odontogenic keratocyst (OKC), the present study was aimed to compare angiogenesis in Odontogenic keratocyst and Dentigerous cyst (DC). In this experimental study, tissue sections of 46 samples of OKC and DC were stained through immunohistochemical method using Vascular Endothelial Growth Factor (VEGF) antibody. VEGF expression was evaluated in epithelial cells, fibroblasts and endothelial cells. The average percentage of stained cells in any samples was categorized to 3 groups as follows: SCORE 0: 10% of cells or less are positive. SCORE 1: 10 to 50% of cells are positive. SCORE 2: more than 50% of cells are positive. Mann-U-Whitney, T-test and chi-square was used for statistical analysis. The average of VEGF expression in 24 samples of DC was 20.2% and in 22 samples of OKC was 52.6%, respectively. The average of VEGF expression in these two cysts had statistical significant differences. (PV= 0.045). There was significant statistical differences between two cysts in the terms of VEGF SCORE (PV= 0.000). OKC samples had significantly higher SCORE for the purpose of VEGF incidence than DC. Also, there were no differences between VEGF expression in epithelial cells of two cysts (PV= 0.268) there were significant statistical differences between two cysts in terms of endothelial cell staining. The endothelial cell staining was significantly higher in OKC than DC (PV= 0.037%). Regarding higher expression of Vascular Endothelial Growth factor in OKC than DC, it seems that angiogenesis may have great impression on clinical outcome of OKC.

  7. Anger Expression Types and Interpersonal Problems in Nurses.

    PubMed

    Han, Aekyung; Won, Jongsoon; Kim, Oksoo; Lee, Sang E

    2015-06-01

    The purpose of this study was to investigate the anger expression types in nurses and to analyze the differences between the anger expression types and interpersonal problems. The data were collected from 149 nurses working in general hospitals with 300 beds or more in Seoul or Gyeonggi province, Korea. For anger expression type, the anger expression scale from the Korean State-Trait Anger Expression Inventory was used. For interpersonal problems, the short form of the Korean Inventory of Interpersonal Problems Circumplex Scales was used. Data were analyzed using descriptive statistics, cluster analysis, multivariate analysis of variance, and Duncan's multiple comparisons test. Three anger expression types in nurses were found: low-anger expression, anger-in, and anger-in/control type. From the results of multivariate analysis of variance, there were significant differences between anger expression types and interpersonal problems (Wilks lambda F = 3.52, p < .001). Additionally, anger-in/control type was found to have the most difficulty with interpersonal problems by Duncan's post hoc test (p < .050). Based on this research, the development of an anger expression intervention program for nurses is recommended to establish the means of expressing the suppressed emotions, which would help the nurses experience less interpersonal problems. Copyright © 2015. Published by Elsevier B.V.

  8. ExAtlas: An interactive online tool for meta-analysis of gene expression data.

    PubMed

    Sharov, Alexei A; Schlessinger, David; Ko, Minoru S H

    2015-12-01

    We have developed ExAtlas, an on-line software tool for meta-analysis and visualization of gene expression data. In contrast to existing software tools, ExAtlas compares multi-component data sets and generates results for all combinations (e.g. all gene expression profiles versus all Gene Ontology annotations). ExAtlas handles both users' own data and data extracted semi-automatically from the public repository (GEO/NCBI database). ExAtlas provides a variety of tools for meta-analyses: (1) standard meta-analysis (fixed effects, random effects, z-score, and Fisher's methods); (2) analyses of global correlations between gene expression data sets; (3) gene set enrichment; (4) gene set overlap; (5) gene association by expression profile; (6) gene specificity; and (7) statistical analysis (ANOVA, pairwise comparison, and PCA). ExAtlas produces graphical outputs, including heatmaps, scatter-plots, bar-charts, and three-dimensional images. Some of the most widely used public data sets (e.g. GNF/BioGPS, Gene Ontology, KEGG, GAD phenotypes, BrainScan, ENCODE ChIP-seq, and protein-protein interaction) are pre-loaded and can be used for functional annotations.

  9. A consistent framework for Horton regression statistics that leads to a modified Hack's law

    USGS Publications Warehouse

    Furey, P.R.; Troutman, B.M.

    2008-01-01

    A statistical framework is introduced that resolves important problems with the interpretation and use of traditional Horton regression statistics. The framework is based on a univariate regression model that leads to an alternative expression for Horton ratio, connects Horton regression statistics to distributional simple scaling, and improves the accuracy in estimating Horton plot parameters. The model is used to examine data for drainage area A and mainstream length L from two groups of basins located in different physiographic settings. Results show that confidence intervals for the Horton plot regression statistics are quite wide. Nonetheless, an analysis of covariance shows that regression intercepts, but not regression slopes, can be used to distinguish between basin groups. The univariate model is generalized to include n > 1 dependent variables. For the case where the dependent variables represent ln A and ln L, the generalized model performs somewhat better at distinguishing between basin groups than two separate univariate models. The generalized model leads to a modification of Hack's law where L depends on both A and Strahler order ??. Data show that ?? plays a statistically significant role in the modified Hack's law expression. ?? 2008 Elsevier B.V.

  10. Chicken ovalbumin upstream promoter-transcription factor II regulates nuclear receptor, myogenic, and metabolic gene expression in skeletal muscle cells.

    PubMed

    Crowther, Lisa M; Wang, Shu-Ching Mary; Eriksson, Natalie A; Myers, Stephen A; Murray, Lauren A; Muscat, George E O

    2011-02-24

    We demonstrate that chicken ovalbumin upstream promoter-transcription factor II (COUP-TFII) mRNA is more abundantly expressed (than COUP-TFI mRNA) in skeletal muscle C2C12 cells and in (type I and II) skeletal muscle tissue from C57BL/10 mice. Consequently, we have utilized the ABI TaqMan Low Density Array (TLDA) platform to analyze gene expression changes specifically attributable to ectopic COUP-TFII (relative to vector only) expression in muscle cells. Utilizing a TLDA-based platform and 5 internal controls, we analyze the entire NR superfamily, 96 critical metabolic genes, and 48 important myogenic regulatory genes on the TLDA platform utilizing 5 internal controls. The low density arrays were analyzed by rigorous statistical analysis (with Genorm normalization, Bioconductor R, and the Empirical Bayes statistic) using the (integromics) statminer software. In addition, we validated the differentially expressed patho-physiologically relevant gene (identified on the TLDA platform) glucose transporter type 4 (Glut4). We demonstrated that COUP-TFII expression increased the steady state levels of Glut4 mRNA and protein, while ectopic expression of truncated COUP-TFII lacking helix 12 (COUP-TFΔH12) reduced Glut4 mRNA expression in C2C12 cells. Moreover, COUP-TFII expression trans-activated the Glut4 promoter (-997/+3), and ChIP analysis identified selective recruitment of COUP-TFII to a region encompassing a highly conserved SP1 binding site (in mouse, rat, and human) at nt positions -131/-118. Mutation of the SpI site ablated COUP-TFII mediated trans-activation of the Glut4 promoter. In conclusion, this study demonstrates that in skeletal muscle cells, COUP-TFII regulates several nuclear hormone receptors, and critical metabolic and muscle specific genes.

  11. A database application for pre-processing, storage and comparison of mass spectra derived from patients and controls

    PubMed Central

    Titulaer, Mark K; Siccama, Ivar; Dekker, Lennard J; van Rijswijk, Angelique LCT; Heeren, Ron MA; Sillevis Smitt, Peter A; Luider, Theo M

    2006-01-01

    Background Statistical comparison of peptide profiles in biomarker discovery requires fast, user-friendly software for high throughput data analysis. Important features are flexibility in changing input variables and statistical analysis of peptides that are differentially expressed between patient and control groups. In addition, integration the mass spectrometry data with the results of other experiments, such as microarray analysis, and information from other databases requires a central storage of the profile matrix, where protein id's can be added to peptide masses of interest. Results A new database application is presented, to detect and identify significantly differentially expressed peptides in peptide profiles obtained from body fluids of patient and control groups. The presented modular software is capable of central storage of mass spectra and results in fast analysis. The software architecture consists of 4 pillars, 1) a Graphical User Interface written in Java, 2) a MySQL database, which contains all metadata, such as experiment numbers and sample codes, 3) a FTP (File Transport Protocol) server to store all raw mass spectrometry files and processed data, and 4) the software package R, which is used for modular statistical calculations, such as the Wilcoxon-Mann-Whitney rank sum test. Statistic analysis by the Wilcoxon-Mann-Whitney test in R demonstrates that peptide-profiles of two patient groups 1) breast cancer patients with leptomeningeal metastases and 2) prostate cancer patients in end stage disease can be distinguished from those of control groups. Conclusion The database application is capable to distinguish patient Matrix Assisted Laser Desorption Ionization (MALDI-TOF) peptide profiles from control groups using large size datasets. The modular architecture of the application makes it possible to adapt the application to handle also large sized data from MS/MS- and Fourier Transform Ion Cyclotron Resonance (FT-ICR) mass spectrometry experiments. It is expected that the higher resolution and mass accuracy of the FT-ICR mass spectrometry prevents the clustering of peaks of different peptides and allows the identification of differentially expressed proteins from the peptide profiles. PMID:16953879

  12. A database application for pre-processing, storage and comparison of mass spectra derived from patients and controls.

    PubMed

    Titulaer, Mark K; Siccama, Ivar; Dekker, Lennard J; van Rijswijk, Angelique L C T; Heeren, Ron M A; Sillevis Smitt, Peter A; Luider, Theo M

    2006-09-05

    Statistical comparison of peptide profiles in biomarker discovery requires fast, user-friendly software for high throughput data analysis. Important features are flexibility in changing input variables and statistical analysis of peptides that are differentially expressed between patient and control groups. In addition, integration the mass spectrometry data with the results of other experiments, such as microarray analysis, and information from other databases requires a central storage of the profile matrix, where protein id's can be added to peptide masses of interest. A new database application is presented, to detect and identify significantly differentially expressed peptides in peptide profiles obtained from body fluids of patient and control groups. The presented modular software is capable of central storage of mass spectra and results in fast analysis. The software architecture consists of 4 pillars, 1) a Graphical User Interface written in Java, 2) a MySQL database, which contains all metadata, such as experiment numbers and sample codes, 3) a FTP (File Transport Protocol) server to store all raw mass spectrometry files and processed data, and 4) the software package R, which is used for modular statistical calculations, such as the Wilcoxon-Mann-Whitney rank sum test. Statistic analysis by the Wilcoxon-Mann-Whitney test in R demonstrates that peptide-profiles of two patient groups 1) breast cancer patients with leptomeningeal metastases and 2) prostate cancer patients in end stage disease can be distinguished from those of control groups. The database application is capable to distinguish patient Matrix Assisted Laser Desorption Ionization (MALDI-TOF) peptide profiles from control groups using large size datasets. The modular architecture of the application makes it possible to adapt the application to handle also large sized data from MS/MS- and Fourier Transform Ion Cyclotron Resonance (FT-ICR) mass spectrometry experiments. It is expected that the higher resolution and mass accuracy of the FT-ICR mass spectrometry prevents the clustering of peaks of different peptides and allows the identification of differentially expressed proteins from the peptide profiles.

  13. Identification of Reference Genes for Quantitative Gene Expression Studies in a Non-Model Tree Pistachio (Pistacia vera L.)

    PubMed Central

    Moazzam Jazi, Maryam; Ghadirzadeh Khorzoghi, Effat; Botanga, Christopher; Seyedi, Seyed Mahdi

    2016-01-01

    The tree species, Pistacia vera (P. vera) is an important commercial product that is salt-tolerant and long-lived, with a possible lifespan of over one thousand years. Gene expression analysis is an efficient method to explore the possible regulatory mechanisms underlying these characteristics. Therefore, having the most suitable set of reference genes is required for transcript level normalization under different conditions in P. vera. In the present study, we selected eight widely used reference genes, ACT, EF1α, α-TUB, β-TUB, GAPDH, CYP2, UBQ10, and 18S rRNA. Using qRT-PCR their expression was assessed in 54 different samples of three cultivars of P. vera. The samples were collected from different organs under various abiotic treatments (cold, drought, and salt) across three time points. Several statistical programs (geNorm, NormFinder, and BestKeeper) were applied to estimate the expression stability of candidate reference genes. Results obtained from the statistical analysis were then exposed to Rank aggregation package to generate a consensus gene rank. Based on our results, EF1α was found to be the superior reference gene in all samples under all abiotic treatments. In addition to EF1α, ACT and β-TUB were the second best reference genes for gene expression analysis in leaf and root. We recommended β-TUB as the second most stable gene for samples under the cold and drought treatments, while ACT holds the same position in samples analyzed under salt treatment. This report will benefit future research on the expression profiling of P. vera and other members of the Anacardiaceae family. PMID:27308855

  14. Identification of Reference Genes for Quantitative Gene Expression Studies in a Non-Model Tree Pistachio (Pistacia vera L.).

    PubMed

    Moazzam Jazi, Maryam; Ghadirzadeh Khorzoghi, Effat; Botanga, Christopher; Seyedi, Seyed Mahdi

    2016-01-01

    The tree species, Pistacia vera (P. vera) is an important commercial product that is salt-tolerant and long-lived, with a possible lifespan of over one thousand years. Gene expression analysis is an efficient method to explore the possible regulatory mechanisms underlying these characteristics. Therefore, having the most suitable set of reference genes is required for transcript level normalization under different conditions in P. vera. In the present study, we selected eight widely used reference genes, ACT, EF1α, α-TUB, β-TUB, GAPDH, CYP2, UBQ10, and 18S rRNA. Using qRT-PCR their expression was assessed in 54 different samples of three cultivars of P. vera. The samples were collected from different organs under various abiotic treatments (cold, drought, and salt) across three time points. Several statistical programs (geNorm, NormFinder, and BestKeeper) were applied to estimate the expression stability of candidate reference genes. Results obtained from the statistical analysis were then exposed to Rank aggregation package to generate a consensus gene rank. Based on our results, EF1α was found to be the superior reference gene in all samples under all abiotic treatments. In addition to EF1α, ACT and β-TUB were the second best reference genes for gene expression analysis in leaf and root. We recommended β-TUB as the second most stable gene for samples under the cold and drought treatments, while ACT holds the same position in samples analyzed under salt treatment. This report will benefit future research on the expression profiling of P. vera and other members of the Anacardiaceae family.

  15. High CTLA-4 expression correlates with poor prognosis in thymoma patients

    PubMed Central

    Santoni, Giorgio; Amantini, Consuelo; Morelli, Maria Beatrice; Tomassoni, Daniele; Santoni, Matteo; Marinelli, Oliviero; Nabissi, Massimo; Cardinali, Claudio; Paolucci, Vittorio; Torniai, Mariangela; Rinaldi, Silvia; Morgese, Francesca; Bernardini, Giovanni; Berardi, Rossana

    2018-01-01

    Thymomas, tumors that arise from epithelial cells of the thymus gland, are the most common neoplasms of the anterior mediastinum, with an incidence rate of approximately 2.5 per million/year. Cytotoxic T Lymphocyte Antigen 4 (CTLA-4 or CD152) exerts inhibitory activity on T cells, and since its oncogenic role in the progression of different types of tumors, it has emerged as a potential therapeutic target in cancer patients. In this study, we assessed the expression of CTLA-4 both at mRNA and protein levels in paraffin embedded-tissues from patients with thymomas. Furthermore, we evaluated the relationship between CTLA-4 expression and the clinical-pathologic characteristics and prognosis in patients with thymomas. Sixty-eight patients with median age corresponding to 62 years were included in this analysis. Thymomas were classified accordingly to the WHO and Masaoka-Koga for histochemical analysis and for prognostic significance. A statistical difference was found between CTLA-4 mRNA levels in human normal thymus compared with thymoma specimens. CTLA-4 expression was statistically found to progressively increase in A, B1, B2, AB and it was maximal in B3 thymomas. According to Masaoka-Koga pathological classification, CTLA-4 expression was lower in I, IIA and IIB, and higher in invasive III and IV stages. By confocal microscopy analysis we identified the expression of CTLA-4 both in tumor cells and in CD45+ tumor-infiltrating leukocytes, mainly in B3 and AB thymomas. Finally, CTLA-4 overexpression significantly correlates with reduced overall survival in thymoma patients and in atypical thymoma subgroup, suggesting that it represents a negative prognostic factor. PMID:29682176

  16. Do Deregulated Cas Proteins Induce Genomic Instability in Early-Stage Ovarian Cancer

    DTIC Science & Technology

    2006-12-01

    use Western blot analysis of tumor lysates to correlate expression of HEF1, p130Cas, Aurora A, and phospho-Aurora A. This analysis is in progress. In...and importantly, evaluated a number of different detection/image analysis systems to ensure reproducible quantitative results. We have used a pilot...reproducible Interestingly, preliminary statistical analysis using Spearman and Pearson correlation indicates at least one striking correlation

  17. Method Designed to Respect Molecular Heterogeneity Can Profoundly Correct Present Data Interpretations for Genome-Wide Expression Analysis

    PubMed Central

    Chen, Chih-Hao; Hsu, Chueh-Lin; Huang, Shih-Hao; Chen, Shih-Yuan; Hung, Yi-Lin; Chen, Hsiao-Rong; Wu, Yu-Chung

    2015-01-01

    Although genome-wide expression analysis has become a routine tool for gaining insight into molecular mechanisms, extraction of information remains a major challenge. It has been unclear why standard statistical methods, such as the t-test and ANOVA, often lead to low levels of reproducibility, how likely applying fold-change cutoffs to enhance reproducibility is to miss key signals, and how adversely using such methods has affected data interpretations. We broadly examined expression data to investigate the reproducibility problem and discovered that molecular heterogeneity, a biological property of genetically different samples, has been improperly handled by the statistical methods. Here we give a mathematical description of the discovery and report the development of a statistical method, named HTA, for better handling molecular heterogeneity. We broadly demonstrate the improved sensitivity and specificity of HTA over the conventional methods and show that using fold-change cutoffs has lost much information. We illustrate the especial usefulness of HTA for heterogeneous diseases, by applying it to existing data sets of schizophrenia, bipolar disorder and Parkinson’s disease, and show it can abundantly and reproducibly uncover disease signatures not previously detectable. Based on 156 biological data sets, we estimate that the methodological issue has affected over 96% of expression studies and that HTA can profoundly correct 86% of the affected data interpretations. The methodological advancement can better facilitate systems understandings of biological processes, render biological inferences that are more reliable than they have hitherto been and engender translational medical applications, such as identifying diagnostic biomarkers and drug prediction, which are more robust. PMID:25793610

  18. Comparison of culture media for ex vivo cultivation of limbal epithelial progenitor cells

    PubMed Central

    Loureiro, Renata Ruoco; Cristovam, Priscila Cardoso; Martins, Caio Marques; Covre, Joyce Luciana; Sobrinho, Juliana Aparecida; Ricardo, José Reinaldo da Silva; Hazarbassanov, Rossen Myhailov; Höfling-Lima, Ana Luisa; Belfort, Rubens; Nishi, Mauro

    2013-01-01

    Purpose To compare the effectiveness of three culture media for growth, proliferation, differentiation, and viability of ex vivo cultured limbal epithelial progenitor cells. Methods Limbal epithelial progenitor cell cultures were established from ten human corneal rims and grew on plastic wells in three culture media: supplemental hormonal epithelial medium (SHEM), keratinocyte serum-free medium (KSFM), and Epilife. The performance of culturing limbal epithelial progenitor cells in each medium was evaluated according to the following parameters: growth area of epithelial migration; immunocytochemistry for adenosine 5′-triphosphate-binding cassette member 2 (ABCG2), p63, Ki67, cytokeratin 3 (CK3), and vimentin (VMT) and real-time reverse transcription polymerase chain reaction (RT–PCR) for CK3, ABCG2, and p63, and cell viability using Hoechst staining. Results Limbal epithelial progenitor cells cultivated in SHEM showed a tendency to faster migration, compared to KSFM and Epilife. Immunocytochemical analysis showed that proliferated cells in the SHEM had lower expression for markers related to progenitor epithelial cells (ABCG2) and putative progenitor cells (p63), and a higher percentage of positive cells for differentiated epithelium (CK3) when compared to KSFM and Epilife. In PCR analysis, ABCG2 expression was statistically higher for Epilife compared to SHEM. Expression of p63 was statistically higher for Epilife compared to SHEM and KSFM. However, CK3 expression was statistically lower for KSFM compared to SHEM. Conclusions Based on our findings, we concluded that cells cultured in KSFM and Epilife media presented a higher percentage of limbal epithelial progenitor cells, compared to SHEM. PMID:23378720

  19. [Distribution of human enterovirus 71 in brainstem of infants with brain stem encephalitis and infection mechanism].

    PubMed

    Hao, Bo; Gao, Di; Tang, Da-Wei; Wang, Xiao-Guang; Liu, Shui-Ping; Kong, Xiao-Ping; Liu, Chao; Huang, Jing-Lu; Bi, Qi-Ming; Quan, Li; Luo, Bin

    2012-04-01

    To explore the mechanism that how human enterovirus 71 (EV71) invades the brainstem and how intercellular adhesion molecules-1 (ICAM-1) participates by analyzing the expression and distribution of human EV71, and ICAM-1 in brainstem of infants with brain stem encephalitis. Twenty-two brainstem of infants with brain stem encephalitis were collected as the experimental group and 10 brainstems of fatal congenital heart disease were selected as the control group. The sections with perivascular cuffings were selected to observe EV71-VP1 expression by immunohistochemistry method and ICAM-1 expression was detected for the sections with EV71-VP1 positive expression. The staining image analysis and statistics analysis were performed. The experiment and control groups were compared. (1) EV71-VP1 positive cells in the experimental group were mainly astrocytes in brainstem with nigger-brown particles, and the control group was negative. (2) ICAM-1 positive cells showed nigger-brown. The expression in inflammatory cells (around blood vessels of brain stem and in glial nodules) and gliocytes increased. The results showed statistical difference comparing with control group (P < 0.05). The brainstem encephalitis can be used to diagnose fatal EV71 infection in infants. EV71 can invade the brainstem via hematogenous route. ICAM-1 may play an important role in the pathogenic process.

  20. A statistical method for the conservative adjustment of false discovery rate (q-value).

    PubMed

    Lai, Yinglei

    2017-03-14

    q-value is a widely used statistical method for estimating false discovery rate (FDR), which is a conventional significance measure in the analysis of genome-wide expression data. q-value is a random variable and it may underestimate FDR in practice. An underestimated FDR can lead to unexpected false discoveries in the follow-up validation experiments. This issue has not been well addressed in literature, especially in the situation when the permutation procedure is necessary for p-value calculation. We proposed a statistical method for the conservative adjustment of q-value. In practice, it is usually necessary to calculate p-value by a permutation procedure. This was also considered in our adjustment method. We used simulation data as well as experimental microarray or sequencing data to illustrate the usefulness of our method. The conservativeness of our approach has been mathematically confirmed in this study. We have demonstrated the importance of conservative adjustment of q-value, particularly in the situation that the proportion of differentially expressed genes is small or the overall differential expression signal is weak.

  1. Analysis of temporal gene expression profiles: clustering by simulated annealing and determining the optimal number of clusters.

    PubMed

    Lukashin, A V; Fuchs, R

    2001-05-01

    Cluster analysis of genome-wide expression data from DNA microarray hybridization studies has proved to be a useful tool for identifying biologically relevant groupings of genes and samples. In the present paper, we focus on several important issues related to clustering algorithms that have not yet been fully studied. We describe a simple and robust algorithm for the clustering of temporal gene expression profiles that is based on the simulated annealing procedure. In general, this algorithm guarantees to eventually find the globally optimal distribution of genes over clusters. We introduce an iterative scheme that serves to evaluate quantitatively the optimal number of clusters for each specific data set. The scheme is based on standard approaches used in regular statistical tests. The basic idea is to organize the search of the optimal number of clusters simultaneously with the optimization of the distribution of genes over clusters. The efficiency of the proposed algorithm has been evaluated by means of a reverse engineering experiment, that is, a situation in which the correct distribution of genes over clusters is known a priori. The employment of this statistically rigorous test has shown that our algorithm places greater than 90% genes into correct clusters. Finally, the algorithm has been tested on real gene expression data (expression changes during yeast cell cycle) for which the fundamental patterns of gene expression and the assignment of genes to clusters are well understood from numerous previous studies.

  2. Principal Angle Enrichment Analysis (PAEA): Dimensionally Reduced Multivariate Gene Set Enrichment Analysis Tool

    PubMed Central

    Clark, Neil R.; Szymkiewicz, Maciej; Wang, Zichen; Monteiro, Caroline D.; Jones, Matthew R.; Ma’ayan, Avi

    2016-01-01

    Gene set analysis of differential expression, which identifies collectively differentially expressed gene sets, has become an important tool for biology. The power of this approach lies in its reduction of the dimensionality of the statistical problem and its incorporation of biological interpretation by construction. Many approaches to gene set analysis have been proposed, but benchmarking their performance in the setting of real biological data is difficult due to the lack of a gold standard. In a previously published work we proposed a geometrical approach to differential expression which performed highly in benchmarking tests and compared well to the most popular methods of differential gene expression. As reported, this approach has a natural extension to gene set analysis which we call Principal Angle Enrichment Analysis (PAEA). PAEA employs dimensionality reduction and a multivariate approach for gene set enrichment analysis. However, the performance of this method has not been assessed nor its implementation as a web-based tool. Here we describe new benchmarking protocols for gene set analysis methods and find that PAEA performs highly. The PAEA method is implemented as a user-friendly web-based tool, which contains 70 gene set libraries and is freely available to the community. PMID:26848405

  3. Principal Angle Enrichment Analysis (PAEA): Dimensionally Reduced Multivariate Gene Set Enrichment Analysis Tool.

    PubMed

    Clark, Neil R; Szymkiewicz, Maciej; Wang, Zichen; Monteiro, Caroline D; Jones, Matthew R; Ma'ayan, Avi

    2015-11-01

    Gene set analysis of differential expression, which identifies collectively differentially expressed gene sets, has become an important tool for biology. The power of this approach lies in its reduction of the dimensionality of the statistical problem and its incorporation of biological interpretation by construction. Many approaches to gene set analysis have been proposed, but benchmarking their performance in the setting of real biological data is difficult due to the lack of a gold standard. In a previously published work we proposed a geometrical approach to differential expression which performed highly in benchmarking tests and compared well to the most popular methods of differential gene expression. As reported, this approach has a natural extension to gene set analysis which we call Principal Angle Enrichment Analysis (PAEA). PAEA employs dimensionality reduction and a multivariate approach for gene set enrichment analysis. However, the performance of this method has not been assessed nor its implementation as a web-based tool. Here we describe new benchmarking protocols for gene set analysis methods and find that PAEA performs highly. The PAEA method is implemented as a user-friendly web-based tool, which contains 70 gene set libraries and is freely available to the community.

  4. Near-exact distributions for the block equicorrelation and equivariance likelihood ratio test statistic

    NASA Astrophysics Data System (ADS)

    Coelho, Carlos A.; Marques, Filipe J.

    2013-09-01

    In this paper the authors combine the equicorrelation and equivariance test introduced by Wilks [13] with the likelihood ratio test (l.r.t.) for independence of groups of variables to obtain the l.r.t. of block equicorrelation and equivariance. This test or its single block version may find applications in many areas as in psychology, education, medicine, genetics and they are important "in many tests of multivariate analysis, e.g. in MANOVA, Profile Analysis, Growth Curve analysis, etc" [12, 9]. By decomposing the overall hypothesis into the hypotheses of independence of groups of variables and the hypothesis of equicorrelation and equivariance we are able to obtain the expressions for the overall l.r.t. statistic and its moments. From these we obtain a suitable factorization of the characteristic function (c.f.) of the logarithm of the l.r.t. statistic, which enables us to develop highly manageable and precise near-exact distributions for the test statistic.

  5. The role of TNF alpha polymorphism and expression in susceptibility to nasal polyposis.

    PubMed

    Zhang, Guimin; Zhang, Jinmei; Kuang, Manbao; Lin, Peng

    2018-05-01

    In this study, we first performed a meta-analysis to assess the role of single-nucleotide polymorphism (SNP) within tumor necrosis factor alpha (TNF alpha) gene and TNF alpha expression in the risk of nasal polyposis. STATA 12.0 software was utilized to conduct the Mantel-Haenszel statistics, Cohen statistics, Begg's test, Egger's tests and sensitivity analysis. We systemically carried out the database retrieval and initially identified 486 articles. After screening, 15 articles were included in our meta-analysis. For TNF alpha rs1800629 G/A SNP, compared with control group, an increased risk of nasal polyposis of case group was observed in the models of A vs. G [p (P value of association) = 0.009, OR (odds ratio) = 1.35], GA vs. GG (p = 0.001, OR = 1.69), GA+AA vs. GG (p = 0.010, OR = 1.47). The similar results were observed in Caucasian subgroup (p < 0.05, OR > 1). For TNF alpha rs361525 G/A SNP, no significant difference between control and case group was detected (all p > 0.05). In addition, a significant difference exists between case and control groups in the meta-analyses of TNF alpha expression in nasal mucosal cells, secreted TNF alpha (p < 0.05, OR > 1), but not serum TNF alpha (p = 0.090). The present meta-analysis revealed that TNF alpha rs1800629, increased TNF alpha expression and secretion of nasal mucosal cells were associated with an increased risk of nasal polyposis.

  6. Perceptual integration of kinematic components in the recognition of emotional facial expressions.

    PubMed

    Chiovetto, Enrico; Curio, Cristóbal; Endres, Dominik; Giese, Martin

    2018-04-01

    According to a long-standing hypothesis in motor control, complex body motion is organized in terms of movement primitives, reducing massively the dimensionality of the underlying control problems. For body movements, this low-dimensional organization has been convincingly demonstrated by the learning of low-dimensional representations from kinematic and EMG data. In contrast, the effective dimensionality of dynamic facial expressions is unknown, and dominant analysis approaches have been based on heuristically defined facial "action units," which reflect contributions of individual face muscles. We determined the effective dimensionality of dynamic facial expressions by learning of a low-dimensional model from 11 facial expressions. We found an amazingly low dimensionality with only two movement primitives being sufficient to simulate these dynamic expressions with high accuracy. This low dimensionality is confirmed statistically, by Bayesian model comparison of models with different numbers of primitives, and by a psychophysical experiment that demonstrates that expressions, simulated with only two primitives, are indistinguishable from natural ones. In addition, we find statistically optimal integration of the emotion information specified by these primitives in visual perception. Taken together, our results indicate that facial expressions might be controlled by a very small number of independent control units, permitting very low-dimensional parametrization of the associated facial expression.

  7. Quantum noise in SIS mixers

    NASA Astrophysics Data System (ADS)

    Zorin, A. B.

    1985-03-01

    In the present, quantum-statistical analysis of SIS heterodyne mixer performance, the conventional three-port model of the mixer circuit and the microscopic theory of superconducting tunnel junctions are used to derive a general expression for a noise parameter previously used for the case of parametric amplifiers. This expression is numerically evaluated for various quasiparticle current step widths, dc bias voltages, local oscillator powers, signal frequencies, signal source admittances, and operation temperatures.

  8. Causal network analysis of head and neck keloid tissue identifies potential master regulators.

    PubMed

    Garcia-Rodriguez, Laura; Jones, Lamont; Chen, Kang Mei; Datta, Indrani; Divine, George; Worsham, Maria J

    2016-10-01

    To generate novel insights and hypotheses in keloid development from potential master regulators. Prospective cohort. Six fresh keloid and six normal skin samples from 12 anonymous donors were used in a prospective cohort study. Genome-wide profiling was done previously on the cohort using the Infinium HumanMethylation450 BeadChip (Illumina, San Diego, CA). The 190 statistically significant CpG islands between keloid and normal tissue mapped to 152 genes (P < .05). The top 10 statistically significant genes (VAMP5, ACTR3C, GALNT3, KCNAB2, LRRC61, SCML4, SYNGR1, TNS1, PLEKHG5, PPP1R13-α, false discovery rate <.015) were uploaded into the Ingenuity Pathway Analysis software's Causal Network Analysis (QIAGEN, Redwood City, CA). To reflect expected gene expression direction in the context of methylation changes, the inverse of the methylation ratio from keloid versus normal tissue was used for the analysis. Causal Network Analysis identified disease-specific master regulator molecules based on downstream differentially expressed keloid-specific genes and expected directionality of expression (hypermethylated vs. hypomethylated). Causal Network Analysis software identified four hierarchical networks that included four master regulators (pyroxamide, tributyrin, PRKG2, and PENK) and 19 intermediate regulators. Causal Network Analysis of differentiated methylated gene data of keloid versus normal skin demonstrated four causal networks with four master regulators. These hierarchical networks suggest potential driver roles for their downstream keloid gene targets in the pathogenesis of the keloid phenotype, likely triggered due to perturbation/injury to normal tissue. NA Laryngoscope, 126:E319-E324, 2016. © 2016 The American Laryngological, Rhinological and Otological Society, Inc.

  9. Prognostic implications of adhesion molecule expression in colorectal cancer.

    PubMed

    Seo, Kyung-Jin; Kim, Maru; Kim, Jeana

    2015-01-01

    Research on the expression of adhesion molecules, E-cadherin (ECAD), CD24, CD44 and osteopontin (OPN) in colorectal cancer (CRC) has been limited, even though CRC is one of the leading causes of cancer-related deaths. This study was conducted to evaluate the expression of adhesion molecules in CRC and to determine their relationships with clinicopathologic variables, and the prognostic significance. The expression of ECAD, CD24, CD44 and OPN was examined in 174 stage II and III CRC specimens by immunohistochemistry of TMA. Negative ECAD expression was significantly correlated with advanced nodal stage and poor tumor differentiation. Multivariate analysis showed that both negative expression of ECAD and positive expression of CD24 were independent prognostic factors for disease-free survival (DFS) in CRC patients (P<0.001, relative risk [RR] = 5.596, 95% CI = 2.712-11.549; P = 0.038, RR = 3.768, 95% CI = 1.077-13.185, respectively). However, for overall survival (OS), only ECAD negativity showed statistically significant results in multivariate analysis (P<0.001, RR = 4.819, 95% CI = 2.515-9.234). Positive expression of CD24 was associated with poor OS in univariate analysis but was of no prognostic value in multivariate analysis. In conclusion, our study suggests that among these four adhesion molecules, ECAD and CD24 expression can be considered independent prognostic factors. The role of CD44 and OPN may need further evaluation.

  10. Prognostic implications of adhesion molecule expression in colorectal cancer

    PubMed Central

    Seo, Kyung-Jin; Kim, Maru; Kim, Jeana

    2015-01-01

    Research on the expression of adhesion molecules, E-cadherin (ECAD), CD24, CD44 and osteopontin (OPN) in colorectal cancer (CRC) has been limited, even though CRC is one of the leading causes of cancer-related deaths. This study was conducted to evaluate the expression of adhesion molecules in CRC and to determine their relationships with clinicopathologic variables, and the prognostic significance. The expression of ECAD, CD24, CD44 and OPN was examined in 174 stage II and III CRC specimens by immunohistochemistry of TMA. Negative ECAD expression was significantly correlated with advanced nodal stage and poor tumor differentiation. Multivariate analysis showed that both negative expression of ECAD and positive expression of CD24 were independent prognostic factors for disease-free survival (DFS) in CRC patients (P<0.001, relative risk [RR] = 5.596, 95% CI = 2.712-11.549; P = 0.038, RR = 3.768, 95% CI = 1.077-13.185, respectively). However, for overall survival (OS), only ECAD negativity showed statistically significant results in multivariate analysis (P<0.001, RR = 4.819, 95% CI = 2.515-9.234). Positive expression of CD24 was associated with poor OS in univariate analysis but was of no prognostic value in multivariate analysis. In conclusion, our study suggests that among these four adhesion molecules, ECAD and CD24 expression can be considered independent prognostic factors. The role of CD44 and OPN may need further evaluation. PMID:26097606

  11. RepExplore: addressing technical replicate variance in proteomics and metabolomics data analysis.

    PubMed

    Glaab, Enrico; Schneider, Reinhard

    2015-07-01

    High-throughput omics datasets often contain technical replicates included to account for technical sources of noise in the measurement process. Although summarizing these replicate measurements by using robust averages may help to reduce the influence of noise on downstream data analysis, the information on the variance across the replicate measurements is lost in the averaging process and therefore typically disregarded in subsequent statistical analyses.We introduce RepExplore, a web-service dedicated to exploit the information captured in the technical replicate variance to provide more reliable and informative differential expression and abundance statistics for omics datasets. The software builds on previously published statistical methods, which have been applied successfully to biomedical omics data but are difficult to use without prior experience in programming or scripting. RepExplore facilitates the analysis by providing a fully automated data processing and interactive ranking tables, whisker plot, heat map and principal component analysis visualizations to interpret omics data and derived statistics. Freely available at http://www.repexplore.tk enrico.glaab@uni.lu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.

  12. Visualization of time series statistical data by shape analysis (GDP ratio changes among Asia countries)

    NASA Astrophysics Data System (ADS)

    Shirota, Yukari; Hashimoto, Takako; Fitri Sari, Riri

    2018-03-01

    It has been very significant to visualize time series big data. In the paper we shall discuss a new analysis method called “statistical shape analysis” or “geometry driven statistics” on time series statistical data in economics. In the paper, we analyse the agriculture, value added and industry, value added (percentage of GDP) changes from 2000 to 2010 in Asia. We handle the data as a set of landmarks on a two-dimensional image to see the deformation using the principal components. The point of the analysis method is the principal components of the given formation which are eigenvectors of its bending energy matrix. The local deformation can be expressed as the set of non-Affine transformations. The transformations give us information about the local differences between in 2000 and in 2010. Because the non-Affine transformation can be decomposed into a set of partial warps, we present the partial warps visually. The statistical shape analysis is widely used in biology but, in economics, no application can be found. In the paper, we investigate its potential to analyse the economic data.

  13. New dimensions from statistical graphics for GIS (geographic information system) analysis and interpretation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    McCord, R.A.; Olson, R.J.

    1988-01-01

    Environmental research and assessment activities at Oak Ridge National Laboratory (ORNL) include the analysis of spatial and temporal patterns of ecosystem response at a landscape scale. Analysis through use of geographic information system (GIS) involves an interaction between the user and thematic data sets frequently expressed as maps. A portion of GIS analysis has a mathematical or statistical aspect, especially for the analysis of temporal patterns. ARC/INFO is an excellent tool for manipulating GIS data and producing the appropriate map graphics. INFO also has some limited ability to produce statistical tabulation. At ORNL we have extended our capabilities by graphicallymore » interfacing ARC/INFO and SAS/GRAPH to provide a combined mapping and statistical graphics environment. With the data management, statistical, and graphics capabilities of SAS added to ARC/INFO, we have expanded the analytical and graphical dimensions of the GIS environment. Pie or bar charts, frequency curves, hydrographs, or scatter plots as produced by SAS can be added to maps from attribute data associated with ARC/INFO coverages. Numerous, small, simplified graphs can also become a source of complex map ''symbols.'' These additions extend the dimensions of GIS graphics to include time, details of the thematic composition, distribution, and interrelationships. 7 refs., 3 figs.« less

  14. Cross-correlation detection and analysis for California's electricity market based on analogous multifractal analysis

    NASA Astrophysics Data System (ADS)

    Wang, Fang; Liao, Gui-ping; Li, Jian-hui; Zou, Rui-biao; Shi, Wen

    2013-03-01

    A novel method, which we called the analogous multifractal cross-correlation analysis, is proposed in this paper to study the multifractal behavior in the power-law cross-correlation between price and load in California electricity market. In addition, a statistic ρAMF -XA, which we call the analogous multifractal cross-correlation coefficient, is defined to test whether the cross-correlation between two given signals is genuine or not. Our analysis finds that both the price and load time series in California electricity market express multifractal nature. While, as indicated by the ρAMF -XA statistical test, there is a huge difference in the cross-correlation behavior between the years 1999 and 2000 in California electricity markets.

  15. Cross-correlation detection and analysis for California's electricity market based on analogous multifractal analysis.

    PubMed

    Wang, Fang; Liao, Gui-ping; Li, Jian-hui; Zou, Rui-biao; Shi, Wen

    2013-03-01

    A novel method, which we called the analogous multifractal cross-correlation analysis, is proposed in this paper to study the multifractal behavior in the power-law cross-correlation between price and load in California electricity market. In addition, a statistic ρAMF-XA, which we call the analogous multifractal cross-correlation coefficient, is defined to test whether the cross-correlation between two given signals is genuine or not. Our analysis finds that both the price and load time series in California electricity market express multifractal nature. While, as indicated by the ρAMF-XA statistical test, there is a huge difference in the cross-correlation behavior between the years 1999 and 2000 in California electricity markets.

  16. Elevated expression of LSD1 (Lysine-specific demethylase 1) during tumour progression from pre-invasive to invasive ductal carcinoma of the breast

    PubMed Central

    2012-01-01

    Background Lysine-specific demethylase1 (LSD1) is a nuclear protein which belongs to the aminooxidase-enzymes playing an important role in controlling gene expression. It has also been found highly expressed in several human malignancies including breast carcinoma. Our aim was to detect LSD1 expression also in pre-invasive neoplasias of the breast. In the current study we therefore analysed LSD1 protein expression in ductal carcinoma in situ (DCIS) in comparison to invasive ductal breast cancer (IDC). Methods Using immunohistochemistry we systematically analysed LSD1 expression in low grade DCIS (n = 27), intermediate grade DCIS (n = 30), high grade DCIS (n = 31) and in invasive ductal breast cancer (n = 32). SPSS version 18.0 was used for statistical analysis. Results LSD1 was differentially expressed in DCIS and invasive ductal breast cancer. Interestingly, LSD1 was significantly overexpressed in high grade DCIS versus low grade DCIS. Differences in LSD1 expression levels were also statistically significant between low/intermediate DCIS and invasive ductal breast carcinoma. Conclusions LSD1 is also expressed in pre-invasive neoplasias of the breast. Additionally, there is a gradual increase of LSD1 expression within tumour progression from pre-invasive DCIS to invasive ductal breast carcinoma. Therefore upregulation of LSD1 may be an early tumour promoting event. PMID:22920283

  17. Elevated expression of LSD1 (Lysine-specific demethylase 1) during tumour progression from pre-invasive to invasive ductal carcinoma of the breast.

    PubMed

    Serce, Nuran; Gnatzy, Annette; Steiner, Susanne; Lorenzen, Henning; Kirfel, Jutta; Buettner, Reinhard

    2012-08-24

    Lysine-specific demethylase1 (LSD1) is a nuclear protein which belongs to the aminooxidase-enzymes playing an important role in controlling gene expression. It has also been found highly expressed in several human malignancies including breast carcinoma. Our aim was to detect LSD1 expression also in pre-invasive neoplasias of the breast. In the current study we therefore analysed LSD1 protein expression in ductal carcinoma in situ (DCIS) in comparison to invasive ductal breast cancer (IDC). Using immunohistochemistry we systematically analysed LSD1 expression in low grade DCIS (n = 27), intermediate grade DCIS (n = 30), high grade DCIS (n = 31) and in invasive ductal breast cancer (n = 32). SPSS version 18.0 was used for statistical analysis. LSD1 was differentially expressed in DCIS and invasive ductal breast cancer. Interestingly, LSD1 was significantly overexpressed in high grade DCIS versus low grade DCIS. Differences in LSD1 expression levels were also statistically significant between low/intermediate DCIS and invasive ductal breast carcinoma. LSD1 is also expressed in pre-invasive neoplasias of the breast. Additionally, there is a gradual increase of LSD1 expression within tumour progression from pre-invasive DCIS to invasive ductal breast carcinoma. Therefore upregulation of LSD1 may be an early tumour promoting event.

  18. Amelogenin in odontogenic cysts and tumors: An immunohistochemical study

    PubMed Central

    Anigol, Praveen; Kamath, Venkatesh V.; Satelur, Krishnanand; Anand, Nagaraja; Yerlagudda, Komali

    2014-01-01

    Background: Amelogenins are the major enamel proteins that play a major role in the biomineralization and structural organization of enamel. Aberrations of enamel-related proteins are thought to be involved in oncogenesis of odontogenic epithelium. The expression of amelogenin is possibly an indicator of differentiation of epithelial cells in the odontogenic lesions. Aims and Objectives: The present study aimed to observe the expression of amelogenin immunohistochemically in various odontogenic lesions. Materials and Methods: Paraffin sections of 40 odontogenic lesions were stained immunohistochemically with amelogenin antibodies. The positivity, pattern and intensity of expression of the amelogenin antibody were assessed, graded and statistically compared between groups of odontogenic cysts and tumors. Results: Almost all the odontogenic lesions expressed amelogenin in the epithelial component with the exception of an ameloblastic carcinoma. Differing grades of intensity and pattern were seen between the cysts and tumors. Intensity of expression was uniformly prominent in all odontogenic lesions with hard tissue formation. Statistical analysis however did not indicate significant differences between the two groups. Conclusion: The expression of amelogenin antibody is ubiquitous in odontogenic tissues and can be used as a definitive marker for identification of odontogenic epithelium. PMID:25937729

  19. Human equilibrative nucleoside transporter 1 (hENT1) levels predict response to gemcitabine in patients with biliary tract cancer (BTC).

    PubMed

    Santini, Daniele; Schiavon, Gaia; Vincenzi, Bruno; Cass, Carol E; Vasile, Enrico; Manazza, Andrea D; Catalano, Vincenzo; Baldi, Giacomo Giulio; Lai, Raymond; Rizzo, Sergio; Giacobino, Alice; Chiusa, Luigi; Caraglia, Michele; Russo, Antonio; Mackey, John; Falcone, Alfredo; Tonini, Giuseppe

    2011-01-01

    Translational data suggest that nucleoside transporters, in particular human equilibrative nucleoside transporter 1 (hENT1), play an important role in predicting clinical outcome after gemcitabine chemotherapy for several types of cancer. The aim of this study was to retrospectively determine patients' outcome according to the expression of hENT1 in tumoral cells of patients receiving gemcitabine-based therapy. The immunohistochemistry analysis was performed on samples from thirty-one patients with unresectable biliary tract cancer (BTC) consecutively treated with first line gemcitabine-based regimens. Positive hENT1 staining patients were 21 (67.7%); negative hENT1 staining patients were 10 (32.3%). Statistical analysis revealed no association between baseline characteristics, toxicities and tumor response to gemcitabine and hENT1 levels. In the univariate analysis, HENT1 expression was significantly correlated with time to progression (TTP) (p=0.0394; HR 2.902, 95%CI 1.053-7.996). The median TTP was 6.33 versus 2.83 months, respectively in patients with positive versus negative hENT1 staining. Moreover, patients with positive hENT1 expression showed a longer median overall survival when compared with patients with low hENT1 expression (14 versus 7 months, respectively), but this difference did not reach the statistical significance (p=0.128). Therefore, hENT1 may be a relevant predictive marker of benefit from gemcitabine-based therapies in patients with advanced BTC.

  20. Statistical methods for astronomical data with upper limits. I - Univariate distributions

    NASA Technical Reports Server (NTRS)

    Feigelson, E. D.; Nelson, P. I.

    1985-01-01

    The statistical treatment of univariate censored data is discussed. A heuristic derivation of the Kaplan-Meier maximum-likelihood estimator from first principles is presented which results in an expression amenable to analytic error analysis. Methods for comparing two or more censored samples are given along with simple computational examples, stressing the fact that most astronomical problems involve upper limits while the standard mathematical methods require lower limits. The application of univariate survival analysis to six data sets in the recent astrophysical literature is described, and various aspects of the use of survival analysis in astronomy, such as the limitations of various two-sample tests and the role of parametric modelling, are discussed.

  1. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kolker, Eugene

    Our project focused primarily on analysis of different types of data produced by global high-throughput technologies, data integration of gene annotation, and gene and protein expression information, as well as on getting a better functional annotation of Shewanella genes. Specifically, four of our numerous major activities and achievements include the development of: statistical models for identification and expression proteomics, superior to currently available approaches (including our own earlier ones); approaches to improve gene annotations on the whole-organism scale; standards for annotation, transcriptomics and proteomics approaches; and generalized approaches for data integration of gene annotation, gene and protein expression information.

  2. Histogram analysis of ADC in rectal cancer: associations with different histopathological findings including expression of EGFR, Hif1-alpha, VEGF, p53, PD1, and KI 67. A preliminary study.

    PubMed

    Meyer, Hans Jonas; Höhn, Annekathrin; Surov, Alexey

    2018-04-06

    Functional imaging modalities like Diffusion-weighted imaging are increasingly used to predict tumor behavior like cellularity and vascularity in different tumors. Histogram analysis is an emergent imaging analysis, in which every voxel is used to obtain a histogram and therefore statistically information about tumors can be provided. The purpose of this study was to elucidate possible associations between ADC histogram parameters and several immunhistochemical features in rectal cancer. Overall, 11 patients with histologically proven rectal cancer were included into the study. There were 2 (18.18%) females and 9 males with a mean age of 67.1 years. KI 67-index, expression of p53, EGFR, VEGF, and Hif1-alpha were semiautomatically estimated. The tumors were divided into PD1-positive and PD1-negative lesions. ADC histogram analysis was performed as a whole lesion measurement using an in-house matlab application. Spearman's correlation analysis revealed a strong correlation between EGFR expression and ADCmax (p=0.72, P=0.02). None of the vascular parameters (VEGF, Hif1-alpha) correlated with ADC parameters. Kurtosis and skewness correlated inversely with p53 expression (p=-0.64, P=0.03 and p=-0.81, P=0.002, respectively). ADCmedian and ADCmode correlated with Ki67 (p=-0.62, P=0.04 and p=-0.65, P=0.03, respectively). PD1-positive tumors showed statistically significant lower ADCmax values in comparison to PD1-negative tumors, 1.93 ± 0.36 vs 2.32 ± 0.47×10 -3 mm 2 /s, p=0.04. Several associations were identified between histogram parameter derived from ADC maps and EGFR, KI 67 and p53 expression in rectal cancer. Furthermore, ADCmax was different between PD1 positive and PD1 negative tumors indicating an important role of ADC parameters for possible future treatment prediction.

  3. Histogram analysis of ADC in rectal cancer: associations with different histopathological findings including expression of EGFR, Hif1-alpha, VEGF, p53, PD1, and KI 67. A preliminary study

    PubMed Central

    Meyer, Hans Jonas; Höhn, Annekathrin; Surov, Alexey

    2018-01-01

    Functional imaging modalities like Diffusion-weighted imaging are increasingly used to predict tumor behavior like cellularity and vascularity in different tumors. Histogram analysis is an emergent imaging analysis, in which every voxel is used to obtain a histogram and therefore statistically information about tumors can be provided. The purpose of this study was to elucidate possible associations between ADC histogram parameters and several immunhistochemical features in rectal cancer. Overall, 11 patients with histologically proven rectal cancer were included into the study. There were 2 (18.18%) females and 9 males with a mean age of 67.1 years. KI 67-index, expression of p53, EGFR, VEGF, and Hif1-alpha were semiautomatically estimated. The tumors were divided into PD1-positive and PD1-negative lesions. ADC histogram analysis was performed as a whole lesion measurement using an in-house matlab application. Spearman's correlation analysis revealed a strong correlation between EGFR expression and ADCmax (p=0.72, P=0.02). None of the vascular parameters (VEGF, Hif1-alpha) correlated with ADC parameters. Kurtosis and skewness correlated inversely with p53 expression (p=-0.64, P=0.03 and p=-0.81, P=0.002, respectively). ADCmedian and ADCmode correlated with Ki67 (p=-0.62, P=0.04 and p=-0.65, P=0.03, respectively). PD1-positive tumors showed statistically significant lower ADCmax values in comparison to PD1-negative tumors, 1.93 ± 0.36 vs 2.32 ± 0.47×10−3mm2/s, p=0.04. Several associations were identified between histogram parameter derived from ADC maps and EGFR, KI 67 and p53 expression in rectal cancer. Furthermore, ADCmax was different between PD1 positive and PD1 negative tumors indicating an important role of ADC parameters for possible future treatment prediction. PMID:29719621

  4. Correlation between chemotherapy resistance in osteosarcoma patients and PAK5 and Ezrin gene expression

    PubMed Central

    Liu, Qian; Xu, Bo; Zhou, Wanshan

    2018-01-01

    The correlation between PAK5 (P21-activated kinase 5) and Ezrin gene expression and chemotherapy resistance of osteosarcoma patients was investigated. The cisplatin (CDDP)-resistance model of osteosarcoma cells SOSP-9607/CDDP was established to detect the cell growth curve. Methyl thiazolyl tetrazolium (MTT) assay was used to detect the drug resistance of cells to chemotherapy drugs. Transwell assay was used to detect the invasive capacity of cells. Semi-quantitative PCR (qPCR) was used to detect the mRNA expression levels in the drug resistance-related genes PAK5 and Ezrin. Western blot analysis was used to detect the protein expression levels in PAK5 and Ezrin. Tumor tissues were taken from the osteosarcoma patients with chemotherapy resistance to detect the expression levels of PAK5 and Ezrin via immunohistochemical detection, and the correlation between PAK5 and Ezrin expressions was studied. The results of MTT assay showed that the growth rate of SOSP-9607 was similar to that of SOSP-9607/CDDP, and the difference was not statistically significant (P>0.05). The sensitivity of SOSP-9607 to CDDP was significantly higher than that of SOSP-9607/CDDP, and the difference was statistically significant (P<0.01). Transwell assay showed that the migration capacity of SOSP-9607/CDDP was significantly better than that of SOSP-9607 (P<0.01), indicating that the drug resistance cell lines of osteosarcoma were constructed successfully. Semi-qPCR and western blot analysis showed that the protein expression levels in PAK5 and Ezrin in SOSP-9607/CDDP were significantly higher than those in SOSP-9607 (P<0.01). The results of immunohistochemistry showed that the expression quantities of PAK5 and Ezrin in osteosarcoma tissues were significantly higher than those in para-tumor tissues (P<0.01). Pearson's correlation analysis showed that expression of PAK5 and Ezrin was positively correlated (r=0.197, P=0.023). The osteosarcoma resistance is closely related to the expression levels of PAK5 and Ezrin genes. Thus, PAK5 and Ezrin genes may affect the tolerance of osteosarcoma patients to chemotherapy drugs during treatment via the synergistic effect. PMID:29391894

  5. Elevated expression of matrix metalloproteinase-9 is associated with bladder cancer pathogenesis.

    PubMed

    Wu, Gong-Jin; Bao, Jun-Sheng; Yue, Zhong-Jin; Zeng, Fan-Chang; Cen, Song; Tang, Zheng-Yan; Kang, Xin-Li

    2018-01-01

    This study investigated the association between abnormal matrix metalloproteinase-9 (MMP-9) expression and bladder cancer (BC) development. In a retrospective analysis, this study used tissue samples derived from 92 patients pathologically diagnosed with BC (experimental group), who were hospitalized between September 2012 and June 2014 at the Urinary Surgery of Department of Urology, Lanzhou University Second Hospital. As controls (control group), 63 normal pericancerous bladder mucosal tissues (3 cm distant form edge of BC foci) with confirmed pathology were selected from the same time period. Immunohistochemistry was employed to detect MMP-9 protein expression in the tissues and enzyme-linked immunosorbent assay was performed to measure MMP-9 protein levels in tissue samples of patients and control subjects. Finally, a meta-analysis was conducted to understand the overall impact of MMP-9 on BC pathogenesis. STATA 12.0 software (Stata Corp, College Station, TX, USA) was used for all statistical analyses. The MMP-9 positive expression rate in tissue samples and MMP-9 levels were significantly greater in the experimental group compared to the control group (both P < 0.001). The frequency of MMP-9 positive status showed statistically significant differences between G1 (low-grade) and G3 (high-grade) (P < 0.001), between G2 and G3 (P < 0.05), and between G1/G2 and G3 (P = 0.001). Our meta-analysis findings provided further evidence that MMP-9 positive expression status and MMP-9 levels in the experimental group were significantly higher than the control group (positive expressions: Odds ratio [OR] = 18.59, 95% confidence interval [95% CI] = 11.63-29.71, P < 0.001; expression levels: Standard mean difference = 1.51, 95%CI = 0.63-2.39, P = 0.001). The positive expression status of MMP-9 was notably lower in G1/G2 compared to G3 (OR = 0.24, 95%CI = 0.15-0.36, P < 0.001). Our study demonstrated that both positive expression status in tumor tissue and expression levels of MMP-9 are significantly elevated in BC patients and correlate with disease progression. Thus, MMP-9 can serve as a biomarker to determine the degree of BC malignancy.

  6. CorSig: a general framework for estimating statistical significance of correlation and its application to gene co-expression analysis.

    PubMed

    Wang, Hong-Qiang; Tsai, Chung-Jui

    2013-01-01

    With the rapid increase of omics data, correlation analysis has become an indispensable tool for inferring meaningful associations from a large number of observations. Pearson correlation coefficient (PCC) and its variants are widely used for such purposes. However, it remains challenging to test whether an observed association is reliable both statistically and biologically. We present here a new method, CorSig, for statistical inference of correlation significance. CorSig is based on a biology-informed null hypothesis, i.e., testing whether the true PCC (ρ) between two variables is statistically larger than a user-specified PCC cutoff (τ), as opposed to the simple null hypothesis of ρ = 0 in existing methods, i.e., testing whether an association can be declared without a threshold. CorSig incorporates Fisher's Z transformation of the observed PCC (r), which facilitates use of standard techniques for p-value computation and multiple testing corrections. We compared CorSig against two methods: one uses a minimum PCC cutoff while the other (Zhu's procedure) controls correlation strength and statistical significance in two discrete steps. CorSig consistently outperformed these methods in various simulation data scenarios by balancing between false positives and false negatives. When tested on real-world Populus microarray data, CorSig effectively identified co-expressed genes in the flavonoid pathway, and discriminated between closely related gene family members for their differential association with flavonoid and lignin pathways. The p-values obtained by CorSig can be used as a stand-alone parameter for stratification of co-expressed genes according to their correlation strength in lieu of an arbitrary cutoff. CorSig requires one single tunable parameter, and can be readily extended to other correlation measures. Thus, CorSig should be useful for a wide range of applications, particularly for network analysis of high-dimensional genomic data. A web server for CorSig is provided at http://202.127.200.1:8080/probeWeb. R code for CorSig is freely available for non-commercial use at http://aspendb.uga.edu/downloads.

  7. Expression analysis of selected classes of circulating exosomal miRNAs in soccer players as an indicator of adaptation to physical activity

    PubMed Central

    Jastrzębski, Zbigniew; Kiszałkiewicz, Justyna; Brzeziański, Michał; Pastuszak-Lewandoska, Dorota; Radzimińki, Łukasz; Brzeziańska-Lasota, Ewa; Jegier, Anna

    2017-01-01

    Recently studies have shown that, depending on the type of training and its duration, the expression levels of selected circulating myomiRNAs (c-miR-27a,b, c-miR-29a,b,c, c-miR-133a) differ and correlate with the physiological indicators of adaptation to physical activity. To analyse the expression of selected classes of miRNAs in soccer players during different periods of their training cycle. The study involved 22 soccer players aged 17-18 years. The multi-stage 20-m shuttle run test was used to estimate VO2 max among the soccer players. Samples serum were collected at baseline (time point I), after one week (time point II), and after 2 months of training (time point III). The analysis of the relative quantification (RQ) level of three exosomal myomiRNAs, c-miRNA-27b, c-miR-29a, and c-miR-133, was performed by quantitative polymerase chain reaction (qPCR) at three time points – before the training, after 1 week of training and after the completion of two months of competition season training. The expression analysis showed low expression levels (according to references) of all evaluated myomiRNAs before the training cycle. Analysis performed after a week of the training cycle and after completion of the entire training cycle showed elevated expression of all tested myomiRNAs. Statistical analysis revealed significant differences between the first and the second time point in soccer players for c-miR-27b and c-miR-29a; between the first and the third time point for c-miR-27b and c-miR-29a; and between the second and the third time point for c-miR-27b. Statistical analysis showed a positive correlation between the levels of c-miR-29a and VO2 max. Two months of training affected the expression of c-miR-27b and miR-29a in soccer players. The increased expression of c-miR-27b and c-miR-29 with training could indicate their probable role in the adaptation process that takes place in the muscular system. Possibly, the expression of c-miR-29a will be found to be involved in cardiorespiratory fitness in future research. PMID:29472735

  8. Decreased expression of dual specificity phosphatase 22 in colorectal cancer and its potential prognostic relevance for stage IV CRC patients.

    PubMed

    Yu, Dan; Li, Zhenli; Gan, Meifu; Zhang, Hanyun; Yin, Xiaoyang; Tang, Shunli; Wan, Ledong; Tian, Yiping; Zhang, Shuai; Zhu, Yimin; Lai, Maode; Zhang, Dandan

    2015-11-01

    Dual specificity phosphatase 22 (DUSP22) is a novel dual specificity phosphatase that has been demonstrated to be a cancer suppressor gene associated with numerous biological and pathological processes. However, little is known of DUSP22 expression profiling in colorectal cancer and its prognostic value. Our study aims to investigate the role of DUSP22 expression in the prognosis of colorectal cancer. We detected the mRNA expression in 92 paired primary colorectal cancer tissues and the corresponding adjacent normal tissues by using QuantiGenePlex assay. The Friedman test was used to determine the statistical difference of gene expression. Kaplan-Meier survival analysis was performed. Mann-Whitney test and Kruskal-Wallis test were used to conduct data analyses to determine the prognostic value. Statistical significance was set at P < 0.05. In 74 of 92 cases, DUSP22 mRNA was reduced in primary colorectal cancer tissues, compared to the adjacent normal tissues. The mRNA levels of DUSP22 were significantly lower in colorectal cancer tissues than in adjacent normal tissues (0.0290 vs. 0.0658; P < 0.001). Low expression of DUSP22 correlated significantly with large tumor size (P = 0.013). No association was observed between DUSP22 mRNA expression and differentiation, histopathological type, tumor invasion, lymph node metastases, metastases, TNM stage, and Duke's phase (all P > 0.05). Kaplan-Meier analysis indicated that DUSP22 expression had no significant relationship with overall survival in all patients (P > 0.05). Interestingly, low expression level of DUSP22 in stage IV patients had a poor survival measures with a marginal P value (P = 0.07). Reduced DUSP22 expression was found in colorectal cancer specimens. Low expression level of DUSP22 in stage IV patients had a poor survival outcome. Further study is required for the investigation of the role of DUSP22 in colorectal cancer.

  9. BEAT: Bioinformatics Exon Array Tool to store, analyze and visualize Affymetrix GeneChip Human Exon Array data from disease experiments

    PubMed Central

    2012-01-01

    Background It is known from recent studies that more than 90% of human multi-exon genes are subject to Alternative Splicing (AS), a key molecular mechanism in which multiple transcripts may be generated from a single gene. It is widely recognized that a breakdown in AS mechanisms plays an important role in cellular differentiation and pathologies. Polymerase Chain Reactions, microarrays and sequencing technologies have been applied to the study of transcript diversity arising from alternative expression. Last generation Affymetrix GeneChip Human Exon 1.0 ST Arrays offer a more detailed view of the gene expression profile providing information on the AS patterns. The exon array technology, with more than five million data points, can detect approximately one million exons, and it allows performing analyses at both gene and exon level. In this paper we describe BEAT, an integrated user-friendly bioinformatics framework to store, analyze and visualize exon arrays datasets. It combines a data warehouse approach with some rigorous statistical methods for assessing the AS of genes involved in diseases. Meta statistics are proposed as a novel approach to explore the analysis results. BEAT is available at http://beat.ba.itb.cnr.it. Results BEAT is a web tool which allows uploading and analyzing exon array datasets using standard statistical methods and an easy-to-use graphical web front-end. BEAT has been tested on a dataset with 173 samples and tuned using new datasets of exon array experiments from 28 colorectal cancer and 26 renal cell cancer samples produced at the Medical Genetics Unit of IRCCS Casa Sollievo della Sofferenza. To highlight all possible AS events, alternative names, accession Ids, Gene Ontology terms and biochemical pathways annotations are integrated with exon and gene level expression plots. The user can customize the results choosing custom thresholds for the statistical parameters and exploiting the available clinical data of the samples for a multivariate AS analysis. Conclusions Despite exon array chips being widely used for transcriptomics studies, there is a lack of analysis tools offering advanced statistical features and requiring no programming knowledge. BEAT provides a user-friendly platform for a comprehensive study of AS events in human diseases, displaying the analysis results with easily interpretable and interactive tables and graphics. PMID:22536968

  10. Pathway-based factor analysis of gene expression data produces highly heritable phenotypes that associate with age.

    PubMed

    Anand Brown, Andrew; Ding, Zhihao; Viñuela, Ana; Glass, Dan; Parts, Leopold; Spector, Tim; Winn, John; Durbin, Richard

    2015-03-09

    Statistical factor analysis methods have previously been used to remove noise components from high-dimensional data prior to genetic association mapping and, in a guided fashion, to summarize biologically relevant sources of variation. Here, we show how the derived factors summarizing pathway expression can be used to analyze the relationships between expression, heritability, and aging. We used skin gene expression data from 647 twins from the MuTHER Consortium and applied factor analysis to concisely summarize patterns of gene expression to remove broad confounding influences and to produce concise pathway-level phenotypes. We derived 930 "pathway phenotypes" that summarized patterns of variation across 186 KEGG pathways (five phenotypes per pathway). We identified 69 significant associations of age with phenotype from 57 distinct KEGG pathways at a stringent Bonferroni threshold ([Formula: see text]). These phenotypes are more heritable ([Formula: see text]) than gene expression levels. On average, expression levels of 16% of genes within these pathways are associated with age. Several significant pathways relate to metabolizing sugars and fatty acids; others relate to insulin signaling. We have demonstrated that factor analysis methods combined with biological knowledge can produce more reliable phenotypes with less stochastic noise than the individual gene expression levels, which increases our power to discover biologically relevant associations. These phenotypes could also be applied to discover associations with other environmental factors. Copyright © 2015 Brown et al.

  11. Pathway-Based Factor Analysis of Gene Expression Data Produces Highly Heritable Phenotypes That Associate with Age

    PubMed Central

    Anand Brown, Andrew; Ding, Zhihao; Viñuela, Ana; Glass, Dan; Parts, Leopold; Spector, Tim; Winn, John; Durbin, Richard

    2015-01-01

    Statistical factor analysis methods have previously been used to remove noise components from high-dimensional data prior to genetic association mapping and, in a guided fashion, to summarize biologically relevant sources of variation. Here, we show how the derived factors summarizing pathway expression can be used to analyze the relationships between expression, heritability, and aging. We used skin gene expression data from 647 twins from the MuTHER Consortium and applied factor analysis to concisely summarize patterns of gene expression to remove broad confounding influences and to produce concise pathway-level phenotypes. We derived 930 “pathway phenotypes” that summarized patterns of variation across 186 KEGG pathways (five phenotypes per pathway). We identified 69 significant associations of age with phenotype from 57 distinct KEGG pathways at a stringent Bonferroni threshold (P<5.38×10−5). These phenotypes are more heritable (h2=0.32) than gene expression levels. On average, expression levels of 16% of genes within these pathways are associated with age. Several significant pathways relate to metabolizing sugars and fatty acids; others relate to insulin signaling. We have demonstrated that factor analysis methods combined with biological knowledge can produce more reliable phenotypes with less stochastic noise than the individual gene expression levels, which increases our power to discover biologically relevant associations. These phenotypes could also be applied to discover associations with other environmental factors. PMID:25758824

  12. Inferring molecular interactions pathways from eQTL data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Rashid, Imran; McDermott, Jason E.; Samudrala, Ram

    Analysis of expression quantitative trait loci (eQTL) helps elucidate the connection between genotype, gene expression levels, and phenotype. However, standard statistical genetics can only attribute changes in expression levels to loci on the genome, not specific genes. Each locus can contain many genes, making it very difficult to discover which gene is controlling the expression levels of other genes. Furthermore, it is even more difficult to find a pathway of molecular interactions responsible for controlling the expression levels. Here we describe a series of techniques for finding explanatory pathways by exploring graphs of molecular interactions. We show several simple methodsmore » can find complete pathways the explain the mechanism of differential expression in eQTL data.« less

  13. Expression of FOXP3, CD68, and CD20 at Diagnosis in the Microenvironment of Classical Hodgkin Lymphoma Is Predictive of Outcome

    PubMed Central

    Greaves, Paul; Clear, Andrew; Coutinho, Rita; Wilson, Andrew; Matthews, Janet; Owen, Andrew; Shanyinde, Milensu; Lister, T. Andrew; Calaminici, Maria; Gribben, John G.

    2013-01-01

    Purpose The immune microenvironment is key to the pathophysiology of classical Hodgkin lymphoma (CHL). Twenty percent of patients experience failure of their initial treatment, and others receive excessively toxic treatment. Prognostic scores and biomarkers have yet to influence outcomes significantly. Previous biomarker studies have been limited by the extent of tissue analyzed, statistical inconsistencies, and failure to validate findings. We aimed to overcome these limitations by validating recently identified microenvironment biomarkers (CD68, FOXP3, and CD20) in a new patient cohort with a greater extent of tissue and by using rigorous statistical methodology. Patients and Methods Diagnostic tissue from 122 patients with CHL was microarrayed and stained, and positive cells were counted across 10 to 20 high-powered fields per patient by using an automated system. Two statistical analyses were performed: a categorical analysis with test/validation set-defined cut points and Kaplan-Meier estimated outcome measures of 5-year overall survival (OS), disease-specific survival (DSS), and freedom from first-line treatment failure (FFTF) and an independent multivariate analysis of absolute uncategorized counts. Results Increased CD20 expression confers superior OS. Increased FOXP3 expression confers superior OS, and increased CD68 confers inferior FFTF and OS. FOXP3 varies independently of CD68 expression and retains significance when analyzed as a continuous variable in multivariate analysis. A simple score combining FOXP3 and CD68 discriminates three groups: FFTF 93%, 62%, and 47% (P < .001), DSS 93%, 82%, and 63% (P = .03), and OS 93%, 82%, and 59% (P = .002). Conclusion We have independently validated CD68, FOXP3, and CD20 as prognostic biomarkers in CHL, and we demonstrate, to the best of our knowledge for the first time, that combining FOXP3 and CD68 may further improve prognostic stratification. PMID:23045593

  14. Comparative study of joint analysis of microarray gene expression data in survival prediction and risk assessment of breast cancer patients

    PubMed Central

    2016-01-01

    Abstract Microarray gene expression data sets are jointly analyzed to increase statistical power. They could either be merged together or analyzed by meta-analysis. For a given ensemble of data sets, it cannot be foreseen which of these paradigms, merging or meta-analysis, works better. In this article, three joint analysis methods, Z -score normalization, ComBat and the inverse normal method (meta-analysis) were selected for survival prognosis and risk assessment of breast cancer patients. The methods were applied to eight microarray gene expression data sets, totaling 1324 patients with two clinical endpoints, overall survival and relapse-free survival. The performance derived from the joint analysis methods was evaluated using Cox regression for survival analysis and independent validation used as bias estimation. Overall, Z -score normalization had a better performance than ComBat and meta-analysis. Higher Area Under the Receiver Operating Characteristic curve and hazard ratio were also obtained when independent validation was used as bias estimation. With a lower time and memory complexity, Z -score normalization is a simple method for joint analysis of microarray gene expression data sets. The derived findings suggest further assessment of this method in future survival prediction and cancer classification applications. PMID:26504096

  15. Transcriptome profiling analysis reveals biomarkers in colon cancer samples of various differentiation

    PubMed Central

    Yu, Tonghu; Zhang, Huaping; Qi, Hong

    2018-01-01

    The aim of the present study was to investigate more colon cancer-related genes in different stages. Gene expression profile E-GEOD-62932 was extracted for differentially expressed gene (DEG) screening. Series test of cluster analysis was used to obtain significant trending models. Based on the Gene Ontology and Kyoto Encyclopedia of Genes and Genomes databases, functional and pathway enrichment analysis were processed and a pathway relation network was constructed. Gene co-expression network and gene signal network were constructed for common DEGs. The DEGs with the same trend were clustered and in total, 16 clusters with statistical significance were obtained. The screened DEGs were enriched into small molecule metabolic process and metabolic pathways. The pathway relation network was constructed with 57 nodes. A total of 328 common DEGs were obtained. Gene signal network was constructed with 71 nodes. Gene co-expression network was constructed with 161 nodes and 211 edges. ABCD3, CPT2, AGL and JAM2 are potential biomarkers for the diagnosis of colon cancer. PMID:29928385

  16. Gene expression profiling of Japanese psoriatic skin reveals an increased activity in molecular stress and immune response signals.

    PubMed

    Kulski, Jerzy K; Kenworthy, William; Bellgard, Matthew; Taplin, Ross; Okamoto, Koichi; Oka, Akira; Mabuchi, Tomotaka; Ozawa, Akira; Tamiya, Gen; Inoko, Hidetoshi

    2005-12-01

    Gene expression profiling was performed on biopsies of affected and unaffected psoriatic skin and normal skin from seven Japanese patients to obtain insights into the pathways that control this disease. HUG95A Affymetrix DNA chips that contained oligonucleotide arrays of approximately 12,000 well-characterized human genes were used in the study. The statistical analysis of the Affymetrix data, based on the ranking of the Student t-test statistic, revealed a complex regulation of molecular stress and immune gene responses. The majority of the 266 induced genes in affected and unaffected psoriatic skin were involved with interferon mediation, immunity, cell adhesion, cytoskeleton restructuring, protein trafficking and degradation, RNA regulation and degradation, signalling transduction, apoptosis and atypical epidermal cellular proliferation and differentiation. The disturbances in the normal protein degradation equilibrium of skin were reflected by the significant increase in the gene expression of various protease inhibitors and proteinases, including the induced components of the ATP/ubiquitin-dependent non-lysosomal proteolytic pathway that is involved with peptide processing and presentation to T cells. Some of the up-regulated genes, such as TGM1, IVL, FABP5, CSTA and SPRR, are well-known psoriatic markers involved in atypical epidermal cellular organization and differentiation. In the comparison between the affected and unaffected psoriatic skin, the transcription factor JUNB was found at the top of the statistical rankings for the up-regulated genes in affected skin, suggesting that it has an important but as yet undefined role in psoriasis. Our gene expression data and analysis suggest that psoriasis is a chronic interferon- and T-cell-mediated immune disease of the skin where the imbalance in epidermal cellular structure, growth and differentiation arises from the molecular antiviral stress signals initiating inappropriate immune responses.

  17. Transcriptomic and bioinformatics analysis of the early time-course of the response to prostaglandin F2 alpha in the bovine corpus luteum

    USDA-ARS?s Scientific Manuscript database

    RNA expression analysis was performed on the corpus luteum tissue at five time points after prostaglandin F2 alpha treatment of midcycle cows using an Affymetrix Bovine Gene v1 Array. The normalized linear microarray data was uploaded to the NCBI GEO repository (GSE94069). Subsequent statistical ana...

  18. AhR-mediated gene expression in the developing mouse telencephalon.

    PubMed

    Gohlke, Julia M; Stockton, Pat S; Sieber, Stella; Foley, Julie; Portier, Christopher J

    2009-11-01

    We hypothesize that TCDD-induced developmental neurotoxicity is modulated through an AhR-dependent interaction with key regulatory neuronal differentiation pathways during telencephalon development. To test this hypothesis we examined global gene expression in both dorsal and ventral telencephalon tissues in E13.5 AhR-/- and wildtype mice exposed to TCDD or vehicle. Consistent with previous biochemical, pathological and behavioral studies, our results suggest TCDD initiated changes in gene expression in the developing telencephalon are primarily AhR-dependent, as no statistically significant gene expression changes are evident after TCDD exposure in AhR-/- mice. Based on a gene regulatory network for neuronal specification in the developing telencephalon, the present analysis suggests differentiation of GABAergic neurons in the ventral telencephalon is compromised in TCDD exposed and AhR-/- mice. In addition, our analysis suggests Sox11 may be directly regulated by AhR based on gene expression and comparative genomics analyses. In conclusion, this analysis supports the hypothesis that AhR has a specific role in the normal development of the telencephalon and provides a mechanistic framework for neurodevelopmental toxicity of chemicals that perturb AhR signaling.

  19. [The use of expressive writing in the course of care for cancer patients to reduce emotional distress: analysis of the literature].

    PubMed

    Gallo, Isabella; Garrino, Lorenza; Di Monte, Valerio

    2015-01-01

    The emotional distress represents one of the symptoms most frequently reported in the cancer patient in therapy, increasing the risk of developing a disease depressive. Through the analysis of the literature we want to assess whether the use of expressive writing on cancer patients in their care pathway compared to the use of writing neutral reduces emotional distress. The bibliographic search was conducted using the databases CINAHL, PubMed, Cochrane Library and PsycInfo. The results of research conducted on 7 randomized controlled trials, including 3 pilot studies have shown after expressive writing sessions (experimental group) versus neutral writing (control group) a significant reduction in distress in the experimental group early stages of cancer (p = 0,0183); in patients with a diagnosis of metastatic assigned to the group expressive writing there was a statistically significant relevance in the reduction of mood disorders (p = 0,03).Were determined statistically significant group differences also with respect to some measure on the quality of sleep (p = 0,04). The expressive writing did not produce significant reductions in psychological distress and improvements in physical health (p > 0,20) in patients diagnosed with metastatic disease of long duration and, in the palliative care there have been results of feasibility for poor adherence at follow-up. From the results it is evident that the strategies of expressive writing improves the management of the disease, reduce the physical and psychological symptoms related to the tumor while reducing the emotional distress in patients at an early stage of the disease.

  20. Lewis x is highly expressed in normal tissues: a comparative immunohistochemical study and literature revision.

    PubMed

    Croce, María V; Isla-Larrain, Marina; Rabassa, Martín E; Demichelis, Sandra; Colussi, Andrea G; Crespo, Marina; Lacunza, Ezequiel; Segal-Eiras, Amada

    2007-01-01

    An immunohistochemical analysis was employed to determine the expression of carbohydrate antigens associated to mucins in normal epithelia. Tissue samples were obtained as biopsies from normal breast (18), colon (35) and oral cavity mucosa (8). The following carbohydrate epitopes were studied: sialyl-Lewis x, Lewis x, Lewis y, Tn hapten, sialyl-Tn and Thomsen-Friedenreich antigen. Mucins were also studied employing antibodies against MUC1, MUC2, MUC4, MUC5AC, MUC6 and also normal colonic glycolipid. Statistical analysis was performed and Kendall correlations were obtained. Lewis x showed an apical pattern mainly at plasma membrane, although cytoplasmic staining was also found in most samples. TF, Tn and sTn haptens were detected in few specimens, while sLewis x was found in oral mucosa and breast tissue. Also, normal breast expressed MUC1 at a high percentage, whereas MUC4 was observed in a small number of samples. Colon specimens mainly expressed MUC2 and MUC1, while most oral mucosa samples expressed MUC4 and MUC1. A positive correlation between MUC1VNTR and TF epitope (r=0.396) was found in breast samples, while in colon specimens MUC2 and colonic glycolipid versus Lewis x were statistically significantly correlated (r=0.28 and r=0.29, respectively). As a conclusion, a defined carbohydrate epitope expression is not exclusive of normal tissue or a determined localization, and it is possible to assume that different glycoproteins and glycolipids may be carriers of carbohydrate antigens depending on the tissue localization considered.

  1. The antenna transcriptome changes in mosquito Anopheles sinensis, pre- and post- blood meal.

    PubMed

    Chen, Qian; Pei, Di; Li, Jianyong; Jing, Chengyu; Wu, Wenjian; Man, Yahui

    2017-01-01

    Antenna is the main chemosensory organ in mosquitoes. Characterization of the transcriptional changes after blood meal, especially those related to chemoreception, may help to explain mosquito blood sucking behavior and to identify novel targets for mosquito control. Anopheles sinensis is an Asiatic mosquito species which transmits malaria and lymphatic filariasis. However, studies on chemosensory biology in female An. sinensis are quite lacking. Here we report a transcriptome analysis of An. sinensis female antennae pre- and post- blood meal. We created six An. sinensis antenna RNA-seq libraries, three from females without blood meal and three from females five hours after a blood meal. Illumina sequencing was conducted to analyze the transcriptome differences between the two groups. In total, the sequenced fragments created 21,643 genes, 1,828 of them were novel. 12,861 of these genes were considered to be expressed (FPKM >1.0) in at least one of the two groups, with 12,159 genes expressed in both groups. 548 genes were differentially expressed in the blood-fed group, with 331 genes up-regulated and 217 genes down-regulated. GO enrichment analysis of the differentially expressed genes suggested that there were no statistically over represented GO terms among down-regulated genes in blood-fed mosquitoes, while the enriched GO terms of the up-regulated genes occurred mainly in metabolic process. For the chemosensory gene families, a subtle distinction in the expression levels can be observed according to our statistical analysis. However, the firstly comprehensive identification of these chemosensory gene families in An. sinensis antennae will help to characterize the precise function of these proteins in odor recognition in mosquitoes. This study provides a first global view in the changes of transcript accumulation elicited by blood meal in An. sinensis female antennae.

  2. [Expressions of EMMPRIN and its ligand CyPA in gingival crevicular fluid of chronic periodontitis patients].

    PubMed

    He, Yan-ping; Xie, Ming; Jiao, Ting

    2016-02-01

    To detect the expressions of EMMPRIN and its ligand CyPA in gingival crevicular fluid (GCF) of chronic periodontitis (CP) patients and explore their possible relation to the status of periodontal inflammation. GCF of CP patients (group CP) and periodontitis-free patients with intact dentition (the control group) were collected and assayed for EMMPRIN and CyPA expressions by ELISA. The clinical periodontal status of these patients were examined. Statistical analysis was performed by use of SPSS 17.0 software package. Spearman's correlation analysis was utilized to determine the relationships between the expressions of EMMPRIN and CyPA in GCF and the clinical parameters. In addition, analysis of variance (ANOVA) was used for comparing the difference between group CP and the control group. In group CP, GCF volume was positively correlated with EMMPRIN total amount, CyPA total amount and some clinical periodontal indexes (GI,SBI,AL). EMMPRIN total amount was positively correlated with GCF volume, CyPA total amount and some of clinical periodontal indexes (GI,SBI,AL), but it was negatively correlated with smoking status (P<0.05). Moreover, CyPA total amount was positively correlated with GCF volume, EMMPRIN total amount and some of clinical periodontal indexes (GI,SBI,AL). In the control group,there were significant positive correlations among GCF volume, EMMPRIN total amount and CyPA total amount. The difference of GCF, EMMPRIN and CyPA between the 2 groups were statistically significant (P<0.05). EMMPRIN and its ligand CyPA in GCF of periodontitis-free patients with intact dentition and CP patients were all detected. As the progress of periodontal inflammation, GCF secretion increases, as well as the expressions of EMMPRIN and CyPA in GCF.

  3. A knowledge-based T2-statistic to perform pathway analysis for quantitative proteomic data

    PubMed Central

    Chen, Yi-Hau

    2017-01-01

    Approaches to identify significant pathways from high-throughput quantitative data have been developed in recent years. Still, the analysis of proteomic data stays difficult because of limited sample size. This limitation also leads to the practice of using a competitive null as common approach; which fundamentally implies genes or proteins as independent units. The independent assumption ignores the associations among biomolecules with similar functions or cellular localization, as well as the interactions among them manifested as changes in expression ratios. Consequently, these methods often underestimate the associations among biomolecules and cause false positives in practice. Some studies incorporate the sample covariance matrix into the calculation to address this issue. However, sample covariance may not be a precise estimation if the sample size is very limited, which is usually the case for the data produced by mass spectrometry. In this study, we introduce a multivariate test under a self-contained null to perform pathway analysis for quantitative proteomic data. The covariance matrix used in the test statistic is constructed by the confidence scores retrieved from the STRING database or the HitPredict database. We also design an integrating procedure to retain pathways of sufficient evidence as a pathway group. The performance of the proposed T2-statistic is demonstrated using five published experimental datasets: the T-cell activation, the cAMP/PKA signaling, the myoblast differentiation, and the effect of dasatinib on the BCR-ABL pathway are proteomic datasets produced by mass spectrometry; and the protective effect of myocilin via the MAPK signaling pathway is a gene expression dataset of limited sample size. Compared with other popular statistics, the proposed T2-statistic yields more accurate descriptions in agreement with the discussion of the original publication. We implemented the T2-statistic into an R package T2GA, which is available at https://github.com/roqe/T2GA. PMID:28622336

  4. A knowledge-based T2-statistic to perform pathway analysis for quantitative proteomic data.

    PubMed

    Lai, En-Yu; Chen, Yi-Hau; Wu, Kun-Pin

    2017-06-01

    Approaches to identify significant pathways from high-throughput quantitative data have been developed in recent years. Still, the analysis of proteomic data stays difficult because of limited sample size. This limitation also leads to the practice of using a competitive null as common approach; which fundamentally implies genes or proteins as independent units. The independent assumption ignores the associations among biomolecules with similar functions or cellular localization, as well as the interactions among them manifested as changes in expression ratios. Consequently, these methods often underestimate the associations among biomolecules and cause false positives in practice. Some studies incorporate the sample covariance matrix into the calculation to address this issue. However, sample covariance may not be a precise estimation if the sample size is very limited, which is usually the case for the data produced by mass spectrometry. In this study, we introduce a multivariate test under a self-contained null to perform pathway analysis for quantitative proteomic data. The covariance matrix used in the test statistic is constructed by the confidence scores retrieved from the STRING database or the HitPredict database. We also design an integrating procedure to retain pathways of sufficient evidence as a pathway group. The performance of the proposed T2-statistic is demonstrated using five published experimental datasets: the T-cell activation, the cAMP/PKA signaling, the myoblast differentiation, and the effect of dasatinib on the BCR-ABL pathway are proteomic datasets produced by mass spectrometry; and the protective effect of myocilin via the MAPK signaling pathway is a gene expression dataset of limited sample size. Compared with other popular statistics, the proposed T2-statistic yields more accurate descriptions in agreement with the discussion of the original publication. We implemented the T2-statistic into an R package T2GA, which is available at https://github.com/roqe/T2GA.

  5. Temporal scaling and spatial statistical analyses of groundwater level fluctuations

    NASA Astrophysics Data System (ADS)

    Sun, H.; Yuan, L., Sr.; Zhang, Y.

    2017-12-01

    Natural dynamics such as groundwater level fluctuations can exhibit multifractionality and/or multifractality due likely to multi-scale aquifer heterogeneity and controlling factors, whose statistics requires efficient quantification methods. This study explores multifractionality and non-Gaussian properties in groundwater dynamics expressed by time series of daily level fluctuation at three wells located in the lower Mississippi valley, after removing the seasonal cycle in the temporal scaling and spatial statistical analysis. First, using the time-scale multifractional analysis, a systematic statistical method is developed to analyze groundwater level fluctuations quantified by the time-scale local Hurst exponent (TS-LHE). Results show that the TS-LHE does not remain constant, implying the fractal-scaling behavior changing with time and location. Hence, we can distinguish the potentially location-dependent scaling feature, which may characterize the hydrology dynamic system. Second, spatial statistical analysis shows that the increment of groundwater level fluctuations exhibits a heavy tailed, non-Gaussian distribution, which can be better quantified by a Lévy stable distribution. Monte Carlo simulations of the fluctuation process also show that the linear fractional stable motion model can well depict the transient dynamics (i.e., fractal non-Gaussian property) of groundwater level, while fractional Brownian motion is inadequate to describe natural processes with anomalous dynamics. Analysis of temporal scaling and spatial statistics therefore may provide useful information and quantification to understand further the nature of complex dynamics in hydrology.

  6. [Inhibitory effect of nimesulide and oxaliplatin on tumor growth and lymphatic metastasis of transplanted human lung cancer in nude mice].

    PubMed

    Lang, Zhe; Chen, Gang; Wang, Dong-chang

    2012-10-01

    This study was designed to evaluate the inhibitory effect of nimesulide in combination with oxaliplatin on tumor growth, expression of COX-2, VEGF-C, VEGFR-3, survivin and β-catenin, and lymphatic metastasis in lung cancer xenograft in nude mice, and to discuss the possible synergistic effect of nimesulide in combination with oxaliplatin. Human lung cancer A549 cells were injected into BALB/c nude mice subcutaneously. Thirty-three healthy male nude mice were randomly divided into 4 groups: the control group, nimesulide group, oxaliplatin group and nimesulide combined with oxaliplatin group. Transplanted tumor tissues were collected and the expressions of COX-2, VEGF-C, VEGFR-3, survivin, β-catenin protein were detected by immunohistochemistry, and RT-PCR assay was used to assess the expression of tumor COX-2, VEGF-C, VEGFR-3, survivin and β-catenin mRNA. SPSS 16.0 was used for statistical analysis. Data were present as (x(-) ± s), and the means were compared by analysis of variance test. Tumor inhibition rates of the nimesulide group, oxaliplatin group and nimesulide + oxaliplatin group were 39.73%, 48.04% and 65.94%, respectively. Immunohistochemical and RT-PCR analysis showed that compared with the control group, the expression levels of COX-2, VEGF-C, VEGFR-3, survivin and β-catenin of the nimesulide group were significantly reduced (all P < 0.05). Compared with the control group, statistical analysis of variance showed that the expression levels of COX-2, VEGF-C and VEGFR-3 of the oxaliplatin group were significantly increased (P < 0.05), the expression levels of survivin and β-catenin protein and mRNA of the oxaliplatin group were significantly reduced (P < 0.05). Compared with the control group, the expression levels of COX-2, VEGF-C, VEGFR-3, survivin and β-catenin of the nimesulide + oxaliplatin group were significantly reduced (all P < 0.05). Both nimesulide alone or in combination with oxaliplatin can significantly inhibit the growth of lung cancer xenografts in nude mice and the expression levels of COX-2, VEGF-C, VEGFR-3, survivin and β-catenin. Oxaliplatin can significantly inhibit the growth of lung cancer xenografts in nude mice, and the expression of survivin and β-catenin. Nimesulide in combination with oxaliplatin enhances the antitumor effect of oxaliplatin.

  7. TRAM (Transcriptome Mapper): database-driven creation and analysis of transcriptome maps from multiple sources

    PubMed Central

    2011-01-01

    Background Several tools have been developed to perform global gene expression profile data analysis, to search for specific chromosomal regions whose features meet defined criteria as well as to study neighbouring gene expression. However, most of these tools are tailored for a specific use in a particular context (e.g. they are species-specific, or limited to a particular data format) and they typically accept only gene lists as input. Results TRAM (Transcriptome Mapper) is a new general tool that allows the simple generation and analysis of quantitative transcriptome maps, starting from any source listing gene expression values for a given gene set (e.g. expression microarrays), implemented as a relational database. It includes a parser able to assign univocal and updated gene symbols to gene identifiers from different data sources. Moreover, TRAM is able to perform intra-sample and inter-sample data normalization, including an original variant of quantile normalization (scaled quantile), useful to normalize data from platforms with highly different numbers of investigated genes. When in 'Map' mode, the software generates a quantitative representation of the transcriptome of a sample (or of a pool of samples) and identifies if segments of defined lengths are over/under-expressed compared to the desired threshold. When in 'Cluster' mode, the software searches for a set of over/under-expressed consecutive genes. Statistical significance for all results is calculated with respect to genes localized on the same chromosome or to all genome genes. Transcriptome maps, showing differential expression between two sample groups, relative to two different biological conditions, may be easily generated. We present the results of a biological model test, based on a meta-analysis comparison between a sample pool of human CD34+ hematopoietic progenitor cells and a sample pool of megakaryocytic cells. Biologically relevant chromosomal segments and gene clusters with differential expression during the differentiation toward megakaryocyte were identified. Conclusions TRAM is designed to create, and statistically analyze, quantitative transcriptome maps, based on gene expression data from multiple sources. The release includes FileMaker Pro database management runtime application and it is freely available at http://apollo11.isto.unibo.it/software/, along with preconfigured implementations for mapping of human, mouse and zebrafish transcriptomes. PMID:21333005

  8. Human cytosolic glutathione-S-transferases: quantitative analysis of expression, comparative analysis of structures and inhibition strategies of isozymes involved in drug resistance.

    PubMed

    Mohana, Krishnamoorthy; Achary, Anant

    2017-08-01

    Glutathione-S-transferase (GST) inhibition is a strategy to overcome drug resistance. Several isoforms of human GSTs are present and they are expressed in almost all the organs. Specific expression levels of GSTs in various organs are collected from the human transcriptome data and analysis of the organ-specific expression of GST isoforms is carried out. The variations in the level of expressions of GST isoforms are statistically significant. The GST expression differs in diseased conditions as reported by many investigators and some of the isoforms of GSTs are disease markers or drug targets. Structure analysis of various isoforms is carried out and literature mining has been performed to identify the differences in the active sites of the GSTs. The xenobiotic binding H site is classified into H1, H2, and H3 and the differences in the amino acid composition, the hydrophobicity and other structural features of H site of GSTs are discussed. The existing inhibition strategies are compared. The advent of rational drug design, mechanism-based inhibition strategies, availability of high-throughput screening, target specific, and selective inhibition of GST isoforms involved in drug resistance could be achieved for the reversal of drug resistance and aid in the treatment of diseases.

  9. Discrimination between smiling faces: Human observers vs. automated face analysis.

    PubMed

    Del Líbano, Mario; Calvo, Manuel G; Fernández-Martín, Andrés; Recio, Guillermo

    2018-05-11

    This study investigated (a) how prototypical happy faces (with happy eyes and a smile) can be discriminated from blended expressions with a smile but non-happy eyes, depending on type and intensity of the eye expression; and (b) how smile discrimination differs for human perceivers versus automated face analysis, depending on affective valence and morphological facial features. Human observers categorized faces as happy or non-happy, or rated their valence. Automated analysis (FACET software) computed seven expressions (including joy/happiness) and 20 facial action units (AUs). Physical properties (low-level image statistics and visual saliency) of the face stimuli were controlled. Results revealed, first, that some blended expressions (especially, with angry eyes) had lower discrimination thresholds (i.e., they were identified as "non-happy" at lower non-happy eye intensities) than others (especially, with neutral eyes). Second, discrimination sensitivity was better for human perceivers than for automated FACET analysis. As an additional finding, affective valence predicted human discrimination performance, whereas morphological AUs predicted FACET discrimination. FACET can be a valid tool for categorizing prototypical expressions, but is currently more limited than human observers for discrimination of blended expressions. Configural processing facilitates detection of in/congruence(s) across regions, and thus detection of non-genuine smiling faces (due to non-happy eyes). Copyright © 2018 Elsevier B.V. All rights reserved.

  10. Modulation of intestinal gene expression by dietary zinc status: Effectiveness of cDNA arrays for expression profiling of a single nutrient deficiency

    PubMed Central

    Blanchard, Raymond K.; Moore, J. Bernadette; Green, Calvert L.; Cousins, Robert J.

    2001-01-01

    Mammalian nutritional status affects the homeostatic balance of multiple physiological processes and their associated gene expression. Although DNA array analysis can monitor large numbers of genes, there are no reports of expression profiling of a micronutrient deficiency in an intact animal system. In this report, we have tested the feasibility of using cDNA arrays to compare the global changes in expression of genes of known function that occur in the early stages of rodent zinc deficiency. The gene-modulating effects of this deficiency were demonstrated by real-time quantitative PCR measurements of altered mRNA levels for metallothionein 1, zinc transporter 2, and uroguanylin, all of which have been previously documented as zinc-regulated genes. As a result of the low level of inherent noise within this model system and application of a recently reported statistical tool for statistical analysis of microarrays [Tusher, V.G., Tibshirani, R. & Chu, G. (2001) Proc. Natl. Acad. Sci. USA 98, 5116–5121], we demonstrate the ability to reproducibly identify the modest changes in mRNA abundance produced by this single micronutrient deficiency. Among the genes identified by this array profile are intestinal genes that influence signaling pathways, growth, transcription, redox, and energy utilization. Additionally, the influence of dietary zinc supply on the expression of some of these genes was confirmed by real-time quantitative PCR. Overall, these data support the effectiveness of cDNA array expression profiling to investigate the pleiotropic effects of specific nutrients and may provide an approach to establishing markers for assessment of nutritional status. PMID:11717422

  11. Statistical models for the analysis and design of digital polymerase chain (dPCR) experiments

    USGS Publications Warehouse

    Dorazio, Robert; Hunter, Margaret

    2015-01-01

    Statistical methods for the analysis and design of experiments using digital PCR (dPCR) have received only limited attention and have been misused in many instances. To address this issue and to provide a more general approach to the analysis of dPCR data, we describe a class of statistical models for the analysis and design of experiments that require quantification of nucleic acids. These models are mathematically equivalent to generalized linear models of binomial responses that include a complementary, log–log link function and an offset that is dependent on the dPCR partition volume. These models are both versatile and easy to fit using conventional statistical software. Covariates can be used to specify different sources of variation in nucleic acid concentration, and a model’s parameters can be used to quantify the effects of these covariates. For purposes of illustration, we analyzed dPCR data from different types of experiments, including serial dilution, evaluation of copy number variation, and quantification of gene expression. We also showed how these models can be used to help design dPCR experiments, as in selection of sample sizes needed to achieve desired levels of precision in estimates of nucleic acid concentration or to detect differences in concentration among treatments with prescribed levels of statistical power.

  12. Statistical Models for the Analysis and Design of Digital Polymerase Chain Reaction (dPCR) Experiments.

    PubMed

    Dorazio, Robert M; Hunter, Margaret E

    2015-11-03

    Statistical methods for the analysis and design of experiments using digital PCR (dPCR) have received only limited attention and have been misused in many instances. To address this issue and to provide a more general approach to the analysis of dPCR data, we describe a class of statistical models for the analysis and design of experiments that require quantification of nucleic acids. These models are mathematically equivalent to generalized linear models of binomial responses that include a complementary, log-log link function and an offset that is dependent on the dPCR partition volume. These models are both versatile and easy to fit using conventional statistical software. Covariates can be used to specify different sources of variation in nucleic acid concentration, and a model's parameters can be used to quantify the effects of these covariates. For purposes of illustration, we analyzed dPCR data from different types of experiments, including serial dilution, evaluation of copy number variation, and quantification of gene expression. We also showed how these models can be used to help design dPCR experiments, as in selection of sample sizes needed to achieve desired levels of precision in estimates of nucleic acid concentration or to detect differences in concentration among treatments with prescribed levels of statistical power.

  13. An extended data mining method for identifying differentially expressed assay-specific signatures in functional genomic studies.

    PubMed

    Rollins, Derrick K; Teh, Ailing

    2010-12-17

    Microarray data sets provide relative expression levels for thousands of genes for a small number, in comparison, of different experimental conditions called assays. Data mining techniques are used to extract specific information of genes as they relate to the assays. The multivariate statistical technique of principal component analysis (PCA) has proven useful in providing effective data mining methods. This article extends the PCA approach of Rollins et al. to the development of ranking genes of microarray data sets that express most differently between two biologically different grouping of assays. This method is evaluated on real and simulated data and compared to a current approach on the basis of false discovery rate (FDR) and statistical power (SP) which is the ability to correctly identify important genes. This work developed and evaluated two new test statistics based on PCA and compared them to a popular method that is not PCA based. Both test statistics were found to be effective as evaluated in three case studies: (i) exposing E. coli cells to two different ethanol levels; (ii) application of myostatin to two groups of mice; and (iii) a simulated data study derived from the properties of (ii). The proposed method (PM) effectively identified critical genes in these studies based on comparison with the current method (CM). The simulation study supports higher identification accuracy for PM over CM for both proposed test statistics when the gene variance is constant and for one of the test statistics when the gene variance is non-constant. PM compares quite favorably to CM in terms of lower FDR and much higher SP. Thus, PM can be quite effective in producing accurate signatures from large microarray data sets for differential expression between assays groups identified in a preliminary step of the PCA procedure and is, therefore, recommended for use in these applications.

  14. Linnorm: improved statistical analysis for single cell RNA-seq expression data

    PubMed Central

    Yip, Shun H.; Wang, Panwen; Kocher, Jean-Pierre A.; Sham, Pak Chung

    2017-01-01

    Abstract Linnorm is a novel normalization and transformation method for the analysis of single cell RNA sequencing (scRNA-seq) data. Linnorm is developed to remove technical noises and simultaneously preserve biological variations in scRNA-seq data, such that existing statistical methods can be improved. Using real scRNA-seq data, we compared Linnorm with existing normalization methods, including NODES, SAMstrt, SCnorm, scran, DESeq and TMM. Linnorm shows advantages in speed, technical noise removal and preservation of cell heterogeneity, which can improve existing methods in the discovery of novel subtypes, pseudo-temporal ordering of cells, clustering analysis, etc. Linnorm also performs better than existing DEG analysis methods, including BASiCS, NODES, SAMstrt, Seurat and DESeq2, in false positive rate control and accuracy. PMID:28981748

  15. MALDI-TOF Mass Spectrometry Enables a Comprehensive and Fast Analysis of Dynamics and Qualities of Stress Responses of Lactobacillus paracasei subsp. paracasei F19

    PubMed Central

    Schott, Ann-Sophie; Behr, Jürgen; Quinn, Jennifer; Vogel, Rudi F.

    2016-01-01

    Lactic acid bacteria (LAB) are widely used as starter cultures in the manufacture of foods. Upon preparation, these cultures undergo various stresses resulting in losses of survival and fitness. In order to find conditions for the subsequent identification of proteomic biomarkers and their exploitation for preconditioning of strains, we subjected Lactobacillus (Lb.) paracasei subsp. paracasei TMW 1.1434 (F19) to different stress qualities (osmotic stress, oxidative stress, temperature stress, pH stress and starvation stress). We analysed the dynamics of its stress responses based on the expression of stress proteins using MALDI-TOF mass spectrometry (MS), which has so far been used for species identification. Exploiting the methodology of accumulating protein expression profiles by MALDI-TOF MS followed by the statistical evaluation with cluster analysis and discriminant analysis of principle components (DAPC), it was possible to monitor the expression of low molecular weight stress proteins, identify a specific time point when the expression of stress proteins reached its maximum, and statistically differentiate types of adaptive responses into groups. Above the specific result for F19 and its stress response, these results demonstrate the discriminatory power of MALDI-TOF MS to characterize even dynamics of stress responses of bacteria and enable a knowledge-based focus on the laborious identification of biomarkers and stress proteins. To our knowledge, the implementation of MALDI-TOF MS protein profiling for the fast and comprehensive analysis of various stress responses is new to the field of bacterial stress responses. Consequently, we generally propose MALDI-TOF MS as an easy and quick method to characterize responses of microbes to different environmental conditions, to focus efforts of more elaborate approaches on time points and dynamics of stress responses. PMID:27783652

  16. Simultaneous measurement of cerebral blood flow and mRNA signals: pixel-based inter-modality correlational analysis.

    PubMed

    Zhao, W; Busto, R; Truettner, J; Ginsberg, M D

    2001-07-30

    The analysis of pixel-based relationships between local cerebral blood flow (LCBF) and mRNA expression can reveal important insights into brain function. Traditionally, LCBF and in situ hybridization studies for genes of interest have been analyzed in separate series. To overcome this limitation and to increase the power of statistical analysis, this study focused on developing a double-label method to measure local cerebral blood flow (LCBF) and gene expressions simultaneously by means of a dual-autoradiography procedure. A 14C-iodoantipyrine autoradiographic LCBF study was first performed. Serial brain sections (12 in this study) were obtained at multiple coronal levels and were processed in the conventional manner to yield quantitative LCBF images. Two replicate sections at each bregma level were then used for in situ hybridization. To eliminate the 14C-iodoantipyrine from these sections, a chloroform-washout procedure was first performed. The sections were then processed for in situ hybridization autoradiography for the probes of interest. This method was tested in Wistar rats subjected to 12 min of global forebrain ischemia by two-vessel occlusion plus hypotension, followed by 2 or 6 h of reperfusion (n=4-6 per group). LCBF and in situ hybridization images for heat shock protein 70 (HSP70) were generated for each rat, aligned by disparity analysis, and analyzed on a pixel-by-pixel basis. This method yielded detailed inter-modality correlation between LCBF and HSP70 mRNA expressions. The advantages of this method include reducing the number of experimental animals by one-half; and providing accurate pixel-based correlations between different modalities in the same animals, thus enabling paired statistical analyses. This method can be extended to permit correlation of LCBF with the expression of multiple genes of interest.

  17. Analysis of expression and prognostic significance of vimentin and the response to temozolomide in glioma patients.

    PubMed

    Lin, Lin; Wang, Guangzhi; Ming, Jianguang; Meng, Xiangqi; Han, Bo; Sun, Bo; Cai, Jinquan; Jiang, Chuanlu

    2016-11-01

    Gliomas are the most common primary intracranial malignant tumors in adults. Surgical resection followed by optional radiotherapy and chemotherapy is the current standard therapy for glioma patients. Vimentin, a protein of intermediate filament family, could maintain the cellular integrity and participate in several cell signal pathways to modulate the motility and invasion of cancer cells. The purpose of the present research was to identify the relationship between vimentin expression and clinical characteristics and detect the prognostic and predictive ability of vimentin in patients with glioma. To determine the expression of vimentin in glioma tissues, paraffin-embedded blocks from glioma patients by surgical resection were obtained and evaluated by immunohistochemistry. To further investigate the association of vimentin expression with survival, we employed mRNA expression of vimentin genes from the Chinese Glioma Genome Atlas (CGGA) and the GSE 16011 dataset. Kaplan-Meier analysis and Cox regression model were used to statistical analysis. We detected positive vimentin straining in 84 % of high-grade compared to 47 % in low-grade glioma patients. Additionally, vimentin mRNA expression was correlated with glioma grade in both CGGA and GSE16011 dataset. Patients with low vimentin expression have longer survival than high expression. In multivariate analysis, vimentin was an independent significant prognostic factor for high-grade glioma patients. We also identified that glioblastoma patients with low vimentin expression had a better response to temozolomide therapy. Vimentin expression has a significant association with tumor grade and overall survival of high-grade glioma patients. Low vimentin expression may benefit from temozolomide therapy.

  18. [Again review of research design and statistical methods of Chinese Journal of Cardiology].

    PubMed

    Kong, Qun-yu; Yu, Jin-ming; Jia, Gong-xian; Lin, Fan-li

    2012-11-01

    To re-evaluate and compare the research design and the use of statistical methods in Chinese Journal of Cardiology. Summary the research design and statistical methods in all of the original papers in Chinese Journal of Cardiology all over the year of 2011, and compared the result with the evaluation of 2008. (1) There is no difference in the distribution of the design of researches of between the two volumes. Compared with the early volume, the use of survival regression and non-parameter test are increased, while decreased in the proportion of articles with no statistical analysis. (2) The proportions of articles in the later volume are significant lower than the former, such as 6(4%) with flaws in designs, 5(3%) with flaws in the expressions, 9(5%) with the incomplete of analysis. (3) The rate of correction of variance analysis has been increased, so as the multi-group comparisons and the test of normality. The error rate of usage has been decreased form 17% to 25% without significance in statistics due to the ignorance of the test of homogeneity of variance. Many improvements showed in Chinese Journal of Cardiology such as the regulation of the design and statistics. The homogeneity of variance should be paid more attention in the further application.

  19. Decreased expression of class III β-tubulin is associated with unfavourable prognosis in patients with malignant melanoma.

    PubMed

    Shimizu, Akira; Kaira, Kyoichi; Yasuda, Masahito; Asao, Takayuki; Ishikawa, Osamu

    2016-02-01

    Class III β-tubulin (TUBB3) has been recognized as being associated with resistance to taxane-based regimens in several cancers. However, little is known about the clinicopathological significance of TUBB3 expression in patients with cutaneous malignant melanoma. The aim of this study was to examine the prognostic significance of TUBB3 expression in cutaneous malignant melanoma. A total of 106 patients with surgically resected cutaneous malignant melanoma were assessed. Tumour sections were immunohistochemically stained for TUBB3, Ki-67 and microvessel density with CD34. TUBB3 was highly expressed in 80% (85/106) of patients. No statistically significant relationship was observed between the high expression of TUBB3 and any variables. On univariate analysis, ulceration, disease stage, TUBB3 and CD34 revealed a significant relationship with overall survival and progression-free survival. Multivariate analysis confirmed that a low TUBB3 expression was an independent prognostic factor for poor prognosis of cutaneous malignant melanoma. The decreased expression of TUBB3 could be a significant marker for predicting unfavourable prognosis in patients with cutaneous malignant melanoma.

  20. Identification of new participants in the rainbow trout (Oncorhynchus mykiss) oocyte maturation and ovulation processes using cDNA microarrays

    PubMed Central

    Bobe, Julien; Montfort, Jerôme; Nguyen, Thaovi; Fostier, Alexis

    2006-01-01

    Background The hormonal control of oocyte maturation and ovulation as well as the molecular mechanisms of nuclear maturation have been thoroughly studied in fish. In contrast, the other molecular events occurring in the ovary during post-vitellogenesis have received far less attention. Methods Nylon microarrays displaying 9152 rainbow trout cDNAs were hybridized using RNA samples originating from ovarian tissue collected during late vitellogenesis, post-vitellogenesis and oocyte maturation. Differentially expressed genes were identified using a statistical analysis. A supervised clustering analysis was performed using only differentially expressed genes in order to identify gene clusters exhibiting similar expression profiles. In addition, specific genes were selected and their preovulatory ovarian expression was analyzed using real-time PCR. Results From the statistical analysis, 310 differentially expressed genes were identified. Among those genes, 90 were up-regulated at the time of oocyte maturation while 220 exhibited an opposite pattern. After clustering analysis, 90 clones belonging to 3 gene clusters exhibiting the most remarkable expression patterns were kept for further analysis. Using real-time PCR analysis, we observed a strong up-regulation of ion and water transport genes such as aquaporin 4 (aqp4) and pendrin (slc26). In addition, a dramatic up-regulation of vasotocin (avt) gene was observed. Furthermore, angiotensin-converting-enzyme 2 (ace2), coagulation factor V (cf5), adam 22, and the chemokine cxcl14 genes exhibited a sharp up-regulation at the time of oocyte maturation. Finally, ovarian aromatase (cyp19a1) exhibited a dramatic down-regulation over the post-vitellogenic period while a down-regulation of Cytidine monophosphate-N-acetylneuraminic acid hydroxylase (cmah) was observed at the time of oocyte maturation. Conclusion We showed the over or under expression of more that 300 genes, most of them being previously unstudied or unknown in the fish preovulatory ovary. Our data confirmed the down-regulation of estrogen synthesis genes during the preovulatory period. In addition, the strong up-regulation of aqp4 and slc26 genes prior to ovulation suggests their participation in the oocyte hydration process occurring at that time. Furthermore, among the most up-regulated clones, several genes such as cxcl14, ace2, adam22, cf5 have pro-inflammatory, vasodilatory, proteolytics and coagulatory functions. The identity and expression patterns of those genes support the theory comparing ovulation to an inflammatory-like reaction. PMID:16872517

  1. Differentiation of five body fluids from forensic samples by expression analysis of four microRNAs using quantitative PCR.

    PubMed

    Sauer, Eva; Reinke, Ann-Kathrin; Courts, Cornelius

    2016-05-01

    Applying molecular genetic approaches for the identification of forensically relevant body fluids, which often yield crucial information for the reconstruction of a potential crime, is a current topic of forensic research. Due to their body fluid specific expression patterns and stability against degradation, microRNAs (miRNA) emerged as a promising molecular species, with a range of candidate markers published. The analysis of miRNA via quantitative Real-Time PCR, however, should be based on a relevant strategy of normalization of non-biological variances to deliver reliable and biologically meaningful results. The herein presented work is the as yet most comprehensive study of forensic body fluid identification via miRNA expression analysis based on a thoroughly validated qPCR procedure and unbiased statistical decision making to identify single source samples. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  2. HER2 Expression Status and Prognostic, Diagnostic, and Demographic Properties of Patients with Gastric Cancer: a Single Center Cohort Study from Iran

    PubMed

    Feizy, Abdolamir; Karami, Aida; Eghdamzamiri, Reza; Moghimi, Minoosh; Taheri, Hadi; Mousavinasab, Nouraddin

    2018-06-25

    Background: The fourth most prevalent cancer worldwide and a major cause of death in developing countries is gastric cancer (GC). Human epidermal growth factor receptor 2 (HER2), is a proto-oncogene expressed in different solid tumors. This study aimed to evaluate possible associations of HER2 expression status with survival rate, age, sex, tumor grade, histopathological type, and primary tumor location in patients with GC. Methods: Subjects were enrolled in this cohort study after consideration of inclusion and exclusion criteria. Biopsy specimens were stained using immunohistochemistry. Samples with a score of 3+ were considered to exhibit HER2 overexpression. The mentioned variables were extracted from patients’ files as well as by clinical evaluation. The Kaplan-Meier method was applied for analyzing the survival rate and Chi square for possible factor associations. Results: A total of 210 patients (25.2% female and 74.8% male) were enrolled. In a 5-year follow-up (adherence rate: 45.7%), the average survival was 9.4±10.9 months. HER2 overexpression was evident in 24%. There was no statistically significant association found between HER2 expression and primary tumor location (p-value=0.63), histopathological type (p-value=0.72), or tumor grade (p-value=0.051). Furthermore, no statistically significant links were apparent with tumor grade in either male or female groups as well as patients aged ≥60 and ˂60 years (all p-values >0.05). Moreover, no statistically significant association was detected between HER2 expression status (p-value=0.88), sex (p-value=0.31), and age (p-value=0.055) with patient survival. Conclusions: No statistically meaningful association was found between all parameters examined and HER2 expression status. Divergence of the results from earlier studies might be due to genetic variation. Thus, performing a meta-analysis on certain races might be helpful for clarification. Creative Commons Attribution License

  3. Why weight? Modelling sample and observational level variability improves power in RNA-seq analyses

    PubMed Central

    Liu, Ruijie; Holik, Aliaksei Z.; Su, Shian; Jansz, Natasha; Chen, Kelan; Leong, Huei San; Blewitt, Marnie E.; Asselin-Labat, Marie-Liesse; Smyth, Gordon K.; Ritchie, Matthew E.

    2015-01-01

    Variations in sample quality are frequently encountered in small RNA-sequencing experiments, and pose a major challenge in a differential expression analysis. Removal of high variation samples reduces noise, but at a cost of reducing power, thus limiting our ability to detect biologically meaningful changes. Similarly, retaining these samples in the analysis may not reveal any statistically significant changes due to the higher noise level. A compromise is to use all available data, but to down-weight the observations from more variable samples. We describe a statistical approach that facilitates this by modelling heterogeneity at both the sample and observational levels as part of the differential expression analysis. At the sample level this is achieved by fitting a log-linear variance model that includes common sample-specific or group-specific parameters that are shared between genes. The estimated sample variance factors are then converted to weights and combined with observational level weights obtained from the mean–variance relationship of the log-counts-per-million using ‘voom’. A comprehensive analysis involving both simulations and experimental RNA-sequencing data demonstrates that this strategy leads to a universally more powerful analysis and fewer false discoveries when compared to conventional approaches. This methodology has wide application and is implemented in the open-source ‘limma’ package. PMID:25925576

  4. Prolactin receptor expression in gynaecomastia and male breast carcinoma.

    PubMed

    Ferreira, M; Mesquita, M; Quaresma, M; André, S

    2008-07-01

    Despite the well-established function of prolactin (PRL) in normal breast development, its role in breast cancer pathogenesis is still controversial. PRL activity is dependent on the activation of a transmembrane protein, the PRL receptor (PRLR). The aim was to evaluate and compare PRLR expression in gynaecomastia and male breast carcinoma (MBC). PRLR expression was detected immunohistochemically in 30 cases of gynaecomastia and 30 cases of MBC. The whole series was also assessed for oestrogen receptors (ER), progesterone receptors (PR) and androgen receptors (AR). A cut-off of 10% was used as the criterion for positivity. Histological type and tumour differentiation were evaluated. Pathological stage was assessed [Tumour Node Metastasis (TNM)-International Union Against Cancer system]. Statistical analysis was performed with Fisher's exact test. PRLR positivity was seen in 20% of gynaecomastia cases and in 60% of MBC cases (P = 0.003). In gynaecomastia immunoreactivity was predominantly observed in luminal cell borders, whereas in MBC the reactivity was heterogeneous and mainly cytoplasmic. There was no statistically significant correlation between PRLR expression and ER, PR, AR, pTNM, or histological grade. PRLR is significantly more expressed in MBC than in gynaecomastia, and with different patterns of reactivity, suggesting a role for PRL in male breast carcinogenesis.

  5. Spatio-temporal analysis of aftershock sequences in terms of Non Extensive Statistical Physics.

    NASA Astrophysics Data System (ADS)

    Chochlaki, Kalliopi; Vallianatos, Filippos

    2017-04-01

    Earth's seismicity is considered as an extremely complicated process where long-range interactions and fracturing exist (Vallianatos et al., 2016). For this reason, in order to analyze it, we use an innovative methodological approach, introduced by Tsallis (Tsallis, 1988; 2009), named Non Extensive Statistical Physics. This approach introduce a generalization of the Boltzmann-Gibbs statistical mechanics and it is based on the definition of Tsallis entropy Sq, which maximized leads the the so-called q-exponential function that expresses the probability distribution function that maximizes the Sq. In the present work, we utilize the concept of Non Extensive Statistical Physics in order to analyze the spatiotemporal properties of several aftershock series. Marekova (Marekova, 2014) suggested that the probability densities of the inter-event distances between successive aftershocks follow a beta distribution. Using the same data set we analyze the inter-event distance distribution of several aftershocks sequences in different geographic regions by calculating non extensive parameters that determine the behavior of the system and by fitting the q-exponential function, which expresses the degree of non-extentivity of the investigated system. Furthermore, the inter-event times distribution of the aftershocks as well as the frequency-magnitude distribution has been analyzed. The results supports the applicability of Non Extensive Statistical Physics ideas in aftershock sequences where a strong correlation exists along with memory effects. References C. Tsallis, Possible generalization of Boltzmann-Gibbs statistics, J. Stat. Phys. 52 (1988) 479-487. doi:10.1007/BF01016429 C. Tsallis, Introduction to nonextensive statistical mechanics: Approaching a complex world, 2009. doi:10.1007/978-0-387-85359-8. E. Marekova, Analysis of the spatial distribution between successive earthquakes in aftershocks series, Annals of Geophysics, 57, 5, doi:10.4401/ag-6556, 2014 F. Vallianatos, G. Papadakis, G. Michas, Generalized statistical mechanics approaches to earthquakes and tectonics. Proc. R. Soc. A, 472, 20160497, 2016.

  6. High EphA2 protein expression in renal cell carcinoma is associated with a poor disease outcome.

    PubMed

    Xu, Jinsheng; Zhang, Junxia; Cui, Liwen; Zhang, Huiran; Zhang, Shenglei; Bai, Yaling

    2014-08-01

    The receptor tyrosine kinase, ephrin type-A receptor 2 (EphA2), is normally expressed at sites of cell-to-cell contact in adult epithelial tissues, however, recent studies have shown that it is also overexpressed in various types of epithelial carcinomas, with the greatest level of EphA2 expression observed in metastatic lesions. In the present study, the association between the expression of EphA2 and the outcome of RCC patients was assessed. The high expression level of EphA2 was identified by log-rank test for a statistically significant prediction of the RCC outcome. In an overall multivariate analysis, the high expression level of EphA2 was identified as an independent predictor of RCC outcome. The length of survival of the patients with high EphA2 expression was shorter than that of the patients with a low level of expression (relative risk, 2.304; 95% CI, 1.102-4.818; P=0.027). The analysis of the expression levels of EphA2 in tumor tissues may aid in the identification of the patient subgroup that are at a high risk of a poor disease outcome.

  7. Distributional fold change test – a statistical approach for detecting differential expression in microarray experiments

    PubMed Central

    2012-01-01

    Background Because of the large volume of data and the intrinsic variation of data intensity observed in microarray experiments, different statistical methods have been used to systematically extract biological information and to quantify the associated uncertainty. The simplest method to identify differentially expressed genes is to evaluate the ratio of average intensities in two different conditions and consider all genes that differ by more than an arbitrary cut-off value to be differentially expressed. This filtering approach is not a statistical test and there is no associated value that can indicate the level of confidence in the designation of genes as differentially expressed or not differentially expressed. At the same time the fold change by itself provide valuable information and it is important to find unambiguous ways of using this information in expression data treatment. Results A new method of finding differentially expressed genes, called distributional fold change (DFC) test is introduced. The method is based on an analysis of the intensity distribution of all microarray probe sets mapped to a three dimensional feature space composed of average expression level, average difference of gene expression and total variance. The proposed method allows one to rank each feature based on the signal-to-noise ratio and to ascertain for each feature the confidence level and power for being differentially expressed. The performance of the new method was evaluated using the total and partial area under receiver operating curves and tested on 11 data sets from Gene Omnibus Database with independently verified differentially expressed genes and compared with the t-test and shrinkage t-test. Overall the DFC test performed the best – on average it had higher sensitivity and partial AUC and its elevation was most prominent in the low range of differentially expressed features, typical for formalin-fixed paraffin-embedded sample sets. Conclusions The distributional fold change test is an effective method for finding and ranking differentially expressed probesets on microarrays. The application of this test is advantageous to data sets using formalin-fixed paraffin-embedded samples or other systems where degradation effects diminish the applicability of correlation adjusted methods to the whole feature set. PMID:23122055

  8. A discriminative test among the different theories proposed to explain the origin of the genetic code: the coevolution theory finds additional support.

    PubMed

    Giulio, Massimo Di

    2018-05-19

    A discriminative statistical test among the different theories proposed to explain the origin of the genetic code is presented. Gathering the amino acids into polarity and biosynthetic classes that are the first expression of the physicochemical theory of the origin of the genetic code and the second expression of the coevolution theory, these classes are utilized in the Fisher's exact test to establish their significance within the genetic code table. Linking to the rows and columns of the genetic code of probabilities that express the statistical significance of these classes, I have finally been in the condition to be able to calculate a χ value to link to both the physicochemical theory and to the coevolution theory that would express the corroboration level referred to these theories. The comparison between these two χ values showed that the coevolution theory is able to explain - in this strictly empirical analysis - the origin of the genetic code better than that of the physicochemical theory. Copyright © 2018 Elsevier B.V. All rights reserved.

  9. When is hub gene selection better than standard meta-analysis?

    PubMed

    Langfelder, Peter; Mischel, Paul S; Horvath, Steve

    2013-01-01

    Since hub nodes have been found to play important roles in many networks, highly connected hub genes are expected to play an important role in biology as well. However, the empirical evidence remains ambiguous. An open question is whether (or when) hub gene selection leads to more meaningful gene lists than a standard statistical analysis based on significance testing when analyzing genomic data sets (e.g., gene expression or DNA methylation data). Here we address this question for the special case when multiple genomic data sets are available. This is of great practical importance since for many research questions multiple data sets are publicly available. In this case, the data analyst can decide between a standard statistical approach (e.g., based on meta-analysis) and a co-expression network analysis approach that selects intramodular hubs in consensus modules. We assess the performance of these two types of approaches according to two criteria. The first criterion evaluates the biological insights gained and is relevant in basic research. The second criterion evaluates the validation success (reproducibility) in independent data sets and often applies in clinical diagnostic or prognostic applications. We compare meta-analysis with consensus network analysis based on weighted correlation network analysis (WGCNA) in three comprehensive and unbiased empirical studies: (1) Finding genes predictive of lung cancer survival, (2) finding methylation markers related to age, and (3) finding mouse genes related to total cholesterol. The results demonstrate that intramodular hub gene status with respect to consensus modules is more useful than a meta-analysis p-value when identifying biologically meaningful gene lists (reflecting criterion 1). However, standard meta-analysis methods perform as good as (if not better than) a consensus network approach in terms of validation success (criterion 2). The article also reports a comparison of meta-analysis techniques applied to gene expression data and presents novel R functions for carrying out consensus network analysis, network based screening, and meta analysis.

  10. Statistical indicators of collective behavior and functional clusters in gene networks of yeast

    NASA Astrophysics Data System (ADS)

    Živković, J.; Tadić, B.; Wick, N.; Thurner, S.

    2006-03-01

    We analyze gene expression time-series data of yeast (S. cerevisiae) measured along two full cell-cycles. We quantify these data by using q-exponentials, gene expression ranking and a temporal mean-variance analysis. We construct gene interaction networks based on correlation coefficients and study the formation of the corresponding giant components and minimum spanning trees. By coloring genes according to their cell function we find functional clusters in the correlation networks and functional branches in the associated trees. Our results suggest that a percolation point of functional clusters can be identified on these gene expression correlation networks.

  11. SVAw - a web-based application tool for automated surrogate variable analysis of gene expression studies

    PubMed Central

    2013-01-01

    Background Surrogate variable analysis (SVA) is a powerful method to identify, estimate, and utilize the components of gene expression heterogeneity due to unknown and/or unmeasured technical, genetic, environmental, or demographic factors. These sources of heterogeneity are common in gene expression studies, and failing to incorporate them into the analysis can obscure results. Using SVA increases the biological accuracy and reproducibility of gene expression studies by identifying these sources of heterogeneity and correctly accounting for them in the analysis. Results Here we have developed a web application called SVAw (Surrogate variable analysis Web app) that provides a user friendly interface for SVA analyses of genome-wide expression studies. The software has been developed based on open source bioconductor SVA package. In our software, we have extended the SVA program functionality in three aspects: (i) the SVAw performs a fully automated and user friendly analysis workflow; (ii) It calculates probe/gene Statistics for both pre and post SVA analysis and provides a table of results for the regression of gene expression on the primary variable of interest before and after correcting for surrogate variables; and (iii) it generates a comprehensive report file, including graphical comparison of the outcome for the user. Conclusions SVAw is a web server freely accessible solution for the surrogate variant analysis of high-throughput datasets and facilitates removing all unwanted and unknown sources of variation. It is freely available for use at http://psychiatry.igm.jhmi.edu/sva. The executable packages for both web and standalone application and the instruction for installation can be downloaded from our web site. PMID:23497726

  12. Quality of human milk expressed in a human milk bank and at home.

    PubMed

    Borges, Mayla S; Oliveira, Angela M de M; Hattori, Wallisen T; Abdallah, Vânia O S

    2017-08-30

    To evaluate the quality of the human milk expressed at home and at a human milk bank. This a retrospective, analytical, and observational study, performed by assessing titratable acidity records and the microbiological culture of 100 human milk samples expressed at home and at a human milk bank, in 2014. For the statistical analysis, generalized estimating equations (GEE) and the chi-squared test were used. When comparing the two sample groups, no significant difference was found, with 98% and 94% of the samples being approved among those collected at the milk bank and at home, respectively. No main interaction effect between local and titratable acidity records (p=0.285) was observed, and there was no statistically significant difference between the expected and observed values for the association between the collection place and the microbiological culture results (p=0.307). The quality of human milk expressed at home and at the milk bank are in agreement with the recommended standards, confirming that the expression of human milk at home is as safe as expression at the human milk bank, provided that the established hygiene, conservation, storage, and transport standards are followed. Copyright © 2017 Sociedade Brasileira de Pediatria. Published by Elsevier Editora Ltda. All rights reserved.

  13. Prognostic significance of MCM 2 and Ki-67 in neuroblastic tumors in children.

    PubMed

    Lewandowska, Magdalena; Taran, Katarzyna; Sitkiewicz, Anna; Andrzejewska, Ewa

    2015-12-02

    Neuroblastic tumors can be characterized by three features: spontaneous regression, maturation and aggressive proliferation. The most common and routinely used method of assessing tumor cell proliferation is to determine the Ki-67 index in the tumor tissue. Despite numerous studies, neuroblastoma biology is not fully understood, which makes treatment results unsatisfactory. MCM 2 is a potential prognostic factor in the neuroblastoma group. The study is based on retrospective analysis of 35 patients treated for neuroblastic tumors in the Department of Pediatric Surgery and Oncology of the Medical University of Lodz, during the period 2001-2011. The material comprised tissues of 16 tumors excised during the operation and 19 biopsy specimens. Immunohistochemical examinations were performed with immunoperoxidase using mouse monoclonal anti-MCM 2 and anti-Ki-67 antibodies. We observed that MCM 2 expression ranged from 2% to 98% and the Ki-67 index ranged from 0 to 95%. There was a statistically significant correlation between expression of MCM 2 and the value of the Ki-67 index and a correlation close to statistical significance between expression of MCM 2 and unfavorable histopathology. There was no statistical relationship between expression of MCM 2 and age over 1 year and N-myc amplification. The presented research shows that MCM 2 may have prognostic significance in neuroblastic pediatric tumors and as a potential prognostic factor could be the starting point of new individualized therapy.

  14. Methods for Genome-Wide Analysis of Gene Expression Changes in Polyploids

    PubMed Central

    Wang, Jianlin; Lee, Jinsuk J.; Tian, Lu; Lee, Hyeon-Se; Chen, Meng; Rao, Sheetal; Wei, Edward N.; Doerge, R. W.; Comai, Luca; Jeffrey Chen, Z.

    2007-01-01

    Polyploidy is an evolutionary innovation, providing extra sets of genetic material for phenotypic variation and adaptation. It is predicted that changes of gene expression by genetic and epigenetic mechanisms are responsible for novel variation in nascent and established polyploids (Liu and Wendel, 2002; Osborn et al., 2003; Pikaard, 2001). Studying gene expression changes in allopolyploids is more complicated than in autopolyploids, because allopolyploids contain more than two sets of genomes originating from divergent, but related, species. Here we describe two methods that are applicable to the genome-wide analysis of gene expression differences resulting from genome duplication in autopolyploids or interactions between homoeologous genomes in allopolyploids. First, we describe an amplified fragment length polymorphism (AFLP)–complementary DNA (cDNA) display method that allows the discrimination of homoeologous loci based on restriction polymorphisms between the progenitors. Second, we describe microarray analyses that can be used to compare gene expression differences between the allopolyploids and respective progenitors using appropriate experimental design and statistical analysis. We demonstrate the utility of these two complementary methods and discuss the pros and cons of using the methods to analyze gene expression changes in autopolyploids and allopolyploids. Furthermore, we describe these methods in general terms to be of wider applicability for comparative gene expression in a variety of evolutionary, genetic, biological, and physiological contexts. PMID:15865985

  15. Automated Assessment of Child Vocalization Development Using LENA.

    PubMed

    Richards, Jeffrey A; Xu, Dongxin; Gilkerson, Jill; Yapanel, Umit; Gray, Sharmistha; Paul, Terrance

    2017-07-12

    To produce a novel, efficient measure of children's expressive vocal development on the basis of automatic vocalization assessment (AVA), child vocalizations were automatically identified and extracted from audio recordings using Language Environment Analysis (LENA) System technology. Assessment was based on full-day audio recordings collected in a child's unrestricted, natural language environment. AVA estimates were derived using automatic speech recognition modeling techniques to categorize and quantify the sounds in child vocalizations (e.g., protophones and phonemes). These were expressed as phone and biphone frequencies, reduced to principal components, and inputted to age-based multiple linear regression models to predict independently collected criterion-expressive language scores. From these models, we generated vocal development AVA estimates as age-standardized scores and development age estimates. AVA estimates demonstrated strong statistical reliability and validity when compared with standard criterion expressive language assessments. Automated analysis of child vocalizations extracted from full-day recordings in natural settings offers a novel and efficient means to assess children's expressive vocal development. More research remains to identify specific mechanisms of operation.

  16. Threshold-free high-power methods for the ontological analysis of genome-wide gene-expression studies

    PubMed Central

    Nilsson, Björn; Håkansson, Petra; Johansson, Mikael; Nelander, Sven; Fioretos, Thoas

    2007-01-01

    Ontological analysis facilitates the interpretation of microarray data. Here we describe new ontological analysis methods which, unlike existing approaches, are threshold-free and statistically powerful. We perform extensive evaluations and introduce a new concept, detection spectra, to characterize methods. We show that different ontological analysis methods exhibit distinct detection spectra, and that it is critical to account for this diversity. Our results argue strongly against the continued use of existing methods, and provide directions towards an enhanced approach. PMID:17488501

  17. Expression of survivin and clinical correlation in patients with breast cancer.

    PubMed

    Sohn, Doo Min; Kim, Sung Yong; Baek, Moo Jun; Lim, Cheol Wan; Lee, Min Hyuk; Cho, Moo Sik; Kim, Tae Yoon

    2006-07-01

    Survivin is a member of the inhibitor of apoptosis (IAP) family, which is also involved in the regulation of cell division and is also overexpressed and associated with parameters of poor prognosis in most human cancers, including carcinomas of the lung, breast, colon, stomach, esophagus and pancreas. This study examined the expression patterns of survivin in normal breast tissue, atypical hyperplasia, primary breast cancer and lymph node tissues involved in breast cancer and determined whether the expression of survivin is associated with the characteristics and prognosis of breast cancer. Formalin-fixed paraffin-embedded samples from 80 breast cancer, 20 atypical hyperplasia and 20 malignant lymph node tissue cases were immunostained using polyclonal survivin (Novus Biologicals, CO, USA). The degree of immunostaining was recorded on a scale of 0-3 according to the percentages of staining and distributions within the cytoplasm and nucleus. Survivin was expressed in 52, 14 and 17 of the 80 breast cancer (65%), atypical hyperplasia (70%) and breast cancer lymphoid (85%) specimens, respectively. Among those expressing cancer, 11.3%, 31.3% and 22.5% demonstrated only nuclear staining, only cytoplasmic staining and both nuclear and cytoplasmic staining, respectively. A statistical analysis revealed that cytoplasmic survivin expression was correlated with the stage, histological grade and L/N metastasis. In a Cox proportional hazard model analysis, the expression of survivin was not identified as a significant independent predictor of overall survival (P=0.168), although the decrease in the survival rate of survivin-positive patients did reach statistical significance (P=0.048). our results show that survivin is frequently overexpressed in primary breast cancer and its expression gradually increased from normal breast tissue to malignant lymph nodes. The expression of cytoplasmic survivin was common in breast cancer and could be both a useful diagnostic marker and an important source of prognostic information.

  18. Are Interactions between cis-Regulatory Variants Evidence for Biological Epistasis or Statistical Artifacts?

    PubMed

    Fish, Alexandra E; Capra, John A; Bush, William S

    2016-10-06

    The importance of epistasis-or statistical interactions between genetic variants-to the development of complex disease in humans has been controversial. Genome-wide association studies of statistical interactions influencing human traits have recently become computationally feasible and have identified many putative interactions. However, statistical models used to detect interactions can be confounded, which makes it difficult to be certain that observed statistical interactions are evidence for true molecular epistasis. In this study, we investigate whether there is evidence for epistatic interactions between genetic variants within the cis-regulatory region that influence gene expression after accounting for technical, statistical, and biological confounding factors. We identified 1,119 (FDR = 5%) interactions that appear to regulate gene expression in human lymphoblastoid cell lines, a tightly controlled, largely genetically determined phenotype. Many of these interactions replicated in an independent dataset (90 of 803 tested, Bonferroni threshold). We then performed an exhaustive analysis of both known and novel confounders, including ceiling/floor effects, missing genotype combinations, haplotype effects, single variants tagged through linkage disequilibrium, and population stratification. Every interaction could be explained by at least one of these confounders, and replication in independent datasets did not protect against some confounders. Assuming that the confounding factors provide a more parsimonious explanation for each interaction, we find it unlikely that cis-regulatory interactions contribute strongly to human gene expression, which calls into question the relevance of cis-regulatory interactions for other human phenotypes. We additionally propose several best practices for epistasis testing to protect future studies from confounding. Copyright © 2016 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  19. dbMDEGA: a database for meta-analysis of differentially expressed genes in autism spectrum disorder.

    PubMed

    Zhang, Shuyun; Deng, Libin; Jia, Qiyue; Huang, Shaoting; Gu, Junwang; Zhou, Fankun; Gao, Meng; Sun, Xinyi; Feng, Chang; Fan, Guangqin

    2017-11-16

    Autism spectrum disorders (ASD) are hereditary, heterogeneous and biologically complex neurodevelopmental disorders. Individual studies on gene expression in ASD cannot provide clear consensus conclusions. Therefore, a systematic review to synthesize the current findings from brain tissues and a search tool to share the meta-analysis results are urgently needed. Here, we conducted a meta-analysis of brain gene expression profiles in the current reported human ASD expression datasets (with 84 frozen male cortex samples, 17 female cortex samples, 32 cerebellum samples and 4 formalin fixed samples) and knock-out mouse ASD model expression datasets (with 80 collective brain samples). Then, we applied R language software and developed an interactive shared and updated database (dbMDEGA) displaying the results of meta-analysis of data from ASD studies regarding differentially expressed genes (DEGs) in the brain. This database, dbMDEGA ( https://dbmdega.shinyapps.io/dbMDEGA/ ), is a publicly available web-portal for manual annotation and visualization of DEGs in the brain from data from ASD studies. This database uniquely presents meta-analysis values and homologous forest plots of DEGs in brain tissues. Gene entries are annotated with meta-values, statistical values and forest plots of DEGs in brain samples. This database aims to provide searchable meta-analysis results based on the current reported brain gene expression datasets of ASD to help detect candidate genes underlying this disorder. This new analytical tool may provide valuable assistance in the discovery of DEGs and the elucidation of the molecular pathogenicity of ASD. This database model may be replicated to study other disorders.

  20. Aberrant membranous expression of β-catenin predicts poor prognosis in patients with craniopharyngioma.

    PubMed

    Li, Zongping; Xu, Jianguo; Huang, Siqing; You, Chao

    2015-12-01

    The objective of this study is to investigate β-catenin expression in craniopharyngioma patients and determine its significance in predicting the prognosis of this disease. Fifty craniopharyngioma patients were enrolled in this study. Expression of β-catenin in tumor specimens collected from these patients was examined through immunostaining. In addition, mutation of exon 3 in the β-catenin gene, CTNNB1, was analyzed using polymerase chain reaction, denaturing high-pressure liquid chromatography, and DNA sequencing. Based on these results, we explored the association between membranous β-catenin expression, clinical and pathologic characteristics, and prognoses in these patients. Of all craniopharyngioma specimens, 31 (62.0%) had preserved membranous β-catenin expression, whereas the remaining 19 specimens (38.0%) displayed aberrant expression. Statistical analysis showed a significant correlation between aberrant membranous β-catenin expression and CTNNB1 exon 3 mutation, as well as between aberrant membranous β-catenin expression and the histopathologic type of craniopharyngioma and type of resection in our patient population. Furthermore, aberrant membranous β-catenin expression was found to be associated with poor patient survival. Results of Kaplan-Meier survival analysis and Cox regression analysis further confirmed this finding. In conclusion, our study demonstrated that aberrant membranous β-catenin expression was significantly correlated with poor survival in patients with craniopharyngioma. This raises the possibility for use of aberrant membranous β-catenin expression as an independent risk factor in predicting the prognosis of this disease. Copyright © 2015 Elsevier Inc. All rights reserved.

  1. Statistical analysis and modeling of intermittent transport events in the tokamak scrape-off layer

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Anderson, Johan, E-mail: anderson.johan@gmail.com; Halpern, Federico D.; Ricci, Paolo

    The turbulence observed in the scrape-off-layer of a tokamak is often characterized by intermittent events of bursty nature, a feature which raises concerns about the prediction of heat loads on the physical boundaries of the device. It appears thus necessary to delve into the statistical properties of turbulent physical fields such as density, electrostatic potential, and temperature, focusing on the mathematical expression of tails of the probability distribution functions. The method followed here is to generate statistical information from time-traces of the plasma density stemming from Braginskii-type fluid simulations and check this against a first-principles theoretical model. The analysis ofmore » the numerical simulations indicates that the probability distribution function of the intermittent process contains strong exponential tails, as predicted by the analytical theory.« less

  2. Data-adaptive test statistics for microarray data.

    PubMed

    Mukherjee, Sach; Roberts, Stephen J; van der Laan, Mark J

    2005-09-01

    An important task in microarray data analysis is the selection of genes that are differentially expressed between different tissue samples, such as healthy and diseased. However, microarray data contain an enormous number of dimensions (genes) and very few samples (arrays), a mismatch which poses fundamental statistical problems for the selection process that have defied easy resolution. In this paper, we present a novel approach to the selection of differentially expressed genes in which test statistics are learned from data using a simple notion of reproducibility in selection results as the learning criterion. Reproducibility, as we define it, can be computed without any knowledge of the 'ground-truth', but takes advantage of certain properties of microarray data to provide an asymptotically valid guide to expected loss under the true data-generating distribution. We are therefore able to indirectly minimize expected loss, and obtain results substantially more robust than conventional methods. We apply our method to simulated and oligonucleotide array data. By request to the corresponding author.

  3. Statistical theory and applications of lock-in carrierographic image pixel brightness dependence on multi-crystalline Si solar cell efficiency and photovoltage

    NASA Astrophysics Data System (ADS)

    Mandelis, Andreas; Zhang, Yu; Melnikov, Alexander

    2012-09-01

    A solar cell lock-in carrierographic image generation theory based on the concept of non-equilibrium radiation chemical potential was developed. An optoelectronic diode expression was derived linking the emitted radiative recombination photon flux (current density), the solar conversion efficiency, and the external load resistance via the closed- and/or open-circuit photovoltage. The expression was shown to be of a structure similar to the conventional electrical photovoltaic I-V equation, thereby allowing the carrierographic image to be used in a quantitative statistical pixel brightness distribution analysis with outcome being the non-contacting measurement of mean values of these important parameters averaged over the entire illuminated solar cell surface. This is the optoelectronic equivalent of the electrical (contacting) measurement method using an external resistor circuit and the outputs of the solar cell electrode grid, the latter acting as an averaging distribution network over the surface. The statistical theory was confirmed using multi-crystalline Si solar cells.

  4. Chemokine-like factor-like MARVEL transmembrane domain-containing 3 expression is associated with a favorable prognosis in esophageal squamous cell carcinoma.

    PubMed

    Han, Tianci; Shu, Tianci; Dong, Siyuan; Li, Peiwen; Li, Weinan; Liu, Dali; Qi, Ruiqun; Zhang, Shuguang; Zhang, Lin

    2017-05-01

    Decreased expression of human chemokine-like factor-like MARVEL transmembrane domain-containing 3 (CMTM3) has been identified in a number of human tumors and tumor cell lines, including gastric and testicular cancer, and PC3, CAL27 and Tca-83 cell lines. However, the association between CMTM3 expression and the clinicopathological features and prognosis of esophageal squamous cell carcinoma (ESCC) patients remains unclear. The aim of the present study was to investigate the correlation between CMTM3 expression and clinicopathological parameters and prognosis in ESCC. CMTM3 mRNA and protein expression was analyzed in ESCC and paired non-tumor tissues by quantitative real-time polymerase chain reaction, western blotting and immunohistochemical analysis. The Kaplan-Meier method was used to plot survival curves and the Cox proportional hazards regression model was also used for univariate and multivariate survival analysis. The results revealed that CMTM3 mRNA and protein expression levels were lower in 82.5% (30/40) and 75% (30/40) of ESCC tissues, respectively, when compared with matched non-tumor tissues. Statistical analysis demonstrated that CMTM3 expression was significantly correlated with lymph node metastasis (P=0.002) and clinical stage (P<0.001) in ESCC tissues. Furthermore, the survival time of ESCC patients exhibiting low CMTM3 expression was significantly shorter than that of ESCC patients exhibiting high CMTM3 expression (P=0.01). In addition, Kaplan-Meier survival analysis revealed that the overall survival time of patients exhibiting low CMTM3 expression was significantly decreased compared with patients exhibiting high CMTM3 expression (P=0.010). Cox multivariate analysis indicated that CMTM3 protein expression was an independent prognostic predictor for ESCC after resection. This study indicated that CMTM3 expression is significantly decreased in ESCC tissues and CMTM3 protein expression in resected tumors may present an effective prognostic biomarker.

  5. The Role of IQGAP1 in Breast Carcinoma

    DTIC Science & Technology

    2012-10-01

    and"-tubulin expression was measured as described above. Statistical Analysis —All experiments were repeated inde- pendently at least three times...IQGAP1 Binds HER2—In vitro analysis with pure proteins was used to examine a possible interaction between IQGAP1 and HER2. GST alone or GST-HER2 was...incubated with puri- fied IQGAP1, and complexes were isolated with glutathione- Sepharose. Analysis by Western blotting reveals that IQGAP1 bindsHER2

  6. Time-series RNA-seq analysis package (TRAP) and its application to the analysis of rice, Oryza sativa L. ssp. Japonica, upon drought stress.

    PubMed

    Jo, Kyuri; Kwon, Hawk-Bin; Kim, Sun

    2014-06-01

    Measuring expression levels of genes at the whole genome level can be useful for many purposes, especially for revealing biological pathways underlying specific phenotype conditions. When gene expression is measured over a time period, we have opportunities to understand how organisms react to stress conditions over time. Thus many biologists routinely measure whole genome level gene expressions at multiple time points. However, there are several technical difficulties for analyzing such whole genome expression data. In addition, these days gene expression data is often measured by using RNA-sequencing rather than microarray technologies and then analysis of expression data is much more complicated since the analysis process should start with mapping short reads and produce differentially activated pathways and also possibly interactions among pathways. In addition, many useful tools for analyzing microarray gene expression data are not applicable for the RNA-seq data. Thus a comprehensive package for analyzing time series transcriptome data is much needed. In this article, we present a comprehensive package, Time-series RNA-seq Analysis Package (TRAP), integrating all necessary tasks such as mapping short reads, measuring gene expression levels, finding differentially expressed genes (DEGs), clustering and pathway analysis for time-series data in a single environment. In addition to implementing useful algorithms that are not available for RNA-seq data, we extended existing pathway analysis methods, ORA and SPIA, for time series analysis and estimates statistical values for combined dataset by an advanced metric. TRAP also produces visual summary of pathway interactions. Gene expression change labeling, a practical clustering method used in TRAP, enables more accurate interpretation of the data when combined with pathway analysis. We applied our methods on a real dataset for the analysis of rice (Oryza sativa L. Japonica nipponbare) upon drought stress. The result showed that TRAP was able to detect pathways more accurately than several existing methods. TRAP is available at http://biohealth.snu.ac.kr/software/TRAP/. Copyright © 2014 Elsevier Inc. All rights reserved.

  7. The research of statistical properties of colorimetric features of screens with a three-component color formation principle

    NASA Astrophysics Data System (ADS)

    Zharinov, I. O.; Zharinov, O. O.

    2017-12-01

    The problem of the research is concerned with quantitative analysis of influence of technological variation of the screen color profile parameters on chromaticity coordinates of the displayed image. Some mathematical expressions which approximate the two-dimensional distribution of chromaticity coordinates of an image, which is displayed on the screen with a three-component color formation principle were proposed. Proposed mathematical expressions show the way to development of correction techniques to improve reproducibility of the colorimetric features of displays.

  8. Linnorm: improved statistical analysis for single cell RNA-seq expression data.

    PubMed

    Yip, Shun H; Wang, Panwen; Kocher, Jean-Pierre A; Sham, Pak Chung; Wang, Junwen

    2017-12-15

    Linnorm is a novel normalization and transformation method for the analysis of single cell RNA sequencing (scRNA-seq) data. Linnorm is developed to remove technical noises and simultaneously preserve biological variations in scRNA-seq data, such that existing statistical methods can be improved. Using real scRNA-seq data, we compared Linnorm with existing normalization methods, including NODES, SAMstrt, SCnorm, scran, DESeq and TMM. Linnorm shows advantages in speed, technical noise removal and preservation of cell heterogeneity, which can improve existing methods in the discovery of novel subtypes, pseudo-temporal ordering of cells, clustering analysis, etc. Linnorm also performs better than existing DEG analysis methods, including BASiCS, NODES, SAMstrt, Seurat and DESeq2, in false positive rate control and accuracy. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  9. MicroRNA Expression Analysis in Serum of Patients with Congenital Hemochromatosis and Age-Related Macular Degeneration (AMD)

    PubMed Central

    Szemraj, Maciej; Oszajca, Katarzyna; Szemraj, Janusz; Jurowski, Piotr

    2017-01-01

    Background Congenital hemochromatosis is a disorder caused by mutations of genes involved in iron metabolism, leading to increased levels of iron concentration in tissues and serum. High concentrations of iron can lead to the development of AMD. The aim of this study was to analyze circulating miRNAs in the serum of congenital hemochromatosis patients with AMD and their correlation with the expression of genes involved in iron metabolism. Material/Methods Peripheral blood monolayer cells and serum were obtained from patients with congenital hemochromatosis, congenital hemochromatosis and AMD, AMD patients without congenital hemochromatosis, and healthy controls. Serum miRNAs expressions were analyzed by RT-PCR (qRT-PCR) using TaqMan MicroRNA probes, and proteins levels were measured by ELSA kits. Gene polymorphisms in TF and TFRC genes were determined using the TaqMan discrimination assay. Results Statistical analysis of the miRNAs expressions selected for further study the miR-31, miR-133a, miR-141, miR-145, miR-149, and miR-182, which are involved in the posttranscriptional expression of iron-related genes: TF, TFRI, DMT1, FTL, and FPN1. It was discovered that the observed changes in the expressions of the miRNAs was correlated with the level of protein in the serum of the analyzed genes. There were no statistically significant differences in the distribution of genotype and allele frequencies in TF and TFRC genes between analyzed groups of patients. Conclusions The differences studied in the miRNA serum profile, in conjunction with the changes in the analyzed protein levels, may be useful in the early detection of congenital hemochromatosis in patients who may develop AMD disease. PMID:28827515

  10. Epidermal Growth Factor Receptor Is Related to Poor Survival in Glioblastomas: Single-Institution Experience

    PubMed Central

    Choi, Youngmin; Lee, Hyung-Sik; Hur, Won-Joo; Sung, Ki-Han; Kim, Ki-Uk; Choi, Sun-Seob; Kim, Su-Jin; Kim, Dae-Cheol

    2013-01-01

    Purpose There are conflicting results surrounding the prognostic significance of epidermal growth factor receptor (EGFR) status in glioblastoma (GBM) patients. Accordingly, we attempted to assess the influence of EGFR expression on the survival of GBM patients receiving postoperative radiotherapy. Materials and Methods Thirty three GBM patients who had received surgery and postoperative radiotherapy at our institute, between March 1997 and February 2006, were included. The evaluation of EGFR expression with immunohistochemistry was available for 30 patients. Kaplan-Meier survival analysis and Cox regression were used for statistical analysis. Results EGFR was expressed in 23 patients (76.7%), and not expressed in seven (23.3%). Survival in EGFR expressing GBM patients was significantly less than that in non-expressing patients (median survival: 12.5 versus 17.5 months, p=0.013). Patients who received more than 60 Gy showed improved survival over those who received up to 60 Gy (median survival: 17.0 versus 9.0 months, p=0.000). Negative EGFR expression and a higher radiation dose were significantly correlated with improved survival on multivariate analysis. Survival rates showed no differences according to age, sex, and surgical extent. Conclusion The expression of EGFR demonstrated a significantly deleterious effect on the survival of GBM patients. Therefore, approaches targeting EGFR should be considered in potential treatment methods for GBM patients, in addition to current management strategies. PMID:23225805

  11. Selection of Valid Reference Genes for Reverse Transcription Quantitative PCR Analysis in Heliconius numata (Lepidoptera: Nymphalidae)

    PubMed Central

    Chouteau, Mathieu; Whibley, Annabel; Joron, Mathieu; Llaurens, Violaine

    2016-01-01

    Identifying the genetic basis of adaptive variation is challenging in non-model organisms and quantitative real time PCR. is a useful tool for validating predictions regarding the expression of candidate genes. However, comparing expression levels in different conditions requires rigorous experimental design and statistical analyses. Here, we focused on the neotropical passion-vine butterflies Heliconius, non-model species studied in evolutionary biology for their adaptive variation in wing color patterns involved in mimicry and in the signaling of their toxicity to predators. We aimed at selecting stable reference genes to be used for normalization of gene expression data in RT-qPCR analyses from developing wing discs according to the minimal guidelines described in Minimum Information for publication of Quantitative Real-Time PCR Experiments (MIQE). To design internal RT-qPCR controls, we studied the stability of expression of nine candidate reference genes (actin, annexin, eF1α, FK506BP, PolyABP, PolyUBQ, RpL3, RPS3A, and tubulin) at two developmental stages (prepupal and pupal) using three widely used programs (GeNorm, NormFinder and BestKeeper). Results showed that, despite differences in statistical methods, genes RpL3, eF1α, polyABP, and annexin were stably expressed in wing discs in late larval and pupal stages of Heliconius numata. This combination of genes may be used as a reference for a reliable study of differential expression in wings for instance for genes involved in important phenotypic variation, such as wing color pattern variation. Through this example, we provide general useful technical recommendations as well as relevant statistical strategies for evolutionary biologists aiming to identify candidate-genes involved adaptive variation in non-model organisms. PMID:27271971

  12. Predictors of Science Subject Discipline Identities: A Statistical Analysis

    ERIC Educational Resources Information Center

    Nieswandt, Martina; Barrett, Sarah E.; McEneaney, Elizabeth H.

    2013-01-01

    This quantitative study (n = 247) explores whether preservice science teachers express science-specific identities that reflect multiple areas of their beliefs (e.g., purpose for science teaching, inclusion of science-technology-society-environment issues into science teaching, and nature of science) as well as other individual characteristics…

  13. Allium sativum L. regulates in vitro IL-17 gene expression in human peripheral blood mononuclear cells.

    PubMed

    Moutia, Mouna; Seghrouchni, Fouad; Abouelazz, Omar; Elouaddari, Anass; Al Jahid, Abdellah; Elhou, Abdelhalim; Nadifi, Sellama; Jamal Eddine, Jamal; Habti, Norddine; Badou, Abdallah

    2016-09-29

    Allium sativum L. (A.S.) "garlic", one of the most interesting medicinal plants, has been suggested to contain compounds that could be beneficial in numerous pathological situations including cancer. In this work, we aimed to assess the immunomodulatory effect of A.S. preparation on human peripheral blood mononuclear cells from healthy individuals. Nontoxic doses of A.S. were identified using MTT assay. Effects on CD4+ or CD8+ T lymphocyte proliferation were studied using flow cytometry. The effect of A.S. on cytokine gene expression was studied using qRT-PCR. Finally, qualitative analysis of A.S. was performed by HPLC approach. Data were analyzed statistically by one-way ANOVA test. The nontoxic doses of A.S. preparation did not affect neither spontaneous nor TCR-mediated CD4+ or CD8+ T lymphocyte proliferation. Interestingly, A.S. exhibited a statistically significant regulation of IL-17 gene expression, a cytokine involved in several inflammatory and autoimmune diseases. In contrast, the expression of IL-4, an anti-inflammatory cytokine, was unaffected. Qualitative analysis of A.S. ethanol preparation indicated the presence of three polyphenol bioactive compounds, which are catechin, vanillic acid and ferulic acid. The specific inhibition of the pro-inflammatory cytokine, IL-17 without affecting cell proliferation in human PBMCs by the Allium sativum L. preparation suggests a potential valuable effect of the compounds present in this plant for the treatment of inflammatory diseases and cancer, where IL-17 is highly expressed. The individual contribution of these three compounds to this global effect will be assessed.

  14. B7-H4 overexpression in ovarian tumors.

    PubMed

    Tringler, Barbara; Liu, Wenhui; Corral, Laura; Torkko, Kathleen C; Enomoto, Takayuki; Davidson, Susan; Lucia, M Scott; Heinz, David E; Papkoff, Jackie; Shroyer, Kenneth R

    2006-01-01

    Despite great advances in therapeutic management, the mortality rate for ovarian cancer has remained relatively stable over the past 50 years. This study was designed to evaluate the expression of B7-H4 protein, recently identified as a potential molecular marker of breast and ovarian cancer by quantitative PCR analysis, in benign tumors, tumors of low malignant potential and malignant tumors of the ovary. Archival formalin-fixed tissue blocks from serous, mucinous, endometrioid and clear cell ovarian tumors were evaluated by immunohistochemistry for the distribution of B7-H4 expression, and staining intensity was measured by automated image analysis. Univariate analyses were used to test for statistically significant relationships. B7-H4 cytoplasmic and membranous expression was detected in all primary serous (n = 32), endometrioid (n = 12), and clear cell carcinomas (n = 15), and in all metastatic serous (n = 23) and endometrioid (n = 7) ovarian carcinomas. By contrast, focal B7-H4 expression was detected in only 1/11 mucinous carcinomas. The proportion of positive cells and median staining intensity was greater in serous carcinomas than in serous cystadenomas or serous tumors of low malignant potential, and the differences were statistically significant (P < 0.0001 and P = 0.034, respectively). The median staining intensity was also significantly greater in endometrioid carcinomas than in endometriosis (P = 0.005). The consistent overexpression of B7-H4 in serous, endometrioid and clear cell ovarian carcinomas and the relative absence of expression in most normal somatic tissues indicates that B7-H4 should be further investigated as a potential diagnostic marker or therapeutic target for ovarian cancer.

  15. The prognostic significance of specific HOX gene expression patterns in ovarian cancer.

    PubMed

    Kelly, Zoe; Moller-Levet, Carla; McGrath, Sophie; Butler-Manuel, Simon; Kavitha Madhuri, Thumuluru; Kierzek, Andrzej M; Pandha, Hardev; Morgan, Richard; Michael, Agnieszka

    2016-10-01

    HOX genes are vital for all aspects of mammalian growth and differentiation, and their dysregulated expression is related to ovarian carcinogenesis. The aim of the current study was to establish the prognostic value of HOX dysregulation as well as its role in platinum resistance. The potential to target HOX proteins through the HOX/PBX interaction was also explored in the context of platinum resistance. HOX gene expression was determined in ovarian cancer cell lines and primary EOCs by QPCR, and compared to expression in normal ovarian epithelium and fallopian tube tissue samples. Statistical analysis included one-way ANOVA and t-tests, using statistical software R and GraphPad. The analysis identified 36 of the 39 HOX genes as being overexpressed in high grade serous EOC compared to normal tissue. We detected a molecular HOX gene-signature that predicted poor outcome. Overexpression of HOXB4 and HOXB9 was identified in high grade serous cell lines after platinum resistance developed. Targeting the HOX/PBX dimer with the HXR9 peptide enhanced the cytotoxicity of cisplatin in platinum-resistant ovarian cancer. In conclusion, this study has shown the HOX genes are highly dysregulated in ovarian cancer with high expression of HOXA13, B6, C13, D1 and D13 being predictive of poor clinical outcome. Targeting the HOX/PBX dimer in platinum-resistant cancer represents a potentially new therapeutic option that should be further developed and tested in clinical trials. © 2016 The Authors International Journal of Cancer published by John Wiley & Sons Ltd on behalf of UICC.

  16. PathMAPA: a tool for displaying gene expression and performing statistical tests on metabolic pathways at multiple levels for Arabidopsis.

    PubMed

    Pan, Deyun; Sun, Ning; Cheung, Kei-Hoi; Guan, Zhong; Ma, Ligeng; Holford, Matthew; Deng, Xingwang; Zhao, Hongyu

    2003-11-07

    To date, many genomic and pathway-related tools and databases have been developed to analyze microarray data. In published web-based applications to date, however, complex pathways have been displayed with static image files that may not be up-to-date or are time-consuming to rebuild. In addition, gene expression analyses focus on individual probes and genes with little or no consideration of pathways. These approaches reveal little information about pathways that are key to a full understanding of the building blocks of biological systems. Therefore, there is a need to provide useful tools that can generate pathways without manually building images and allow gene expression data to be integrated and analyzed at pathway levels for such experimental organisms as Arabidopsis. We have developed PathMAPA, a web-based application written in Java that can be easily accessed over the Internet. An Oracle database is used to store, query, and manipulate the large amounts of data that are involved. PathMAPA allows its users to (i) upload and populate microarray data into a database; (ii) integrate gene expression with enzymes of the pathways; (iii) generate pathway diagrams without building image files manually; (iv) visualize gene expressions for each pathway at enzyme, locus, and probe levels; and (v) perform statistical tests at pathway, enzyme and gene levels. PathMAPA can be used to examine Arabidopsis thaliana gene expression patterns associated with metabolic pathways. PathMAPA provides two unique features for the gene expression analysis of Arabidopsis thaliana: (i) automatic generation of pathways associated with gene expression and (ii) statistical tests at pathway level. The first feature allows for the periodical updating of genomic data for pathways, while the second feature can provide insight into how treatments affect relevant pathways for the selected experiment(s).

  17. PathMAPA: a tool for displaying gene expression and performing statistical tests on metabolic pathways at multiple levels for Arabidopsis

    PubMed Central

    Pan, Deyun; Sun, Ning; Cheung, Kei-Hoi; Guan, Zhong; Ma, Ligeng; Holford, Matthew; Deng, Xingwang; Zhao, Hongyu

    2003-01-01

    Background To date, many genomic and pathway-related tools and databases have been developed to analyze microarray data. In published web-based applications to date, however, complex pathways have been displayed with static image files that may not be up-to-date or are time-consuming to rebuild. In addition, gene expression analyses focus on individual probes and genes with little or no consideration of pathways. These approaches reveal little information about pathways that are key to a full understanding of the building blocks of biological systems. Therefore, there is a need to provide useful tools that can generate pathways without manually building images and allow gene expression data to be integrated and analyzed at pathway levels for such experimental organisms as Arabidopsis. Results We have developed PathMAPA, a web-based application written in Java that can be easily accessed over the Internet. An Oracle database is used to store, query, and manipulate the large amounts of data that are involved. PathMAPA allows its users to (i) upload and populate microarray data into a database; (ii) integrate gene expression with enzymes of the pathways; (iii) generate pathway diagrams without building image files manually; (iv) visualize gene expressions for each pathway at enzyme, locus, and probe levels; and (v) perform statistical tests at pathway, enzyme and gene levels. PathMAPA can be used to examine Arabidopsis thaliana gene expression patterns associated with metabolic pathways. Conclusion PathMAPA provides two unique features for the gene expression analysis of Arabidopsis thaliana: (i) automatic generation of pathways associated with gene expression and (ii) statistical tests at pathway level. The first feature allows for the periodical updating of genomic data for pathways, while the second feature can provide insight into how treatments affect relevant pathways for the selected experiment(s). PMID:14604444

  18. GECKO: a complete large-scale gene expression analysis platform.

    PubMed

    Theilhaber, Joachim; Ulyanov, Anatoly; Malanthara, Anish; Cole, Jack; Xu, Dapeng; Nahf, Robert; Heuer, Michael; Brockel, Christoph; Bushnell, Steven

    2004-12-10

    Gecko (Gene Expression: Computation and Knowledge Organization) is a complete, high-capacity centralized gene expression analysis system, developed in response to the needs of a distributed user community. Based on a client-server architecture, with a centralized repository of typically many tens of thousands of Affymetrix scans, Gecko includes automatic processing pipelines for uploading data from remote sites, a data base, a computational engine implementing approximately 50 different analysis tools, and a client application. Among available analysis tools are clustering methods, principal component analysis, supervised classification including feature selection and cross-validation, multi-factorial ANOVA, statistical contrast calculations, and various post-processing tools for extracting data at given error rates or significance levels. On account of its open architecture, Gecko also allows for the integration of new algorithms. The Gecko framework is very general: non-Affymetrix and non-gene expression data can be analyzed as well. A unique feature of the Gecko architecture is the concept of the Analysis Tree (actually, a directed acyclic graph), in which all successive results in ongoing analyses are saved. This approach has proven invaluable in allowing a large (approximately 100 users) and distributed community to share results, and to repeatedly return over a span of years to older and potentially very complex analyses of gene expression data. The Gecko system is being made publicly available as free software http://sourceforge.net/projects/geckoe. In totality or in parts, the Gecko framework should prove useful to users and system developers with a broad range of analysis needs.

  19. Statistics and bioinformatics in nutritional sciences: analysis of complex data in the era of systems biology⋆

    PubMed Central

    Fu, Wenjiang J.; Stromberg, Arnold J.; Viele, Kert; Carroll, Raymond J.; Wu, Guoyao

    2009-01-01

    Over the past two decades, there have been revolutionary developments in life science technologies characterized by high throughput, high efficiency, and rapid computation. Nutritionists now have the advanced methodologies for the analysis of DNA, RNA, protein, low-molecular-weight metabolites, as well as access to bioinformatics databases. Statistics, which can be defined as the process of making scientific inferences from data that contain variability, has historically played an integral role in advancing nutritional sciences. Currently, in the era of systems biology, statistics has become an increasingly important tool to quantitatively analyze information about biological macromolecules. This article describes general terms used in statistical analysis of large, complex experimental data. These terms include experimental design, power analysis, sample size calculation, and experimental errors (type I and II errors) for nutritional studies at population, tissue, cellular, and molecular levels. In addition, we highlighted various sources of experimental variations in studies involving microarray gene expression, real-time polymerase chain reaction, proteomics, and other bioinformatics technologies. Moreover, we provided guidelines for nutritionists and other biomedical scientists to plan and conduct studies and to analyze the complex data. Appropriate statistical analyses are expected to make an important contribution to solving major nutrition-associated problems in humans and animals (including obesity, diabetes, cardiovascular disease, cancer, ageing, and intrauterine fetal retardation). PMID:20233650

  20. HDBStat!: a platform-independent software suite for statistical analysis of high dimensional biology data.

    PubMed

    Trivedi, Prinal; Edwards, Jode W; Wang, Jelai; Gadbury, Gary L; Srinivasasainagendra, Vinodh; Zakharkin, Stanislav O; Kim, Kyoungmi; Mehta, Tapan; Brand, Jacob P L; Patki, Amit; Page, Grier P; Allison, David B

    2005-04-06

    Many efforts in microarray data analysis are focused on providing tools and methods for the qualitative analysis of microarray data. HDBStat! (High-Dimensional Biology-Statistics) is a software package designed for analysis of high dimensional biology data such as microarray data. It was initially developed for the analysis of microarray gene expression data, but it can also be used for some applications in proteomics and other aspects of genomics. HDBStat! provides statisticians and biologists a flexible and easy-to-use interface to analyze complex microarray data using a variety of methods for data preprocessing, quality control analysis and hypothesis testing. Results generated from data preprocessing methods, quality control analysis and hypothesis testing methods are output in the form of Excel CSV tables, graphs and an Html report summarizing data analysis. HDBStat! is a platform-independent software that is freely available to academic institutions and non-profit organizations. It can be downloaded from our website http://www.soph.uab.edu/ssg_content.asp?id=1164.

  1. ABCA Transporter Gene Expression and Poor Outcome in Epithelial Ovarian Cancer

    PubMed Central

    Hedditch, Ellen L.; Gao, Bo; Russell, Amanda J.; Lu, Yi; Emmanuel, Catherine; Beesley, Jonathan; Johnatty, Sharon E.; Chen, Xiaoqing; Harnett, Paul; George, Joshy; Williams, Rebekka T.; Flemming, Claudia; Lambrechts, Diether; Despierre, Evelyn; Lambrechts, Sandrina; Vergote, Ignace; Karlan, Beth; Lester, Jenny; Orsulic, Sandra; Walsh, Christine; Fasching, Peter; Beckmann, Matthias W.; Ekici, Arif B.; Hein, Alexander; Matsuo, Keitaro; Hosono, Satoyo; Nakanishi, Toru; Yatabe, Yasushi; Pejovic, Tanja; Bean, Yukie; Heitz, Florian; Harter, Philipp; du Bois, Andreas; Schwaab, Ira; Hogdall, Estrid; Kjaer, Susan K.; Jensen, Allan; Hogdall, Claus; Lundvall, Lene; Engelholm, Svend Aage; Brown, Bob; Flanagan, James; Metcalf, Michelle D; Siddiqui, Nadeem; Sellers, Thomas; Fridley, Brooke; Cunningham, Julie; Schildkraut, Joellen; Iversen, Ed; Weber, Rachel P.; Berchuck, Andrew; Goode, Ellen; Bowtell, David D.; Chenevix-Trench, Georgia; deFazio, Anna; Norris, Murray D.; MacGregor, Stuart; Haber, Michelle; Henderson, Michelle J.

    2014-01-01

    Background ATP-binding cassette (ABC) transporters play various roles in cancer biology and drug resistance, but their association with outcomes in serous epithelial ovarian cancer (EOC) is unknown. Methods The relationship between clinical outcomes and ABC transporter gene expression in two independent cohorts of high-grade serous EOC tumors was assessed with real-time quantitative polymerase chain reaction, analysis of expression microarray data, and immunohistochemistry. Associations between clinical outcomes and ABCA transporter gene single nucleotide polymorphisms were tested in a genome-wide association study. Impact of short interfering RNA–mediated gene suppression was determined by colony forming and migration assays. Association with survival was assessed with Kaplan–Meier analysis and log-rank tests. All statistical tests were two-sided. Results Associations with outcome were observed with ABC transporters of the “A” subfamily, but not with multidrug transporters. High-level expression of ABCA1, ABCA6, ABCA8, and ABCA9 in primary tumors was statistically significantly associated with reduced survival in serous ovarian cancer patients. Low levels of ABCA5 and the C-allele of rs536009 were associated with shorter overall survival (hazard ratio for death = 1.50; 95% confidence interval [CI] =1.26 to 1.79; P = 6.5e−6). The combined expression pattern of ABCA1, ABCA5, and either ABCA8 or ABCA9 was associated with particularly poor outcome (mean overall survival in group with adverse ABCA1, ABCA5 and ABCA9 gene expression = 33.2 months, 95% CI = 26.4 to 40.1; vs 55.3 months in the group with favorable ABCA gene expression, 95% CI = 49.8 to 60.8; P = .001), independently of tumor stage or surgical debulking status. Suppression of cholesterol transporter ABCA1 inhibited ovarian cancer cell growth and migration in vitro, and statin treatment reduced ovarian cancer cell migration. Conclusions Expression of ABCA transporters was associated with poor outcome in serous ovarian cancer, implicating lipid trafficking as a potentially important process in EOC. PMID:24957074

  2. TLR and NLRP3 inflammasome expression deregulation in macrophages of adult rats subjected to neonatal malnutrition and infected with methicillin-resistant Staphylococcus aureus.

    PubMed

    Gomes de Morais, Natália; Barreto da Costa, Thacianna; Bezerra de Lira, Joana Maria; da Cunha Gonçalves de Albuquerque, Suênia; Alves Pereira, Valéria Rêgo; de Paiva Cavalcanti, Milena; Machado Barbosa de Castro, Célia Maria

    2017-01-01

    Nutritional aggression in critical periods may lead to epigenetic changes that affect gene expression. The aim of this study was to assess the effect of neonatal malnutrition on the expression of toll-like receptor (TLR)-2, TLR-4, and NLRP3 receptors, caspase-1 enzyme, and interleukin (IL)-1 β production in macrophages infected with methicillin-resistant (MRSA) and methicillin-sensitive (MSSA) Staphylococcus aureus. Wistar rats (N = 24) were divided in two distinct groups: nourished (17% casein) and malnourished (8% casein). Four systems were established after the isolation of mononuclear cells: negative control, positive control, MRSA, and MSSA. The plates were incubated at 37°C for 24 h in humidified atmosphere and 5% carbon dioxide. Tests were performed after this period to analyze the expression of standard recognition receptors, caspase-1 enzyme, and the production of IL-1 β. Student's t test and analysis of variance were used in the statistical analysis; P < 0.05 was statistically significant. Malnutrition reduced animal growth and the expression of TLR-2, TLR-4, and NLRP3 receptors, the caspase-1 enzyme, and the IL-1 β levels in macrophages infected with lipopolysaccharides in the present study. However, the interaction between the S. aureus and the macrophages promoted greater gene expression of receptors and enzymes. The neonatal malnutrition model compromised the expression of standard recognition receptors, of the caspase-1 enzyme as well as the production of IL-1 β. However, the S. aureus and neonatal malnutrition combination led to intense transcription of such innate immunity components. Therefore, the deregulation in the expression of TLR and NLRP3 receptors and of the caspase-1 enzyme may induce extensive tissue injury and favor the permanence and spread of these bacteria, especially those that are methicillin resistant. Copyright © 2016 Elsevier Inc. All rights reserved.

  3. Evaluation of Different Normalization and Analysis Procedures for Illumina Gene Expression Microarray Data Involving Small Changes

    PubMed Central

    Johnstone, Daniel M.; Riveros, Carlos; Heidari, Moones; Graham, Ross M.; Trinder, Debbie; Berretta, Regina; Olynyk, John K.; Scott, Rodney J.; Moscato, Pablo; Milward, Elizabeth A.

    2013-01-01

    While Illumina microarrays can be used successfully for detecting small gene expression changes due to their high degree of technical replicability, there is little information on how different normalization and differential expression analysis strategies affect outcomes. To evaluate this, we assessed concordance across gene lists generated by applying different combinations of normalization strategy and analytical approach to two Illumina datasets with modest expression changes. In addition to using traditional statistical approaches, we also tested an approach based on combinatorial optimization. We found that the choice of both normalization strategy and analytical approach considerably affected outcomes, in some cases leading to substantial differences in gene lists and subsequent pathway analysis results. Our findings suggest that important biological phenomena may be overlooked when there is a routine practice of using only one approach to investigate all microarray datasets. Analytical artefacts of this kind are likely to be especially relevant for datasets involving small fold changes, where inherent technical variation—if not adequately minimized by effective normalization—may overshadow true biological variation. This report provides some basic guidelines for optimizing outcomes when working with Illumina datasets involving small expression changes. PMID:27605185

  4. Estimating differential expression from multiple indicators

    PubMed Central

    Ilmjärv, Sten; Hundahl, Christian Ansgar; Reimets, Riin; Niitsoo, Margus; Kolde, Raivo; Vilo, Jaak; Vasar, Eero; Luuk, Hendrik

    2014-01-01

    Regardless of the advent of high-throughput sequencing, microarrays remain central in current biomedical research. Conventional microarray analysis pipelines apply data reduction before the estimation of differential expression, which is likely to render the estimates susceptible to noise from signal summarization and reduce statistical power. We present a probe-level framework, which capitalizes on the high number of concurrent measurements to provide more robust differential expression estimates. The framework naturally extends to various experimental designs and target categories (e.g. transcripts, genes, genomic regions) as well as small sample sizes. Benchmarking in relation to popular microarray and RNA-sequencing data-analysis pipelines indicated high and stable performance on the Microarray Quality Control dataset and in a cell-culture model of hypoxia. Experimental-data-exhibiting long-range epigenetic silencing of gene expression was used to demonstrate the efficacy of detecting differential expression of genomic regions, a level of analysis not embraced by conventional workflows. Finally, we designed and conducted an experiment to identify hypothermia-responsive genes in terms of monotonic time-response. As a novel insight, hypothermia-dependent up-regulation of multiple genes of two major antioxidant pathways was identified and verified by quantitative real-time PCR. PMID:24586062

  5. Synthetic data sets for the identification of key ingredients for RNA-seq differential analysis.

    PubMed

    Rigaill, Guillem; Balzergue, Sandrine; Brunaud, Véronique; Blondet, Eddy; Rau, Andrea; Rogier, Odile; Caius, José; Maugis-Rabusseau, Cathy; Soubigou-Taconnat, Ludivine; Aubourg, Sébastien; Lurin, Claire; Martin-Magniette, Marie-Laure; Delannoy, Etienne

    2018-01-01

    Numerous statistical pipelines are now available for the differential analysis of gene expression measured with RNA-sequencing technology. Most of them are based on similar statistical frameworks after normalization, differing primarily in the choice of data distribution, mean and variance estimation strategy and data filtering. We propose an evaluation of the impact of these choices when few biological replicates are available through the use of synthetic data sets. This framework is based on real data sets and allows the exploration of various scenarios differing in the proportion of non-differentially expressed genes. Hence, it provides an evaluation of the key ingredients of the differential analysis, free of the biases associated with the simulation of data using parametric models. Our results show the relevance of a proper modeling of the mean by using linear or generalized linear modeling. Once the mean is properly modeled, the impact of the other parameters on the performance of the test is much less important. Finally, we propose to use the simple visualization of the raw P-value histogram as a practical evaluation criterion of the performance of differential analysis methods on real data sets. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  6. Verification of TREX1 as a promising indicator of judging the prognosis of osteosarcoma.

    PubMed

    Feng, Jinyi; Lan, Ruilong; Cai, Guanxiong; Lin, Jinluan; Wang, Xinwen; Lin, Jianhua; Han, Deping

    2016-11-24

    The study aimed to explore the correlation between the expression of TREX1 and the metastasis and the survival time of patients with osteosarcoma as well as biological characteristics of osteosarcoma cells for the prognosis judgment of osteosarcoma. The correlation between the expression of TREX1 protein and the occurrence of pulmonary metastasis in 45 cases of osteosarcoma was analyzed. The CD133 + and CD133 - cell subsets of osteosarcoma stem cells were sorted by the flow cytometry. The tumorsphere culture, clone formation, growth curve, osteogenic and adipogenic differentiation, tumor-formation ability in nude mice, sensitivity of chemotherapeutic drugs, and other cytobiology behaviors were compared between the cell subsets in two groups; the expressions of stem cell-related genes Nanog and Oct4 were compared; The expressions of TREX1 protein and mRNA were compared between the cell subsets in two groups. The data was statistically analyzed. The measurement data between the two groups were compared using t test. The count data between the two groups were compared using χ 2 test and Kaplan-Meier survival analysis. A P value <0.05 indicated that the difference was statistically significant. The expression of TREX1 protein in patients with osteosarcoma in the metastasis group was significantly lower than that in the non-metastasis group. The difference was statistically significant (P < 0.05). Up to the last follow-up visit, the former average survival time was significantly lower than that of the latter, and the difference was statistically significant (P < 0.05). The expression of TREX1 in human osteosarcoma CD133 + cell subsets was significantly lower than that in CD133 - cell subsets. Stemness-related genes Nanog and Oct4 were highly expressed in human osteosarcoma CD133 + cell subsets with lower expression of TREX1; the biological characteristics identification experiment showed that human CD133 + cell subsets with low TREX1 expression could form tumorspheres, the number of colony forming was more, the cell proliferation ability was strong, the osteogenic and adipogenic differentiation potential was big, the tumor-forming ability in nude mice was strong, and the sensibility of chemotherapeutics drugs on cisplatin was low. The expression of TREX1 may be related to metastasis in patients with osteosarcoma. The expression of TREX1 was closely related to the cytobiology characteristics of osteosarcoma stem cell. TREX1 can play an important role in the occurrence and development processes. And, TREX1 is expected to become an effective new index for the evaluation of the prognosis.

  7. ABCG2 in peptic ulcer: gene expression and mutation analysis.

    PubMed

    Salagacka-Kubiak, Aleksandra; Żebrowska, Marta; Wosiak, Agnieszka; Balcerczak, Mariusz; Mirowski, Marek; Balcerczak, Ewa

    2016-08-01

    The aim of this study was to evaluate the participation of polymorphism at position C421A and mRNA expression of the ABCG2 gene in the development of peptic ulcers, which is a very common and severe disease. ABCG2, encoded by the ABCG2 gene, has been found inter alia in the gastrointestinal tract, where it plays a protective role eliminating xenobiotics from cells into the extracellular environment. The materials for the study were biopsies of gastric mucosa taken during a routine endoscopy. For genotyping by polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP) at position C421A, DNA was isolated from 201 samples, while for the mRNA expression level by real-time PCR, RNA was isolated from 60 patients. The control group of healthy individuals consisted of 97 blood donors. The dominant genotype in the group of peptic ulcer patients and healthy individuals was homozygous CC. No statistically significant differences between healthy individuals and the whole group of peptic ulcer patients and, likewise, between the subgroups of peptic ulcer patients (infected and uninfected with Helicobacter pylori) were found. ABCG2 expression relative to GAPDH expression was found in 38 of the 60 gastric mucosa samples. The expression level of the gene varies greatly among cases. The statistically significant differences between the intensity (p = 0.0375) of H. pylori infection and ABCG2 gene expression have been shown. It was observed that the more intense the infection, the higher the level of ABCG2 expression.

  8. The application of artificial intelligence to microarray data: identification of a novel gene signature to identify bladder cancer progression.

    PubMed

    Catto, James W F; Abbod, Maysam F; Wild, Peter J; Linkens, Derek A; Pilarsky, Christian; Rehman, Ishtiaq; Rosario, Derek J; Denzinger, Stefan; Burger, Maximilian; Stoehr, Robert; Knuechel, Ruth; Hartmann, Arndt; Hamdy, Freddie C

    2010-03-01

    New methods for identifying bladder cancer (BCa) progression are required. Gene expression microarrays can reveal insights into disease biology and identify novel biomarkers. However, these experiments produce large datasets that are difficult to interpret. To develop a novel method of microarray analysis combining two forms of artificial intelligence (AI): neurofuzzy modelling (NFM) and artificial neural networks (ANN) and validate it in a BCa cohort. We used AI and statistical analyses to identify progression-related genes in a microarray dataset (n=66 tumours, n=2800 genes). The AI-selected genes were then investigated in a second cohort (n=262 tumours) using immunohistochemistry. We compared the accuracy of AI and statistical approaches to identify tumour progression. AI identified 11 progression-associated genes (odds ratio [OR]: 0.70; 95% confidence interval [CI], 0.56-0.87; p=0.0004), and these were more discriminate than genes chosen using statistical analyses (OR: 1.24; 95% CI, 0.96-1.60; p=0.09). The expression of six AI-selected genes (LIG3, FAS, KRT18, ICAM1, DSG2, and BRCA2) was determined using commercial antibodies and successfully identified tumour progression (concordance index: 0.66; log-rank test: p=0.01). AI-selected genes were more discriminate than pathologic criteria at determining progression (Cox multivariate analysis: p=0.01). Limitations include the use of statistical correlation to identify 200 genes for AI analysis and that we did not compare regression identified genes with immunohistochemistry. AI and statistical analyses use different techniques of inference to determine gene-phenotype associations and identify distinct prognostic gene signatures that are equally valid. We have identified a prognostic gene signature whose members reflect a variety of carcinogenic pathways that could identify progression in non-muscle-invasive BCa. 2009 European Association of Urology. Published by Elsevier B.V. All rights reserved.

  9. Intra-articular decorin influences the fibrosis genetic expression profile in a rabbit model of joint contracture.

    PubMed

    Abdel, M P; Morrey, M E; Barlow, J D; Grill, D E; Kolbert, C P; An, K N; Steinmann, S P; Morrey, B F; Sanchez-Sotelo, J

    2014-01-01

    The goal of this study was to determine whether intra-articular administration of the potentially anti-fibrotic agent decorin influences the expression of genes involved in the fibrotic cascade, and ultimately leads to less contracture, in an animal model. A total of 18 rabbits underwent an operation on their right knees to form contractures. Six limbs in group 1 received four intra-articular injections of decorin; six limbs in group 2 received four intra-articular injections of bovine serum albumin (BSA) over eight days; six limbs in group 3 received no injections. The contracted limbs of rabbits in group 1 were biomechanically and genetically compared with the contracted limbs of rabbits in groups 2 and 3, with the use of a calibrated joint measuring device and custom microarray, respectively. There was no statistical difference in the flexion contracture angles between those limbs that received intra-articular decorin versus those that received intra-articular BSA (66° vs 69°; p = 0.41). Likewise, there was no statistical difference between those limbs that received intra-articular decorin versus those who had no injection (66° vs 72°; p = 0.27). When compared with BSA, decorin led to a statistically significant increase in the mRNA expression of 12 genes (p < 0.01). In addition, there was a statistical change in the mRNA expression of three genes, when compared with those without injection. In this model, when administered intra-articularly at eight weeks, 2 mg of decorin had no significant effect on joint contractures. However, our genetic analysis revealed a significant alteration in several fibrotic genes. Cite this article: Bone Joint Res 2014;3:82-8.

  10. Accounting for isotopic clustering in Fourier transform mass spectrometry data analysis for clinical diagnostic studies.

    PubMed

    Kakourou, Alexia; Vach, Werner; Nicolardi, Simone; van der Burgt, Yuri; Mertens, Bart

    2016-10-01

    Mass spectrometry based clinical proteomics has emerged as a powerful tool for high-throughput protein profiling and biomarker discovery. Recent improvements in mass spectrometry technology have boosted the potential of proteomic studies in biomedical research. However, the complexity of the proteomic expression introduces new statistical challenges in summarizing and analyzing the acquired data. Statistical methods for optimally processing proteomic data are currently a growing field of research. In this paper we present simple, yet appropriate methods to preprocess, summarize and analyze high-throughput MALDI-FTICR mass spectrometry data, collected in a case-control fashion, while dealing with the statistical challenges that accompany such data. The known statistical properties of the isotopic distribution of the peptide molecules are used to preprocess the spectra and translate the proteomic expression into a condensed data set. Information on either the intensity level or the shape of the identified isotopic clusters is used to derive summary measures on which diagnostic rules for disease status allocation will be based. Results indicate that both the shape of the identified isotopic clusters and the overall intensity level carry information on the class outcome and can be used to predict the presence or absence of the disease.

  11. HIF-1α and GLUT-1 Expression in Atypical Endometrial Hyperplasia, Type I and II Endometrial Carcinoma: A Potential Role in Pathogenesis

    PubMed Central

    Abdou, Asmaa Gaber; Wahed, Moshira Mohammed Abdel; Kassem, Hend Abdou

    2016-01-01

    Introduction Hypoxia-Inducible Factor 1α (HIF-1α) is one of the major adaptive responses to hypoxia, regulating the activity of glucose transporter -1 (GLUT-1), responsible for glucose uptake. Aim To evaluate the immunohistochemical expression of both HIF-1α and GLUT-1 in type I and II endometrial carcinoma and their correlation with the available clinicopathologic variables in each type. Materials and Methods A retrospective study was conducted on archival blocks diagnosed from pathology department between April 2010 and August 2014 included 9 cases of atypical hyperplasia and 67 cases of endometrial carcinoma. Evaluation of both HIF-1α and GLUT-1 expression using standard immunohistochemical techniques performed on cut sections from selected paraffin embedded blocks. Statistical Analysis Descriptive analysis of the variables and statistical significances were calculated by non-parametric chi-square test using the Statistical Package for the Social Sciences version 12.0 (SPSS). Results HIF-1α was expressed in epithelial (88.9%, 52.2%, 61.2% and 50%) and stromal (33.3%, 74.6%. 71.4% and 83.3%) components of hyperplasia, total cases of EC, type I and II EC, respectively. GLUT-1 was expressed in the epithelial component of 88.9%, 98.5%, 98% and 100% of hyperplasia, total EC cases, type I and II EC, respectively. The necrosis related pattern of epithelial HIF-1α expression was in favour of type II (p=0.018) and grade III (p=0.038). HIF-1α H-score was associated with high apoptosis in both type I and total cases of EC (p=0.04). GLUT-1 H-score was negatively correlated with apoptotic count (p=0.04) and associated with high grade (p=0.003) and advanced stage in total EC (p=0.004). GLUT-1 H-score was correlated with the pattern of HIF-1α staining in all cases of EC (p= 0.04). Conclusion The role of HIF-1α in epithelial cells may differ from that of stromal cells in EC; however they augment the expression of each other supporting the crosstalk between them. The stepwise increase in H- score of GLUT-1 in the studied cases implies its potential role in carcinogenesis of EC. HIF-1α may promote GLUT-1 expression in EC especially surrounding areas of necrosis. The differences between type I and type II EC regarding HIF-1α and GLUT-1 expression may confirm the differences in their aetiopathogenesis. PMID:27437226

  12. Comparisons between Arabidopsis thaliana and Drosophila melanogaster in relation to Coding and Noncoding Sequence Length and Gene Expression

    PubMed Central

    Caldwell, Rachel; Lin, Yan-Xia; Zhang, Ren

    2015-01-01

    There is a continuing interest in the analysis of gene architecture and gene expression to determine the relationship that may exist. Advances in high-quality sequencing technologies and large-scale resource datasets have increased the understanding of relationships and cross-referencing of expression data to the large genome data. Although a negative correlation between expression level and gene (especially transcript) length has been generally accepted, there have been some conflicting results arising from the literature concerning the impacts of different regions of genes, and the underlying reason is not well understood. The research aims to apply quantile regression techniques for statistical analysis of coding and noncoding sequence length and gene expression data in the plant, Arabidopsis thaliana, and fruit fly, Drosophila melanogaster, to determine if a relationship exists and if there is any variation or similarities between these species. The quantile regression analysis found that the coding sequence length and gene expression correlations varied, and similarities emerged for the noncoding sequence length (5′ and 3′ UTRs) between animal and plant species. In conclusion, the information described in this study provides the basis for further exploration into gene regulation with regard to coding and noncoding sequence length. PMID:26114098

  13. Identifiability of PBPK Models with Applications to ...

    EPA Pesticide Factsheets

    Any statistical model should be identifiable in order for estimates and tests using it to be meaningful. We consider statistical analysis of physiologically-based pharmacokinetic (PBPK) models in which parameters cannot be estimated precisely from available data, and discuss different types of identifiability that occur in PBPK models and give reasons why they occur. We particularly focus on how the mathematical structure of a PBPK model and lack of appropriate data can lead to statistical models in which it is impossible to estimate at least some parameters precisely. Methods are reviewed which can determine whether a purely linear PBPK model is globally identifiable. We propose a theorem which determines when identifiability at a set of finite and specific values of the mathematical PBPK model (global discrete identifiability) implies identifiability of the statistical model. However, we are unable to establish conditions that imply global discrete identifiability, and conclude that the only safe approach to analysis of PBPK models involves Bayesian analysis with truncated priors. Finally, computational issues regarding posterior simulations of PBPK models are discussed. The methodology is very general and can be applied to numerous PBPK models which can be expressed as linear time-invariant systems. A real data set of a PBPK model for exposure to dimethyl arsinic acid (DMA(V)) is presented to illustrate the proposed methodology. We consider statistical analy

  14. Computerized image analysis for quantitative neuronal phenotyping in zebrafish.

    PubMed

    Liu, Tianming; Lu, Jianfeng; Wang, Ye; Campbell, William A; Huang, Ling; Zhu, Jinmin; Xia, Weiming; Wong, Stephen T C

    2006-06-15

    An integrated microscope image analysis pipeline is developed for automatic analysis and quantification of phenotypes in zebrafish with altered expression of Alzheimer's disease (AD)-linked genes. We hypothesize that a slight impairment of neuronal integrity in a large number of zebrafish carrying the mutant genotype can be detected through the computerized image analysis method. Key functionalities of our zebrafish image processing pipeline include quantification of neuron loss in zebrafish embryos due to knockdown of AD-linked genes, automatic detection of defective somites, and quantitative measurement of gene expression levels in zebrafish with altered expression of AD-linked genes or treatment with a chemical compound. These quantitative measurements enable the archival of analyzed results and relevant meta-data. The structured database is organized for statistical analysis and data modeling to better understand neuronal integrity and phenotypic changes of zebrafish under different perturbations. Our results show that the computerized analysis is comparable to manual counting with equivalent accuracy and improved efficacy and consistency. Development of such an automated data analysis pipeline represents a significant step forward to achieve accurate and reproducible quantification of neuronal phenotypes in large scale or high-throughput zebrafish imaging studies.

  15. Thymidylate synthase (TS) protein expression as a prognostic factor in advanced colorectal cancer: a comparison with TS mRNA expression.

    PubMed

    Nakagawa, Tateo; Shimada, Mitsuo; Kurita, Nobuhiro; Iwata, Takashi; Nishioka, Masanori; Yoshikawa, Kozo; Higashijima, Jun; Utsunomiya, Tohru

    2012-06-01

    The role of intratumoral thymidylate synthase (TS) mRNA or protein expression is still controversial and little has been reported regarding relation of them in colorectal cancer. Forty-six patients with advanced colorectal cancer who underwent surgical resection were included. TS mRNA expression was determined by the Danenberg tumor profile method based on laser-captured micro-dissection of the tumor cells. TS protein expression was evaluated using immunohistochemical staining. TS mRNA expression tended to relate TS protein expression. Statistical significance was not found in overall survival between the TS mRNA high group and low group regardless of performing adjuvant chemotherapy. The overall survival in the TS protein negative group was significantly higher than that in positive group in all and the patients without adjuvant chemotherapy. Multivariate analysis showed TS protein expression was as an independent prognostic factor. TS protein expression tends to be related TS mRNA expression and is an independent prognostic factor in advanced colorectal cancer.

  16. Robust Linear Models for Cis-eQTL Analysis.

    PubMed

    Rantalainen, Mattias; Lindgren, Cecilia M; Holmes, Christopher C

    2015-01-01

    Expression Quantitative Trait Loci (eQTL) analysis enables characterisation of functional genetic variation influencing expression levels of individual genes. In outbread populations, including humans, eQTLs are commonly analysed using the conventional linear model, adjusting for relevant covariates, assuming an allelic dosage model and a Gaussian error term. However, gene expression data generally have noise that induces heavy-tailed errors relative to the Gaussian distribution and often include atypical observations, or outliers. Such departures from modelling assumptions can lead to an increased rate of type II errors (false negatives), and to some extent also type I errors (false positives). Careful model checking can reduce the risk of type-I errors but often not type II errors, since it is generally too time-consuming to carefully check all models with a non-significant effect in large-scale and genome-wide studies. Here we propose the application of a robust linear model for eQTL analysis to reduce adverse effects of deviations from the assumption of Gaussian residuals. We present results from a simulation study as well as results from the analysis of real eQTL data sets. Our findings suggest that in many situations robust models have the potential to provide more reliable eQTL results compared to conventional linear models, particularly in respect to reducing type II errors due to non-Gaussian noise. Post-genomic data, such as that generated in genome-wide eQTL studies, are often noisy and frequently contain atypical observations. Robust statistical models have the potential to provide more reliable results and increased statistical power under non-Gaussian conditions. The results presented here suggest that robust models should be considered routinely alongside other commonly used methodologies for eQTL analysis.

  17. Whole Genome Gene Expression Meta-Analysis of Inflammatory Bowel Disease Colon Mucosa Demonstrates Lack of Major Differences between Crohn's Disease and Ulcerative Colitis

    PubMed Central

    Østvik, Ann E.; Drozdov, Ignat; Gustafsson, Bjørn I.; Kidd, Mark; Beisvag, Vidar; Torp, Sverre H.; Waldum, Helge L.; Martinsen, Tom Christian; Damås, Jan Kristian; Espevik, Terje; Sandvik, Arne K.

    2013-01-01

    Background In inflammatory bowel disease (IBD), genetic susceptibility together with environmental factors disturbs gut homeostasis producing chronic inflammation. The two main IBD subtypes are Ulcerative colitis (UC) and Crohn’s disease (CD). We present the to-date largest microarray gene expression study on IBD encompassing both inflamed and un-inflamed colonic tissue. A meta-analysis including all available, comparable data was used to explore important aspects of IBD inflammation, thereby validating consistent gene expression patterns. Methods Colon pinch biopsies from IBD patients were analysed using Illumina whole genome gene expression technology. Differential expression (DE) was identified using LIMMA linear model in the R statistical computing environment. Results were enriched for gene ontology (GO) categories. Sets of genes encoding antimicrobial proteins (AMP) and proteins involved in T helper (Th) cell differentiation were used in the interpretation of the results. All available data sets were analysed using the same methods, and results were compared on a global and focused level as t-scores. Results Gene expression in inflamed mucosa from UC and CD are remarkably similar. The meta-analysis confirmed this. The patterns of AMP and Th cell-related gene expression were also very similar, except for IL23A which was consistently higher expressed in UC than in CD. Un-inflamed tissue from patients demonstrated minimal differences from healthy controls. Conclusions There is no difference in the Th subgroup involvement between UC and CD. Th1/Th17 related expression, with little Th2 differentiation, dominated both diseases. The different IL23A expression between UC and CD suggests an IBD subtype specific role. AMPs, previously little studied, are strongly overexpressed in IBD. The presented meta-analysis provides a sound background for further research on IBD pathobiology. PMID:23468882

  18. Whole genome gene expression meta-analysis of inflammatory bowel disease colon mucosa demonstrates lack of major differences between Crohn's disease and ulcerative colitis.

    PubMed

    Granlund, Atle van Beelen; Flatberg, Arnar; Østvik, Ann E; Drozdov, Ignat; Gustafsson, Bjørn I; Kidd, Mark; Beisvag, Vidar; Torp, Sverre H; Waldum, Helge L; Martinsen, Tom Christian; Damås, Jan Kristian; Espevik, Terje; Sandvik, Arne K

    2013-01-01

    In inflammatory bowel disease (IBD), genetic susceptibility together with environmental factors disturbs gut homeostasis producing chronic inflammation. The two main IBD subtypes are Ulcerative colitis (UC) and Crohn's disease (CD). We present the to-date largest microarray gene expression study on IBD encompassing both inflamed and un-inflamed colonic tissue. A meta-analysis including all available, comparable data was used to explore important aspects of IBD inflammation, thereby validating consistent gene expression patterns. Colon pinch biopsies from IBD patients were analysed using Illumina whole genome gene expression technology. Differential expression (DE) was identified using LIMMA linear model in the R statistical computing environment. Results were enriched for gene ontology (GO) categories. Sets of genes encoding antimicrobial proteins (AMP) and proteins involved in T helper (Th) cell differentiation were used in the interpretation of the results. All available data sets were analysed using the same methods, and results were compared on a global and focused level as t-scores. Gene expression in inflamed mucosa from UC and CD are remarkably similar. The meta-analysis confirmed this. The patterns of AMP and Th cell-related gene expression were also very similar, except for IL23A which was consistently higher expressed in UC than in CD. Un-inflamed tissue from patients demonstrated minimal differences from healthy controls. There is no difference in the Th subgroup involvement between UC and CD. Th1/Th17 related expression, with little Th2 differentiation, dominated both diseases. The different IL23A expression between UC and CD suggests an IBD subtype specific role. AMPs, previously little studied, are strongly overexpressed in IBD. The presented meta-analysis provides a sound background for further research on IBD pathobiology.

  19. Expression of thymidylate synthase (TS) and its prognostic significance in patients with cutaneous angiosarcoma.

    PubMed

    Shimizu, A; Kaira, K; Okubo, Y; Utsumi, D; Bolag, A; Yasuda, M; Takahashi, K; Ishikawa, O

    2017-01-01

    Cutaneous angiosarcoma (CA) is extremely rare, and little is known about the biological significance of possible biomarkers for chemotherapeutic agents. Thymidylate synthase (TS) is an attractive target for cancer treatment in various human neoplasms. It remains unclear whether the expression of TS is associated with the clinicopathological features of CA patients. The aim of this study was to elucidate the relationship between TS expression and the clinicopathological significance in CA patients. Fifty-one patients with CA were included in this study. TS expression and Ki-67 labeling index were examined using immunohistochemical analysis. TS was positively expressed in 39% (20/51) of CA patients. No statistically significant prognostic factor was identified as a predictor of overall survival (OS) for all patients by univariate analysis, whereas a significant prognostic variable for progression free survival (PFS) was found to be the clinical stage. In addition, both univariate and multivariate analyses confirmed that positive expression of TS was a significant predictor of worse PFS in CA patients of clinical stage 1. Positive TS expression in CA was identified as a significant predictor of worse outcome in patients of clinical stage 1.

  20. Increase in the adhesion molecule P-selectin in endothelium overlying atherosclerotic plaques. Coexpression with intercellular adhesion molecule-1.

    PubMed Central

    Johnson-Tidey, R. R.; McGregor, J. L.; Taylor, P. R.; Poston, R. N.

    1994-01-01

    P-selectin (GMP-140) is an adhesion molecule present within endothelial cells that is rapidly translocated to the cell membrane upon activation, where it mediates endothelial-leukocyte interactions. Immunohistochemical analysis of human atherosclerotic plaques has shown strong expression of P-selectin by the endothelium overlying active atherosclerotic plaques. P-selectin is not, however, detected in normal arterial endothelium or in endothelium overlying inactive fibrous plaques. Color image analysis was used to quantitate the degree of P-selectin expression in the endothelium and demonstrates a statistically significant increase in P-selectin expression by atherosclerotic endothelial cells. Double immunofluorescence shows that some of this P-selectin is expressed on the luminal surface of the endothelial cells. Previous work has demonstrated a significant up-regulation in the expression of the intercellular adhesion molecule-1 in atherosclerotic endothelium and a study on the expression of intercellular adhesion molecule-1 and P-selectin in atherosclerosis shows a highly positive correlation. These results suggest that the selective and cooperative expression of P-selectin and intercellular adhesion molecule-1 may be involved in the recruitment of monocytes into sites of atherosclerosis. Images Figure 1 Figure 3 Figure 4 Figure 5 PMID:7513951

  1. Statistical plant set estimation using Schroeder-phased multisinusoidal input design

    NASA Technical Reports Server (NTRS)

    Bayard, D. S.

    1992-01-01

    A frequency domain method is developed for plant set estimation. The estimation of a plant 'set' rather than a point estimate is required to support many methods of modern robust control design. The approach here is based on using a Schroeder-phased multisinusoid input design which has the special property of placing input energy only at the discrete frequency points used in the computation. A detailed analysis of the statistical properties of the frequency domain estimator is given, leading to exact expressions for the probability distribution of the estimation error, and many important properties. It is shown that, for any nominal parametric plant estimate, one can use these results to construct an overbound on the additive uncertainty to any prescribed statistical confidence. The 'soft' bound thus obtained can be used to replace 'hard' bounds presently used in many robust control analysis and synthesis methods.

  2. miRNA Temporal Analyzer (mirnaTA): a bioinformatics tool for identifying differentially expressed microRNAs in temporal studies using normal quantile transformation.

    PubMed

    Cer, Regina Z; Herrera-Galeano, J Enrique; Anderson, Joseph J; Bishop-Lilly, Kimberly A; Mokashi, Vishwesh P

    2014-01-01

    Understanding the biological roles of microRNAs (miRNAs) is a an active area of research that has produced a surge of publications in PubMed, particularly in cancer research. Along with this increasing interest, many open-source bioinformatics tools to identify existing and/or discover novel miRNAs in next-generation sequencing (NGS) reads become available. While miRNA identification and discovery tools are significantly improved, the development of miRNA differential expression analysis tools, especially in temporal studies, remains substantially challenging. Further, the installation of currently available software is non-trivial and steps of testing with example datasets, trying with one's own dataset, and interpreting the results require notable expertise and time. Subsequently, there is a strong need for a tool that allows scientists to normalize raw data, perform statistical analyses, and provide intuitive results without having to invest significant efforts. We have developed miRNA Temporal Analyzer (mirnaTA), a bioinformatics package to identify differentially expressed miRNAs in temporal studies. mirnaTA is written in Perl and R (Version 2.13.0 or later) and can be run across multiple platforms, such as Linux, Mac and Windows. In the current version, mirnaTA requires users to provide a simple, tab-delimited, matrix file containing miRNA name and count data from a minimum of two to a maximum of 20 time points and three replicates. To recalibrate data and remove technical variability, raw data is normalized using Normal Quantile Transformation (NQT), and linear regression model is used to locate any miRNAs which are differentially expressed in a linear pattern. Subsequently, remaining miRNAs which do not fit a linear model are further analyzed in two different non-linear methods 1) cumulative distribution function (CDF) or 2) analysis of variances (ANOVA). After both linear and non-linear analyses are completed, statistically significant miRNAs (P < 0.05) are plotted as heat maps using hierarchical cluster analysis and Euclidean distance matrix computation methods. mirnaTA is an open-source, bioinformatics tool to aid scientists in identifying differentially expressed miRNAs which could be further mined for biological significance. It is expected to provide researchers with a means of interpreting raw data to statistical summaries in a fast and intuitive manner.

  3. Kuhn-Tucker optimization based reliability analysis for probabilistic finite elements

    NASA Technical Reports Server (NTRS)

    Liu, W. K.; Besterfield, G.; Lawrence, M.; Belytschko, T.

    1988-01-01

    The fusion of probability finite element method (PFEM) and reliability analysis for fracture mechanics is considered. Reliability analysis with specific application to fracture mechanics is presented, and computational procedures are discussed. Explicit expressions for the optimization procedure with regard to fracture mechanics are given. The results show the PFEM is a very powerful tool in determining the second-moment statistics. The method can determine the probability of failure or fracture subject to randomness in load, material properties and crack length, orientation, and location.

  4. CD10 and osteopontin expression in dentigerous cyst and ameloblastoma.

    PubMed

    Masloub, Shaimaa M; Abdel-Azim, Adel M; Elhamid, Ehab S Abd

    2011-05-24

    To investigate the expression of CD10 and osteopontin in dentigerous cyst and ameloblastoma and to correlate their expression with neoplastic potentiality of dentigerous cyst and local invasion and risk of local recurrence in ameloblastoma. CD10 and osteopontin expression was studied by means of immunohistochemistry in 9 cases of dentigerous cysts (DC) and 17 cases of ameloblastoma. There were 7 unicystic ameloblastoma (UCA) and 10 multicystic ameloblastoma (MCA). Positive cases were included in the statistical analysis, carried on the tabulated data using the Open Office Spreadsheet 3.2.1 under Linux operating system. Analysis of variance and correlation studies were performed using "R" under Linux operating system (R Development Core Team (2010). Tukey post-hoc test was also performed as a pair-wise test. The significant level was set at 0.05. High CD10 and osteopontin expression was observed in UCA and MCA, and low CD10 and osteopontin expression was observed in DC. Significant correlation was seen between CD10 and osteopontin expression and neoplastic potentiality of DC and local invasion and risk of recurrences in ameloblastoma. In DC, high CD10 and osteopontin expression may indicate the neoplastic potentiality of certain areas. In UCA & MCA, high CD10 and osteopontin expression may identify areas with locally invasive behavior and high risk of recurrence.

  5. COSMIC MICROWAVE BACKGROUND LIKELIHOOD APPROXIMATION FOR BANDED PROBABILITY DISTRIBUTIONS

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gjerløw, E.; Mikkelsen, K.; Eriksen, H. K.

    We investigate sets of random variables that can be arranged sequentially such that a given variable only depends conditionally on its immediate predecessor. For such sets, we show that the full joint probability distribution may be expressed exclusively in terms of uni- and bivariate marginals. Under the assumption that the cosmic microwave background (CMB) power spectrum likelihood only exhibits correlations within a banded multipole range, Δl{sub C}, we apply this expression to two outstanding problems in CMB likelihood analysis. First, we derive a statistically well-defined hybrid likelihood estimator, merging two independent (e.g., low- and high-l) likelihoods into a single expressionmore » that properly accounts for correlations between the two. Applying this expression to the Wilkinson Microwave Anisotropy Probe (WMAP) likelihood, we verify that the effect of correlations on cosmological parameters in the transition region is negligible in terms of cosmological parameters for WMAP; the largest relative shift seen for any parameter is 0.06σ. However, because this may not hold for other experimental setups (e.g., for different instrumental noise properties or analysis masks), but must rather be verified on a case-by-case basis, we recommend our new hybridization scheme for future experiments for statistical self-consistency reasons. Second, we use the same expression to improve the convergence rate of the Blackwell-Rao likelihood estimator, reducing the required number of Monte Carlo samples by several orders of magnitude, and thereby extend it to high-l applications.« less

  6. An effect size filter improves the reproducibility in spectral counting-based comparative proteomics.

    PubMed

    Gregori, Josep; Villarreal, Laura; Sánchez, Alex; Baselga, José; Villanueva, Josep

    2013-12-16

    The microarray community has shown that the low reproducibility observed in gene expression-based biomarker discovery studies is partially due to relying solely on p-values to get the lists of differentially expressed genes. Their conclusions recommended complementing the p-value cutoff with the use of effect-size criteria. The aim of this work was to evaluate the influence of such an effect-size filter on spectral counting-based comparative proteomic analysis. The results proved that the filter increased the number of true positives and decreased the number of false positives and the false discovery rate of the dataset. These results were confirmed by simulation experiments where the effect size filter was used to evaluate systematically variable fractions of differentially expressed proteins. Our results suggest that relaxing the p-value cut-off followed by a post-test filter based on effect size and signal level thresholds can increase the reproducibility of statistical results obtained in comparative proteomic analysis. Based on our work, we recommend using a filter consisting of a minimum absolute log2 fold change of 0.8 and a minimum signal of 2-4 SpC on the most abundant condition for the general practice of comparative proteomics. The implementation of feature filtering approaches could improve proteomic biomarker discovery initiatives by increasing the reproducibility of the results obtained among independent laboratories and MS platforms. Quality control analysis of microarray-based gene expression studies pointed out that the low reproducibility observed in the lists of differentially expressed genes could be partially attributed to the fact that these lists are generated relying solely on p-values. Our study has established that the implementation of an effect size post-test filter improves the statistical results of spectral count-based quantitative proteomics. The results proved that the filter increased the number of true positives whereas decreased the false positives and the false discovery rate of the datasets. The results presented here prove that a post-test filter applying a reasonable effect size and signal level thresholds helps to increase the reproducibility of statistical results in comparative proteomic analysis. Furthermore, the implementation of feature filtering approaches could improve proteomic biomarker discovery initiatives by increasing the reproducibility of results obtained among independent laboratories and MS platforms. This article is part of a Special Issue entitled: Standardization and Quality Control in Proteomics. Copyright © 2013 Elsevier B.V. All rights reserved.

  7. Differential protein-coding gene and long noncoding RNA expression in smoking-related lung squamous cell carcinoma.

    PubMed

    Li, Shicheng; Sun, Xiao; Miao, Shuncheng; Liu, Jia; Jiao, Wenjie

    2017-11-01

    Cigarette smoking is one of the greatest preventable risk factors for developing cancer, and most cases of lung squamous cell carcinoma (lung SCC) are associated with smoking. The pathogenesis mechanism of tumor progress is unclear. This study aimed to identify biomarkers in smoking-related lung cancer, including protein-coding gene, long noncoding RNA, and transcription factors. We selected and obtained messenger RNA microarray datasets and clinical data from the Gene Expression Omnibus database to identify gene expression altered by cigarette smoking. Integrated bioinformatic analysis was used to clarify biological functions of the identified genes, including Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway, the construction of a protein-protein interaction network, transcription factor, and statistical analyses. Subsequent quantitative real-time PCR was utilized to verify these bioinformatic analyses. Five hundred and ninety-eight differentially expressed genes and 21 long noncoding RNA were identified in smoking-related lung SCC. GO and KEGG pathway analysis showed that identified genes were enriched in the cancer-related functions and pathways. The protein-protein interaction network revealed seven hub genes identified in lung SCC. Several transcription factors and their binding sites were predicted. The results of real-time quantitative PCR revealed that AURKA and BIRC5 were significantly upregulated and LINC00094 was downregulated in the tumor tissues of smoking patients. Further statistical analysis indicated that dysregulation of AURKA, BIRC5, and LINC00094 indicated poor prognosis in lung SCC. Protein-coding genes AURKA, BIRC5, and LINC00094 could be biomarkers or therapeutic targets for smoking-related lung SCC. © 2017 The Authors. Thoracic Cancer published by China Lung Oncology Group and John Wiley & Sons Australia, Ltd.

  8. Why weight? Modelling sample and observational level variability improves power in RNA-seq analyses.

    PubMed

    Liu, Ruijie; Holik, Aliaksei Z; Su, Shian; Jansz, Natasha; Chen, Kelan; Leong, Huei San; Blewitt, Marnie E; Asselin-Labat, Marie-Liesse; Smyth, Gordon K; Ritchie, Matthew E

    2015-09-03

    Variations in sample quality are frequently encountered in small RNA-sequencing experiments, and pose a major challenge in a differential expression analysis. Removal of high variation samples reduces noise, but at a cost of reducing power, thus limiting our ability to detect biologically meaningful changes. Similarly, retaining these samples in the analysis may not reveal any statistically significant changes due to the higher noise level. A compromise is to use all available data, but to down-weight the observations from more variable samples. We describe a statistical approach that facilitates this by modelling heterogeneity at both the sample and observational levels as part of the differential expression analysis. At the sample level this is achieved by fitting a log-linear variance model that includes common sample-specific or group-specific parameters that are shared between genes. The estimated sample variance factors are then converted to weights and combined with observational level weights obtained from the mean-variance relationship of the log-counts-per-million using 'voom'. A comprehensive analysis involving both simulations and experimental RNA-sequencing data demonstrates that this strategy leads to a universally more powerful analysis and fewer false discoveries when compared to conventional approaches. This methodology has wide application and is implemented in the open-source 'limma' package. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  9. Controlling false-negative errors in microarray differential expression analysis: a PRIM approach.

    PubMed

    Cole, Steve W; Galic, Zoran; Zack, Jerome A

    2003-09-22

    Theoretical considerations suggest that current microarray screening algorithms may fail to detect many true differences in gene expression (Type II analytic errors). We assessed 'false negative' error rates in differential expression analyses by conventional linear statistical models (e.g. t-test), microarray-adapted variants (e.g. SAM, Cyber-T), and a novel strategy based on hold-out cross-validation. The latter approach employs the machine-learning algorithm Patient Rule Induction Method (PRIM) to infer minimum thresholds for reliable change in gene expression from Boolean conjunctions of fold-induction and raw fluorescence measurements. Monte Carlo analyses based on four empirical data sets show that conventional statistical models and their microarray-adapted variants overlook more than 50% of genes showing significant up-regulation. Conjoint PRIM prediction rules recover approximately twice as many differentially expressed transcripts while maintaining strong control over false-positive (Type I) errors. As a result, experimental replication rates increase and total analytic error rates decline. RT-PCR studies confirm that gene inductions detected by PRIM but overlooked by other methods represent true changes in mRNA levels. PRIM-based conjoint inference rules thus represent an improved strategy for high-sensitivity screening of DNA microarrays. Freestanding JAVA application at http://microarray.crump.ucla.edu/focus

  10. Immunohistochemical assessment of Fhit protein expression in advanced gastric carcinomas in correlation with Helicobacter pylori infection and survival time.

    PubMed

    Czyzewska, Jolanta; Guzińska-Ustymowicz, Katarzyna; Pryczynicz, Anna; Kemona, Andrzej; Bandurski, Roman

    2009-01-01

    Fhit protein is known to play a role in the process of neoplastic transformation. It has been demonstrated that FHIT gene inactivation is manifested by a lack or very low concentration of Fhit protein in tissues collected from tumours in many organs, including head, neck, breast, lungs, stomach or large intestine. The study included a group of 80 patients with advanced gastric carcinomas. The expression of Fhit protein was assessed by means of the immunohistochemical method (avidin-biotin-streptavidin) in the sections fixed in formalin and embedded in paraffin, using rabbit polyclonal antiFhit antibody (Abcam, UK) at 1: 200. Statistical analysis did not show any correlation of the expression of Fhit protein in the main mass of tumour and in the metastasis to lymph node with gender, depth of wall invasion, histological differentiation, Lauren's classification, Bormann's classification, metastases to local lymph nodes or Helicobacter pylori infection. However, a strong statistical correlation was revealed of Fhit protein expression in the main mass of tumour with patients' age (p=0.04) and tumour location in the stomach (p=0.02). No relationship was found between Fhit expression in the main mass of tumour and survival time (p=0.26).

  11. Cloning of the cDNA encoding adenosine 5'-monophosphate deaminase 1 and its mRNA expression in Japanese flounder Paralichthys olivaceus

    NASA Astrophysics Data System (ADS)

    Jiang, Keyong; Sun, Shujuan; Liu, Mei; Wang, Baojie; Meng, Xiaolin; Wang, Lei

    2013-01-01

    AMP deaminase catalyzes the conversion of AMP into IMP and ammonia. In the present study, a full-length cDNA of AMPD1 from skeletal muscle of Japanese flounder Paralichthys olivaceus was cloned and characterized. The 2 526 bp cDNA contains a 5'-UTR of 78 bp, a 3'-UTR of 237 bp and an open reading frame (ORF) of 2 211 bp, which encodes a protein of 736 amino acids. The predicted protein contains a highly conserved AMP deaminase motif (SLSTDDP) and an ATP-binding site sequence (EPLMEEYAIAAQVFK). Phylogenetic analysis showed that the AMPD1 and AMPD3 genes originate from the same branch, but are evolutionarily distant from the AMPD2 gene. RT-PCR showed that the flounder AMPD1 gene was expressed only in skeletal muscle. QRT-PCR analysis revealed a statistically significant 2.54 fold higher level of AMPD1 mRNA in adult muscle (750±40 g) compared with juvenile muscle (7.5±2 g) ( P<0.05). HPLC analysis showed that the IMP content in adult muscle (3.35±0.21 mg/g) was also statistically significantly higher than in juvenile muscle (1.08±0.04 mg/g) ( P<0.05). There is a direct relationship between the AMPD1 gene expression level and IMP content in the skeletal muscle of juvenile and adult flounders. These results may provide useful information for quality improvement and molecular breeding of aquatic animals.

  12. An 80-gene set to predict response to preoperative chemoradiotherapy for rectal cancer by principle component analysis.

    PubMed

    Empuku, Shinichiro; Nakajima, Kentaro; Akagi, Tomonori; Kaneko, Kunihiko; Hijiya, Naoki; Etoh, Tsuyoshi; Shiraishi, Norio; Moriyama, Masatsugu; Inomata, Masafumi

    2016-05-01

    Preoperative chemoradiotherapy (CRT) for locally advanced rectal cancer not only improves the postoperative local control rate, but also induces downstaging. However, it has not been established how to individually select patients who receive effective preoperative CRT. The aim of this study was to identify a predictor of response to preoperative CRT for locally advanced rectal cancer. This study is additional to our multicenter phase II study evaluating the safety and efficacy of preoperative CRT using oral fluorouracil (UMIN ID: 03396). From April, 2009 to August, 2011, 26 biopsy specimens obtained prior to CRT were analyzed by cyclopedic microarray analysis. Response to CRT was evaluated according to a histological grading system using surgically resected specimens. To decide on the number of genes for dividing into responder and non-responder groups, we statistically analyzed the data using a dimension reduction method, a principle component analysis. Of the 26 cases, 11 were responders and 15 non-responders. No significant difference was found in clinical background data between the two groups. We determined that the optimal number of genes for the prediction of response was 80 of 40,000 and the functions of these genes were analyzed. When comparing non-responders with responders, genes expressed at a high level functioned in alternative splicing, whereas those expressed at a low level functioned in the septin complex. Thus, an 80-gene expression set that predicts response to preoperative CRT for locally advanced rectal cancer was identified using a novel statistical method.

  13. The use of open source bioinformatics tools to dissect transcriptomic data.

    PubMed

    Nitsche, Benjamin M; Ram, Arthur F J; Meyer, Vera

    2012-01-01

    Microarrays are a valuable technology to study fungal physiology on a transcriptomic level. Various microarray platforms are available comprising both single and two channel arrays. Despite different technologies, preprocessing of microarray data generally includes quality control, background correction, normalization, and summarization of probe level data. Subsequently, depending on the experimental design, diverse statistical analysis can be performed, including the identification of differentially expressed genes and the construction of gene coexpression networks.We describe how Bioconductor, a collection of open source and open development packages for the statistical programming language R, can be used for dissecting microarray data. We provide fundamental details that facilitate the process of getting started with R and Bioconductor. Using two publicly available microarray datasets from Aspergillus niger, we give detailed protocols on how to identify differentially expressed genes and how to construct gene coexpression networks.

  14. The statistical kinematical theory of X-ray diffraction as applied to reciprocal-space mapping

    PubMed

    Nesterets; Punegov

    2000-11-01

    The statistical kinematical X-ray diffraction theory is developed to describe reciprocal-space maps (RSMs) from deformed crystals with defects of the structure. The general solutions for coherent and diffuse components of the scattered intensity in reciprocal space are derived. As an example, the explicit expressions for intensity distributions in the case of spherical defects and of a mosaic crystal were obtained. The theory takes into account the instrumental function of the triple-crystal diffractometer and can therefore be used for experimental data analysis.

  15. Spectral statistics of the uni-modular ensemble

    NASA Astrophysics Data System (ADS)

    Joyner, Christopher H.; Smilansky, Uzy; Weidenmüller, Hans A.

    2017-09-01

    We investigate the spectral statistics of Hermitian matrices in which the elements are chosen uniformly from U(1) , called the uni-modular ensemble (UME), in the limit of large matrix size. Using three complimentary methods; a supersymmetric integration method, a combinatorial graph-theoretical analysis and a Brownian motion approach, we are able to derive expressions for 1 / N corrections to the mean spectral moments and also analyse the fluctuations about this mean. By addressing the same ensemble from three different point of view, we can critically compare their relative advantages and derive some new results.

  16. Modelling gene expression profiles related to prostate tumor progression using binary states

    PubMed Central

    2013-01-01

    Background Cancer is a complex disease commonly characterized by the disrupted activity of several cancer-related genes such as oncogenes and tumor-suppressor genes. Previous studies suggest that the process of tumor progression to malignancy is dynamic and can be traced by changes in gene expression. Despite the enormous efforts made for differential expression detection and biomarker discovery, few methods have been designed to model the gene expression level to tumor stage during malignancy progression. Such models could help us understand the dynamics and simplify or reveal the complexity of tumor progression. Methods We have modeled an on-off state of gene activation per sample then per stage to select gene expression profiles associated to tumor progression. The selection is guided by statistical significance of profiles based on random permutated datasets. Results We show that our method identifies expected profiles corresponding to oncogenes and tumor suppressor genes in a prostate tumor progression dataset. Comparisons with other methods support our findings and indicate that a considerable proportion of significant profiles is not found by other statistical tests commonly used to detect differential expression between tumor stages nor found by other tailored methods. Ontology and pathway analysis concurred with these findings. Conclusions Results suggest that our methodology may be a valuable tool to study tumor malignancy progression, which might reveal novel cancer therapies. PMID:23721350

  17. Identifying Epigenetic Biomarkers using Maximal Relevance and Minimal Redundancy Based Feature Selection for Multi-Omics Data.

    PubMed

    Mallik, Saurav; Bhadra, Tapas; Maulik, Ujjwal

    2017-01-01

    Epigenetic Biomarker discovery is an important task in bioinformatics. In this article, we develop a new framework of identifying statistically significant epigenetic biomarkers using maximal-relevance and minimal-redundancy criterion based feature (gene) selection for multi-omics dataset. Firstly, we determine the genes that have both expression as well as methylation values, and follow normal distribution. Similarly, we identify the genes which consist of both expression and methylation values, but do not follow normal distribution. For each case, we utilize a gene-selection method that provides maximal-relevant, but variable-weighted minimum-redundant genes as top ranked genes. For statistical validation, we apply t-test on both the expression and methylation data consisting of only the normally distributed top ranked genes to determine how many of them are both differentially expressed andmethylated. Similarly, we utilize Limma package for performing non-parametric Empirical Bayes test on both expression and methylation data comprising only the non-normally distributed top ranked genes to identify how many of them are both differentially expressed and methylated. We finally report the top-ranking significant gene-markerswith biological validation. Moreover, our framework improves positive predictive rate and reduces false positive rate in marker identification. In addition, we provide a comparative analysis of our gene-selection method as well as othermethods based on classificationperformances obtained using several well-known classifiers.

  18. Harnessing the complexity of gene expression data from cancer: from single gene to structural pathway methods

    PubMed Central

    2012-01-01

    High-dimensional gene expression data provide a rich source of information because they capture the expression level of genes in dynamic states that reflect the biological functioning of a cell. For this reason, such data are suitable to reveal systems related properties inside a cell, e.g., in order to elucidate molecular mechanisms of complex diseases like breast or prostate cancer. However, this is not only strongly dependent on the sample size and the correlation structure of a data set, but also on the statistical hypotheses tested. Many different approaches have been developed over the years to analyze gene expression data to (I) identify changes in single genes, (II) identify changes in gene sets or pathways, and (III) identify changes in the correlation structure in pathways. In this paper, we review statistical methods for all three types of approaches, including subtypes, in the context of cancer data and provide links to software implementations and tools and address also the general problem of multiple hypotheses testing. Further, we provide recommendations for the selection of such analysis methods. Reviewers This article was reviewed by Arcady Mushegian, Byung-Soo Kim and Joel Bader. PMID:23227854

  19. A Comparative Study of Student Math Skills: Perceptions, Validation, and Recommendations

    ERIC Educational Resources Information Center

    Jones, Thomas W.; Price, Barbara A.; Randall, Cindy H.

    2011-01-01

    A study was conducted at a southern university in sophomore level production classes to assess skills such as the order of arithmetic operations, decimal and percent conversion, solving of algebraic expressions, and evaluation of formulas. The study was replicated using business statistics and quantitative analysis classes at a southeastern…

  20. Uncertainty Analysis of Instrument Calibration and Application

    NASA Technical Reports Server (NTRS)

    Tripp, John S.; Tcheng, Ping

    1999-01-01

    Experimental aerodynamic researchers require estimated precision and bias uncertainties of measured physical quantities, typically at 95 percent confidence levels. Uncertainties of final computed aerodynamic parameters are obtained by propagation of individual measurement uncertainties through the defining functional expressions. In this paper, rigorous mathematical techniques are extended to determine precision and bias uncertainties of any instrument-sensor system. Through this analysis, instrument uncertainties determined through calibration are now expressed as functions of the corresponding measurement for linear and nonlinear univariate and multivariate processes. Treatment of correlated measurement precision error is developed. During laboratory calibration, calibration standard uncertainties are assumed to be an order of magnitude less than those of the instrument being calibrated. Often calibration standards do not satisfy this assumption. This paper applies rigorous statistical methods for inclusion of calibration standard uncertainty and covariance due to the order of their application. The effects of mathematical modeling error on calibration bias uncertainty are quantified. The effects of experimental design on uncertainty are analyzed. The importance of replication is emphasized, techniques for estimation of both bias and precision uncertainties using replication are developed. Statistical tests for stationarity of calibration parameters over time are obtained.

  1. MiRNA-TF-gene network analysis through ranking of biomolecules for multi-informative uterine leiomyoma dataset.

    PubMed

    Mallik, Saurav; Maulik, Ujjwal

    2015-10-01

    Gene ranking is an important problem in bioinformatics. Here, we propose a new framework for ranking biomolecules (viz., miRNAs, transcription-factors/TFs and genes) in a multi-informative uterine leiomyoma dataset having both gene expression and methylation data using (statistical) eigenvector centrality based approach. At first, genes that are both differentially expressed and methylated, are identified using Limma statistical test. A network, comprising these genes, corresponding TFs from TRANSFAC and ITFP databases, and targeter miRNAs from miRWalk database, is then built. The biomolecules are then ranked based on eigenvector centrality. Our proposed method provides better average accuracy in hub gene and non-hub gene classifications than other methods. Furthermore, pre-ranked Gene set enrichment analysis is applied on the pathway database as well as GO-term databases of Molecular Signatures Database with providing a pre-ranked gene-list based on different centrality values for comparing among the ranking methods. Finally, top novel potential gene-markers for the uterine leiomyoma are provided. Copyright © 2015 Elsevier Inc. All rights reserved.

  2. Meta-analysis methods for combining multiple expression profiles: comparisons, statistical characterization and an application guideline

    PubMed Central

    2013-01-01

    Background As high-throughput genomic technologies become accurate and affordable, an increasing number of data sets have been accumulated in the public domain and genomic information integration and meta-analysis have become routine in biomedical research. In this paper, we focus on microarray meta-analysis, where multiple microarray studies with relevant biological hypotheses are combined in order to improve candidate marker detection. Many methods have been developed and applied in the literature, but their performance and properties have only been minimally investigated. There is currently no clear conclusion or guideline as to the proper choice of a meta-analysis method given an application; the decision essentially requires both statistical and biological considerations. Results We performed 12 microarray meta-analysis methods for combining multiple simulated expression profiles, and such methods can be categorized for different hypothesis setting purposes: (1) HS A : DE genes with non-zero effect sizes in all studies, (2) HS B : DE genes with non-zero effect sizes in one or more studies and (3) HS r : DE gene with non-zero effect in "majority" of studies. We then performed a comprehensive comparative analysis through six large-scale real applications using four quantitative statistical evaluation criteria: detection capability, biological association, stability and robustness. We elucidated hypothesis settings behind the methods and further apply multi-dimensional scaling (MDS) and an entropy measure to characterize the meta-analysis methods and data structure, respectively. Conclusions The aggregated results from the simulation study categorized the 12 methods into three hypothesis settings (HS A , HS B , and HS r ). Evaluation in real data and results from MDS and entropy analyses provided an insightful and practical guideline to the choice of the most suitable method in a given application. All source files for simulation and real data are available on the author’s publication website. PMID:24359104

  3. Meta-analysis methods for combining multiple expression profiles: comparisons, statistical characterization and an application guideline.

    PubMed

    Chang, Lun-Ching; Lin, Hui-Min; Sibille, Etienne; Tseng, George C

    2013-12-21

    As high-throughput genomic technologies become accurate and affordable, an increasing number of data sets have been accumulated in the public domain and genomic information integration and meta-analysis have become routine in biomedical research. In this paper, we focus on microarray meta-analysis, where multiple microarray studies with relevant biological hypotheses are combined in order to improve candidate marker detection. Many methods have been developed and applied in the literature, but their performance and properties have only been minimally investigated. There is currently no clear conclusion or guideline as to the proper choice of a meta-analysis method given an application; the decision essentially requires both statistical and biological considerations. We performed 12 microarray meta-analysis methods for combining multiple simulated expression profiles, and such methods can be categorized for different hypothesis setting purposes: (1) HS(A): DE genes with non-zero effect sizes in all studies, (2) HS(B): DE genes with non-zero effect sizes in one or more studies and (3) HS(r): DE gene with non-zero effect in "majority" of studies. We then performed a comprehensive comparative analysis through six large-scale real applications using four quantitative statistical evaluation criteria: detection capability, biological association, stability and robustness. We elucidated hypothesis settings behind the methods and further apply multi-dimensional scaling (MDS) and an entropy measure to characterize the meta-analysis methods and data structure, respectively. The aggregated results from the simulation study categorized the 12 methods into three hypothesis settings (HS(A), HS(B), and HS(r)). Evaluation in real data and results from MDS and entropy analyses provided an insightful and practical guideline to the choice of the most suitable method in a given application. All source files for simulation and real data are available on the author's publication website.

  4. Comparison of software packages for detecting differential expression in RNA-seq studies

    PubMed Central

    Seyednasrollah, Fatemeh; Laiho, Asta

    2015-01-01

    RNA-sequencing (RNA-seq) has rapidly become a popular tool to characterize transcriptomes. A fundamental research problem in many RNA-seq studies is the identification of reliable molecular markers that show differential expression between distinct sample groups. Together with the growing popularity of RNA-seq, a number of data analysis methods and pipelines have already been developed for this task. Currently, however, there is no clear consensus about the best practices yet, which makes the choice of an appropriate method a daunting task especially for a basic user without a strong statistical or computational background. To assist the choice, we perform here a systematic comparison of eight widely used software packages and pipelines for detecting differential expression between sample groups in a practical research setting and provide general guidelines for choosing a robust pipeline. In general, our results demonstrate how the data analysis tool utilized can markedly affect the outcome of the data analysis, highlighting the importance of this choice. PMID:24300110

  5. Comparison of software packages for detecting differential expression in RNA-seq studies.

    PubMed

    Seyednasrollah, Fatemeh; Laiho, Asta; Elo, Laura L

    2015-01-01

    RNA-sequencing (RNA-seq) has rapidly become a popular tool to characterize transcriptomes. A fundamental research problem in many RNA-seq studies is the identification of reliable molecular markers that show differential expression between distinct sample groups. Together with the growing popularity of RNA-seq, a number of data analysis methods and pipelines have already been developed for this task. Currently, however, there is no clear consensus about the best practices yet, which makes the choice of an appropriate method a daunting task especially for a basic user without a strong statistical or computational background. To assist the choice, we perform here a systematic comparison of eight widely used software packages and pipelines for detecting differential expression between sample groups in a practical research setting and provide general guidelines for choosing a robust pipeline. In general, our results demonstrate how the data analysis tool utilized can markedly affect the outcome of the data analysis, highlighting the importance of this choice. © The Author 2013. Published by Oxford University Press.

  6. Big Data, Big Opportunities, and Big Challenges.

    PubMed

    Frelinger, Jeffrey A

    2015-11-01

    High-throughput assays have begun to revolutionize modern biology and medicine. The advent of cheap next-generation sequencing (NGS) has made it possible to interrogate cells and human populations as never before. Although this has allowed us to investigate the genetics, gene expression, and impacts of the microbiome, there remain both practical and conceptual challenges. These include data handling, storage, and statistical analysis, as well as an inherent problem of the analysis of heterogeneous cell populations.

  7. EBprot: Statistical analysis of labeling-based quantitative proteomics data.

    PubMed

    Koh, Hiromi W L; Swa, Hannah L F; Fermin, Damian; Ler, Siok Ghee; Gunaratne, Jayantha; Choi, Hyungwon

    2015-08-01

    Labeling-based proteomics is a powerful method for detection of differentially expressed proteins (DEPs). The current data analysis platform typically relies on protein-level ratios, which is obtained by summarizing peptide-level ratios for each protein. In shotgun proteomics, however, some proteins are quantified with more peptides than others, and this reproducibility information is not incorporated into the differential expression (DE) analysis. Here, we propose a novel probabilistic framework EBprot that directly models the peptide-protein hierarchy and rewards the proteins with reproducible evidence of DE over multiple peptides. To evaluate its performance with known DE states, we conducted a simulation study to show that the peptide-level analysis of EBprot provides better receiver-operating characteristic and more accurate estimation of the false discovery rates than the methods based on protein-level ratios. We also demonstrate superior classification performance of peptide-level EBprot analysis in a spike-in dataset. To illustrate the wide applicability of EBprot in different experimental designs, we applied EBprot to a dataset for lung cancer subtype analysis with biological replicates and another dataset for time course phosphoproteome analysis of EGF-stimulated HeLa cells with multiplexed labeling. Through these examples, we show that the peptide-level analysis of EBprot is a robust alternative to the existing statistical methods for the DE analysis of labeling-based quantitative datasets. The software suite is freely available on the Sourceforge website http://ebprot.sourceforge.net/. All MS data have been deposited in the ProteomeXchange with identifier PXD001426 (http://proteomecentral.proteomexchange.org/dataset/PXD001426/). © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  8. [The effect of butylphthalide on expression of NGF and BDNF in ischemia stroke tissue of rat cerebrum].

    PubMed

    Kong, Shuang-yan; Li, Qi-fu; Yang, Jie; He, Li

    2007-06-01

    To study the expressions of BDNF, BDNF mRNA, NGF and NGF mRNA in the permanent focal cerebral ischemia tissues of rats. METHHODS: Healthy male Sprague-Dawley rats were taken for this study project. According to the procedure of Zea-Longa, the rat model with permanent cerebral ischemia was established by rat middle cerebral artery obstructed (MCAO) with a nylon thread, and the model rats of neurobehavioral evaluation as 1-3 grade were randomly divided into two groups: butylphthalide group (A group) and control group (B group). A group was given with 25 mg/kg butylphthalide, B group was given with edible oil, two times every day. 3 days after occlusion, all rats were sacrificed after evaluated the neurobehavioral scores, and the samples of cerebrum were obtained after in situ perfusion and fixation with 40 g/L paraformaldehyde. 5 rats in each group were taken to tetrazolium chloride (TTC) staining for macroscopic observation of cerebral infarction area, the rest samples were processed by immunohistochemistry to evaluate effects of butylphthalide on BDNF and NGF expression, hybridization in situ to evaluate effects of butylphthalide on BDNF mRNA and NGF mRNA expression. SPSS12. 0 for statistical analysis, it was P<0. 05 as having statistical significance. Comparing to control group (B group), butylphthalide group (A group) did not have significantly pathological difference, but the grade of behavior and infarction area were apparently reduced (P<0. 05). In butylphthalide group, there was a significant expression up-regulation to BDNF, NGF, BDNF mRNA and NGF mRNA in the peripheral around infarction and cornu ammonis or hippocampus area (P<0. 05). However in the infarction area, the expressions of BDNF, NGF, BDNF mRNA and NGF mRNA had no significantly statistical difference (P> 0. 05). Comparing to control group, butylphthalide can significantly up-regulate the expressions of BDNF and NGF in genetic transcription level, and protect from the ischemia injury.

  9. Clinicopathological and prognostic significance of the RUNX3 expression in gastric cancer: a systematic review and meta-analysis.

    PubMed

    Liu, Baiying; Han, Yao; Jiang, Lu; Jiang, Dongdong; Li, Wenbin; Zhang, Taotao; Zu, Guo; Zhang, Xiangwen

    2018-05-01

    The relationship between expression of runt related transcription factor 3 (RUNX3) and clinicopathological parameters of the patients with gastric cancer (GC) is controversial. The studies were retrieved from those already published essay in PubMed, EMBASE, Wan Fang, CNKI (China National Knowledge Infrastructure), the Cochrane Library and Google Scholar. All statistical tests in this meta-analysis were performed using Stata 10.0 software (Stata Corp, College Station, TX). A P value less than 0.05 was considered statistically significant. A total of nine studies involving 796 patients were included in final meta-analysis. The pooled data showed that expression of RUNX3 was significant correlated with tumor's differentiation (OR = 0.387; 95%CI: 0.237-0.633; P = 0.000), depth of invasion (OR = 0.443; 95%CI: 0.273-0.717; P = 0.001), lymph node metastasis (OR = 0.394; 95%CI: 0.259-0.598; P = 0.000), distant metastasis (OR = 0.403; 95%CI: 0.213-0.764; P = 0.005) and TNM stage (OR = 0.461; 95%CI, 0.322-0.659; P = 0.000) in GC. Expression of RUNX3 was significant correlated with good overall survival (OS) [1-year OS (OR = 2.735; 95%CI: 1.966-3.806; P = 0.000), 3-year OS (OR = 4.782; 95%CI: 3.634-6.292; P = 0.000), 5-year OS (OR = 5.191; 95%CI: 3.775-7.138; P = 0.000]. However, RUNX3 was not correlated with gender (OR = 1.409; 95%CI: 0.986-2.014; P = 0.060). RUNX3 expression correlates with tumor's differentiation, depth of invasion, lymph node metastasis, distant metastasis, TNM stage and OS of GC patients. Copyright © 2018 IJS Publishing Group Ltd. Published by Elsevier Ltd. All rights reserved.

  10. Generalizing Terwilliger's likelihood approach: a new score statistic to test for genetic association.

    PubMed

    el Galta, Rachid; Uitte de Willige, Shirley; de Visser, Marieke C H; Helmer, Quinta; Hsu, Li; Houwing-Duistermaat, Jeanine J

    2007-09-24

    In this paper, we propose a one degree of freedom test for association between a candidate gene and a binary trait. This method is a generalization of Terwilliger's likelihood ratio statistic and is especially powerful for the situation of one associated haplotype. As an alternative to the likelihood ratio statistic, we derive a score statistic, which has a tractable expression. For haplotype analysis, we assume that phase is known. By means of a simulation study, we compare the performance of the score statistic to Pearson's chi-square statistic and the likelihood ratio statistic proposed by Terwilliger. We illustrate the method on three candidate genes studied in the Leiden Thrombophilia Study. We conclude that the statistic follows a chi square distribution under the null hypothesis and that the score statistic is more powerful than Terwilliger's likelihood ratio statistic when the associated haplotype has frequency between 0.1 and 0.4 and has a small impact on the studied disorder. With regard to Pearson's chi-square statistic, the score statistic has more power when the associated haplotype has frequency above 0.2 and the number of variants is above five.

  11. Asymptotic Linear Spectral Statistics for Spiked Hermitian Random Matrices

    NASA Astrophysics Data System (ADS)

    Passemier, Damien; McKay, Matthew R.; Chen, Yang

    2015-07-01

    Using the Coulomb Fluid method, this paper derives central limit theorems (CLTs) for linear spectral statistics of three "spiked" Hermitian random matrix ensembles. These include Johnstone's spiked model (i.e., central Wishart with spiked correlation), non-central Wishart with rank-one non-centrality, and a related class of non-central matrices. For a generic linear statistic, we derive simple and explicit CLT expressions as the matrix dimensions grow large. For all three ensembles under consideration, we find that the primary effect of the spike is to introduce an correction term to the asymptotic mean of the linear spectral statistic, which we characterize with simple formulas. The utility of our proposed framework is demonstrated through application to three different linear statistics problems: the classical likelihood ratio test for a population covariance, the capacity analysis of multi-antenna wireless communication systems with a line-of-sight transmission path, and a classical multiple sample significance testing problem.

  12. System Biology Approach: Gene Network Analysis for Muscular Dystrophy.

    PubMed

    Censi, Federica; Calcagnini, Giovanni; Mattei, Eugenio; Giuliani, Alessandro

    2018-01-01

    Phenotypic changes at different organization levels from cell to entire organism are associated to changes in the pattern of gene expression. These changes involve the entire genome expression pattern and heavily rely upon correlation patterns among genes. The classical approach used to analyze gene expression data builds upon the application of supervised statistical techniques to detect genes differentially expressed among two or more phenotypes (e.g., normal vs. disease). The use of an a posteriori, unsupervised approach based on principal component analysis (PCA) and the subsequent construction of gene correlation networks can shed a light on unexpected behaviour of gene regulation system while maintaining a more naturalistic view on the studied system.In this chapter we applied an unsupervised method to discriminate DMD patient and controls. The genes having the highest absolute scores in the discrimination between the groups were then analyzed in terms of gene expression networks, on the basis of their mutual correlation in the two groups. The correlation network structures suggest two different modes of gene regulation in the two groups, reminiscent of important aspects of DMD pathogenesis.

  13. Xylella fastidiosa gene expression analysis by DNA microarrays.

    PubMed

    Travensolo, Regiane F; Carareto-Alves, Lucia M; Costa, Maria V C G; Lopes, Tiago J S; Carrilho, Emanuel; Lemos, Eliana G M

    2009-04-01

    Xylella fastidiosa genome sequencing has generated valuable data by identifying genes acting either on metabolic pathways or in associated pathogenicity and virulence. Based on available information on these genes, new strategies for studying their expression patterns, such as microarray technology, were employed. A total of 2,600 primer pairs were synthesized and then used to generate fragments using the PCR technique. The arrays were hybridized against cDNAs labeled during reverse transcription reactions and which were obtained from bacteria grown under two different conditions (liquid XDM(2) and liquid BCYE). All data were statistically analyzed to verify which genes were differentially expressed. In addition to exploring conditions for X. fastidiosa genome-wide transcriptome analysis, the present work observed the differential expression of several classes of genes (energy, protein, amino acid and nucleotide metabolism, transport, degradation of substances, toxins and hypothetical proteins, among others). The understanding of expressed genes in these two different media will be useful in comprehending the metabolic characteristics of X. fastidiosa, and in evaluating how important certain genes are for the functioning and survival of these bacteria in plants.

  14. Holo-analysis.

    PubMed

    Rosen, G D

    2006-06-01

    Meta-analysis is a vague descriptor used to encompass very diverse methods of data collection analysis, ranging from simple averages to more complex statistical methods. Holo-analysis is a fully comprehensive statistical analysis of all available data and all available variables in a specified topic, with results expressed in a holistic factual empirical model. The objectives and applications of holo-analysis include software production for prediction of responses with confidence limits, translation of research conditions to praxis (field) circumstances, exposure of key missing variables, discovery of theoretically unpredictable variables and interactions, and planning future research. Holo-analyses are cited as examples of the effects on broiler feed intake and live weight gain of exogenous phytases, which account for 70% of variation in responses in terms of 20 highly significant chronological, dietary, environmental, genetic, managemental, and nutrient variables. Even better future accountancy of variation will be facilitated if and when authors of papers routinely provide key data for currently neglected variables, such as temperatures, complete feed formulations, and mortalities.

  15. NATbox: a network analysis toolbox in R.

    PubMed

    Chavan, Shweta S; Bauer, Michael A; Scutari, Marco; Nagarajan, Radhakrishnan

    2009-10-08

    There has been recent interest in capturing the functional relationships (FRs) from high-throughput assays using suitable computational techniques. FRs elucidate the working of genes in concert as a system as opposed to independent entities hence may provide preliminary insights into biological pathways and signalling mechanisms. Bayesian structure learning (BSL) techniques and its extensions have been used successfully for modelling FRs from expression profiles. Such techniques are especially useful in discovering undocumented FRs, investigating non-canonical signalling mechanisms and cross-talk between pathways. The objective of the present study is to develop a graphical user interface (GUI), NATbox: Network Analysis Toolbox in the language R that houses a battery of BSL algorithms in conjunction with suitable statistical tools for modelling FRs in the form of acyclic networks from gene expression profiles and their subsequent analysis. NATbox is a menu-driven open-source GUI implemented in the R statistical language for modelling and analysis of FRs from gene expression profiles. It provides options to (i) impute missing observations in the given data (ii) model FRs and network structure from gene expression profiles using a battery of BSL algorithms and identify robust dependencies using a bootstrap procedure, (iii) present the FRs in the form of acyclic graphs for visualization and investigate its topological properties using network analysis metrics, (iv) retrieve FRs of interest from published literature. Subsequently, use these FRs as structural priors in BSL (v) enhance scalability of BSL across high-dimensional data by parallelizing the bootstrap routines. NATbox provides a menu-driven GUI for modelling and analysis of FRs from gene expression profiles. By incorporating readily available functions from existing R-packages, it minimizes redundancy and improves reproducibility, transparency and sustainability, characteristic of open-source environments. NATbox is especially suited for interdisciplinary researchers and biologists with minimal programming experience and would like to use systems biology approaches without delving into the algorithmic aspects. The GUI provides appropriate parameter recommendations for the various menu options including default parameter choices for the user. NATbox can also prove to be a useful demonstration and teaching tool in graduate and undergraduate course in systems biology. It has been tested successfully under Windows and Linux operating systems. The source code along with installation instructions and accompanying tutorial can be found at http://bioinformatics.ualr.edu/natboxWiki/index.php/Main_Page.

  16. Effect of local and global geomagnetic activity on human cardiovascular homeostasis.

    PubMed

    Dimitrova, Svetla; Stoilova, Irina; Yanev, Toni; Cholakov, Ilia

    2004-02-01

    The authors investigated the effects of local and planetary geomagnetic activity on human physiology. They collected data in Sofia, Bulgaria, from a group of 86 volunteers during the periods of the autumnal and vernal equinoxes. They used the factors local/planetary geomagnetic activity, day of measurement, gender, and medication use to apply a four-factor multiple analysis of variance. They also used a post hoc analysis to establish the statistical significance of the differences between the average values of the measured physiological parameters in the separate factor levels. In addition, the authors performed correlation analysis between the physiological parameters examined and geophysical factors. The results revealed that geomagnetic changes had a statistically significant influence on arterial blood pressure. Participants expressed this reaction with weak local geomagnetic changes and when major and severe global geomagnetic storms took place.

  17. Using protein-protein interactions for refining gene networks estimated from microarray data by Bayesian networks.

    PubMed

    Nariai, N; Kim, S; Imoto, S; Miyano, S

    2004-01-01

    We propose a statistical method to estimate gene networks from DNA microarray data and protein-protein interactions. Because physical interactions between proteins or multiprotein complexes are likely to regulate biological processes, using only mRNA expression data is not sufficient for estimating a gene network accurately. Our method adds knowledge about protein-protein interactions to the estimation method of gene networks under a Bayesian statistical framework. In the estimated gene network, a protein complex is modeled as a virtual node based on principal component analysis. We show the effectiveness of the proposed method through the analysis of Saccharomyces cerevisiae cell cycle data. The proposed method improves the accuracy of the estimated gene networks, and successfully identifies some biological facts.

  18. Statistical design of quantitative mass spectrometry-based proteomic experiments.

    PubMed

    Oberg, Ann L; Vitek, Olga

    2009-05-01

    We review the fundamental principles of statistical experimental design, and their application to quantitative mass spectrometry-based proteomics. We focus on class comparison using Analysis of Variance (ANOVA), and discuss how randomization, replication and blocking help avoid systematic biases due to the experimental procedure, and help optimize our ability to detect true quantitative changes between groups. We also discuss the issues of pooling multiple biological specimens for a single mass analysis, and calculation of the number of replicates in a future study. When applicable, we emphasize the parallels between designing quantitative proteomic experiments and experiments with gene expression microarrays, and give examples from that area of research. We illustrate the discussion using theoretical considerations, and using real-data examples of profiling of disease.

  19. Introduction of statistical information in a syntactic analyzer for document image recognition

    NASA Astrophysics Data System (ADS)

    Maroneze, André O.; Coüasnon, Bertrand; Lemaitre, Aurélie

    2011-01-01

    This paper presents an improvement to document layout analysis systems, offering a possible solution to Sayre's paradox (which states that an element "must be recognized before it can be segmented; and it must be segmented before it can be recognized"). This improvement, based on stochastic parsing, allows integration of statistical information, obtained from recognizers, during syntactic layout analysis. We present how this fusion of numeric and symbolic information in a feedback loop can be applied to syntactic methods to improve document description expressiveness. To limit combinatorial explosion during exploration of solutions, we devised an operator that allows optional activation of the stochastic parsing mechanism. Our evaluation on 1250 handwritten business letters shows this method allows the improvement of global recognition scores.

  20. Building gene expression profile classifiers with a simple and efficient rejection option in R.

    PubMed

    Benso, Alfredo; Di Carlo, Stefano; Politano, Gianfranco; Savino, Alessandro; Hafeezurrehman, Hafeez

    2011-01-01

    The collection of gene expression profiles from DNA microarrays and their analysis with pattern recognition algorithms is a powerful technology applied to several biological problems. Common pattern recognition systems classify samples assigning them to a set of known classes. However, in a clinical diagnostics setup, novel and unknown classes (new pathologies) may appear and one must be able to reject those samples that do not fit the trained model. The problem of implementing a rejection option in a multi-class classifier has not been widely addressed in the statistical literature. Gene expression profiles represent a critical case study since they suffer from the curse of dimensionality problem that negatively reflects on the reliability of both traditional rejection models and also more recent approaches such as one-class classifiers. This paper presents a set of empirical decision rules that can be used to implement a rejection option in a set of multi-class classifiers widely used for the analysis of gene expression profiles. In particular, we focus on the classifiers implemented in the R Language and Environment for Statistical Computing (R for short in the remaining of this paper). The main contribution of the proposed rules is their simplicity, which enables an easy integration with available data analysis environments. Since in the definition of a rejection model tuning of the involved parameters is often a complex and delicate task, in this paper we exploit an evolutionary strategy to automate this process. This allows the final user to maximize the rejection accuracy with minimum manual intervention. This paper shows how the use of simple decision rules can be used to help the use of complex machine learning algorithms in real experimental setups. The proposed approach is almost completely automated and therefore a good candidate for being integrated in data analysis flows in labs where the machine learning expertise required to tune traditional classifiers might not be available.

  1. Statistical analysis of aerosol species, trace gasses, and meteorology in Chicago.

    PubMed

    Binaku, Katrina; O'Brien, Timothy; Schmeling, Martina; Fosco, Tinamarie

    2013-09-01

    Both canonical correlation analysis (CCA) and principal component analysis (PCA) were applied to atmospheric aerosol and trace gas concentrations and meteorological data collected in Chicago during the summer months of 2002, 2003, and 2004. Concentrations of ammonium, calcium, nitrate, sulfate, and oxalate particulate matter, as well as, meteorological parameters temperature, wind speed, wind direction, and humidity were subjected to CCA and PCA. Ozone and nitrogen oxide mixing ratios were also included in the data set. The purpose of statistical analysis was to determine the extent of existing linear relationship(s), or lack thereof, between meteorological parameters and pollutant concentrations in addition to reducing dimensionality of the original data to determine sources of pollutants. In CCA, the first three canonical variate pairs derived were statistically significant at the 0.05 level. Canonical correlation between the first canonical variate pair was 0.821, while correlations of the second and third canonical variate pairs were 0.562 and 0.461, respectively. The first canonical variate pair indicated that increasing temperatures resulted in high ozone mixing ratios, while the second canonical variate pair showed wind speed and humidity's influence on local ammonium concentrations. No new information was uncovered in the third variate pair. Canonical loadings were also interpreted for information regarding relationships between data sets. Four principal components (PCs), expressing 77.0 % of original data variance, were derived in PCA. Interpretation of PCs suggested significant production and/or transport of secondary aerosols in the region (PC1). Furthermore, photochemical production of ozone and wind speed's influence on pollutants were expressed (PC2) along with overall measure of local meteorology (PC3). In summary, CCA and PCA results combined were successful in uncovering linear relationships between meteorology and air pollutants in Chicago and aided in determining possible pollutant sources.

  2. MetNet: Software to Build and Model the Biogenetic Lattice of Arabidopsis

    DOE PAGES

    Wurtele, Eve Syrkin; Li, Jie; Diao, Lixia; ...

    2003-01-01

    MetNet (http://www.botany.iastate.edu/∼mash/metnetex/metabolicnetex.html) is publicly available software in development for analysis of genome-wide RNA, protein and metabolite profiling data. The software is designed to enable the biologist to visualize, statistically analyse and model a metabolic and regulatory network map of Arabidopsis , combined with gene expression profiling data. It contains a JAVA interface to an interactions database (MetNetDB) containing information on regulatory and metabolic interactions derived from a combination of web databases (TAIR, KEGG, BRENDA) and input from biologists in their area of expertise. FCModeler captures input from MetNetDB in a graphical form. Sub-networks can be identified and interpreted using simplemore » fuzzy cognitive maps. FCModeler is intended to develop and evaluate hypotheses, and provide a modelling framework for assessing the large amounts of data captured by high-throughput gene expression experiments. FCModeler and MetNetDB are currently being extended to three-dimensional virtual reality display. The MetNet map, together with gene expression data, can be viewed using multivariate graphics tools in GGobi linked with the data analytic tools in R. Users can highlight different parts of the metabolic network and see the relevant expression data highlighted in other data plots. Multi-dimensional expression data can be rotated through different dimensions. Statistical analysis can be computed alongside the visual. MetNet is designed to provide a framework for the formulation of testable hypotheses regarding the function of specific genes, and in the long term provide the basis for identification of metabolic and regulatory networks that control plant composition and development.« less

  3. Relation of glypican-3 and E-cadherin expressions to clinicopathological features and prognosis of mucinous and non-mucinous colorectal adenocarcinoma.

    PubMed

    Foda, Abd Al-Rahman Mohammad; Mohammad, Mie Ali; Abdel-Aziz, Azza; El-Hawary, Amira Kamal

    2015-06-01

    Glypican-3 (GPC3) is a member of the membrane-bound heparin sulfate proteoglycans. E-cadherin is an adhesive receptor that is believed to act as a tumor suppressor gene. Many studies had investigated E-cadherin expressions in colorectal carcinoma (CRC) while only one study had investigated GPC3 expression in CRC. This study aims to investigate expression of GCP3 and E-cadherin in colorectal mucinous carcinoma (MA) and non-mucinous adenocarcinoma (NMA) using manual tissue microarray technique. Tumor tissue specimens are collected from 75 cases of MC and 75 cases of NMA who underwent radical surgery from Jan 2007 to Jan 2012 at the Gastroenterology Centre, Mansoura University, Egypt. Their clinicopathological parameters and survival data were revised and analyzed using established statistical methodologies. High-density manual tissue microarrays were constructed using modified mechanical pencil tip technique and immunohistochemistry for GPC3 and E-cadherin was done. NMA showed higher expression of GPC3 than MA with no statistically significant relation. NMA showed a significantly higher E-cadherin expression than MA. GPC3 and E-cadherin positivity rates were significantly interrelated in NMA, but not in MA, group. In NMA group, there was no significant relation between either GPC3 or E-cadherin expression and the clinicopathological features. In a univariate analysis, neither GPC3 nor E-cadherin expression showed a significant impact on disease-free survival (DFS) or overall survival (OS). GPC3 and E-cadherin expressions are not independent prognostic factors in CRC. However, expressions of both are significantly interrelated in NMA patients, suggesting an excellent interplay between both, in contrast to MA. Further molecular studies are needed to further explore the relationship between GCP3 and E-cadherin in colorectal carcinogenesis.

  4. FAS ligand expression in inflammatory infiltrate lymphoid cells as a prognostic marker in oral squamous cell carcinoma.

    PubMed

    Peterle, G T; Santos, M; Mendes, S O; Carvalho-Neto, P B; Maia, L L; Stur, E; Agostini, L P; Silva, C V M; Trivilin, L O; Nunes, F D; Carvalho, M B; Tajara, E H; Louro, I D; Silva-Conforti, A M A

    2015-09-22

    Currently, the most important prognostic factor in oral squamous cell carcinoma (OSCC) is the presence of regional lymph node metastases, which correlates with a 50% reduction in life expectancy. We have previously observed that expression of hypoxia genes in the tumor inflammatory infiltrate is statistically related to prognosis in OSCC. FAS and FASL expression levels in OSCC have previously been related to patient survival. The present study analyzed the relationship between FASL expression in the inflammatory infiltrate lymphoid cells and clinical variables, tumor histology, and prognosis of OSCC. Strong FASL expression was significantly associated with lymph node metastases (P = 0.035) and disease-specific death (P = 0.014), but multivariate analysis did not confirm FASL expression as an independent death risk factor (OR = 2.78, 95%CI = 0.81-9.55). Disease-free and disease-specific survival were significantly correlated with FASL expression (P = 0.016 and P = 0.005, respectively). Multivariate analysis revealed that strong FASL expression is an independent marker for earlier disease relapse and disease-specific death, with approximately 2.5-fold increased risk compared with weak expression (HR = 2.24, 95%CI = 1.08-4.65 and HR = 2.49, 95%CI = 1.04-5.99, respectively). Our results suggest a potential role for this expression profile as a tumor prognostic marker in OSCC patients.

  5. Predictive value of PD-L1 based on mRNA level in the treatment of stage IV melanoma with ipilimumab.

    PubMed

    Brüggemann, C; Kirchberger, M C; Goldinger, S M; Weide, B; Konrad, A; Erdmann, M; Schadendorf, D; Croner, R S; Krähenbühl, L; Kähler, K C; Hafner, C; Leisgang, W; Kiesewetter, F; Dummer, R; Schuler, G; Stürzl, M; Heinzerling, L

    2017-10-01

    PD-L1 is established as a predictive marker for therapy of non-small cell lung cancer with pembrolizumab. Furthermore, PD-L1 positive melanoma has shown more favorable outcomes when treated with anti-PD1 antibodies and dacarbazine compared to PD-L1 negative melanoma. However, the role of PD-L1 expression with regard to response to checkpoint inhibition with anti-CTLA-4 is not clear, yet. In addition, the lack of standardization in the immunohistochemical assessment of PD-L1 makes the comparison of results difficult. In this study, we investigated the PD-L1 gene expression with a new fully automated technique via RT-PCR and correlated the findings with the response to the anti-CTLA-4 antibody ipilimumab. Within a retrospective multi-center trial, PD-L1 gene expression was evaluated in 78 melanoma patients in a total of 111 pre-treatment tumor samples from 6 skin cancer centers and analyzed with regard to response to ipilimumab. For meaningful statistical analysis, the cohort was enriched for responders with 30 responders and 48 non-responders. Gene expression was assessed by quantitative RT-PCR after extracting mRNA from formalin-fixed paraffin embedded tumor tissue and correlated with results from immunohistochemical (IHC) stainings. The evaluation of PD-L1 expression based on mRNA level is feasible. Correlation between PD-L1 expression as assessed by IHC and RT-PCR showed varying levels of concordance depending on the antibody employed. RT-PCR should be further investigated to measure PD-L1 expression, since it is a semi-quantitative method with observer-independent evaluation. With this approach, there was no statistical significant difference in the PD-L1 expression between responders and non-responders to the therapy with ipilimumab. The evaluation of PD-L1 expression based on mRNA level is feasible. Correlation between PD-L1 expression as assessed by IHC and RT-PCR showed varying levels of concordance depending on the antibody employed. RT-PCR should be further investigated to measure PD-L1 expression, since it is a semi-quantitative method with observer-independent evaluation. With this approach, there was no statistical significant difference in the PD-L1 expression between responders and non-responders to the therapy with ipilimumab.

  6. Statistical analysis of an RNA titration series evaluates microarray precision and sensitivity on a whole-array basis

    PubMed Central

    Holloway, Andrew J; Oshlack, Alicia; Diyagama, Dileepa S; Bowtell, David DL; Smyth, Gordon K

    2006-01-01

    Background Concerns are often raised about the accuracy of microarray technologies and the degree of cross-platform agreement, but there are yet no methods which can unambiguously evaluate precision and sensitivity for these technologies on a whole-array basis. Results A methodology is described for evaluating the precision and sensitivity of whole-genome gene expression technologies such as microarrays. The method consists of an easy-to-construct titration series of RNA samples and an associated statistical analysis using non-linear regression. The method evaluates the precision and responsiveness of each microarray platform on a whole-array basis, i.e., using all the probes, without the need to match probes across platforms. An experiment is conducted to assess and compare four widely used microarray platforms. All four platforms are shown to have satisfactory precision but the commercial platforms are superior for resolving differential expression for genes at lower expression levels. The effective precision of the two-color platforms is improved by allowing for probe-specific dye-effects in the statistical model. The methodology is used to compare three data extraction algorithms for the Affymetrix platforms, demonstrating poor performance for the commonly used proprietary algorithm relative to the other algorithms. For probes which can be matched across platforms, the cross-platform variability is decomposed into within-platform and between-platform components, showing that platform disagreement is almost entirely systematic rather than due to measurement variability. Conclusion The results demonstrate good precision and sensitivity for all the platforms, but highlight the need for improved probe annotation. They quantify the extent to which cross-platform measures can be expected to be less accurate than within-platform comparisons for predicting disease progression or outcome. PMID:17118209

  7. A statistical spatial power spectrum of the Earth's lithospheric magnetic field

    NASA Astrophysics Data System (ADS)

    Thébault, E.; Vervelidou, F.

    2015-05-01

    The magnetic field of the Earth's lithosphere arises from rock magnetization contrasts that were shaped over geological times. The field can be described mathematically in spherical harmonics or with distributions of magnetization. We exploit this dual representation and assume that the lithospheric field is induced by spatially varying susceptibility values within a shell of constant thickness. By introducing a statistical assumption about the power spectrum of the susceptibility, we then derive a statistical expression for the spatial power spectrum of the crustal magnetic field for the spatial scales ranging from 60 to 2500 km. This expression depends on the mean induced magnetization, the thickness of the shell, and a power law exponent for the power spectrum of the susceptibility. We test the relevance of this form with a misfit analysis to the observational NGDC-720 lithospheric magnetic field model power spectrum. This allows us to estimate a mean global apparent induced magnetization value between 0.3 and 0.6 A m-1, a mean magnetic crustal thickness value between 23 and 30 km, and a root mean square for the field value between 190 and 205 nT at 95 per cent. These estimates are in good agreement with independent models of the crustal magnetization and of the seismic crustal thickness. We carry out the same analysis in the continental and oceanic domains separately. We complement the misfit analyses with a Kolmogorov-Smirnov goodness-of-fit test and we conclude that the observed power spectrum can be each time a sample of the statistical one.

  8. Using Peptide-Level Proteomics Data for Detecting Differentially Expressed Proteins.

    PubMed

    Suomi, Tomi; Corthals, Garry L; Nevalainen, Olli S; Elo, Laura L

    2015-11-06

    The expression of proteins can be quantified in high-throughput means using different types of mass spectrometers. In recent years, there have emerged label-free methods for determining protein abundance. Although the expression is initially measured at the peptide level, a common approach is to combine the peptide-level measurements into protein-level values before differential expression analysis. However, this simple combination is prone to inconsistencies between peptides and may lose valuable information. To this end, we introduce here a method for detecting differentially expressed proteins by combining peptide-level expression-change statistics. Using controlled spike-in experiments, we show that the approach of averaging peptide-level expression changes yields more accurate lists of differentially expressed proteins than does the conventional protein-level approach. This is particularly true when there are only few replicate samples or the differences between the sample groups are small. The proposed technique is implemented in the Bioconductor package PECA, and it can be downloaded from http://www.bioconductor.org.

  9. Accounting for technical noise in differential expression analysis of single-cell RNA sequencing data.

    PubMed

    Jia, Cheng; Hu, Yu; Kelly, Derek; Kim, Junhyong; Li, Mingyao; Zhang, Nancy R

    2017-11-02

    Recent technological breakthroughs have made it possible to measure RNA expression at the single-cell level, thus paving the way for exploring expression heterogeneity among individual cells. Current single-cell RNA sequencing (scRNA-seq) protocols are complex and introduce technical biases that vary across cells, which can bias downstream analysis without proper adjustment. To account for cell-to-cell technical differences, we propose a statistical framework, TASC (Toolkit for Analysis of Single Cell RNA-seq), an empirical Bayes approach to reliably model the cell-specific dropout rates and amplification bias by use of external RNA spike-ins. TASC incorporates the technical parameters, which reflect cell-to-cell batch effects, into a hierarchical mixture model to estimate the biological variance of a gene and detect differentially expressed genes. More importantly, TASC is able to adjust for covariates to further eliminate confounding that may originate from cell size and cell cycle differences. In simulation and real scRNA-seq data, TASC achieves accurate Type I error control and displays competitive sensitivity and improved robustness to batch effects in differential expression analysis, compared to existing methods. TASC is programmed to be computationally efficient, taking advantage of multi-threaded parallelization. We believe that TASC will provide a robust platform for researchers to leverage the power of scRNA-seq. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  10. Accounting for technical noise in differential expression analysis of single-cell RNA sequencing data

    PubMed Central

    Jia, Cheng; Hu, Yu; Kelly, Derek; Kim, Junhyong

    2017-01-01

    Abstract Recent technological breakthroughs have made it possible to measure RNA expression at the single-cell level, thus paving the way for exploring expression heterogeneity among individual cells. Current single-cell RNA sequencing (scRNA-seq) protocols are complex and introduce technical biases that vary across cells, which can bias downstream analysis without proper adjustment. To account for cell-to-cell technical differences, we propose a statistical framework, TASC (Toolkit for Analysis of Single Cell RNA-seq), an empirical Bayes approach to reliably model the cell-specific dropout rates and amplification bias by use of external RNA spike-ins. TASC incorporates the technical parameters, which reflect cell-to-cell batch effects, into a hierarchical mixture model to estimate the biological variance of a gene and detect differentially expressed genes. More importantly, TASC is able to adjust for covariates to further eliminate confounding that may originate from cell size and cell cycle differences. In simulation and real scRNA-seq data, TASC achieves accurate Type I error control and displays competitive sensitivity and improved robustness to batch effects in differential expression analysis, compared to existing methods. TASC is programmed to be computationally efficient, taking advantage of multi-threaded parallelization. We believe that TASC will provide a robust platform for researchers to leverage the power of scRNA-seq. PMID:29036714

  11. Expression quantitative trait loci and genetic regulatory network analysis reveals that Gabra2 is involved in stress responses in the mouse.

    PubMed

    Dai, Jiajuan; Wang, Xusheng; Chen, Ying; Wang, Xiaodong; Zhu, Jun; Lu, Lu

    2009-11-01

    Previous studies have revealed that the subunit alpha 2 (Gabra2) of the gamma-aminobutyric acid receptor plays a critical role in the stress response. However, little is known about the gentetic regulatory network for Gabra2 and the stress response. We combined gene expression microarray analysis and quantitative trait loci (QTL) mapping to characterize the genetic regulatory network for Gabra2 expression in the hippocampus of BXD recombinant inbred (RI) mice. Our analysis found that the expression level of Gabra2 exhibited much variation in the hippocampus across the BXD RI strains and between the parental strains, C57BL/6J, and DBA/2J. Expression QTL (eQTL) mapping showed three microarray probe sets of Gabra2 to have highly significant linkage likelihood ratio statistic (LRS) scores. Gene co-regulatory network analysis showed that 10 genes, including Gria3, Chka, Drd3, Homer1, Grik2, Odz4, Prkag2, Grm5, Gabrb1, and Nlgn1 are directly or indirectly associated with stress responses. Eleven genes were implicated as Gabra2 downstream genes through mapping joint modulation. The genetical genomics approach demonstrates the importance and the potential power of the eQTL studies in identifying genetic regulatory networks that contribute to complex traits, such as stress responses.

  12. Machine Learning–Based Differential Network Analysis: A Study of Stress-Responsive Transcriptomes in Arabidopsis[W

    PubMed Central

    Ma, Chuang; Xin, Mingming; Feldmann, Kenneth A.; Wang, Xiangfeng

    2014-01-01

    Machine learning (ML) is an intelligent data mining technique that builds a prediction model based on the learning of prior knowledge to recognize patterns in large-scale data sets. We present an ML-based methodology for transcriptome analysis via comparison of gene coexpression networks, implemented as an R package called machine learning–based differential network analysis (mlDNA) and apply this method to reanalyze a set of abiotic stress expression data in Arabidopsis thaliana. The mlDNA first used a ML-based filtering process to remove nonexpressed, constitutively expressed, or non-stress-responsive “noninformative” genes prior to network construction, through learning the patterns of 32 expression characteristics of known stress-related genes. The retained “informative” genes were subsequently analyzed by ML-based network comparison to predict candidate stress-related genes showing expression and network differences between control and stress networks, based on 33 network topological characteristics. Comparative evaluation of the network-centric and gene-centric analytic methods showed that mlDNA substantially outperformed traditional statistical testing–based differential expression analysis at identifying stress-related genes, with markedly improved prediction accuracy. To experimentally validate the mlDNA predictions, we selected 89 candidates out of the 1784 predicted salt stress–related genes with available SALK T-DNA mutagenesis lines for phenotypic screening and identified two previously unreported genes, mutants of which showed salt-sensitive phenotypes. PMID:24520154

  13. A complementation assay for in vivo protein structure/function analysis in Physcomitrella patens (Funariaceae)

    DOE PAGES

    Scavuzzo-Duggan, Tess R.; Chaves, Arielle M.; Roberts, Alison W.

    2015-07-14

    Here, a method for rapid in vivo functional analysis of engineered proteins was developed using Physcomitrella patens. A complementation assay was designed for testing structure/function relationships in cellulose synthase (CESA) proteins. The components of the assay include (1) construction of test vectors that drive expression of epitope-tagged PpCESA5 carrying engineered mutations, (2) transformation of a ppcesa5 knockout line that fails to produce gametophores with test and control vectors, (3) scoring the stable transformants for gametophore production, (4) statistical analysis comparing complementation rates for test vectors to positive and negative control vectors, and (5) analysis of transgenic protein expression by Westernmore » blotting. The assay distinguished mutations that generate fully functional, nonfunctional, and partially functional proteins. In conclusion, compared with existing methods for in vivo testing of protein function, this complementation assay provides a rapid method for investigating protein structure/function relationships in plants.« less

  14. Immunohistochemical Analysis of the Role Connective Tissue Growth Factor in Drug-induced Gingival Overgrowth in Response to Phenytoin, Cyclosporine, and Nifedipine

    PubMed Central

    Anand, A. J.; Gopalakrishnan, Sivaram; Karthikeyan, R.; Mishra, Debasish; Mohapatra, Shreeyam

    2018-01-01

    Objective: To evaluate for the presence of connective tissue growth factor (CTGF) in drug (phenytoin, cyclosporine, and nifedipine)-induced gingival overgrowth (DIGO) and to compare it with healthy controls in the absence of overgrowth. Materials and Methods: Thirty-five patients were chosen for the study and segregated into study (25) and control groups (10). The study group consisted of phenytoin-induced (10), cyclosporine-induced (10), and nifedipine-induced (5) gingival overgrowth. After completing necessary medical evaluations, biopsy was done. The tissue samples were fixed in 10% formalin and then immunohistochemically evaluated for the presence of CTGF. The statistical analysis of the values was done using statistical package SPSS PC+ (Statistical Package for the Social Sciences, version 4.01). Results: The outcome of immunohistochemistry shows that DIGO samples express more CTGF than control group and phenytoin expresses more CTGF followed by nifedipine and cyclosporine. Conclusion: The study shows that there is an increase in the levels of CTGF in patients with DIGO in comparison to the control group without any gingival overgrowth. In the study, we compared the levels of CTGF in DIGO induced by three most commonly used drugs phenytoin, cyclosporine, and nifedipine. By comparing the levels of CTGF, we find that cyclosporine induces the production of least amount of CTGF. Therefore, it might be a more viable drug choice with reduced side effects. PMID:29629324

  15. High efficiency family shuffling based on multi-step PCR and in vivo DNA recombination in yeast: statistical and functional analysis of a combinatorial library between human cytochrome P450 1A1 and 1A2.

    PubMed

    Abécassis, V; Pompon, D; Truan, G

    2000-10-15

    The design of a family shuffling strategy (CLERY: Combinatorial Libraries Enhanced by Recombination in Yeast) associating PCR-based and in vivo recombination and expression in yeast is described. This strategy was tested using human cytochrome P450 CYP1A1 and CYP1A2 as templates, which share 74% nucleotide sequence identity. Construction of highly shuffled libraries of mosaic structures and reduction of parental gene contamination were two major goals. Library characterization involved multiprobe hybridization on DNA macro-arrays. The statistical analysis of randomly selected clones revealed a high proportion of chimeric genes (86%) and a homogeneous representation of the parental contribution among the sequences (55.8 +/- 2.5% for parental sequence 1A2). A microtiter plate screening system was designed to achieve colorimetric detection of polycyclic hydrocarbon hydroxylation by transformed yeast cells. Full sequences of five randomly picked and five functionally selected clones were analyzed. Results confirmed the shuffling efficiency and allowed calculation of the average length of sequence exchange and mutation rates. The efficient and statistically representative generation of mosaic structures by this type of family shuffling in a yeast expression system constitutes a novel and promising tool for structure-function studies and tuning enzymatic activities of multicomponent eucaryote complexes involving non-soluble enzymes.

  16. Introduction to bioinformatics.

    PubMed

    Can, Tolga

    2014-01-01

    Bioinformatics is an interdisciplinary field mainly involving molecular biology and genetics, computer science, mathematics, and statistics. Data intensive, large-scale biological problems are addressed from a computational point of view. The most common problems are modeling biological processes at the molecular level and making inferences from collected data. A bioinformatics solution usually involves the following steps: Collect statistics from biological data. Build a computational model. Solve a computational modeling problem. Test and evaluate a computational algorithm. This chapter gives a brief introduction to bioinformatics by first providing an introduction to biological terminology and then discussing some classical bioinformatics problems organized by the types of data sources. Sequence analysis is the analysis of DNA and protein sequences for clues regarding function and includes subproblems such as identification of homologs, multiple sequence alignment, searching sequence patterns, and evolutionary analyses. Protein structures are three-dimensional data and the associated problems are structure prediction (secondary and tertiary), analysis of protein structures for clues regarding function, and structural alignment. Gene expression data is usually represented as matrices and analysis of microarray data mostly involves statistics analysis, classification, and clustering approaches. Biological networks such as gene regulatory networks, metabolic pathways, and protein-protein interaction networks are usually modeled as graphs and graph theoretic approaches are used to solve associated problems such as construction and analysis of large-scale networks.

  17. BAG3 promotes chondrosarcoma progression by upregulating the expression of β-catenin

    PubMed Central

    Shi, Huijuan; Chen, Wenfang; Dong, Yu; Lu, Xiaofang; Zhang, Wenhui; Wang, Liantang

    2018-01-01

    To investigate the roles of B-cell lymphoma-2 associated athanogene 3 (BAG3) in human chondrosarcoma and the potential mechanisms, the expression levels of BAG3 were detected in the present study, and the associations between BAG3 and clinical pathological parameters, clinical stage as well as the survival of patients were analyzed. The present study detected BAG3 mRNA and protein expression in the normal cartilage cell line HC-a and in SW1353 chondrosarcoma cells by reverse transcription-quantitative polymerase chain reaction and western blot analysis. The BAG3 protein expression in 59 cases of chondrosarcoma, 30 patients with endogenous chondroma and 8 cases of normal cartilage was semi-quantitatively analyzed using the immunohistochemical method. In addition, the BAG3 protein expression level, the clinical pathological parameters, clinical stage and the survival time of patients with chondrosarcoma were analyzed. The plasmid transfection method was employed to upregulate the expression BAG3 and small RNA interference to downregulate the expression of BAG3 in SW1353 cells. The expression levels of BAG3 protein and mRNA were significantly increased in the chondrosarcoma cell line when compared with the normal cartilage cell line. The immunohistochemistry results indicated that BAG3 protein was overexpressed in the tissue of human chondrosarcoma. Statistical analysis showed that the expression level of BAG3 was significantly increased in the different Enneking staging of patients with chondrosarcoma and Tumor staging, and there were no statistical differences in age, gender, histological classification and tumor size. In the in vitro experiments, the data revealed that BAG3 significantly promoted chondrosarcoma cell proliferation, colony-formation, migration and invasion; however, it inhibited chondrosarcoma cell apoptosis. It was observed that BAG3 upregulated β-catenin expression at the mRNA and protein levels. In addition, BAG3 induced the expression of runt-related transcription factor 2 (RUNX2) in chondrosarcoma cells by upregulating β-catenin. These clinical analyses revealed a positive association between β-catenin and BAG3 in chondrosarcoma tumors. BAG3 was significantly increased in chondrosarcoma cells and tissues compared with the normal cartilage cells, tissue and cartilage benign tumors. Thus, BAG3 may serve as an oncogene in the development of chondrosarcoma via the induction of RUNX2 expression. The results of the present study contribute to further research on the biological development of chondrosarcoma. PMID:29484408

  18. THD-Module Extractor: An Application for CEN Module Extraction and Interesting Gene Identification for Alzheimer's Disease.

    PubMed

    Kakati, Tulika; Kashyap, Hirak; Bhattacharyya, Dhruba K

    2016-11-30

    There exist many tools and methods for construction of co-expression network from gene expression data and for extraction of densely connected gene modules. In this paper, a method is introduced to construct co-expression network and to extract co-expressed modules having high biological significance. The proposed method has been validated on several well known microarray datasets extracted from a diverse set of species, using statistical measures, such as p and q values. The modules obtained in these studies are found to be biologically significant based on Gene Ontology enrichment analysis, pathway analysis, and KEGG enrichment analysis. Further, the method was applied on an Alzheimer's disease dataset and some interesting genes are found, which have high semantic similarity among them, but are not significantly correlated in terms of expression similarity. Some of these interesting genes, such as MAPT, CASP2, and PSEN2, are linked with important aspects of Alzheimer's disease, such as dementia, increase cell death, and deposition of amyloid-beta proteins in Alzheimer's disease brains. The biological pathways associated with Alzheimer's disease, such as, Wnt signaling, Apoptosis, p53 signaling, and Notch signaling, incorporate these interesting genes. The proposed method is evaluated in regard to existing literature.

  19. Identification of expression quantitative trait loci by the interaction analysis using genetic algorithm.

    PubMed

    Namkung, Junghyun; Nam, Jin-Wu; Park, Taesung

    2007-01-01

    Many genes with major effects on quantitative traits have been reported to interact with other genes. However, finding a group of interacting genes from thousands of SNPs is challenging. Hence, an efficient and robust algorithm is needed. The genetic algorithm (GA) is useful in searching for the optimal solution from a very large searchable space. In this study, we show that genome-wide interaction analysis using GA and a statistical interaction model can provide a practical method to detect biologically interacting loci. We focus our search on transcriptional regulators by analyzing gene x gene interactions for cancer-related genes. The expression values of three cancer-related genes were selected from the expression data of the Genetic Analysis Workshop 15 Problem 1 data set. We implemented a GA to identify the expression quantitative trait loci that are significantly associated with expression levels of the cancer-related genes. The time complexity of the GA was compared with that of an exhaustive search algorithm. As a result, our GA, which included heuristic methods, such as archive, elitism, and local search, has greatly reduced computational time in a genome-wide search for gene x gene interactions. In general, the GA took one-fifth the computation time of an exhaustive search for the most significant pair of single-nucleotide polymorphisms.

  20. Identification of expression quantitative trait loci by the interaction analysis using genetic algorithm

    PubMed Central

    Namkung, Junghyun; Nam, Jin-Wu; Park, Taesung

    2007-01-01

    Many genes with major effects on quantitative traits have been reported to interact with other genes. However, finding a group of interacting genes from thousands of SNPs is challenging. Hence, an efficient and robust algorithm is needed. The genetic algorithm (GA) is useful in searching for the optimal solution from a very large searchable space. In this study, we show that genome-wide interaction analysis using GA and a statistical interaction model can provide a practical method to detect biologically interacting loci. We focus our search on transcriptional regulators by analyzing gene × gene interactions for cancer-related genes. The expression values of three cancer-related genes were selected from the expression data of the Genetic Analysis Workshop 15 Problem 1 data set. We implemented a GA to identify the expression quantitative trait loci that are significantly associated with expression levels of the cancer-related genes. The time complexity of the GA was compared with that of an exhaustive search algorithm. As a result, our GA, which included heuristic methods, such as archive, elitism, and local search, has greatly reduced computational time in a genome-wide search for gene × gene interactions. In general, the GA took one-fifth the computation time of an exhaustive search for the most significant pair of single-nucleotide polymorphisms. PMID:18466570

  1. Kidney Transplant Rejection and Tissue Injury by Gene Profiling of Biopsies and Peripheral Blood Lymphocytes

    PubMed Central

    Flechner, Stuart M.; Kurian, Sunil M.; Head, Steven R.; Sharp, Starlette M.; Whisenant, Thomas C.; Zhang, Jie; Chismar, Jeffrey D.; Horvath, Steve; Mondala, Tony; Gilmartin, Timothy; Cook, Daniel J.; Kay, Steven A.; Walker, John R.; Salomon, Daniel R.

    2007-01-01

    A major challenge for kidney transplantation is balancing the need for immunosuppression to prevent rejection, while minimizing drug-induced toxicities. We used DNA microarrays (HG-U95Av2 GeneChips, Affymetrix) to determine gene expression profiles for kidney biopsies and peripheral blood lymphocytes (PBLs) in transplant patients including normal donor kidneys, well-functioning transplants without rejection, kidneys undergoing acute rejection, and transplants with renal dysfunction without rejection. We developed a data analysis schema based on expression signal determination, class comparison and prediction, hierarchical clustering, statistical power analysis and real-time quantitative PCR validation. We identified distinct gene expression signatures for both biopsies and PBLs that correlated significantly with each of the different classes of transplant patients. This is the most complete report to date using commercial arrays to identify unique expression signatures in transplant biopsies distinguishing acute rejection, acute dysfunction without rejection and well-functioning transplants with no rejection history. We demonstrate for the first time the successful application of high density DNA chip analysis of PBL as a diagnostic tool for transplantation. The significance of these results, if validated in a multicenter prospective trial, would be the establishment of a metric based on gene expression signatures for monitoring the immune status and immunosuppression of transplanted patients. PMID:15307835

  2. contamDE: differential expression analysis of RNA-seq data for contaminated tumor samples.

    PubMed

    Shen, Qi; Hu, Jiyuan; Jiang, Ning; Hu, Xiaohua; Luo, Zewei; Zhang, Hong

    2016-03-01

    Accurate detection of differentially expressed genes between tumor and normal samples is a primary approach of cancer-related biomarker identification. Due to the infiltration of tumor surrounding normal cells, the expression data derived from tumor samples would always be contaminated with normal cells. Ignoring such cellular contamination would deflate the power of detecting DE genes and further confound the biological interpretation of the analysis results. For the time being, there does not exists any differential expression analysis approach for RNA-seq data in literature that can properly account for the contamination of tumor samples. Without appealing to any extra information, we develop a new method 'contamDE' based on a novel statistical model that associates RNA-seq expression levels with cell types. It is demonstrated through simulation studies that contamDE could be much more powerful than the existing methods that ignore the contamination. In the application to two cancer studies, contamDE uniquely found several potential therapy and prognostic biomarkers of prostate cancer and non-small cell lung cancer. An R package contamDE is freely available at http://homepage.fudan.edu.cn/zhangh/softwares/ zhanghfd@fudan.edu.cn Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  3. THD-Module Extractor: An Application for CEN Module Extraction and Interesting Gene Identification for Alzheimer’s Disease

    PubMed Central

    Kakati, Tulika; Kashyap, Hirak; Bhattacharyya, Dhruba K.

    2016-01-01

    There exist many tools and methods for construction of co-expression network from gene expression data and for extraction of densely connected gene modules. In this paper, a method is introduced to construct co-expression network and to extract co-expressed modules having high biological significance. The proposed method has been validated on several well known microarray datasets extracted from a diverse set of species, using statistical measures, such as p and q values. The modules obtained in these studies are found to be biologically significant based on Gene Ontology enrichment analysis, pathway analysis, and KEGG enrichment analysis. Further, the method was applied on an Alzheimer’s disease dataset and some interesting genes are found, which have high semantic similarity among them, but are not significantly correlated in terms of expression similarity. Some of these interesting genes, such as MAPT, CASP2, and PSEN2, are linked with important aspects of Alzheimer’s disease, such as dementia, increase cell death, and deposition of amyloid-beta proteins in Alzheimer’s disease brains. The biological pathways associated with Alzheimer’s disease, such as, Wnt signaling, Apoptosis, p53 signaling, and Notch signaling, incorporate these interesting genes. The proposed method is evaluated in regard to existing literature. PMID:27901073

  4. Development of a gene expression database and related analysis programs for evaluation of anticancer compounds.

    PubMed

    Ushijima, Masaru; Mashima, Tetsuo; Tomida, Akihiro; Dan, Shingo; Saito, Sakae; Furuno, Aki; Tsukahara, Satomi; Seimiya, Hiroyuki; Yamori, Takao; Matsuura, Masaaki

    2013-03-01

    Genome-wide transcriptional expression analysis is a powerful strategy for characterizing the biological activity of anticancer compounds. It is often instructive to identify gene sets involved in the activity of a given drug compound for comparison with different compounds. Currently, however, there is no comprehensive gene expression database and related application system that is; (i) specialized in anticancer agents; (ii) easy to use; and (iii) open to the public. To develop a public gene expression database of antitumor agents, we first examined gene expression profiles in human cancer cells after exposure to 35 compounds including 25 clinically used anticancer agents. Gene signatures were extracted that were classified as upregulated or downregulated after exposure to the drug. Hierarchical clustering showed that drugs with similar mechanisms of action, such as genotoxic drugs, were clustered. Connectivity map analysis further revealed that our gene signature data reflected modes of action of the respective agents. Together with the database, we developed analysis programs that calculate scores for ranking changes in gene expression and for searching statistically significant pathways from the Kyoto Encyclopedia of Genes and Genomes database in order to analyze the datasets more easily. Our database and the analysis programs are available online at our website (http://scads.jfcr.or.jp/db/cs/). Using these systems, we successfully showed that proteasome inhibitors are selectively classified as endoplasmic reticulum stress inducers and induce atypical endoplasmic reticulum stress. Thus, our public access database and related analysis programs constitute a set of efficient tools to evaluate the mode of action of novel compounds and identify promising anticancer lead compounds. © 2012 Japanese Cancer Association.

  5. Statistical approach for selection of biologically informative genes.

    PubMed

    Das, Samarendra; Rai, Anil; Mishra, D C; Rai, Shesh N

    2018-05-20

    Selection of informative genes from high dimensional gene expression data has emerged as an important research area in genomics. Many gene selection techniques have been proposed so far are either based on relevancy or redundancy measure. Further, the performance of these techniques has been adjudged through post selection classification accuracy computed through a classifier using the selected genes. This performance metric may be statistically sound but may not be biologically relevant. A statistical approach, i.e. Boot-MRMR, was proposed based on a composite measure of maximum relevance and minimum redundancy, which is both statistically sound and biologically relevant for informative gene selection. For comparative evaluation of the proposed approach, we developed two biological sufficient criteria, i.e. Gene Set Enrichment with QTL (GSEQ) and biological similarity score based on Gene Ontology (GO). Further, a systematic and rigorous evaluation of the proposed technique with 12 existing gene selection techniques was carried out using five gene expression datasets. This evaluation was based on a broad spectrum of statistically sound (e.g. subject classification) and biological relevant (based on QTL and GO) criteria under a multiple criteria decision-making framework. The performance analysis showed that the proposed technique selects informative genes which are more biologically relevant. The proposed technique is also found to be quite competitive with the existing techniques with respect to subject classification and computational time. Our results also showed that under the multiple criteria decision-making setup, the proposed technique is best for informative gene selection over the available alternatives. Based on the proposed approach, an R Package, i.e. BootMRMR has been developed and available at https://cran.r-project.org/web/packages/BootMRMR. This study will provide a practical guide to select statistical techniques for selecting informative genes from high dimensional expression data for breeding and system biology studies. Published by Elsevier B.V.

  6. Microarray‑based bioinformatics analysis of the prospective target gene network of key miRNAs influenced by long non‑coding RNA PVT1 in HCC.

    PubMed

    Zhang, Yu; Mo, Wei-Jia; Wang, Xiao; Zhang, Tong-Tong; Qin, Yuan; Wang, Han-Lin; Chen, Gang; Wei, Dan-Ming; Dang, Yi-Wu

    2018-05-02

    The long non‑coding RNA (lncRNA) PVT1 plays vital roles in the tumorigenesis and development of various types of cancer. However, the potential expression profiling, functions and pathways of PVT1 in HCC remain unknown. PVT1 was knocked down in SMMC‑7721 cells, and a miRNA microarray analysis was performed to detect the differentially expressed miRNAs. Twelve target prediction algorithms were used to predict the underlying targets of these differentially expressed miRNAs. Bioinformatics analysis was performed to explore the underlying functions, pathways and networks of the targeted genes. Furthermore, the relationship between PVT1 and the clinical parameters in HCC was confirmed based on the original data in the TCGA database. Among the differentially expressed miRNAs, the top two upregulated and downregulated miRNAs were selected for further analysis based on the false discovery rate (FDR), fold‑change (FC) and P‑values. Based on the TCGA database, PVT1 was obviously highly expressed in HCC, and a statistically higher PVT1 expression was found for sex (male), ethnicity (Asian) and pathological grade (G3+G4) compared to the control groups (P<0.05). Furthermore, Gene Ontology (GO) analysis revealed that the target genes were involved in complex cellular pathways, such as the macromolecule biosynthetic process, compound metabolic process, and transcription. Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis revealed that the MAPK and Wnt signaling pathways may be correlated with the regulation of the four candidate miRNAs. The results therefore provide significant information on the differentially expressed miRNAs associated with PVT1 in HCC, and we hypothesized that PVT1 may play vital roles in HCC by regulating different miRNAs or target gene expression (particularly MAPK8) via the MAPK or Wnt signaling pathways. Thus, further investigation of the molecular mechanism of PVT1 in HCC is needed.

  7. A pedagogical approach to the Boltzmann factor through experiments and simulations

    NASA Astrophysics Data System (ADS)

    Battaglia, O. R.; Bonura, A.; Sperandeo-Mineo, R. M.

    2009-09-01

    The Boltzmann factor is the basis of a huge amount of thermodynamic and statistical physics, both classical and quantum. It governs the behaviour of all systems in nature that are exchanging energy with their environment. To understand why the expression has this specific form involves a deep mathematical analysis, whose flow of logic is hard to see and is not at the level of high school or college students' preparation. We here present some experiments and simulations aimed at directly deriving its mathematical expression and illustrating the fundamental concepts on which it is grounded. Experiments use easily available apparatuses, and simulations are developed in the Net-Logo environment that, besides having a user-friendly interface, allows an easy interaction with the algorithm. The approach supplies pedagogical support for the introduction of the Boltzmann factor at the undergraduate level to students without a background in statistical mechanics.

  8. Beta-Catenin and Epithelial Tumors: A Study Based on 374 Oropharyngeal Cancers

    PubMed Central

    Santoro, Angela; Pannone, Giuseppe; Papagerakis, Silvana; McGuff, H. Stan; Cafarelli, Barbara; Lepore, Silvia; De Maria, Salvatore; Rubini, Corrado; Mattoni, Marilena; Staibano, Stefania; Mezza, Ernesto; De Rosa, Gaetano; Aquino, Gabriella; Losito, Simona; Loreto, Carla; Crimi, Salvatore; Bufo, Pantaleo

    2014-01-01

    Introduction. Although altered regulation of the Wnt pathway via beta-catenin is a frequent event in several human cancers, its potential implications in oral/oropharyngeal squamous cell carcinomas (OSCC/OPSCC) are largely unexplored. Work purpose was to define association between beta-catenin expression and clinical-pathological parameters in 374 OSCCs/OP-SCCs by immunohistochemistry (IHC). Materials and Methods. Association between IHC detected patterns of protein expression and clinical-pathological parameters was assessed by statistical analysis and survival rates by Kaplan-Meier curves. Beta-catenin expression was also investigated in OSCC cell lines by Real-Time PCR. An additional analysis of the DNA content was performed on 22 representative OSCCs/OPSCCs by DNA-image-cytometric analysis. Results and Discussion. All carcinomas exhibited significant alterations of beta-catenin expression (P < 0.05). Beta-catenin protein was mainly detected in the cytoplasm of cancerous cells and only focal nuclear positivity was observed. Higher cytoplasmic expression correlated significantly with poor histological differentiation, advanced stage, and worst patient outcome (P < 0.05). By Real-Time PCR significant increase of beta-catenin mRNA was detected in OSCC cell lines and in 45% of surgical specimens. DNA ploidy study demonstrated high levels of aneuploidy in beta-catenin overexpressing carcinomas. Conclusions. This is the largest study reporting significant association between beta-catenin expression and clinical-pathological factors in patients with OSCCs/OPSCCs. PMID:24511551

  9. Intratumoral heterogeneity analysis reveals hidden associations between protein expression losses and patient survival in clear cell renal cell carcinoma

    PubMed Central

    Devarajan, Karthik; Parsons, Theodore; Wang, Qiong; O'Neill, Raymond; Solomides, Charalambos; Peiper, Stephen C.; Testa, Joseph R.; Uzzo, Robert; Yang, Haifeng

    2017-01-01

    Intratumoral heterogeneity (ITH) is a prominent feature of kidney cancer. It is not known whether it has utility in finding associations between protein expression and clinical parameters. We used ITH that is detected by immunohistochemistry (IHC) to aid the association analysis between the loss of SWI/SNF components and clinical parameters.160 ccRCC tumors (40 per tumor stage) were used to generate tissue microarray (TMA). Four foci from different regions of each tumor were selected. IHC was performed against PBRM1, ARID1A, SETD2, SMARCA4, and SMARCA2. Statistical analyses were performed to correlate biomarker losses with patho-clinical parameters. Categorical variables were compared between groups using Fisher's exact tests. Univariate and multivariable analyses were used to correlate biomarker changes and patient survivals. Multivariable analyses were performed by constructing decision trees using the classification and regression trees (CART) methodology. IHC detected widespread ITH in ccRCC tumors. The statistical analysis of the “Truncal loss” (root loss) found additional correlations between biomarker losses and tumor stages than the traditional “Loss in tumor (total)”. Losses of SMARCA4 or SMARCA2 significantly improved prognosis for overall survival (OS). Losses of PBRM1, ARID1A or SETD2 had the opposite effect. Thus “Truncal Loss” analysis revealed hidden links between protein losses and patient survival in ccRCC. PMID:28445125

  10. A Pedagogical Approach to the Boltzmann Factor through Experiments and Simulations

    ERIC Educational Resources Information Center

    Battaglia, O. R.; Bonura, A.; Sperandeo-Mineo, R. M.

    2009-01-01

    The Boltzmann factor is the basis of a huge amount of thermodynamic and statistical physics, both classical and quantum. It governs the behaviour of all systems in nature that are exchanging energy with their environment. To understand why the expression has this specific form involves a deep mathematical analysis, whose flow of logic is hard to…

  11. Utilization of Lymphoblastoid Cell Lines as a System for the Molecular Modeling of Autism

    ERIC Educational Resources Information Center

    Baron, Colin A.; Liu, Stephenie Y.; Hicks, Chindo; Gregg, Jeffrey P.

    2006-01-01

    In order to provide an alternative approach for understanding the biology and genetics of autism, we performed statistical analysis of gene expression profiles of lymphoblastoid cell lines derived from children with autism and their families. The goal was to assess the feasibility of using this model in identifying autism-associated genes.…

  12. Brain region-specific gene expression changes after chronic intermittent ethanol exposure and early withdrawal in C57BL/6J mice

    PubMed Central

    Melendez, Roberto I.; McGinty, Jacqueline F.; Kalivas, Peter W.; Becker, Howard C.

    2014-01-01

    Neuroadaptations that participate in the ontogeny of alcohol dependence are likely a result of altered gene expression in various brain regions. The present study investigated brain region-specific changes in the pattern and magnitude of gene expression immediately following chronic intermittent ethanol (CIE) exposure and 8 hours following final ethanol exposure [i.e. early withdrawal (EWD)]. High-density oligonucleotide microarrays (Affymetrix 430A 2.0, Affymetrix, Santa Clara, CA, USA) and bioinformatics analysis were used to characterize gene expression and function in the prefrontal cortex (PFC), hippocampus (HPC) and nucleus accumbens (NAc) of C57BL/6J mice (Jackson Laboratories, Bar Harbor, ME, USA). Gene expression levels were determined using gene chip robust multi-array average followed by statistical analysis of microarrays and validated by quantitative real-time reverse transcription polymerase chain reaction and Western blot analysis. Results indicated that immediately following CIE exposure, changes in gene expression were strikingly greater in the PFC (284 genes) compared with the HPC (16 genes) and NAc (32 genes). Bioinformatics analysis revealed that most of the transcriptionally responsive genes in the PFC were involved in Ras/MAPK signaling, notch signaling or ubiquitination. In contrast, during EWD, changes in gene expression were greatest in the HPC (139 genes) compared with the PFC (four genes) and NAc (eight genes). The most transcriptionally responsive genes in the HPC were involved in mRNA processing or actin dynamics. Of the few genes detected in the NAc, the most representatives were involved in circadian rhythms. Overall, these findings indicate that brain region-specific and time-dependent neuroadaptive alterations in gene expression play an integral role in the development of alcohol dependence and withdrawal. PMID:21812870

  13. GEOquery: a bridge between the Gene Expression Omnibus (GEO) and BioConductor.

    PubMed

    Davis, Sean; Meltzer, Paul S

    2007-07-15

    Microarray technology has become a standard molecular biology tool. Experimental data have been generated on a huge number of organisms, tissue types, treatment conditions and disease states. The Gene Expression Omnibus (Barrett et al., 2005), developed by the National Center for Bioinformatics (NCBI) at the National Institutes of Health is a repository of nearly 140,000 gene expression experiments. The BioConductor project (Gentleman et al., 2004) is an open-source and open-development software project built in the R statistical programming environment (R Development core Team, 2005) for the analysis and comprehension of genomic data. The tools contained in the BioConductor project represent many state-of-the-art methods for the analysis of microarray and genomics data. We have developed a software tool that allows access to the wealth of information within GEO directly from BioConductor, eliminating many the formatting and parsing problems that have made such analyses labor-intensive in the past. The software, called GEOquery, effectively establishes a bridge between GEO and BioConductor. Easy access to GEO data from BioConductor will likely lead to new analyses of GEO data using novel and rigorous statistical and bioinformatic tools. Facilitating analyses and meta-analyses of microarray data will increase the efficiency with which biologically important conclusions can be drawn from published genomic data. GEOquery is available as part of the BioConductor project.

  14. Meta-analysis of human gene expression in response to Mycobacterium tuberculosis infection reveals potential therapeutic targets.

    PubMed

    Wang, Zhang; Arat, Seda; Magid-Slav, Michal; Brown, James R

    2018-01-10

    With the global emergence of multi-drug resistant strains of Mycobacterium tuberculosis, new strategies to treat tuberculosis are urgently needed such as therapeutics targeting potential human host factors. Here we performed a statistical meta-analysis of human gene expression in response to both latent and active pulmonary tuberculosis infections from nine published datasets. We found 1655 genes that were significantly differentially expressed during active tuberculosis infection. In contrast, no gene was significant for latent tuberculosis. Pathway enrichment analysis identified 90 significant canonical human pathways, including several pathways more commonly related to non-infectious diseases such as the LRRK2 pathway in Parkinson's disease, and PD-1/PD-L1 signaling pathway important for new immuno-oncology therapies. The analysis of human genome-wide association studies datasets revealed tuberculosis-associated genetic variants proximal to several genes in major histocompatibility complex for antigen presentation. We propose several new targets and drug-repurposing opportunities including intravenous immunoglobulin, ion-channel blockers and cancer immuno-therapeutics for development as combination therapeutics with anti-mycobacterial agents. Our meta-analysis provides novel insights into host genes and pathways important for tuberculosis and brings forth potential drug repurposing opportunities for host-directed therapies.

  15. EVALUATION OF THE EXTRACELLULAR MATRIX OF INJURED SUPRASPINATUS IN RATS

    PubMed Central

    Almeida, Luiz Henrique Oliveira; Ikemoto, Roberto; Mader, Ana Maria; Pinhal, Maria Aparecida Silva; Munhoz, Bruna; Murachovsky, Joel

    2016-01-01

    ABSTRACT Objective: To evaluate the evolution of injuries of the supraspinatus muscle by immunohistochemistry (IHC) and anatomopathological analysis in animal model (Wistar rats). Methods: Twenty-five Wistar rats were submitted to complete injury of the supraspinatus tendon, then subsequently sacrificed in groups of five animals at the following periods: immediately after the injury, 24h after the injury, 48h after, 30 days after and three months after the injury. All groups underwent histological and IHC analysis. Results: Regarding vascular proliferation and inflammatory infiltrate, we found a statistically significant difference between groups 1(control group) and 2 (24h after injury). IHC analysis showed that expression of vascular endothelial growth factor (VEGF) showed a statistically significant difference between groups 1 and 2, and collagen type 1 (Col-1) evaluation presented a statistically significant difference between groups 1 and 4. Conclusion: We observed changes in the extracellular matrix components compatible with remodeling and healing. Remodeling is more intense 24h after injury. However, VEGF and Col-1 are substantially increased at 24h and 30 days after the injury, respectively. Level of Evidence I, Experimental Study. PMID:26997907

  16. Statistical analysis on experimental calibration data for flowmeters in pressure pipes

    NASA Astrophysics Data System (ADS)

    Lazzarin, Alessandro; Orsi, Enrico; Sanfilippo, Umberto

    2017-08-01

    This paper shows a statistical analysis on experimental calibration data for flowmeters (i.e.: electromagnetic, ultrasonic, turbine flowmeters) in pressure pipes. The experimental calibration data set consists of the whole archive of the calibration tests carried out on 246 flowmeters from January 2001 to October 2015 at Settore Portate of Laboratorio di Idraulica “G. Fantoli” of Politecnico di Milano, that is accredited as LAT 104 for a flow range between 3 l/s and 80 l/s, with a certified Calibration and Measurement Capability (CMC) - formerly known as Best Measurement Capability (BMC) - equal to 0.2%. The data set is split into three subsets, respectively consisting in: 94 electromagnetic, 83 ultrasonic and 69 turbine flowmeters; each subset is analysed separately from the others, but then a final comparison is carried out. In particular, the main focus of the statistical analysis is the correction C, that is the difference between the flow rate Q measured by the calibration facility (through the accredited procedures and the certified reference specimen) minus the flow rate QM contemporarily recorded by the flowmeter under calibration, expressed as a percentage of the same QM .

  17. Application of meta-analysis methods for identifying proteomic expression level differences.

    PubMed

    Amess, Bob; Kluge, Wolfgang; Schwarz, Emanuel; Haenisch, Frieder; Alsaif, Murtada; Yolken, Robert H; Leweke, F Markus; Guest, Paul C; Bahn, Sabine

    2013-07-01

    We present new statistical approaches for identification of proteins with expression levels that are significantly changed when applying meta-analysis to two or more independent experiments. We showed that the Euclidean distance measure has reduced risk of false positives compared to the rank product method. Our Ψ-ranking method has advantages over the traditional fold-change approach by incorporating both the fold-change direction as well as the p-value. In addition, the second novel method, Π-ranking, considers the ratio of the fold-change and thus integrates all three parameters. We further improved the latter by introducing our third technique, Σ-ranking, which combines all three parameters in a balanced nonparametric approach. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  18. Evaluation and Validation of Housekeeping Genes as Reference for Gene Expression Studies in Pigeonpea (Cajanus cajan) Under Drought Stress Conditions

    PubMed Central

    Sinha, Pallavi; Singh, Vikas K.; Suryanarayana, V.; Krishnamurthy, L.; Saxena, Rachit K.; Varshney, Rajeev K.

    2015-01-01

    Gene expression analysis using quantitative real-time PCR (qRT-PCR) is a very sensitive technique and its sensitivity depends on the stable performance of reference gene(s) used in the study. A number of housekeeping genes have been used in various expression studies in many crops however, their expression were found to be inconsistent under different stress conditions. As a result, species specific housekeeping genes have been recommended for different expression studies in several crop species. However, such specific housekeeping genes have not been reported in the case of pigeonpea (Cajanus cajan) despite the fact that genome sequence has become available for the crop. To identify the stable housekeeping genes in pigeonpea for expression analysis under drought stress conditions, the relative expression variations of 10 commonly used housekeeping genes (EF1α, UBQ10, GAPDH, 18SrRNA, 25SrRNA, TUB6, ACT1, IF4α, UBC and HSP90) were studied on root, stem and leaves tissues of Asha (ICPL 87119). Three statistical algorithms geNorm, NormFinder and BestKeeper were used to define the stability of candidate genes. geNorm analysis identified IF4α and TUB6 as the most stable housekeeping genes however, NormFinder analysis determined IF4α and HSP90 as the most stable housekeeping genes under drought stress conditions. Subsequently validation of the identified candidate genes was undertaken in qRT-PCR based gene expression analysis of uspA gene which plays an important role for drought stress conditions in pigeonpea. The relative quantification of the uspA gene varied according to the internal controls (stable and least stable genes), thus highlighting the importance of the choice of as well as validation of internal controls in such experiments. The identified stable and validated housekeeping genes will facilitate gene expression studies in pigeonpea especially under drought stress conditions. PMID:25849964

  19. Evaluation and validation of housekeeping genes as reference for gene expression studies in pigeonpea (Cajanus cajan) under drought stress conditions.

    PubMed

    Sinha, Pallavi; Singh, Vikas K; Suryanarayana, V; Krishnamurthy, L; Saxena, Rachit K; Varshney, Rajeev K

    2015-01-01

    Gene expression analysis using quantitative real-time PCR (qRT-PCR) is a very sensitive technique and its sensitivity depends on the stable performance of reference gene(s) used in the study. A number of housekeeping genes have been used in various expression studies in many crops however, their expression were found to be inconsistent under different stress conditions. As a result, species specific housekeeping genes have been recommended for different expression studies in several crop species. However, such specific housekeeping genes have not been reported in the case of pigeonpea (Cajanus cajan) despite the fact that genome sequence has become available for the crop. To identify the stable housekeeping genes in pigeonpea for expression analysis under drought stress conditions, the relative expression variations of 10 commonly used housekeeping genes (EF1α, UBQ10, GAPDH, 18SrRNA, 25SrRNA, TUB6, ACT1, IF4α, UBC and HSP90) were studied on root, stem and leaves tissues of Asha (ICPL 87119). Three statistical algorithms geNorm, NormFinder and BestKeeper were used to define the stability of candidate genes. geNorm analysis identified IF4α and TUB6 as the most stable housekeeping genes however, NormFinder analysis determined IF4α and HSP90 as the most stable housekeeping genes under drought stress conditions. Subsequently validation of the identified candidate genes was undertaken in qRT-PCR based gene expression analysis of uspA gene which plays an important role for drought stress conditions in pigeonpea. The relative quantification of the uspA gene varied according to the internal controls (stable and least stable genes), thus highlighting the importance of the choice of as well as validation of internal controls in such experiments. The identified stable and validated housekeeping genes will facilitate gene expression studies in pigeonpea especially under drought stress conditions.

  20. Expression of miR-146a-5p in patients with intracranial aneurysms and its association with prognosis.

    PubMed

    Zhang, H-L; Li, L; Cheng, C-J; Sun, X-C

    2018-02-01

    The study aims to detect the association of miR-146a-5p with intracranial aneurysms (IAs). The expression of miR-146a-5p was compared from plasma samples between 72 patients with intracranial aneurysms (IAs) and 40 healthy volunteers by quantitative Real-time polymerase chain reaction (qRT-PCR). Statistical analysis was performed to analyze the relationship between miR-146a-5p expression and clinical data and overall survival (OS) time of IAs patients. Univariate and multivariate Cox proportional hazards have also been performed. Notably, higher miR-146a-5p expression was found in plasma samples from 72 patients with intracranial aneurysms (IAs) compared with 40 healthy controls. Higher miR-146a-5p expression was significantly associated with rupture and Hunt-Hess level in IAs patients. Kaplan-Meier survival analysis verified that higher miR-146a-5p expression predicted a shorter overall survival (OS) compared with lower miR-146a-5p expression in IAs patients. Univariate and multivariate Cox proportional hazards demonstrated that higher miR-146a-5p expression, rupture, and Hunt-Hess were independent risk factors of OS in patients with intracranial aneurysms (IAs). MiR-146a-5p expression may serve as a biomarker for predicting prognosis in patients with IAs.

  1. The Human EST Ontology Explorer: a tissue-oriented visualization system for ontologies distribution in human EST collections.

    PubMed

    Merelli, Ivan; Caprera, Andrea; Stella, Alessandra; Del Corvo, Marcello; Milanesi, Luciano; Lazzari, Barbara

    2009-10-15

    The NCBI dbEST currently contains more than eight million human Expressed Sequenced Tags (ESTs). This wide collection represents an important source of information for gene expression studies, provided it can be inspected according to biologically relevant criteria. EST data can be browsed using different dedicated web resources, which allow to investigate library specific gene expression levels and to make comparisons among libraries, highlighting significant differences in gene expression. Nonetheless, no tool is available to examine distributions of quantitative EST collections in Gene Ontology (GO) categories, nor to retrieve information concerning library-dependent EST involvement in metabolic pathways. In this work we present the Human EST Ontology Explorer (HEOE) http://www.itb.cnr.it/ptp/human_est_explorer, a web facility for comparison of expression levels among libraries from several healthy and diseased tissues. The HEOE provides library-dependent statistics on the distribution of sequences in the GO Direct Acyclic Graph (DAG) that can be browsed at each GO hierarchical level. The tool is based on large-scale BLAST annotation of EST sequences. Due to the huge number of input sequences, this BLAST analysis was performed with the aid of grid computing technology, which is particularly suitable to address data parallel task. Relying on the achieved annotation, library-specific distributions of ESTs in the GO Graph were inferred. A pathway-based search interface was also implemented, for a quick evaluation of the representation of libraries in metabolic pathways. EST processing steps were integrated in a semi-automatic procedure that relies on Perl scripts and stores results in a MySQL database. A PHP-based web interface offers the possibility to simultaneously visualize, retrieve and compare data from the different libraries. Statistically significant differences in GO categories among user selected libraries can also be computed. The HEOE provides an alternative and complementary way to inspect EST expression levels with respect to approaches currently offered by other resources. Furthermore, BLAST computation on the whole human EST dataset was a suitable test of grid scalability in the context of large-scale bioinformatics analysis. The HEOE currently comprises sequence analysis from 70 non-normalized libraries, representing a comprehensive overview on healthy and unhealthy tissues. As the analysis procedure can be easily applied to other libraries, the number of represented tissues is intended to increase.

  2. MAGMA: analysis of two-channel microarrays made easy.

    PubMed

    Rehrauer, Hubert; Zoller, Stefan; Schlapbach, Ralph

    2007-07-01

    The web application MAGMA provides a simple and intuitive interface to identify differentially expressed genes from two-channel microarray data. While the underlying algorithms are not superior to those of similar web applications, MAGMA is particularly user friendly and can be used without prior training. The user interface guides the novice user through the most typical microarray analysis workflow consisting of data upload, annotation, normalization and statistical analysis. It automatically generates R-scripts that document MAGMA's entire data processing steps, thereby allowing the user to regenerate all results in his local R installation. The implementation of MAGMA follows the model-view-controller design pattern that strictly separates the R-based statistical data processing, the web-representation and the application logic. This modular design makes the application flexible and easily extendible by experts in one of the fields: statistical microarray analysis, web design or software development. State-of-the-art Java Server Faces technology was used to generate the web interface and to perform user input processing. MAGMA's object-oriented modular framework makes it easily extendible and applicable to other fields and demonstrates that modern Java technology is also suitable for rather small and concise academic projects. MAGMA is freely available at www.magma-fgcz.uzh.ch.

  3. Expression of Organic Anion Transporters 1 and 3 in the Ovine Fetal Brain During the Latter Half of Gestation

    PubMed Central

    Cousins, Roderick; Wood, Charles E.

    2010-01-01

    Development and maturation of the fetal brain is critical for homeostasis in utero, responsiveness to fetal stress and, in ruminants, control of the timing of birth. In the sheep, as in the human, the placenta secretes estrogen and other signaling molecules into both the fetal and maternal blood, molecules whose entry or exit across the blood-brain barrier is likely to be facilitated by transporters. The purpose of this study was to test the hypothesis that the ovine fetal brain expresses organic anion transporters, and that the expression of these transporters varies as a function of brain region and fetal gestational age. Brains and pituitaries were collected at the time of sacrifice from fetal and newborn sheep at 80, 100, 120, 130, 145 days gestation and on the first day of postnatal life (parturition in sheep is at approximately 147 days gestation). Hypothalamus, medullary brainstem, cerebellum, and pituitary were processed for mRNA extraction and synthesis of cDNA (4–5/group). Real-time PCR analysis of OAT1 and OAT3 expression revealed significant expression of both genes in all of the tissues tested. In hypothalamus and cerebellum, there were statistically significant increases in the expression of one or both genes towards the end of gestation. In medullary brainstem and pituitary, the levels of expression were relatively unchanged as there were no statistically significant changes with developmental age. We conclude that the ovine fetal brain expresses both OAT1 and OAT3, that the pattern of expression suggests an increasing role for these transporters in the physiology of the developing fetal brain as the fetus nears the time of spontaneous parturition. PMID:20708067

  4. Expression of HSP27 in Hepatocellular Carcinoma.

    PubMed

    Eto, Daimei; Hisaka, Toru; Horiuchi, Hiroyuki; Uchida, Shinji; Ishikawa, Hiroto; Kawashima, Yusuke; Kinugasa, Tetsushi; Nakashima, Osamu; Yano, Hirohisa; Okuda, Koji; Akagi, Yoshito

    2016-07-01

    Heat-shock protein 27 (HSP27), a low molecular weight stress protein, is recognized as a molecular chaperone. The expression of HSP27 has been detected in some human tumors and while HSP27 is phosphorylated as a reresponse to stress, the function of phosphorylated HSP27 (p-HSP27) is not known. The aim of this study was to investigate what kind of effect expression of HSP27 and p-HSP27 in HCC has on clinicopathological characteristics and prognosis. An immunohistochemical study for HSP27 and p-HSP27 was performed on 194 resected HCC cases. We analyzed the correlation of HSP27 expression with various parameters statistically. There was no correlation between expression of HSP27 and the clinicopathological characteristics and prognosis from the analysis of 194 cases. From the analysis of the hepatitis C virus (HCV)-positive group of 142 cases, those that were p-HSP27-positive had a larger tumor diameter and the portal vein invasion rate was high. The expression of total HSP27 may serve as a new, clinically useful marker of HCC. In addition, the present study suggests that the expression of phosphorylated HSP27 is useful in the screening and grading of HCC occurring in the setting of HCV. Copyright© 2016 International Institute of Anticancer Research (Dr. John G. Delinassios), All rights reserved.

  5. EMMPRIN expression in oral squamous cell carcinomas: correlation with tumor proliferation and patient survival.

    PubMed

    Monteiro, Luís Silva; Delgado, Maria Leonor; Ricardo, Sara; Garcez, Fernanda; do Amaral, Barbas; Pacheco, José Júlio; Lopes, Carlos; Bousbaa, Hassan

    2014-01-01

    The aim of our study was to explore the clinicopathological and prognostic significance of extracellular matrix metalloproteinase inducer (EMMPRIN) expression in oral squamous cell carcinomas (OSCC), and its relation with the proliferative tumor status of OSCC. We examined EMMPRIN and Ki-67 proteins expression by immunohistochemistry in 74 cases with OSCC. Statistical analysis was conducted to examine their clinicopathological and prognostic significance in OSCC. EMMPRIN membrane expression was observed in all cases, with both membrane and cytoplasmic tumor expression in 61 cases (82.4%). EMMPRIN overexpression was observed in 56 cases (75.7%). Moderately or poorly differentiated tumors showed EMMPRIN overexpression more frequently than well-differentiated tumors (P = 0.002). Overexpression of EMMPRIN was correlated with high Ki-67 expression (P = 0.004). In the multivariate analysis, EMMPRIN overexpression reveals an adverse independent prognostic value for cancer-specific survival (CSS) (P = 0.034). Our results reveal that EMMPRIN protein is overexpressed in more than two-thirds of OSCC cases, especially in high proliferative and less differentiated tumors. The independent value of EMMPRIN overexpression in CSS suggests that this protein could be used as an important biological prognostic marker for patients with OSCC. Moreover, the high expression of EMMPRIN makes it a possible therapeutic target in OSCC patients.

  6. Expression of p53, p21 and cyclin D1 in penile cancer: p53 predicts poor prognosis.

    PubMed

    Gunia, Sven; Kakies, Christoph; Erbersdobler, Andreas; Hakenberg, Oliver W; Koch, Stefan; May, Matthias

    2012-03-01

    To evaluate the role of p53, p21 and cyclin D1 expression in patients with penile cancer (PC). Paraffin-embedded tissues from PC specimens from six pathology departments were subjected to a central histopathological review performed by one pathologist. The tissue microarray technique was used for immunostaining which was evaluated by two independent pathologists and correlated with cancer-specific survival (CSS). κ-statistics were used to assess interobserver variability. Uni- and multivariable Cox proportional hazards analysis was applied to assess the independent effects of several prognostic factors on CSS over a median of 32 months (IQR 6-66 months). Specimens and clinical data from 110 men treated surgically for primary PC were collected. p53 staining was positive in 30 and negative in 62 specimens. κ-statistics showed substantial interobserver reproducibility of p53 staining evaluation (κ=0.73; p<0.001). The 5-year CSS rate for the entire study cohort was 74%. Five-year CSS was 84% in p53-negative and 51% in p53-positive PC patients (p=0.003). Multivariable analysis showed p53 (HR=3.20; p=0.041) and pT-stage (HR=4.29; p<0.001) as independent significant prognostic factors for CSS. Cyclin D1 and p21 expression were not correlated with survival. However, incorporating p21 into a multivariable Cox model did contribute to improved model quality for predicting CSS. In patients with PC, the expression of p53 in the primary tumour specimen can be reproducibly assessed and is negatively associated with cancer specific survival.

  7. Genome-Wide Identification and Evaluation of Reference Genes for Quantitative RT-PCR Analysis during Tomato Fruit Development.

    PubMed

    Cheng, Yuan; Bian, Wuying; Pang, Xin; Yu, Jiahong; Ahammed, Golam J; Zhou, Guozhi; Wang, Rongqing; Ruan, Meiying; Li, Zhimiao; Ye, Qingjing; Yao, Zhuping; Yang, Yuejian; Wan, Hongjian

    2017-01-01

    Gene expression analysis in tomato fruit has drawn increasing attention nowadays. Quantitative real-time PCR (qPCR) is a routine technique for gene expression analysis. In qPCR operation, reliability of results largely depends on the choice of appropriate reference genes (RGs). Although tomato is a model for fruit biology study, few RGs for qPCR analysis in tomato fruit had yet been developed. In this study, we initially identified 38 most stably expressed genes based on tomato transcriptome data set, and their expression stabilities were further determined in a set of tomato fruit samples of four different fruit developmental stages (Immature, mature green, breaker, mature red) using qPCR analysis. Two statistical algorithms, geNorm and Normfinder, concordantly determined the superiority of these identified putative RGs. Notably, SlFRG05 (Solyc01g104170), SlFRG12 (Solyc04g009770), SlFRG16 (Solyc10g081190), SlFRG27 (Solyc06g007510), and SlFRG37 (Solyc11g005330) were proved to be suitable RGs for tomato fruit development study. Further analysis using geNorm indicate that the combined use of SlFRG03 (Solyc02g063070) and SlFRG27 would provide more reliable normalization results in qPCR experiments. The identified RGs in this study will be beneficial for future qPCR analysis of tomato fruit developmental study, as well as for the potential identification of optimal normalization controls in other plant species.

  8. Genome-Wide Identification and Evaluation of Reference Genes for Quantitative RT-PCR Analysis during Tomato Fruit Development

    PubMed Central

    Cheng, Yuan; Bian, Wuying; Pang, Xin; Yu, Jiahong; Ahammed, Golam J.; Zhou, Guozhi; Wang, Rongqing; Ruan, Meiying; Li, Zhimiao; Ye, Qingjing; Yao, Zhuping; Yang, Yuejian; Wan, Hongjian

    2017-01-01

    Gene expression analysis in tomato fruit has drawn increasing attention nowadays. Quantitative real-time PCR (qPCR) is a routine technique for gene expression analysis. In qPCR operation, reliability of results largely depends on the choice of appropriate reference genes (RGs). Although tomato is a model for fruit biology study, few RGs for qPCR analysis in tomato fruit had yet been developed. In this study, we initially identified 38 most stably expressed genes based on tomato transcriptome data set, and their expression stabilities were further determined in a set of tomato fruit samples of four different fruit developmental stages (Immature, mature green, breaker, mature red) using qPCR analysis. Two statistical algorithms, geNorm and Normfinder, concordantly determined the superiority of these identified putative RGs. Notably, SlFRG05 (Solyc01g104170), SlFRG12 (Solyc04g009770), SlFRG16 (Solyc10g081190), SlFRG27 (Solyc06g007510), and SlFRG37 (Solyc11g005330) were proved to be suitable RGs for tomato fruit development study. Further analysis using geNorm indicate that the combined use of SlFRG03 (Solyc02g063070) and SlFRG27 would provide more reliable normalization results in qPCR experiments. The identified RGs in this study will be beneficial for future qPCR analysis of tomato fruit developmental study, as well as for the potential identification of optimal normalization controls in other plant species. PMID:28900431

  9. A regulation probability model-based meta-analysis of multiple transcriptomics data sets for cancer biomarker identification.

    PubMed

    Xie, Xin-Ping; Xie, Yu-Feng; Wang, Hong-Qiang

    2017-08-23

    Large-scale accumulation of omics data poses a pressing challenge of integrative analysis of multiple data sets in bioinformatics. An open question of such integrative analysis is how to pinpoint consistent but subtle gene activity patterns across studies. Study heterogeneity needs to be addressed carefully for this goal. This paper proposes a regulation probability model-based meta-analysis, jGRP, for identifying differentially expressed genes (DEGs). The method integrates multiple transcriptomics data sets in a gene regulatory space instead of in a gene expression space, which makes it easy to capture and manage data heterogeneity across studies from different laboratories or platforms. Specifically, we transform gene expression profiles into a united gene regulation profile across studies by mathematically defining two gene regulation events between two conditions and estimating their occurring probabilities in a sample. Finally, a novel differential expression statistic is established based on the gene regulation profiles, realizing accurate and flexible identification of DEGs in gene regulation space. We evaluated the proposed method on simulation data and real-world cancer datasets and showed the effectiveness and efficiency of jGRP in identifying DEGs identification in the context of meta-analysis. Data heterogeneity largely influences the performance of meta-analysis of DEGs identification. Existing different meta-analysis methods were revealed to exhibit very different degrees of sensitivity to study heterogeneity. The proposed method, jGRP, can be a standalone tool due to its united framework and controllable way to deal with study heterogeneity.

  10. Statistical error in simulations of Poisson processes: Example of diffusion in solids

    NASA Astrophysics Data System (ADS)

    Nilsson, Johan O.; Leetmaa, Mikael; Vekilova, Olga Yu.; Simak, Sergei I.; Skorodumova, Natalia V.

    2016-08-01

    Simulations of diffusion in solids often produce poor statistics of diffusion events. We present an analytical expression for the statistical error in ion conductivity obtained in such simulations. The error expression is not restricted to any computational method in particular, but valid in the context of simulation of Poisson processes in general. This analytical error expression is verified numerically for the case of Gd-doped ceria by running a large number of kinetic Monte Carlo calculations.

  11. High Rab27A expression indicates favorable prognosis in CRC.

    PubMed

    Shi, Chuanbing; Yang, Xiaojun; Ni, Yijiang; Hou, Ning; Xu, Li; Zhan, Feng; Zhu, Huijun; Xiong, Lin; Chen, Pingsheng

    2015-06-13

    Rab27A is a peculiar member in Rab family and has been suggested to play essential roles in the development of human cancers. However, the association between Rab27A expression and clinicopathological characteristics of colorectal cancer (CRC) has not been elucidated yet. One-step quantitative real-time polymerase chain reaction (qPCR) test with 18 fresh-frozen CRC samples and immunohistochemistry (IHC) analysis in 112 CRC cases were executed to evaluate the relationship between Rab27A expression and the clinicopathological features of CRC. Cox regression and Kaplan-Meier survival analyses were performed to identify the prognostic factors for 112 CRC patients. The results specified that the expression levels of Rab27A mRNA and protein were significantly higher in CRC tissues than that in matched non-cancerous tissues, in both qPCR test (p = 0.029) and IHC analysis (p = 0.020). The IHC data indicated that the Rab27A protein expression in CRC was statistically correlated with lymph node metastasis (p = 0.022) and TNM stage (p = 0.026). Cox multi-factor analysis and Kaplan-Meier method suggested Rab27A protein expression (p = 0.012) and tumor differentiation (p = 0.004) were significantly associated with the overall survival of CRC patients. The data indicated the differentiate expression of Rab27A in CRC tissues and matched non-cancerous tissues. Rab27A may be used as a valuable prognostic biomarker for CRC patients.

  12. Altered Molecular Expression of the TLR4/NF-κB Signaling Pathway in Mammary Tissue of Chinese Holstein Cattle with Mastitis

    PubMed Central

    Wu, Jie; Li, Lian; Sun, Yu; Huang, Shuai; Tang, Juan; Yu, Pan; Wang, Genlin

    2015-01-01

    Toll-like receptor 4 (TLR4) mediated activation of the nuclear transcription factor κB (NF-κB) signaling pathway by mastitis initiates expression of genes associated with inflammation and the innate immune response. In this study, the profile of mastitis-induced differential gene expression in the mammary tissue of Chinese Holstein cattle was investigated by Gene-Chip microarray and bioinformatics. The microarray results revealed that 79 genes associated with the TLR4/NF-κB signaling pathway were differentially expressed. Of these genes, 19 were up-regulated and 29 were down-regulated in mastitis tissue compared to normal, healthy tissue. Statistical analysis of transcript and protein level expression changes indicated that 10 genes, namely TLR4, MyD88, IL-6, and IL-10, were up-regulated, while, CD14, TNF-α, MD-2, IL-β, NF-κB, and IL-12 were significantly down-regulated in mastitis tissue in comparison with normal tissue. Analyses using bioinformatics database resources, such as the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis and the Gene Ontology Consortium (GO) for term enrichment analysis, suggested that these differently expressed genes implicate different regulatory pathways for immune function in the mammary gland. In conclusion, our study provides new evidence for better understanding the differential expression and mechanisms of the TLR4 /NF-κB signaling pathway in Chinese Holstein cattle with mastitis. PMID:25706977

  13. Altered molecular expression of the TLR4/NF-κB signaling pathway in mammary tissue of Chinese Holstein cattle with mastitis.

    PubMed

    Wu, Jie; Li, Lian; Sun, Yu; Huang, Shuai; Tang, Juan; Yu, Pan; Wang, Genlin

    2015-01-01

    Toll-like receptor 4 (TLR4) mediated activation of the nuclear transcription factor κB (NF-κB) signaling pathway by mastitis initiates expression of genes associated with inflammation and the innate immune response. In this study, the profile of mastitis-induced differential gene expression in the mammary tissue of Chinese Holstein cattle was investigated by Gene-Chip microarray and bioinformatics. The microarray results revealed that 79 genes associated with the TLR4/NF-κB signaling pathway were differentially expressed. Of these genes, 19 were up-regulated and 29 were down-regulated in mastitis tissue compared to normal, healthy tissue. Statistical analysis of transcript and protein level expression changes indicated that 10 genes, namely TLR4, MyD88, IL-6, and IL-10, were up-regulated, while, CD14, TNF-α, MD-2, IL-β, NF-κB, and IL-12 were significantly down-regulated in mastitis tissue in comparison with normal tissue. Analyses using bioinformatics database resources, such as the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis and the Gene Ontology Consortium (GO) for term enrichment analysis, suggested that these differently expressed genes implicate different regulatory pathways for immune function in the mammary gland. In conclusion, our study provides new evidence for better understanding the differential expression and mechanisms of the TLR4 /NF-κB signaling pathway in Chinese Holstein cattle with mastitis.

  14. Proteomic Analysis of Matched Formalin-Fixed, Paraffin-Embedded Specimens in Patients with Advanced Serous Ovarian Carcinoma

    PubMed Central

    Smith, Ashlee L.; Sun, Mai; Bhargava, Rohit; Stewart, Nicolas A.; Flint, Melanie S.; Bigbee, William L.; Krivak, Thomas C.; Strange, Mary A.; Cooper, Kristine L.; Zorn, Kristin K.

    2013-01-01

    Objective: The biology of high grade serous ovarian carcinoma (HGSOC) is poorly understood. Little has been reported on intratumoral homogeneity or heterogeneity of primary HGSOC tumors and their metastases. We evaluated the global protein expression profiles of paired primary and metastatic HGSOC from formalin-fixed, paraffin-embedded (FFPE) tissue samples. Methods: After IRB approval, six patients with advanced HGSOC were identified with tumor in both ovaries at initial surgery. Laser capture microdissection (LCM) was used to extract tumor for protein digestion. Peptides were extracted and analyzed by reversed-phase liquid chromatography coupled to a linear ion trap mass spectrometer. Tandem mass spectra were searched against the UniProt human protein database. Differences in protein abundance between samples were assessed and analyzed by Ingenuity Pathway Analysis software. Immunohistochemistry (IHC) for select proteins from the original and an additional validation set of five patients was performed. Results: Unsupervised clustering of the abundance profiles placed the paired specimens adjacent to each other. IHC H-score analysis of the validation set revealed a strong correlation between paired samples for all proteins. For the similarly expressed proteins, the estimated correlation coefficients in two of three experimental samples and all validation samples were statistically significant (p < 0.05). The estimated correlation coefficients in the experimental sample proteins classified as differentially expressed were not statistically significant. Conclusion: A global proteomic screen of primary HGSOC tumors and their metastatic lesions identifies tumoral homogeneity and heterogeneity and provides preliminary insight into these protein profiles and the cellular pathways they constitute. PMID:28250404

  15. A Prototype System for Retrieval of Gene Functional Information

    PubMed Central

    Folk, Lillian C.; Patrick, Timothy B.; Pattison, James S.; Wolfinger, Russell D.; Mitchell, Joyce A.

    2003-01-01

    Microarrays allow researchers to gather data about the expression patterns of thousands of genes simultaneously. Statistical analysis can reveal which genes show statistically significant results. Making biological sense of those results requires the retrieval of functional information about the genes thus identified, typically a manual gene-by-gene retrieval of information from various on-line databases. For experiments generating thousands of genes of interest, retrieval of functional information can become a significant bottleneck. To address this issue, we are currently developing a prototype system to automate the process of retrieval of functional information from multiple on-line sources. PMID:14728346

  16. Gender discrimination and prediction on the basis of facial metric information.

    PubMed

    Fellous, J M

    1997-07-01

    Horizontal and vertical facial measurements are statistically independent. Discriminant analysis shows that five of such normalized distances explain over 95% of the gender differences of "training" samples and predict the gender of 90% novel test faces exhibiting various facial expressions. The robustness of the method and its results are assessed. It is argued that these distances (termed fiducial) are compatible with those found experimentally by psychophysical and neurophysiological studies. In consequence, partial explanations for the effects observed in these experiments can be found in the intrinsic statistical nature of the facial stimuli used.

  17. Cancer testis antigen OY-TES-1: analysis of protein expression in ovarian cancer with tissue microarrays.

    PubMed

    Fan, R; Huang, W; Luo, B; Zhang, Q M; Xiao, S W; Xie, X X

    2015-01-01

    Revised manuscript accepted for publication March 5, Objectives: The purpose of this study was to determine the potential of cancer testis antigen OY-TES-1 as a vaccine for ovarian cancer (OC). A tissue microarray (TMA) containing 107 samples from OC tissues and 48 samples from OC adjacent tissues was analyzed by immunohistochemistry with the OY-TES-1 polyclonal antibody. The correlation between OY-TES-1 and clinic pathological traits of OC was statistically analyzed. The expression of OY-TES-1 protein was found in 81% (87/107) of OC tissues and 56% (27/48) of OC adjacent tissues. The immunostaining intensity of OY-TES-1 in OC tissues was significantly higher than that in OC adjacent tissues tested (p = 0.040). OC adjacent tissues only demonstrated lower immunostaining intensity, whereas some of OC tissues presented higher immunostaining intensity and majority showed the heterogeneity of protein distribution. There was no statistically significant correlation found between OY-TES-1 expression and any other clinicopathological traits such as age, FIGO stage, pathological grade, and histological type. OY-TES-1 was expressed in OC tissues with a high proportion, and some of OC tissues presented OY-TES-1 expression in high level vs OC adjacent tissues. OY-TES-1 could be an attractive target for immunotherapy for OC in the future.

  18. Comparative study of Hsp27, GSK3β, Wnt1 and PRDX3 in Hirschsprung's disease.

    PubMed

    Gao, Hong; Liu, Xiaomei; Chen, Dong; Lv, Liangying; Wu, Mei; Mi, Jie; Wang, Weilin

    2014-06-01

    Hirschsprung's disease (HSCR) is a developmental disorder of the enteric nervous system characterized by aganglionosis in distal gut. In this study, we used two-dimensional gel electrophoresis (2-DE) technology coupled with matrix assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF-MS) analysis to identify differentially expressed proteins in the aganglionic (stenotic) and ganglionic (normal) colon segment tissues from patients with HSCR. We identified 15 proteins with different expression levels between the stenotic and the normal colon segment tissues from patients with HSCR. Nine proteins were upregulated and six proteins downregulated in the stenotic colon segment tissues compared to the normal colon segment tissues. Based on the biological functions, we selected the Hsp27 upregulated proteins and the PRDX3 downregulated proteins to confirm their expression in 20 patients. The protein and mRNA expressions of Hsp27 were statistically higher in the stenotic colon segment tissues than in the normal colon segment tissues, whereas the protein and mRNA expressions of PRDX3 were statistically lower in the stenotic colon segment tissues than in the normal colon segment tissues. These findings of changes in mRNA and protein in tissues from patients with HSCR provide information which may be helpful in understanding the pathomechanism that is implicated in the disease. © 2014 The Authors. International Journal of Experimental Pathology © 2014 International Journal of Experimental Pathology.

  19. Surprisal analysis of genome-wide transcript profiling identifies differentially expressed genes and pathways associated with four growth conditions in the microalga Chlamydomonas.

    PubMed

    Bogaert, Kenny A; Manoharan-Basil, Sheeba S; Perez, Emilie; Levine, Raphael D; Remacle, Francoise; Remacle, Claire

    2018-01-01

    The usual cultivation mode of the green microalga Chlamydomonas is liquid medium and light. However, the microalga can also be grown on agar plates and in darkness. Our aim is to analyze and compare gene expression of cells cultivated in these different conditions. For that purpose, RNA-seq data are obtained from Chlamydomonas samples of two different labs grown in four environmental conditions (agar@light, agar@dark, liquid@light, liquid@dark). The RNA seq data are analyzed by surprisal analysis, which allows the simultaneous meta-analysis of all the samples. First we identify a balance state, which defines a state where the expression levels are similar in all the samples irrespectively of their growth conditions, or lab origin. In addition our analysis identifies additional constraints needed to quantify the deviation with respect to the balance state. The first constraint differentiates the agar samples versus the liquid ones; the second constraint the dark samples versus the light ones. The two constraints are almost of equal importance. Pathways involved in stress responses are found in the agar phenotype while the liquid phenotype comprises ATP and NADH production pathways. Remodeling of membrane is suggested in the dark phenotype while photosynthetic pathways characterize the light phenotype. The same trends are also present when performing purely statistical analysis such as K-means clustering and differentially expressed genes.

  20. Behavioral analysis of Drosophila transformants expressing human taste receptor genes in the gustatory receptor neurons.

    PubMed

    Adachi, Ryota; Sasaki, Yuko; Morita, Hiromi; Komai, Michio; Shirakawa, Hitoshi; Goto, Tomoko; Furuyama, Akira; Isono, Kunio

    2012-06-01

    Transgenic Drosophila expressing human T2R4 and T2R38 bitter-taste receptors or PKD2L1 sour-taste receptor in the fly gustatory receptor neurons and other tissues were prepared using conventional Gal4/UAS binary system. Molecular analysis showed that the transgene mRNAs are expressed according to the tissue specificity of the Gal4 drivers. Transformants expressing the transgene taste receptors in the fly taste neurons were then studied by a behavioral assay to analyze whether transgene chemoreceptors are functional and coupled to the cell response. Since wild-type flies show strong aversion against the T2R ligands as in mammals, the authors analyzed the transformants where the transgenes are expressed in the fly sugar receptor neurons so that they promote feeding ligand-dependently if they are functional and activate the neurons. Although the feeding preference varied considerably among different strains and individuals, statistical analysis using large numbers of transformants indicated that transformants expressing T2R4 showed a small but significant increase in the preference for denatonium and quinine, the T2R4 ligands, as compared to the control flies, whereas transformants expressing T2R38 did not. Similarly, transformants expressing T2R38 and PKD2L1 also showed a similar preference increase for T2R38-specific ligand phenylthiocarbamide (PTC) and a sour-taste ligand, citric acid, respectively. Taken together, the transformants expressing mammalian taste receptors showed a small but significant increase in the feeding preference that is taste receptor and also ligand dependent. Although future improvements are required to attain performance comparable to the endogenous robust response, Drosophila taste neurons may serve as a potential in vivo heterologous expression system for analyzing chemoreceptor function.

  1. Inducible nitric oxide expression correlates with the level of inflammation in periapical cysts.

    PubMed

    Matsumoto, Mariza Akemi; Ribeiro, Daniel Araki

    2007-10-01

    In an attempt to elucidate if inducible nitric oxide expression (iNOS) is correlated with the level of inflammation in periapical cysts with accuracy, the goal of this study was to evaluate the expression of iNOS in these ones. 30 cases were included in this study being iNOS evaluated by means of immunohistochemistry. Statistical analysis was performed by Kruskal-Wallis non-parametric test followed by the post-hoc Dunn's test. iNOS stain was detected throughout the epithelium, subepithelial fibroblasts and macrophages in all cases, indistinctly. Nevertheless, iNOS immunostaining in periapical cysts was different according to the levels of inflammation, being the strongest effect associated with intense inflammatory infiltrate. Taken together, our results indicate that immunoreactivity of iNOS was expressed in several cellular types present in periapical cyst, being positively correlated with the level of inflammation. Therefore, iNOS expression plays an important role in the pathogenesis of periapical cysts.

  2. Growth Factors and COX2 Expression in Canine Perivascular Wall Tumors.

    PubMed

    Avallone, G; Stefanello, D; Boracchi, P; Ferrari, R; Gelain, M E; Turin, L; Tresoldi, E; Roccabianca, P

    2015-11-01

    Canine perivascular wall tumors (PWTs) are a group of subcutaneous soft tissue sarcomas developing from vascular mural cells. Mural cells are involved in angiogenesis through a complex crosstalk with endothelial cells mediated by several growth factors and their receptors. The evaluation of their expression may have relevance since they may represent a therapeutic target in the control of canine PWTs. The expression of vascular endothelial growth factor (VEGF) and receptors VEGFR-I/II, basic fibroblast growth factor (bFGF) and receptor Flg, platelet-derived growth factor B (PDGFB) and receptor PDGFRβ, transforming growth factor β1 (TGFβ1) and receptors TGFβR-I/II, and cyclooxygenase 2 (COX2) was evaluated on frozen sections of 40 PWTs by immunohistochemistry and semiquantitatively scored to identify their potential role in PWT development. Statistical analysis was performed to analyze possible correlations between Ki67 labeling index and the expression of each molecule. Proteins of the VEGF-, PDGFB-, and bFGF-mediated pathways were highly expressed in 27 (67.5%), 30 (75%), and 19 (47.5%) of 40 PWTs, respectively. Proteins of the TGFβ1- and COX2-mediated pathways were highly expressed in 4 (10%) and 14 (35%) of 40 cases. Statistical analysis identified an association between VEGF and VEGFR-I/II (P = .015 and .003, respectively), bFGF and Flg (P = .038), bFGF and PDGFRβ (P = .003), and between TGFβ1 and COX2 (P = .006). These findings were consistent with the mechanisms that have been reported to play a role in angiogenesis and in tumor development. No association with Ki67 labeling index was found. VEGF-, PDGFB-, and bFGF-mediated pathways seem to have a key role in PWT development and growth. Blockade of tyrosine kinase receptors after surgery could represent a promising therapy with the aim to reduce the PWT relapse rate and prolong the time to relapse. © The Author(s) 2015.

  3. ABCA transporter gene expression and poor outcome in epithelial ovarian cancer.

    PubMed

    Hedditch, Ellen L; Gao, Bo; Russell, Amanda J; Lu, Yi; Emmanuel, Catherine; Beesley, Jonathan; Johnatty, Sharon E; Chen, Xiaoqing; Harnett, Paul; George, Joshy; Williams, Rebekka T; Flemming, Claudia; Lambrechts, Diether; Despierre, Evelyn; Lambrechts, Sandrina; Vergote, Ignace; Karlan, Beth; Lester, Jenny; Orsulic, Sandra; Walsh, Christine; Fasching, Peter; Beckmann, Matthias W; Ekici, Arif B; Hein, Alexander; Matsuo, Keitaro; Hosono, Satoyo; Nakanishi, Toru; Yatabe, Yasushi; Pejovic, Tanja; Bean, Yukie; Heitz, Florian; Harter, Philipp; du Bois, Andreas; Schwaab, Ira; Hogdall, Estrid; Kjaer, Susan K; Jensen, Allan; Hogdall, Claus; Lundvall, Lene; Engelholm, Svend Aage; Brown, Bob; Flanagan, James; Metcalf, Michelle D; Siddiqui, Nadeem; Sellers, Thomas; Fridley, Brooke; Cunningham, Julie; Schildkraut, Joellen; Iversen, Ed; Weber, Rachel P; Berchuck, Andrew; Goode, Ellen; Bowtell, David D; Chenevix-Trench, Georgia; deFazio, Anna; Norris, Murray D; MacGregor, Stuart; Haber, Michelle; Henderson, Michelle J

    2014-07-01

    ATP-binding cassette (ABC) transporters play various roles in cancer biology and drug resistance, but their association with outcomes in serous epithelial ovarian cancer (EOC) is unknown. The relationship between clinical outcomes and ABC transporter gene expression in two independent cohorts of high-grade serous EOC tumors was assessed with real-time quantitative polymerase chain reaction, analysis of expression microarray data, and immunohistochemistry. Associations between clinical outcomes and ABCA transporter gene single nucleotide polymorphisms were tested in a genome-wide association study. Impact of short interfering RNA-mediated gene suppression was determined by colony forming and migration assays. Association with survival was assessed with Kaplan-Meier analysis and log-rank tests. All statistical tests were two-sided. Associations with outcome were observed with ABC transporters of the "A" subfamily, but not with multidrug transporters. High-level expression of ABCA1, ABCA6, ABCA8, and ABCA9 in primary tumors was statistically significantly associated with reduced survival in serous ovarian cancer patients. Low levels of ABCA5 and the C-allele of rs536009 were associated with shorter overall survival (hazard ratio for death = 1.50; 95% confidence interval [CI] =1.26 to 1.79; P = 6.5e-6). The combined expression pattern of ABCA1, ABCA5, and either ABCA8 or ABCA9 was associated with particularly poor outcome (mean overall survival in group with adverse ABCA1, ABCA5 and ABCA9 gene expression = 33.2 months, 95% CI = 26.4 to 40.1; vs 55.3 months in the group with favorable ABCA gene expression, 95% CI = 49.8 to 60.8; P = .001), independently of tumor stage or surgical debulking status. Suppression of cholesterol transporter ABCA1 inhibited ovarian cancer cell growth and migration in vitro, and statin treatment reduced ovarian cancer cell migration. Expression of ABCA transporters was associated with poor outcome in serous ovarian cancer, implicating lipid trafficking as a potentially important process in EOC. © The Author 2014. Published by Oxford University Press. All rights reserved.

  4. An information-theoretic approach to the modeling and analysis of whole-genome bisulfite sequencing data.

    PubMed

    Jenkinson, Garrett; Abante, Jordi; Feinberg, Andrew P; Goutsias, John

    2018-03-07

    DNA methylation is a stable form of epigenetic memory used by cells to control gene expression. Whole genome bisulfite sequencing (WGBS) has emerged as a gold-standard experimental technique for studying DNA methylation by producing high resolution genome-wide methylation profiles. Statistical modeling and analysis is employed to computationally extract and quantify information from these profiles in an effort to identify regions of the genome that demonstrate crucial or aberrant epigenetic behavior. However, the performance of most currently available methods for methylation analysis is hampered by their inability to directly account for statistical dependencies between neighboring methylation sites, thus ignoring significant information available in WGBS reads. We present a powerful information-theoretic approach for genome-wide modeling and analysis of WGBS data based on the 1D Ising model of statistical physics. This approach takes into account correlations in methylation by utilizing a joint probability model that encapsulates all information available in WGBS methylation reads and produces accurate results even when applied on single WGBS samples with low coverage. Using the Shannon entropy, our approach provides a rigorous quantification of methylation stochasticity in individual WGBS samples genome-wide. Furthermore, it utilizes the Jensen-Shannon distance to evaluate differences in methylation distributions between a test and a reference sample. Differential performance assessment using simulated and real human lung normal/cancer data demonstrate a clear superiority of our approach over DSS, a recently proposed method for WGBS data analysis. Critically, these results demonstrate that marginal methods become statistically invalid when correlations are present in the data. This contribution demonstrates clear benefits and the necessity of modeling joint probability distributions of methylation using the 1D Ising model of statistical physics and of quantifying methylation stochasticity using concepts from information theory. By employing this methodology, substantial improvement of DNA methylation analysis can be achieved by effectively taking into account the massive amount of statistical information available in WGBS data, which is largely ignored by existing methods.

  5. Power flow as a complement to statistical energy analysis and finite element analysis

    NASA Technical Reports Server (NTRS)

    Cuschieri, J. M.

    1987-01-01

    Present methods of analysis of the structural response and the structure-borne transmission of vibrational energy use either finite element (FE) techniques or statistical energy analysis (SEA) methods. The FE methods are a very useful tool at low frequencies where the number of resonances involved in the analysis is rather small. On the other hand SEA methods can predict with acceptable accuracy the response and energy transmission between coupled structures at relatively high frequencies where the structural modal density is high and a statistical approach is the appropriate solution. In the mid-frequency range, a relatively large number of resonances exist which make finite element method too costly. On the other hand SEA methods can only predict an average level form. In this mid-frequency range a possible alternative is to use power flow techniques, where the input and flow of vibrational energy to excited and coupled structural components can be expressed in terms of input and transfer mobilities. This power flow technique can be extended from low to high frequencies and this can be integrated with established FE models at low frequencies and SEA models at high frequencies to form a verification of the method. This method of structural analysis using power flo and mobility methods, and its integration with SEA and FE analysis is applied to the case of two thin beams joined together at right angles.

  6. A method for developing design diagrams for ceramic and glass materials using fatigue data

    NASA Technical Reports Server (NTRS)

    Heslin, T. M.; Magida, M. B.; Forrest, K. A.

    1986-01-01

    The service lifetime of glass and ceramic materials can be expressed as a plot of time-to-failure versus applied stress whose plot is parametric in percent probability of failure. This type of plot is called a design diagram. Confidence interval estimates for such plots depend on the type of test that is used to generate the data, on assumptions made concerning the statistical distribution of the test results, and on the type of analysis used. This report outlines the development of design diagrams for glass and ceramic materials in engineering terms using static or dynamic fatigue tests, assuming either no particular statistical distribution of test results or a Weibull distribution and using either median value or homologous ratio analysis of the test results.

  7. Statistical functions and relevant correlation coefficients of clearness index

    NASA Astrophysics Data System (ADS)

    Pavanello, Diego; Zaaiman, Willem; Colli, Alessandra; Heiser, John; Smith, Scott

    2015-08-01

    This article presents a statistical analysis of the sky conditions, during years from 2010 to 2012, for three different locations: the Joint Research Centre site in Ispra (Italy, European Solar Test Installation - ESTI laboratories), the site of National Renewable Energy Laboratory in Golden (Colorado, USA) and the site of Brookhaven National Laboratories in Upton (New York, USA). The key parameter is the clearness index kT, a dimensionless expression of the global irradiance impinging upon a horizontal surface at a given instant of time. In the first part, the sky conditions are characterized using daily averages, giving a general overview of the three sites. In the second part the analysis is performed using data sets with a short-term resolution of 1 sample per minute, demonstrating remarkable properties of the statistical distributions of the clearness index, reinforced by a proof using fuzzy logic methods. Successively some time-dependent correlations between different meteorological variables are presented in terms of Pearson and Spearman correlation coefficients, and introducing a new one.

  8. Oncogene GAEC1 regulates CAPN10 expression which predicts survival in esophageal squamous cell carcinoma

    PubMed Central

    Chan, Dessy; Tsoi, Miriam Yuen-Tung; Liu, Christina Di; Chan, Sau-Hing; Law, Simon Ying-Kit; Chan, Kwok-Wah; Chan, Yuen-Piu; Gopalan, Vinod; Lam, Alfred King-Yin; Tang, Johnny Cheuk-On

    2013-01-01

    AIM: To identify the downstream regulated genes of GAEC1 oncogene in esophageal squamous cell carcinoma and their clinicopathological significance. METHODS: The anti-proliferative effect of knocking down the expression of GAEC1 oncogene was studied by using the RNA interference (RNAi) approach through transfecting the GAEC1-overexpressed esophageal carcinoma cell line KYSE150 with the pSilencer vector cloned with a GAEC1-targeted sequence, followed by MTS cell proliferation assay and cell cycle analysis using flow cytometry. RNA was then extracted from the parental, pSilencer-GAEC1-targeted sequence transfected and pSilencer negative control vector transfected KYSE150 cells for further analysis of different patterns in gene expression. Genes differentially expressed with suppressed GAEC1 expression were then determined using Human Genome U133 Plus 2.0 cDNA microarray analysis by comparing with the parental cells and normalized with the pSilencer negative control vector transfected cells. The most prominently regulated genes were then studied by immunohistochemical staining using tissue microarrays to determine their clinicopathological correlations in esophageal squamous cell carcinoma by statistical analyses. RESULTS: The RNAi approach of knocking down gene expression showed the effective suppression of GAEC1 expression in esophageal squamous cell carcinoma cell line KYSE150 that resulted in the inhibition of cell proliferation and increase of apoptotic population. cDNA microarray analysis for identifying differentially expressed genes detected the greatest levels of downregulation of calpain 10 (CAPN10) and upregulation of trinucleotide repeat containing 6C (TNRC6C) transcripts when GAEC1 expression was suppressed. At the tissue level, the high level expression of calpain 10 protein was significantly associated with longer patient survival (month) of esophageal squamous cell carcinoma compared to the patients with low level of calpain 10 expression (37.73 ± 16.33 vs 12.62 ± 12.44, P = 0.032). No significant correction was observed among the TNRC6C protein expression level and the clinocopathologcial features of esophageal squamous cell carcinoma. CONCLUSION: GAEC1 regulates the expression of CAPN10 and TNRC6C downstream. Calpain 10 expression is a potential prognostic marker in patients with esophageal squamous cell carcinoma. PMID:23687414

  9. Global identification and expression analysis of stress-responsive genes of the Argonaute family in apple.

    PubMed

    Xu, Ruirui; Liu, Caiyun; Li, Ning; Zhang, Shizhong

    2016-12-01

    Argonaute (AGO) proteins, which are found in yeast, animals, and plants, are the core molecules of the RNA-induced silencing complex. These proteins play important roles in plant growth, development, and responses to biotic stresses. The complete analysis and classification of the AGO gene family have been recently reported in different plants. Nevertheless, systematic analysis and expression profiling of these genes have not been performed in apple (Malus domestica). Approximately 15 AGO genes were identified in the apple genome. The phylogenetic tree, chromosome location, conserved protein motifs, gene structure, and expression of the AGO gene family in apple were analyzed for gene prediction. All AGO genes were phylogenetically clustered into four groups (i.e., AGO1, AGO4, MEL1/AGO5, and ZIPPY/AGO7) with the AGO genes of Arabidopsis. These groups of the AGO gene family were statistically analyzed and compared among 31 plant species. The predicted apple AGO genes are distributed across nine chromosomes at different densities and include three segment duplications. Expression studies indicated that 15 AGO genes exhibit different expression patterns in at least one of the tissues tested. Additionally, analysis of gene expression levels indicated that the genes are mostly involved in responses to NaCl, PEG, heat, and low-temperature stresses. Hence, several candidate AGO genes are involved in different aspects of physiological and developmental processes and may play an important role in abiotic stress responses in apple. To the best of our knowledge, this study is the first to report a comprehensive analysis of the apple AGO gene family. Our results provide useful information to understand the classification and putative functions of these proteins, especially for gene members that may play important roles in abiotic stress responses in M. hupehensis.

  10. Integrative multi-platform meta-analysis of gene expression profiles in pancreatic ductal adenocarcinoma patients for identifying novel diagnostic biomarkers.

    PubMed

    Irigoyen, Antonio; Jimenez-Luna, Cristina; Benavides, Manuel; Caba, Octavio; Gallego, Javier; Ortuño, Francisco Manuel; Guillen-Ponce, Carmen; Rojas, Ignacio; Aranda, Enrique; Torres, Carolina; Prados, Jose

    2018-01-01

    Applying differentially expressed genes (DEGs) to identify feasible biomarkers in diseases can be a hard task when working with heterogeneous datasets. Expression data are strongly influenced by technology, sample preparation processes, and/or labeling methods. The proliferation of different microarray platforms for measuring gene expression increases the need to develop models able to compare their results, especially when different technologies can lead to signal values that vary greatly. Integrative meta-analysis can significantly improve the reliability and robustness of DEG detection. The objective of this work was to develop an integrative approach for identifying potential cancer biomarkers by integrating gene expression data from two different platforms. Pancreatic ductal adenocarcinoma (PDAC), where there is an urgent need to find new biomarkers due its late diagnosis, is an ideal candidate for testing this technology. Expression data from two different datasets, namely Affymetrix and Illumina (18 and 36 PDAC patients, respectively), as well as from 18 healthy controls, was used for this study. A meta-analysis based on an empirical Bayesian methodology (ComBat) was then proposed to integrate these datasets. DEGs were finally identified from the integrated data by using the statistical programming language R. After our integrative meta-analysis, 5 genes were commonly identified within the individual analyses of the independent datasets. Also, 28 novel genes that were not reported by the individual analyses ('gained' genes) were also discovered. Several of these gained genes have been already related to other gastroenterological tumors. The proposed integrative meta-analysis has revealed novel DEGs that may play an important role in PDAC and could be potential biomarkers for diagnosing the disease.

  11. Investigating a multigene prognostic assay based on significant pathways for Luminal A breast cancer through gene expression profile analysis.

    PubMed

    Gao, Haiyan; Yang, Mei; Zhang, Xiaolan

    2018-04-01

    The present study aimed to investigate potential recurrence-risk biomarkers based on significant pathways for Luminal A breast cancer through gene expression profile analysis. Initially, the gene expression profiles of Luminal A breast cancer patients were downloaded from The Cancer Genome Atlas database. The differentially expressed genes (DEGs) were identified using a Limma package and the hierarchical clustering analysis was conducted for the DEGs. In addition, the functional pathways were screened using Kyoto Encyclopedia of Genes and Genomes pathway enrichment analyses and rank ratio calculation. The multigene prognostic assay was exploited based on the statistically significant pathways and its prognostic function was tested using train set and verified using the gene expression data and survival data of Luminal A breast cancer patients downloaded from the Gene Expression Omnibus. A total of 300 DEGs were identified between good and poor outcome groups, including 176 upregulated genes and 124 downregulated genes. The DEGs may be used to effectively distinguish Luminal A samples with different prognoses verified by hierarchical clustering analysis. There were 9 pathways screened as significant pathways and a total of 18 DEGs involved in these 9 pathways were identified as prognostic biomarkers. According to the survival analysis and receiver operating characteristic curve, the obtained 18-gene prognostic assay exhibited good prognostic function with high sensitivity and specificity to both the train and test samples. In conclusion the 18-gene prognostic assay including the key genes, transcription factor 7-like 2, anterior parietal cortex and lymphocyte enhancer factor-1 may provide a new method for predicting outcomes and may be conducive to the promotion of precision medicine for Luminal A breast cancer.

  12. Time-Course Analysis of Gene Expression During the Saccharomyces cerevisiae Hypoxic Response.

    PubMed

    Bendjilali, Nasrine; MacLeon, Samuel; Kalra, Gurmannat; Willis, Stephen D; Hossian, A K M Nawshad; Avery, Erica; Wojtowicz, Olivia; Hickman, Mark J

    2017-01-05

    Many cells experience hypoxia, or low oxygen, and respond by dramatically altering gene expression. In the yeast Saccharomyces cerevisiae, genes that respond are required for many oxygen-dependent cellular processes, such as respiration, biosynthesis, and redox regulation. To more fully characterize the global response to hypoxia, we exposed yeast to hypoxic conditions, extracted RNA at different times, and performed RNA sequencing (RNA-seq) analysis. Time-course statistical analysis revealed hundreds of genes that changed expression by up to 550-fold. The genes responded with varying kinetics suggesting that multiple regulatory pathways are involved. We identified most known oxygen-regulated genes and also uncovered new regulated genes. Reverse transcription-quantitative PCR (RT-qPCR) analysis confirmed that the lysine methyltransferase EFM6 and the recombinase DMC1, both conserved in humans, are indeed oxygen-responsive. Looking more broadly, oxygen-regulated genes participate in expected processes like respiration and lipid metabolism, but also in unexpected processes like amino acid and vitamin metabolism. Using principle component analysis, we discovered that the hypoxic response largely occurs during the first 2 hr and then a new steady-state expression state is achieved. Moreover, we show that the oxygen-dependent genes are not part of the previously described environmental stress response (ESR) consisting of genes that respond to diverse types of stress. While hypoxia appears to cause a transient stress, the hypoxic response is mostly characterized by a transition to a new state of gene expression. In summary, our results reveal that hypoxia causes widespread and complex changes in gene expression to prepare the cell to function with little or no oxygen. Copyright © 2017 Bendjilali et al.

  13. Immunophenotypic and Molecular Analysis of Human Dental Pulp Stem Cells Potential for Neurogenic Differentiation

    PubMed Central

    Fatima, Nikhat; Khan, Aleem A.; Vishwakarma, Sandeep K.

    2017-01-01

    Background: Growing evidence shows that dental pulp (DP) tissues could be a potential source of adult stem cells for the treatment of devastating neurological diseases and several other conditions. Aims: Exploration of the expression profile of several key molecular markers to evaluate the molecular dynamics in undifferentiated and differentiated DP-derived stem cells (DPSCs) in vitro. Settings and Design: The characteristics and multilineage differentiation ability of DPSCs were determined by cellular and molecular kinetics. DPSCs were further induced to form adherent (ADH) and non-ADH (NADH) neurospheres under serum-free condition which was further induced into neurogenic lineage cells and characterized for their molecular and cellular diversity at each stage. Statistical Analysis Used: Statistical analysis used one-way analysis of variance, Student's t-test, Livak method for relative quantification, and R programming. Results: Immunophenotypic analysis of DPSCs revealed >80% cells positive for mesenchymal markers CD90 and CD105, >70% positive for transferring receptor (CD71), and >30% for chemotactic factor (CXCR3). These cells showed mesodermal differentiation also and confirmed by specific staining and molecular analysis. Activation of neuronal lineage markers and neurogenic growth factors was observed during lineage differentiation of cells derived from NADH and ADH spheroids. Greater than 80% of cells were found to express β-tubulin III in both differentiation conditions. Conclusions: The present study reported a cascade of immunophenotypic and molecular markers to characterize neurogenic differentiation of DPSCs under serum-free condition. These findings trigger the future analyses for clinical applicability of DP-derived cells in regenerative applications. PMID:28566856

  14. Mucinous Colorectal Adenocarcinoma: Influence of EGFR and E-Cadherin Expression on Clinicopathologic Features and Prognosis.

    PubMed

    Foda, Abd AlRahman M; AbdelAziz, Azza; El-Hawary, Amira K; Hosni, Ali; Zalata, Khalid R; Gado, Asmaa I

    2015-08-01

    Previous studies have shown conflicting results on epidermal growth factor receptor (EGFR) and E-cadherin expression in colorectal carcinoma and their prognostic significance. To the best of our knowledge, this study is the first to investigate EGFR and E-cadherin expression, interrelation and relation to clinicopathologic, histologic parameters, and survival in rare colorectal mucinous adenocarcinoma (MA). In this study, we studied tumor tissue specimens from 150 patients with colorectal MA and nonmucinous adenocarcinoma (NMA). High-density manual tissue microarrays were constructed using modified mechanical pencil tips technique, and immunohistochemistry for EGFR and E-cadherin was performed. All relations were analyzed using established statistical methodologies. NMA expressed EGFR and E-cadherin in significantly higher rates with significant heterogenous pattern than MA. EGFR and E-cadherin positivity rates were significantly interrelated in both NMA and MA groups. In the NMA group, high EGFR expression was associated with old age, male sex, multiplicity of tumors, lack of mucinous component, and association with schistosomiasis. However, in the MA group, high EGFR expression was associated only with old age and MA subtype rather than signet ring carcinoma subtype. Conversely, high E-cadherin expression in MA cases was associated with old age, fungating tumor configuration, MA subtype, and negative intratumoral lymphocytic response. However, in the NMA cases, none of these factors was statistically significant. In a univariate analysis, neither EGFR nor E-cadherin expression showed a significant impact on disease-free or overall survival. Targeted therapy against EGFR and E-cadherin may not be useful in patients with MA. Neither EGFR nor E-cadherin is an independent prognostic factor in NMA or MA.

  15. Sedimentological analysis and bed thickness statistics from a Carboniferous deep-water channel-levee complex: Myall Trough, SE Australia

    NASA Astrophysics Data System (ADS)

    Palozzi, Jason; Pantopoulos, George; Maravelis, Angelos G.; Nordsvan, Adam; Zelilidis, Avraam

    2018-02-01

    This investigation presents an outcrop-based integrated study of internal division analysis and statistical treatment of turbidite bed thickness applied to a Carboniferous deep-water channel-levee complex in the Myall Trough, southeast Australia. Turbidite beds of the studied succession are characterized by a range of sedimentary structures grouped into two main associations, a thick-bedded and a thin-bedded one, that reflect channel-fill and overbank/levee deposits, respectively. Three vertically stacked channel-levee cycles have been identified. Results of statistical analysis of bed thickness, grain-size and internal division patterns applied on the studied channel-levee succession, indicate that turbidite bed thickness data seem to be well characterized by a bimodal lognormal distribution, which is possibly reflecting the difference between deposition from lower-density flows (in a levee/overbank setting) and very high-density flows (in a channel fill setting). Power law and exponential distributions were observed to hold only for the thick-bedded parts of the succession and cannot characterize the whole bed thickness range of the studied sediments. The succession also exhibits non-random clustering of bed thickness and grain-size measurements. The studied sediments are also characterized by the presence of statistically detected fining-upward sandstone packets. A novel quantitative approach (change-point analysis) is proposed for the detection of those packets. Markov permutation statistics also revealed the existence of order in the alternation of internal divisions in the succession expressed by an optimal internal division cycle reflecting two main types of gravity flow events deposited within both thick-bedded conglomeratic and thin-bedded sandstone associations. The analytical methods presented in this study can be used as additional tools for quantitative analysis and recognition of depositional environments in hydrocarbon-bearing research of ancient deep-water channel-levee settings.

  16. Expression of miR-155, miR-146a, and miR-326 in T1D patients from Chile: relationship with autoimmunity and inflammatory markers.

    PubMed

    García-Díaz, Diego F; Pizarro, Carolina; Camacho-Guillén, Patricia; Codner, Ethel; Soto, Néstor; Pérez-Bravo, Francisco

    2018-02-01

    Objective The aim of this research was to analyze the expression profile of miR-155, miR-146a, and miR-326 in peripheral blood mononuclear cells (PBMC) of 47 patients with type 1 diabetes mellitus (T1D) and 39 control subjects, as well as the possible association with autoimmune or inflammatory markers. Subjects and methods Expression profile of miRs by means of qPCR using TaqMan probes. Autoantibodies and inflammatory markers by ELISA. Statistical analysis using bivariate correlation. Results The analysis of the results shows an increase in the expression of miR-155 in T1D patients in basal conditions compared to the controls (p < 0.001) and a decreased expression level of miR-326 (p < 0.01) and miR-146a (p < 0.05) compared T1D patients to the controls. miR-155 was the only miRs associated with autoinmmunity (ZnT8) and inflammatory status (vCAM). Conclusion Our data show a possible role of miR-155 related to autoimmunity and inflammation in Chilean patients with T1D.

  17. Exploiting the full power of temporal gene expression profiling through a new statistical test: application to the analysis of muscular dystrophy data.

    PubMed

    Vinciotti, Veronica; Liu, Xiaohui; Turk, Rolf; de Meijer, Emile J; 't Hoen, Peter A C

    2006-04-03

    The identification of biologically interesting genes in a temporal expression profiling dataset is challenging and complicated by high levels of experimental noise. Most statistical methods used in the literature do not fully exploit the temporal ordering in the dataset and are not suited to the case where temporal profiles are measured for a number of different biological conditions. We present a statistical test that makes explicit use of the temporal order in the data by fitting polynomial functions to the temporal profile of each gene and for each biological condition. A Hotelling T2-statistic is derived to detect the genes for which the parameters of these polynomials are significantly different from each other. We validate the temporal Hotelling T2-test on muscular gene expression data from four mouse strains which were profiled at different ages: dystrophin-, beta-sarcoglycan and gamma-sarcoglycan deficient mice, and wild-type mice. The first three are animal models for different muscular dystrophies. Extensive biological validation shows that the method is capable of finding genes with temporal profiles significantly different across the four strains, as well as identifying potential biomarkers for each form of the disease. The added value of the temporal test compared to an identical test which does not make use of temporal ordering is demonstrated via a simulation study, and through confirmation of the expression profiles from selected genes by quantitative PCR experiments. The proposed method maximises the detection of the biologically interesting genes, whilst minimising false detections. The temporal Hotelling T2-test is capable of finding relatively small and robust sets of genes that display different temporal profiles between the conditions of interest. The test is simple, it can be used on gene expression data generated from any experimental design and for any number of conditions, and it allows fast interpretation of the temporal behaviour of genes. The R code is available from V.V. The microarray data have been submitted to GEO under series GSE1574 and GSE3523.

  18. Exploiting the full power of temporal gene expression profiling through a new statistical test: Application to the analysis of muscular dystrophy data

    PubMed Central

    Vinciotti, Veronica; Liu, Xiaohui; Turk, Rolf; de Meijer, Emile J; 't Hoen, Peter AC

    2006-01-01

    Background The identification of biologically interesting genes in a temporal expression profiling dataset is challenging and complicated by high levels of experimental noise. Most statistical methods used in the literature do not fully exploit the temporal ordering in the dataset and are not suited to the case where temporal profiles are measured for a number of different biological conditions. We present a statistical test that makes explicit use of the temporal order in the data by fitting polynomial functions to the temporal profile of each gene and for each biological condition. A Hotelling T2-statistic is derived to detect the genes for which the parameters of these polynomials are significantly different from each other. Results We validate the temporal Hotelling T2-test on muscular gene expression data from four mouse strains which were profiled at different ages: dystrophin-, beta-sarcoglycan and gamma-sarcoglycan deficient mice, and wild-type mice. The first three are animal models for different muscular dystrophies. Extensive biological validation shows that the method is capable of finding genes with temporal profiles significantly different across the four strains, as well as identifying potential biomarkers for each form of the disease. The added value of the temporal test compared to an identical test which does not make use of temporal ordering is demonstrated via a simulation study, and through confirmation of the expression profiles from selected genes by quantitative PCR experiments. The proposed method maximises the detection of the biologically interesting genes, whilst minimising false detections. Conclusion The temporal Hotelling T2-test is capable of finding relatively small and robust sets of genes that display different temporal profiles between the conditions of interest. The test is simple, it can be used on gene expression data generated from any experimental design and for any number of conditions, and it allows fast interpretation of the temporal behaviour of genes. The R code is available from V.V. The microarray data have been submitted to GEO under series GSE1574 and GSE3523. PMID:16584545

  19. Psychometric challenges and proposed solutions when scoring facial emotion expression codes.

    PubMed

    Olderbak, Sally; Hildebrandt, Andrea; Pinkpank, Thomas; Sommer, Werner; Wilhelm, Oliver

    2014-12-01

    Coding of facial emotion expressions is increasingly performed by automated emotion expression scoring software; however, there is limited discussion on how best to score the resulting codes. We present a discussion of facial emotion expression theories and a review of contemporary emotion expression coding methodology. We highlight methodological challenges pertinent to scoring software-coded facial emotion expression codes and present important psychometric research questions centered on comparing competing scoring procedures of these codes. Then, on the basis of a time series data set collected to assess individual differences in facial emotion expression ability, we derive, apply, and evaluate several statistical procedures, including four scoring methods and four data treatments, to score software-coded emotion expression data. These scoring procedures are illustrated to inform analysis decisions pertaining to the scoring and data treatment of other emotion expression questions and under different experimental circumstances. Overall, we found applying loess smoothing and controlling for baseline facial emotion expression and facial plasticity are recommended methods of data treatment. When scoring facial emotion expression ability, maximum score is preferred. Finally, we discuss the scoring methods and data treatments in the larger context of emotion expression research.

  20. DEApp: an interactive web interface for differential expression analysis of next generation sequence data.

    PubMed

    Li, Yan; Andrade, Jorge

    2017-01-01

    A growing trend in the biomedical community is the use of Next Generation Sequencing (NGS) technologies in genomics research. The complexity of downstream differential expression (DE) analysis is however still challenging, as it requires sufficient computer programing and command-line knowledge. Furthermore, researchers often need to evaluate and visualize interactively the effect of using differential statistical and error models, assess the impact of selecting different parameters and cutoffs, and finally explore the overlapping consensus of cross-validated results obtained with different methods. This represents a bottleneck that slows down or impedes the adoption of NGS technologies in many labs. We developed DEApp, an interactive and dynamic web application for differential expression analysis of count based NGS data. This application enables models selection, parameter tuning, cross validation and visualization of results in a user-friendly interface. DEApp enables labs with no access to full time bioinformaticians to exploit the advantages of NGS applications in biomedical research. This application is freely available at https://yanli.shinyapps.io/DEAppand https://gallery.shinyapps.io/DEApp.

  1. Identification of rat lung-specific microRNAs by micoRNA microarray: valuable discoveries for the facilitation of lung research.

    PubMed

    Wang, Yang; Weng, Tingting; Gou, Deming; Chen, Zhongming; Chintagari, Narendranath Reddy; Liu, Lin

    2007-01-24

    An important mechanism for gene regulation utilizes small non-coding RNAs called microRNAs (miRNAs). These small RNAs play important roles in tissue development, cell differentiation and proliferation, lipid and fat metabolism, stem cells, exocytosis, diseases and cancers. To date, relatively little is known about functions of miRNAs in the lung except lung cancer. In this study, we utilized a rat miRNA microarray containing 216 miRNA probes, printed in-house, to detect the expression of miRNAs in the rat lung compared to the rat heart, brain, liver, kidney and spleen. Statistical analysis using Significant Analysis of Microarray (SAM) and Tukey Honestly Significant Difference (HSD) revealed 2 miRNAs (miR-195 and miR-200c) expressed specifically in the lung and 9 miRNAs co-expressed in the lung and another organ. 12 selected miRNAs were verified by Northern blot analysis. The identified lung-specific miRNAs from this work will facilitate functional studies of miRNAs during normal physiological and pathophysiological processes of the lung.

  2. [Influence of demographic and socioeconomic characteristics on the quality of life].

    PubMed

    Grbić, Gordana; Djokić, Dragoljub; Kocić, Sanja; Mitrašinović, Dejan; Rakić, Ljiljana; Prelević, Rade; Krivokapić, Žarko; Miljković, Snežana

    2011-01-01

    The quality of life is a multidimensional concept, which is best expressed by the subjective well-being. Evaluation of the quality of life is the basis for measuring the well-being, and the determination of factors that determine the quality of life quality is the basis for its improvement. To evaluate and assess the determinants of the perceived quality of life of group distinguishing features which characterize demographic and socioeconomic factors. This was a cross-sectional study of a representative sample of the population in Serbia aged over 20 years (9479 examinees). The quality of life was expressed by the perception of well-being (pleasure of life). Data on the examinees (demographic and socioeconomic characteristics) were collected by using a questionnaire for adults of each household. To process, analyze and present the data, we used the methods of parametric descriptive statistics (mean value, standard deviation, coefficient of variation), variance analysis and factor analysis. Although men evaluated the quality of life with a slightly higher grading, there was no statistically significant difference in the evaluation of the quality of life in relation to the examinee's gender (p > 0.005). Among the examinees there was a high statistically significant difference in grading the quality of life depending on age, level of education, marital status and type of job (p < 0.001). In relation to the number of children, there was no statistically significant difference in he grading of the quality of life (p > 0.005). The quality of life is influenced by numerous factors that characterize each person (demographic and socioeconomic characteristics of individual). Determining factors of the quality of life are numerous and diverse, and the manner and the strength of their influence are variable.

  3. quenched-smFISH: Counting small RNA in Pathogenic Bacteria

    NASA Astrophysics Data System (ADS)

    Shepherd, Douglas; Li, Nan; Micheva-Viteva, Sofiya; Munsky, Brian; Hong-Geller, Elizabeth; Werner, James

    2014-03-01

    Here, we present a modification to single-molecule fluorescence in situ hybridization, quenched smFISH (q-smFISH), that enables quantitative detection and analysis of small RNA (sRNA) expressed in bacteria. We show that short nucleic acid targets can be detected when the background of unbound singly dye-labeled DNA oligomers is reduced through hybridization with a set of complementary DNA oligomers labeled with a fluorescence quencher. Exploiting an automated, multi-color wide-field microscope and GPU-accelerated data analysis package, we analyzed the statistics of sRNA expression in thousands of individual Yersinia pseudotuberculosis and Yersinia pestis bacteria before and during a simulated infection. Before infection, we find only a small fraction of either bacteria express the small RNAs YSR35 or YSP8. The copy numbers of these RNA are increased during simulated infection, suggesting a role in pathogenesis. The ability to directly quantify expression level changes of sRNA in single cells as a function of external stimuli provides key information on the role of sRNA in bacterial regulatory networks.

  4. Prognostic Significance of Nuclear β-Catenin Expression in Patients with Colorectal Cancer from Iran

    PubMed Central

    Nazemalhosseini Mojarad, Ehsan; Kashfi, Seyed Mohammad Hossein; Mirtalebi, Hanieh; Almasi, Shohre; Chaleshi, Vahid; Kishani Farahani, Roya; Tarban, Peyman; Molaei, Mahsa; Zali, Mohammad Reza; J.K. Kuppen, Peter

    2015-01-01

    Background: Beta catenin plays a key role in cancer tumorigenesis. However, its prognostic significance in patients with colorectal cancer (CRC) remains controversial. It has been demonstrated that 90% of all tumors have a mutation in individual components of multiple oncogenes in Wnt/β-catenin pathway. Accumulation of nuclear β-catenin in cytoplasm leads to uncontrolled cell proliferation. Thus, nuclear β-catenin accumulation may be a valuable biomarker associated with invasion, metastasis and poor prognosis of CRC. Objectives: In this study the prognostic value of beta catenin expression in 165 Iranian CRC patients was evaluated. Patients and Methods: In this cross sectional retrospective study immunohistochemistry analyses of formalin-fixed paraffin-embedded (FFPE) tumor tissues were performed to characterize the expression of nuclear β-catenin in a series of 165 Iranian patients with colorectal carcinoma. Heat-induced antigen retrieval using the microwave method was applied for all staining procedures. Staining was scored independently by two observers, and a high level of concordance (90%) was achieved. Statistical analysis was done using the SPSS software for Windows, version 13.0.0 (SPSS Inc., Chicago, IL). Two-tailed P < 0.05 was considered statistically significant. Results: The patients consisted of 85 males and 80 females. Eighty-eight patients had primary tumor of the rectum and sigmoid, while 77 patients had primary tumor of the colon. The mean period of follow-up was 47.2 ± 10 months and the median period of follow-up was 38 months (range 6 - 58) for each patient. Of 165 tumors, 32 tumors (19.39 %) showed expression of β-catenin and 133 (80.6 %) were negative for β-catenin expression. Based on our findings the distribution of Microsatellite Instability (MSI) status differed between patients with nuclear β-catenin positive and negative tumors and this difference was significant (P = 0.001). Patients with nuclear β-catenin positive expression profile were found to be younger than patients with negative nuclear β-catenin expression (P = 0.010). Univariate and multivariate analysis showed that tumors with β-catenin expression had a poorer prognosis compared to tumors without β-catenin expression. Conclusions: According to our findings, the distribution of nuclear b-catenin expression is a poor prognostic marker in patients with colon cancer. PMID:26421170

  5. [Elevated expression of CLOCK is associated with poor prognosis in hepatocellular carcinoma].

    PubMed

    Li, Bo; Yang, Xiliang; Li, Jiaqi; Yang, Yi; Yan, Zhaoyong; Zhang, Hongxin; Mu, Jiao

    2018-02-01

    Objective To evaluate the expression of circadian locomotor output cycles kaput (CLOCK) and its effects on cell growth in hepatocellular carcinoma (HCC). Methods The expression of CLOCK in 158 pairs of human HCC tissues and matched noncancerous samples was detected by immunohistochemical (IHC) staining. The expression of CLOCK in HCC patients was also verified using the data from GEO and TCGA (a total of 356 cases). The relationship between CLOCK expression and clinicopathological features of HCC patients was analyzed by single factor statistical analysis. Kaplan-Meier survival curves of HCC patients were drawn to study the relationship between the expression level of CLOCK and the survival state. The effect of CLOCK on the growth of HepG2 cells was detected by MTS assay. Results The expression of CLOCK in HCC tissues was significantly higher than that in the adjacent tissues, and the up-regulation of CLOCK expression in HCC tissue was also confirmed in the public data of HCC (356 cases). HCC patients were divided into low CLOCK expression group and high CLOCK expression group. Univariate analysis showed that the expression of CLOCK was related to tumor size, TNM stage, and portal vein invasion in HCC patients. HCC patients with low CLOCK expression had longer overall survival time and relapse-free survival time than those with high CLOCK expression. The proliferation of cells significantly decreased after the expression of CLOCK was knocked down in HepG2 cells. Conclusion The expression of CLOCK in HCC tissues was much higher than that in normal liver tissues, and the high expression of CLOCK indicated the poor prognosis. The knockdown of CLOCK in HCC cells could inhibit the proliferation of HepG2 cells.

  6. Expression and Functional Significance of HtrA1 Loss in Endometrial Cancer

    PubMed Central

    Mullany, Sally A.; Moslemi-Kebria, Mehdi; Rattan, Ramandeep; Khurana, Ashwani; Clayton, Amy; Ota, Takayo; Mariani, Andrea; Podratz, Karl C.; Chien, Jeremy; Shridhar, Viji

    2010-01-01

    Purpose The purpose of this study was to determine if loss of serine protease HtrA1 in endometrial cancer will promote the invasive potential of EC cell lines. Experimental design Western blot analysis and immunohistochemistry methods were used to determine HtrA1 expression in EC cell lines and primary tumors, respectively. Migration, invasion assays and in vivo xenograft experiment were performed to compare the extent of metastasis between HtrA1 expressing and HtrA-1 knocked down clones. Results Western blot analysis of HtrA1 in 13 EC cell lines revealed complete loss of HtrA1 expression in all 7 papillary serous EC cell lines. Downregulation of HtrA1 in Hec1A and Hec1B cell lines resulted in a 3-4 fold increase in the invasive potential. Exogenous expression of HtrA1 in Ark 1 and Ark 2 cells resulted in 3-4 fold decrease in both invasive and migration potential of these cells. There was an increased rate of metastasis to the lungs associated with HtrA1 downregulation in Hec1B cells compared to control cells with endogenous HtrA1 expression. Enhanced expression of HtrA1 in Ark 2 cells resulted in significantly less tumor nodules metastasizing to the lungs compared to parental or protease deficient (SA mutant) Ark 2 cells. Immunohistochemical (IHC) analysis showed 57% (105/184) of primary EC tumors had low HtrA1 expression. The association of low HtrA1 expression with high-grade endometrioid tumors was statistically significant (p=0.016). Conclusions Collectively, these data indicate loss of HtrA1 may contribute to the aggressiveness and metastatic ability of endometrial tumors. PMID:21098697

  7. Prognostic Value of microRNA-9 in Various Cancers: a Meta-analysis.

    PubMed

    Zhang, Yunyuan; Zhou, Jun; Sun, Meiling; Sun, Guirong; Cao, Yongxian; Zhang, Haiping; Tian, Runhua; Zhou, Lan; Duan, Liang; Chen, Xian; Lun, Limin

    2017-07-01

    Recently, there are more and more evidences from studies have revealed the association between microRNA-9 (miR-9) expression and outcome in multiple cancers, but inconsistent results have also been reported. It is necessary to rationalize a meta analysis of all available data to clarify the prognostic role of miR-9. Eligible studies were selected through multiple search strategies and the quality was assessed by MOOSE. Data was extracted from studies according to the key statistics index. All analyses were performed using STATA software. Twenty studies were selected in the meta-analysis to evaluate the prognostic role of miR-9 in multiple tumors. MiR-9 expression level was an independent prognostic biomarker for OS in tumor patients using multivariate and univariate analyses. High expression levels of miR-9 was demonstrated to associated with poor overall survival (OS) (HR = 2.23, 95 % CI: 1.56-3.17, P < 0.05) and recurrence free survival/progress free survival (RFS/PFS) (HR = 2.08, 95 % CI: 1.33-3.27, P < 0.05). Subgroup analysis showed that residence region (China and Japan), sample size, cancer type (solid or leukemia), follow-up months and analysis method (qPCR) did not alter the predictive value of miR-9 on OS in various cancers. Furthermore, no significant associations were detected for miR-9 expression and lymph node metastasis or distant metastasis. The present results suggest that promoted miR-9 expression is associated with poor OS in patients with general cancers.

  8. Analysis of myosin heavy chain mRNA expression by RT-PCR

    NASA Technical Reports Server (NTRS)

    Wright, C.; Haddad, F.; Qin, A. X.; Baldwin, K. M.

    1997-01-01

    An assay was developed for rapid and sensitive analysis of myosin heavy chain (MHC) mRNA expression in rodent skeletal muscle. Only 2 microg of total RNA were necessary for the simultaneous analysis of relative mRNA expression of six different MHC genes. We designed synthetic DNA fragments as internal standards, which contained the relevant primer sequences for the adult MHC mRNAs type I, IIa, IIx, IIb as well as the embryonic and neonatal MHC mRNAs. A known amount of the synthetic fragment was added to each polymerase chain reaction (PCR) and yielded a product of different size than the amplified MHC mRNA fragment. The ratio of amplified MHC fragment to synthetic fragment allowed us to calculate percentages of the gene expression of the different MHC genes in a given muscle sample. Comparison with the traditional Northern blot analysis demonstrated that our reverse transcriptase-PCR-based assay was reliable, fast, and quantitative over a wide range of relative MHC mRNA expression in a spectrum of adult and neonatal rat skeletal muscles. Furthermore, the high sensitivity of the assay made it very useful when only small quantities of tissue were available. Statistical analysis of the signals for each MHC isoform across the analyzed samples showed a highly significant correlation between the PCR and the Northern signals as Pearson correlation coefficients ranged between 0.77 and 0.96 (P < 0.005). This assay has potential use in analyzing small muscle samples such as biopsies and samples from pre- and/or neonatal stages of development.

  9. Customized Molecular Phenotyping by Quantitative Gene Expression and Pattern Recognition Analysis

    PubMed Central

    Akilesh, Shreeram; Shaffer, Daniel J.; Roopenian, Derry

    2003-01-01

    Description of the molecular phenotypes of pathobiological processes in vivo is a pressing need in genomic biology. We have implemented a high-throughput real-time PCR strategy to establish quantitative expression profiles of a customized set of target genes. It enables rapid, reproducible data acquisition from limited quantities of RNA, permitting serial sampling of mouse blood during disease progression. We developed an easy to use statistical algorithm—Global Pattern Recognition—to readily identify genes whose expression has changed significantly from healthy baseline profiles. This approach provides unique molecular signatures for rheumatoid arthritis, systemic lupus erythematosus, and graft versus host disease, and can also be applied to defining the molecular phenotype of a variety of other normal and pathological processes. PMID:12840047

  10. Performing statistical analyses on quantitative data in Taverna workflows: an example using R and maxdBrowse to identify differentially-expressed genes from microarray data.

    PubMed

    Li, Peter; Castrillo, Juan I; Velarde, Giles; Wassink, Ingo; Soiland-Reyes, Stian; Owen, Stuart; Withers, David; Oinn, Tom; Pocock, Matthew R; Goble, Carole A; Oliver, Stephen G; Kell, Douglas B

    2008-08-07

    There has been a dramatic increase in the amount of quantitative data derived from the measurement of changes at different levels of biological complexity during the post-genomic era. However, there are a number of issues associated with the use of computational tools employed for the analysis of such data. For example, computational tools such as R and MATLAB require prior knowledge of their programming languages in order to implement statistical analyses on data. Combining two or more tools in an analysis may also be problematic since data may have to be manually copied and pasted between separate user interfaces for each tool. Furthermore, this transfer of data may require a reconciliation step in order for there to be interoperability between computational tools. Developments in the Taverna workflow system have enabled pipelines to be constructed and enacted for generic and ad hoc analyses of quantitative data. Here, we present an example of such a workflow involving the statistical identification of differentially-expressed genes from microarray data followed by the annotation of their relationships to cellular processes. This workflow makes use of customised maxdBrowse web services, a system that allows Taverna to query and retrieve gene expression data from the maxdLoad2 microarray database. These data are then analysed by R to identify differentially-expressed genes using the Taverna RShell processor which has been developed for invoking this tool when it has been deployed as a service using the RServe library. In addition, the workflow uses Beanshell scripts to reconcile mismatches of data between services as well as to implement a form of user interaction for selecting subsets of microarray data for analysis as part of the workflow execution. A new plugin system in the Taverna software architecture is demonstrated by the use of renderers for displaying PDF files and CSV formatted data within the Taverna workbench. Taverna can be used by data analysis experts as a generic tool for composing ad hoc analyses of quantitative data by combining the use of scripts written in the R programming language with tools exposed as services in workflows. When these workflows are shared with colleagues and the wider scientific community, they provide an approach for other scientists wanting to use tools such as R without having to learn the corresponding programming language to analyse their own data.

  11. Performing statistical analyses on quantitative data in Taverna workflows: An example using R and maxdBrowse to identify differentially-expressed genes from microarray data

    PubMed Central

    Li, Peter; Castrillo, Juan I; Velarde, Giles; Wassink, Ingo; Soiland-Reyes, Stian; Owen, Stuart; Withers, David; Oinn, Tom; Pocock, Matthew R; Goble, Carole A; Oliver, Stephen G; Kell, Douglas B

    2008-01-01

    Background There has been a dramatic increase in the amount of quantitative data derived from the measurement of changes at different levels of biological complexity during the post-genomic era. However, there are a number of issues associated with the use of computational tools employed for the analysis of such data. For example, computational tools such as R and MATLAB require prior knowledge of their programming languages in order to implement statistical analyses on data. Combining two or more tools in an analysis may also be problematic since data may have to be manually copied and pasted between separate user interfaces for each tool. Furthermore, this transfer of data may require a reconciliation step in order for there to be interoperability between computational tools. Results Developments in the Taverna workflow system have enabled pipelines to be constructed and enacted for generic and ad hoc analyses of quantitative data. Here, we present an example of such a workflow involving the statistical identification of differentially-expressed genes from microarray data followed by the annotation of their relationships to cellular processes. This workflow makes use of customised maxdBrowse web services, a system that allows Taverna to query and retrieve gene expression data from the maxdLoad2 microarray database. These data are then analysed by R to identify differentially-expressed genes using the Taverna RShell processor which has been developed for invoking this tool when it has been deployed as a service using the RServe library. In addition, the workflow uses Beanshell scripts to reconcile mismatches of data between services as well as to implement a form of user interaction for selecting subsets of microarray data for analysis as part of the workflow execution. A new plugin system in the Taverna software architecture is demonstrated by the use of renderers for displaying PDF files and CSV formatted data within the Taverna workbench. Conclusion Taverna can be used by data analysis experts as a generic tool for composing ad hoc analyses of quantitative data by combining the use of scripts written in the R programming language with tools exposed as services in workflows. When these workflows are shared with colleagues and the wider scientific community, they provide an approach for other scientists wanting to use tools such as R without having to learn the corresponding programming language to analyse their own data. PMID:18687127

  12. Analysis of the fluctuations of the tumour/host interface

    NASA Astrophysics Data System (ADS)

    Milotti, Edoardo; Vyshemirsky, Vladislav; Stella, Sabrina; Dogo, Federico; Chignola, Roberto

    2017-11-01

    In a recent analysis of metabolic scaling in solid tumours we found a scaling law that interpolates between the power laws μ ∝ V and μ ∝V 2 / 3, where μ is the metabolic rate expressed as the glucose absorption rate and V is the tumour volume. The scaling law fits quite well both in vitro and in vivo data, however we also observed marked fluctuations that are associated with the specific biological properties of individual tumours. Here we analyse these fluctuations, in an attempt to find the population-wide distribution of an important parameter (A) which expresses the total extent of the interface between the solid tumour and the non-cancerous environment. Heuristic considerations suggest that the values of the A parameter follow a lognormal distribution, and, allowing for the large uncertainties of the experimental data, our statistical analysis confirms this.

  13. Clinical validation of nuclear factor kappa B expression in invasive breast cancer.

    PubMed

    Agrawal, Anil Kumar; Pielka, Ewa; Lipinski, Artur; Jelen, Michal; Kielan, Wojciech; Agrawal, Siddarth

    2018-01-01

    Breast cancer is the most commonly diagnosed cancer in Polish women. The expression of transcription nuclear factor kappa B, a key inducer of inflammatory response promoting carcinogenesis and cancer progression in breast cancer, is not well-established. We assessed the nuclear factor kappa B expression in a total of 119 invasive breast carcinomas and 25 healthy control samples and correlated this expression pattern with several clinical and pathologic parameters including histologic type and grade, tumor size, lymph node status, estrogen receptor status, and progesterone receptor status. The data used for the analysis were derived from medical records. An immunohistochemical analysis of nuclear factor kappa B, estrogen receptor, and progesterone receptor was carried out and evaluation of stainings was performed. The expression of nuclear factor kappa B was significantly higher than that in the corresponding healthy control samples. No statistical difference was demonstrated in nuclear factor kappa B expression in relation to age, menopausal status, lymph node status, tumor size and location, grade and histologic type of tumor, and hormonal status (estrogen receptor and progesterone receptor). Nuclear factor kappa B is significantly overexpressed in invasive breast cancer tissues. Although nuclear factor kappa B status does not correlate with clinicopathological findings, it might provide important additional information on prognosis and become a promising object for targeted therapy.

  14. EMMPRIN/CD147 is an independent prognostic biomarker in cutaneous melanoma.

    PubMed

    Caudron, Anne; Battistella, Maxime; Feugeas, Jean-Paul; Pages, Cécile; Basset-Seguin, Nicole; Mazouz Dorval, Sarra; Funck Brentano, Elisa; Sadoux, Aurélie; Podgorniak, Marie-Pierre; Menashi, Suzanne; Janin, Anne; Lebbé, Céleste; Mourah, Samia

    2016-08-01

    CD147 has been implicated in melanoma invasion and metastasis mainly through increasing metalloproteinase synthesis and regulating VEGF/VEGFR signalling. In this study, the prognostic value of CD147 expression was investigated in a cohort of 196 cutaneous melanomas including 136 consecutive primary malignant melanomas, 30 lymph nodes, 16 in-transit and 14 visceral metastases. A series of 10 normal skin, 10 blue nevi and 10 dermal nevi was used as control. CD147 expression was assessed by immunohistochemistry, and the association of its expression with the clinicopathological characteristics of patients and survival was evaluated using univariate and multivariate statistical analyses. Univariate analysis showed that high CD147 expression was significantly associated with metastatic potential and with a reduced overall survival (P < 0.05 for both) in primary melanoma patients. CD147 expression level was correlated with histological factors which were associated with prognosis: Clark level, ulceration status and more particularly with Breslow index (r = 0.7, P < 10(-8) ). Multivariate analysis retained CD147 expression level and ulceration status as predicting factors for metastasis and overall survival (P < 0.05 for both). CD147 emerges as an important factor in the aggressive behaviour of melanoma and deserves further evaluation as an independent prognostic biomarker. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  15. Study on the relationship between the methylation of the MMP-9 gene promoter region and diabetic nephropathy.

    PubMed

    Yang, Xiao-Hui; Feng, Shi-Ya; Yu, Yang; Liang, Zhou

    2018-01-01

    This study aims to explore the relationship between the methylation of matrix metalloproteinase (MMP)-9 gene promoter region and diabetic nephropathy (DN) through the detection of the methylation level of MMP-9 gene promoter region in the peripheral blood of patients with DN in different periods and serum MMP-9 concentration. The methylation level of the MMP-9 gene promoter region was detected by methylation-specific polymerase chain reaction (MSP), and the content of MMP-9 in serum was determined by enzyme-linked immunosorbent assay (ELISA). Results of the statistical analysis revealed that serum MMP-9 protein expression levels gradually increased in patients in the simple diabetic group, early diabetic nephropathy group and clinical diabetic nephropathy group, compared with the control group; and the difference was statistically significant (P < 0.05). Compared with the control group, the methylation levels of MMP-9 gene promoter regions gradually decreased in patients in the simple diabetic group, early diabetic nephropathy group, and clinical diabetic nephropathy group; and the difference was statistically significant (P < 0.05). Furthermore, correlation analysis results indicated that the demethylation levels of the MMP-9 gene promoter region was positively correlated with serum protein levels, urinary albumin to creatinine ratio (UACR), urea and creatinine; and was negatively correlated with GFR. The demethylation of the MMP-9 gene promoter region may be involved in the occurrence and development of diabetic nephropathy by regulating the expression of MMP-9 protein in serum.

  16. Polyester: simulating RNA-seq datasets with differential transcript expression.

    PubMed

    Frazee, Alyssa C; Jaffe, Andrew E; Langmead, Ben; Leek, Jeffrey T

    2015-09-01

    Statistical methods development for differential expression analysis of RNA sequencing (RNA-seq) requires software tools to assess accuracy and error rate control. Since true differential expression status is often unknown in experimental datasets, artificially constructed datasets must be utilized, either by generating costly spike-in experiments or by simulating RNA-seq data. Polyester is an R package designed to simulate RNA-seq data, beginning with an experimental design and ending with collections of RNA-seq reads. Its main advantage is the ability to simulate reads indicating isoform-level differential expression across biological replicates for a variety of experimental designs. Data generated by Polyester is a reasonable approximation to real RNA-seq data and standard differential expression workflows can recover differential expression set in the simulation by the user. Polyester is freely available from Bioconductor (http://bioconductor.org/). jtleek@gmail.com Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  17. Improving information retrieval in functional analysis.

    PubMed

    Rodriguez, Juan C; González, Germán A; Fresno, Cristóbal; Llera, Andrea S; Fernández, Elmer A

    2016-12-01

    Transcriptome analysis is essential to understand the mechanisms regulating key biological processes and functions. The first step usually consists of identifying candidate genes; to find out which pathways are affected by those genes, however, functional analysis (FA) is mandatory. The most frequently used strategies for this purpose are Gene Set and Singular Enrichment Analysis (GSEA and SEA) over Gene Ontology. Several statistical methods have been developed and compared in terms of computational efficiency and/or statistical appropriateness. However, whether their results are similar or complementary, the sensitivity to parameter settings, or possible bias in the analyzed terms has not been addressed so far. Here, two GSEA and four SEA methods and their parameter combinations were evaluated in six datasets by comparing two breast cancer subtypes with well-known differences in genetic background and patient outcomes. We show that GSEA and SEA lead to different results depending on the chosen statistic, model and/or parameters. Both approaches provide complementary results from a biological perspective. Hence, an Integrative Functional Analysis (IFA) tool is proposed to improve information retrieval in FA. It provides a common gene expression analytic framework that grants a comprehensive and coherent analysis. Only a minimal user parameter setting is required, since the best SEA/GSEA alternatives are integrated. IFA utility was demonstrated by evaluating four prostate cancer and the TCGA breast cancer microarray datasets, which showed its biological generalization capabilities. Copyright © 2016 Elsevier Ltd. All rights reserved.

  18. A statistical physics perspective on alignment-independent protein sequence comparison.

    PubMed

    Chattopadhyay, Amit K; Nasiev, Diar; Flower, Darren R

    2015-08-01

    Within bioinformatics, the textual alignment of amino acid sequences has long dominated the determination of similarity between proteins, with all that implies for shared structure, function and evolutionary descent. Despite the relative success of modern-day sequence alignment algorithms, so-called alignment-free approaches offer a complementary means of determining and expressing similarity, with potential benefits in certain key applications, such as regression analysis of protein structure-function studies, where alignment-base similarity has performed poorly. Here, we offer a fresh, statistical physics-based perspective focusing on the question of alignment-free comparison, in the process adapting results from 'first passage probability distribution' to summarize statistics of ensemble averaged amino acid propensity values. In this article, we introduce and elaborate this approach. © The Author 2015. Published by Oxford University Press.

  19. Scoring clustering solutions by their biological relevance.

    PubMed

    Gat-Viks, I; Sharan, R; Shamir, R

    2003-12-12

    A central step in the analysis of gene expression data is the identification of groups of genes that exhibit similar expression patterns. Clustering gene expression data into homogeneous groups was shown to be instrumental in functional annotation, tissue classification, regulatory motif identification, and other applications. Although there is a rich literature on clustering algorithms for gene expression analysis, very few works addressed the systematic comparison and evaluation of clustering results. Typically, different clustering algorithms yield different clustering solutions on the same data, and there is no agreed upon guideline for choosing among them. We developed a novel statistically based method for assessing a clustering solution according to prior biological knowledge. Our method can be used to compare different clustering solutions or to optimize the parameters of a clustering algorithm. The method is based on projecting vectors of biological attributes of the clustered elements onto the real line, such that the ratio of between-groups and within-group variance estimators is maximized. The projected data are then scored using a non-parametric analysis of variance test, and the score's confidence is evaluated. We validate our approach using simulated data and show that our scoring method outperforms several extant methods, including the separation to homogeneity ratio and the silhouette measure. We apply our method to evaluate results of several clustering methods on yeast cell-cycle gene expression data. The software is available from the authors upon request.

  20. Time-Course Gene Set Analysis for Longitudinal Gene Expression Data

    PubMed Central

    Hejblum, Boris P.; Skinner, Jason; Thiébaut, Rodolphe

    2015-01-01

    Gene set analysis methods, which consider predefined groups of genes in the analysis of genomic data, have been successfully applied for analyzing gene expression data in cross-sectional studies. The time-course gene set analysis (TcGSA) introduced here is an extension of gene set analysis to longitudinal data. The proposed method relies on random effects modeling with maximum likelihood estimates. It allows to use all available repeated measurements while dealing with unbalanced data due to missing at random (MAR) measurements. TcGSA is a hypothesis driven method that identifies a priori defined gene sets with significant expression variations over time, taking into account the potential heterogeneity of expression within gene sets. When biological conditions are compared, the method indicates if the time patterns of gene sets significantly differ according to these conditions. The interest of the method is illustrated by its application to two real life datasets: an HIV therapeutic vaccine trial (DALIA-1 trial), and data from a recent study on influenza and pneumococcal vaccines. In the DALIA-1 trial TcGSA revealed a significant change in gene expression over time within 69 gene sets during vaccination, while a standard univariate individual gene analysis corrected for multiple testing as well as a standard a Gene Set Enrichment Analysis (GSEA) for time series both failed to detect any significant pattern change over time. When applied to the second illustrative data set, TcGSA allowed the identification of 4 gene sets finally found to be linked with the influenza vaccine too although they were found to be associated to the pneumococcal vaccine only in previous analyses. In our simulation study TcGSA exhibits good statistical properties, and an increased power compared to other approaches for analyzing time-course expression patterns of gene sets. The method is made available for the community through an R package. PMID:26111374

  1. TROP2 correlates with microvessel density and poor prognosis in hilar cholangiocarcinoma.

    PubMed

    Ning, Shanglei; Guo, Sen; Xie, Jianjun; Xu, Yunfei; Lu, Xiaofei; Chen, Yuxin

    2013-02-01

    Trophoblast cell surface antigen 2 (TROP2) was found to be associated with tumor progression and poor prognosis in a variety of epithelial carcinomas. The aim of the study was to investigate TROP2 expression and its prognostic impact in hilar cholangiocarcinoma. Immunohistochemistry and quantitative real-time PCR were used to determine TROP2 expression in surgical specimens from 70 hilar cholangiocarcinoma patients receiving radical resection. The relationship between TROP2 expression and microvessel density was investigated and standard statistical analysis was used to evaluate TROP2 prognosis significance in hilar cholangiocarcinoma. High TROP2 expression by immunohistochemistry was found in 43 (61.4 %) of the 70 tumor specimens. Quantitative real-time PCR confirmed that TROP2 level in tumor was significantly higher than in non-tumoral biliary tissues (P = 0.001). Significant correlations were found between TROP2 expression and histological differentiation (P = 0.016) and tumor T stage (P = 0.031) in hilar cholangiocarcinoma. TROP2 expression correlated with microvessel density in hilar cholangiocarcinoma (P = 0.026). High TROP2 expression patients had a significantly poorer overall survival rate than those with low TROP2 expression (30 vs. 68.5 %, P = 0.001), and multivariate Cox regression analysis indicated TROP2 as an independent prognostic factor for hilar cholangiocarcinoma (P = 0.004). TROP2 expression correlates with microvessel density significantly and is an independent prognostic factor in human hilar cholangiocarcinoma.

  2. Venous Shunt Versus Venous Ligation for Vascular Damage Control: The Immunohistochemical Evidence.

    PubMed

    Góes Junior, Adenauer Marinho de Oliveira; Abib, Simone de Campos Vieira; Alves, Maria Teresa de Seixas; Ferreira, Paulo Sérgio Venerando da Silva; Andrade, Mariseth Carvalho de

    2017-05-01

    To evaluate the expression of immunohistochemical markers of tissue ischemia (iNOS, eNOS, and HSP70) in a vascular damage control experimental model to determine if a venous temporary vascular shunt insertion leads to a better limb perfusion when compared with the ligature of the injured vein. Experimental study in male Sus Scrofa weighting 40 Kg. Animals were distributed into 5 groups: group 1 animals were submitted to right external iliac artery (EIA) shunting and right external iliac vein (EIV) ligation; group 2 animals were submitted to right EIA shunting and right EIV shunting; group 3 animals were submitted to right EIV ligation; group 4 animals were submitted to right EIV shunting; group 5 animals were not submitted to vascular shunting or venous ligation. Transonic Systems flowmeters were used to measure vascular flow on right and left external iliac vessels, and i-STAT (Abbot) portable blood analyzer was used for EIVs blood biochemical analysis. An initial baseline register of invasive arterial pressure, iliac vessels flow, and venous blood analysis was performed. Arterial pressure and iliac vessels flow were taken immediately after right iliac vessels shunting or ligation. Then, hemorrhagic shock was induced by continuous 20 mL/min blood withdraw from the external right jugular vein whereas arterial blood pressure and iliac vessels flow registers were taken every 10 min, and blood samples from EIVs were obtained every 30 min until the vascular flow through right EIA (or through the shunt inserted into the right EIV for group 4 animals) became inexistent or until the animal's death. After the end of the experiments, bilateral hind limb's biopsies were obtained for immunohistochemical analysis. Using image editing and analysis software, the expression of iNOS, eNOS, and HSP70 (3 well-known ischemic associated immunohistochemical markers) was assessed. The mean expression of each marker in the right hind limb was compared between groups. For statistical analysis, Microsoft Office Excel 2007 and BioEstat 5.0 (2007) were used. Immunohistochemical analysis showed no difference regarding the iNOS expression; nevertheless, both eNOS and HSP70 expression were statistically more intense (P < 0.05) on group 1 (eNOS = 1.32; HSP70 = 15.05) than on group 2 (eNOS = 0.018; HSP70 = 8.56). The higher expression of eNOS and HSP70 in the right hind limbs of group 1 animals (arterial shunt and venous ligature) than group 2 animals (arterial shunt and venous shunt) suggests that venous ligation is associated with more intense ischemic histological findings than venous shunting. Copyright © 2017 Elsevier Inc. All rights reserved.

  3. Extended local similarity analysis (eLSA) of microbial community and other time series data with replicates.

    PubMed

    Xia, Li C; Steele, Joshua A; Cram, Jacob A; Cardon, Zoe G; Simmons, Sheri L; Vallino, Joseph J; Fuhrman, Jed A; Sun, Fengzhu

    2011-01-01

    The increasing availability of time series microbial community data from metagenomics and other molecular biological studies has enabled the analysis of large-scale microbial co-occurrence and association networks. Among the many analytical techniques available, the Local Similarity Analysis (LSA) method is unique in that it captures local and potentially time-delayed co-occurrence and association patterns in time series data that cannot otherwise be identified by ordinary correlation analysis. However LSA, as originally developed, does not consider time series data with replicates, which hinders the full exploitation of available information. With replicates, it is possible to understand the variability of local similarity (LS) score and to obtain its confidence interval. We extended our LSA technique to time series data with replicates and termed it extended LSA, or eLSA. Simulations showed the capability of eLSA to capture subinterval and time-delayed associations. We implemented the eLSA technique into an easy-to-use analytic software package. The software pipeline integrates data normalization, statistical correlation calculation, statistical significance evaluation, and association network construction steps. We applied the eLSA technique to microbial community and gene expression datasets, where unique time-dependent associations were identified. The extended LSA analysis technique was demonstrated to reveal statistically significant local and potentially time-delayed association patterns in replicated time series data beyond that of ordinary correlation analysis. These statistically significant associations can provide insights to the real dynamics of biological systems. The newly designed eLSA software efficiently streamlines the analysis and is freely available from the eLSA homepage, which can be accessed at http://meta.usc.edu/softs/lsa.

  4. Extended local similarity analysis (eLSA) of microbial community and other time series data with replicates

    PubMed Central

    2011-01-01

    Background The increasing availability of time series microbial community data from metagenomics and other molecular biological studies has enabled the analysis of large-scale microbial co-occurrence and association networks. Among the many analytical techniques available, the Local Similarity Analysis (LSA) method is unique in that it captures local and potentially time-delayed co-occurrence and association patterns in time series data that cannot otherwise be identified by ordinary correlation analysis. However LSA, as originally developed, does not consider time series data with replicates, which hinders the full exploitation of available information. With replicates, it is possible to understand the variability of local similarity (LS) score and to obtain its confidence interval. Results We extended our LSA technique to time series data with replicates and termed it extended LSA, or eLSA. Simulations showed the capability of eLSA to capture subinterval and time-delayed associations. We implemented the eLSA technique into an easy-to-use analytic software package. The software pipeline integrates data normalization, statistical correlation calculation, statistical significance evaluation, and association network construction steps. We applied the eLSA technique to microbial community and gene expression datasets, where unique time-dependent associations were identified. Conclusions The extended LSA analysis technique was demonstrated to reveal statistically significant local and potentially time-delayed association patterns in replicated time series data beyond that of ordinary correlation analysis. These statistically significant associations can provide insights to the real dynamics of biological systems. The newly designed eLSA software efficiently streamlines the analysis and is freely available from the eLSA homepage, which can be accessed at http://meta.usc.edu/softs/lsa. PMID:22784572

  5. Dissociation between recognition and detection advantage for facial expressions: a meta-analysis.

    PubMed

    Nummenmaa, Lauri; Calvo, Manuel G

    2015-04-01

    Happy facial expressions are recognized faster and more accurately than other expressions in categorization tasks, whereas detection in visual search tasks is widely believed to be faster for angry than happy faces. We used meta-analytic techniques for resolving this categorization versus detection advantage discrepancy for positive versus negative facial expressions. Effect sizes were computed on the basis of the r statistic for a total of 34 recognition studies with 3,561 participants and 37 visual search studies with 2,455 participants, yielding a total of 41 effect sizes for recognition accuracy, 25 for recognition speed, and 125 for visual search speed. Random effects meta-analysis was conducted to estimate effect sizes at population level. For recognition tasks, an advantage in recognition accuracy and speed for happy expressions was found for all stimulus types. In contrast, for visual search tasks, moderator analysis revealed that a happy face detection advantage was restricted to photographic faces, whereas a clear angry face advantage was found for schematic and "smiley" faces. Robust detection advantage for nonhappy faces was observed even when stimulus emotionality was distorted by inversion or rearrangement of the facial features, suggesting that visual features primarily drive the search. We conclude that the recognition advantage for happy faces is a genuine phenomenon related to processing of facial expression category and affective valence. In contrast, detection advantages toward either happy (photographic stimuli) or nonhappy (schematic) faces is contingent on visual stimulus features rather than facial expression, and may not involve categorical or affective processing. (c) 2015 APA, all rights reserved).

  6. Evaluation of predictive capacities of biomarkers based on research synthesis.

    PubMed

    Hattori, Satoshi; Zhou, Xiao-Hua

    2016-11-10

    The objective of diagnostic studies or prognostic studies is to evaluate and compare predictive capacities of biomarkers. Suppose we are interested in evaluation and comparison of predictive capacities of continuous biomarkers for a binary outcome based on research synthesis. In analysis of each study, subjects are often classified into two groups of the high-expression and low-expression groups according to a cut-off value, and statistical analysis is based on a 2 × 2 table defined by the response and the high expression or low expression of the biomarker. Because the cut-off is study specific, it is difficult to interpret a combined summary measure such as an odds ratio based on the standard meta-analysis techniques. The summary receiver operating characteristic curve is a useful method for meta-analysis of diagnostic studies in the presence of heterogeneity of cut-off values to examine discriminative capacities of biomarkers. We develop a method to estimate positive or negative predictive curves, which are alternative to the receiver operating characteristic curve based on information reported in published papers of each study. These predictive curves provide a useful graphical presentation of pairs of positive and negative predictive values and allow us to compare predictive capacities of biomarkers of different scales in the presence of heterogeneity in cut-off values among studies. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  7. Helicobacter pylori and gastric mucin expression: A systematic review and meta-analysis.

    PubMed

    Niv, Yaron

    2015-08-21

    To investigate the relationship between Helicobacter pylori (H. pylori) and mucin expression in gastric mucosa. English Medical literature searches were conducted for gastric mucin expression in H. pylori infected people vs uninfected people. Searches were performed up to December 31(th) 2014, using MEDLINE, PubMed, EMBASE, Scopus, and CENTRAL. Studies comparing mucin expression in the gastric mucosa in patients positive and negative for H. pylori infection, were included. Meta-analysis was performed by using Comprehensive meta-analysis software (Version 3, Biostat Inc., Englewood, NJ, United States). Pooled odds ratios (ORs) and 95% confidence intervals (CIs) were calculated compared mucin expression in individual studies by using the random effects model. Heterogeneity between studies was evaluated using the Cochran Q-test, and it was considered to be present if the Q-test P value was less than 0.10. I(2) statistic was used to measure the proportion of inconsistency in individual studies, with I(2) > 50% representing substantial heterogeneity. We also calculated a potential publication bias. Eleven studies, which represent 53 sub-studies of 15 different kinds of mucin expression, were selected according to the inclusion criteria. Every kind of mucin has been considered as one study. When a specific mucin has been studied in more than one paper, we combined the results in a nested meta-analysis of this particular mucin: MUC2, MUC6, STn, Paradoxical con A, Tn, T, Type 1 chain mucin, LeA, SLeA, LeB, AB-PAS, MUC1, and MUC5AC. The odds ratio of mucin expression in random analysis was 2.33, 95%CI: 1.230-4.411, P = 0.009, higher expression in H. pylori infected patients. Odds ratio for mucin expression in H. pylori positive patients was higher for MUC6 (9.244, 95%CI: 1.567-54.515, P = 0.014), and significantly lower for MUC5AC (0.447, 95%CI: 0.211-0.949, P = 0.036). Thus, H. pylori infection may increase MUC6 expression and decrease MUC5AC expression by 924% and 52%, respectively. H. pylori inhibits MUC5AC expression in the gastric epithelium, and facilitates colonization. In contrast, increased MUC6 expression may help inhibiting colonization, using MUC6 antibiotics properties.

  8. Helicobacter pylori and gastric mucin expression: A systematic review and meta-analysis

    PubMed Central

    Niv, Yaron

    2015-01-01

    AIM: To investigate the relationship between Helicobacter pylori (H. pylori) and mucin expression in gastric mucosa. METHODS: English Medical literature searches were conducted for gastric mucin expression in H. pylori infected people vs uninfected people. Searches were performed up to December 31th 2014, using MEDLINE, PubMed, EMBASE, Scopus, and CENTRAL. Studies comparing mucin expression in the gastric mucosa in patients positive and negative for H. pylori infection, were included. Meta-analysis was performed by using Comprehensive meta-analysis software (Version 3, Biostat Inc., Englewood, NJ, United States). Pooled odds ratios (ORs) and 95% confidence intervals (CIs) were calculated compared mucin expression in individual studies by using the random effects model. Heterogeneity between studies was evaluated using the Cochran Q-test, and it was considered to be present if the Q-test P value was less than 0.10. I2 statistic was used to measure the proportion of inconsistency in individual studies, with I2 > 50% representing substantial heterogeneity. We also calculated a potential publication bias. RESULTS: Eleven studies, which represent 53 sub-studies of 15 different kinds of mucin expression, were selected according to the inclusion criteria. Every kind of mucin has been considered as one study. When a specific mucin has been studied in more than one paper, we combined the results in a nested meta-analysis of this particular mucin: MUC2, MUC6, STn, Paradoxical con A, Tn, T, Type 1 chain mucin, LeA, SLeA, LeB, AB-PAS, MUC1, and MUC5AC. The odds ratio of mucin expression in random analysis was 2.33, 95%CI: 1.230-4.411, P = 0.009, higher expression in H. pylori infected patients. Odds ratio for mucin expression in H. pylori positive patients was higher for MUC6 (9.244, 95%CI: 1.567-54.515, P = 0.014), and significantly lower for MUC5AC (0.447, 95%CI: 0.211-0.949, P = 0.036). Thus, H. pylori infection may increase MUC6 expression and decrease MUC5AC expression by 924% and 52%, respectively. CONCLUSION: H. pylori inhibits MUC5AC expression in the gastric epithelium, and facilitates colonization. In contrast, increased MUC6 expression may help inhibiting colonization, using MUC6 antibiotics properties. PMID:26309370

  9. Expression of FLT4 in hypoxia-induced neovascular models in vitro and in vivo.

    PubMed

    Liu, Jiao-Lian; Xia, Xiao-Bo; Xu, Hui-Zhuo

    2011-01-01

    To investigate the expression of FLT4 in retina with oxygen induced retinopathy (OIR) and in brain endothelial cell lines (bEnd3) under hypoxia conditions in mice. Fifty-two one-week-old C57BL/6J mice were divided into control group and hypoxia group. The mice of hypoxia group were exposed to 75% oxygen for 5 days and then returned to the room air to induce retinal neovascularization. Mice in control group were raised in the environment of room air at the same time. The expressions of FLT4 mRNA and protein were checked with RT-PCR and Western Blot analysis at postnatal day 14, 17 and 21 ( P14, P17 and P21) respectively. 125mmol/L CoCl(2) were added to the culture medium of bEnd3 cell, proteins were extracted in 12, 24, 48 and 72 hours and FLT4 levels were examined by Western Blot analysis. The mRNA and protein level of FLT4 expressed in P14 and P17 OIR mice retina statistically up-regulated as compared with those in control group, but there was no statistical difference between OIR group and control group at P21. FLT4 levels increased significantly in 12, 24 and 48 hours hypoxia intervened bEnd3 cells, its levels in 72 hours increased mildly but showed no significance. FLT4 levels increase in OIR mice retinas and bEnd3 cells in hypoxia. It may play an important role in endothelial cells proliferation in hypoxia and retinal neovascularization in OIR mice.

  10. Expression of FLT4 in hypoxia-induced neovascular models in vitro and in vivo

    PubMed Central

    Liu, Jiao-Lian; Xia, Xiao-Bo; Xu, Hui-Zhuo

    2011-01-01

    AIM To investigate the expression of FLT4 in retina with oxygen induced retinopathy (OIR) and in brain endothelial cell lines (bEnd3) under hypoxia conditions in mice. METHODS Fifty-two one-week-old C57BL/6J mice were divided into control group and hypoxia group. The mice of hypoxia group were exposed to 75% oxygen for 5 days and then returned to the room air to induce retinal neovascularization. Mice in control group were raised in the environment of room air at the same time. The expressions of FLT4 mRNA and protein were checked with RT-PCR and Western Blot analysis at postnatal day 14, 17 and 21 ( P14, P17 and P21) respectively. 125mmol/L CoCl2 were added to the culture medium of bEnd3 cell, proteins were extracted in 12, 24, 48 and 72 hours and FLT4 levels were examined by Western Blot analysis. RESULTS The mRNA and protein level of FLT4 expressed in P14 and P17 OIR mice retina statistically up-regulated as compared with those in control group, but there was no statistical difference between OIR group and control group at P21. FLT4 levels increased significantly in 12, 24 and 48 hours hypoxia intervened bEnd3 cells, its levels in 72 hours increased mildly but showed no significance. CONCLUSION FLT4 levels increase in OIR mice retinas and bEnd3 cells in hypoxia. It may play an important role in endothelial cells proliferation in hypoxia and retinal neovascularization in OIR mice. PMID:22553602

  11. Expression of microRNA 638 and sex-determining region Y-box 2 in hepatocellular carcinoma: Association between clinicopathological features and prognosis.

    PubMed

    Ye, Weikang; Li, Jieke; Fang, Guan; Cai, Xiupeng; Zhang, Yan; Zhou, Chaojun; Chen, Lei; Yang, Wenjun

    2018-05-01

    The aim of the present study was to determine the expression profile of microRNA 638 (miR-638) and sex-determining region Y-box 2 (SOX2) in hepatocellular carcinoma (HCC), and to investigate their association with clinicopathological features and survival. Reverse transcription-quantitative polymerase chain reaction analysis was used to investigate miR-638 and SOX2 expression in 78 patients with HCC. Western blot and immunohistochemical analyses were performed in order to determine SOX2 protein expression in HCC samples. Combined with the clinical postoperative follow-up data, the expression of miR-638 and SOX2 and the association between this and the prognostic values of patients with HCC were statistically analyzed. The results of the present study confirmed that miR-638 expression in tumor tissues was significantly downregulated (P<0.001), while SOX2 expression was significantly increased, compared with healthy control tissues (P<0.05). In addition, a significant inverse correlation between miR-638 and SOX2 expression was also observed in the HCC tissues (r=-0.675; P<0.05). Clinicopathological correlation analysis demonstrated that reduced miR-638 and elevated SOX2 expression was significantly associated with the Tumor-Node-Metastasis stage and portal vascular invasion (P<0.05). However, no significant differences were observed in other clinicopathological features, including age, sex, tumor size, tumor differentiation and hepatitis status (P>0.05). Notably, follow-up analysis revealed that patients with HCC with low miR-638 expression and high SOX2 expression tended to have a significantly shorter postoperative survival time (P<0.001). It was concluded that miR-638 may serve a vital role in the occurrence and progression of HCC by regulating SOX2 expression and thus, that miR-638 and SOX2 may be critical as novel diagnostic and prognostic biomarkers for HCC.

  12. Modeling genome-wide dynamic regulatory network in mouse lungs with influenza infection using high-dimensional ordinary differential equations.

    PubMed

    Wu, Shuang; Liu, Zhi-Ping; Qiu, Xing; Wu, Hulin

    2014-01-01

    The immune response to viral infection is regulated by an intricate network of many genes and their products. The reverse engineering of gene regulatory networks (GRNs) using mathematical models from time course gene expression data collected after influenza infection is key to our understanding of the mechanisms involved in controlling influenza infection within a host. A five-step pipeline: detection of temporally differentially expressed genes, clustering genes into co-expressed modules, identification of network structure, parameter estimate refinement, and functional enrichment analysis, is developed for reconstructing high-dimensional dynamic GRNs from genome-wide time course gene expression data. Applying the pipeline to the time course gene expression data from influenza-infected mouse lungs, we have identified 20 distinct temporal expression patterns in the differentially expressed genes and constructed a module-based dynamic network using a linear ODE model. Both intra-module and inter-module annotations and regulatory relationships of our inferred network show some interesting findings and are highly consistent with existing knowledge about the immune response in mice after influenza infection. The proposed method is a computationally efficient, data-driven pipeline bridging experimental data, mathematical modeling, and statistical analysis. The application to the influenza infection data elucidates the potentials of our pipeline in providing valuable insights into systematic modeling of complicated biological processes.

  13. Image analysis tools and emerging algorithms for expression proteomics

    PubMed Central

    English, Jane A.; Lisacek, Frederique; Morris, Jeffrey S.; Yang, Guang-Zhong; Dunn, Michael J.

    2012-01-01

    Since their origins in academic endeavours in the 1970s, computational analysis tools have matured into a number of established commercial packages that underpin research in expression proteomics. In this paper we describe the image analysis pipeline for the established 2-D Gel Electrophoresis (2-DE) technique of protein separation, and by first covering signal analysis for Mass Spectrometry (MS), we also explain the current image analysis workflow for the emerging high-throughput ‘shotgun’ proteomics platform of Liquid Chromatography coupled to MS (LC/MS). The bioinformatics challenges for both methods are illustrated and compared, whilst existing commercial and academic packages and their workflows are described from both a user’s and a technical perspective. Attention is given to the importance of sound statistical treatment of the resultant quantifications in the search for differential expression. Despite wide availability of proteomics software, a number of challenges have yet to be overcome regarding algorithm accuracy, objectivity and automation, generally due to deterministic spot-centric approaches that discard information early in the pipeline, propagating errors. We review recent advances in signal and image analysis algorithms in 2-DE, MS, LC/MS and Imaging MS. Particular attention is given to wavelet techniques, automated image-based alignment and differential analysis in 2-DE, Bayesian peak mixture models and functional mixed modelling in MS, and group-wise consensus alignment methods for LC/MS. PMID:21046614

  14. Proteomic Profiling and Differential Messenger RNA Expression Correlate HSP27 and Serpin Family B Member 1 to Apical Periodontitis Outcomes.

    PubMed

    Cavalla, Franco; Biguetti, Claudia; Jain, Sameer; Johnson, Cleverick; Letra, Ariadne; Garlet, Gustavo Pompermaier; Silva, Renato Menezes

    2017-09-01

    Understanding protein expression profiles of apical periodontitis may contribute to the discovery of novel diagnostic or therapeutic molecular targets. Periapical tissue samples (n = 5) of patients with lesions characterized as nonhealing were submitted for proteomic analysis. Two differentially expressed proteins (heat shock protein 27 [HSP27] and serpin family B member 1 [SERPINB1]) were selected for characterization, localization by immunofluorescence, and association with known biomarkers of acute inflammatory response in human apical periodontitis (n = 110) and healthy periodontal ligaments (n = 26). Apical periodontitis samples were categorized as stable/inactive (n = 70) or progressive/active (n = 40) based on the ratio of expression of receptor activator of nuclear factor kappa-B ligand (RANKL)/osteoprotegerin (OPG). Next, the expression of HSP27, SERPINB1, C-X-C motif Chemokine Receptor 1 (CXCR1), matrix metalloproteinase 8 (MMP8), myeloperoxidase (MPO), and cathepsin G (CTSG) messenger RNA was evaluated using real-time polymerase chain reaction. Data analysis was performed using the Shapiro-Wilk test, analysis of variance, and the Pearson test. P values <.05 were considered statistically significant. Proteomic analysis revealed 48 proteins as differentially expressed in apical periodontitis compared with a healthy periodontium, with 30 of these proteins found to be expressed in all 4 lesions. The expression of HSP27 and SERPINB1 was ∼2-fold higher in apical periodontitis. Next, an increased expression of HSP27 was detected in epithelial cells, whereas SERPINB1 expression was noted in neutrophils and epithelial cells. HSP27 and SERPINB1 transcripts were highly expressed in stable/inactive lesions (P < .05). Significant negative correlations were found between the expression of HSP27 and SERPINB1 with biomarkers of acute inflammation including CXCR1, MPO, and CTSG. Our data suggest HSP27 and SERPINB1 as potential regulators of the inflammatory response in apical periodontitis. Additional functional studies should be performed to further characterize the role of these molecules during the development/progression of apical periodontitis. Copyright © 2017 American Association of Endodontists. Published by Elsevier Inc. All rights reserved.

  15. Clustering gene expression data based on predicted differential effects of GV interaction.

    PubMed

    Pan, Hai-Yan; Zhu, Jun; Han, Dan-Fu

    2005-02-01

    Microarray has become a popular biotechnology in biological and medical research. However, systematic and stochastic variabilities in microarray data are expected and unavoidable, resulting in the problem that the raw measurements have inherent "noise" within microarray experiments. Currently, logarithmic ratios are usually analyzed by various clustering methods directly, which may introduce bias interpretation in identifying groups of genes or samples. In this paper, a statistical method based on mixed model approaches was proposed for microarray data cluster analysis. The underlying rationale of this method is to partition the observed total gene expression level into various variations caused by different factors using an ANOVA model, and to predict the differential effects of GV (gene by variety) interaction using the adjusted unbiased prediction (AUP) method. The predicted GV interaction effects can then be used as the inputs of cluster analysis. We illustrated the application of our method with a gene expression dataset and elucidated the utility of our approach using an external validation.

  16. Meta-STEPP: subpopulation treatment effect pattern plot for individual patient data meta-analysis.

    PubMed

    Wang, Xin Victoria; Cole, Bernard; Bonetti, Marco; Gelber, Richard D

    2016-09-20

    We have developed a method, called Meta-STEPP (subpopulation treatment effect pattern plot for meta-analysis), to explore treatment effect heterogeneity across covariate values in the meta-analysis setting for time-to-event data when the covariate of interest is continuous. Meta-STEPP forms overlapping subpopulations from individual patient data containing similar numbers of events with increasing covariate values, estimates subpopulation treatment effects using standard fixed-effects meta-analysis methodology, displays the estimated subpopulation treatment effect as a function of the covariate values, and provides a statistical test to detect possibly complex treatment-covariate interactions. Simulation studies show that this test has adequate type-I error rate recovery as well as power when reasonable window sizes are chosen. When applied to eight breast cancer trials, Meta-STEPP suggests that chemotherapy is less effective for tumors with high estrogen receptor expression compared with those with low expression. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  17. Network theory inspired analysis of time-resolved expression data reveals key players guiding P. patens stem cell development.

    PubMed

    Busch, Hauke; Boerries, Melanie; Bao, Jie; Hanke, Sebastian T; Hiss, Manuel; Tiko, Theodhor; Rensing, Stefan A

    2013-01-01

    Transcription factors (TFs) often trigger developmental decisions, yet, their transcripts are often only moderately regulated and thus not easily detected by conventional statistics on expression data. Here we present a method that allows to determine such genes based on trajectory analysis of time-resolved transcriptome data. As a proof of principle, we have analysed apical stem cells of filamentous moss (P. patens) protonemata that develop from leaflets upon their detachment from the plant. By our novel correlation analysis of the post detachment transcriptome kinetics we predict five out of 1,058 TFs to be involved in the signaling leading to the establishment of pluripotency. Among the predicted regulators is the basic helix loop helix TF PpRSL1, which we show to be involved in the establishment of apical stem cells in P. patens. Our methodology is expected to aid analysis of key players of developmental decisions in complex plant and animal systems.

  18. Network Analysis of Rodent Transcriptomes in Spaceflight

    NASA Technical Reports Server (NTRS)

    Ramachandran, Maya; Fogle, Homer; Costes, Sylvain

    2017-01-01

    Network analysis methods leverage prior knowledge of cellular systems and the statistical and conceptual relationships between analyte measurements to determine gene connectivity. Correlation and conditional metrics are used to infer a network topology and provide a systems-level context for cellular responses. Integration across multiple experimental conditions and omics domains can reveal the regulatory mechanisms that underlie gene expression. GeneLab has assembled rich multi-omic (transcriptomics, proteomics, epigenomics, and epitranscriptomics) datasets for multiple murine tissues from the Rodent Research 1 (RR-1) experiment. RR-1 assesses the impact of 37 days of spaceflight on gene expression across a variety of tissue types, such as adrenal glands, quadriceps, gastrocnemius, tibalius anterior, extensor digitorum longus, soleus, eye, and kidney. Network analysis is particularly useful for RR-1 -omics datasets because it reinforces subtle relationships that may be overlooked in isolated analyses and subdues confounding factors. Our objective is to use network analysis to determine potential target nodes for therapeutic intervention and identify similarities with existing disease models. Multiple network algorithms are used for a higher confidence consensus.

  19. Linear and Order Statistics Combiners for Pattern Classification

    NASA Technical Reports Server (NTRS)

    Tumer, Kagan; Ghosh, Joydeep; Lau, Sonie (Technical Monitor)

    2001-01-01

    Several researchers have experimentally shown that substantial improvements can be obtained in difficult pattern recognition problems by combining or integrating the outputs of multiple classifiers. This chapter provides an analytical framework to quantify the improvements in classification results due to combining. The results apply to both linear combiners and order statistics combiners. We first show that to a first order approximation, the error rate obtained over and above the Bayes error rate, is directly proportional to the variance of the actual decision boundaries around the Bayes optimum boundary. Combining classifiers in output space reduces this variance, and hence reduces the 'added' error. If N unbiased classifiers are combined by simple averaging. the added error rate can be reduced by a factor of N if the individual errors in approximating the decision boundaries are uncorrelated. Expressions are then derived for linear combiners which are biased or correlated, and the effect of output correlations on ensemble performance is quantified. For order statistics based non-linear combiners, we derive expressions that indicate how much the median, the maximum and in general the i-th order statistic can improve classifier performance. The analysis presented here facilitates the understanding of the relationships among error rates, classifier boundary distributions, and combining in output space. Experimental results on several public domain data sets are provided to illustrate the benefits of combining and to support the analytical results.

  20. BAG3 promotes chondrosarcoma progression by upregulating the expression of β-catenin.

    PubMed

    Shi, Huijuan; Chen, Wenfang; Dong, Yu; Lu, Xiaofang; Zhang, Wenhui; Wang, Liantang

    2018-04-01

    To investigate the roles of B‑cell lymphoma‑2 associated athanogene 3 (BAG3) in human chondrosarcoma and the potential mechanisms, the expression levels of BAG3 were detected in the present study, and the associations between BAG3 and clinical pathological parameters, clinical stage as well as the survival of patients were analyzed. The present study detected BAG3 mRNA and protein expression in the normal cartilage cell line HC‑a and in SW1353 chondrosarcoma cells by reverse transcription‑quantitative polymerase chain reaction and western blot analysis. The BAG3 protein expression in 59 cases of chondrosarcoma, 30 patients with endogenous chondroma and 8 cases of normal cartilage was semi-quantitatively analyzed using the immunohistochemical method. In addition, the BAG3 protein expression level, the clinical pathological parameters, clinical stage and the survival time of patients with chondrosarcoma were analyzed. The plasmid transfection method was employed to upregulate the expression BAG3 and small RNA interference to downregulate the expression of BAG3 in SW1353 cells. The expression levels of BAG3 protein and mRNA were significantly increased in the chondrosarcoma cell line when compared with the normal cartilage cell line. The immunohistochemistry results indicated that BAG3 protein was overexpressed in the tissue of human chondrosarcoma. Statistical analysis showed that the expression level of BAG3 was significantly increased in the different Enneking staging of patients with chondrosarcoma and Tumor staging, and there were no statistical differences in age, gender, histological classification and tumor size. In the in vitro experiments, the data revealed that BAG3 significantly promoted chondrosarcoma cell proliferation, colony‑formation, migration and invasion; however, it inhibited chondrosarcoma cell apoptosis. It was observed that BAG3 upregulated β‑catenin expression at the mRNA and protein levels. In addition, BAG3 induced the expression of runt‑related transcription factor 2 (RUNX2) in chondrosarcoma cells by upregulating β‑catenin. These clinical analyses revealed a positive association between β‑catenin and BAG3 in chondrosarcoma tumors. BAG3 was significantly increased in chondrosarcoma cells and tissues compared with the normal cartilage cells, tissue and cartilage benign tumors. Thus, BAG3 may serve as an oncogene in the development of chondrosarcoma via the induction of RUNX2 expression. The results of the present study contribute to further research on the biological development of chondrosarcoma.

  1. Cross-frequency and band-averaged response variance prediction in the hybrid deterministic-statistical energy analysis method

    NASA Astrophysics Data System (ADS)

    Reynders, Edwin P. B.; Langley, Robin S.

    2018-08-01

    The hybrid deterministic-statistical energy analysis method has proven to be a versatile framework for modeling built-up vibro-acoustic systems. The stiff system components are modeled deterministically, e.g., using the finite element method, while the wave fields in the flexible components are modeled as diffuse. In the present paper, the hybrid method is extended such that not only the ensemble mean and variance of the harmonic system response can be computed, but also of the band-averaged system response. This variance represents the uncertainty that is due to the assumption of a diffuse field in the flexible components of the hybrid system. The developments start with a cross-frequency generalization of the reciprocity relationship between the total energy in a diffuse field and the cross spectrum of the blocked reverberant loading at the boundaries of that field. By making extensive use of this generalization in a first-order perturbation analysis, explicit expressions are derived for the cross-frequency and band-averaged variance of the vibrational energies in the diffuse components and for the cross-frequency and band-averaged variance of the cross spectrum of the vibro-acoustic field response of the deterministic components. These expressions are extensively validated against detailed Monte Carlo analyses of coupled plate systems in which diffuse fields are simulated by randomly distributing small point masses across the flexible components, and good agreement is found.

  2. QNB: differential RNA methylation analysis for count-based small-sample sequencing data with a quad-negative binomial model.

    PubMed

    Liu, Lian; Zhang, Shao-Wu; Huang, Yufei; Meng, Jia

    2017-08-31

    As a newly emerged research area, RNA epigenetics has drawn increasing attention recently for the participation of RNA methylation and other modifications in a number of crucial biological processes. Thanks to high throughput sequencing techniques, such as, MeRIP-Seq, transcriptome-wide RNA methylation profile is now available in the form of count-based data, with which it is often of interests to study the dynamics at epitranscriptomic layer. However, the sample size of RNA methylation experiment is usually very small due to its costs; and additionally, there usually exist a large number of genes whose methylation level cannot be accurately estimated due to their low expression level, making differential RNA methylation analysis a difficult task. We present QNB, a statistical approach for differential RNA methylation analysis with count-based small-sample sequencing data. Compared with previous approaches such as DRME model based on a statistical test covering the IP samples only with 2 negative binomial distributions, QNB is based on 4 independent negative binomial distributions with their variances and means linked by local regressions, and in the way, the input control samples are also properly taken care of. In addition, different from DRME approach, which relies only the input control sample only for estimating the background, QNB uses a more robust estimator for gene expression by combining information from both input and IP samples, which could largely improve the testing performance for very lowly expressed genes. QNB showed improved performance on both simulated and real MeRIP-Seq datasets when compared with competing algorithms. And the QNB model is also applicable to other datasets related RNA modifications, including but not limited to RNA bisulfite sequencing, m 1 A-Seq, Par-CLIP, RIP-Seq, etc.

  3. TEtools facilitates big data expression analysis of transposable elements and reveals an antagonism between their activity and that of piRNA genes

    PubMed Central

    Lerat, Emmanuelle; Fablet, Marie; Modolo, Laurent; Lopez-Maestre, Hélène

    2017-01-01

    Abstract Over recent decades, substantial efforts have been made to understand the interactions between host genomes and transposable elements (TEs). The impact of TEs on the regulation of host genes is well known, with TEs acting as platforms of regulatory sequences. Nevertheless, due to their repetitive nature it is considerably hard to integrate TE analysis into genome-wide studies. Here, we developed a specific tool for the analysis of TE expression: TEtools. This tool takes into account the TE sequence diversity of the genome, it can be applied to unannotated or unassembled genomes and is freely available under the GPL3 (https://github.com/l-modolo/TEtools). TEtools performs the mapping of RNA-seq data obtained from classical mRNAs or small RNAs onto a list of TE sequences and performs differential expression analyses with statistical relevance. Using this tool, we analyzed TE expression from five Drosophila wild-type strains. Our data show for the first time that the activity of TEs is strictly linked to the activity of the genes implicated in the piwi-interacting RNA biogenesis and therefore fits an arms race scenario between TE sequences and host control genes. PMID:28204592

  4. [Optimization of prokaryotic expression conditions of Leptospira interrogans trigeminy genus-specific protein antigen based on surface response analysis].

    PubMed

    Wang, Jiang; Luo, Dongjiao; Sun, Aihua; Yan, Jie

    2008-07-01

    Lipoproteins LipL32 and LipL21 and transmembrane protein OMPL1 have been confirmed as the superficial genus-specific antigens of Leptospira interrogans, which can be used as antigens for developing a universal genetic engineering vaccine. In order to obtain high expression of an artificial fusion gene lipL32/1-lipL21-ompL1/2, we optimized prokaryotic expression conditions. We used surface response analysis based on the central composite design to optimize culture conditions of a new antigen protein by recombinant Escherichia coli DE3.The culture conditions included initial pH, induction start time, post-induction time, Isopropyl beta-D-thiogalactopyranoside (IPTG) concentration, and temperature. The maximal production of antigen protein was 37.78 mg/l. The optimal culture conditions for high recombinant fusion protein was determined: initial pH 7.9, induction start time 2.5 h, a post-induction time of 5.38 h, 0.20 mM IPTG, and a post-induction temperature of 31 degrees C. Surface response analysis based on CCD increased the target production. This statistical method reduced the number of experiments required for optimization and enabled rapid identification and integration of the key culture condition parameters for optimizing recombinant protein expression.

  5. Using foreground/background analysis to determine leaf and canopy chemistry

    NASA Technical Reports Server (NTRS)

    Pinzon, J. E.; Ustin, S. L.; Hart, Q. J.; Jacquemoud, S.; Smith, M. O.

    1995-01-01

    Spectral Mixture Analysis (SMA) has become a well established procedure for analyzing imaging spectrometry data, however, the technique is relatively insensitive to minor sources of spectral variation (e.g., discriminating stressed from unstressed vegetation and variations in canopy chemistry). Other statistical approaches have been tried e.g., stepwise multiple linear regression analysis to predict canopy chemistry. Grossman et al. reported that SMLR is sensitive to measurement error and that the prediction of minor chemical components are not independent of patterns observed in more dominant spectral components like water. Further, they observed that the relationships were strongly dependent on the mode of expressing reflectance (R, -log R) and whether chemistry was expressed on a weight (g/g) or are basis (g/sq m). Thus, alternative multivariate techniques need to be examined. Smith et al. reported a revised SMA that they termed Foreground/Background Analysis (FBA) that permits directing the analysis along any axis of variance by identifying vectors through the n-dimensional spectral volume orthonormal to each other. Here, we report an application of the FBA technique for the detection of canopy chemistry using a modified form of the analysis.

  6. Methods for processing microarray data.

    PubMed

    Ares, Manuel

    2014-02-01

    Quality control must be maintained at every step of a microarray experiment, from RNA isolation through statistical evaluation. Here we provide suggestions for analyzing microarray data. Because the utility of the results depends directly on the design of the experiment, the first critical step is to ensure that the experiment can be properly analyzed and interpreted. What is the biological question? What is the best way to perform the experiment? How many replicates will be required to obtain the desired statistical resolution? Next, the samples must be prepared, pass quality controls for integrity and representation, and be hybridized and scanned. Also, slides with defects, missing data, high background, or weak signal must be rejected. Data from individual slides must be normalized and combined so that the data are as free of systematic bias as possible. The third phase is to apply statistical filters and tests to the data to determine genes (1) expressed above background, (2) whose expression level changes in different samples, and (3) whose RNA-processing patterns or protein associations change. Next, a subset of the data should be validated by an alternative method, such as reverse transcription-polymerase chain reaction (RT-PCR). Provided that this endorses the general conclusions of the array analysis, gene sets whose expression, splicing, polyadenylation, protein binding, etc. change in different samples can be classified with respect to function, sequence motif properties, as well as other categories to extract hypotheses for their biological roles and regulatory logic.

  7. Compositional data analysis for physical activity, sedentary time and sleep research.

    PubMed

    Dumuid, Dorothea; Stanford, Tyman E; Martin-Fernández, Josep-Antoni; Pedišić, Željko; Maher, Carol A; Lewis, Lucy K; Hron, Karel; Katzmarzyk, Peter T; Chaput, Jean-Philippe; Fogelholm, Mikael; Hu, Gang; Lambert, Estelle V; Maia, José; Sarmiento, Olga L; Standage, Martyn; Barreira, Tiago V; Broyles, Stephanie T; Tudor-Locke, Catrine; Tremblay, Mark S; Olds, Timothy

    2017-01-01

    The health effects of daily activity behaviours (physical activity, sedentary time and sleep) are widely studied. While previous research has largely examined activity behaviours in isolation, recent studies have adjusted for multiple behaviours. However, the inclusion of all activity behaviours in traditional multivariate analyses has not been possible due to the perfect multicollinearity of 24-h time budget data. The ensuing lack of adjustment for known effects on the outcome undermines the validity of study findings. We describe a statistical approach that enables the inclusion of all daily activity behaviours, based on the principles of compositional data analysis. Using data from the International Study of Childhood Obesity, Lifestyle and the Environment, we demonstrate the application of compositional multiple linear regression to estimate adiposity from children's daily activity behaviours expressed as isometric log-ratio coordinates. We present a novel method for predicting change in a continuous outcome based on relative changes within a composition, and for calculating associated confidence intervals to allow for statistical inference. The compositional data analysis presented overcomes the lack of adjustment that has plagued traditional statistical methods in the field, and provides robust and reliable insights into the health effects of daily activity behaviours.

  8. A microarray analysis of the effects of moderate hypothermia and rewarming on gene expression by human hepatocytes (HepG2).

    PubMed

    Sonna, Larry A; Kuhlmeier, Matthew M; Khatri, Purvesh; Chen, Dechang; Lilly, Craig M

    2010-09-01

    The gene expression changes produced by moderate hypothermia are not fully known, but appear to differ in important ways from those produced by heat shock. We examined the gene expression changes produced by moderate hypothermia and tested the hypothesis that rewarming after hypothermia approximates a heat-shock response. Six sets of human HepG2 hepatocytes were subjected to moderate hypothermia (31 degrees C for 16 h), a conventional in vitro heat shock (43 degrees C for 30 min) or control conditions (37 degrees C), then harvested immediately or allowed to recover for 3 h at 37 degrees C. Expression analysis was performed with Affymetrix U133A gene chips, using analysis of variance-based techniques. Moderate hypothermia led to distinct time-dependent expression changes, as did heat shock. Hypothermia initially caused statistically significant, greater than or equal to twofold changes in expression (relative to controls) of 409 sequences (143 increased and 266 decreased), whereas heat shock affected 71 (35 increased and 36 decreased). After 3 h of recovery, 192 sequences (83 increased, 109 decreased) were affected by hypothermia and 231 (146 increased, 85 decreased) by heat shock. Expression of many heat shock proteins was decreased by hypothermia but significantly increased after rewarming. A comparison of sequences affected by thermal stress without regard to the magnitude of change revealed that the overlap between heat and cold stress was greater after 3 h of recovery than immediately following thermal stress. Thus, while some overlap occurs (particularly after rewarming), moderate hypothermia produces extensive, time-dependent gene expression changes in HepG2 cells that differ in important ways from those induced by heat shock.

  9. [The expression and significance of RORγT in periapical granulomas and radicular cysts].

    PubMed

    Lang, Xiao-ying; Li, Song

    2014-08-01

    To identify retinoic acid-related orphan nuclear receptor-γT (RORγT), the specific markers of T helper 17 (Th17) cells by immunohistochemical analysis to confirm the presence of Th17 cells in periapical lesions. Eighteen radicular cysts (RCs) and 22 periapical granulomas (PGs) were collected in the Department of Oral Pathology after periapical surgery as the experimental samples. Five alveolar bone samples were obtained from a group of impacted third molars recommended for extraction as the control samples. The protein expression of RORγT was measured by immunohistochemical analysis for all samples. In addition, the protein expression of IL-17 was measured at the same time. Statistical analysis was performed using SPSS 17.0 software package to evaluate the differences of expression of RORγT and IL-17 according to type of lesion (PG vs. RC vs. control group) and intensity of the inflammatory infiltration (mild vs. moderate vs. severe vs. control group). RORγT+ cells were detected in all periapical lesions tissues, and the expression of RORγT was significantly higher in periapical lesions than in normal tissues which had no expression of RORγT (P<0.05). Significant differences in the expression of RORγT were observed among healthy tissues, lesions with mild inflammation, moderate inflammation and severe inflammation (P<0.05), respectively. Positive correlations between RORγT and IL-17 protein levels were observed in PGs (r=0.935,P<0.05) and RCs (r=0.803,P<0.05), respectively. The results demonstrates a significant increase in the expression of RORγT in patients suffering from periapical lesions in comparison with normal control subjects, indicating that Th17 cells are more likely to exist in periapical lesions.

  10. Gender Differences in Expressed Interests in Engineering-Related Fields ACT 30-Year Data Analysis Identified Trends and Suggested Avenues to Reverse Trends

    ERIC Educational Resources Information Center

    Iskander, E. Tiffany; Gore, Paul A., Jr.; Furse, Cynthia; Bergerson, Amy

    2013-01-01

    Historically, women have been underrepresented in the Science, Technology, Engineering, and Math (STEM) fields both as college majors and in the professional community. This disturbing trend, observed in many countries, is more serious and evident in American universities and is reflected in the U.S. workforce statistics. In this article, we…

  11. Expression of CDX-2 and Ki-67 in different grades of colorectal adenocarcinomas.

    PubMed

    Sen, Anway; Mitra, Sumit; Das, Ram Narayan; Dasgupta, Shatavisha; Saha, Koushik; Chatterjee, Uttara; Mukherjee, Krishnendu; Datta, Chhanda; Chattopadhyay, Bitan K

    2015-01-01

    CDX2 is a caudal homeobox gene essential for intestinal differentiation and is specifically expressed in colorectal adenocarcinomas. Its role in colorectal carcinogenesis is not fully elucidated. To study the expression pattern of CDX2 and Ki-67 in different grades of colorectal adenocarcinomas and to observe the relationship of their staining patterns in various tumor stages and to look for correlation if any, between Ki-67 labeling index (Ki-67 LI) and CDX2 expression. A total of 74 cases were enrolled. Detailed clinical profile, peroperative findings, histological grading and staging were noted. Immunohistochemistry for CDX2 and Ki-67 was done, and Ki-67 LI was calculated. CDX2 staining was graded semi-quantitatively, and statistical analysis was done. Age of presentation ranged from 20 to 75 years, and the male:female ratio was 1.83:1. There were 8, 47 and 13 cases of well, moderate and poorly differentiated adenocarcinomas, respectively. The mean Ki-67 LI of well, moderate and poorly differentiated adenocarcinomas were 14.25, 31.34 and 43.08 respectively, and their difference was statistically significant, correlation was also noted with stage. CDX2 expression appeared to be stronger in poorly differentiated cases, but there was no significant difference in its expression in the different grades and stages. There was no correlation between Ki-67 LI and CDX2 immunostaining pattern. The lymph node metastasis showed CDX2 positivity in all the cases. Expression of CDX2 does not significantly change with the grade of colorectal adenocarcinomas. However, it is an important diagnostic marker in metastatic colonic lesions. The Ki-67 LI, on the other hand, showed a strong correlation with histopathological grades.

  12. Quantitative analysis of a deeply sequenced marine microbial metatranscriptome.

    PubMed

    Gifford, Scott M; Sharma, Shalabh; Rinta-Kanto, Johanna M; Moran, Mary Ann

    2011-03-01

    The potential of metatranscriptomic sequencing to provide insights into the environmental factors that regulate microbial activities depends on how fully the sequence libraries capture community expression (that is, sample-sequencing depth and coverage depth), and the sensitivity with which expression differences between communities can be detected (that is, statistical power for hypothesis testing). In this study, we use an internal standard approach to make absolute (per liter) estimates of transcript numbers, a significant advantage over proportional estimates that can be biased by expression changes in unrelated genes. Coastal waters of the southeastern United States contain 1 × 10(12) bacterioplankton mRNA molecules per liter of seawater (~200 mRNA molecules per bacterial cell). Even for the large bacterioplankton libraries obtained in this study (~500,000 possible protein-encoding sequences in each of two libraries after discarding rRNAs and small RNAs from >1 million 454 FLX pyrosequencing reads), sample-sequencing depth was only 0.00001%. Expression levels of 82 genes diagnostic for transformations in the marine nitrogen, phosphorus and sulfur cycles ranged from below detection (<1 × 10(6) transcripts per liter) for 36 genes (for example, phosphonate metabolism gene phnH, dissimilatory nitrate reductase subunit napA) to >2.7 × 10(9) transcripts per liter (ammonia transporter amt and ammonia monooxygenase subunit amoC). Half of the categories for which expression was detected, however, had too few copy numbers for robust statistical resolution, as would be required for comparative (experimental or time-series) expression studies. By representing whole community gene abundance and expression in absolute units (per volume or mass of environment), 'omics' data can be better leveraged to improve understanding of microbially mediated processes in the ocean.

  13. Up-regulated expression of type II very low density lipoprotein receptor correlates with cancer metastasis and has a potential link to β-catenin in different cancers.

    PubMed

    He, Lei; Lu, Yanjun; Wang, Peng; Zhang, Jun; Yin, Chuanchang; Qu, Shen

    2010-11-03

    Very low density lipoprotein receptor (VLDLR) has been considered as a multiple function receptor due to binding numerous ligands, causing endocytosis and regulating cellular signaling. Our group previously reported that enhanced activity of type II VLDLR (VLDLR II), one subtype of VLDLR, promotes adenocarcinoma SGC7901 cells proliferation and migration. The aim of this study is to explore the expression levels of VLDLR II in human gastric, breast and lung cancer tissues, and to investigate its relationship with clinical characteristics and β-catenin expression status. VLDLR II expression was examined using immunohistochemistry (IHC) and Western blot in tumor tissues from 213 gastric, breast and lung cancer patients, tumor adjacent noncancerous tissues by same methods. Correlations between VLDLR II and clinical features, as well as β-catenin expression status were evaluated by statistical analysis. The immunohistochemical staining of VLDLR II showed statistical difference between tumor tissues and tumor adjacent noncancerous tissues in gastric, breast and lung cancers (P = 0.034, 0.018 and 0.043, respectively). Moreover, using Western, we found higher VLDLR II expression levels were associated with lymph node and distant metastasis in gastric and breast cancer (P < 0.05). Furthermore, highly significant positive correlations were found between VLDLR II and β-catenin in gastric cancer (r = 0.689; P < 0.001)breast cancer (r = 0.594; P < 0.001). According to the results of the current study, high VLDLR II expression is correlated with lymph node and distant metastasis in gastric and breast cancer patients, the data suggest that VLDLR II may be a clinical marker in cancers, and has a potential link with β-catenin signaling pathway. This is the first to reveal the closer relationship of VLDLR II with clinical information.

  14. Neuropeptide Y2 Receptor (NPY2R) Expression in Saliva Predicts Feeding Immaturity in the Premature Neonate

    PubMed Central

    Maron, Jill L.; Johnson, Kirby L.; Dietz, Jessica A.; Chen, Minghua L.; Bianchi, Diana W.

    2012-01-01

    Background The current practice in newborn medicine is to subjectively assess when a premature infant is ready to feed by mouth. When the assessment is inaccurate, the resulting feeding morbidities may be significant, resulting in long-term health consequences and millions of health care dollars annually. We hypothesized that the developmental maturation of hypothalamic regulation of feeding behavior is a predictor of successful oral feeding in the premature infant. To test this hypothesis, we analyzed the gene expression of neuropeptide Y2 receptor (NPY2R), a known hypothalamic regulator of feeding behavior, in neonatal saliva to determine its role as a biomarker in predicting oral feeding success in the neonate. Methodology/Principal Findings Salivary samples (n = 116), were prospectively collected from 63 preterm and 13 term neonates (post-conceptual age (PCA) 26 4/7 to 41 4/7 weeks) from five predefined feeding stages. Expression of NPY2R in neonatal saliva was determined by multiplex RT-qPCR amplification. Expression results were retrospectively correlated with feeding status at time of sample collection. Statistical analysis revealed that expression of NPY2R had a 95% positive predictive value for feeding immaturity. NPY2R expression statistically significantly decreased with advancing PCA (Wilcoxon test p value<0.01), and was associated with feeding status (chi square p value  =  0.013). Conclusions/Significance Developmental maturation of hypothalamic regulation of feeding behavior is an essential component of oral feeding success in the newborn. NPY2R expression in neonatal saliva is predictive of an immature feeding pattern. It is a clinically relevant biomarker that may be monitored in saliva to improve clinical care and reduce significant feeding-associated morbidities that affect the premature neonate. PMID:22629465

  15. SPARTA: Simple Program for Automated reference-based bacterial RNA-seq Transcriptome Analysis.

    PubMed

    Johnson, Benjamin K; Scholz, Matthew B; Teal, Tracy K; Abramovitch, Robert B

    2016-02-04

    Many tools exist in the analysis of bacterial RNA sequencing (RNA-seq) transcriptional profiling experiments to identify differentially expressed genes between experimental conditions. Generally, the workflow includes quality control of reads, mapping to a reference, counting transcript abundance, and statistical tests for differentially expressed genes. In spite of the numerous tools developed for each component of an RNA-seq analysis workflow, easy-to-use bacterially oriented workflow applications to combine multiple tools and automate the process are lacking. With many tools to choose from for each step, the task of identifying a specific tool, adapting the input/output options to the specific use-case, and integrating the tools into a coherent analysis pipeline is not a trivial endeavor, particularly for microbiologists with limited bioinformatics experience. To make bacterial RNA-seq data analysis more accessible, we developed a Simple Program for Automated reference-based bacterial RNA-seq Transcriptome Analysis (SPARTA). SPARTA is a reference-based bacterial RNA-seq analysis workflow application for single-end Illumina reads. SPARTA is turnkey software that simplifies the process of analyzing RNA-seq data sets, making bacterial RNA-seq analysis a routine process that can be undertaken on a personal computer or in the classroom. The easy-to-install, complete workflow processes whole transcriptome shotgun sequencing data files by trimming reads and removing adapters, mapping reads to a reference, counting gene features, calculating differential gene expression, and, importantly, checking for potential batch effects within the data set. SPARTA outputs quality analysis reports, gene feature counts and differential gene expression tables and scatterplots. SPARTA provides an easy-to-use bacterial RNA-seq transcriptional profiling workflow to identify differentially expressed genes between experimental conditions. This software will enable microbiologists with limited bioinformatics experience to analyze their data and integrate next generation sequencing (NGS) technologies into the classroom. The SPARTA software and tutorial are available at sparta.readthedocs.org.

  16. Comprehensive evaluation of AmpliSeq transcriptome, a novel targeted whole transcriptome RNA sequencing methodology for global gene expression analysis.

    PubMed

    Li, Wenli; Turner, Amy; Aggarwal, Praful; Matter, Andrea; Storvick, Erin; Arnett, Donna K; Broeckel, Ulrich

    2015-12-16

    Whole transcriptome sequencing (RNA-seq) represents a powerful approach for whole transcriptome gene expression analysis. However, RNA-seq carries a few limitations, e.g., the requirement of a significant amount of input RNA and complications led by non-specific mapping of short reads. The Ion AmpliSeq Transcriptome Human Gene Expression Kit (AmpliSeq) was recently introduced by Life Technologies as a whole-transcriptome, targeted gene quantification kit to overcome these limitations of RNA-seq. To assess the performance of this new methodology, we performed a comprehensive comparison of AmpliSeq with RNA-seq using two well-established next-generation sequencing platforms (Illumina HiSeq and Ion Torrent Proton). We analyzed standard reference RNA samples and RNA samples obtained from human induced pluripotent stem cell derived cardiomyocytes (hiPSC-CMs). Using published data from two standard RNA reference samples, we observed a strong concordance of log2 fold change for all genes when comparing AmpliSeq to Illumina HiSeq (Pearson's r = 0.92) and Ion Torrent Proton (Pearson's r = 0.92). We used ROC, Matthew's correlation coefficient and RMSD to determine the overall performance characteristics. All three statistical methods demonstrate AmpliSeq as a highly accurate method for differential gene expression analysis. Additionally, for genes with high abundance, AmpliSeq outperforms the two RNA-seq methods. When analyzing four closely related hiPSC-CM lines, we show that both AmpliSeq and RNA-seq capture similar global gene expression patterns consistent with known sources of variations. Our study indicates that AmpliSeq excels in the limiting areas of RNA-seq for gene expression quantification analysis. Thus, AmpliSeq stands as a very sensitive and cost-effective approach for very large scale gene expression analysis and mRNA marker screening with high accuracy.

  17. Powerful Identification of Cis-regulatory SNPs in Human Primary Monocytes Using Allele-Specific Gene Expression

    PubMed Central

    Almlöf, Jonas Carlsson; Lundmark, Per; Lundmark, Anders; Ge, Bing; Maouche, Seraya; Göring, Harald H. H.; Liljedahl, Ulrika; Enström, Camilla; Brocheton, Jessy; Proust, Carole; Godefroy, Tiphaine; Sambrook, Jennifer G.; Jolley, Jennifer; Crisp-Hihn, Abigail; Foad, Nicola; Lloyd-Jones, Heather; Stephens, Jonathan; Gwilliam, Rhian; Rice, Catherine M.; Hengstenberg, Christian; Samani, Nilesh J.; Erdmann, Jeanette; Schunkert, Heribert; Pastinen, Tomi; Deloukas, Panos; Goodall, Alison H.; Ouwehand, Willem H.; Cambien, François; Syvänen, Ann-Christine

    2012-01-01

    A large number of genome-wide association studies have been performed during the past five years to identify associations between SNPs and human complex diseases and traits. The assignment of a functional role for the identified disease-associated SNP is not straight-forward. Genome-wide expression quantitative trait locus (eQTL) analysis is frequently used as the initial step to define a function while allele-specific gene expression (ASE) analysis has not yet gained a wide-spread use in disease mapping studies. We compared the power to identify cis-acting regulatory SNPs (cis-rSNPs) by genome-wide allele-specific gene expression (ASE) analysis with that of traditional expression quantitative trait locus (eQTL) mapping. Our study included 395 healthy blood donors for whom global gene expression profiles in circulating monocytes were determined by Illumina BeadArrays. ASE was assessed in a subset of these monocytes from 188 donors by quantitative genotyping of mRNA using a genome-wide panel of SNP markers. The performance of the two methods for detecting cis-rSNPs was evaluated by comparing associations between SNP genotypes and gene expression levels in sample sets of varying size. We found that up to 8-fold more samples are required for eQTL mapping to reach the same statistical power as that obtained by ASE analysis for the same rSNPs. The performance of ASE is insensitive to SNPs with low minor allele frequencies and detects a larger number of significantly associated rSNPs using the same sample size as eQTL mapping. An unequivocal conclusion from our comparison is that ASE analysis is more sensitive for detecting cis-rSNPs than standard eQTL mapping. Our study shows the potential of ASE mapping in tissue samples and primary cells which are difficult to obtain in large numbers. PMID:23300628

  18. CellTree: an R/bioconductor package to infer the hierarchical structure of cell populations from single-cell RNA-seq data.

    PubMed

    duVerle, David A; Yotsukura, Sohiya; Nomura, Seitaro; Aburatani, Hiroyuki; Tsuda, Koji

    2016-09-13

    Single-cell RNA sequencing is fast becoming one the standard method for gene expression measurement, providing unique insights into cellular processes. A number of methods, based on general dimensionality reduction techniques, have been suggested to help infer and visualise the underlying structure of cell populations from single-cell expression levels, yet their models generally lack proper biological grounding and struggle at identifying complex differentiation paths. Here we introduce cellTree: an R/Bioconductor package that uses a novel statistical approach, based on document analysis techniques, to produce tree structures outlining the hierarchical relationship between single-cell samples, while identifying latent groups of genes that can provide biological insights. With cellTree, we provide experimentalists with an easy-to-use tool, based on statistically and biologically-sound algorithms, to efficiently explore and visualise single-cell RNA data. The cellTree package is publicly available in the online Bionconductor repository at: http://bioconductor.org/packages/cellTree/ .

  19. Effect of trapidil in myocardial ischemia-reperfusion injury in rabbit.

    PubMed

    Liu, Mingjie; Sun, Qi; Wang, Qiang; Wang, Xiuying; Lin, Peng; Yang, Ming; Yan, Yuanyuan

    2014-01-01

    To evaluate the cardioprotective effects of trapidil on myocardial ischemia-reperfusion injury (MIRI) in rabbits. Rabbits were subjected to 40 min of myocardial ischemia followed by 120 min of reperfusion. Blood for superoxide dismutase (SOD) and malondialdehyde (MDA) were estimated. At the end of reperfusion, the rabbits were sacrificed and the hearts were isolated for histological examination. An apoptotic index (AI) was determined using the terminal deoxynucleotidyl transferase (TdT)-mediated dUTP nick-end-labeling (TUNEL) method. The expression of apoptosis-related proteins Bax and Bcl-2 was analyzed using immunohistochemistry. Statistical analyses were performed by one-way analysis of variance (ANOVA), P < 0.05 considered statistically significant. Trapidil caused a significant (P < 0.05) increase in SOD activity, as decreased MDA levels and significantly (P < 0.05) reduced the expression of Bax as compared with the ischemia-reperfusion (IR) control group. Trapidil may attenuate the myocardial damage produced by IR injury and offer potential cardioprotective action.

  20. NETWORK ASSISTED ANALYSIS TO REVEAL THE GENETIC BASIS OF AUTISM1

    PubMed Central

    Liu, Li; Lei, Jing; Roeder, Kathryn

    2016-01-01

    While studies show that autism is highly heritable, the nature of the genetic basis of this disorder remains illusive. Based on the idea that highly correlated genes are functionally interrelated and more likely to affect risk, we develop a novel statistical tool to find more potentially autism risk genes by combining the genetic association scores with gene co-expression in specific brain regions and periods of development. The gene dependence network is estimated using a novel partial neighborhood selection (PNS) algorithm, where node specific properties are incorporated into network estimation for improved statistical and computational efficiency. Then we adopt a hidden Markov random field (HMRF) model to combine the estimated network and the genetic association scores in a systematic manner. The proposed modeling framework can be naturally extended to incorporate additional structural information concerning the dependence between genes. Using currently available genetic association data from whole exome sequencing studies and brain gene expression levels, the proposed algorithm successfully identified 333 genes that plausibly affect autism risk. PMID:27134692

  1. Expression of cancer-testis antigens MAGE-A4 and MAGE-C1 in oral squamous cell carcinoma.

    PubMed

    Montoro, José Raphael de Moura Campos; Mamede, Rui Celso Martins; Neder Serafini, Luciano; Saggioro, Fabiano Pinto; Figueiredo, David Livingstone Alves; Silva, Wilson Araújo da; Jungbluth, Achim A; Spagnoli, Giulio Cesare; Zago, Marco Antônio

    2012-08-01

    Tumor markers are genes or their products expressed exclusively or preferentially in tumor cells and cancer-testis antigens (CTAs) form a group of genes with a typical expression pattern expressed in a variety of malignant neoplasms. CTAs are considered potential targets for cancer vaccines. It is possible that the CTA MAGE-A4 (melanoma antigen) and MAGE-C1 are expressed in carcinoma of the oral cavity and are related with survival. This study involved immunohistochemical analysis of 23 patients with oral squamous cell carcinoma (SCC) and was carried out using antibodies for MAGE-A4 and MAGE-C1. Fisher's exact test and log-rank test were used to evaluate the results. The expression of the MAGE-A4 and MAGE-C1 were 56.5% and 47.8% without statistical difference in studied variables and survival. The expression of at least 1 CTA was present in 78.3% of the patients, however, without correlation with clinicopathologic variables and survival. Copyright © 2011 Wiley Periodicals, Inc.

  2. Proteome analysis of the fungus Aspergillus carbonarius under ochratoxin A producing conditions.

    PubMed

    Crespo-Sempere, A; Gil, J V; Martínez-Culebras, P V

    2011-06-30

    Aspergillus carbonarius is an important ochratoxin A producing fungus that is responsible for mycotoxin contamination of grapes and wine. In this study, the proteomes of highly (W04-40) and weakly (W04-46) OTA-producing A. carbonarius strains were compared to identify proteins that may be involved in OTA biosynthesis. Protein samples were extracted from two biological replicates and subjected to two dimensional gel electrophoresis analysis and mass spectrometry. Expression profile comparison (PDQuest software), revealed 21 differential spots that were statistically significant and showed a two-fold change in expression, or greater. Among these, nine protein spots were identified by MALDI-MS/MS and MASCOT database and twelve remain unidentified. Of the identified proteins, seven showed a higher expression in strain W04-40 (high OTA producer) and two in strain W04-46 (low OTA producer). Some of the identified amino acid sequences shared homology with proteins involved in regulation, amino acid metabolism, oxidative stress and sporulation. It is worth noting the presence of a protein with 126.5 fold higher abundance in strain W04-40 showing homology with protein CipC, a protein with unknown function related with pathogenesis and mycotoxin production by some authors. Variations in protein expression were also further investigated at the mRNA level by real-time PCR analysis. The mRNA expression levels from three identified proteins including CipC showed correlation with protein expression levels. This study represents the first proteomic analysis for a comparison of two A. carbonarius strains with different OTA production and will contribute to a better understanding of the molecular events involved in OTA biosynthesis. Copyright © 2011 Elsevier B.V. All rights reserved.

  3. Co-acting gene networks predict TRAIL responsiveness of tumour cells with high accuracy.

    PubMed

    O'Reilly, Paul; Ortutay, Csaba; Gernon, Grainne; O'Connell, Enda; Seoighe, Cathal; Boyce, Susan; Serrano, Luis; Szegezdi, Eva

    2014-12-19

    Identification of differentially expressed genes from transcriptomic studies is one of the most common mechanisms to identify tumor biomarkers. This approach however is not well suited to identify interaction between genes whose protein products potentially influence each other, which limits its power to identify molecular wiring of tumour cells dictating response to a drug. Due to the fact that signal transduction pathways are not linear and highly interlinked, the biological response they drive may be better described by the relative amount of their components and their functional relationships than by their individual, absolute expression. Gene expression microarray data for 109 tumor cell lines with known sensitivity to the death ligand cytokine tumor necrosis factor-related apoptosis-inducing ligand (TRAIL) was used to identify genes with potential functional relationships determining responsiveness to TRAIL-induced apoptosis. The machine learning technique Random Forest in the statistical environment "R" with backward elimination was used to identify the key predictors of TRAIL sensitivity and differentially expressed genes were identified using the software GeneSpring. Gene co-regulation and statistical interaction was assessed with q-order partial correlation analysis and non-rejection rate. Biological (functional) interactions amongst the co-acting genes were studied with Ingenuity network analysis. Prediction accuracy was assessed by calculating the area under the receiver operator curve using an independent dataset. We show that the gene panel identified could predict TRAIL-sensitivity with a very high degree of sensitivity and specificity (AUC=0·84). The genes in the panel are co-regulated and at least 40% of them functionally interact in signal transduction pathways that regulate cell death and cell survival, cellular differentiation and morphogenesis. Importantly, only 12% of the TRAIL-predictor genes were differentially expressed highlighting the importance of functional interactions in predicting the biological response. The advantage of co-acting gene clusters is that this analysis does not depend on differential expression and is able to incorporate direct- and indirect gene interactions as well as tissue- and cell-specific characteristics. This approach (1) identified a descriptor of TRAIL sensitivity which performs significantly better as a predictor of TRAIL sensitivity than any previously reported gene signatures, (2) identified potential novel regulators of TRAIL-responsiveness and (3) provided a systematic view highlighting fundamental differences between the molecular wiring of sensitive and resistant cell types.

  4. The expression of full length Gp91-phox protein is associated with reduced amphotropic retroviral production.

    PubMed

    Bellantuono, I; Lashford, L S; Rafferty, J A; Fairbairn, L J

    2000-05-01

    As a single gene defect in mature bone marrow cells, chronic granulomatous disease (X-CGD) represents a disorder which may be amenable to gene therapy by the transfer of the missing subunit into hemopoietic stem cells. In the majority of cases lack of Gp91-phox causes the disease. So far, studies involving transfer of Gp91-phox cDNA, including a phase I clinical trial, have yielded disappointing results. Most often, low titers of virus have been reported. In the present study we investigated the possible reasons for low titer amphotropic viral production. To investigate the effect of Gp91 cDNA on the efficiency of retroviral production from the packaging cell line, GP+envAm12, we constructed vectors containing either the native cDNA, truncated versions of the cDNA or a mutated form (LATG) in which the natural translational start codon was changed to a stop codon. Following derivation of clonal packaging cell lines, these were assessed for viral titer by RNA slot blot and analyzed by non-parametrical statistical analysis (Whitney-Mann U-test). An improvement in viral titer of just over two-fold was found in packaging cells containing the start-codon mutant of Gp91 and no evidence of truncated viral RNA was seen in these cells. Further analysis revealed the presence of rearranged forms of the provirus in Gp91-expressing cells, and the production of truncated, unpackaged viral RNA. Protein analysis revealed that LATG-transduced cells did not express full-length Gp91-phox, whereas those containing the wild-type cDNA did. However, a truncated protein was seen in ATG-transduced cells which was also present in wild type cells. No evidence for the presence of a negative transcriptional regulatory element was found from studies with the deletion mutants. A statistically significant effect of protein production on the production of virus from Gp91-expressing cells was found. Our data point to a need to restrict expression of the Gp91-phox protein and its derivatives in order to enhance retroviral production and suggest that improvements in current vectors for CGD gene therapy may need to include controlled, directed expression only in mature neutrophils.

  5. Prognostic Value of NME1 (NM23-H1) in Patients with Digestive System Neoplasms: A Systematic Review and Meta-Analysis.

    PubMed

    Han, Wei; Shi, Chun-Tao; Cao, Fei-Yun; Cao, Fang; Chen, Min-Bin; Lu, Rong-Zhu; Wang, Hua-Bing; Yu, Min; He, Da-Wei; Wang, Qing-Hua; Wang, Jie-Feng; Xu, Xuan-Xuan; Ding, Hou-Zhong

    2016-01-01

    There is a heated debate on whether the prognostic value of NME1 is favorable or unfavorable. Thus, we carried out a meta-analysis to evaluate the relationship between NME1 expression and the prognosis of patients with digestive system neoplasms. We searched PubMed, EMBASE and Web of Science for relevant articles. The pooled odd ratios (ORs) and corresponding 95%CI were calculated to evaluate the prognostic value of NME1 expression in patients with digestive system neoplasms, and the association between NME1 expression and clinicopathological factors. We also performed subgroup analyses to find out the source of heterogeneity. 2904 patients were pooled from 28 available studies in total. Neither the incorporative OR combined by 17 studies with overall survival (OR = 0.65, 95%CI:0.41-1.03, P = 0.07) nor the pooled OR with disease-free survival (OR = 0.75, 95%CI:0.17-3.36, P = 0.71) in statistics showed any significance. Although we couldn't find any significance in TNM stage (OR = 0.78, 95%CI:0.44-1.36, P = 0.38), elevated NME1 expression was related to well tumor differentiation (OR = 0.59, 95%CI:0.47-0.73, P<0.00001), negative N status (OR = 0.54, 95%CI:0.36-0.82, P = 0.003) and Dukes' stage (OR = 0.43, 95%CI:0.24-0.77, P = 0.004). And in the subgroup analyses, we only find the "years" which might be the source of heterogeneity of overall survival in gastric cancer. The results showed that statistically significant association was found between NME1 expression and the tumor differentiation, N status and Dukes' stage of patients with digestive system cancers, while no significance was found in overall survival, disease-free survival and TNM stage. More and further researches should be conducted to reveal the prognostic value of NME1.

  6. [Effect of tongluo xingnao effervescent tablet on learning and memory of AD rats and expression of insulin-degrading enzyme in hippocampus].

    PubMed

    Zhang, Yin-Jie; Dai, Yuan; Hu, Yong; Ma, Yun-Tong; Xu, Shi-Jun; Wang, Yong-Yan

    2013-09-01

    To study the effect of Tongluo Xingnao effervescent tablet on learning and memory of dementia rats induced by injection of Abeta25-35 in hippocampus and expression of insulin-degrading enzyme in hippocampus, in order to provide basis for preventing and treating senile dementia. The dementia rat model was established by injecting Abeta25-35 in hippocampus. The rats were divided into the model control group, the Aricept (1.4 mg x kg(-1)) group, and Tongluo Xingnao effervescent tablet high dose (7.56 g x kg(-1)), middle dose (3.78 g x kg(-1)) and low dose (1.59 g x kg(-1)) groups. A sham operation group was established by injecting normal saline in hippocampus. The rats were orally given drugs for 90 days, once a day. Their learning and memory were tested by using Morris water maze. Immunohistochemistry and image analysis were utilized for a quantitative analysis on the expression of insulin-degrading enzyme in hippocampus. Tongluo Xingnao effervescent tablet could significantly shorten the escape latency of rats in the directional navigation test, prolong the retention time in the first quadrant dwell, decrease the retention time in the third quadrant dwell, increase the frequency of crossing the platform, show a more notable statistical significance than the model control group (P < 0.05). Additionally, it could also remarkably increase the average optical density of insulin-degrading enzyme in hippocampus, promote the expression of insulin-degrading enzyme in hippocampus, and show a more notable statistical significance than the model control group (P < 0.05). Tongluo Xingnao effervescent tablet has the effects of improving learning and memory capacity of AD rats and promoting the expression of insulin-degrading enzyme in hippocampus. Its effect in promoting intelligence will be related to increased insulin-degrading enzyme in hippocampus.

  7. Differential expression of CD10 in prostate cancer and its clinical implication

    PubMed Central

    Dall'Era, Marc A; True, Lawrence D; Siegel, Andrew F; Porter, Michael P; Sherertz, Tracy M; Liu, Alvin Y

    2007-01-01

    Background CD10 is a transmembrane metallo-endopeptidase that cleaves and inactivates a variety of peptide growth factors. Loss of CD10 expression is a common, early event in human prostate cancer; however, CD10 positive cancer cells frequently appear in lymph node metastasis. We hypothesize that prostate tumors expressing high levels of CD10 have a more aggressive biology with an early propensity towards lymph node metastasis. Methods Eighty-seven patients, 53 with and 34 without pathologically organ confined prostate cancer at the time of radical prostatectomy (RP), were used for the study. Fourteen patients with lymph node metastasis found at the time of surgery were identified and included in this study. Serial sections from available frozen tumor specimens in OCT were processed for CD10 immunohistochemistry. Cancer glands were graded for the presence and intensity of CD10 staining, and overall percentage of glands staining positive was estimated. Clinical characteristics including pre- and post-operative PSA and Gleason score were obtained. A similar study as a control for the statistical analysis was performed with CD13 staining. For statistical analysis, strong staining was defined as > 20% positivity based on the observed maximum separation of the cumulative distributions. Results CD10 expression significantly correlated with Gleason grade, tumor stage, and with pre-operative serum PSA. Seventy percent of RP specimens from patients with node metastasis showed strong staining for CD10, compared to 30% in the entire cohort (OR = 3.4, 95% CI: 1.08–10.75, P = 0.019). Increased staining for CD10 was associated with PSA recurrence after RP. CD13 staining did not correlate significantly with any of these same clinical parameters. Conclusion These results suggest that the expression of CD10 by prostate cancer corresponds to a more aggressive phenotype with a higher malignant potential, described histologically by the Gleason score. CD10 offers potential clinical utility for stratifying prostate cancer to predict biological behavior of the tumor. PMID:17335564

  8. Cell cloning-based transcriptome analysis in Rett patients: relevance to the pathogenesis of Rett syndrome of new human MeCP2 target genes.

    PubMed

    Nectoux, J; Fichou, Y; Rosas-Vargas, H; Cagnard, N; Bahi-Buisson, N; Nusbaum, P; Letourneur, F; Chelly, J; Bienvenu, T

    2010-07-01

    More than 90% of Rett syndrome (RTT) patients have heterozygous mutations in the X-linked methyl-CpG binding protein 2 (MECP2) gene that encodes the methyl-CpG-binding protein 2, a transcriptional modulator. Because MECP2 is subjected to X chromosome inactivation (XCI), girls with RTT either express the wild-type or mutant allele in each individual cell. To test the consequences of MECP2 mutations resulting from a genome-wide transcriptional dysregulation and to identify its target genes in a system that circumvents the functional mosaicism resulting from XCI, we carried out gene expression profiling of clonal populations derived from fibroblast primary cultures expressing exclusively either the wild-type or the mutant MECP2 allele. Clonal cultures were obtained from skin biopsy of three RTT patients carrying either a non-sense or a frameshift MECP2 mutation. For each patient, gene expression profiles of wild-type and mutant clones were compared by oligonucleotide expression microarray analysis. Firstly, clustering analysis classified the RTT patients according to their genetic background and MECP2 mutation. Secondly, expression profiling by microarray analysis and quantitative RT-PCR indicated four up-regulated genes and five down-regulated genes significantly dysregulated in all our statistical analysis, including excellent potential candidate genes for the understanding of the pathophysiology of this neurodevelopmental disease. Thirdly, chromatin immunoprecipitation analysis confirmed MeCP2 binding to respective CpG islands in three out of four up-regulated candidate genes and sequencing of bisulphite-converted DNA indicated that MeCP2 preferentially binds to methylated-DNA sequences. Most importantly, the finding that at least two of these genes (BMCC1 and RNF182) were shown to be involved in cell survival and/or apoptosis may suggest that impaired MeCP2 function could alter the survival of neurons thus compromising brain function without inducing cell death.

  9. Clinical significance of nm23 gene expression in gastric cancer.

    PubMed

    Mönig, Stefan P; Nolden, Brit; Lübke, Thomas; Pohl, Alexandra; Grass, Guido; Schneider, Paul M; Dienes, Hans P; Hölscher, Arnulf H; Baldus, Stephan E

    2007-01-01

    The expression of the nm23 gene has been associated with the development of metastasis. Numerous studies have shown down-regulation of nm23 expression in metastatic breast and colon cancer. The expression of the putative metastasis-suppressor gene nm23 in gastric carcinoma is controversial. The aim of this study was the analysis of nm23 expression in a large series of gastric cancer patients. In a retrospective immunohistochemical study specimens obtained from 116 gastric cancer patients (mean age 64 years; range: 33-85) who had undergone gastrectomy with extended lymphadenectomy were analyzed. Nm23 expression in the tumor epithelium was studied by immunohistochemistry followed by a semi-quantitative (score 0-3) evaluation. Statistical analysis including Chi-square test, uni- and multivariate survival analyses were performed. The nm23 staining pattern was positive (score 2-3) in 100 (86.2%) specimens and negative (score 0-1) in 16 (13.8%) samples. Lymph node metastasis was found in 65% of the patients. No significant correlations could be determined between nm23 expression and other variables such as gender, age, tumor differentiation, WHO-, Laurén-, Goseki-, or Ming-classification. The intensity of nm23 staining in the tumor cells was not significantly correlated with depth of tumor infiltration (T-stage), lymph node metastasis (N-stage), distant metastasis (M-stage), UICC-stage, or prognosis. Our series did not show a correlation of nm23 expression in terms of lymph node and distant metastasis or prognosis in gastric cancer patients.

  10. Real-time quantitative polymerase chain reaction analysis of patients with refractory chronic periodontitis.

    PubMed

    Marconcini, Simone; Covani, Ugo; Barone, Antonio; Vittorio, Orazio; Curcio, Michele; Barbuti, Serena; Scatena, Fabrizio; Felli, Lamberto; Nicolini, Claudio

    2011-07-01

    Periodontitis is a complex multifactorial disease and is typically polygenic in origin. Genes play a fundamental part in each biologic process forming complex networks of interactions. However, only some genes have a high number of interactions with other genes in the network and may, therefore, be considered to play an important role. In a preliminary bioinformatic analysis, five genes that showed a higher number of interactions were identified and termed leader genes. In the present study, we use real-time quantitative polymerase chain reaction (PCR) technology to evaluate the expression levels of leader genes in the leukocytes of 10 patients with refractory chronic periodontitis and compare the expression levels with those of the same genes in 24 healthy patients. Blood was collected from 24 healthy human subjects and 10 patients with refractory chronic periodontitis and placed into heparinized blood collection tubes by personnel trained in phlebotomy using a sterile technique. Blood leukocyte cells were immediately lysed by using a kit for total RNA purification from human whole blood. Complementary DNA (cDNA) synthesis was obtained from total RNA and then real-time quantitative PCR was performed. PCR efficiencies were calculated with a relative standard curve derived from a five cDNA dilution series in triplicate that gave regression coefficients >0.98 and efficiencies >96%. The standard curves were obtained using glyceraldehyde-3-phosphate dehydrogenase (GAPDH) and growth factor receptor binding protein 2 (GRB2), casitas B-lineage lymphoma (CBL), nuclear factor-KB1 (NFKB1), and REL-A (gene for transcription factor p65) gene primers and amplified with 1.6, 8, 40, 200, and 1,000 ng/μL total cDNA. Curves obtained for each sample showed a linear relationship between RNA concentrations and the cycle threshold value of real-time quantitative PCR for all genes. Data were expressed as mean ± SE (SEM). The groups were compared to the analysis of variance. A probability value <0.01 was considered statistically significant. The present study agrees with the preliminary bioinformatics analysis. In our experiments, the association of pathology with the genes was statistically significant for GRB2 and CBL (P <0.01), and it was not statistically significant for REL-A and NFKB1. This article lends support to our preliminary hypothesis that assigned an important role in refractory aggressive periodontitis to leader genes.

  11. Differential Proteomic Analysis of Noncardia Gastric Cancer from Individuals of Northern Brazil

    PubMed Central

    Leal, Mariana Ferreira; Chung, Janete; Calcagno, Danielle Queiroz; Assumpção, Paulo Pimentel; Demachki, Samia; da Silva, Ismael Dale Cotrim Guerreiro; Chammas, Roger; Burbano, Rommel Rodríguez; de Arruda Cardoso Smith, Marília

    2012-01-01

    Gastric cancer is the second leading cause of cancer-related death worldwide. The identification of new cancer biomarkers is necessary to reduce the mortality rates through the development of new screening assays and early diagnosis, as well as new target therapies. In this study, we performed a proteomic analysis of noncardia gastric neoplasias of individuals from Northern Brazil. The proteins were analyzed by two-dimensional electrophoresis and mass spectrometry. For the identification of differentially expressed proteins, we used statistical tests with bootstrapping resampling to control the type I error in the multiple comparison analyses. We identified 111 proteins involved in gastric carcinogenesis. The computational analysis revealed several proteins involved in the energy production processes and reinforced the Warburg effect in gastric cancer. ENO1 and HSPB1 expression were further evaluated. ENO1 was selected due to its role in aerobic glycolysis that may contribute to the Warburg effect. Although we observed two up-regulated spots of ENO1 in the proteomic analysis, the mean expression of ENO1 was reduced in gastric tumors by western blot. However, mean ENO1 expression seems to increase in more invasive tumors. This lack of correlation between proteomic and western blot analyses may be due to the presence of other ENO1 spots that present a slightly reduced expression, but with a high impact in the mean protein expression. In neoplasias, HSPB1 is induced by cellular stress to protect cells against apoptosis. In the present study, HSPB1 presented an elevated protein and mRNA expression in a subset of gastric cancer samples. However, no association was observed between HSPB1 expression and clinicopathological characteristics. Here, we identified several possible biomarkers of gastric cancer in individuals from Northern Brazil. These biomarkers may be useful for the assessment of prognosis and stratification for therapy if validated in larger clinical study sets. PMID:22860099

  12. Efficiently Identifying Significant Associations in Genome-wide Association Studies

    PubMed Central

    Eskin, Eleazar

    2013-01-01

    Abstract Over the past several years, genome-wide association studies (GWAS) have implicated hundreds of genes in common disease. More recently, the GWAS approach has been utilized to identify regions of the genome that harbor variation affecting gene expression or expression quantitative trait loci (eQTLs). Unlike GWAS applied to clinical traits, where only a handful of phenotypes are analyzed per study, in eQTL studies, tens of thousands of gene expression levels are measured, and the GWAS approach is applied to each gene expression level. This leads to computing billions of statistical tests and requires substantial computational resources, particularly when applying novel statistical methods such as mixed models. We introduce a novel two-stage testing procedure that identifies all of the significant associations more efficiently than testing all the single nucleotide polymorphisms (SNPs). In the first stage, a small number of informative SNPs, or proxies, across the genome are tested. Based on their observed associations, our approach locates the regions that may contain significant SNPs and only tests additional SNPs from those regions. We show through simulations and analysis of real GWAS datasets that the proposed two-stage procedure increases the computational speed by a factor of 10. Additionally, efficient implementation of our software increases the computational speed relative to the state-of-the-art testing approaches by a factor of 75. PMID:24033261

  13. Accounting for mudflow genesis in preliminary assessment of the maximum volume of solid mudflow sediments in the North Caucasus

    NASA Astrophysics Data System (ADS)

    Zalikhanov, M. Ch.; Kondratieva, N. V.; Adzhiev, A. Kh.; Razumov, V. V.

    2016-09-01

    The area of investigation was subject to multifactor analysis of the relationship between the maximum amount of mudflow solid sediments ( W) and parameters such as the mudflow basin area ( S), average channel slope (α), and mudflow channel length ( L). They were used to obtain analytical expressions in order to approximate the W( S, L, α) relation based on the mudflow genesis and source height. Statistical data on mudflow manifestations in different basins in the North Caucasus covering more than fifty years were used to obtain the analytical expressions in order to assess the maximum volume of mudflow solid sediments.

  14. [Arf6, RalA and BIRC5 protein expression in non small cell lung cancer].

    PubMed

    Knizhnik, A V; Kovaleva, O B; Laktionov, K K; Mochal'nikova, V V; Komel'kov, A V; Chevkina, E M; Zborovskaia, I B

    2011-01-01

    Evaluation of tumor markers expression pattern which determines individual progression parameters is one of the major topics in molecular oncopathology research. This work presents research on expression analysis of several Ras-Ral associated signal transduction pathway proteins (Arf6, RalA and BIRC5) in accordance with clinical criteria in non small cell lung cancer patients. Using Western-blot analysis and RT-PCR Arf6, RalA and BIRC5 expression has been analyzed in parallel in 53 non small cell lung cancer samples of different origin. Arf6 protein expression was elevated in 55% non small cell lung cancer tumor samples in comparison with normal tissue. In the group of squamous cell lung cancer Arf6 expression elevation was observed more often. RalA protein expression was decreased in comparison to normal tissue samples in 64% of non small cell lung cancer regardless to morphological structure. Correlation between RalA protein expression decrease and absence of regional metastases was revealed for squamous cell lung cancer. BIRC5 protein expression in tumor samples versus corresponding normal tissue was 1.3 times more often elevated in the squamous cell lung cancer group (in 76% tumor samples). At the same time elevation of BIRC5 expression was fixed only in 63% of adenocarcinoma tumor samples. A statistically significant decrease (p = 0.0158) of RalA protein expression and increase (p = 0.0498) of Arf6 protein expression in comparison with normal tissue was found for T1-2N0M0 and T1-2N1-2M0 groups of squamous cell lung cancer correspondingly.

  15. Expression level and clinical significance of IL-2, IL-6 and TGF-β in elderly patients with goiter and hyperthyroidism.

    PubMed

    Lv, L-F; Jia, H-Y; Zhang, H-F; Hu, Y-X

    2017-10-01

    To investigate the level of expression and the clinical significance of IL-2 (interleukin-2), IL-6 (interleukin-6) and TGF-β (transforming growth factor-β) in elderly patients with goiter and hyperthyroidism. Gender, age, course of disease, BMI (Body Mass Index), serum FT3 (Free triiodothyronine-3), FT4 (Free triiodothyronine-4), TT3 (Total triiodothyronine-3), TT4 (Total triiodothyronine-4), TSH (Thyroid Stimulating Hormone) and clinical manifestations on admission and other general clinical data and laboratory examination results were collected and statistically analyzed as case group in 128 elderly patients with goiter and hyperthyroidism. Additional 128 over 60-year-old patients with hyperthyroidism were selected as control group. The thyroid tissue of these patients and the control group were examined by fine needle aspiration biopsy. The expressions of IL-2, IL-6, TGF-β of the thyroid tissue in all patients were detected by immunohistochemistry, qRT-PCR (Real-time Quantitative Polymerase Chain Reaction) and Western blot method respectively, and the statistical analysis was carried out. p < 0.05 indicated that the difference had statistical significance. Compared with the control group, the expressions of IL-2, IL-6 and TGF-β in the group of patients were significantly higher (p < 0.05). The significantly higher expression of IL-2, IL-6, and TGF-β was mainly concentrated in the thyroid follicular cells of patients with hyperthyroidism and thyroid enlargement (p < 0.05). In the patients with goiter, hyperthyroidism, and symptoms of exophthalmos, the level of expression of IL-6 was significantly higher than that of patients without exophthalmos (p < 0.05). In the patients with goiter, hyperthyroidism and symptoms of exophthalmos, and the patients with goiter, hyperthyroidism without symptoms of exophthalmos, IL-2 and TGF-β expression level were not different (p > 0.05). The expression levels of IL-2, IL-6, and TGF-β were significantly increased in the patients with senile goiter and hyperthyroidism, but in the senile patients with goiter, hyperthyroidism and exophthalmos symptoms, IL-6 levels were significantly higher than those without exophthalmos. The use of IL-2, IL-6, and TGF-β is of great significance in the diagnosis of goiter with hyperthyroidism, especially for elderly patients with atypical clinical symptoms of hyperthyroidism.

  16. Whole-Genome Analysis of the SHORT-ROOT Developmental Pathway in Arabidopsis

    PubMed Central

    Busch, Wolfgang; Cui, Hongchang; Wang, Jean Y; Blilou, Ikram; Hassan, Hala; Nakajima, Keiji; Matsumoto, Noritaka; Lohmann, Jan U; Scheres, Ben

    2006-01-01

    Stem cell function during organogenesis is a key issue in developmental biology. The transcription factor SHORT-ROOT (SHR) is a critical component in a developmental pathway regulating both the specification of the root stem cell niche and the differentiation potential of a subset of stem cells in the Arabidopsis root. To obtain a comprehensive view of the SHR pathway, we used a statistical method called meta-analysis to combine the results of several microarray experiments measuring the changes in global expression profiles after modulating SHR activity. Meta-analysis was first used to identify the direct targets of SHR by combining results from an inducible form of SHR driven by its endogenous promoter, ectopic expression, followed by cell sorting and comparisons of mutant to wild-type roots. Eight putative direct targets of SHR were identified, all with expression patterns encompassing subsets of the native SHR expression domain. Further evidence for direct regulation by SHR came from binding of SHR in vivo to the promoter regions of four of the eight putative targets. A new role for SHR in the vascular cylinder was predicted from the expression pattern of several direct targets and confirmed with independent markers. The meta-analysis approach was then used to perform a global survey of the SHR indirect targets. Our analysis suggests that the SHR pathway regulates root development not only through a large transcription regulatory network but also through hormonal pathways and signaling pathways using receptor-like kinases. Taken together, our results not only identify the first nodes in the SHR pathway and a new function for SHR in the development of the vascular tissue but also reveal the global architecture of this developmental pathway. PMID:16640459

  17. Biochemical and molecular characterization of thyroid tissue by micro-Raman spectroscopy and gene expression analysis

    NASA Astrophysics Data System (ADS)

    Neto, Lázaro P. M.; Martin, Aírton A.; Soto, Claudio A. T.; Santos, André B. O.; Mello, Evandro S.; Pereira, Marina A.; Cernea, Cláudio R.; Brandão, Lenine G.; Canevari, Renata A.

    2016-02-01

    Thyroid carcinomas represent the main endocrine malignancy and their diagnosis may produce inconclusive results. Raman spectroscopy and gene expression analysis have shown excellent results on the differentiation of carcinomas. This study aimed to improve the discrimination between different thyroid pathologies combining of both analyses. A total of 35 thyroid tissues samples including normal tissue (n=10), goiter (n=10), papillary (n=10) and follicular carcinomas (n=5) were analyzed. Confocal Raman spectra was obtain by using a Rivers Diagnostic System, 785 nm laser excitation and CCD detector. The data was processed by the software Labspec5 and Origin 8.5 and analyzed by Minitab® program. The gene expression analysis was performed by qRT-PCR technique for TG, TPO, PDGFB, SERPINA1, LGALS3 and TFF3 genes and statistically analyzed by Mann-Whitney test. The confocal Raman spectroscopy allowed a maximum discrimination of 91.1% between normal and tumor tissues, 84.8% between benign and malignant pathologies and 84.6% among carcinomas analyzed. Significant differences was observed for TG, LGALS3, SERPINA1 and TFF3 genes between benign lesions and carcinomas, and SERPINA1 and TFF3 genes between papillary and follicular carcinomas. Principal component analysis was performed using PC1 and PC2 in the papillary carcinoma samples that showed over gene expression when compared with normal sample, where 90% of discrimination was observed at the Amide 1 (1655 cm-1), and at the tyrosine spectra region (856 cm-1). The discrimination of tissues thyroid carried out by confocal Raman spectroscopy and gene expression analysis indicate that these techniques are promising tools to be used in the diagnosis of thyroid lesions.

  18. Computerized measurement of facial expression of emotions in schizophrenia.

    PubMed

    Alvino, Christopher; Kohler, Christian; Barrett, Frederick; Gur, Raquel E; Gur, Ruben C; Verma, Ragini

    2007-07-30

    Deficits in the ability to express emotions characterize several neuropsychiatric disorders and are a hallmark of schizophrenia, and there is need for a method of quantifying expression, which is currently done by clinical ratings. This paper presents the development and validation of a computational framework for quantifying emotional expression differences between patients with schizophrenia and healthy controls. Each face is modeled as a combination of elastic regions, and expression changes are modeled as a deformation between a neutral face and an expressive face. Functions of these deformations, known as the regional volumetric difference (RVD) functions, form distinctive quantitative profiles of expressions. Employing pattern classification techniques, we have designed expression classifiers for the four universal emotions of happiness, sadness, anger and fear by training on RVD functions of expression changes. The classifiers were cross-validated and then applied to facial expression images of patients with schizophrenia and healthy controls. The classification score for each image reflects the extent to which the expressed emotion matches the intended emotion. Group-wise statistical analysis revealed this score to be significantly different between healthy controls and patients, especially in the case of anger. This score correlated with clinical severity of flat affect. These results encourage the use of such deformation based expression quantification measures for research in clinical applications that require the automated measurement of facial affect.

  19. Relationship of the Interaction Between Two Quantitative Trait Loci with γ-Globin Expression in β-Thalassemia Intermedia Patients.

    PubMed

    NickAria, Shiva; Haghpanah, Sezaneh; Ramzi, Mani; Karimi, Mehran

    2018-05-10

    Globin switching is a significant factor on blood hemoglobin (Hb) level but its molecular mechanisms have not yet been identified, however, several quantitative trait loci (QTL) and polymorphisms involved regions on chromosomes 2p, 6q, 8q and X account for variation in the γ-globin expression level. We studied the effect of interaction between a region on intron six of the TOX gene, chromosome 8q (chr8q) and XmnI locus on the γ-globin promoter, chr11p on γ-globin expression in 150 β-thalassemia intermedia (β-TI) patients, evaluated by statistical interaction analysis. Our results showed a significant interaction between one QTL on intron six of the TOX gene (rs9693712) and XmnI locus that effect γ-globin expression. Interchromosomal interaction mediates through transcriptional machanisms to preserve true genome architectural features, chromosomes localization and DNA bending. This interaction can be a part of the unknown molecular mechanism of globin switching and regulation of gene expression.

  20. Analysis of baseline gene expression levels from ...

    EPA Pesticide Factsheets

    The use of gene expression profiling to predict chemical mode of action would be enhanced by better characterization of variance due to individual, environmental, and technical factors. Meta-analysis of microarray data from untreated or vehicle-treated animals within the control arm of toxicogenomics studies has yielded useful information on baseline fluctuations in gene expression. A dataset of control animal microarray expression data was assembled by a working group of the Health and Environmental Sciences Institute's Technical Committee on the Application of Genomics in Mechanism Based Risk Assessment in order to provide a public resource for assessments of variability in baseline gene expression. Data from over 500 Affymetrix microarrays from control rat liver and kidney were collected from 16 different institutions. Thirty-five biological and technical factors were obtained for each animal, describing a wide range of study characteristics, and a subset were evaluated in detail for their contribution to total variability using multivariate statistical and graphical techniques. The study factors that emerged as key sources of variability included gender, organ section, strain, and fasting state. These and other study factors were identified as key descriptors that should be included in the minimal information about a toxicogenomics study needed for interpretation of results by an independent source. Genes that are the most and least variable, gender-selectiv

  1. Model-based reconstruction of synthetic promoter library in Corynebacterium glutamicum.

    PubMed

    Zhang, Shuanghong; Liu, Dingyu; Mao, Zhitao; Mao, Yufeng; Ma, Hongwu; Chen, Tao; Zhao, Xueming; Wang, Zhiwen

    2018-05-01

    To develop an efficient synthetic promoter library for fine-tuned expression of target genes in Corynebacterium glutamicum. A synthetic promoter library for C. glutamicum was developed based on conserved sequences of the - 10 and - 35 regions. The synthetic promoter library covered a wide range of strengths, ranging from 1 to 193% of the tac promoter. 68 promoters were selected and sequenced for correlation analysis between promoter sequence and strength with a statistical model. A new promoter library was further reconstructed with improved promoter strength and coverage based on the results of correlation analysis. Tandem promoter P70 was finally constructed with increased strength by 121% over the tac promoter. The promoter library developed in this study showed a great potential for applications in metabolic engineering and synthetic biology for the optimization of metabolic networks. To the best of our knowledge, this is the first reconstruction of synthetic promoter library based on statistical analysis of C. glutamicum.

  2. Interpretation of correlations in clinical research.

    PubMed

    Hung, Man; Bounsanga, Jerry; Voss, Maren Wright

    2017-11-01

    Critically analyzing research is a key skill in evidence-based practice and requires knowledge of research methods, results interpretation, and applications, all of which rely on a foundation based in statistics. Evidence-based practice makes high demands on trained medical professionals to interpret an ever-expanding array of research evidence. As clinical training emphasizes medical care rather than statistics, it is useful to review the basics of statistical methods and what they mean for interpreting clinical studies. We reviewed the basic concepts of correlational associations, violations of normality, unobserved variable bias, sample size, and alpha inflation. The foundations of causal inference were discussed and sound statistical analyses were examined. We discuss four ways in which correlational analysis is misused, including causal inference overreach, over-reliance on significance, alpha inflation, and sample size bias. Recent published studies in the medical field provide evidence of causal assertion overreach drawn from correlational findings. The findings present a primer on the assumptions and nature of correlational methods of analysis and urge clinicians to exercise appropriate caution as they critically analyze the evidence before them and evaluate evidence that supports practice. Critically analyzing new evidence requires statistical knowledge in addition to clinical knowledge. Studies can overstate relationships, expressing causal assertions when only correlational evidence is available. Failure to account for the effect of sample size in the analyses tends to overstate the importance of predictive variables. It is important not to overemphasize the statistical significance without consideration of effect size and whether differences could be considered clinically meaningful.

  3. Statistical inference for time course RNA-Seq data using a negative binomial mixed-effect model.

    PubMed

    Sun, Xiaoxiao; Dalpiaz, David; Wu, Di; S Liu, Jun; Zhong, Wenxuan; Ma, Ping

    2016-08-26

    Accurate identification of differentially expressed (DE) genes in time course RNA-Seq data is crucial for understanding the dynamics of transcriptional regulatory network. However, most of the available methods treat gene expressions at different time points as replicates and test the significance of the mean expression difference between treatments or conditions irrespective of time. They thus fail to identify many DE genes with different profiles across time. In this article, we propose a negative binomial mixed-effect model (NBMM) to identify DE genes in time course RNA-Seq data. In the NBMM, mean gene expression is characterized by a fixed effect, and time dependency is described by random effects. The NBMM is very flexible and can be fitted to both unreplicated and replicated time course RNA-Seq data via a penalized likelihood method. By comparing gene expression profiles over time, we further classify the DE genes into two subtypes to enhance the understanding of expression dynamics. A significance test for detecting DE genes is derived using a Kullback-Leibler distance ratio. Additionally, a significance test for gene sets is developed using a gene set score. Simulation analysis shows that the NBMM outperforms currently available methods for detecting DE genes and gene sets. Moreover, our real data analysis of fruit fly developmental time course RNA-Seq data demonstrates the NBMM identifies biologically relevant genes which are well justified by gene ontology analysis. The proposed method is powerful and efficient to detect biologically relevant DE genes and gene sets in time course RNA-Seq data.

  4. Gene set differential analysis of time course expression profiles via sparse estimation in functional logistic model with application to time-dependent biomarker detection.

    PubMed

    Kayano, Mitsunori; Matsui, Hidetoshi; Yamaguchi, Rui; Imoto, Seiya; Miyano, Satoru

    2016-04-01

    High-throughput time course expression profiles have been available in the last decade due to developments in measurement techniques and devices. Functional data analysis, which treats smoothed curves instead of originally observed discrete data, is effective for the time course expression profiles in terms of dimension reduction, robustness, and applicability to data measured at small and irregularly spaced time points. However, the statistical method of differential analysis for time course expression profiles has not been well established. We propose a functional logistic model based on elastic net regularization (F-Logistic) in order to identify the genes with dynamic alterations in case/control study. We employ a mixed model as a smoothing method to obtain functional data; then F-Logistic is applied to time course profiles measured at small and irregularly spaced time points. We evaluate the performance of F-Logistic in comparison with another functional data approach, i.e. functional ANOVA test (F-ANOVA), by applying the methods to real and synthetic time course data sets. The real data sets consist of the time course gene expression profiles for long-term effects of recombinant interferon β on disease progression in multiple sclerosis. F-Logistic distinguishes dynamic alterations, which cannot be found by competitive approaches such as F-ANOVA, in case/control study based on time course expression profiles. F-Logistic is effective for time-dependent biomarker detection, diagnosis, and therapy. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  5. Novel molecular subtypes of serous and endometrioid ovarian cancer linked to clinical outcome.

    PubMed

    Tothill, Richard W; Tinker, Anna V; George, Joshy; Brown, Robert; Fox, Stephen B; Lade, Stephen; Johnson, Daryl S; Trivett, Melanie K; Etemadmoghadam, Dariush; Locandro, Bianca; Traficante, Nadia; Fereday, Sian; Hung, Jillian A; Chiew, Yoke-Eng; Haviv, Izhak; Gertig, Dorota; DeFazio, Anna; Bowtell, David D L

    2008-08-15

    The study aim to identify novel molecular subtypes of ovarian cancer by gene expression profiling with linkage to clinical and pathologic features. Microarray gene expression profiling was done on 285 serous and endometrioid tumors of the ovary, peritoneum, and fallopian tube. K-means clustering was applied to identify robust molecular subtypes. Statistical analysis identified differentially expressed genes, pathways, and gene ontologies. Laser capture microdissection, pathology review, and immunohistochemistry validated the array-based findings. Patient survival within k-means groups was evaluated using Cox proportional hazards models. Class prediction validated k-means groups in an independent dataset. A semisupervised survival analysis of the array data was used to compare against unsupervised clustering results. Optimal clustering of array data identified six molecular subtypes. Two subtypes represented predominantly serous low malignant potential and low-grade endometrioid subtypes, respectively. The remaining four subtypes represented higher grade and advanced stage cancers of serous and endometrioid morphology. A novel subtype of high-grade serous cancers reflected a mesenchymal cell type, characterized by overexpression of N-cadherin and P-cadherin and low expression of differentiation markers, including CA125 and MUC1. A poor prognosis subtype was defined by a reactive stroma gene expression signature, correlating with extensive desmoplasia in such samples. A similar poor prognosis signature could be found using a semisupervised analysis. Each subtype displayed distinct levels and patterns of immune cell infiltration. Class prediction identified similar subtypes in an independent ovarian dataset with similar prognostic trends. Gene expression profiling identified molecular subtypes of ovarian cancer of biological and clinical importance.

  6. Comparative study of torque expression among active and passive self-ligating and conventional brackets

    PubMed Central

    Franco, Érika Mendonça Fernandes; Valarelli, Fabrício Pinelli; Fernandes, João Batista; Cançado, Rodrigo Hermont; de Freitas, Karina Maria Salvatore

    2015-01-01

    Abstract Objective: The aim of this study was to compare torque expression in active and passive self-ligating and conventional brackets. Methods: A total of 300 segments of stainless steel wire 0.019 x 0.025-in and six different brands of brackets (Damon 3MX, Portia, In-Ovation R, Bioquick, Roth SLI and Roth Max) were used. Torque moments were measured at 12°, 24°, 36° and 48°, using a wire torsion device associated with a universal testing machine. The data obtained were compared by analysis of variance followed by Tukey test for multiple comparisons. Regression analysis was performed by the least-squares method to generate the mathematical equation of the optimal curve for each brand of bracket. Results: Statistically significant differences were observed in the expression of torque among all evaluated bracket brands in all evaluated torsions (p < 0.05). It was found that Bioquick presented the lowest torque expression in all tested torsions; in contrast, Damon 3MX bracket presented the highest torque expression up to 36° torsion. Conclusions: The connection system between wire/bracket (active, passive self-ligating or conventional with elastic ligature) seems not to interfere in the final torque expression, the latter being probably dependent on the interaction between the wire and the bracket chosen for orthodontic mechanics. PMID:26691972

  7. Gene Expression Analysis Of Circulating Hormone Refractory Prostate Cancer Micrometastases

    DTIC Science & Technology

    2011-02-01

    of prostate cancer. We hypothesized that the copy number changes could be prognostic and aid in future chemotherapy regimen selection. After...Task 1 will be analyzed over the next year to elicit statistically meaningful prognostic DNA based biomarkers. Two of the patients (#8 and #13) had...HRPC), and to determine whether CECs can be used to predict survival in these patients. PATIENTS AND METHODS Several prognostic models that

  8. Computer architecture evaluation for structural dynamics computations: Project summary

    NASA Technical Reports Server (NTRS)

    Standley, Hilda M.

    1989-01-01

    The intent of the proposed effort is the examination of the impact of the elements of parallel architectures on the performance realized in a parallel computation. To this end, three major projects are developed: a language for the expression of high level parallelism, a statistical technique for the synthesis of multicomputer interconnection networks based upon performance prediction, and a queueing model for the analysis of shared memory hierarchies.

  9. Comparative Statistical Analysis of Auroral Models

    DTIC Science & Technology

    2012-03-22

    was willing to add this project to her extremely busy schedule. Lastly, I must also express my sincere appreciation for the rest of the faculty and...models have been extensively used for estimating GPS and other communication satellite disturbances ( Newell et al., 2010a). The auroral oval...models predict changes in the auroral oval in response to various geomagnetic conditions. In 2010, Newell et al. conducted a comparative study of

  10. Correlation between STK33 and the pathology and prognosis of lung cancer

    PubMed Central

    Lu, Yi; Tang, Jie; Zhang, Wenmei; Shen, Ce; Xu, Ling; Yang, Danrong

    2017-01-01

    Correlation between the expression of STK33 and the pathology of lung cancer was investigated, to explore its effects on prognosis. Hundred and two lung cancer patients diagnosed by pathological examinations were randomly selected in Shanghai Jiao Tong University Affiliated Sixth People's Hospital from February, 2012 to February, 2017 to serve as observation group, and the tumor tissues were collected. At the same time, 19 patients with lung benign lesions were selected and lung tissues were also collected to serve as control group. RT-qPCR was used to detect the expression of STK33 mRNA in tissues. Expression levels of STK33 protein were detected and compared by SP immunohistochemistry staining and western blot analysis. Statistical analysis was performed to analyze the correlation between STK33 expression and the pathology and prognosis of lung cancer. Results of PCR showed that expression level of STK33 gene in control group was significantly lower than that in observation group (p<0.05). The expression level of STK33 mRNA in lung adenocarcinoma and squamous cell carcinoma was lower than that in lung small cell carcinoma and large cell carcinoma (p<0.05). Western blot analysis showed that the expression level of STK33 protein in lung small cell carcinoma and large cell carcinoma was significantly higher than that in lung adenocarcinoma and squamous cell carcinoma (p<0.05). Immunohistochemistry staining showed that the positive rate of STK33 in lung large cell carcinoma (100%) and small cell carcinoma (100%) was significantly higher than that in lung adenocarcinoma (88.1%) and squamous cell carcinoma (86.2%) (p<0.05). The 5-year survival rate analysis showed that the recurrence-free survival rate and overall survival rate of STK33 gene high expression level group were significantly lower than those of low expression level group (p<0.05). The differential expression level of STK33 is related to the pathology and prognosis of lung cancer, which is of great value in clinical diagnosis and prognosis evaluation. PMID:29085482

  11. A comprehensive analysis on preservation patterns of gene co-expression networks during Alzheimer's disease progression.

    PubMed

    Ray, Sumanta; Hossain, Sk Md Mosaddek; Khatun, Lutfunnesa; Mukhopadhyay, Anirban

    2017-12-20

    Alzheimer's disease (AD) is a chronic neuro-degenerative disruption of the brain which involves in large scale transcriptomic variation. The disease does not impact every regions of the brain at the same time, instead it progresses slowly involving somewhat sequential interaction with different regions. Analysis of the expression patterns of the genes in different regions of the brain influenced in AD surely contribute for a enhanced comprehension of AD pathogenesis and shed light on the early characterization of the disease. Here, we have proposed a framework to identify perturbation and preservation characteristics of gene expression patterns across six distinct regions of the brain ("EC", "HIP", "PC", "MTG", "SFG", and "VCX") affected in AD. Co-expression modules were discovered considering a couple of regions at once. These are then analyzed to know the preservation and perturbation characteristics. Different module preservation statistics and a rank aggregation mechanism have been adopted to detect the changes of expression patterns across brain regions. Gene ontology (GO) and pathway based analysis were also carried out to know the biological meaning of preserved and perturbed modules. In this article, we have extensively studied the preservation patterns of co-expressed modules in six distinct brain regions affected in AD. Some modules are emerged as the most preserved while some others are detected as perturbed between a pair of brain regions. Further investigation on the topological properties of preserved and non-preserved modules reveals a substantial association amongst "betweenness centrality" and "degree" of the involved genes. Our findings may render a deeper realization of the preservation characteristics of gene expression patterns in discrete brain regions affected by AD.

  12. The autophagy-related marker LC3 can predict prognosis in human hepatocellular carcinoma.

    PubMed

    Lee, Yoo Jin; Hah, Yu Jin; Ha, Yu Jin; Kang, Yu Na; Kang, Koo Jeong; Hwang, Jae Seok; Chung, Woo Jin; Cho, Kwang Bum; Park, Kyung Sik; Kim, Eun Soo; Seo, Hye-Young; Kim, Mi-Kyung; Park, Keun-Gyu; Jang, Byoung Kuk

    2013-01-01

    Defects of autophagy and endoplasmic reticulum (ER) stress are related to many diseases and tumors. However, only a few studies have examined hepatocellular carcinoma (HCC) as related to these processes. Therefore, in this study, we investigated the expression and extent of autophagy and ER stress-related markers in HCC and their influence on clinical characteristics and prognosis for each protein. The expression of autophagy-related markers (LC3 and Beclin-1) and ER stress-related markers (GRP78 and CHOP) was analyzed by immunohistochemistry on tissues from completely resected specimens of 190 HCC patients. Their influence on clinicopathologic features and prognosis were evaluated using the chi-square test and Kaplan-Meier analysis. Correlations of each protein were determined by Spearman's correlation analysis. LC3 expression was not correlated with TNM, BCLC stage, or Edmonson-Steiner grading, whereas it was correlated with longer overall survival (OS) (p = 0.039) and tended to be related with longer time to recurrence (TTR) (p=0.068) although it did not show statistical significance. Multivariate analysis indicated that LC3 expression was a significantly independent prognostic factor of OS (HR, 0.42; 95% CI, 0.22-0.80; p-value=0.009) and TTR (HR, 0.54; 95% CI, 0.33-0.90; p=0.017). Expression of LC3 in advanced stages of TNM (III) (p=0.045) and Edmonson-Steiner Grades (III and IV) (p=0.043) was correlated with longer survival, but not in the early stages. A positive correlation was not observed between the expression of autophagy-related markers and ER stress-related markers. Our results suggest that the expression and extent of LC3 might be a strong prognostic factor of HCC, especially in patients with surgical resection.

  13. Cross-study projections of genomic biomarkers: an evaluation in cancer genomics.

    PubMed

    Lucas, Joseph E; Carvalho, Carlos M; Chen, Julia Ling-Yu; Chi, Jen-Tsan; West, Mike

    2009-01-01

    Human disease studies using DNA microarrays in both clinical/observational and experimental/controlled studies are having increasing impact on our understanding of the complexity of human diseases. A fundamental concept is the use of gene expression as a "common currency" that links the results of in vitro controlled experiments to in vivo observational human studies. Many studies--in cancer and other diseases--have shown promise in using in vitro cell manipulations to improve understanding of in vivo biology, but experiments often simply fail to reflect the enormous phenotypic variation seen in human diseases. We address this with a framework and methods to dissect, enhance and extend the in vivo utility of in vitro derived gene expression signatures. From an experimentally defined gene expression signature we use statistical factor analysis to generate multiple quantitative factors in human cancer gene expression data. These factors retain their relationship to the original, one-dimensional in vitro signature but better describe the diversity of in vivo biology. In a breast cancer analysis, we show that factors can reflect fundamentally different biological processes linked to molecular and clinical features of human cancers, and that in combination they can improve prediction of clinical outcomes.

  14. An empirical likelihood ratio test robust to individual heterogeneity for differential expression analysis of RNA-seq.

    PubMed

    Xu, Maoqi; Chen, Liang

    2018-01-01

    The individual sample heterogeneity is one of the biggest obstacles in biomarker identification for complex diseases such as cancers. Current statistical models to identify differentially expressed genes between disease and control groups often overlook the substantial human sample heterogeneity. Meanwhile, traditional nonparametric tests lose detailed data information and sacrifice the analysis power, although they are distribution free and robust to heterogeneity. Here, we propose an empirical likelihood ratio test with a mean-variance relationship constraint (ELTSeq) for the differential expression analysis of RNA sequencing (RNA-seq). As a distribution-free nonparametric model, ELTSeq handles individual heterogeneity by estimating an empirical probability for each observation without making any assumption about read-count distribution. It also incorporates a constraint for the read-count overdispersion, which is widely observed in RNA-seq data. ELTSeq demonstrates a significant improvement over existing methods such as edgeR, DESeq, t-tests, Wilcoxon tests and the classic empirical likelihood-ratio test when handling heterogeneous groups. It will significantly advance the transcriptomics studies of cancers and other complex disease. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  15. The grapevine expression atlas reveals a deep transcriptome shift driving the entire plant into a maturation program.

    PubMed

    Fasoli, Marianna; Dal Santo, Silvia; Zenoni, Sara; Tornielli, Giovanni Battista; Farina, Lorenzo; Zamboni, Anita; Porceddu, Andrea; Venturini, Luca; Bicego, Manuele; Murino, Vittorio; Ferrarini, Alberto; Delledonne, Massimo; Pezzotti, Mario

    2012-09-01

    We developed a genome-wide transcriptomic atlas of grapevine (Vitis vinifera) based on 54 samples representing green and woody tissues and organs at different developmental stages as well as specialized tissues such as pollen and senescent leaves. Together, these samples expressed ∼91% of the predicted grapevine genes. Pollen and senescent leaves had unique transcriptomes reflecting their specialized functions and physiological status. However, microarray and RNA-seq analysis grouped all the other samples into two major classes based on maturity rather than organ identity, namely, the vegetative/green and mature/woody categories. This division represents a fundamental transcriptomic reprogramming during the maturation process and was highlighted by three statistical approaches identifying the transcriptional relationships among samples (correlation analysis), putative biomarkers (O2PLS-DA approach), and sets of strongly and consistently expressed genes that define groups (topics) of similar samples (biclustering analysis). Gene coexpression analysis indicated that the mature/woody developmental program results from the reiterative coactivation of pathways that are largely inactive in vegetative/green tissues, often involving the coregulation of clusters of neighboring genes and global regulation based on codon preference. This global transcriptomic reprogramming during maturation has not been observed in herbaceous annual species and may be a defining characteristic of perennial woody plants.

  16. Analysis of microarray leukemia data using an efficient MapReduce-based K-nearest-neighbor classifier.

    PubMed

    Kumar, Mukesh; Rath, Nitish Kumar; Rath, Santanu Kumar

    2016-04-01

    Microarray-based gene expression profiling has emerged as an efficient technique for classification, prognosis, diagnosis, and treatment of cancer. Frequent changes in the behavior of this disease generates an enormous volume of data. Microarray data satisfies both the veracity and velocity properties of big data, as it keeps changing with time. Therefore, the analysis of microarray datasets in a small amount of time is essential. They often contain a large amount of expression, but only a fraction of it comprises genes that are significantly expressed. The precise identification of genes of interest that are responsible for causing cancer are imperative in microarray data analysis. Most existing schemes employ a two-phase process such as feature selection/extraction followed by classification. In this paper, various statistical methods (tests) based on MapReduce are proposed for selecting relevant features. After feature selection, a MapReduce-based K-nearest neighbor (mrKNN) classifier is also employed to classify microarray data. These algorithms are successfully implemented in a Hadoop framework. A comparative analysis is done on these MapReduce-based models using microarray datasets of various dimensions. From the obtained results, it is observed that these models consume much less execution time than conventional models in processing big data. Copyright © 2016 Elsevier Inc. All rights reserved.

  17. Exploring root symbiotic programs in the model legume Medicago truncatula using EST analysis.

    PubMed

    Journet, Etienne-Pascal; van Tuinen, Diederik; Gouzy, Jérome; Crespeau, Hervé; Carreau, Véronique; Farmer, Mary-Jo; Niebel, Andreas; Schiex, Thomas; Jaillon, Olivier; Chatagnier, Odile; Godiard, Laurence; Micheli, Fabienne; Kahn, Daniel; Gianinazzi-Pearson, Vivienne; Gamas, Pascal

    2002-12-15

    We report on a large-scale expressed sequence tag (EST) sequencing and analysis program aimed at characterizing the sets of genes expressed in roots of the model legume Medicago truncatula during interactions with either of two microsymbionts, the nitrogen-fixing bacterium Sinorhizobium meliloti or the arbuscular mycorrhizal fungus Glomus intraradices. We have designed specific tools for in silico analysis of EST data, in relation to chimeric cDNA detection, EST clustering, encoded protein prediction, and detection of differential expression. Our 21 473 5'- and 3'-ESTs could be grouped into 6359 EST clusters, corresponding to distinct virtual genes, along with 52 498 other M.truncatula ESTs available in the dbEST (NCBI) database that were recruited in the process. These clusters were manually annotated, using a specifically developed annotation interface. Analysis of EST cluster distribution in various M.truncatula cDNA libraries, supported by a refined R test to evaluate statistical significance and by 'electronic northern' representation, enabled us to identify a large number of novel genes predicted to be up- or down-regulated during either symbiotic root interaction. These in silico analyses provide a first global view of the genetic programs for root symbioses in M.truncatula. A searchable database has been built and can be accessed through a public interface.

  18. Transcription Analysis of the Myometrium of Labouring and Non-Labouring Women

    PubMed Central

    Hutchinson, James L.; Hibbert, Nanette; Freeman, Tom C.; Saunders, Philippa T. K.; Norman, Jane E.

    2016-01-01

    An incomplete understanding of the molecular mechanisms that initiate normal human labour at term seriously hampers the development of effective ways to predict, prevent and treat disorders such as preterm labour. Appropriate analysis of large microarray experiments that compare gene expression in non-labouring and labouring gestational tissues is necessary to help bridge these gaps in our knowledge. In this work, gene expression in 48 (22 labouring, 26 non-labouring) lower-segment myometrial samples collected at Caesarean section were analysed using Illumina HT-12 v4.0 BeadChips. Normalised data were compared between labouring and non-labouring groups using traditional statistical methods and a novel network graph approach. We sought technical validation with quantitative real-time PCR, and biological replication through inverse variance-weighted meta-analysis with published microarray data. We have extended the list of genes suggested to be associated with labour: Compared to non-labouring samples, labouring samples showed apparent higher expression at 960 probes (949 genes) and apparent lower expression at 801 probes (789 genes) (absolute fold change ≥1.2, rank product percentage of false positive value (RP-PFP) <0.05). Although half of the women in the labouring group had received pharmaceutical treatment to induce or augment labour, sensitivity analysis suggested that this did not confound our results. In agreement with previous studies, functional analysis suggested that labour was characterised by an increase in the expression of inflammatory genes and network analysis suggested a strong neutrophil signature. Our analysis also suggested that labour is characterised by a decrease in the expression of muscle-specific processes, which has not been explicitly discussed previously. We validated these findings through the first formal meta-analysis of raw data from previous experiments and we hypothesise that this represents a change in the composition of myometrial tissue at labour. Further work will be necessary to reveal whether these results are solely due to leukocyte infiltration into the myometrium as a mechanism initiating labour, or in addition whether they also represent gene changes in the myocytes themselves. We have made all our data available at www.ebi.ac.uk/arrayexpress/ (accession number E-MTAB-3136) to facilitate progression of this work. PMID:27176052

  19. In vivo characterization of a reporter gene system for imaging hypoxia-induced gene expression.

    PubMed

    Carlin, Sean; Pugachev, Andrei; Sun, Xiaorong; Burke, Sean; Claus, Filip; O'Donoghue, Joseph; Ling, C Clifton; Humm, John L

    2009-10-01

    To characterize a tumor model containing a hypoxia-inducible reporter gene and to demonstrate utility by comparison of reporter gene expression to the uptake and distribution of the hypoxia tracer (18)F-fluoromisonidazole ((18)F-FMISO). Three tumors derived from the rat prostate cancer cell line R3327-AT were grown in each of two rats as follows: (1) parental R3327-AT, (2) positive control R3327-AT/PC in which the HSV1-tkeGFP fusion reporter gene was expressed constitutively, (3) R3327-AT/HRE in which the reporter gene was placed under the control of a hypoxia-inducible factor-responsive promoter sequence (HRE). Animals were coadministered a hypoxia-specific marker (pimonidazole) and the reporter gene probe (124)I-2'-fluoro-2'-deoxy-1-beta-d-arabinofuranosyl-5-iodouracil ((124)I-FIAU) 3 h prior to sacrifice. Statistical analysis of the spatial association between (124)I-FIAU uptake and pimonidazole fluorescent staining intensity was then performed on a pixel-by-pixel basis. Utility of this system was demonstrated by assessment of reporter gene expression versus the exogenous hypoxia probe (18)F-FMISO. Two rats, each bearing a single R3327-AT/HRE tumor, were injected with (124)I-FIAU (3 h before sacrifice) and (18)F-FMISO (2 h before sacrifice). Statistical analysis of the spatial association between (18)F-FMISO and (124)I-FIAU on a pixel-by-pixel basis was performed. Correlation coefficients between (124)I-FIAU uptake and pimonidazole staining intensity were: 0.11 in R3327-AT tumors, -0.66 in R3327-AT/PC and 0.76 in R3327-AT/HRE, confirming that only in the R3327-AT/HRE tumor was HSV1-tkeGFP gene expression associated with hypoxia. Correlation coefficients between (18)F-FMISO and (124)I-FIAU uptakes in R3327-AT/HRE tumors were r=0.56, demonstrating good spatial correspondence between the two tracers. We have confirmed hypoxia-specific expression of the HSV1-tkeGFP fusion gene in the R3327-AT/HRE tumor model and demonstrated the utility of this model for the evaluation of radiolabeled hypoxia tracers.

  20. Bioinformatics approaches for cross-species liver cancer analysis based on microarray gene expression profiling

    PubMed Central

    Fang, H; Tong, W; Perkins, R; Shi, L; Hong, H; Cao, X; Xie, Q; Yim, SH; Ward, JM; Pitot, HC; Dragan, YP

    2005-01-01

    Background The completion of the sequencing of human, mouse and rat genomes and knowledge of cross-species gene homologies enables studies of differential gene expression in animal models. These types of studies have the potential to greatly enhance our understanding of diseases such as liver cancer in humans. Genes co-expressed across multiple species are most likely to have conserved functions. We have used various bioinformatics approaches to examine microarray expression profiles from liver neoplasms that arise in albumin-SV40 transgenic rats to elucidate genes, chromosome aberrations and pathways that might be associated with human liver cancer. Results In this study, we first identified 2223 differentially expressed genes by comparing gene expression profiles for two control, two adenoma and two carcinoma samples using an F-test. These genes were subsequently mapped to the rat chromosomes using a novel visualization tool, the Chromosome Plot. Using the same plot, we further mapped the significant genes to orthologous chromosomal locations in human and mouse. Many genes expressed in rat 1q that are amplified in rat liver cancer map to the human chromosomes 10, 11 and 19 and to the mouse chromosomes 7, 17 and 19, which have been implicated in studies of human and mouse liver cancer. Using Comparative Genomics Microarray Analysis (CGMA), we identified regions of potential aberrations in human. Lastly, a pathway analysis was conducted to predict altered human pathways based on statistical analysis and extrapolation from the rat data. All of the identified pathways have been known to be important in the etiology of human liver cancer, including cell cycle control, cell growth and differentiation, apoptosis, transcriptional regulation, and protein metabolism. Conclusion The study demonstrates that the hepatic gene expression profiles from the albumin-SV40 transgenic rat model revealed genes, pathways and chromosome alterations consistent with experimental and clinical research in human liver cancer. The bioinformatics tools presented in this paper are essential for cross species extrapolation and mapping of microarray data, its analysis and interpretation. PMID:16026603

  1. Anger and depression levels of mothers with premature infants in the neonatal intensive care unit.

    PubMed

    Kardaşözdemir, Funda; AKGüN Şahin, Zümrüt

    2016-02-04

    The aim of this study was to examine anger and depression levels of mothers who had a premature infant in the NICU, and all factors affecting the situation. This descriptive study was performed in the level I and II units of NICU at three state hospitals in Turkey. The data was collected with a demographic questionnaire, "Beck Depression Inventory" and "Anger Expression Scale". Descriptive statistics, parametric and nonparametric statistical tests and Pearson correlation were used in the data analysis. Mothers whose infants are under care in NICU have moderate depression. It has also been determined that mothers' educational level, income level and gender of infants were statistically significant (p <0.05). A positive relationship between depression and trait anger scores was found to be statistically significant. A negative relationship existed between depression and anger-control scores for the mothers, which was statistically significant (p <0.05). Due to the results of research, recommended that mothers who are at risk of depression and anger in the NICU evaluated by nurses and these nurses to develop their consulting roles.

  2. TUSC2(FUS1)-erlotinib Induced Vulnerabilities in Epidermal Growth Factor Receptor(EGFR) Wildtype Non-small Cell Lung Cancer(NSCLC) Targeted by the Repurposed Drug Auranofin.

    PubMed

    Xiaobo, Cao; Majidi, Mourad; Feng, Meng; Shao, Ruping; Wang, Jing; Zhao, Yang; Baladandayuthapani, Veerabhadran; Song, Juhee; Fang, Bingliang; Ji, Lin; Mehran, Reza; Roth, Jack A

    2016-11-15

    Expression of the TUSC2/FUS1 tumor suppressor gene in TUSC2 deficient EGFR wildtype lung cancer cells increased sensitivity to erlotinib. Microarray mRNA expression analysis of TUSC2 inducible lung cancer cells treated with erlotinib uncovered defects in the response to oxidative stress suggesting that increasing reactive oxygen species (ROS) would enhance therapeutic efficacy. Addition of the thioredoxin reductase 1 inhibitor (TXNRD1) auranofin (AF) to NSCLC cells treated with combination of TUSC2 forced expression with erlotinib increased tumor cell apoptosis and inhibited colony formation. TXNRD1 overexpression rescued tumors from AF-TUSC2-erlotinib induced apoptosis. Neutralizing ROS with nordihydroguaiaretic acid (NDGA) abrogated cell death induced by AF-TUSC2-erlotinib, indicating a regulatory role for ROS in the efficacy of the three drug combination. Isobologram-based statistical analysis of this combination demonstrated superior synergism, compared with each individual treatment at lower concentrations. In NSCLC tumor xenografts, tumor growth was markedly inhibited and animal survival was prolonged over controls by AF-TUSC2-erlotinib. Microarray mRNA expression analysis uncovered oxidative stress and DNA damage gene signatures significantly upregulated by AF-TUSC2-erlotinib compared to TUSC2-erlotinib. Pathway analysis showed the highest positive z-score for the NRF2-mediated oxidative stress response. Taken together these findings show that the combination of TUSC2-erlotinib induces additional novel vulnerabilities that can be targeted with AF.

  3. Solar activity prediction

    NASA Technical Reports Server (NTRS)

    Slutz, R. J.; Gray, T. B.; West, M. L.; Stewart, F. G.; Leftin, M.

    1971-01-01

    A statistical study of formulas for predicting the sunspot number several years in advance is reported. By using a data lineup with cycle maxima coinciding, and by using multiple and nonlinear predictors, a new formula which gives better error estimates than former formulas derived from the work of McNish and Lincoln is obtained. A statistical analysis is conducted to determine which of several mathematical expressions best describes the relationship between 10.7 cm solar flux and Zurich sunspot numbers. Attention is given to the autocorrelation of the observations, and confidence intervals for the derived relationships are presented. The accuracy of predicting a value of 10.7 cm solar flux from a predicted sunspot number is dicussed.

  4. Expected values and variances of Bragg peak intensities measured in a nanocrystalline powder diffraction experiment

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Öztürk, Hande; Noyan, I. Cevdet

    A rigorous study of sampling and intensity statistics applicable for a powder diffraction experiment as a function of crystallite size is presented. Our analysis yields approximate equations for the expected value, variance and standard deviations for both the number of diffracting grains and the corresponding diffracted intensity for a given Bragg peak. The classical formalism published in 1948 by Alexander, Klug & Kummer [J. Appl. Phys.(1948),19, 742–753] appears as a special case, limited to large crystallite sizes, here. It is observed that both the Lorentz probability expression and the statistics equations used in the classical formalism are inapplicable for nanocrystallinemore » powder samples.« less

  5. Expected values and variances of Bragg peak intensities measured in a nanocrystalline powder diffraction experiment

    DOE PAGES

    Öztürk, Hande; Noyan, I. Cevdet

    2017-08-24

    A rigorous study of sampling and intensity statistics applicable for a powder diffraction experiment as a function of crystallite size is presented. Our analysis yields approximate equations for the expected value, variance and standard deviations for both the number of diffracting grains and the corresponding diffracted intensity for a given Bragg peak. The classical formalism published in 1948 by Alexander, Klug & Kummer [J. Appl. Phys.(1948),19, 742–753] appears as a special case, limited to large crystallite sizes, here. It is observed that both the Lorentz probability expression and the statistics equations used in the classical formalism are inapplicable for nanocrystallinemore » powder samples.« less

  6. Role of RANK and Akt1 activation in human osteosarcoma progression: A clinicopathological study.

    PubMed

    Zhu, Jianxi; Liu, Yuwei; Zhu, Yong; Zeng, Min; Xie, Jie; Lei, Pengfei; Li, Kanghua; Hu, Yihe

    2017-06-01

    The receptor activator of nuclear factor κB (RANK) axis is the fundamental signaling pathway in bone formation as well as bone tumor pathophysiology. The aim of the present study was to evaluate the impact of the expression of RANK and its downstream signaling molecule Akt1 on tumor progression in patients with osteosarcoma. Expression of RANK and Akt1 was examined in 78 human osteosarcoma samples by immunohistochemistry using formalin-fixed samples. Following this, each graded immunohistochemistry result was correlated with clinicopathological parameters and patient survival. In total, 60 osteosarcomas (76.9%) expressed RANK and 58 cases (74.4%) showed expression of Akt1. In addition, expression of RANK was negatively correlated with disease-free survival by Kaplan-Meier analysis. A resistance was observed to chemotherapy in RANK-expressing cases, which was statistically significant (P<0.05). In addition, chemotherapy and staging of the tumor were found to independent factors that have an effect on patient survival (P<0.05). Thus, RANK was identified as a negative prognostic factor of osteosarcoma survival.

  7. Clinicopathological significance of SLP-2 overexpression in human gallbladder cancer.

    PubMed

    Wang, Wei-Xin; Lin, Qing-Feng; Shen, Dong; Liu, Shao-Ping; Mao, Wei-Dong; Ma, Gui; Qi, Wei-Dong

    2014-01-01

    Several studies have indicated that overexpression of stomatin-like protein 2 (SLP-2) has been identified in several types of cancer. However, its role and clinical relevance in gallbladder cancer (GBC) is unknown. The purpose of this study was to reveal the prognostic significance of SLP-2 in GBC. The SLP-2 expression was examined at mRNA and protein levels by real-time quantitative polymerase chain reaction (qRT-PCR), and immunohistochemistry in GBC tissues and adjacent noncancerous tissues. Statistical analyses were applied to test the associations between SLP-2 expression, clinicopathologic factors, and prognosis. Immunohistochemistry and qRT-PCR showed that the protein and mRNA expression levels of SLP-2 were both significantly higher in GBC tissues than in adjacent noncancerous tissues. In addition, immunohistochemistry analysis showed that SLP-2 expression was significantly correlated with histological grade (P <0.001), pathologic T stage (P = 0.019), clinical stage (P = 0.001), and lymph node metastasis (P = 0.026). The Kaplan-Meier survival curves indicated that patients with high expression of SLP-2 had shorter overall survival than those with low expression (P <0.001). Meanwhile, the Cox multivariate analysis indicated that high expressions of SLP-2 were an independent prognostic factor for patients with GBC. These data showed that SLP-2 may play an important role in human GBC tumorigenesis, and SLP-2 might serve as a novel prognostic marker in human GBC.

  8. Connective tissue growth factor immunohistochemical expression is associated with gallbladder cancer progression.

    PubMed

    Garcia, Patricia; Leal, Pamela; Alvarez, Hector; Brebi, Priscilla; Ili, Carmen; Tapia, Oscar; Roa, Juan C

    2013-02-01

    Gallbladder cancer (GBC) is an aggressive neoplasia associated with late diagnosis, unsatisfactory treatment, and poor prognosis. Molecular mechanisms involved in GBC pathogenesis remain poorly understood. Connective tissue growth factor (CTGF) is thought to play a role in the pathologic processes and is overexpressed in several human cancers, including GBC. No information is available about CTGF expression in early stages of gallbladder carcinogenesis. Objective.- To evaluate the expression level of CTGF in benign and malignant lesions of gallbladder and its correlation with clinicopathologic features and GBC prognosis. Connective tissue growth factor protein was examined by immunohistochemistry on tissue microarrays containing tissue samples of chronic cholecystitis (n = 51), dysplasia (n = 15), and GBC (n = 169). The samples were scored according to intensity of staining as low/absent and high CTGF expressers. Statistical analysis was performed using the χ(2) test or Fisher exact probability test with a significance level of P < .05. Survival analysis was assessed by the Kaplan-Meier method and the log-rank test. Connective tissue growth factor expression showed a progressive increase from chronic cholecystitis to dysplasia and then to early and advanced carcinoma. Immunohistochemical expression (score ≥2) was significantly higher in advanced tumors, in comparison with chronic cholecystitis (P < .001) and dysplasia (P = .03). High levels of CTGF expression correlated with better survival (P = .04). Our results suggest a role for CTGF in GBC progression and a positive association with better prognosis. In addition, they underscore the importance of considering the involvement of inflammation on GBC development.

  9. Downregulation of long non-coding RNA ENSG00000241684 is associated with poor prognosis in advanced clear cell renal cell carcinoma.

    PubMed

    Su, Hengchuan; Wang, Hongkai; Shi, Guohai; Zhang, Hailiang; Sun, Fukang; Ye, Dingwei

    2018-06-01

    In order to identify potential novel biomarkers of advanced clear cell renal cell carcinoma (ccRCC), we re-evaluated published long non-coding RNA (lncRNA) expression profiling data. The lncRNA expression profiles in ccRCC microarray dataset GSE47352 were analyzed and an independent cohort of 61 clinical samples including 21 advanced and 40 localized ccRCC patients was used to confirm the most statistically significant lncRNAs by real time PCR. Next, the relationships between the selected lncRNAs and ccRCC patients' clinicopathological features were investigated. The effects of LncRNAs on the invasion and proliferation of renal carcinoma cells were also investigated. The PCR results in a cohort of 21 advanced ccRCC and 40 localized ccRCC tissues were used for confirmation of the selected lncRNAs which were statistically most significant. The PCR results showed that the expression of three LncRNA (ENSG00000241684, ENSG00000231721 and NEAT1) were significantly downregulated in advanced ccRCC. Kaplan-Meier analysis revealed that reduced expression of LncRNA ENSG00000241684 and NEAT1 were significantly associated with poor overall survival. The univariate and multivariate Cox regression indicated LncRNA ENSG00000241684 had significant hazard ratios for predicting clinical outcome. LncRNA ENSG00000241684 expression was negatively correlated with pTNM stage. Overexpression of ENSG00000241684 significantly impaired cell proliferation and reduced the invasion ability in 786-O and ACHN cells. lncRNAs are involved in renal carcinogenesis and decreased lncRNA ENSG00000241684 expression may be an independent adverse prognostic factor in advanced ccRCC patients. Copyright © 2018 Elsevier Ltd, BASO ~ The Association for Cancer Surgery, and the European Society of Surgical Oncology. All rights reserved.

  10. Gene ARMADA: an integrated multi-analysis platform for microarray data implemented in MATLAB.

    PubMed

    Chatziioannou, Aristotelis; Moulos, Panagiotis; Kolisis, Fragiskos N

    2009-10-27

    The microarray data analysis realm is ever growing through the development of various tools, open source and commercial. However there is absence of predefined rational algorithmic analysis workflows or batch standardized processing to incorporate all steps, from raw data import up to the derivation of significantly differentially expressed gene lists. This absence obfuscates the analytical procedure and obstructs the massive comparative processing of genomic microarray datasets. Moreover, the solutions provided, heavily depend on the programming skills of the user, whereas in the case of GUI embedded solutions, they do not provide direct support of various raw image analysis formats or a versatile and simultaneously flexible combination of signal processing methods. We describe here Gene ARMADA (Automated Robust MicroArray Data Analysis), a MATLAB implemented platform with a Graphical User Interface. This suite integrates all steps of microarray data analysis including automated data import, noise correction and filtering, normalization, statistical selection of differentially expressed genes, clustering, classification and annotation. In its current version, Gene ARMADA fully supports 2 coloured cDNA and Affymetrix oligonucleotide arrays, plus custom arrays for which experimental details are given in tabular form (Excel spreadsheet, comma separated values, tab-delimited text formats). It also supports the analysis of already processed results through its versatile import editor. Besides being fully automated, Gene ARMADA incorporates numerous functionalities of the Statistics and Bioinformatics Toolboxes of MATLAB. In addition, it provides numerous visualization and exploration tools plus customizable export data formats for seamless integration by other analysis tools or MATLAB, for further processing. Gene ARMADA requires MATLAB 7.4 (R2007a) or higher and is also distributed as a stand-alone application with MATLAB Component Runtime. Gene ARMADA provides a highly adaptable, integrative, yet flexible tool which can be used for automated quality control, analysis, annotation and visualization of microarray data, constituting a starting point for further data interpretation and integration with numerous other tools.

  11. Do alterations in follicular fluid proteases contribute to human infertility?

    PubMed

    Cookingham, Lisa Marii; Van Voorhis, Bradley J; Ascoli, Mario

    2015-05-01

    Cathepsin L and ADAMTS-1 are known to play critical roles in follicular rupture, ovulation, and fertility in mice. Similar studies in humans are limited; however, both are known to increase during the periovulatory period. No studies have examined either protease in the follicular fluid of women with unexplained infertility or infertility related to advanced maternal age (AMA). We sought to determine if alterations in cathepsin L and/or ADAMTS-1 existed in these infertile populations. Patients undergoing in vitro fertilization (IVF) for unexplained infertility or AMA-related infertility were prospectively recruited for the study; patients with tubal or male factor infertility were recruited as controls. Follicular fluid was collected to determine gene expression (via quantitative polymerase chain reaction), enzyme concentrations (via enzyme-linked immunosorbent assays), and enzymatic activities (via fluorogenic enzyme cleavage assay or Western blot analysis) of cathepsin L and ADAMTS-1. The analysis included a total of 42 patients (14 per group). We found no statistically significant difference in gene expression, enzyme concentration, or enzymatic activity of cathepsin L or ADAMTS-1 in unexplained infertility or AMA-related infertility as compared to controls. We also found no statistically significant difference in expression or concentration with advancing age. Cathepsin L and ADAMTS-1 are not altered in women with unexplained infertility or AMA-related infertility undergoing IVF, and they do not decline with advancing age. It is possible that differences exist in natural cycles, contributing to infertility; however, our findings do not support a role for protease alterations as a common cause of infertility.

  12. The prognostic value of interleukin-17 in lung cancer: A systematic review with meta-analysis based on Chinese patients.

    PubMed

    Wang, Xiao-Fei; Zhu, Yi-Tong; Wang, Jia-Jia; Zeng, Da-Xiong; Mu, Chuan-Yong; Chen, Yan-Bin; Lei, Wei; Zhu, Ye-Han; Huang, Jian-An

    2017-01-01

    Interleukin-17 (IL-17) plays an important role in cancer progression. Previous studies remained controversial regarding the correlation between IL-17 expression and lung cancer (LC) prognosis. To comprehensively and quantitatively summarize the prognostic value of IL-17 expression in LC patients, a systematic review and meta-analysis were performed. We identified the relevant literatures by searching the PubMed, EMBASE, Cochrane Library, SinoMed, China National Knowledge Infrastructure (CNKI) and Wanfang Data databases, up until April 1, 2017. Overall survival (OS), disease free survival (DFS) and clinicopathological characteristics were collected from relevant studies. Pooled hazard ratios (HR) and corresponding 95% confidence intervals (CI) were calculated to estimate the effective value of IL-17 expression on clinical outcomes. Six studies containing 479 Chinese LC patients were involved in this meta-analysis. The results indicated high IL-17 expression was independently correlated with poorer OS (HR = 1.82, 95% CI 1.44-2.29, P < 0.00001) and shorter DFS (HR = 2.41, 95% CI 1.42-4.08, P = 0.001) in LC patients. Further, when stratified by LC histological type (non-small cell lung cancer and small cell lung cancer), tumor stage (Ⅰ-Ⅲ,Ⅰ-Ⅳ and Ⅳ), detection specimen (serum, intratumoral tissue and pleural effusion), test method (immunological histological chemistry and enzyme linked immunosorbent assay), and HR estimated method (reported and estimated), all of the results were statistically significant. These data indicated that elevated IL-17 expression is correlated with poor clinical outcomes in LC. The meta-analysis did not show heterogeneity or publication bias. The present meta-analysis revealed that high IL-17 expression was an indicator of poor prognosis for Chinese patients with LC. It could potentially help to assess patients' prognosis and estimate treatment efficacy in therapeutic interventions.

  13. Information processing of motion in facial expression and the geometry of dynamical systems

    NASA Astrophysics Data System (ADS)

    Assadi, Amir H.; Eghbalnia, Hamid; McMenamin, Brenton W.

    2005-01-01

    An interesting problem in analysis of video data concerns design of algorithms that detect perceptually significant features in an unsupervised manner, for instance methods of machine learning for automatic classification of human expression. A geometric formulation of this genre of problems could be modeled with help of perceptual psychology. In this article, we outline one approach for a special case where video segments are to be classified according to expression of emotion or other similar facial motions. The encoding of realistic facial motions that convey expression of emotions for a particular person P forms a parameter space XP whose study reveals the "objective geometry" for the problem of unsupervised feature detection from video. The geometric features and discrete representation of the space XP are independent of subjective evaluations by observers. While the "subjective geometry" of XP varies from observer to observer, levels of sensitivity and variation in perception of facial expressions appear to share a certain level of universality among members of similar cultures. Therefore, statistical geometry of invariants of XP for a sample of population could provide effective algorithms for extraction of such features. In cases where frequency of events is sufficiently large in the sample data, a suitable framework could be provided to facilitate the information-theoretic organization and study of statistical invariants of such features. This article provides a general approach to encode motion in terms of a particular genre of dynamical systems and the geometry of their flow. An example is provided to illustrate the general theory.

  14. High-Throughput Analysis of Age-Dependent Protein Changes in Layer II/III of the Human Orbitofrontal Cortex

    NASA Astrophysics Data System (ADS)

    Kapadia, Fenika

    Studies on the orbitofrontal cortex (OFC) during normal aging have shown a decline in cognitive functions, a loss of spines/synapses in layer III and gene expression changes related to neural communication. Biological changes during the course of normal aging are summarized into 9 hallmarks based on aging in peripheral tissue. Whether these hallmarks apply to non-dividing brain tissue is not known. Therefore, we opted to perform large-scale proteomic profiling of the OFC layer II/III during normal aging from 15 young and 18 old male subjects. MaxQuant was utilized for label-free quantification and statistical analysis by the Random Intercept Model (RIM) identified 118 differentially expressed (DE) age-related proteins. Altered neural communication was the most represented hallmark of aging (54% of DE proteins), highlighting the importance of communication in the brain. Functional analysis showed enrichment in GABA/glutamate signaling and pro-inflammatory responses. The former may contribute to alterations in excitation/inhibition, leading to cognitive decline during aging.

  15. A clinicopathological analysis of primary mucosal malignant melanoma.

    PubMed

    Izumi, Daisuke; Ishimoto, Takatsugu; Yoshida, Naoya; Nakamura, Kenichi; Kosumi, Keisuke; Tokunaga, Ryuma; Sugihara, Hidetaka; Sawayama, Hiroshi; Karashima, Ryuichi; Imamura, Yu; Ida, Satoshi; Hiyoshi, Yukiharu; Iwagami, Shiro; Baba, Yoshifumi; Sakamoto, Yasuo; Miyamoto, Yuji; Watanabe, Masayuki; Baba, Hideo

    2015-07-01

    Primary mucosal malignant melanoma (PMMM) is a rare and highly lethal neoplasm associated with a poor prognosis. CXC chemokine receptor 4 (CXCR4) is expressed on various tumor cells, including malignant melanoma. Recent data indicate that CXCL12 and CXCR4 play a critical role in the behavior of cancer cells and in the survival of cancer patients. However, there has been no study that has addressed the expression and function of CXCR4/CXCL12 signaling in PMMM. Immunohistochemical staining for CXCL12 and Ki67 in biopsy tissues from 10 cases of PMMM was performed. We analyzed the correlations between the clinicopathological features and expression levels of CXCL12 and Ki67. Six cases showed a high level of CXCL12 expression, while four cases had a low level of expression. High expression of CXCL12 correlated with a poor prognosis, although statistical significance was not reached (p = 0.054). Ki67 was highly expressed in five cases, while the expression in the other five cases was low. There was no correlation between the Ki67 expression and prognosis. The findings of this study suggest that CXCL12 expression may play an important role in the biological behavior of PMMM and may be associated with a poor prognosis of PMMM patients.

  16. A close examination of double filtering with fold change and t test in microarray analysis

    PubMed Central

    2009-01-01

    Background Many researchers use the double filtering procedure with fold change and t test to identify differentially expressed genes, in the hope that the double filtering will provide extra confidence in the results. Due to its simplicity, the double filtering procedure has been popular with applied researchers despite the development of more sophisticated methods. Results This paper, for the first time to our knowledge, provides theoretical insight on the drawback of the double filtering procedure. We show that fold change assumes all genes to have a common variance while t statistic assumes gene-specific variances. The two statistics are based on contradicting assumptions. Under the assumption that gene variances arise from a mixture of a common variance and gene-specific variances, we develop the theoretically most powerful likelihood ratio test statistic. We further demonstrate that the posterior inference based on a Bayesian mixture model and the widely used significance analysis of microarrays (SAM) statistic are better approximations to the likelihood ratio test than the double filtering procedure. Conclusion We demonstrate through hypothesis testing theory, simulation studies and real data examples, that well constructed shrinkage testing methods, which can be united under the mixture gene variance assumption, can considerably outperform the double filtering procedure. PMID:19995439

  17. An Independent Filter for Gene Set Testing Based on Spectral Enrichment.

    PubMed

    Frost, H Robert; Li, Zhigang; Asselbergs, Folkert W; Moore, Jason H

    2015-01-01

    Gene set testing has become an indispensable tool for the analysis of high-dimensional genomic data. An important motivation for testing gene sets, rather than individual genomic variables, is to improve statistical power by reducing the number of tested hypotheses. Given the dramatic growth in common gene set collections, however, testing is often performed with nearly as many gene sets as underlying genomic variables. To address the challenge to statistical power posed by large gene set collections, we have developed spectral gene set filtering (SGSF), a novel technique for independent filtering of gene set collections prior to gene set testing. The SGSF method uses as a filter statistic the p-value measuring the statistical significance of the association between each gene set and the sample principal components (PCs), taking into account the significance of the associated eigenvalues. Because this filter statistic is independent of standard gene set test statistics under the null hypothesis but dependent under the alternative, the proportion of enriched gene sets is increased without impacting the type I error rate. As shown using simulated and real gene expression data, the SGSF algorithm accurately filters gene sets unrelated to the experimental outcome resulting in significantly increased gene set testing power.

  18. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bowyer, John F., E-mail: john.bowyer@fda.hhs.go; Latendresse, John R.; Delongchamp, Robert R.

    A study was undertaken to determine whether alterations in the gene expression or overt histological signs of neurotoxicity in selected regions of the forebrain might occur from acrylamide exposure via drinking water. Gene expression at the mRNA level was evaluated by cDNA array and/or RT-PCR analysis in the striatum, substantia nigra and parietal cortex of rat after a 2-week acrylamide exposure. The highest dose tested (maximally tolerated) of approximately 44 mg/kg/day resulted in a significant decreased body weight, sluggishness, and locomotor activity reduction. These physiological effects were not accompanied by prominent changes in gene expression in the forebrain. All themore » expression changes seen in the 1200 genes that were evaluated in the three brain regions were <= 1.5-fold, and most not significant. Very few, if any, statistically significant changes were seen in mRNA levels of the more than 50 genes directly related to the cholinergic, noradrenergic, GABAergic or glutamatergic neurotransmitter systems in the striatum, substantia nigra or parietal cortex. All the expression changes observed in genes related to dopaminergic function were less than 1.5-fold and not statistically significant and the 5HT1b receptor was the only serotonin-related gene affected. Therefore, gene expression changes were few and modest in basal ganglia and sensory cortex at a time when the behavioral manifestations of acrylamide toxicity had become prominent. No histological evidence of axonal, dendritic or neuronal cell body damage was found in the forebrain due to the acrylamide exposure. As well, microglial activation was not present. These findings are consistent with the absence of expression changes in genes related to changes in neuroinflammation or neurotoxicity. Over all, these data suggest that oral ingestion of acrylamide in drinking water or food, even at maximally tolerable levels, induced neither marked changes in gene expression nor neurotoxicity in the motor and somatosensory areas of the central nervous system.« less

  19. Computational synchronization of microarray data with application to Plasmodium falciparum.

    PubMed

    Zhao, Wei; Dauwels, Justin; Niles, Jacquin C; Cao, Jianshu

    2012-06-21

    Microarrays are widely used to investigate the blood stage of Plasmodium falciparum infection. Starting with synchronized cells, gene expression levels are continually measured over the 48-hour intra-erythrocytic cycle (IDC). However, the cell population gradually loses synchrony during the experiment. As a result, the microarray measurements are blurred. In this paper, we propose a generalized deconvolution approach to reconstruct the intrinsic expression pattern, and apply it to P. falciparum IDC microarray data. We develop a statistical model for the decay of synchrony among cells, and reconstruct the expression pattern through statistical inference. The proposed method can handle microarray measurements with noise and missing data. The original gene expression patterns become more apparent in the reconstructed profiles, making it easier to analyze and interpret the data. We hypothesize that reconstructed gene expression patterns represent better temporally resolved expression profiles that can be probabilistically modeled to match changes in expression level to IDC transitions. In particular, we identify transcriptionally regulated protein kinases putatively involved in regulating the P. falciparum IDC. By analyzing publicly available microarray data sets for the P. falciparum IDC, protein kinases are ranked in terms of their likelihood to be involved in regulating transitions between the ring, trophozoite and schizont developmental stages of the P. falciparum IDC. In our theoretical framework, a few protein kinases have high probability rankings, and could potentially be involved in regulating these developmental transitions. This study proposes a new methodology for extracting intrinsic expression patterns from microarray data. By applying this method to P. falciparum microarray data, several protein kinases are predicted to play a significant role in the P. falciparum IDC. Earlier experiments have indeed confirmed that several of these kinases are involved in this process. Overall, these results indicate that further functional analysis of these additional putative protein kinases may reveal new insights into how the P. falciparum IDC is regulated.

  20. Transcriptional expression analysis of survivin splice variants reveals differential expression of survivin-3α in breast cancer.

    PubMed

    Moniri Javadhesari, Solmaz; Gharechahi, Javad; Hosseinpour Feizi, Mohammad Ali; Montazeri, Vahid; Halimi, Monireh

    2013-04-01

    Survivin, which is a novel member of the inhibitor of apoptosis family proteins, is known to play an important role in the regulation of cell cycle and apoptosis. Differential expression of survivin in tumor tissues introduces it as a new candidate molecular marker for cancer. Here we investigated the expression of survivin and its splice variants in breast tumors, as well as normal adjacent tissues obtained from the same patients. Thirty five tumors and 17 normal adjacent tissues from women diagnosed with breast cancer were explored in this study. Differential expression of different survivin splice variants was detected and semiquantitatively analyzed using reverse transcription-polymerase chain reaction. Results showed that survivin and its splice variants were differentially expressed in tumor specimens compared with normal adjacent tissues. The expression of survivin-3B and survivin-3α was specifically detected in tumor tissues compared with normal adjacent ones (53% in tumor tissues compared to 5% in normal adjacent for survivin-3B and 65% in tumor tissues and 0.0% in normal adjacent tissues for survivin-3α). Statistical analysis showed that survivin and survivin-ΔEx3 were upregulated in benign (90%, p<0.034) and malignant (76%, p<0.042) tumors, respectively. On the other hand, our results showed that survivin-2α (100% of the cases) was the dominant expressed variant of survivin in breast cancer. The data presented here showed that survivin splice variants were differentially expressed in benign and malignant breast cancer tissues, suggesting their potential role in breast cancer development. Differential expression of survivin-2α and survivin-3α splice variants highlights their usefulness as new candidate markers for breast cancer diagnosis and prognosis.

Top