Survival analysis of cervical cancer using stratified Cox regression
NASA Astrophysics Data System (ADS)
Purnami, S. W.; Inayati, K. D.; Sari, N. W. Wulan; Chosuvivatwong, V.; Sriplung, H.
2016-04-01
Cervical cancer is one of the mostly widely cancer cause of the women death in the world including Indonesia. Most cervical cancer patients come to the hospital already in an advanced stadium. As a result, the treatment of cervical cancer becomes more difficult and even can increase the death's risk. One of parameter that can be used to assess successfully of treatment is the probability of survival. This study raises the issue of cervical cancer survival patients at Dr. Soetomo Hospital using stratified Cox regression based on six factors such as age, stadium, treatment initiation, companion disease, complication, and anemia. Stratified Cox model is used because there is one independent variable that does not satisfy the proportional hazards assumption that is stadium. The results of the stratified Cox model show that the complication variable is significant factor which influent survival probability of cervical cancer patient. The obtained hazard ratio is 7.35. It means that cervical cancer patient who has complication is at risk of dying 7.35 times greater than patient who did not has complication. While the adjusted survival curves showed that stadium IV had the lowest probability of survival.
Factors Associated with Methadone Treatment Duration: A Cox Regression Analysis
Peng, Ching-Yi; Chao, En; Lee, Tony Szu-Hsien
2015-01-01
This study examined retention rates and associated predictors of methadone maintenance treatment (MMT) duration among 128 newly admitted patients in Taiwan. A semi-structured questionnaire was used to obtain demographic and drug use history. Daily records of methadone taken and test results for HIV, HCV, and morphine toxicology were taken from a computerized medical registry. Cox regression analyses were performed to examine factors associated with MMT duration. MMT retention rates were 80.5%, 68.8%, 53.9%, and 41.4% for 3, 6, 12, and 18 months, respectively. Excluding 38 patients incarcerated during the study period, retention rates were 81.1%, 73.3%, 61.1%, and 48.9% for 3 months, 6 months, 12 months, and 18 months, respectively. No participant seroconverted to HIV and 1 died during the 18-months follow-up. Results showed that being female, imprisonment, a longer distance from house to clinic, having a lower methadone dose after 30 days, being HCV positive, and in the New Taipei city program predicted early patient dropout. The findings suggest favorable MMT outcomes of HIV seroincidence and mortality. Results indicate that the need to minimize travel distance and to provide programs that meet women’s requirements justify expansion of MMT clinics in Taiwan. PMID:25875531
Vatcheva, KP; Lee, M; McCormick, JB; Rahbar, MH
2016-01-01
Objective To demonstrate the adverse impact of ignoring statistical interactions in regression models used in epidemiologic studies. Study design and setting Based on different scenarios that involved known values for coefficient of the interaction term in Cox regression models we generated 1000 samples of size 600 each. The simulated samples and a real life data set from the Cameron County Hispanic Cohort were used to evaluate the effect of ignoring statistical interactions in these models. Results Compared to correctly specified Cox regression models with interaction terms, misspecified models without interaction terms resulted in up to 8.95 fold bias in estimated regression coefficients. Whereas when data were generated from a perfect additive Cox proportional hazards regression model the inclusion of the interaction between the two covariates resulted in only 2% estimated bias in main effect regression coefficients estimates, but did not alter the main findings of no significant interactions. Conclusions When the effects are synergic, the failure to account for an interaction effect could lead to bias and misinterpretation of the results, and in some instances to incorrect policy decisions. Best practices in regression analysis must include identification of interactions, including for analysis of data from epidemiologic studies.
Dehesh, Tania; Zare, Najaf; Ayatollahi, Seyyed Mohammad Taghi
2015-01-01
Background. Univariate meta-analysis (UM) procedure, as a technique that provides a single overall result, has become increasingly popular. Neglecting the existence of other concomitant covariates in the models leads to loss of treatment efficiency. Our aim was proposing four new approximation approaches for the covariance matrix of the coefficients, which is not readily available for the multivariate generalized least square (MGLS) method as a multivariate meta-analysis approach. Methods. We evaluated the efficiency of four new approaches including zero correlation (ZC), common correlation (CC), estimated correlation (EC), and multivariate multilevel correlation (MMC) on the estimation bias, mean square error (MSE), and 95% probability coverage of the confidence interval (CI) in the synthesis of Cox proportional hazard models coefficients in a simulation study. Result. Comparing the results of the simulation study on the MSE, bias, and CI of the estimated coefficients indicated that MMC approach was the most accurate procedure compared to EC, CC, and ZC procedures. The precision ranking of the four approaches according to all above settings was MMC ≥ EC ≥ CC ≥ ZC. Conclusion. This study highlights advantages of MGLS meta-analysis on UM approach. The results suggested the use of MMC procedure to overcome the lack of information for having a complete covariance matrix of the coefficients. PMID:26413142
Simultaneous confidence bands for Cox regression from semiparametric random censorship.
Mondal, Shoubhik; Subramanian, Sundarraman
2016-01-01
Cox regression is combined with semiparametric random censorship models to construct simultaneous confidence bands (SCBs) for subject-specific survival curves. Simulation results are presented to compare the performance of the proposed SCBs with the SCBs that are based only on standard Cox. The new SCBs provide correct empirical coverage and are more informative. The proposed SCBs are illustrated with two real examples. An extension to handle missing censoring indicators is also outlined. PMID:25691289
Partial least squares Cox regression for genome-wide data.
Nygård, Ståle; Borgan, Ornulf; Lingjaerde, Ole Christian; Størvold, Hege Leite
2008-06-01
Most methods for survival prediction from high-dimensional genomic data combine the Cox proportional hazards model with some technique of dimension reduction, such as partial least squares regression (PLS). Applying PLS to the Cox model is not entirely straightforward, and multiple approaches have been proposed. The method of Park etal. (Bioinformatics 18(Suppl. 1):S120-S127, 2002) uses a reformulation of the Cox likelihood to a Poisson type likelihood, thereby enabling estimation by iteratively reweighted partial least squares for generalized linear models. We propose a modification of the method of Park et al. (2002) such that estimates of the baseline hazard and the gene effects are obtained in separate steps. The resulting method has several advantages over the method of Park et al. (2002) and other existing Cox PLS approaches, as it allows for estimation of survival probabilities for new patients, enables a less memory-demanding estimation procedure, and allows for incorporation of lower-dimensional non-genomic variables like disease grade and tumor thickness. We also propose to combine our Cox PLS method with an initial gene selection step in which genes are ordered by their Cox score and only the highest-ranking k% of the genes are retained, obtaining a so-called supervised partial least squares regression method. In simulations, both the unsupervised and the supervised version outperform other Cox PLS methods. PMID:18188699
Iuliano, Antonella; Occhipinti, Annalisa; Angelini, Claudia; De Feis, Italia; Lió, Pietro
2016-01-01
International initiatives such as the Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium (ICGC) are collecting multiple datasets at different genome-scales with the aim of identifying novel cancer biomarkers and predicting survival of patients. To analyze such data, several statistical methods have been applied, among them Cox regression models. Although these models provide a good statistical framework to analyze omic data, there is still a lack of studies that illustrate advantages and drawbacks in integrating biological information and selecting groups of biomarkers. In fact, classical Cox regression algorithms focus on the selection of a single biomarker, without taking into account the strong correlation between genes. Even though network-based Cox regression algorithms overcome such drawbacks, such network-based approaches are less widely used within the life science community. In this article, we aim to provide a clear methodological framework on the use of such approaches in order to turn cancer research results into clinical applications. Therefore, we first discuss the rationale and the practical usage of three recently proposed network-based Cox regression algorithms (i.e., Net-Cox, AdaLnet, and fastcox). Then, we show how to combine existing biological knowledge and available data with such algorithms to identify networks of cancer biomarkers and to estimate survival of patients. Finally, we describe in detail a new permutation-based approach to better validate the significance of the selection in terms of cancer gene signatures and pathway/networks identification. We illustrate the proposed methodology by means of both simulations and real case studies. Overall, the aim of our work is two-fold. Firstly, to show how network-based Cox regression models can be used to integrate biological knowledge (e.g., multi-omics data) for the analysis of survival data. Secondly, to provide a clear methodological and computational approach for
Cox Regression Models with Functional Covariates for Survival Data
Gellar, Jonathan E.; Colantuoni, Elizabeth; Needham, Dale M.; Crainiceanu, Ciprian M.
2015-01-01
We extend the Cox proportional hazards model to cases when the exposure is a densely sampled functional process, measured at baseline. The fundamental idea is to combine penalized signal regression with methods developed for mixed effects proportional hazards models. The model is fit by maximizing the penalized partial likelihood, with smoothing parameters estimated by a likelihood-based criterion such as AIC or EPIC. The model may be extended to allow for multiple functional predictors, time varying coefficients, and missing or unequally-spaced data. Methods were inspired by and applied to a study of the association between time to death after hospital discharge and daily measures of disease severity collected in the intensive care unit, among survivors of acute respiratory distress syndrome. PMID:26441487
Diagnostic Measures for the Cox Regression Model with Missing Covariates
Zhu, Hongtu; Ibrahim, Joseph G.; Chen, Ming-Hui
2015-01-01
Summary This paper investigates diagnostic measures for assessing the influence of observations and model misspecification in the presence of missing covariate data for the Cox regression model. Our diagnostics include case-deletion measures, conditional martingale residuals, and score residuals. The Q-distance is proposed to examine the effects of deleting individual observations on the estimates of finite-dimensional and infinite-dimensional parameters. Conditional martingale residuals are used to construct goodness of fit statistics for testing possible misspecification of the model assumptions. A resampling method is developed to approximate the p-values of the goodness of fit statistics. Simulation studies are conducted to evaluate our methods, and a real data set is analyzed to illustrate their use. PMID:26903666
ERIC Educational Resources Information Center
Chen, Chau-Kuang
2005-01-01
Logistic and Cox regression methods are practical tools used to model the relationships between certain student learning outcomes and their relevant explanatory variables. The logistic regression model fits an S-shaped curve into a binary outcome with data points of zero and one. The Cox regression model allows investigators to study the duration…
Xu, Haoming; Moni, Mohammad Ali; Liò, Pietro
2015-12-01
In cancer genomics, gene expression levels provide important molecular signatures for all types of cancer, and this could be very useful for predicting the survival of cancer patients. However, the main challenge of gene expression data analysis is high dimensionality, and microarray is characterised by few number of samples with large number of genes. To overcome this problem, a variety of penalised Cox proportional hazard models have been proposed. We introduce a novel network regularised Cox proportional hazard model and a novel multiplex network model to measure the disease comorbidities and to predict survival of the cancer patient. Our methods are applied to analyse seven microarray cancer gene expression datasets: breast cancer, ovarian cancer, lung cancer, liver cancer, renal cancer and osteosarcoma. Firstly, we applied a principal component analysis to reduce the dimensionality of original gene expression data. Secondly, we applied a network regularised Cox regression model on the reduced gene expression datasets. By using normalised mutual information method and multiplex network model, we predict the comorbidities for the liver cancer based on the integration of diverse set of omics and clinical data, and we find the diseasome associations (disease-gene association) among different cancers based on the identified common significant genes. Finally, we evaluated the precision of the approach with respect to the accuracy of survival prediction using ROC curves. We report that colon cancer, liver cancer and renal cancer share the CXCL5 gene, and breast cancer, ovarian cancer and renal cancer share the CCND2 gene. Our methods are useful to predict survival of the patient and disease comorbidities more accurately and helpful for improvement of the care of patients with comorbidity. Software in Matlab and R is available on our GitHub page: https://github.com/ssnhcom/NetworkRegularisedCox.git. PMID:26611766
Nie, Z Q; Ou, Y Q; Zhuang, J; Qu, Y J; Mai, J Z; Chen, J M; Liu, X Q
2016-05-10
Conditional logistic regression analysis and unconditional logistic regression analysis are commonly used in case control study, but Cox proportional hazard model is often used in survival data analysis. Most literature only refer to main effect model, however, generalized linear model differs from general linear model, and the interaction was composed of multiplicative interaction and additive interaction. The former is only statistical significant, but the latter has biological significance. In this paper, macros was written by using SAS 9.4 and the contrast ratio, attributable proportion due to interaction and synergy index were calculated while calculating the items of logistic and Cox regression interactions, and the confidence intervals of Wald, delta and profile likelihood were used to evaluate additive interaction for the reference in big data analysis in clinical epidemiology and in analysis of genetic multiplicative and additive interactions. PMID:27188374
Mortality Prediction in ICUs Using A Novel Time-Slicing Cox Regression Method
Wang, Yuan; Chen, Wenlin; Heard, Kevin; Kollef, Marin H.; Bailey, Thomas C.; Cui, Zhicheng; He, Yujie; Lu, Chenyang; Chen, Yixin
2015-01-01
Over the last few decades, machine learning and data mining have been increasingly used for clinical prediction in ICUs. However, there is still a huge gap in making full use of the time-series data generated from ICUs. Aiming at filling this gap, we propose a novel approach entitled Time Slicing Cox regression (TS-Cox), which extends the classical Cox regression into a classification method on multi-dimensional time-series. Unlike traditional classifiers such as logistic regression and support vector machines, our model not only incorporates the discriminative features derived from the time-series, but also naturally exploits the temporal orders of these features based on a Cox-like function. Empirical evaluation on MIMIC-II database demonstrates the efficacy of the TS-Cox model. Our TS-Cox model outperforms all other baseline models by a good margin in terms of AUC_PR, sensitivity and PPV, which indicates that TS-Cox may be a promising tool for mortality prediction in ICUs. PMID:26958269
Sneeringer, Stacy
2010-04-01
While a recent paper by Cox in this journal uses as its motivating factor the benefits of quantitative risk assessment, its content is entirely devoted to critiquing Sneeringer's article in the American Journal of Agricultural Economics. Cox's two main critiques of Sneeringer are fundamentally flawed and misrepresent the original article. Cox posits that Sneeringer did A and B, and then argues why A and B are incorrect. However, Sneeringer in fact did C and D; thus critiques of A and B are not applicable to Sneeringer's analysis. PMID:20345577
Box–Cox Transformation and Random Regression Models for Fecal egg Count Data
da Silva, Marcos Vinícius Gualberto Barbosa; Van Tassell, Curtis P.; Sonstegard, Tad S.; Cobuci, Jaime Araujo; Gasbarre, Louis C.
2012-01-01
Accurate genetic evaluation of livestock is based on appropriate modeling of phenotypic measurements. In ruminants, fecal egg count (FEC) is commonly used to measure resistance to nematodes. FEC values are not normally distributed and logarithmic transformations have been used in an effort to achieve normality before analysis. However, the transformed data are often still not normally distributed, especially when data are extremely skewed. A series of repeated FEC measurements may provide information about the population dynamics of a group or individual. A total of 6375 FEC measures were obtained for 410 animals between 1992 and 2003 from the Beltsville Agricultural Research Center Angus herd. Original data were transformed using an extension of the Box–Cox transformation to approach normality and to estimate (co)variance components. We also proposed using random regression models (RRM) for genetic and non-genetic studies of FEC. Phenotypes were analyzed using RRM and restricted maximum likelihood. Within the different orders of Legendre polynomials used, those with more parameters (order 4) adjusted FEC data best. Results indicated that the transformation of FEC data utilizing the Box–Cox transformation family was effective in reducing the skewness and kurtosis, and dramatically increased estimates of heritability, and measurements of FEC obtained in the period between 12 and 26 weeks in a 26-week experimental challenge period are genetically correlated. PMID:22303406
Pathway-gene identification for pancreatic cancer survival via doubly regularized Cox regression
2014-01-01
Background Recent global genomic analyses identified 69 gene sets and 12 core signaling pathways genetically altered in pancreatic cancer, which is a highly malignant disease. A comprehensive understanding of the genetic signatures and signaling pathways that are directly correlated to pancreatic cancer survival will help cancer researchers to develop effective multi-gene targeted, personalized therapies for the pancreatic cancer patients at different stages. A previous work that applied a LASSO penalized regression method, which only considered individual genetic effects, identified 12 genes associated with pancreatic cancer survival. Results In this work, we integrate pathway information into pancreatic cancer survival analysis. We introduce and apply a doubly regularized Cox regression model to identify both genes and signaling pathways related to pancreatic cancer survival. Conclusions Four signaling pathways, including Ion transport, immune phagocytosis, TGFβ (spermatogenesis), regulation of DNA-dependent transcription pathways, and 15 genes within the four pathways are identified and verified to be directly correlated to pancreatic cancer survival. Our findings can help cancer researchers design new strategies for the early detection and diagnosis of pancreatic cancer. PMID:24565114
Modern Regression Discontinuity Analysis
ERIC Educational Resources Information Center
Bloom, Howard S.
2012-01-01
This article provides a detailed discussion of the theory and practice of modern regression discontinuity (RD) analysis for estimating the effects of interventions or treatments. Part 1 briefly chronicles the history of RD analysis and summarizes its past applications. Part 2 explains how in theory an RD analysis can identify an average effect of…
Tosteson, Tor D.; Morden, Nancy E.; Stukel, Therese A.; O'Malley, A. James
2014-01-01
The estimation of treatment effects is one of the primary goals of statistics in medicine. Estimation based on observational studies is subject to confounding. Statistical methods for controlling bias due to confounding include regression adjustment, propensity scores and inverse probability weighted estimators. These methods require that all confounders are recorded in the data. The method of instrumental variables (IVs) can eliminate bias in observational studies even in the absence of information on confounders. We propose a method for integrating IVs within the framework of Cox's proportional hazards model and demonstrate the conditions under which it recovers the causal effect of treatment. The methodology is based on the approximate orthogonality of an instrument with unobserved confounders among those at risk. We derive an estimator as the solution to an estimating equation that resembles the score equation of the partial likelihood in much the same way as the traditional IV estimator resembles the normal equations. To justify this IV estimator for a Cox model we perform simulations to evaluate its operating characteristics. Finally, we apply the estimator to an observational study of the effect of coronary catheterization on survival. PMID:25506259
Multiple linear regression analysis
NASA Technical Reports Server (NTRS)
Edwards, T. R.
1980-01-01
Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.
Lee, Eunjee; Zhu, Hongtu; Kong, Dehan; Wang, Yalin; Giovanello, Kelly Sullivan; Ibrahim, Joseph G
2015-01-01
The aim of this paper is to develop a Bayesian functional linear Cox regression model (BFLCRM) with both functional and scalar covariates. This new development is motivated by establishing the likelihood of conversion to Alzheimer’s disease (AD) in 346 patients with mild cognitive impairment (MCI) enrolled in the Alzheimer’s Disease Neuroimaging Initiative 1 (ADNI-1) and the early markers of conversion. These 346 MCI patients were followed over 48 months, with 161 MCI participants progressing to AD at 48 months. The functional linear Cox regression model was used to establish that functional covariates including hippocampus surface morphology and scalar covariates including brain MRI volumes, cognitive performance (ADAS-Cog), and APOE status can accurately predict time to onset of AD. Posterior computation proceeds via an efficient Markov chain Monte Carlo algorithm. A simulation study is performed to evaluate the finite sample performance of BFLCRM. PMID:26900412
NASA Technical Reports Server (NTRS)
Kattan, Michael W.; Hess, Kenneth R.; Kattan, Michael W.
1998-01-01
New computationally intensive tools for medical survival analyses include recursive partitioning (also called CART) and artificial neural networks. A challenge that remains is to better understand the behavior of these techniques in effort to know when they will be effective tools. Theoretically they may overcome limitations of the traditional multivariable survival technique, the Cox proportional hazards regression model. Experiments were designed to test whether the new tools would, in practice, overcome these limitations. Two datasets in which theory suggests CART and the neural network should outperform the Cox model were selected. The first was a published leukemia dataset manipulated to have a strong interaction that CART should detect. The second was a published cirrhosis dataset with pronounced nonlinear effects that a neural network should fit. Repeated sampling of 50 training and testing subsets was applied to each technique. The concordance index C was calculated as a measure of predictive accuracy by each technique on the testing dataset. In the interaction dataset, CART outperformed Cox (P less than 0.05) with a C improvement of 0.1 (95% Cl, 0.08 to 0.12). In the nonlinear dataset, the neural network outperformed the Cox model (P less than 0.05), but by a very slight amount (0.015). As predicted by theory, CART and the neural network were able to overcome limitations of the Cox model. Experiments like these are important to increase our understanding of when one of these new techniques will outperform the standard Cox model. Further research is necessary to predict which technique will do best a priori and to assess the magnitude of superiority.
Analysis of the correlation between P53 and Cox-2 expression and prognosis in esophageal cancer
CHEN, JUN; WU, FANG; PEI, HONG-LEI; GU, WEN-DONG; NING, ZHONG-HUA; SHAO, YING-JIE; HUANG, JIN
2015-01-01
The present study aimed to explore the importance of P53 and Cox-2 protein expression in esophageal cancer and assess their influence on prognosis. The expression of P53 and Cox-2 was assessed in esophageal cancer samples from 195 patients subjected to radical surgery at Changzhou First People's Hospital (Changzhou, China) between May 2010 and December 2011. Expression of P53 and Cox-2 proteins were detected in 60.5% (118/195) and 69.7% (136/195) of the samples, respectively, and were co-expressed in 43.1% (84/195) of the samples. A correlation was identified between P53 expression and overall survival (OS) (P=0.0351) as well as disease-free survival (DFS) (P=0.0307). In addition, the co-expression of P53 and Cox-2 also correlated with OS (P=0.0040) and DFS (P=0.0042). P53 expression (P=0.023), TNM staging (P<0.001) and P53/Cox-2 co-expression (P=0.009) were identified as independent factors affecting OS in patients with esophageal cancer via a Cox multivariate regression model analysis. A similar analysis also identified P53 expression (P=0.020), TNM staging (P<0.001) and P53/Cox-2 co-expression (P=0.008) as independent prognostic factors influencing DFS in these patients. Binary logistic regression analysis demonstrated a correlation between P53 expression (P=0.012), TNM staging (P<0.001), tumor differentiation level (P=0.023) and P53/Cox-2 co-expression (P=0.021), and local recurrence or distant esophageal cancer metastasis. The results of the present study indicate that P53 and Cox-2 proteins may act synergistically in the development of esophageal cancer, and the assessment of P53/Cox-2 co-expression status in esophageal cancer biopsies may become an important diagnostic criterion to evaluate the prognosis of patients with esophageal cancer. PMID:26622818
Precision Efficacy Analysis for Regression.
ERIC Educational Resources Information Center
Brooks, Gordon P.
When multiple linear regression is used to develop a prediction model, sample size must be large enough to ensure stable coefficients. If the derivation sample size is inadequate, the model may not predict well for future subjects. The precision efficacy analysis for regression (PEAR) method uses a cross- validity approach to select sample sizes…
Devarajan, Karthik; Ebrahimi, Nader
2011-01-01
The assumption of proportional hazards (PH) fundamental to the Cox PH model sometimes may not hold in practice. In this paper, we propose a generalization of the Cox PH model in terms of the cumulative hazard function taking a form similar to the Cox PH model, with the extension that the baseline cumulative hazard function is raised to a power function. Our model allows for interaction between covariates and the baseline hazard and it also includes, for the two sample problem, the case of two Weibull distributions and two extreme value distributions differing in both scale and shape parameters. The partial likelihood approach can not be applied here to estimate the model parameters. We use the full likelihood approach via a cubic B-spline approximation for the baseline hazard to estimate the model parameters. A semi-automatic procedure for knot selection based on Akaike's Information Criterion is developed. We illustrate the applicability of our approach using real-life data. PMID:21076652
Devarajan, Karthik; Ebrahimi, Nader
2010-01-01
The assumption of proportional hazards (PH) fundamental to the Cox PH model sometimes may not hold in practice. In this paper, we propose a generalization of the Cox PH model in terms of the cumulative hazard function taking a form similar to the Cox PH model, with the extension that the baseline cumulative hazard function is raised to a power function. Our model allows for interaction between covariates and the baseline hazard and it also includes, for the two sample problem, the case of two Weibull distributions and two extreme value distributions differing in both scale and shape parameters. The partial likelihood approach can not be applied here to estimate the model parameters. We use the full likelihood approach via a cubic B-spline approximation for the baseline hazard to estimate the model parameters. A semi-automatic procedure for knot selection based on Akaike’s Information Criterion is developed. We illustrate the applicability of our approach using real-life data. PMID:21076652
Covariate analysis of survival data: a small-sample study of Cox's model
Johnson, M.E.; Tolley, H.D.; Bryson, M.C.; Goldman, A.S.
1982-09-01
Cox's proportional-hazards model is frequently used to adjust for covariate effects in survival-data analysis. The small-sample performances of the maximum partial likelihood estimators of the regression parameters in a two-covariate hazard function model are evaluated with respect to bias, variance, and power in hypothesis tests. Previous Monte Carlo work on the two-sample problem is reviewed.
Including network knowledge into Cox regression models for biomarker signature discovery.
Fröhlich, Holger
2014-03-01
Discovery of prognostic and diagnostic biomarker gene signatures for diseases, such as cancer, is seen as a major step toward a better personalized medicine. During the last decade various methods have been proposed for that purpose. However, one important obstacle for making gene signatures a standard tool in clinical diagnosis is the typical low reproducibility of these signatures combined with the difficulty to achieve a clear biological interpretation. For that purpose in the last years there has been a growing interest in approaches that try to integrate information from molecular interaction networks. Most of these methods focus on classification problems, that is learn a model from data that discriminates patients into distinct clinical groups. Far less has been published on approaches that predict a patient's event risk. In this paper, we investigate eight methods that integrate network information into multivariable Cox proportional hazard models for risk prediction in breast cancer. We compare the prediction performance of our tested algorithms via cross-validation as well as across different datasets. In addition, we highlight the stability and interpretability of obtained gene signatures. In conclusion, we find GeneRank-based filtering to be a simple, computationally cheap and highly predictive technique to integrate network information into event time prediction models. Signatures derived via this method are highly reproducible. PMID:24430933
Regression analysis of networked data
Zhou, Yan; Song, Peter X.-K.
2016-01-01
This paper concerns regression methodology for assessing relationships between multi-dimensional response variables and covariates that are correlated within a network. To address analytical challenges associated with the integration of network topology into the regression analysis, we propose a hybrid quadratic inference method that uses both prior and data-driven correlations among network nodes. A Godambe information-based tuning strategy is developed to allocate weights between the prior and data-driven network structures, so the estimator is efficient. The proposed method is conceptually simple and computationally fast, and has appealing large-sample properties. It is evaluated by simulation, and its application is illustrated using neuroimaging data from an association study of the effects of iron deficiency on auditory recognition memory in infants. PMID:27279658
Analysis of COX2 mutants reveals cytochrome oxidase subassemblies in yeast
2005-01-01
Cytochrome oxidase catalyses the reduction of oxygen to water. The mitochondrial enzyme contains up to 13 subunits, 11 in yeast, of which three, Cox1p, Cox2p and Cox3p, are mitochondrially encoded. The assembly pathway of this complex is still poorly understood. Its study in yeast has been so far impeded by the rapid turnover of unassembled subunits of the enzyme. In the present study, immunoblot analysis of blue native gels of yeast wild-type and Cox2p mutants revealed five cytochrome oxidase complexes or subcomplexes: a, b, c, d and f; a is likely to be the fully assembled enzyme; b lacks Cox6ap; d contains Cox7p and/or Cox7ap; f represents unassembled Cox1p; and c, observed only in the Cox2p mutants, contains Cox1p, Cox3p, Cox5p and Cox6p and lacks the other subunits. The identification of these novel cytochrome oxidase subcomplexes should encourage the reexamination of other yeast mutants. PMID:15921494
Gene-Based Association Analysis for Censored Traits Via Fixed Effect Functional Regressions.
Fan, Ruzong; Wang, Yifan; Yan, Qi; Ding, Ying; Weeks, Daniel E; Lu, Zhaohui; Ren, Haobo; Cook, Richard J; Xiong, Momiao; Swaroop, Anand; Chew, Emily Y; Chen, Wei
2016-02-01
Genetic studies of survival outcomes have been proposed and conducted recently, but statistical methods for identifying genetic variants that affect disease progression are rarely developed. Motivated by our ongoing real studies, here we develop Cox proportional hazard models using functional regression (FR) to perform gene-based association analysis of survival traits while adjusting for covariates. The proposed Cox models are fixed effect models where the genetic effects of multiple genetic variants are assumed to be fixed. We introduce likelihood ratio test (LRT) statistics to test for associations between the survival traits and multiple genetic variants in a genetic region. Extensive simulation studies demonstrate that the proposed Cox RF LRT statistics have well-controlled type I error rates. To evaluate power, we compare the Cox FR LRT with the previously developed burden test (BT) in a Cox model and sequence kernel association test (SKAT), which is based on mixed effect Cox models. The Cox FR LRT statistics have higher power than or similar power as Cox SKAT LRT except when 50%/50% causal variants had negative/positive effects and all causal variants are rare. In addition, the Cox FR LRT statistics have higher power than Cox BT LRT. The models and related test statistics can be useful in the whole genome and whole exome association studies. An age-related macular degeneration dataset was analyzed as an example. PMID:26782979
Regression Analysis by Example. 5th Edition
ERIC Educational Resources Information Center
Chatterjee, Samprit; Hadi, Ali S.
2012-01-01
Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. "Regression Analysis by Example, Fifth Edition" has been expanded and thoroughly…
Dewi, Lestari
2016-01-01
Introduction: The enzyme cyclooxygenase (COX) is an enzyme that catalyzes the formation of one of the mediators of inflammation, the prostaglandins. Inhibition of COX allegedly can improve inflammation-induced pathological conditions. Aim: The purpose of the present study was to evaluate the potential of Sargassum sp. components, Fucoidan and alginate, as COX inhibitors. Material and methods: The study was conducted by means of a computational (in silico) method. It was performed in two main stages, the docking between COX-1 and COX-2 with Fucoidan, alginate and aspirin (for comparison) and the analysis of the amount of interactions formed and the residues directly involved in the process of interaction. Results: Our results showed that both Fucoidan and alginate had an excellent potential as inhibitors of COX-1 and COX-2. Fucoidan had a better potential as an inhibitor of COX than alginate. COX inhibition was expected to provide a more favorable effect on inflammation-related pathological conditions. Conclusion: The active compounds Fucoidan and alginate derived from Sargassum sp. were suspected to possess a good potential as inhibitors of COX-1 and COX-2. PMID:27594740
Ternès, Nils; Rotolo, Federico; Michiels, Stefan
2016-07-10
Correct selection of prognostic biomarkers among multiple candidates is becoming increasingly challenging as the dimensionality of biological data becomes higher. Therefore, minimizing the false discovery rate (FDR) is of primary importance, while a low false negative rate (FNR) is a complementary measure. The lasso is a popular selection method in Cox regression, but its results depend heavily on the penalty parameter λ. Usually, λ is chosen using maximum cross-validated log-likelihood (max-cvl). However, this method has often a very high FDR. We review methods for a more conservative choice of λ. We propose an empirical extension of the cvl by adding a penalization term, which trades off between the goodness-of-fit and the parsimony of the model, leading to the selection of fewer biomarkers and, as we show, to the reduction of the FDR without large increase in FNR. We conducted a simulation study considering null and moderately sparse alternative scenarios and compared our approach with the standard lasso and 10 other competitors: Akaike information criterion (AIC), corrected AIC, Bayesian information criterion (BIC), extended BIC, Hannan and Quinn information criterion (HQIC), risk information criterion (RIC), one-standard-error rule, adaptive lasso, stability selection, and percentile lasso. Our extension achieved the best compromise across all the scenarios between a reduction of the FDR and a limited raise of the FNR, followed by the AIC, the RIC, and the adaptive lasso, which performed well in some settings. We illustrate the methods using gene expression data of 523 breast cancer patients. In conclusion, we propose to apply our extension to the lasso whenever a stringent FDR with a limited FNR is targeted. Copyright © 2016 John Wiley & Sons, Ltd. PMID:26970107
Khosravi, Bahareh; Pourahmad, Saeedeh; Bahreini, Amin; Nikeghbalian, Saman; Mehrdad, Goli
2015-01-01
Background: Transplantation is the only treatment for patients with liver failure. Since the therapy imposes high expenses to the patients and community, identification of effective factors on survival of such patients after transplantation is valuable. Objectives: The current study attempted to model the survival of patients (two years old and above) after liver transplantation using neural network and Cox Proportional Hazards (Cox PH) regression models. The event is defined as death due to complications of liver transplantation. Patients and Methods: In a historical cohort study, the clinical findings of 1168 patients who underwent liver transplant surgery (from March 2008 to march 2013) at Shiraz Namazee Hospital Organ Transplantation Center, Shiraz, Southern Iran, were used. To model the one to five years survival of such patients, Cox PH regression model accompanied by three layers feed forward artificial neural network (ANN) method were applied on data separately and their prediction accuracy was compared using the area under the receiver operating characteristic curve (ROC). Furthermore, Kaplan-Meier method was used to estimate the survival probabilities in different years. Results: The estimated survival probability of one to five years for the patients were 91%, 89%, 85%, 84%, and 83%, respectively. The areas under the ROC were 86.4% and 80.7% for ANN and Cox PH models, respectively. In addition, the accuracy of prediction rate for ANN and Cox PH methods was equally 92.73%. Conclusions: The present study detected more accurate results for ANN method compared to those of Cox PH model to analyze the survival of patients with liver transplantation. Furthermore, the order of effective factors in patients’ survival after transplantation was clinically more acceptable. The large dataset with a few missing data was the advantage of this study, the fact which makes the results more reliable. PMID:26500682
Regression analysis of cytopathological data
Whittemore, A.S.; McLarty, J.W.; Fortson, N.; Anderson, K.
1982-12-01
Epithelial cells from the human body are frequently labelled according to one of several ordered levels of abnormality, ranging from normal to malignant. The label of the most abnormal cell in a specimen determines the score for the specimen. This paper presents a model for the regression of specimen scores against continuous and discrete variables, as in host exposure to carcinogens. Application to data and tests for adequacy of model fit are illustrated using sputum specimens obtained from a cohort of former asbestos workers.
Lee, Paul H.
2016-01-01
Healthy adults are advised to perform at least 150 min of moderate-intensity physical activity weekly, but this advice is based on studies using self-reports of questionable validity. This study examined the dose-response relationship of accelerometer-measured physical activity and sedentary behaviors on all-cause mortality using segmented Cox regression to empirically determine the break-points of the dose-response relationship. Data from 7006 adult participants aged 18 or above in the National Health and Nutrition Examination Survey waves 2003–2004 and 2005–2006 were included in the analysis and linked with death certificate data using a probabilistic matching approach in the National Death Index through December 31, 2011. Physical activity and sedentary behavior were measured using ActiGraph model 7164 accelerometer over the right hip for 7 consecutive days. Each minute with accelerometer count <100; 1952–5724; and ≥5725 were classified as sedentary, moderate-intensity physical activity, and vigorous-intensity physical activity, respectively. Segmented Cox regression was used to estimate the hazard ratio (HR) of time spent in sedentary behaviors, moderate-intensity physical activity, and vigorous-intensity physical activity and all-cause mortality, adjusted for demographic characteristics, health behaviors, and health conditions. Data were analyzed in 2016. During 47,119 person-year of follow-up, 608 deaths occurred. Each additional hour per day of sedentary behaviors was associated with a HR of 1.15 (95% CI 1.01, 1.31) among participants who spend at least 10.9 h per day on sedentary behaviors, and each additional minute per day spent on moderate-intensity physical activity was associated with a HR of 0.94 (95% CI 0.91, 0.96) among participants with daily moderate-intensity physical activity ≤14.1 min. Associations of moderate physical activity and sedentary behaviors on all-cause mortality were independent of each other. To conclude, evidence from
Regression Analysis: Legal Applications in Institutional Research
ERIC Educational Resources Information Center
Frizell, Julie A.; Shippen, Benjamin S., Jr.; Luna, Andrew L.
2008-01-01
This article reviews multiple regression analysis, describes how its results should be interpreted, and instructs institutional researchers on how to conduct such analyses using an example focused on faculty pay equity between men and women. The use of multiple regression analysis will be presented as a method with which to compare salaries of…
Regression Analysis and the Sociological Imagination
ERIC Educational Resources Information Center
De Maio, Fernando
2014-01-01
Regression analysis is an important aspect of most introductory statistics courses in sociology but is often presented in contexts divorced from the central concerns that bring students into the discipline. Consequently, we present five lesson ideas that emerge from a regression analysis of income inequality and mortality in the USA and Canada.
2014-01-01
Background Large-scale public health interventions with rapid scale-up are increasingly being implemented worldwide. Such implementation allows for a large target population to be reached in a short period of time. But when the time comes to investigate the effectiveness of these interventions, the rapid scale-up creates several methodological challenges, such as the lack of baseline data and the absence of control groups. One example of such an intervention is Avahan, the India HIV/AIDS initiative of the Bill & Melinda Gates Foundation. One question of interest is the effect of Avahan on condom use by female sex workers with their clients. By retrospectively reconstructing condom use and sex work history from survey data, it is possible to estimate how condom use rates evolve over time. However formal inference about how this rate changes at a given point in calendar time remains challenging. Methods We propose a new statistical procedure based on a mixture of binomial regression and Cox regression. We compare this new method to an existing approach based on generalized estimating equations through simulations and application to Indian data. Results Both methods are unbiased, but the proposed method is more powerful than the existing method, especially when initial condom use is high. When applied to the Indian data, the new method mostly agrees with the existing method, but seems to have corrected some implausible results of the latter in a few districts. We also show how the new method can be used to analyze the data of all districts combined. Conclusions The use of both methods can be recommended for exploratory data analysis. However for formal statistical inference, the new method has better power. PMID:24397563
Box-Cox Mixed Logit Model for Travel Behavior Analysis
NASA Astrophysics Data System (ADS)
Orro, Alfonso; Novales, Margarita; Benitez, Francisco G.
2010-09-01
To represent the behavior of travelers when they are deciding how they are going to get to their destination, discrete choice models, based on the random utility theory, have become one of the most widely used tools. The field in which these models were developed was halfway between econometrics and transport engineering, although the latter now constitutes one of their principal areas of application. In the transport field, they have mainly been applied to mode choice, but also to the selection of destination, route, and other important decisions such as the vehicle ownership. In usual practice, the most frequently employed discrete choice models implement a fixed coefficient utility function that is linear in the parameters. The principal aim of this paper is to present the viability of specifying utility functions with random coefficients that are nonlinear in the parameters, in applications of discrete choice models to transport. Nonlinear specifications in the parameters were present in discrete choice theory at its outset, although they have seldom been used in practice until recently. The specification of random coefficients, however, began with the probit and the hedonic models in the 1970s, and, after a period of apparent little practical interest, has burgeoned into a field of intense activity in recent years with the new generation of mixed logit models. In this communication, we present a Box-Cox mixed logit model, original of the authors. It includes the estimation of the Box-Cox exponents in addition to the parameters of the random coefficients distribution. Probability of choose an alternative is an integral that will be calculated by simulation. The estimation of the model is carried out by maximizing the simulated log-likelihood of a sample of observed individual choices between alternatives. The differences between the predictions yielded by models that are inconsistent with real behavior have been studied with simulation experiments.
Meta-analysis of cyclooxygenase-2 (COX-2) 765G>C polymorphism and Alzheimer's disease.
Su, Jianhua; Wen, Shihong; Zhu, Jinlong; Liu, Ruiping; Yang, Jinsong
2016-09-01
The cyclooxygenase-2 (COX-2) 765G>C polymorphism has been extensively investigated for association with Alzheimer's disease (AD). However, results of different studies have been inconsistent. The aim of the present meta-analysis was to evaluate the association between the 765G>C polymorphism of the COX-2 gene and susceptibility to AD. We searched all related subjects in PubMed, Embase, SinoMed, and China Knowledge Resource Integrated Database and identified seven studies that reported a relationship between the COX-2 765G>C polymorphism and AD. A total of 1260 cases and 1112 controls were included in the seven studies. Our data suggest that the COX-2 765G>C polymorphism may decrease the risk of AD in five genetic models. As a result, this meta-analysis suggests the 765G>C polymorphism of the COX-2 gene may be a protective factor for AD. As our sample size was limited, large-scale, well-designed studies are necessary to validate the association between the COX-2 765G>C polymorphism and AD. PMID:27443496
Using Regression Analysis: A Guided Tour.
ERIC Educational Resources Information Center
Shelton, Fred Ames
1987-01-01
Discusses the use and interpretation of multiple regression analysis with computer programs and presents a flow chart of the process. A general explanation of the flow chart is provided, followed by an example showing the development of a linear equation which could be used in estimating manufacturing overhead cost. (Author/LRW)
Commonality Analysis for the Regression Case.
ERIC Educational Resources Information Center
Murthy, Kavita
Commonality analysis is a procedure for decomposing the coefficient of determination (R superscript 2) in multiple regression analyses into the percent of variance in the dependent variable associated with each independent variable uniquely, and the proportion of explained variance associated with the common effects of predictors in various…
Kim, Sungduk; Chen, Ming-Hui; Ibrahim, Joseph G.; Shah, Arvind K.; Lin, Jianxin
2013-01-01
In this paper, we propose a class of Box-Cox transformation regression models with multidimensional random effects for analyzing multivariate responses for individual patient data (IPD) in meta-analysis. Our modeling formulation uses a multivariate normal response meta-analysis model with multivariate random effects, in which each response is allowed to have its own Box-Cox transformation. Prior distributions are specified for the Box-Cox transformation parameters as well as the regression coefficients in this complex model, and the Deviance Information Criterion (DIC) is used to select the best transformation model. Since the model is quite complex, a novel Monte Carlo Markov chain (MCMC) sampling scheme is developed to sample from the joint posterior of the parameters. This model is motivated by a very rich dataset comprising 26 clinical trials involving cholesterol lowering drugs where the goal is to jointly model the three dimensional response consisting of Low Density Lipoprotein Cholesterol (LDL-C), High Density Lipoprotein Cholesterol (HDL-C), and Triglycerides (TG) (LDL-C, HDL-C, TG). Since the joint distribution of (LDL-C, HDL-C, TG) is not multivariate normal and in fact quite skewed, a Box-Cox transformation is needed to achieve normality. In the clinical literature, these three variables are usually analyzed univariately: however, a multivariate approach would be more appropriate since these variables are correlated with each other. A detailed analysis of these data is carried out using the proposed methodology. PMID:23580436
Robust Mediation Analysis Based on Median Regression
Yuan, Ying; MacKinnon, David P.
2014-01-01
Mediation analysis has many applications in psychology and the social sciences. The most prevalent methods typically assume that the error distribution is normal and homoscedastic. However, this assumption may rarely be met in practice, which can affect the validity of the mediation analysis. To address this problem, we propose robust mediation analysis based on median regression. Our approach is robust to various departures from the assumption of homoscedasticity and normality, including heavy-tailed, skewed, contaminated, and heteroscedastic distributions. Simulation studies show that under these circumstances, the proposed method is more efficient and powerful than standard mediation analysis. We further extend the proposed robust method to multilevel mediation analysis, and demonstrate through simulation studies that the new approach outperforms the standard multilevel mediation analysis. We illustrate the proposed method using data from a program designed to increase reemployment and enhance mental health of job seekers. PMID:24079925
Wong, May C M; Lam, K F; Lo, Edward C M
2006-02-15
In some controlled clinical trials in dental research, multiple failure time data from the same patient are frequently observed that result in clustered multiple failure time. Moreover, the treatments are often delivered by more than one operator and thus the multiple failure times are clustered according to a multilevel structure when the operator effects are assumed to be random. In practice, it is often too expensive or even impossible to monitor the study subjects continuously, but they are examined periodically at some regular pre-scheduled visits. Hence, discrete or grouped clustered failure time data are collected. The aim of this paper is to illustrate the use of the Monte Carlo Markov chain (MCMC) approach and non-informative prior in a Bayesian framework to mimic the maximum likelihood (ML) estimation in a frequentist approach in multilevel modelling of clustered grouped survival data. A three-level model with additive variance components model for the random effects is considered in this paper. Both the grouped proportional hazards model and the dynamic logistic regression model are used. The approximate intra-cluster correlation of the log failure times can be estimated when the grouped proportional hazards model is used. The statistical package WinBUGS is adopted to estimate the parameter of interest based on the MCMC method. The models and method are applied to a data set obtained from a prospective clinical study on a cohort of Chinese school children that atraumatic restorative treatment (ART) restorations were placed on permanent teeth with carious lesions. Altogether 284 ART restorations were placed by five dentists and clinical status of the ART restorations was evaluated annually for 6 years after placement, thus clustered grouped failure times of the restorations were recorded. Results based on the grouped proportional hazards model revealed that clustering effect among the log failure times of the different restorations from the same child was
A method for nonlinear exponential regression analysis
NASA Technical Reports Server (NTRS)
Junkin, B. G.
1971-01-01
A computer-oriented technique is presented for performing a nonlinear exponential regression analysis on decay-type experimental data. The technique involves the least squares procedure wherein the nonlinear problem is linearized by expansion in a Taylor series. A linear curve fitting procedure for determining the initial nominal estimates for the unknown exponential model parameters is included as an integral part of the technique. A correction matrix was derived and then applied to the nominal estimate to produce an improved set of model parameters. The solution cycle is repeated until some predetermined criterion is satisfied.
Sibling dilution hypothesis: a regression surface analysis.
Marjoribanks, K
2001-08-01
This study examined relationships between sibship size (the number of children in a family), birth order, and measures of academic performance, academic self-concept, and educational aspirations at different levels of family educational resources. As part of a national longitudinal study of Australian secondary school students data were collected from 2,530 boys and 2,450 girls in Years 9 and 10. Regression surfaces were constructed from models that included terms to account for linear, interaction, and curvilinear associations among the variables. Analysis suggests the general propositions (a) family educational resources have significant associations with children's school-related outcomes at different levels of sibling variables, the relationships for girls being curvilinear, and (b) sibling variables continue to have small significant associations with affective and cognitive outcomes, after taking into account variations in family educational resources. That is, the investigation provides only partial support for the sibling dilution hypothesis. PMID:11729548
Technological Forecasting with a Multiple Regression Analysis Approach.
ERIC Educational Resources Information Center
Luftig, Jeffrey T.; Norton, Willis P.
1981-01-01
This article examines simple and multiple regression analysis as forecasting tools, and details the process by which multiple regression analysis may be used to increase the accuracy of the technology forecast. (CT)
ERIC Educational Resources Information Center
Williams, John D.; Lindem, Alfred C.
Four computer programs using the general purpose multiple linear regression program have been developed. Setwise regression analysis is a stepwise procedure for sets of variables; there will be as many steps as there are sets. Covarmlt allows a solution to the analysis of covariance design with multiple covariates. A third program has three…
A rotor optimization using regression analysis
NASA Technical Reports Server (NTRS)
Giansante, N.
1984-01-01
The design and development of helicopter rotors is subject to the many design variables and their interactions that effect rotor operation. Until recently, selection of rotor design variables to achieve specified rotor operational qualities has been a costly, time consuming, repetitive task. For the past several years, Kaman Aerospace Corporation has successfully applied multiple linear regression analysis, coupled with optimization and sensitivity procedures, in the analytical design of rotor systems. It is concluded that approximating equations can be developed rapidly for a multiplicity of objective and constraint functions and optimizations can be performed in a rapid and cost effective manner; the number and/or range of design variables can be increased by expanding the data base and developing approximating functions to reflect the expanded design space; the order of the approximating equations can be expanded easily to improve correlation between analyzer results and the approximating equations; gradients of the approximating equations can be calculated easily and these gradients are smooth functions reducing the risk of numerical problems in the optimization; the use of approximating functions allows the problem to be started easily and rapidly from various initial designs to enhance the probability of finding a global optimum; and the approximating equations are independent of the analysis or optimization codes used.
Using Dominance Analysis to Determine Predictor Importance in Logistic Regression
ERIC Educational Resources Information Center
Azen, Razia; Traxel, Nicole
2009-01-01
This article proposes an extension of dominance analysis that allows researchers to determine the relative importance of predictors in logistic regression models. Criteria for choosing logistic regression R[superscript 2] analogues were determined and measures were selected that can be used to perform dominance analysis in logistic regression. A…
Sliced Inverse Regression for Time Series Analysis
NASA Astrophysics Data System (ADS)
Chen, Li-Sue
1995-11-01
In this thesis, general nonlinear models for time series data are considered. A basic form is x _{t} = f(beta_sp{1} {T}X_{t-1},beta_sp {2}{T}X_{t-1},... , beta_sp{k}{T}X_ {t-1},varepsilon_{t}), where x_{t} is an observed time series data, X_{t } is the first d time lag vector, (x _{t},x_{t-1},... ,x _{t-d-1}), f is an unknown function, beta_{i}'s are unknown vectors, varepsilon_{t }'s are independent distributed. Special cases include AR and TAR models. We investigate the feasibility applying SIR/PHD (Li 1990, 1991) (the sliced inverse regression and principal Hessian methods) in estimating beta _{i}'s. PCA (Principal component analysis) is brought in to check one critical condition for SIR/PHD. Through simulation and a study on 3 well -known data sets of Canadian lynx, U.S. unemployment rate and sunspot numbers, we demonstrate how SIR/PHD can effectively retrieve the interesting low-dimension structures for time series data.
Choi, In-Wook; Kim, Hwang-Yong; Quan, Juan-Hua; Ryu, Jae-Gee; Sun, Rubing; Lee, Young-Ha
2015-01-01
Fascioliasis, a food-borne trematode zoonosis, is a disease primarily in cattle and sheep and occasionally in humans. Water dropwort (Oenanthe javanica), an aquatic perennial herb, is a common second intermediate host of Fasciola, and the fresh stems and leaves are widely used as a seasoning in the Korean diet. However, no information regarding Fasciola species contamination in water dropwort is available. Here, we collected 500 samples of water dropwort in 3 areas in Korea during February and March 2015, and the water dropwort contamination of Fasciola species was monitored by DNA sequencing analysis of the Fasciola hepatica and Fasciola gigantica specific mitochondrial cytochrome c oxidase subunit 1 (cox1) and nuclear ribosomal internal transcribed spacer 2 (ITS-2). Among the 500 samples assessed, the presence of F. hepatica cox1 and 1TS-2 markers were detected in 2 samples, and F. hepatica contamination was confirmed by sequencing analysis. The nucleotide sequences of cox1 PCR products from the 2 F. hepatica-contaminated samples were 96.5% identical to the F. hepatica cox1 sequences in GenBank, whereas F. gigantica cox1 sequences were 46.8% similar with the sequence detected from the cox1 positive samples. However, F. gigantica cox1 and ITS-2 markers were not detected by PCR in the 500 samples of water dropwort. Collectively, in this survey of the water dropwort contamination with Fasciola species, very low prevalence of F. hepatica contamination was detected in the samples. PMID:26537044
Choi, In-Wook; Kim, Hwang-Yong; Quan, Juan-Hua; Ryu, Jae-Gee; Sun, Rubing; Lee, Young-Ha
2015-10-01
Fascioliasis, a food-borne trematode zoonosis, is a disease primarily in cattle and sheep and occasionally in humans. Water dropwort (Oenanthe javanica), an aquatic perennial herb, is a common second intermediate host of Fasciola, and the fresh stems and leaves are widely used as a seasoning in the Korean diet. However, no information regarding Fasciola species contamination in water dropwort is available. Here, we collected 500 samples of water dropwort in 3 areas in Korea during February and March 2015, and the water dropwort contamination of Fasciola species was monitored by DNA sequencing analysis of the Fasciola hepatica and Fasciola gigantica specific mitochondrial cytochrome c oxidase subunit 1 (cox1) and nuclear ribosomal internal transcribed spacer 2 (ITS-2). Among the 500 samples assessed, the presence of F. hepatica cox1 and 1TS-2 markers were detected in 2 samples, and F. hepatica contamination was confirmed by sequencing analysis. The nucleotide sequences of cox1 PCR products from the 2 F. hepatica-contaminated samples were 96.5% identical to the F. hepatica cox1 sequences in GenBank, whereas F. gigantica cox1 sequences were 46.8% similar with the sequence detected from the cox1 positive samples. However, F. gigantica cox1 and ITS-2 markers were not detected by PCR in the 500 samples of water dropwort. Collectively, in this survey of the water dropwort contamination with Fasciola species, very low prevalence of F. hepatica contamination was detected in the samples. PMID:26537044
Giganti, Mark J.; Luz, Paula M.; Caro-Vega, Yanink; Cesar, Carina; Padgett, Denis; Koenig, Serena; Echevarria, Juan; McGowan, Catherine C.; Shepherd, Bryan E.
2015-01-01
Abstract Many studies of HIV/AIDS aggregate data from multiple cohorts to improve power and generalizability. There are several analysis approaches to account for cross-cohort heterogeneity; we assessed how different approaches can impact results from an HIV/AIDS study investigating predictors of mortality. Using data from 13,658 HIV-infected patients starting antiretroviral therapy from seven Latin American and Caribbean cohorts, we illustrate the assumptions of seven readily implementable approaches to account for across cohort heterogeneity with Cox proportional hazards models, and we compare hazard ratio estimates across approaches. As a sensitivity analysis, we modify cohort membership to generate specific heterogeneity conditions. Hazard ratio estimates varied slightly between the seven analysis approaches, but differences were not clinically meaningful. Adjusted hazard ratio estimates for the association between AIDS at treatment initiation and death varied from 2.00 to 2.20 across approaches that accounted for heterogeneity; the adjusted hazard ratio was estimated as 1.73 in analyses that ignored across cohort heterogeneity. In sensitivity analyses with more extreme heterogeneity, we noted a slightly greater distinction between approaches. Despite substantial heterogeneity between cohorts, the impact of the specific approach to account for heterogeneity was minimal in our case study. Our results suggest that it is important to account for across cohort heterogeneity in analyses, but that the specific technique for addressing heterogeneity may be less important. Because of their flexibility in accounting for cohort heterogeneity, we prefer stratification or meta-analysis methods, but we encourage investigators to consider their specific study conditions and objectives. PMID:25647087
Giganti, Mark J; Luz, Paula M; Caro-Vega, Yanink; Cesar, Carina; Padgett, Denis; Koenig, Serena; Echevarria, Juan; McGowan, Catherine C; Shepherd, Bryan E
2015-05-01
Many studies of HIV/AIDS aggregate data from multiple cohorts to improve power and generalizability. There are several analysis approaches to account for cross-cohort heterogeneity; we assessed how different approaches can impact results from an HIV/AIDS study investigating predictors of mortality. Using data from 13,658 HIV-infected patients starting antiretroviral therapy from seven Latin American and Caribbean cohorts, we illustrate the assumptions of seven readily implementable approaches to account for across cohort heterogeneity with Cox proportional hazards models, and we compare hazard ratio estimates across approaches. As a sensitivity analysis, we modify cohort membership to generate specific heterogeneity conditions. Hazard ratio estimates varied slightly between the seven analysis approaches, but differences were not clinically meaningful. Adjusted hazard ratio estimates for the association between AIDS at treatment initiation and death varied from 2.00 to 2.20 across approaches that accounted for heterogeneity; the adjusted hazard ratio was estimated as 1.73 in analyses that ignored across cohort heterogeneity. In sensitivity analyses with more extreme heterogeneity, we noted a slightly greater distinction between approaches. Despite substantial heterogeneity between cohorts, the impact of the specific approach to account for heterogeneity was minimal in our case study. Our results suggest that it is important to account for across cohort heterogeneity in analyses, but that the specific technique for addressing heterogeneity may be less important. Because of their flexibility in accounting for cohort heterogeneity, we prefer stratification or meta-analysis methods, but we encourage investigators to consider their specific study conditions and objectives. PMID:25647087
Topics in route-regression analysis
Geissler, P.H.; Sauer, J.R.
1990-01-01
The route-regression method has been used in recent years to analyze data from roadside surveys. With this method, a population trend is estimated for each route in a region, then regional trends are estimated as a weighted mean of the individual route trends. This method can accurately incorporate data that is unbalanced by changes in years surveyed and observer differences. We suggest that route-regression methodology is most efficient in the estimation of long-term (>5 year) trends, and tends to provide conservative results for low-density species.
Crager, Michael R.; Tang, Gong
2015-01-01
We propose a method for assessing an individual patient’s risk of a future clinical event using clinical trial or cohort data and Cox proportional hazards regression, combining the information from several studies using meta-analysis techniques. The method combines patient-specific estimates of the log cumulative hazard across studies, weighting by the relative precision of the estimates, using either fixed- or random-effects meta-analysis calculations. Risk assessment can be done for any future patient using a few key summary statistics determined once and for all from each study. Generalizations of the method to logistic regression and linear models are immediate. We evaluate the methods using simulation studies and illustrate their application using real data. PMID:26664111
Joint regression analysis for discrete longitudinal data.
Madsen, L; Fang, Y
2011-09-01
We introduce an approximation to the Gaussian copula likelihood of Song, Li, and Yuan (2009, Biometrics 65, 60-68) used to estimate regression parameters from correlated discrete or mixed bivariate or trivariate outcomes. Our approximation allows estimation of parameters from response vectors of length much larger than three, and is asymptotically equivalent to the Gaussian copula likelihood. We estimate regression parameters from the toenail infection data of De Backer et al. (1996, British Journal of Dermatology 134, 16-17), which consist of binary response vectors of length seven or less from 294 subjects. Although maximizing the Gaussian copula likelihood yields estimators that are asymptotically more efficient than generalized estimating equation (GEE) estimators, our simulation study illustrates that for finite samples, GEE estimators can actually be as much as 20% more efficient. PMID:21039391
Xiao, Zengming; Wu, Hao; Wu, Yang
2013-01-01
Background Numerous studies examining the relationship between Cyclooxygenase-2 (COX-2) immunoexpression and clinical outcome in osteosarcoma patients have yielded inconclusive results. Methods We accordingly conducted a meta-analysis of 9 studies (442 patients) that evaluated the correlation between COX-2 immunoexpression and clinical prognosis (death). Pooled odds ratios (OR) and risk ratios (RR) with 95% confidence intervals (95% CI) were calculated using the random-effects or fixed-effects model. Results Meta–analysis showed no significant association between COX-2 positivity and age, gender, tumor location, histology, stage, metastasis or 90% necrosis. Conversely, COX-2 immunoexpression was associated with overall survival rate (RR=2.12; 95% CI: 1.10–3.74; P=0.009) and disease-free survival rate (RR=1.63; 95% CI: 1.17–2.28; P=0.004) at 2 years. Sensitivity analysis performed by omitting low quality studies showed that the pooled results were stable. Conclusions COX-2 positivity was associated with a lower 2-year overall survival rate and disease-free survival rate. COX-2 expression change is an independent prognostic factor in patients with osteosarcoma. PMID:24358237
Strategies for Detecting Outliers in Regression Analysis: An Introductory Primer.
ERIC Educational Resources Information Center
Evans, Victoria P.
Outliers are extreme data points that have the potential to influence statistical analyses. Outlier identification is important to researchers using regression analysis because outliers can influence the model used to such an extent that they seriously distort the conclusions drawn from the data. The effects of outliers on regression analysis are…
Molecular docking analysis of known flavonoids as duel COX-2 inhibitors in the context of cancer.
Dash, Raju; Uddin, Mir Muhammad Nasir; Hosen, S M Zahid; Rahim, Zahed Bin; Dinar, Abu Mansur; Kabir, Mohammad Shah Hafez; Sultan, Ramiz Ahmed; Islam, Ashekul; Hossain, Md Kamrul
2015-01-01
Cyclooxygenase-2 (COX-2) catalyzed synthesis of prostaglandin E2 and it associates with tumor growth, infiltration, and metastasis in preclinical experiments. Known inhibitors against COX-2 exhibit toxicity. Therefore, it is of interest to screen natural compounds like flavanoids against COX-2. Molecular docking using 12 known flavanoids against COX-2 by FlexX and of ArgusLab were performed. All compounds showed a favourable binding energy of >-10 KJ/mol in FlexX and > -8 kcal/mol in ArgusLab. However, this data requires in vitro and in vivo verification for further consideration. PMID:26770028
Molecular docking analysis of known flavonoids as duel COX-2 inhibitors in the context of cancer
Dash, Raju; Uddin, Mir Muhammad Nasir; Hosen, S.M. Zahid; Rahim, Zahed Bin; Dinar, Abu Mansur; Kabir, Mohammad Shah Hafez; Sultan, Ramiz Ahmed; Islam, Ashekul; Hossain, Md Kamrul
2015-01-01
Cyclooxygenase-2 (COX-2) catalyzed synthesis of prostaglandin E2 and it associates with tumor growth, infiltration, and metastasis in preclinical experiments. Known inhibitors against COX-2 exhibit toxicity. Therefore, it is of interest to screen natural compounds like flavanoids against COX-2. Molecular docking using 12 known flavanoids against COX-2 by FlexX and of ArgusLab were performed. All compounds showed a favourable binding energy of >-10 KJ/mol in FlexX and > -8 kcal/mol in ArgusLab. However, this data requires in vitro and in vivo verification for further consideration. PMID:26770028
ERIC Educational Resources Information Center
Hecht, Jeffrey B.
The analysis of regression residuals and detection of outliers are discussed, with emphasis on determining how deviant an individual data point must be to be considered an outlier and the impact that multiple suspected outlier data points have on the process of outlier determination and treatment. Only bivariate (one dependent and one independent)…
Takagi, Daisuke; Ikeda, Ken'ichi; Kawachi, Ichiro
2012-11-01
Crime is an important determinant of public health outcomes, including quality of life, mental well-being, and health behavior. A body of research has documented the association between community social capital and crime victimization. The association between social capital and crime victimization has been examined at multiple levels of spatial aggregation, ranging from entire countries, to states, metropolitan areas, counties, and neighborhoods. In multilevel analysis, the spatial boundaries at level 2 are most often drawn from administrative boundaries (e.g., Census tracts in the U.S.). One problem with adopting administrative definitions of neighborhoods is that it ignores spatial spillover. We conducted a study of social capital and crime victimization in one ward of Tokyo city, using a spatial Durbin model with an inverse-distance weighting matrix that assigned each respondent a unique level of "exposure" to social capital based on all other residents' perceptions. The study is based on a postal questionnaire sent to 20-69 years old residents of Arakawa Ward, Tokyo. The response rate was 43.7%. We examined the contextual influence of generalized trust, perceptions of reciprocity, two types of social network variables, as well as two principal components of social capital (constructed from the above four variables). Our outcome measure was self-reported crime victimization in the last five years. In the spatial Durbin model, we found that neighborhood generalized trust, reciprocity, supportive networks and two principal components of social capital were each inversely associated with crime victimization. By contrast, a multilevel regression performed with the same data (using administrative neighborhood boundaries) found generally null associations between neighborhood social capital and crime. Spatial regression methods may be more appropriate for investigating the contextual influence of social capital in homogeneous cultural settings such as Japan. PMID
Regression Commonality Analysis: A Technique for Quantitative Theory Building
ERIC Educational Resources Information Center
Nimon, Kim; Reio, Thomas G., Jr.
2011-01-01
When it comes to multiple linear regression analysis (MLR), it is common for social and behavioral science researchers to rely predominately on beta weights when evaluating how predictors contribute to a regression model. Presenting an underutilized statistical technique, this article describes how organizational researchers can use commonality…
The Precision Efficacy Analysis for Regression Sample Size Method.
ERIC Educational Resources Information Center
Brooks, Gordon P.; Barcikowski, Robert S.
The general purpose of this study was to examine the efficiency of the Precision Efficacy Analysis for Regression (PEAR) method for choosing appropriate sample sizes in regression studies used for precision. The PEAR method, which is based on the algebraic manipulation of an accepted cross-validity formula, essentially uses an effect size to…
PRINCIPAL COMPONENTS ANALYSIS AND PARTIAL LEAST SQUARES REGRESSION
The mathematics behind the techniques of principal component analysis and partial least squares regression is presented in detail, starting from the appropriate extreme conditions. he meaning of the resultant vectors and many of their mathematical interrelationships are also pres...
3D Regression Heat Map Analysis of Population Study Data.
Klemm, Paul; Lawonn, Kai; Glaßer, Sylvia; Niemann, Uli; Hegenscheid, Katrin; Völzke, Henry; Preim, Bernhard
2016-01-01
Epidemiological studies comprise heterogeneous data about a subject group to define disease-specific risk factors. These data contain information (features) about a subject's lifestyle, medical status as well as medical image data. Statistical regression analysis is used to evaluate these features and to identify feature combinations indicating a disease (the target feature). We propose an analysis approach of epidemiological data sets by incorporating all features in an exhaustive regression-based analysis. This approach combines all independent features w.r.t. a target feature. It provides a visualization that reveals insights into the data by highlighting relationships. The 3D Regression Heat Map, a novel 3D visual encoding, acts as an overview of the whole data set. It shows all combinations of two to three independent features with a specific target disease. Slicing through the 3D Regression Heat Map allows for the detailed analysis of the underlying relationships. Expert knowledge about disease-specific hypotheses can be included into the analysis by adjusting the regression model formulas. Furthermore, the influences of features can be assessed using a difference view comparing different calculation results. We applied our 3D Regression Heat Map method to a hepatic steatosis data set to reproduce results from a data mining-driven analysis. A qualitative analysis was conducted on a breast density data set. We were able to derive new hypotheses about relations between breast density and breast lesions with breast cancer. With the 3D Regression Heat Map, we present a visual overview of epidemiological data that allows for the first time an interactive regression-based analysis of large feature sets with respect to a disease. PMID:26529689
Linear regression analysis of survival data with missing censoring indicators.
Wang, Qihua; Dinse, Gregg E
2011-04-01
Linear regression analysis has been studied extensively in a random censorship setting, but typically all of the censoring indicators are assumed to be observed. In this paper, we develop synthetic data methods for estimating regression parameters in a linear model when some censoring indicators are missing. We define estimators based on regression calibration, imputation, and inverse probability weighting techniques, and we prove all three estimators are asymptotically normal. The finite-sample performance of each estimator is evaluated via simulation. We illustrate our methods by assessing the effects of sex and age on the time to non-ambulatory progression for patients in a brain cancer clinical trial. PMID:20559722
NASA Astrophysics Data System (ADS)
Ahn, Kuk-Hyun; Palmer, Richard
2016-09-01
Despite wide use of regression-based regional flood frequency analysis (RFFA) methods, the majority are based on either ordinary least squares (OLS) or generalized least squares (GLS). This paper proposes 'spatial proximity' based RFFA methods using the spatial lagged model (SLM) and spatial error model (SEM). The proposed methods are represented by two frameworks: the quantile regression technique (QRT) and parameter regression technique (PRT). The QRT develops prediction equations for flooding quantiles in average recurrence intervals (ARIs) of 2, 5, 10, 20, and 100 years whereas the PRT provides prediction of three parameters for the selected distribution. The proposed methods are tested using data incorporating 30 basin characteristics from 237 basins in Northeastern United States. Results show that generalized extreme value (GEV) distribution properly represents flood frequencies in the study gages. Also, basin area, stream network, and precipitation seasonality are found to be the most effective explanatory variables in prediction modeling by the QRT and PRT. 'Spatial proximity' based RFFA methods provide reliable flood quantile estimates compared to simpler methods. Compared to the QRT, the PRT may be recommended due to its accuracy and computational simplicity. The results presented in this paper may serve as one possible guidepost for hydrologists interested in flood analysis at ungaged sites.
Background stratified Poisson regression analysis of cohort data
Langholz, Bryan
2012-01-01
Background stratified Poisson regression is an approach that has been used in the analysis of data derived from a variety of epidemiologically important studies of radiation-exposed populations, including uranium miners, nuclear industry workers, and atomic bomb survivors. We describe a novel approach to fit Poisson regression models that adjust for a set of covariates through background stratification while directly estimating the radiation-disease association of primary interest. The approach makes use of an expression for the Poisson likelihood that treats the coefficients for stratum-specific indicator variables as ‘nuisance’ variables and avoids the need to explicitly estimate the coefficients for these stratum-specific parameters. Log-linear models, as well as other general relative rate models, are accommodated. This approach is illustrated using data from the Life Span Study of Japanese atomic bomb survivors and data from a study of underground uranium miners. The point estimate and confidence interval obtained from this ‘conditional’ regression approach are identical to the values obtained using unconditional Poisson regression with model terms for each background stratum. Moreover, it is shown that the proposed approach allows estimation of background stratified Poisson regression models of non-standard form, such as models that parameterize latency effects, as well as regression models in which the number of strata is large, thereby overcoming the limitations of previously available statistical software for fitting background stratified Poisson regression models. PMID:22193911
Background stratified Poisson regression analysis of cohort data.
Richardson, David B; Langholz, Bryan
2012-03-01
Background stratified Poisson regression is an approach that has been used in the analysis of data derived from a variety of epidemiologically important studies of radiation-exposed populations, including uranium miners, nuclear industry workers, and atomic bomb survivors. We describe a novel approach to fit Poisson regression models that adjust for a set of covariates through background stratification while directly estimating the radiation-disease association of primary interest. The approach makes use of an expression for the Poisson likelihood that treats the coefficients for stratum-specific indicator variables as 'nuisance' variables and avoids the need to explicitly estimate the coefficients for these stratum-specific parameters. Log-linear models, as well as other general relative rate models, are accommodated. This approach is illustrated using data from the Life Span Study of Japanese atomic bomb survivors and data from a study of underground uranium miners. The point estimate and confidence interval obtained from this 'conditional' regression approach are identical to the values obtained using unconditional Poisson regression with model terms for each background stratum. Moreover, it is shown that the proposed approach allows estimation of background stratified Poisson regression models of non-standard form, such as models that parameterize latency effects, as well as regression models in which the number of strata is large, thereby overcoming the limitations of previously available statistical software for fitting background stratified Poisson regression models. PMID:22193911
Regression Model Optimization for the Analysis of Experimental Data
NASA Technical Reports Server (NTRS)
Ulbrich, N.
2009-01-01
A candidate math model search algorithm was developed at Ames Research Center that determines a recommended math model for the multivariate regression analysis of experimental data. The search algorithm is applicable to classical regression analysis problems as well as wind tunnel strain gage balance calibration analysis applications. The algorithm compares the predictive capability of different regression models using the standard deviation of the PRESS residuals of the responses as a search metric. This search metric is minimized during the search. Singular value decomposition is used during the search to reject math models that lead to a singular solution of the regression analysis problem. Two threshold dependent constraints are also applied. The first constraint rejects math models with insignificant terms. The second constraint rejects math models with near-linear dependencies between terms. The math term hierarchy rule may also be applied as an optional constraint during or after the candidate math model search. The final term selection of the recommended math model depends on the regressor and response values of the data set, the user s function class combination choice, the user s constraint selections, and the result of the search metric minimization. A frequently used regression analysis example from the literature is used to illustrate the application of the search algorithm to experimental data.
Association of COX-2 -765G>C genetic polymorphism with coronary artery disease: a meta-analysis
Zhang, Ming-Ming; Xie, Xiang; Ma, Yi-Tong; Zheng, Ying-Ying; Yang, Yi-Ning; Li, Xiao-Mei; Fu, Zhen-Yan; Liu, Fen; Chen, Bang-Dang
2015-01-01
Background: Previous studies suggested the single nucleotide polymorphism (SNP) of COX-2 -765G>C (rs20417) is associated with coronary artery disease (CAD), but the results were conflicting. In order to derive a more precise estimation of the associations, we performed a meta-analysis of the relationship between rs20417 and CAD in all published studies. Method: Databases including PubMed, Web of Science, Wanfang, SinoMed and CNKI were systematically searched. Data were extracted using standardized methods. The association was assessed by odds ratio (OR) with 95% confidence intervals (CIs).The statistical tests were performed using Review Manager 5.3.3 and Stata 12.0 software. Results: We identified a total of 14 studies involving a total of 18227 subjects. The pooled odds ratio (OR) for the association between COX-2 -765G>C and CAD and its corresponding 95% confidence interval (95% CI) were evaluated by random or fixed effect model. A significant statistical association between COX-2 -765G>C and CAD was observed in an allelic model (P=0.02, OR=0.64, 95% CI: 0.43-0.94), dominant model (P=0.04, OR=0.74, 95% CI: 0.56-0.99), and recessive model (P=0.02, OR=0.46, 95% CI: 0.23-0.90). Conclusion: This meta-analysis suggested that COX-2 -765G>C is a protective for CAD. PMID:26221283
Joint regression analysis and AMMI model applied to oat improvement
NASA Astrophysics Data System (ADS)
Oliveira, A.; Oliveira, T. A.; Mejza, S.
2012-09-01
In our work we present an application of some biometrical methods useful in genotype stability evaluation, namely AMMI model, Joint Regression Analysis (JRA) and multiple comparison tests. A genotype stability analysis of oat (Avena Sativa L.) grain yield was carried out using data of the Portuguese Plant Breeding Board, sample of the 22 different genotypes during the years 2002, 2003 and 2004 in six locations. In Ferreira et al. (2006) the authors state the relevance of the regression models and of the Additive Main Effects and Multiplicative Interactions (AMMI) model, to study and to estimate phenotypic stability effects. As computational techniques we use the Zigzag algorithm to estimate the regression coefficients and the agricolae-package available in R software for AMMI model analysis.
Time series analysis using semiparametric regression on oil palm production
NASA Astrophysics Data System (ADS)
Yundari, Pasaribu, U. S.; Mukhaiyar, U.
2016-04-01
This paper presents semiparametric kernel regression method which has shown its flexibility and easiness in mathematical calculation, especially in estimating density and regression function. Kernel function is continuous and it produces a smooth estimation. The classical kernel density estimator is constructed by completely nonparametric analysis and it is well reasonable working for all form of function. Here, we discuss about parameter estimation in time series analysis. First, we consider the parameters are exist, then we use nonparametrical estimation which is called semiparametrical. The selection of optimum bandwidth is obtained by considering the approximation of Mean Integrated Square Root Error (MISE).
Analysis of Sting Balance Calibration Data Using Optimized Regression Models
NASA Technical Reports Server (NTRS)
Ulbrich, N.; Bader, Jon B.
2010-01-01
Calibration data of a wind tunnel sting balance was processed using a candidate math model search algorithm that recommends an optimized regression model for the data analysis. During the calibration the normal force and the moment at the balance moment center were selected as independent calibration variables. The sting balance itself had two moment gages. Therefore, after analyzing the connection between calibration loads and gage outputs, it was decided to choose the difference and the sum of the gage outputs as the two responses that best describe the behavior of the balance. The math model search algorithm was applied to these two responses. An optimized regression model was obtained for each response. Classical strain gage balance load transformations and the equations of the deflection of a cantilever beam under load are used to show that the search algorithm s two optimized regression models are supported by a theoretical analysis of the relationship between the applied calibration loads and the measured gage outputs. The analysis of the sting balance calibration data set is a rare example of a situation when terms of a regression model of a balance can directly be derived from first principles of physics. In addition, it is interesting to note that the search algorithm recommended the correct regression model term combinations using only a set of statistical quality metrics that were applied to the experimental data during the algorithm s term selection process.
Accounting for the correlation between fellow eyes in regression analysis.
Glynn, R J; Rosner, B
1992-03-01
Regression techniques that appropriately use all available eyes have infrequently been applied in the ophthalmologic literature, despite advances both in the development of statistical models and in the availability of computer software to fit these models. We considered the general linear model and polychotomous logistic regression approaches of Rosner and the estimating equation approach of Liang and Zeger, applied to both linear and logistic regression. Methods were illustrated with the use of two real data sets: (1) impairment of visual acuity in patients with retinitis pigmentosa and (2) overall visual field impairment in elderly patients evaluated for glaucoma. We discuss the interpretation of coefficients from these models and the advantages of these approaches compared with alternative approaches, such as treating individuals rather than eyes as the unit of analysis, separate regression analyses of right and left eyes, or utilization of ordinary regression techniques without accounting for the correlation between fellow eyes. Specific advantages include enhanced statistical power, more interpretable regression coefficients, greater precision of estimation, and less sensitivity to missing data for some eyes. We concluded that these models should be used more frequently in ophthalmologic research, and we provide guidelines for choosing between alternative models. PMID:1543458
Regression analysis for solving diagnosis problem of children's health
NASA Astrophysics Data System (ADS)
Cherkashina, Yu A.; Gerget, O. M.
2016-04-01
The paper includes results of scientific researches. These researches are devoted to the application of statistical techniques, namely, regression analysis, to assess the health status of children in the neonatal period based on medical data (hemostatic parameters, parameters of blood tests, the gestational age, vascular-endothelial growth factor) measured at 3-5 days of children's life. In this paper a detailed description of the studied medical data is given. A binary logistic regression procedure is discussed in the paper. Basic results of the research are presented. A classification table of predicted values and factual observed values is shown, the overall percentage of correct recognition is determined. Regression equation coefficients are calculated, the general regression equation is written based on them. Based on the results of logistic regression, ROC analysis was performed, sensitivity and specificity of the model are calculated and ROC curves are constructed. These mathematical techniques allow carrying out diagnostics of health of children providing a high quality of recognition. The results make a significant contribution to the development of evidence-based medicine and have a high practical importance in the professional activity of the author.
Regression Analysis: Instructional Resource for Cost/Managerial Accounting
ERIC Educational Resources Information Center
Stout, David E.
2015-01-01
This paper describes a classroom-tested instructional resource, grounded in principles of active learning and a constructivism, that embraces two primary objectives: "demystify" for accounting students technical material from statistics regarding ordinary least-squares (OLS) regression analysis--material that students may find obscure or…
Analysis of Sting Balance Calibration Data Using Optimized Regression Models
NASA Technical Reports Server (NTRS)
Ulbrich, Norbert; Bader, Jon B.
2009-01-01
Calibration data of a wind tunnel sting balance was processed using a search algorithm that identifies an optimized regression model for the data analysis. The selected sting balance had two moment gages that were mounted forward and aft of the balance moment center. The difference and the sum of the two gage outputs were fitted in the least squares sense using the normal force and the pitching moment at the balance moment center as independent variables. The regression model search algorithm predicted that the difference of the gage outputs should be modeled using the intercept and the normal force. The sum of the two gage outputs, on the other hand, should be modeled using the intercept, the pitching moment, and the square of the pitching moment. Equations of the deflection of a cantilever beam are used to show that the search algorithm s two recommended math models can also be obtained after performing a rigorous theoretical analysis of the deflection of the sting balance under load. The analysis of the sting balance calibration data set is a rare example of a situation when regression models of balance calibration data can directly be derived from first principles of physics and engineering. In addition, it is interesting to see that the search algorithm recommended the same regression models for the data analysis using only a set of statistical quality metrics.
Quantile Regression with Censored Data
ERIC Educational Resources Information Center
Lin, Guixian
2009-01-01
The Cox proportional hazards model and the accelerated failure time model are frequently used in survival data analysis. They are powerful, yet have limitation due to their model assumptions. Quantile regression offers a semiparametric approach to model data with possible heterogeneity. It is particularly powerful for censored responses, where the…
A SAS macro for residual deviance of ordinal regression analysis.
Wan, J Y; Wang, W; Bromberg, J
1994-12-01
In this paper, a SAS macro is described for calculating the likelihood of the 'saturated' model in the analysis of ordinal regression. The outcome variable is multinomial on an ordinal scale, while the explanatory variables can be nominal or ordinal. Several ordinal regression models may be fitted to the data. One method of testing for the goodness of fit of these regression models is by comparing the residual deviance with the chi 2 distribution. In SAS, PROC LOGISTIC may be used to fit this type of data with proportional odds model. Unfortunately, the residual deviance is not available from the output. Our SAS macro will supplement the SAS output so that the residual deviance test may be carried out. The data from an ongoing HIV study is used as an illustration. PMID:7736732
Kolenda, Rafał; Ugorski, Maciej; Bednarski, Michał
2014-08-01
Sarcocysts from four Polish roe deer were collected and examined by light microscopy, small subunit ribosomal RNA (ssu rRNA), and the subunit I of cytochrome oxidase (cox1) sequence analysis. This resulted in identification of Sarcocystis gracilis, Sarcocystis oviformis, and Sarcocystis silva. However, we were unable to detect Sarcocystis capreolicanis, the fourth Sarcocystis species found previously in Norwegian roe deer. Polish sarcocysts isolated from various tissues differed in terms of their shape and size and were larger than the respective Norwegian isolates. Analysis of ssu rRNA gene revealed the lack of differences between Sarcocystis isolates belonging to one species and a very low degree of genetic diversity between Polish and Norwegian sarcocysts, ranging from 0.1% for Sarcocystis gracilis and Sarcocystis oviformis to 0.44% for Sarcocystis silva. Contrary to the results of the ssu rRNA analysis, small intraspecies differences in cox1 sequences were found among Polish Sarcocystis gracilis and Sarcocystis silva isolates. The comparison of Polish and Norwegian cox1 sequences representing the same Sarcocystis species revealed similar degree of sequence identity, namely 99.72% for Sarcocystis gracilis, 98.76% for Sarcocystis silva, and 99.85% for Sarcocystis oviformis. Phylogenetic reconstruction and genetic population analyses showed an unexpected high degree of identity between Polish and Norwegian isolates. Moreover, cox1 gene sequences turned out to be more accurate than ssu rRNA when used to reveal phylogenetic relationships among closely related species. The results of our study revealed that the same Sarcocystis species isolated from the same hosts living in different geographic regions show a very high level of genetic similarity. PMID:24948101
Agogo, George O; van der Voet, Hilko; Van't Veer, Pieter; van Eeuwijk, Fred A; Boshuizen, Hendriek C
2016-07-01
Dietary questionnaires are prone to measurement error, which bias the perceived association between dietary intake and risk of disease. Short-term measurements are required to adjust for the bias in the association. For foods that are not consumed daily, the short-term measurements are often characterized by excess zeroes. Via a simulation study, the performance of a two-part calibration model that was developed for a single-replicate study design was assessed by mimicking leafy vegetable intake reports from the multicenter European Prospective Investigation into Cancer and Nutrition (EPIC) study. In part I of the fitted two-part calibration model, a logistic distribution was assumed; in part II, a gamma distribution was assumed. The model was assessed with respect to the magnitude of the correlation between the consumption probability and the consumed amount (hereafter, cross-part correlation), the number and form of covariates in the calibration model, the percentage of zero response values, and the magnitude of the measurement error in the dietary intake. From the simulation study results, transforming the dietary variable in the regression calibration to an appropriate scale was found to be the most important factor for the model performance. Reducing the number of covariates in the model could be beneficial, but was not critical in large-sample studies. The performance was remarkably robust when fitting a one-part rather than a two-part model. The model performance was minimally affected by the cross-part correlation. PMID:27003183
Islam, Abul B M M K; Dave, Mandar; Amin, Sonia; Jensen, Roderick V; Amin, Ashok R
2016-04-01
The constitutively-expressed cyclooxygenase 1 (COX-1) and the inducible COX-2 are both involved in the conversion of arachidonic acid (AA) to prostaglandins (PGs). However, the functional roles of COX-1 at the cellular level remain unclear. We hypothesized that by comparing differential gene expression and eicosanoid metabolism in lung fibroblasts from wild-type (WT) mice and COX-2(-/-) or COX-1(-/-) mice may help address the functional roles of COX-1 in inflammation and other cellular functions. Compared to WT, the number of specifically-induced transcripts were altered descendingly as follows: COX-2(-/-)>COX-1(-/-)>WT+IL-1β. COX-1(-/-) or COX-2(-/-) cells shared about 50% of the induced transcripts with WT cells treated with IL-1β, respectively. An interactive "anti-inflammatory, proinflammatory, and redox-activated" signature in the protein-protein interactome map was observed in COX-2(-/-) cells. The augmented COX-1 mRNA (in COX-2(-/-) cells) was associated with the upregulation of mRNAs for glutathione S-transferase (GST), superoxide dismutase (SOD), NAD(P)H dehydrogenase quinone 1 (NQO1), aryl hydrocarbon receptor (AhR), peroxiredoxin, phospholipase, prostacyclin synthase, and prostaglandin E synthase, resulting in a significant increase in the levels of PGE2, PGD2, leukotriene B4 (LTB4), PGF1α, thromboxane B2 (TXB2), and PGF2α. The COX-1 plays a dominant role in shifting AA toward the LTB4 pathway and anti-inflammatory activities. Compared to WT, the upregulated COX-1 mRNA in COX-2(-/-) cells generated an "eicosanoid storm". The genomic characteristics of COX-2(-/-) is similar to that of proinflammatory cells as observed in IL-1β induced WT cells. COX-1(-/-) and COX-2(-/-) cells exhibited compensation of various eicosanoids at the genomic and metabolic levels. PMID:27012456
Parra, Edwin Roger; Lin, Flavia; Martins, Vanessa; Rangel, Maristela Peres; Capelozzi, Vera Luiza
2013-01-01
OBJECTIVE: To study the expression of COX-1 and COX-2 in the remodeled lung in systemic sclerosis (SSc) and idiopathic pulmonary fibrosis (IPF) patients, correlating that expression with patient survival. METHODS: We examined open lung biopsy specimens from 24 SSc patients and 30 IPF patients, using normal lung tissue as a control. The histological patterns included fibrotic nonspecific interstitial pneumonia (NSIP) in SSc patients and usual interstitial pneumonia (UIP) in IPF patients. We used immunohistochemistry and histomorphometry to evaluate the expression of COX-1 and COX-2 in alveolar septa, vessels, and bronchioles. We then correlated that expression with pulmonary function test results and evaluated its impact on patient survival. RESULTS: The expression of COX-1 and COX-2 in alveolar septa was significantly higher in IPF-UIP and SSc-NSIP lung tissue than in the control tissue. No difference was found between IPF-UIP and SSc-NSIP tissue regarding COX-1 and COX-2 expression. Multivariate analysis based on the Cox regression model showed that the factors associated with a low risk of death were younger age, high DLCO/alveolar volume, IPF, and high COX-1 expression in alveolar septa, whereas those associated with a high risk of death were advanced age, low DLCO/alveolar volume, SSc (with NSIP), and low COX-1 expression in alveolar septa. CONCLUSIONS: Our findings suggest that strategies aimed at preventing low COX-1 synthesis will have a greater impact on SSc, whereas those aimed at preventing high COX-2 synthesis will have a greater impact on IPF. However, prospective randomized clinical trials are needed in order to confirm that. PMID:24473763
The Consequences Of Model Misspecification In Regression Analysis.
Deegan, J
1976-04-01
In ordinary least squares regression analysis the desired property of unbiasedness in estimated coefficients is contingent upon the correspondence of the fitted model with the true underlying data generating process. This paper focuses on developing a systematic characterization of the error forms resulting from model misspecification in single equation models. The consequences of model misspecification, for the error forms identified, are also evaluated. PMID:26821674
Robust regression applied to fractal/multifractal analysis.
NASA Astrophysics Data System (ADS)
Portilla, F.; Valencia, J. L.; Tarquis, A. M.; Saa-Requejo, A.
2012-04-01
Fractal and multifractal are concepts that have grown increasingly popular in recent years in the soil analysis, along with the development of fractal models. One of the common steps is to calculate the slope of a linear fit commonly using least squares method. This shouldn't be a special problem, however, in many situations using experimental data the researcher has to select the range of scales at which is going to work neglecting the rest of points to achieve the best linearity that in this type of analysis is necessary. Robust regression is a form of regression analysis designed to circumvent some limitations of traditional parametric and non-parametric methods. In this method we don't have to assume that the outlier point is simply an extreme observation drawn from the tail of a normal distribution not compromising the validity of the regression results. In this work we have evaluated the capacity of robust regression to select the points in the experimental data used trying to avoid subjective choices. Based on this analysis we have developed a new work methodology that implies two basic steps: • Evaluation of the improvement of linear fitting when consecutive points are eliminated based on R p-value. In this way we consider the implications of reducing the number of points. • Evaluation of the significance of slope difference between fitting with the two extremes points and fitted with the available points. We compare the results applying this methodology and the common used least squares one. The data selected for these comparisons are coming from experimental soil roughness transect and simulated based on middle point displacement method adding tendencies and noise. The results are discussed indicating the advantages and disadvantages of each methodology. Acknowledgements Funding provided by CEIGRAM (Research Centre for the Management of Agricultural and Environmental Risks) and by Spanish Ministerio de Ciencia e Innovación (MICINN) through project no
Functional Regression Models for Epistasis Analysis of Multiple Quantitative Traits.
Zhang, Futao; Xie, Dan; Liang, Meimei; Xiong, Momiao
2016-04-01
To date, most genetic analyses of phenotypes have focused on analyzing single traits or analyzing each phenotype independently. However, joint epistasis analysis of multiple complementary traits will increase statistical power and improve our understanding of the complicated genetic structure of the complex diseases. Despite their importance in uncovering the genetic structure of complex traits, the statistical methods for identifying epistasis in multiple phenotypes remains fundamentally unexplored. To fill this gap, we formulate a test for interaction between two genes in multiple quantitative trait analysis as a multiple functional regression (MFRG) in which the genotype functions (genetic variant profiles) are defined as a function of the genomic position of the genetic variants. We use large-scale simulations to calculate Type I error rates for testing interaction between two genes with multiple phenotypes and to compare the power with multivariate pairwise interaction analysis and single trait interaction analysis by a single variate functional regression model. To further evaluate performance, the MFRG for epistasis analysis is applied to five phenotypes of exome sequence data from the NHLBI's Exome Sequencing Project (ESP) to detect pleiotropic epistasis. A total of 267 pairs of genes that formed a genetic interaction network showed significant evidence of epistasis influencing five traits. The results demonstrate that the joint interaction analysis of multiple phenotypes has a much higher power to detect interaction than the interaction analysis of a single trait and may open a new direction to fully uncovering the genetic structure of multiple phenotypes. PMID:27104857
Functional Regression Models for Epistasis Analysis of Multiple Quantitative Traits
Xie, Dan; Liang, Meimei; Xiong, Momiao
2016-01-01
To date, most genetic analyses of phenotypes have focused on analyzing single traits or analyzing each phenotype independently. However, joint epistasis analysis of multiple complementary traits will increase statistical power and improve our understanding of the complicated genetic structure of the complex diseases. Despite their importance in uncovering the genetic structure of complex traits, the statistical methods for identifying epistasis in multiple phenotypes remains fundamentally unexplored. To fill this gap, we formulate a test for interaction between two genes in multiple quantitative trait analysis as a multiple functional regression (MFRG) in which the genotype functions (genetic variant profiles) are defined as a function of the genomic position of the genetic variants. We use large-scale simulations to calculate Type I error rates for testing interaction between two genes with multiple phenotypes and to compare the power with multivariate pairwise interaction analysis and single trait interaction analysis by a single variate functional regression model. To further evaluate performance, the MFRG for epistasis analysis is applied to five phenotypes of exome sequence data from the NHLBI’s Exome Sequencing Project (ESP) to detect pleiotropic epistasis. A total of 267 pairs of genes that formed a genetic interaction network showed significant evidence of epistasis influencing five traits. The results demonstrate that the joint interaction analysis of multiple phenotypes has a much higher power to detect interaction than the interaction analysis of a single trait and may open a new direction to fully uncovering the genetic structure of multiple phenotypes. PMID:27104857
Poisson Regression Analysis of Illness and Injury Surveillance Data
Frome E.L., Watkins J.P., Ellis E.D.
2012-12-12
The Department of Energy (DOE) uses illness and injury surveillance to monitor morbidity and assess the overall health of the work force. Data collected from each participating site include health events and a roster file with demographic information. The source data files are maintained in a relational data base, and are used to obtain stratified tables of health event counts and person time at risk that serve as the starting point for Poisson regression analysis. The explanatory variables that define these tables are age, gender, occupational group, and time. Typical response variables of interest are the number of absences due to illness or injury, i.e., the response variable is a count. Poisson regression methods are used to describe the effect of the explanatory variables on the health event rates using a log-linear main effects model. Results of fitting the main effects model are summarized in a tabular and graphical form and interpretation of model parameters is provided. An analysis of deviance table is used to evaluate the importance of each of the explanatory variables on the event rate of interest and to determine if interaction terms should be considered in the analysis. Although Poisson regression methods are widely used in the analysis of count data, there are situations in which over-dispersion occurs. This could be due to lack-of-fit of the regression model, extra-Poisson variation, or both. A score test statistic and regression diagnostics are used to identify over-dispersion. A quasi-likelihood method of moments procedure is used to evaluate and adjust for extra-Poisson variation when necessary. Two examples are presented using respiratory disease absence rates at two DOE sites to illustrate the methods and interpretation of the results. In the first example the Poisson main effects model is adequate. In the second example the score test indicates considerable over-dispersion and a more detailed analysis attributes the over-dispersion to extra
Multivariate concentration determination using principal component regression with residual analysis
Keithley, Richard B.; Heien, Michael L.; Wightman, R. Mark
2009-01-01
Data analysis is an essential tenet of analytical chemistry, extending the possible information obtained from the measurement of chemical phenomena. Chemometric methods have grown considerably in recent years, but their wide use is hindered because some still consider them too complicated. The purpose of this review is to describe a multivariate chemometric method, principal component regression, in a simple manner from the point of view of an analytical chemist, to demonstrate the need for proper quality-control (QC) measures in multivariate analysis and to advocate the use of residuals as a proper QC method. PMID:20160977
Regression Analysis of Electric Power Price in California Power Exchange
NASA Astrophysics Data System (ADS)
Miyauchi, Hajime; Tatsuguchi, Genta; Misawa, Tetsuya
The liberalization of the electric power industries was executed from April 1998 in California State. Though this liberalization is suspended because of the extremely high bids and the outages, the information of the power price in the power exchange is very variable to investigate its structure and determination factor. From the accessible web site, we obtained the every hour data of the zone prices and the whole demand of California from April 1998 to September 2001, under the deregulation of the electric power industry. We are analyzing the prices by the regression analysis. In this paper, we compose simple regression equations successfully to classify the price data into four time zones. Next, we analyze the prices from June to September 2000 when the price cap of the power price is changed twice. The Chow test shows that the structural changes in the power price are occurred when the price cap is changed. Thus we observe the determining factor of the electric power price by the regression analysis.
2012-01-01
Background Evidence is accumulating that chronic inflammation may have an important role in prostate cancer (PCa). The COX-2 polymorphism rs2745557 (+202 C/T) has been extensively investigated as a potential risk factor for PCa, but the results have thus far been inconclusive. This meta-analysis was performed to derive a more precise estimation of the association. Methods A comprehensive search was conducted to identify all case-control studies of COX-2 rs2745557 polymorphism and PCa risk. We used odds ratios (ORs) to assess the strength of the association, and 95% confidence intervals (CIs) give a sense of the precision of the estimate. Statistical analyses were performed by Review Manage, version 5.0 and Stata 10.0. Results A total of 8 available studies were considered in the present meta-analysis, with 11356 patients and 11641 controls for rs2745557. When all groups were pooled, there was no evidence that rs2745557 had significant association with PCa under co-dominant, recessive, over-dominant, and allelic models. However, our analysis suggested that rs2745557 was associated with a lower PCa risk under dominant model in overall population (OR = 0.85, 95%CI = 0.74-0.97, P = 0.02). When stratifying for race, there was a significant association between rs2745557 polymorphism and lower PCa risk in dominant model comparison in the subgroup of Caucasians (OR = 0.86, 95%CI = 0.75-0.99, P = 0.04), but not in co-dominant, recessive, over-dominant and allelic comparisons. Conclusion Based on our meta-analysis, COX-2 rs2745557 was associated with a lower PCa risk under dominant model in Caucasians. PMID:22435969
Four cases of Taenia saginata infection with an analysis of COX1 gene.
Cho, Jaeeun; Jung, Bong-Kwang; Lim, Hyemi; Kim, Min-Jae; Yooyen, Thanapon; Lee, Dongmin; Eom, Keeseon S; Shin, Eun-Hee; Chai, Jong-Yil
2014-02-01
Human taeniases had been not uncommon in the Republic of Korea (=Korea) until the 1980s. The prevalence decreased and a national survey in 2004 revealed no Taenia egg positive cases. However, a subsequent national survey in 2012 showed 0.04% (10 cases) prevalence of Taenia spp. eggs suggesting its resurgence in Korea. We recently encountered 4 cases of Taenia saginata infection who had symptoms of taeniasis that included discharge of proglottids. We obtained several proglottids from each case. Because the morphological features of T. saginata are almost indistinguishable from those of Taenia asiatica, molecular analyses using the PCR-RFLP and DNA sequencing of the cytochrome c oxidase subunit 1 (cox1) were performed to identify the species. The PCR-RFLP patterns of all of the 4 specimens were consistent with T. saginata, and the cox1 gene sequence showed 99.8-100% identity with that of T. saginata reported previously from Korea, Japan, China, and Cambodia. All of the 4 patients had the history of travel abroad but its relation with contracting taeniasis was unclear. Our findings may suggest resurgence of T. saginata infection among people in Korea. PMID:24623887
Bar-Yaacov, Dan; Bouskila, Amos; Mishmar, Dan
2013-01-01
Recently, we found dramatic mitochondrial DNA divergence of Israeli Chamaeleo chamaeleon populations into two geographically distinct groups. We aimed to examine whether the same pattern of divergence could be found in nuclear genes. However, no genomic resource is available for any chameleon species. Here we present the first chameleon transcriptome, obtained using deep sequencing (SOLiD). Our analysis identified 164,000 sequence contigs of which 19,000 yielded unique BlastX hits. To test the efficacy of our sequencing effort, we examined whether the chameleon and other available reptilian transcriptomes harbored complete sets of genes comprising known biochemical pathways, focusing on the nDNA-encoded oxidative phosphorylation (OXPHOS) genes as a model. As a reference for the screen, we used the human 86 (including isoforms) known structural nDNA-encoded OXPHOS subunits. Analysis of 34 publicly available vertebrate transcriptomes revealed orthologs for most human OXPHOS genes. However, OXPHOS subunit COX8 (Cytochrome C oxidase subunit 8), including all its known isoforms, was consistently absent in transcriptomes of iguanian lizards, implying loss of this subunit during the radiation of this suborder. The lack of COX8 in the suborder Iguania is intriguing, since it is important for cellular respiration and ATP production. Our sequencing effort added a new resource for comparative genomic studies, and shed new light on the evolutionary dynamics of the OXPHOS system. PMID:24009133
FRATS: Functional Regression Analysis of DTI Tract Statistics
Zhu, Hongtu; Styner, Martin; Tang, Niansheng; Liu, Zhexing; Lin, Weili; Gilmore, John H.
2010-01-01
Diffusion tensor imaging (DTI) provides important information on the structure of white matter fiber bundles as well as detailed tissue properties along these fiber bundles in vivo. This paper presents a functional regression framework, called FRATS, for the analysis of multiple diffusion properties along fiber bundle as functions in an infinite dimensional space and their association with a set of covariates of interest, such as age, diagnostic status and gender, in real applications. The functional regression framework consists of four integrated components: the local polynomial kernel method for smoothing multiple diffusion properties along individual fiber bundles, a functional linear model for characterizing the association between fiber bundle diffusion properties and a set of covariates, a global test statistic for testing hypotheses of interest, and a resampling method for approximating the p-value of the global test statistic. The proposed methodology is applied to characterizing the development of five diffusion properties including fractional anisotropy, mean diffusivity, and the three eigenvalues of diffusion tensor along the splenium of the corpus callosum tract and the right internal capsule tract in a clinical study of neurodevelopment. Significant age and gestational age effects on the five diffusion properties were found in both tracts. The resulting analysis pipeline can be used for understanding normal brain development, the neural bases of neuropsychiatric disorders, and the joint effects of environmental and genetic factors on white matter fiber bundles. PMID:20335089
Regression analysis exploring teacher impact on student FCI post scores
NASA Astrophysics Data System (ADS)
Mahadeo, Jonathan V.; Manthey, Seth R.; Brewe, Eric
2013-01-01
High School Modeling Workshops are designed to improve high school physics teachers' understanding of physics and how to teach using the Modeling method. The basic assumption is that the teacher plays a critical role in their students' physics education. This study investigated teacher impacts on students' Force Concept Inventory scores, (FCI), with the hopes of identifying quantitative differences between teachers. This study examined student FCI scores from 18 teachers with at least a year of teaching high school physics. This data was then evaluated using a General Linear Model (GLM), which allowed for a regression equation to be fitted to the data. This regression equation was used to predict student post FCI scores, based on: teacher ID, student pre FCI score, gender, and representation. The results show 12 out of 18 teachers significantly impact their student post FCI scores. The GLM further revealed that of the 12 teachers only five have a positive impact on student post FCI scores. Given these differences among teachers it is our intention to extend our analysis to investigate pedagogical differences between them.
Nonparametric survival analysis using Bayesian Additive Regression Trees (BART).
Sparapani, Rodney A; Logan, Brent R; McCulloch, Robert E; Laud, Purushottam W
2016-07-20
Bayesian additive regression trees (BART) provide a framework for flexible nonparametric modeling of relationships of covariates to outcomes. Recently, BART models have been shown to provide excellent predictive performance, for both continuous and binary outcomes, and exceeding that of its competitors. Software is also readily available for such outcomes. In this article, we introduce modeling that extends the usefulness of BART in medical applications by addressing needs arising in survival analysis. Simulation studies of one-sample and two-sample scenarios, in comparison with long-standing traditional methods, establish face validity of the new approach. We then demonstrate the model's ability to accommodate data from complex regression models with a simulation study of a nonproportional hazards scenario with crossing survival functions and survival function estimation in a scenario where hazards are multiplicatively modified by a highly nonlinear function of the covariates. Using data from a recently published study of patients undergoing hematopoietic stem cell transplantation, we illustrate the use and some advantages of the proposed method in medical investigations. Copyright © 2016 John Wiley & Sons, Ltd. PMID:26854022
Estimation of crown closure from AVIRIS data using regression analysis
NASA Technical Reports Server (NTRS)
Staenz, K.; Williams, D. J.; Truchon, M.; Fritz, R.
1993-01-01
Crown closure is one of the input parameters used for forest growth and yield modelling. Preliminary work by Staenz et al. indicates that imaging spectrometer data acquired with sensors such as the Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) have some potential for estimating crown closure on a stand level. The objectives of this paper are: (1) to establish a relationship between AVIRIS data and the crown closure derived from aerial photography of a forested test site within the Interior Douglas Fir biogeoclimatic zone in British Columbia, Canada; (2) to investigate the impact of atmospheric effects and the forest background on the correlation between AVIRIS data and crown closure estimates; and (3) to improve this relationship using multiple regression analysis.
A Visual Analytics Approach for Correlation, Classification, and Regression Analysis
Steed, Chad A; SwanII, J. Edward; Fitzpatrick, Patrick J.; Jankun-Kelly, T.J.
2012-02-01
New approaches that combine the strengths of humans and machines are necessary to equip analysts with the proper tools for exploring today's increasing complex, multivariate data sets. In this paper, a novel visual data mining framework, called the Multidimensional Data eXplorer (MDX), is described that addresses the challenges of today's data by combining automated statistical analytics with a highly interactive parallel coordinates based canvas. In addition to several intuitive interaction capabilities, this framework offers a rich set of graphical statistical indicators, interactive regression analysis, visual correlation mining, automated axis arrangements and filtering, and data classification techniques. The current work provides a detailed description of the system as well as a discussion of key design aspects and critical feedback from domain experts.
A Visual Analytics Approach for Correlation, Classification, and Regression Analysis
Steed, Chad A; SwanII, J. Edward; Fitzpatrick, Patrick J.; Jankun-Kelly, T.J.
2013-01-01
New approaches that combine the strengths of humans and machines are necessary to equip analysts with the proper tools for exploring today s increasing complex, multivariate data sets. In this paper, a visual data mining framework, called the Multidimensional Data eXplorer (MDX), is described that addresses the challenges of today s data by combining automated statistical analytics with a highly interactive parallel coordinates based canvas. In addition to several intuitive interaction capabilities, this framework offers a rich set of graphical statistical indicators, interactive regression analysis, visual correlation mining, automated axis arrangements and filtering, and data classification techniques. This chapter provides a detailed description of the system as well as a discussion of key design aspects and critical feedback from domain experts.
Moderated regression analysis and Likert scales: too coarse for comfort.
Russell, C J; Bobko, P
1992-06-01
One of the most commonly accepted models of relationships among three variables in applied industrial and organizational psychology is the simple moderator effect. However, many authors have expressed concern over the general lack of empirical support for interaction effects reported in the literature. We demonstrate in the current sample that use of a continuous, dependent-response scale instead of a discrete, Likert-type scale, causes moderated regression analysis effect sizes to increase an average of 93%. We suggest that use of relatively coarse Likert scales to measure fine dependent responses causes information loss that, although varying widely across subjects, greatly reduces the probability of detecting true interaction effects. Specific recommendations for alternate research strategies are made. PMID:1601825
ADVANTAGES OF USING REGRESSION ANALYSIS TO CALCULATE RESULTS OF CHRONIC TOXICITY TESTS
Although it is traditional to calculate results of chronic toxicity tests using hypothesis testing to detect statistically significant differences from the control, calculation of results using regression analysis offers several major advantages. Regression analysis can directly ...
Spatial regression analysis of traffic crashes in Seoul.
Rhee, Kyoung-Ah; Kim, Joon-Ki; Lee, Young-ihn; Ulfarsson, Gudmundur F
2016-06-01
Traffic crashes can be spatially correlated events and the analysis of the distribution of traffic crash frequency requires evaluation of parameters that reflect spatial properties and correlation. Typically this spatial aspect of crash data is not used in everyday practice by planning agencies and this contributes to a gap between research and practice. A database of traffic crashes in Seoul, Korea, in 2010 was developed at the traffic analysis zone (TAZ) level with a number of GIS developed spatial variables. Practical spatial models using available software were estimated. The spatial error model was determined to be better than the spatial lag model and an ordinary least squares baseline regression. A geographically weighted regression model provided useful insights about localization of effects. The results found that an increased length of roads with speed limit below 30 km/h and a higher ratio of residents below age of 15 were correlated with lower traffic crash frequency, while a higher ratio of residents who moved to the TAZ, more vehicle-kilometers traveled, and a greater number of access points with speed limit difference between side roads and mainline above 30 km/h all increased the number of traffic crashes. This suggests, for example, that better control or design for merging lower speed roads with higher speed roads is important. A key result is that the length of bus-only center lanes had the largest effect on increasing traffic crashes. This is important as bus-only center lanes with bus stop islands have been increasingly used to improve transit times. Hence the potential negative safety impacts of such systems need to be studied further and mitigated through improved design of pedestrian access to center bus stop islands. PMID:26994374
Standardized Regression Coefficients as Indices of Effect Sizes in Meta-Analysis
ERIC Educational Resources Information Center
Kim, Rae Seon
2011-01-01
When conducting a meta-analysis, it is common to find many collected studies that report regression analyses, because multiple regression analysis is widely used in many fields. Meta-analysis uses effect sizes drawn from individual studies as a means of synthesizing a collection of results. However, indices of effect size from regression analyses…
Mixed-effects Poisson regression analysis of adverse event reports
Gibbons, Robert D.; Segawa, Eisuke; Karabatsos, George; Amatya, Anup K.; Bhaumik, Dulal K.; Brown, C. Hendricks; Kapur, Kush; Marcus, Sue M.; Hur, Kwan; Mann, J. John
2008-01-01
SUMMARY A new statistical methodology is developed for the analysis of spontaneous adverse event (AE) reports from post-marketing drug surveillance data. The method involves both empirical Bayes (EB) and fully Bayes estimation of rate multipliers for each drug within a class of drugs, for a particular AE, based on a mixed-effects Poisson regression model. Both parametric and semiparametric models for the random-effect distribution are examined. The method is applied to data from Food and Drug Administration (FDA)’s Adverse Event Reporting System (AERS) on the relationship between antidepressants and suicide. We obtain point estimates and 95 per cent confidence (posterior) intervals for the rate multiplier for each drug (e.g. antidepressants), which can be used to determine whether a particular drug has an increased risk of association with a particular AE (e.g. suicide). Confidence (posterior) intervals that do not include 1.0 provide evidence for either significant protective or harmful associations of the drug and the adverse effect. We also examine EB, parametric Bayes, and semiparametric Bayes estimators of the rate multipliers and associated confidence (posterior) intervals. Results of our analysis of the FDA AERS data revealed that newer antidepressants are associated with lower rates of suicide adverse event reports compared with older antidepressants. We recommend improvements to the existing AERS system, which are likely to improve its public health value as an early warning system. PMID:18404622
Integrated analysis of incidence, progression, regression and disappearance probabilities
Huang, Guan-Hua
2008-01-01
Background Age-related maculopathy (ARM) is a leading cause of vision loss in people aged 65 or older. ARM is distinctive in that it is a disease which can transition through incidence, progression, regression and disappearance. The purpose of this study is to develop methodologies for studying the relationship of risk factors with different transition probabilities. Methods Our framework for studying this relationship includes two different analytical approaches. In the first approach, one can define, model and estimate the relationship between each transition probability and risk factors separately. This approach is similar to constraining a population to a certain disease status at the baseline, and then analyzing the probability of the constrained population to develop a different status. While this approach is intuitive, one risks losing available information while at the same time running into the problem of insufficient sample size. The second approach specifies a transition model for analyzing such a disease. This model provides the conditional probability of a current disease status based upon a previous status, and can therefore jointly analyze all transition probabilities. Throughout the paper, an analysis to determine the birth cohort effect on ARM is used as an illustration. Results and conclusion This study has found parallel separate and joint analyses to be more enlightening than any analysis in isolation. By implementing both approaches, one can obtain more reliable and more efficient results. PMID:18577235
Risk factors for temporomandibular disorder: Binary logistic regression analysis
Magalhães, Bruno G.; de-Sousa, Stéphanie T.; de Mello, Victor V C.; da-Silva-Barbosa, André C.; de-Assis-Morais, Mariana P L.; Barbosa-Vasconcelos, Márcia M V.
2014-01-01
Objectives: To analyze the influence of socioeconomic and demographic factors (gender, economic class, age and marital status) on the occurrence of temporomandibular disorder. Study Design: One hundred individuals from urban areas in the city of Recife (Brazil) registered at Family Health Units was examined using Axis I of the Research Diagnostic Criteria for Temporomandibular Disorders (RDC/TMD) which addresses myofascial pain and joint problems (disc displacement, arthralgia, osteoarthritis and oesteoarthrosis). The Brazilian Economic Classification Criteria (CCEB) was used for the collection of socioeconomic and demographic data. Then, it was categorized as Class A (high social class), Classes B/C (middle class) and Classes D/E (very poor social class). The results were analyzed using Pearson’s chi-square test for proportions, Fisher’s exact test, nonparametric Mann-Whitney test and Binary logistic regression analysis. Results: None of the participants belonged to Class A, 72% belonged to Classes B/C and 28% belonged to Classes D/E. The multivariate analysis revealed that participants from Classes D/E had a 4.35-fold greater chance of exhibiting myofascial pain and 11.3-fold greater chance of exhibiting joint problems. Conclusions: Poverty is a important condition to exhibit myofascial pain and joint problems. Key words:Temporomandibular joint disorders, risk factors, prevalence. PMID:24316706
Analysis of retirement income adequacy using quantile regression: A case study in Malaysia
NASA Astrophysics Data System (ADS)
Alaudin, Ros Idayuwati; Ismail, Noriszura; Isa, Zaidi
2015-09-01
Quantile regression is a statistical analysis that does not restrict attention to the conditional mean and therefore, permitting the approximation of the whole conditional distribution of a response variable. Quantile regression is a robust regression to outliers compared to mean regression models. In this paper, we demonstrate how quantile regression approach can be used to analyze the ratio of projected wealth to needs (wealth-needs ratio) during retirement.
Janssen, I.; Stebbings, J.H.
1990-01-01
In environmental epidemiology, trace and toxic substance concentrations frequently have very highly skewed distributions ranging over one or more orders of magnitude, and prediction by conventional regression is often poor. Classification and Regression Tree Analysis (CART) is an alternative in such contexts. To compare the techniques, two Pennsylvania data sets and three independent variables are used: house radon progeny (RnD) and gamma levels as predicted by construction characteristics in 1330 houses; and {approximately}200 house radon (Rn) measurements as predicted by topographic parameters. CART may identify structural variables of interest not identified by conventional regression, and vice versa, but in general the regression models are similar. CART has major advantages in dealing with other common characteristics of environmental data sets, such as missing values, continuous variables requiring transformations, and large sets of potential independent variables. CART is most useful in the identification and screening of independent variables, greatly reducing the need for cross-tabulations and nested breakdown analyses. There is no need to discard cases with missing values for the independent variables because surrogate variables are intrinsic to CART. The tree-structured approach is also independent of the scale on which the independent variables are measured, so that transformations are unnecessary. CART identifies important interactions as well as main effects. The major advantages of CART appear to be in exploring data. Once the important variables are identified, conventional regressions seem to lead to results similar but more interpretable by most audiences. 12 refs., 8 figs., 10 tabs.
The Variance Normalization Method of Ridge Regression Analysis.
ERIC Educational Resources Information Center
Bulcock, J. W.; And Others
The testing of contemporary sociological theory often calls for the application of structural-equation models to data which are inherently collinear. It is shown that simple ridge regression, which is commonly used for controlling the instability of ordinary least squares regression estimates in ill-conditioned data sets, is not a legitimate…
An Effect Size for Regression Predictors in Meta-Analysis
ERIC Educational Resources Information Center
Aloe, Ariel M.; Becker, Betsy Jane
2012-01-01
A new effect size representing the predictive power of an independent variable from a multiple regression model is presented. The index, denoted as r[subscript sp], is the semipartial correlation of the predictor with the outcome of interest. This effect size can be computed when multiple predictor variables are included in the regression model…
Dunstan, H. M.; Green-Willms, N. S.; Fox, T. D.
1997-01-01
We have used mutational and revertant analysis to study the elements of the 54-nucleotide COX2 5'-untranslated leader involved in translation initiation in yeast mitochondria and in activation by the COX2 translational activator, Pet111p. We generated a collection of mutants with substitutions spanning the entire COX2 5'-UTL by in vitro mutagenesis followed by mitochondrial transformation and gene replacement. The phenotypes of these mutants delimit a 31-nucleotide segment, from -16 to -46, that contains several short sequence elements necessary for COX2 5'-UTL function in translation. The sequences from -16 to -47 were shown to be partially sufficient to promote translation in a foreign context. Analysis of revertants of both the series of linker-scanning alleles and two short deletion/insertion alleles has refined the positions of several possible functional elements of the COX2 5'-untranslated leader, including a putative RNA stem-loop structure that functionally interacts with Pet111p and an octanucleotide sequence present in all S. cerevisiae mitochondrial mRNA 5'-UTLs that is a potential rRNA binding site. PMID:9286670
A Novel Multiobjective Evolutionary Algorithm Based on Regression Analysis
Song, Zhiming; Wang, Maocai; Dai, Guangming; Vasile, Massimiliano
2015-01-01
As is known, the Pareto set of a continuous multiobjective optimization problem with m objective functions is a piecewise continuous (m − 1)-dimensional manifold in the decision space under some mild conditions. However, how to utilize the regularity to design multiobjective optimization algorithms has become the research focus. In this paper, based on this regularity, a model-based multiobjective evolutionary algorithm with regression analysis (MMEA-RA) is put forward to solve continuous multiobjective optimization problems with variable linkages. In the algorithm, the optimization problem is modelled as a promising area in the decision space by a probability distribution, and the centroid of the probability distribution is (m − 1)-dimensional piecewise continuous manifold. The least squares method is used to construct such a model. A selection strategy based on the nondominated sorting is used to choose the individuals to the next generation. The new algorithm is tested and compared with NSGA-II and RM-MEDA. The result shows that MMEA-RA outperforms RM-MEDA and NSGA-II on the test instances with variable linkages. At the same time, MMEA-RA has higher efficiency than the other two algorithms. A few shortcomings of MMEA-RA have also been identified and discussed in this paper. PMID:25874246
A flexible count data regression model for risk analysis.
Guikema, Seth D; Coffelt, Jeremy P; Goffelt, Jeremy P
2008-02-01
In many cases, risk and reliability analyses involve estimating the probabilities of discrete events such as hardware failures and occurrences of disease or death. There is often additional information in the form of explanatory variables that can be used to help estimate the likelihood of different numbers of events in the future through the use of an appropriate regression model, such as a generalized linear model. However, existing generalized linear models (GLM) are limited in their ability to handle the types of variance structures often encountered in using count data in risk and reliability analysis. In particular, standard models cannot handle both underdispersed data (variance less than the mean) and overdispersed data (variance greater than the mean) in a single coherent modeling framework. This article presents a new GLM based on a reformulation of the Conway-Maxwell Poisson (COM) distribution that is useful for both underdispersed and overdispersed count data and demonstrates this model by applying it to the assessment of electric power system reliability. The results show that the proposed COM GLM can provide as good of fits to data as the commonly used existing models for overdispered data sets while outperforming these commonly used models for underdispersed data sets. PMID:18304118
Elghafghuf, Adel; Dufour, Simon; Reyher, Kristen; Dohoo, Ian; Stryhn, Henrik
2014-12-01
Mastitis is a complex disease affecting dairy cows and is considered to be the most costly disease of dairy herds. The hazard of mastitis is a function of many factors, both managerial and environmental, making its control a difficult issue to milk producers. Observational studies of clinical mastitis (CM) often generate datasets with a number of characteristics which influence the analysis of those data: the outcome of interest may be the time to occurrence of a case of mastitis, predictors may change over time (time-dependent predictors), the effects of factors may change over time (time-dependent effects), there are usually multiple hierarchical levels, and datasets may be very large. Analysis of such data often requires expansion of the data into the counting-process format - leading to larger datasets - thus complicating the analysis and requiring excessive computing time. In this study, a nested frailty Cox model with time-dependent predictors and effects was applied to Canadian Bovine Mastitis Research Network data in which 10,831 lactations of 8035 cows from 69 herds were followed through lactation until the first occurrence of CM. The model was fit to the data as a Poisson model with nested normally distributed random effects at the cow and herd levels. Risk factors associated with the hazard of CM during the lactation were identified, such as parity, calving season, herd somatic cell score, pasture access, fore-stripping, and proportion of treated cases of CM in a herd. The analysis showed that most of the predictors had a strong effect early in lactation and also demonstrated substantial variation in the baseline hazard among cows and between herds. A small simulation study for a setting similar to the real data was conducted to evaluate the Poisson maximum likelihood estimation approach with both Gaussian quadrature method and Laplace approximation. Further, the performance of the two methods was compared with the performance of a widely used estimation
Kammarnjesadakul, Patcharee; Palaga, Tanapat; Sritunyalucksana, Kallaya; Mendoza, Leonel; Krajaejun, Theerapong; Vanittanakom, Nongnuch; Tongchusak, Songsak; Denduangboripant, Jessada; Chindamporn, Ariya
2011-04-01
To investigate the phylogenetic relationship among Pythium insidiosum isolates in Thailand, we investigated the genomic DNA of 31 P. insidiosum strains isolated from humans and environmental sources from Thailand, and two from North and Central America. We used PCR to amplify the partial COX II DNA coding sequences and the ITS regions of these isolates. The nucleotide sequences of both amplicons were analyzed by the Bioedit program. Phylogenetic analysis using genetic distance method with Neighbor Joining (NJ) approach was performed using the MEGA4 software. Additional sequences of three other Pythium species, Phytophthora sojae and Lagenidium giganteum were employed as outgroups. The sizes of the COX II amplicons varied from 558-564 bp, whereas the ITS products varied from approximately 871-898 bp. Corrected sequence divergences with Kimura 2-parameter model calculated for the COX II and the ITS DNA sequences ranged between 0.0000-0.0608 and 0.0000-0.2832, respectively. Phylogenetic analysis using both the COX II and the ITS DNA sequences showed similar trees, where we found three sister groups (A(TH), B(TH), and C(TH)) among P. insidiosum strains. All Thai isolates from clinical cases and environmental sources were placed in two separated sister groups (B(TH) and C(TH)), whereas the Americas isolates were grouped into A(TH.) Although the phylogenetic tree based on both regions showed similar distribution, the COX II phylogenetic tree showed higher resolution than the one using the ITS sequences. Our study indicates that COX II gene is the better of the two alternatives to study the phylogenetic relationships among P. insidiosum strains. PMID:20818919
Regression analysis of technical parameters affecting nuclear power plant performances
Ghazy, R.; Ricotti, M. E.; Trueco, P.
2012-07-01
Since the 80's many studies have been conducted in order to explicate good and bad performances of commercial nuclear power plants (NPPs), but yet no defined correlation has been found out to be totally representative of plant operational experience. In early works, data availability and the number of operating power stations were both limited; therefore, results showed that specific technical characteristics of NPPs were supposed to be the main causal factors for successful plant operation. Although these aspects keep on assuming a significant role, later studies and observations showed that other factors concerning management and organization of the plant could instead be predominant comparing utilities operational and economic results. Utility quality, in a word, can be used to summarize all the managerial and operational aspects that seem to be effective in determining plant performance. In this paper operational data of a consistent sample of commercial nuclear power stations, out of the total 433 operating NPPs, are analyzed, mainly focusing on the last decade operational experience. The sample consists of PWR and BWR technology, operated by utilities located in different countries, including U.S. (Japan)) (France)) (Germany)) and Finland. Multivariate regression is performed using Unit Capability Factor (UCF) as the dependent variable; this factor reflects indeed the effectiveness of plant programs and practices in maximizing the available electrical generation and consequently provides an overall indication of how well plants are operated and maintained. Aspects that may not be real causal factors but which can have a consistent impact on the UCF, as technology design, supplier, size and age, are included in the analysis as independent variables. (authors)
Quantile regression provides a fuller analysis of speed data.
Hewson, Paul
2008-03-01
Considerable interest already exists in terms of assessing percentiles of speed distributions, for example monitoring the 85th percentile speed is a common feature of the investigation of many road safety interventions. However, unlike the mean, where t-tests and ANOVA can be used to provide evidence of a statistically significant change, inference on these percentiles is much less common. This paper examines the potential role of quantile regression for modelling the 85th percentile, or any other quantile. Given that crash risk may increase disproportionately with increasing relative speed, it may be argued these quantiles are of more interest than the conditional mean. In common with the more usual linear regression, quantile regression admits a simple test as to whether the 85th percentile speed has changed following an intervention in an analogous way to using the t-test to determine if the mean speed has changed by considering the significance of parameters fitted to a design matrix. Having briefly outlined the technique and briefly examined an application with a widely published dataset concerning speed measurements taken around the introduction of signs in Cambridgeshire, this paper will demonstrate the potential for quantile regression modelling by examining recent data from Northamptonshire collected in conjunction with a "community speed watch" programme. Freely available software is used to fit these models and it is hoped that the potential benefits of using quantile regression methods when examining and analysing speed data are demonstrated. PMID:18329400
Prognostic models in coronary artery disease: Cox and network approaches
Mora, Antonio; Sicari, Rosa; Cortigiani, Lauro; Carpeggiani, Clara; Picano, Eugenio; Capobianco, Enrico
2015-01-01
Predictive assessment of the risk of developing cardiovascular diseases is usually provided by computational approaches centred on Cox models. The complex interdependence structure underlying clinical data patterns can limit the performance of Cox analysis and complicate the interpretation of results, thus calling for complementary and integrative methods. Prognostic models are proposed for studying the risk associated with patients with known or suspected coronary artery disease (CAD) undergoing vasodilator stress echocardiography, an established technique for CAD detection and prognostication. In order to complement standard Cox models, network inference is considered a possible solution to quantify the complex relationships between heterogeneous data categories. In particular, a mutual information network is designed to explore the paths linking patient-associated variables to endpoint events, to reveal prognostic factors and to identify the best possible predictors of death. Data from a prospective, multicentre, observational study are available from a previous study, based on 4313 patients (2532 men; 64±11 years) with known (n=1547) or suspected (n=2766) CAD, who underwent high-dose dipyridamole (0.84 mg kg−1 over 6 min) stress echocardiography with coronary flow reserve (CFR) evaluation of left anterior descending (LAD) artery by Doppler. The overall mortality was the only endpoint analysed by Cox models. The estimated connectivity between clinical variables assigns a complementary value to the proposed network approach in relation to the established Cox model, for instance revealing connectivity paths. Depending on the use of multiple metrics, the constraints of regression analysis in measuring the association strength among clinical variables can be relaxed, and identification of communities and prognostic paths can be provided. On the basis of evidence from various model comparisons, we show in this CAD study that there may be characteristic
Prognostic models in coronary artery disease: Cox and network approaches.
Mora, Antonio; Sicari, Rosa; Cortigiani, Lauro; Carpeggiani, Clara; Picano, Eugenio; Capobianco, Enrico
2015-02-01
Predictive assessment of the risk of developing cardiovascular diseases is usually provided by computational approaches centred on Cox models. The complex interdependence structure underlying clinical data patterns can limit the performance of Cox analysis and complicate the interpretation of results, thus calling for complementary and integrative methods. Prognostic models are proposed for studying the risk associated with patients with known or suspected coronary artery disease (CAD) undergoing vasodilator stress echocardiography, an established technique for CAD detection and prognostication. In order to complement standard Cox models, network inference is considered a possible solution to quantify the complex relationships between heterogeneous data categories. In particular, a mutual information network is designed to explore the paths linking patient-associated variables to endpoint events, to reveal prognostic factors and to identify the best possible predictors of death. Data from a prospective, multicentre, observational study are available from a previous study, based on 4313 patients (2532 men; 64±11 years) with known (n=1547) or suspected (n=2766) CAD, who underwent high-dose dipyridamole (0.84 mg kg(-1) over 6 min) stress echocardiography with coronary flow reserve (CFR) evaluation of left anterior descending (LAD) artery by Doppler. The overall mortality was the only endpoint analysed by Cox models. The estimated connectivity between clinical variables assigns a complementary value to the proposed network approach in relation to the established Cox model, for instance revealing connectivity paths. Depending on the use of multiple metrics, the constraints of regression analysis in measuring the association strength among clinical variables can be relaxed, and identification of communities and prognostic paths can be provided. On the basis of evidence from various model comparisons, we show in this CAD study that there may be characteristic
Seyedmajidi, Maryam; Shafaee, Shahryar; Siadati, Sepideh; Moghaddam, Elham Alizadeh; Ghasemi, Nafiseh; Bijani, Ali; Najafi, Mostafa
2015-01-01
Background: Cyclo-oxygenase-2 (COX-2) is an early response gene that is induced by growth factors, oncogenes and carcinogens and its expression is increased in various tumors. Increased expression of COX-2 plays a significant role in the development and growth of tumors by interfering in biological processes such as cell division, cellular immunity, cell adhesion, apoptosis, and angiogenesis. This study aimed to investigate the immunohistochemical expression of COX-2 in keratocystic odontogenic tumor (KOT) in comparison with ameloblastoma and dentigerous cyst with regards to different clinical behavior and histopathological features of these lesions. Materials and Methods: Paraffined blocks of 45 cases including 15 cases of dentigerous cyst, 15 cases of KOT and 15 cases of ameloblastoma were stained with immunohistochemical method for COX-2. Five high-power fields of each sample were evaluated to determine the percentage of stained cells and the intensity of staining. Degree of immunoreactivity was obtained from the sum of two. Statistical evaluation was performed by the Kruskal-Wallis and ANOVA Mann-Whitney test (P < 0.05). Results: Overexpression of COX-2 in ameloblastoma and KOT was observed compared with dentigerous cyst (P < 0.001). However, no significant difference was observed between the expression of COX-2 in ameloblastoma and KOT (P = 0.148). Conclusion: The COX-2 expression in odontogenic tumors such as ameloblastoma and cystic neoplasm with aggressive behavior such as KOT increases. However, it does not seem that COX-2 affects the development and growth of cysts with noninvasive behavior like dentigerous cyst. PMID:26005470
Technology Transfer Automated Retrieval System (TEKTRAN)
Selective principal component regression analysis (SPCR) uses a subset of the original image bands for principal component transformation and regression. For optimal band selection before the transformation, this paper used genetic algorithms (GA). In this case, the GA process used the regression co...
Exact Analysis of Squared Cross-Validity Coefficient in Predictive Regression Models
ERIC Educational Resources Information Center
Shieh, Gwowen
2009-01-01
In regression analysis, the notion of population validity is of theoretical interest for describing the usefulness of the underlying regression model, whereas the presumably more important concept of population cross-validity represents the predictive effectiveness for the regression equation in future research. It appears that the inference…
NASA Technical Reports Server (NTRS)
Parsons, Vickie s.
2009-01-01
The request to conduct an independent review of regression models, developed for determining the expected Launch Commit Criteria (LCC) External Tank (ET)-04 cycle count for the Space Shuttle ET tanking process, was submitted to the NASA Engineering and Safety Center NESC on September 20, 2005. The NESC team performed an independent review of regression models documented in Prepress Regression Analysis, Tom Clark and Angela Krenn, 10/27/05. This consultation consisted of a peer review by statistical experts of the proposed regression models provided in the Prepress Regression Analysis. This document is the consultation's final report.
Grades, Gender, and Encouragement: A Regression Discontinuity Analysis
ERIC Educational Resources Information Center
Owen, Ann L.
2010-01-01
The author employs a regression discontinuity design to provide direct evidence on the effects of grades earned in economics principles classes on the decision to major in economics and finds a differential effect for male and female students. Specifically, for female students, receiving an A for a final grade in the first economics class is…
Analysis and Interpretation of Findings Using Multiple Regression Techniques
ERIC Educational Resources Information Center
Hoyt, William T.; Leierer, Stephen; Millington, Michael J.
2006-01-01
Multiple regression and correlation (MRC) methods form a flexible family of statistical techniques that can address a wide variety of different types of research questions of interest to rehabilitation professionals. In this article, we review basic concepts and terms, with an emphasis on interpretation of findings relevant to research questions…
Teaching Quantitative Literacy through a Regression Analysis of Exam Performance
ERIC Educational Resources Information Center
Lindner, Andrew M.
2012-01-01
Quantitative literacy is increasingly essential for both informed citizenship and a variety of careers. Though regression is one of the most common methods in quantitative sociology, it is rarely taught until late in students' college careers. In this article, the author describes a classroom-based activity introducing students to regression…
Growth in Mathematics Achievement: Analysis with Classification and Regression Trees
ERIC Educational Resources Information Center
Ma, Xin
2005-01-01
A recently developed statistical technique, often referred to as classification and regression trees (CART), holds great potential for researchers to discover how student-level (and school-level) characteristics interactively affect growth in mathematics achievement. CART is a host of advanced statistical methods that statistically cluster…
HIGH RESOLUTION FOURIER ANALYSIS WITH AUTO-REGRESSIVE LINEAR PREDICTION
Barton, J.; Shirley, D.A.
1984-04-01
Auto-regressive linear prediction is adapted to double the resolution of Angle-Resolved Photoemission Extended Fine Structure (ARPEFS) Fourier transforms. Even with the optimal taper (weighting function), the commonly used taper-and-transform Fourier method has limited resolution: it assumes the signal is zero beyond the limits of the measurement. By seeking the Fourier spectrum of an infinite extent oscillation consistent with the measurements but otherwise having maximum entropy, the errors caused by finite data range can be reduced. Our procedure developed to implement this concept adapts auto-regressive linear prediction to extrapolate the signal in an effective and controllable manner. Difficulties encountered when processing actual ARPEFS data are discussed. A key feature of this approach is the ability to convert improved measurements (signal-to-noise or point density) into improved Fourier resolution.
Using Robust Standard Errors to Combine Multiple Regression Estimates with Meta-Analysis
ERIC Educational Resources Information Center
Williams, Ryan T.
2012-01-01
Combining multiple regression estimates with meta-analysis has continued to be a difficult task. A variety of methods have been proposed and used to combine multiple regression slope estimates with meta-analysis, however, most of these methods have serious methodological and practical limitations. The purpose of this study was to explore the use…
Yang, Man; Wang, Hong-Tao; Zhao, Miao; Meng, Wen-Bo; Ou, Jin-Qing; He, Jun-Hui; Zou, Bing; Lei, Ping-Guang
2015-01-01
Abstract Currently 2 difference classes of cyclooxygenase (COX)-2 inhibitors, coxibs and relatively selective COX-2 inhibitors, are available for patients requiring nonsteroidal anti-inflammatory drug (NSAID) therapy; their gastroprotective effect is hardly directly compared. The aim of this study was to compare the gastroprotective effect of relatively selective COX-2 inhibitors with coxibs. MEDLINE, EMBASE, and the Cochrane Library (from their inception to March 2015) were searched for potential eligible studies. We included randomized controlled trials comparing coxibs (celecoxib, etoricoxib, parecoxib, and lumiracoxib), relatively selective COX-2 inhibitors (nabumetone, meloxicam, and etodolac), and nonselective NSAIDs with a study duration ≥4 weeks. Comparative effectiveness and safety data were pooled by Bayesian network meta-analysis. The primary outcomes were ulcer complications and symptomatic ulcer. Summary effect-size was calculated as risk ratio (RR), together with the 95% confidence interval (CI). This study included 36 trials with a total of 112,351 participants. Network meta-analyses indicated no significant difference between relatively selective COX-2 inhibitors and coxibs regarding ulcer complications (RR, 1.38; 95% CI, 0.47–3.27), symptomatic ulcer (RR, 1.02; 95% CI, 0.09–3.92), and endoscopic ulcer (RR, 1.18; 95% CI, 0.37–2.96). Network meta-analyses adjusting potential influential factors (age, sex, previous ulcer disease, and follow-up time), and sensitivity analyses did not reveal any major change to the main results. Network meta-analyses suggested that relatively selective COX-2 inhibitors and coxibs were associated with comparable incidences of total adverse events (AEs) (RR, 1.09; 95% CI, 0.93–1.31), gastrointestinal AEs (RR, 1.04; 95% CI, 0.87–1.25), total withdrawals (RR, 1.00; 95% CI, 0.74–1.33), and gastrointestinal AE-related withdrawals (RR, 1.02; 95% CI, 0.57–1.74). Relatively selective COX-2 inhibitors appear to be
Yang, Man; Wang, Hong-Tao; Zhao, Miao; Meng, Wen-Bo; Ou, Jin-Qing; He, Jun-Hui; Zou, Bing; Lei, Ping-Guang
2015-10-01
Currently 2 difference classes of cyclooxygenase (COX)-2 inhibitors, coxibs and relatively selective COX-2 inhibitors, are available for patients requiring nonsteroidal anti-inflammatory drug (NSAID) therapy; their gastroprotective effect is hardly directly compared. The aim of this study was to compare the gastroprotective effect of relatively selective COX-2 inhibitors with coxibs. MEDLINE, EMBASE, and the Cochrane Library (from their inception to March 2015) were searched for potential eligible studies. We included randomized controlled trials comparing coxibs (celecoxib, etoricoxib, parecoxib, and lumiracoxib), relatively selective COX-2 inhibitors (nabumetone, meloxicam, and etodolac), and nonselective NSAIDs with a study duration ≥ 4 weeks. Comparative effectiveness and safety data were pooled by Bayesian network meta-analysis. The primary outcomes were ulcer complications and symptomatic ulcer. Summary effect-size was calculated as risk ratio (RR), together with the 95% confidence interval (CI). This study included 36 trials with a total of 112,351 participants. Network meta-analyses indicated no significant difference between relatively selective COX-2 inhibitors and coxibs regarding ulcer complications (RR, 1.38; 95% CI, 0.47-3.27), symptomatic ulcer (RR, 1.02; 95% CI, 0.09-3.92), and endoscopic ulcer (RR, 1.18; 95% CI, 0.37-2.96). Network meta-analyses adjusting potential influential factors (age, sex, previous ulcer disease, and follow-up time), and sensitivity analyses did not reveal any major change to the main results. Network meta-analyses suggested that relatively selective COX-2 inhibitors and coxibs were associated with comparable incidences of total adverse events (AEs) (RR, 1.09; 95% CI, 0.93-1.31), gastrointestinal AEs (RR, 1.04; 95% CI, 0.87-1.25), total withdrawals (RR, 1.00; 95% CI, 0.74-1.33), and gastrointestinal AE-related withdrawals (RR, 1.02; 95% CI, 0.57-1.74). Relatively selective COX-2 inhibitors appear to be associated with
Analysis of apoptosis during hair follicle regression (catagen)
Lindner, G.; Botchkarev, V. A.; Botchkareva, N. V.; Ling, G.; van der Veen, C.; Paus, R.
1997-01-01
Keratinocyte apoptosis is a central element in the regulation of hair follicle regression (catagen), yet the exact location and the control of follicular keratinocyte apoptosis remain obscure. To generate an "apoptomap" of the hair follicle, we have studied selected apoptosis-associated parameters in the C57BL/6 mouse model for hair research during normal and pharmacologically manipulated, pathological catagen development. As assessed by terminal deoxynucleotide transferase dUTP fluorescein nick end-labeling (TUNEL) stain, apoptotic cells not only appeared in the regressing proximal follicle epithelium but, surprisingly, were also seen in the central inner root sheath, in the bulge/isthmus region, and in the secondary germ, but never in the dermal papilla. These apoptosis hot spots during catagen development correlated largely with a down-regulation of the Bcl-2/Bax ratio but only poorly with the expression patterns of interleukin-1beta converting enzyme, p55TNFR, and Fas/Apo-1 immunoreactivity. Instead, a higher correlation was found with p75NTR expression. During cyclophosphamide-induced follicle dystrophy and alopecia, massive keratinocyte apoptosis occurred in the entire proximal hair bulb, except in the dermal papilla, despite a strong up-regulation of Bax and p75NTR immunoreactivity. Selected receptors of the tumor necrosis factor/nerve growth factor family and members of the Bcl-2 family may also play a key role in the control of follicular keratinocyte apoptosis in situ. Images Figure 1 Figure 2 Figure 3 Figure 5. a Figure 6 Figure 8 PMID:9403711
Striker, Lora K.; Medalie, Laura
1997-01-01
This report provides the results of a detailed Level II analysis of scour potential at structure MORETH00010021 on Town Highway 1 crossing Cox Brook, Moretown, Vermont (figures 1–8). A Level II study is a basic engineering analysis of the site, including a quantitative analysis of stream stability and scour (U.S. Department of Transportation, 1993). Results of a Level I scour investigation also are included in Appendix E of this report. A Level I investigation provides a qualitative geomorphic characterization of the study site. Information on the bridge, gleaned from Vermont Agency of Transportation (VTAOT) files, was compiled prior to conducting Level I and Level II analyses and is found in Appendix D. The site is in the Green Mountain section of the New England physiographic province in north-central Vermont. The 2.85-mi2 drainage area is in a predominantly rural and forested basin. In the vicinity of the study site, the surface cover is predominantly forested. In the study area, Cox Brook has an incised, sinuous channel with a slope of approximately 0.02 ft/ft, an average channel top width of 23 ft and an average bank height of 4 ft. The channel bed material ranges from gravel to cobble with a median grain size (D50) of 47.5 mm (0.156 ft). The geomorphic assessment at the time of the Level I and Level II site visit on July 18, 1996, indicated that the reach was stable. The Town Highway 1 crossing of Cox Brook is a 29-ft-long, two-lane bridge consisting of one 27-foot steel-beam span (Vermont Agency of Transportation, written communication, October 13, 1995). The opening length of the structure parallel to the bridge face is 24.8 ft. The bridge is supported by vertical, concrete abutments with wingwalls. The channel is skewed approximately 60 degrees to the opening while the measured opening-skew-to-roadway is 40 degrees. A scour hole 1.0 ft deeper than the mean thalweg depth was observed along the left abutment downstream during the Level I assessment. The
COX7AR is a Stress-inducible Mitochondrial COX Subunit that Promotes Breast Cancer Malignancy.
Zhang, Kezhong; Wang, Guohui; Zhang, Xuebao; Hüttemann, Philipp P; Qiu, Yining; Liu, Jenney; Mitchell, Allison; Lee, Icksoo; Zhang, Chao; Lee, Jin-Sook; Pecina, Petr; Wu, Guojun; Yang, Zeng-Quan; Hüttemann, Maik; Grossman, Lawrence I
2016-01-01
Cytochrome c oxidase (COX), the terminal enzyme of the mitochondrial respiratory chain, plays a key role in regulating mitochondrial energy production and cell survival. COX subunit VIIa polypeptide 2-like protein (COX7AR) is a novel COX subunit that was recently found to be involved in mitochondrial supercomplex assembly and mitochondrial respiration activity. Here, we report that COX7AR is expressed in high energy-demanding tissues, such as brain, heart, liver, and aggressive forms of human breast cancer cells. Under cellular stress that stimulates energy metabolism, COX7AR is induced and incorporated into the mitochondrial COX complex. Functionally, COX7AR promotes cellular energy production in human mammary epithelial cells. Gain- and loss-of-function analysis demonstrates that COX7AR is required for human breast cancer cells to maintain higher rates of proliferation, clone formation, and invasion. In summary, our study revealed that COX7AR is a stress-inducible mitochondrial COX subunit that facilitates human breast cancer malignancy. These findings have important implications in the understanding and treatment of human breast cancer and the diseases associated with mitochondrial energy metabolism. PMID:27550821
COX7AR is a Stress-inducible Mitochondrial COX Subunit that Promotes Breast Cancer Malignancy
Zhang, Kezhong; Wang, Guohui; Zhang, Xuebao; Hüttemann, Philipp P.; Qiu, Yining; Liu, Jenney; Mitchell, Allison; Lee, Icksoo; Zhang, Chao; Lee, Jin-sook; Pecina, Petr; Wu, Guojun; Yang, Zeng-quan; Hüttemann, Maik; Grossman, Lawrence I.
2016-01-01
Cytochrome c oxidase (COX), the terminal enzyme of the mitochondrial respiratory chain, plays a key role in regulating mitochondrial energy production and cell survival. COX subunit VIIa polypeptide 2-like protein (COX7AR) is a novel COX subunit that was recently found to be involved in mitochondrial supercomplex assembly and mitochondrial respiration activity. Here, we report that COX7AR is expressed in high energy-demanding tissues, such as brain, heart, liver, and aggressive forms of human breast cancer cells. Under cellular stress that stimulates energy metabolism, COX7AR is induced and incorporated into the mitochondrial COX complex. Functionally, COX7AR promotes cellular energy production in human mammary epithelial cells. Gain- and loss-of-function analysis demonstrates that COX7AR is required for human breast cancer cells to maintain higher rates of proliferation, clone formation, and invasion. In summary, our study revealed that COX7AR is a stress-inducible mitochondrial COX subunit that facilitates human breast cancer malignancy. These findings have important implications in the understanding and treatment of human breast cancer and the diseases associated with mitochondrial energy metabolism. PMID:27550821
An improved multiple linear regression and data analysis computer program package
NASA Technical Reports Server (NTRS)
Sidik, S. M.
1972-01-01
NEWRAP, an improved version of a previous multiple linear regression program called RAPIER, CREDUC, and CRSPLT, allows for a complete regression analysis including cross plots of the independent and dependent variables, correlation coefficients, regression coefficients, analysis of variance tables, t-statistics and their probability levels, rejection of independent variables, plots of residuals against the independent and dependent variables, and a canonical reduction of quadratic response functions useful in optimum seeking experimentation. A major improvement over RAPIER is that all regression calculations are done in double precision arithmetic.
A regularized multivariate regression approach for eQTL analysis
Zhang, Hexin; Zhang, Yuzheng; Hsu, Li; Wang, Pei
2013-01-01
Expression quantitative trait loci (eQTLs) are genomic loci that regulate expression levels of mRNAs or proteins. Understanding these regulatory provides important clues to biological pathways that underlie diseases. In this paper, we propose a new statistical method, GroupRemMap, for identifying eQTLs. We model the relationship between gene expression and single nucleotide variants (SNVs) through multivariate linear regression models, in which gene expression levels are responses and SNV genotypes are predictors. To handle the high-dimensionality as well as to incorporate the intrinsic group structure of SNVs, we introduce a new regularization scheme to (1) control the overall sparsity of the model; (2) encourage the group selection of SNVs from the same gene; and (3) facilitate the detection of trans-hub-eQTLs. We apply the proposed method to the colorectal and breast cancer data sets from The Cancer Genome Atlas (TCGA), and identify several biologically interesting eQTLs. These findings may provide insight into biological processes associated with cancers and generate hypotheses for future studies. PMID:26085849
Development of a User Interface for a Regression Analysis Software Tool
NASA Technical Reports Server (NTRS)
Ulbrich, Norbert Manfred; Volden, Thomas R.
2010-01-01
An easy-to -use user interface was implemented in a highly automated regression analysis tool. The user interface was developed from the start to run on computers that use the Windows, Macintosh, Linux, or UNIX operating system. Many user interface features were specifically designed such that a novice or inexperienced user can apply the regression analysis tool with confidence. Therefore, the user interface s design minimizes interactive input from the user. In addition, reasonable default combinations are assigned to those analysis settings that influence the outcome of the regression analysis. These default combinations will lead to a successful regression analysis result for most experimental data sets. The user interface comes in two versions. The text user interface version is used for the ongoing development of the regression analysis tool. The official release of the regression analysis tool, on the other hand, has a graphical user interface that is more efficient to use. This graphical user interface displays all input file names, output file names, and analysis settings for a specific software application mode on a single screen which makes it easier to generate reliable analysis results and to perform input parameter studies. An object-oriented approach was used for the development of the graphical user interface. This choice keeps future software maintenance costs to a reasonable limit. Examples of both the text user interface and graphical user interface are discussed in order to illustrate the user interface s overall design approach.
Advanced GIS Exercise: Predicting Rainfall Erosivity Index Using Regression Analysis
ERIC Educational Resources Information Center
Post, Christopher J.; Goddard, Megan A.; Mikhailova, Elena A.; Hall, Steven T.
2006-01-01
Graduate students from a variety of agricultural and natural resource fields are incorporating geographic information systems (GIS) analysis into their graduate research, creating a need for teaching methodologies that help students understand advanced GIS topics for use in their own research. Graduate-level GIS exercises help students understand…
A Noncentral "t" Regression Model for Meta-Analysis
ERIC Educational Resources Information Center
Camilli, Gregory; de la Torre, Jimmy; Chiu, Chia-Yi
2010-01-01
In this article, three multilevel models for meta-analysis are examined. Hedges and Olkin suggested that effect sizes follow a noncentral "t" distribution and proposed several approximate methods. Raudenbush and Bryk further refined this model; however, this procedure is based on a normal approximation. In the current research literature, this…
ANOVA Versus Regression Analysis of ATI Designs: An Empirical Investigation.
ERIC Educational Resources Information Center
Thompson, Bruce
1986-01-01
This paper reports a Monte Carlo study of differences induced by different analysis choices over selected types of aptitude treatment interaction (ATI) data (nine combinations of three sample sizes and three population parameter effect sizes). Generally, ANOVA methods tended to overestimate smaller effect sizes and to underestimate larger effect…
NASA Astrophysics Data System (ADS)
Nishidate, Izumi; Wiswadarma, Aditya; Hase, Yota; Tanaka, Noriyuki; Maeda, Takaaki; Niizeki, Kyuichi; Aizu, Yoshihisa
2011-08-01
In order to visualize melanin and blood concentrations and oxygen saturation in human skin tissue, a simple imaging technique based on multispectral diffuse reflectance images acquired at six wavelengths (500, 520, 540, 560, 580 and 600nm) was developed. The technique utilizes multiple regression analysis aided by Monte Carlo simulation for diffuse reflectance spectra. Using the absorbance spectrum as a response variable and the extinction coefficients of melanin, oxygenated hemoglobin, and deoxygenated hemoglobin as predictor variables, multiple regression analysis provides regression coefficients. Concentrations of melanin and total blood are then determined from the regression coefficients using conversion vectors that are deduced numerically in advance, while oxygen saturation is obtained directly from the regression coefficients. Experiments with a tissue-like agar gel phantom validated the method. In vivo experiments with human skin of the human hand during upper limb occlusion and of the inner forearm exposed to UV irradiation demonstrated the ability of the method to evaluate physiological reactions of human skin tissue.
Analysis of Maryland Poisoning Deaths Using Classification And Regression Tree (CART) Analysis
Pamer, Carol; Serpi, Tracey; Finkelstein, Joseph
2008-01-01
Our study is a cross-sectional analysis of Maryland poisoning deaths for years 2003 and 2004. We used Classification and Regression Tree (CART) methodology to classify 1,204 Maryland undetermined intent poisoning deaths as either unintentional or suicidal poisonings. The predictive ability of the selected set of variables (i.e., poisoned in the home or workplace, location type where poisoned, place of death, poison type, victim race and age, year of death) was extremely good. Of the 301 test cases, only eight were misclassified by the CART regression tree. Of 1,204 undetermined intent poisoning deaths, CART classified 903 as suicides and 301 as unintentional deaths. The major strength of our study is the use of CART to differentiate with a high degree of accuracy between unintentional and suicidal poisoning deaths among Maryland undetermined intent poisoning deaths. PMID:18999168
Analysis for Regression Model Behavior by Sampling Strategy for Annual Pollutant Load Estimation.
Park, Youn Shik; Engel, Bernie A
2015-11-01
Water quality data are typically collected less frequently than streamflow data due to the cost of collection and analysis, and therefore water quality data may need to be estimated for additional days. Regression models are applicable to interpolate water quality data associated with streamflow data and have come to be extensively used, requiring relatively small amounts of data. There is a need to evaluate how well the regression models represent pollutant loads from intermittent water quality data sets. Both the specific regression model and water quality data frequency are important factors in pollutant load estimation. In this study, nine regression models from the Load Estimator (LOADEST) and one regression model from the Web-based Load Interpolation Tool (LOADIN) were evaluated with subsampled water quality data sets from daily measured water quality data sets for N, P, and sediment. Each water quality parameter had different correlations with streamflow, and the subsampled water quality data sets had various proportions of storm samples. The behaviors of the regression models differed not only by water quality parameter but also by proportion of storm samples. The regression models from LOADEST provided accurate and precise annual sediment and P load estimates using the water quality data of 20 to 40% storm samples. LOADIN provided more accurate and precise annual N load estimates than LOADEST. In addition, the results indicate that avoidance of water quality data extrapolation and availability of water quality data from storm events were crucial in annual pollutant load estimation using pollutant regression models. PMID:26641336
Barresi, Vincenza; Trovato-Salinaro, Angela; Spampinato, Giorgia; Musso, Nicolò; Castorina, Sergio; Rizzarelli, Enrico; Condorelli, Daniele Filippo
2016-08-01
Copper homeostasis and distribution is strictly regulated by a network of transporters and intracellular chaperones encoded by a group of genes collectively known as copper homeostasis genes (CHGs). In this work, analysis of The Cancer Genome Atlas database for somatic point mutations in colorectal cancer revealed that inactivating mutations are absent or extremely rare in CHGs. Using oligonucleotide microarrays, we found a strong increase in mRNA levels of the membrane copper transporter 1 protein [CTR1; encoded by the solute carrier family 31 member 1 gene (SLC31A1 gene)] in our series of colorectal carcinoma samples. CTR1 is the main copper influx transporter and changes in its expression are able to induce modifications of cellular copper accumulation. The increased SLC31A1 mRNA level is accompanied by a parallel increase in transcript levels for copper efflux pump ATP7A, copper metabolism Murr1 domain containing 1 (COMMD1), the cytochrome C oxidase assembly factors [synthesis of cytochrome c oxidase 1 (SCO1) and cytochrome c oxidase copper chaperone 11 (COX11)], the cupric reductase six transmembrane epithelial antigen of the prostate (STEAP3), and the metal-regulatory transcription factors (MTF1, MTF2) and specificity protein 1 (SP1). The significant correlation between SLC31A1,SCO1, and COX11 mRNA levels suggests that this transcriptional upregulation might be part of a coordinated program of gene regulation. Transcript-level upregulation of SLC31A1,SCO1, and COX11 was also confirmed by the analysis of different colon carcinoma cell lines (Caco-2, HT116, HT29) and cancer cell lines of different tissue origin (MCF7, PC3). Finally, exon-level expression analysis of SLC31A1 reveals differential expression of alternative transcripts in colorectal cancer and normal colonic mucosa. PMID:27516958
Deng, Yangyang; Parajuli, Prem B.
2011-08-10
Evaluation of economic feasibility of a bio-gasification facility needs understanding of its unit cost under different production capacities. The objective of this study was to evaluate the unit cost of syngas production at capacities from 60 through 1800Nm 3/h using an economic model with three regression analysis techniques (simple regression, reciprocal regression, and log-log regression). The preliminary result of this study showed that reciprocal regression analysis technique had the best fit curve between per unit cost and production capacity, with sum of error squares (SES) lower than 0.001 and coefficient of determination of (R 2) 0.996. The regression analysis techniques determined the minimum unit cost of syngas production for micro-scale bio-gasification facilities of $0.052/Nm 3, under the capacity of 2,880 Nm 3/h. The results of this study suggest that to reduce cost, facilities should run at a high production capacity. In addition, the contribution of this technique could be the new categorical criterion to evaluate micro-scale bio-gasification facility from the perspective of economic analysis.
Regression Models for Demand Reduction based on Cluster Analysis of Load Profiles
Yamaguchi, Nobuyuki; Han, Junqiao; Ghatikar, Girish; Piette, Mary Ann; Asano, Hiroshi; Kiliccote, Sila
2009-06-28
This paper provides new regression models for demand reduction of Demand Response programs for the purpose of ex ante evaluation of the programs and screening for recruiting customer enrollment into the programs. The proposed regression models employ load sensitivity to outside air temperature and representative load pattern derived from cluster analysis of customer baseline load as explanatory variables. The proposed models examined their performances from the viewpoint of validity of explanatory variables and fitness of regressions, using actual load profile data of Pacific Gas and Electric Company's commercial and industrial customers who participated in the 2008 Critical Peak Pricing program including Manual and Automated Demand Response.
ERIC Educational Resources Information Center
Barringer, Mary S.
Researchers are becoming increasingly aware of the advantages of using multiple regression as opposed to analysis of variance (ANOVA) or analysis of covariance (ANCOVA). Multiple regression is more versatile and does not force the researcher to throw away variance by categorizing intervally scaled data. Polynomial regression analysis offers the…
Guidelines for the use of structural versus regression analysis in geomorphic studies
Osterkamp, W.R.; McNellis, Jesse M.; Jordan, Paul Robert
1978-01-01
Regression analysis is a useful curve-fitting technique, but it often is misapplied to geomorphic data sets. When error components can be identified for both variables, the statistical technique of structural analysis is preferred. If regression results are available, conversion to a structural analysis can be made either manually or by computer. Use of computer-generated data sets permits the construction of curves relating variation between regression and structural analyses to the range of data of the independent variable. The data have randomly imposed error components of specified standard deviation and a slope of the linear relation that simulates gradient-discharge relations of natural alluvial streams. The empirically developed curves can be used to determine the need for structural analysis of real geomorphic data sets. (Woodard-USGS)
Exploratory regression analysis: a tool for selecting models and determining predictor importance.
Braun, Michael T; Oswald, Frederick L
2011-06-01
Linear regression analysis is one of the most important tools in a researcher's toolbox for creating and testing predictive models. Although linear regression analysis indicates how strongly a set of predictor variables, taken together, will predict a relevant criterion (i.e., the multiple R), the analysis cannot indicate which predictors are the most important. Although there is no definitive or unambiguous method for establishing predictor variable importance, there are several accepted methods. This article reviews those methods for establishing predictor importance and provides a program (in Excel) for implementing them (available for direct download at http://dl.dropbox.com/u/2480715/ERA.xlsm?dl=1) . The program investigates all 2(p) - 1 submodels and produces several indices of predictor importance. This exploratory approach to linear regression, similar to other exploratory data analysis techniques, has the potential to yield both theoretical and practical benefits. PMID:21298571
Criteria for the use of regression analysis for remote sensing of sediment and pollutants
NASA Technical Reports Server (NTRS)
Whitlock, C. H.; Kuo, C. Y.; Lecroy, S. R. (Principal Investigator)
1982-01-01
Data analysis procedures for quantification of water quality parameters that are already identified and are known to exist within the water body are considered. The liner multiple-regression technique was examined as a procedure for defining and calibrating data analysis algorithms for such instruments as spectrometers and multispectral scanners.
Partitioning Predicted Variance into Constituent Parts: A Primer on Regression Commonality Analysis.
ERIC Educational Resources Information Center
Amado, Alfred J.
Commonality analysis is a method of decomposing the R squared in a multiple regression analysis into the proportion of explained variance of the dependent variable associated with each independent variable uniquely and the proportion of explained variance associated with the common effects of one or more independent variables in various…
Regression Analysis of Physician Distribution to Identify Areas of Need: Some Preliminary Findings.
ERIC Educational Resources Information Center
Morgan, Bruce B.; And Others
A regression analysis was conducted of factors that help to explain the variance in physician distribution and which identify those factors that influence the maldistribution of physicians. Models were developed for different geographic areas to determine the most appropriate unit of analysis for the Western Missouri Area Health Education Center…
Modeling of retardance in ferrofluid with Taguchi-based multiple regression analysis
NASA Astrophysics Data System (ADS)
Lin, Jing-Fung; Wu, Jyh-Shyang; Sheu, Jer-Jia
2015-03-01
The citric acid (CA) coated Fe3O4 ferrofluids are prepared by a co-precipitation method and the magneto-optical retardance property is measured by a Stokes polarimeter. Optimization and multiple regression of retardance in ferrofluids are executed by combining Taguchi method and Excel. From the nine tests for four parameters, including pH of suspension, molar ratio of CA to Fe3O4, volume of CA, and coating temperature, influence sequence and excellent program are found. Multiple regression analysis and F-test on the significance of regression equation are performed. It is found that the model F value is much larger than Fcritical and significance level P <0.0001. So it can be concluded that the regression model has statistically significant predictive ability. Substituting excellent program into equation, retardance is obtained as 32.703°, higher than the highest value in tests by 11.4%.
Hu, W; Yu, X G; Wu, S; Tan, L P; Song, M R; Abdulahi, A Y; Wang, Z; Jiang, B; Li, G Q
2016-07-01
Ancylostoma ceylanicum is a common zoonotic nematode. Cats act as natural reservoirs of the hookworm and are involved in transmitting infection to humans, thus posing a potential risk to public health. The prevalence of feline A. ceylanicum in Guangzhou (South China) was surveyed by polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP). In total, 112 faecal samples were examined; 34.8% (39/112) and 43.8% (49/112) samples were positive with hookworms by microscopy and PCR method, respectively. Among them, 40.8% of samples harboured A. ceylanicum. Twelve positive A. ceylanicum samples were selected randomly and used for cox 1 sequence analysis. Sequencing results revealed that they had 97-99% similarity with A. ceylanicum cox 1 gene sequences deposited in GenBank. A phylogenetic tree showed that A. ceylanicum isolates were divided into two groups: one comprising four isolates from Guangzhou (South China), and the other comprising those from Malaysia, Cambodia and Guangzhou. In the latter group, all A. ceylanicum isolates from Guangzhou were clustered into a minor group again. The results indicate that the high prevalence of A. ceylanicum in stray cats in South China poses a potential risk of hookworm transmission from pet cats to humans, and that A. ceylanicum may be a species complex worldwide. PMID:26123649
Trend Analysis of Cancer Mortality and Incidence in Panama, Using Joinpoint Regression Analysis
Politis, Michael; Higuera, Gladys; Chang, Lissette Raquel; Gomez, Beatriz; Bares, Juan; Motta, Jorge
2015-01-01
Abstract Cancer is one of the leading causes of death worldwide and its incidence is expected to increase in the future. In Panama, cancer is also one of the leading causes of death. In 1964, a nationwide cancer registry was started and it was restructured and improved in 2012. The aim of this study is to utilize Joinpoint regression analysis to study the trends of the incidence and mortality of cancer in Panama in the last decade. Cancer mortality was estimated from the Panamanian National Institute of Census and Statistics Registry for the period 2001 to 2011. Cancer incidence was estimated from the Panamanian National Cancer Registry for the period 2000 to 2009. The Joinpoint Regression Analysis program, version 4.0.4, was used to calculate trends by age-adjusted incidence and mortality rates for selected cancers. Overall, the trend of age-adjusted cancer mortality in Panama has declined over the last 10 years (−1.12% per year). The cancers for which there was a significant increase in the trend of mortality were female breast cancer and ovarian cancer; while the highest increases in incidence were shown for breast cancer, liver cancer, and prostate cancer. Significant decrease in the trend of mortality was evidenced for the following: prostate cancer, lung and bronchus cancer, and cervical cancer; with respect to incidence, only oral and pharynx cancer in both sexes had a significant decrease. Some cancers showed no significant trends in incidence or mortality. This study reveals contrasting trends in cancer incidence and mortality in Panama in the last decade. Although Panama is considered an upper middle income nation, this study demonstrates that some cancer mortality trends, like the ones seen in cervical and lung cancer, behave similarly to the ones seen in high income countries. In contrast, other types, like breast cancer, follow a pattern seen in countries undergoing a transition to a developed economy with its associated lifestyle, nutrition, and
Trend Analysis of Cancer Mortality and Incidence in Panama, Using Joinpoint Regression Analysis
Politis, Michael; Higuera, Gladys; Chang, Lissette Raquel; Gomez, Beatriz; Bares, Juan; Motta, Jorge
2015-01-01
Abstract Cancer is one of the leading causes of death worldwide and its incidence is expected to increase in the future. In Panama, cancer is also one of the leading causes of death. In 1964, a nationwide cancer registry was started and it was restructured and improved in 2012. The aim of this study is to utilize Joinpoint regression analysis to study the trends of the incidence and mortality of cancer in Panama in the last decade. Cancer mortality was estimated from the Panamanian National Institute of Census and Statistics Registry for the period 2001 to 2011. Cancer incidence was estimated from the Panamanian National Cancer Registry for the period 2000 to 2009. The Joinpoint Regression Analysis program, version 4.0.4, was used to calculate trends by age-adjusted incidence and mortality rates for selected cancers. Overall, the trend of age-adjusted cancer mortality in Panama has declined over the last 10 years (−1.12% per year). The cancers for which there was a significant increase in the trend of mortality were female breast cancer and ovarian cancer; while the highest increases in incidence were shown for breast cancer, liver cancer, and prostate cancer. Significant decrease in the trend of mortality was evidenced for the following: prostate cancer, lung and bronchus cancer, and cervical cancer; with respect to incidence, only oral and pharynx cancer in both sexes had a significant decrease. Some cancers showed no significant trends in incidence or mortality. This study reveals contrasting trends in cancer incidence and mortality in Panama in the last decade. Although Panama is considered an upper middle income nation, this study demonstrates that some cancer mortality trends, like the ones seen in cervical and lung cancer, behave similarly to the ones seen in high income countries. In contrast, other types, like breast cancer, follow a pattern seen in countries undergoing a transition to a developed economy with its associated lifestyle, nutrition, and
Regression analysis in interlaboratory surveys: a case study with cholesterol and triglycerides.
Munster, D J; Lever, M; Walmsley, T A
1978-10-01
1. A new interlaboratory survey design, that uses regression analysis to compare results from each laboratory with target values, was tested using cholesterol and triglyceride analyses. The fifty New Zealand laboratories involved showed considerable interlaboratory variation (CV = 8% to 27% for cholesterol, 13% to 113% for triglycerides), 30% and 40% of which was associated with systematic differences between laboratories. 2. End-of-period summaries using regression analysis confirmed the presence of systematic errors. These were either simple types caused apparently by incorrect standardisation (regression slope, B not equal to 1.0) or inappropriate blank correction (intercept, A not equal to zero) or complex types presumably due to nonlinearity or nonspecificity. Graphical display of results from each laboratory aided fault diagnosis and allowed the detection of between-run standardisation differences. 3. Method comparison studies were made: the only highly significant result being lower precision achieved by enzymatic cholesterol methods compared with other colorimetric methods. PMID:729161
2014-01-01
Background and aim Altered glucose metabolism, oxidative stress, lipid levels and inflammatory markers are important risk factors in diabetes, cardiovascular, and many other diseases. Cocoa has been shown to exert antioxidant and anti-inflammatory effects. The aim of this study is twofold: to assess the effect of Cocoa on the lipid profile and peroxidation in addition to the inflammatory markers in type 2 diabetic patients, and to represent a virtual model of probable action mechanism of observed clinical effects of Cocoa consumption using in silico analysis and bioinformatics data. Methods One hundred subjects with type 2 diabetes were included in a randomized clinical control trial. Fifty treatment subjects received 10 grams cocoa powder and 10 grams milk powder dissolved in 250 ml of boiling water, and the other fifty control subjects received only 10 grams milk powder dissolved in 250 ml boiling water. Both groups were on the mentioned regimen twice daily for 6 weeks. Blood samples were obtained prior to Cocoa consumption and 6 weeks after intervention. Serum lipids and lipoproteins profile, malondialdehyde and inflammatory markers including tumor necrosis factor-α (TNF-α), interleukin-6 (IL-6) and high sensitive C-reactive protein (hs-CRP) were measured. For statistical analysis two independent and paired samples t-test and linear regression were used. Bioinformatics and virtual analysis were performed using string data base and Molegro virtual software. Results Cocoa consumption lowered blood cholesterol,triglyceride, LDL-cholesterol, and TNF-α, hs-CRP, IL-6 significantly (P < 0.01). The results showed that the levels of HDL-cholesterol decreased significantly (P < 0.05) but Cocoa inhibited lipid peroxidation in treatment group than control group (P < 0.0001). Virtual analysis showed that the most frequent Cocoa ingredients, (+)-Catechin and (−)-Epicatechin, can dock to the enzyme COX-2. Conclusion These data support the beneficial effect
Larsson, A
1997-08-01
The objective of this study was to investigate the conditions for regression analysis of data from equilibrium experiments. One important issue was to recognize that Kd and the binding site concentration (A) are not of equal nature, although both are parameters in the regression analysis. Whereas Kd approximates to a true constant, A is subject to experimental variation due to pipetting errors and in solid-phase experiments also to uneven coating properties. While recognizing that the ideal assumptions for ordinary regression analysis are poorly satisfied, different regression models were evaluated by extensive simulations. It was first established by a 'worst case' investigation that a limited error (8%) in the dependent variable is not critical for the results obtained at curve-fitting to Langmuir's equation. Seven different equations were compared for the calculation of data representing a solid-phase equilibrium experiment with statistical but no systematic errors. All the equations are rearrangements of the law of mass action. In this setting the Scatchrd plot gave the best result, but also the double reciprocal and the Woolf plots worked well in weighted analysis. Langmuir's equation gave the best result of the 4 nonlinear regression models tested. The influence of one type of systematic error was also investigated. This assumed that 10% of the label was positioned on particles other than the functional ligand molecules. This systematic error was amplified, which resulted in a substantial bias. The calculated Kd-values varied slightly with the regression method used and were almost 24% too high in the best methods. PMID:9328576
Quantile regression for the statistical analysis of immunological data with many non-detects
2012-01-01
Background Immunological parameters are hard to measure. A well-known problem is the occurrence of values below the detection limit, the non-detects. Non-detects are a nuisance, because classical statistical analyses, like ANOVA and regression, cannot be applied. The more advanced statistical techniques currently available for the analysis of datasets with non-detects can only be used if a small percentage of the data are non-detects. Methods and results Quantile regression, a generalization of percentiles to regression models, models the median or higher percentiles and tolerates very high numbers of non-detects. We present a non-technical introduction and illustrate it with an implementation to real data from a clinical trial. We show that by using quantile regression, groups can be compared and that meaningful linear trends can be computed, even if more than half of the data consists of non-detects. Conclusion Quantile regression is a valuable addition to the statistical methods that can be used for the analysis of immunological datasets with non-detects. PMID:22769433
ERIC Educational Resources Information Center
Campbell, S. Duke; Greenberg, Barry
The development of a predictive equation capable of explaining a significant percentage of enrollment variability at Florida International University is described. A model utilizing trend analysis and a multiple regression approach to enrollment forecasting was adapted to investigate enrollment dynamics at the university. Four independent…
Catching up with Harvard: Results from Regression Analysis of World Universities League Tables
ERIC Educational Resources Information Center
Li, Mei; Shankar, Sriram; Tang, Kam Ki
2011-01-01
This paper uses regression analysis to test if the universities performing less well according to Shanghai Jiao Tong University's world universities league tables are able to catch up with the top performers, and to identify national and institutional factors that could affect this catching up process. We have constructed a dataset of 461…
Ultrasound-enhanced bioscouring of greige cotton: regression analysis of process factors
Technology Transfer Automated Retrieval System (TEKTRAN)
Process factors of enzyme concentration, time, power and frequency were investigated for ultrasound-enhanced bioscouring of greige cotton. A fractional factorial experimental design and subsequent regression analysis of the process factors were employed to determine the significance of each factor a...
Passing the Test: Ecological Regression Analysis in the Los Angeles County Case and Beyond.
ERIC Educational Resources Information Center
Lichtman, Allan J.
1991-01-01
Statistical analysis of racially polarized voting prepared for the Garza v County of Los Angeles (California) (1990) voting rights case is reviewed to demonstrate that ecological regression is a flexible, robust technique that illuminates the reality of ethnic voting, and superior to the neighborhood model supported by the defendants. (SLD)
Family Background Variables as Instruments for Education in Income Regressions: A Bayesian Analysis
ERIC Educational Resources Information Center
Hoogerheide, Lennart; Block, Joern H.; Thurik, Roy
2012-01-01
The validity of family background variables instrumenting education in income regressions has been much criticized. In this paper, we use data from the 2004 German Socio-Economic Panel and Bayesian analysis to analyze to what degree violations of the strict validity assumption affect the estimation results. We show that, in case of moderate direct…
ERIC Educational Resources Information Center
Preacher, Kristopher J.; Curran, Patrick J.; Bauer, Daniel J.
2006-01-01
Simple slopes, regions of significance, and confidence bands are commonly used to evaluate interactions in multiple linear regression (MLR) models, and the use of these techniques has recently been extended to multilevel or hierarchical linear modeling (HLM) and latent curve analysis (LCA). However, conducting these tests and plotting the…
Factor Regression Analysis: A New Method for Weighting Predictors. Final Report.
ERIC Educational Resources Information Center
Curtis, Ervin W.
The optimum weighting of variables to predict a dependent-criterion variable is an important problem in nearly all of the social and natural sciences. Although the predominant method, multiple regression analysis (MR), yields optimum weights for the sample at hand, these weights are not generally optimum in the population from which the sample was…
Multiple Logistic Regression Analysis of Cigarette Use among High School Students
ERIC Educational Resources Information Center
Adwere-Boamah, Joseph
2011-01-01
A binary logistic regression analysis was performed to predict high school students' cigarette smoking behavior from selected predictors from 2009 CDC Youth Risk Behavior Surveillance Survey. The specific target student behavior of interest was frequent cigarette use. Five predictor variables included in the model were: a) race, b) frequency of…
Isolating the Effects of Training Using Simple Regression Analysis: An Example of the Procedure.
ERIC Educational Resources Information Center
Waugh, C. Keith
This paper provides a case example of simple regression analysis, a forecasting procedure used to isolate the effects of training from an identified extraneous variable. This case example focuses on results of a three-day sales training program to improve bank loan officers' knowledge, skill-level, and attitude regarding solicitation and sale of…
Some Classroom Experiences in the Teaching of Empirical Model Building and Regression Analysis.
ERIC Educational Resources Information Center
Utter, Merlin; Wilkinson, John W.
The use of the digital computer for the presentation of the topics of empirical model building and regression analysis is discussed. The author concentrates upon a description of computing exercises which are employed to provide the students with experience in model building and evaluation in a controlled situation. The types of exercises given…
What Satisfies Students?: Mining Student-Opinion Data with Regression and Decision Tree Analysis
ERIC Educational Resources Information Center
Thomas, Emily H.; Galambos, Nora
2004-01-01
To investigate how students' characteristics and experiences affect satisfaction, this study uses regression and decision tree analysis with the CHAID algorithm to analyze student-opinion data. A data mining approach identifies the specific aspects of students' university experience that most influence three measures of general satisfaction. The…
Using Refined Regression Analysis To Assess The Ecological Services Of Restored Wetlands
A hierarchical approach to regression analysis of wetland water treatment was conducted to determine which factors are the most appropriate for characterizing wetlands of differing structure and function. We used this approach in an effort to identify the types and characteristi...
Predictive Discriminant Analysis Versus Logistic Regression in Two-Group Classification Problems.
ERIC Educational Resources Information Center
Meshbane, Alice; Morris, John D.
A method for comparing the cross-validated classification accuracies of predictive discriminant analysis and logistic regression classification models is presented under varying data conditions for the two-group classification problem. With this method, separate-group, as well as total-sample proportions of the correct classifications, can be…
A use of regression analysis in acoustical diagnostics of gear drives
NASA Technical Reports Server (NTRS)
Balitskiy, F. Y.; Genkin, M. D.; Ivanova, M. A.; Kobrinskiy, A. A.; Sokolova, A. G.
1973-01-01
A study is presented of components of the vibration spectrum as the filtered first and second harmonics of the tooth frequency which permits information to be obtained on the physical characteristics of the vibration excitation process, and an approach to be made to comparison of models of the gearing. Regression analysis of two random processes has shown a strong dependence of the second harmonic on the first, and independence of the first from the second. The nature of change in the regression line, with change in loading moment, gives rise to the idea of a variable phase shift between the first and second harmonics.
Regression analysis of non-contact acousto-thermal signature data
NASA Astrophysics Data System (ADS)
Criner, Amanda; Schehl, Norman
2016-05-01
The non-contact acousto-thermal signature (NCATS) is a nondestructive evaluation technique with potential to detect fatigue in materials such as noisy titanium and polymer matrix composites. The underlying physical mechanisms and properties may be determined by parameter estimation via nonlinear regression. The nonlinear regression analysis formulation, including the underlying models, is discussed. Several models and associated data analyses are given along with the assumptions implicit in the underlying model. The results are anomalous. These anomalous results are evaluated with respect to the accuracy of the implicit assumptions.
Detrended fluctuation analysis as a regression framework: Estimating dependence at different scales
NASA Astrophysics Data System (ADS)
Kristoufek, Ladislav
2015-02-01
We propose a framework combining detrended fluctuation analysis with standard regression methodology. The method is built on detrended variances and covariances and it is designed to estimate regression parameters at different scales and under potential nonstationarity and power-law correlations. The former feature allows for distinguishing between effects for a pair of variables from different temporal perspectives. The latter ones make the method a significant improvement over the standard least squares estimation. Theoretical claims are supported by Monte Carlo simulations. The method is then applied on selected examples from physics, finance, environmental science, and epidemiology. For most of the studied cases, the relationship between variables of interest varies strongly across scales.
NASA Astrophysics Data System (ADS)
Pradhan, B.; Buchroithner, M. F.; Mansor, S.
2009-04-01
This paper presents the assessment results of spatially based probabilistic three models using Geoinformation Techniques (GIT) for landslide susceptibility analysis at Penang Island in Malaysia. Landslide locations within the study areas were identified by interpreting aerial photographs, satellite images and supported with field surveys. Maps of the topography, soil type, lineaments and land cover were constructed from the spatial data sets. There are nine landslide related factors were extracted from the spatial database and the neural network, frequency ratio and logistic regression coefficients of each factor was computed. Landslide susceptibility maps were drawn for study area using neural network, frequency ratios and logistic regression models. For verification, the results of the analyses were compared with actual landslide locations in study area. The verification results show that frequency ratio model provides higher prediction accuracy than the ANN and regression models.
Wang, Wen-Cheng; Cho, Wen-Chien; Chen, Yin-Jen
2014-01-01
It is estimated that mainland Chinese tourists travelling to Taiwan can bring annual revenues of 400 billion NTD to the Taiwan economy. Thus, how the Taiwanese Government formulates relevant measures to satisfy both sides is the focus of most concern. Taiwan must improve the facilities and service quality of its tourism industry so as to attract more mainland tourists. This paper conducted a questionnaire survey of mainland tourists and used grey relational analysis in grey mathematics to analyze the satisfaction performance of all satisfaction question items. The first eight satisfaction items were used as independent variables, and the overall satisfaction performance was used as a dependent variable for quantile regression model analysis to discuss the relationship between the dependent variable under different quantiles and independent variables. Finally, this study further discussed the predictive accuracy of the least mean regression model and each quantile regression model, as a reference for research personnel. The analysis results showed that other variables could also affect the overall satisfaction performance of mainland tourists, in addition to occupation and age. The overall predictive accuracy of quantile regression model Q0.25 was higher than that of the other three models. PMID:24574916
Wang, Wen-Cheng; Cho, Wen-Chien; Chen, Yin-Jen
2014-01-01
It is estimated that mainland Chinese tourists travelling to Taiwan can bring annual revenues of 400 billion NTD to the Taiwan economy. Thus, how the Taiwanese Government formulates relevant measures to satisfy both sides is the focus of most concern. Taiwan must improve the facilities and service quality of its tourism industry so as to attract more mainland tourists. This paper conducted a questionnaire survey of mainland tourists and used grey relational analysis in grey mathematics to analyze the satisfaction performance of all satisfaction question items. The first eight satisfaction items were used as independent variables, and the overall satisfaction performance was used as a dependent variable for quantile regression model analysis to discuss the relationship between the dependent variable under different quantiles and independent variables. Finally, this study further discussed the predictive accuracy of the least mean regression model and each quantile regression model, as a reference for research personnel. The analysis results showed that other variables could also affect the overall satisfaction performance of mainland tourists, in addition to occupation and age. The overall predictive accuracy of quantile regression model Q0.25 was higher than that of the other three models. PMID:24574916
NASA Technical Reports Server (NTRS)
Rummler, D. R.
1976-01-01
The results are presented of investigations to apply regression techniques to the development of methodology for creep-rupture data analysis. Regression analysis techniques are applied to the explicit description of the creep behavior of materials for space shuttle thermal protection systems. A regression analysis technique is compared with five parametric methods for analyzing three simulated and twenty real data sets, and a computer program for the evaluation of creep-rupture data is presented.
The estimation of Aerosol Optical Depth in eastern China based on regression analysis
NASA Astrophysics Data System (ADS)
Wang, Jing; Shi, Runhe; Liu, Chaoshun; Zhou, Cong
2015-09-01
The atmospheric pollution and air quality issues are getting worse in China, the formation mechanism of aerosols and their environment effects attracted more and more attention. Aerosol Optical Depth (AOD) is one of the most important parameters which can indicate the atmospheric turbidity and aerosol load. High-quality AOD data are significant for the study in the atmospheric environment (i.e., air quality). This paper used MODIS/Terra AOD in 2008 to improve the coverage of MODIS/Aqua AOD, which was based on linear regression analysis model. RMSE between estimation value and AquaAOD detected through satellite is 0.132. The average value of test data was 0.812. The average of regression result was 0.807. It showed that the regression model between AODTerra and AODAqua worked well. Also, we built two sets of estimation models (MODIS AOD and OMI AOD) through stepwise regression analysis model. One is using OMI AOD and meteorological elements to estimate MODIS AOD. The value of RMSE was 0.113, which represents 13.916% of the average(R2=0.782). The other one is using MODIS AOD and meteorological elements to estimate OMI AOD. RMSE of the model is 0.132, which represents 18.182% of the average (R2=0.726).
Lo, Benjamin W. Y.; Fukuda, Hitoshi; Angle, Mark; Teitelbaum, Jeanne; Macdonald, R. Loch; Farrokhyar, Forough; Thabane, Lehana; Levine, Mitchell A. H.
2016-01-01
Background: Classification and regression tree analysis involves the creation of a decision tree by recursive partitioning of a dataset into more homogeneous subgroups. Thus far, there is scarce literature on using this technique to create clinical prediction tools for aneurysmal subarachnoid hemorrhage (SAH). Methods: The classification and regression tree analysis technique was applied to the multicenter Tirilazad database (3551 patients) in order to create the decision-making algorithm. In order to elucidate prognostic subgroups in aneurysmal SAH, neurologic, systemic, and demographic factors were taken into account. The dependent variable used for analysis was the dichotomized Glasgow Outcome Score at 3 months. Results: Classification and regression tree analysis revealed seven prognostic subgroups. Neurological grade, occurrence of post-admission stroke, occurrence of post-admission fever, and age represented the explanatory nodes of this decision tree. Split sample validation revealed classification accuracy of 79% for the training dataset and 77% for the testing dataset. In addition, the occurrence of fever at 1-week post-aneurysmal SAH is associated with increased odds of post-admission stroke (odds ratio: 1.83, 95% confidence interval: 1.56–2.45, P < 0.01). Conclusions: A clinically useful classification tree was generated, which serves as a prediction tool to guide bedside prognostication and clinical treatment decision making. This prognostic decision-making algorithm also shed light on the complex interactions between a number of risk factors in determining outcome after aneurysmal SAH. PMID:27512607
A deformation analysis method of stepwise regression for bridge deflection prediction
NASA Astrophysics Data System (ADS)
Shen, Yueqian; Zeng, Ying; Zhu, Lei; Huang, Teng
2015-12-01
Large-scale bridges are among the most important infrastructures whose safe conditions concern people's daily activities and life safety. Monitoring of large-scale bridges is crucial since deformation might have occurred. How to obtain the deformation information and then judge the safe conditions are the key and difficult problems in bridge deformation monitoring field. Deflection is the important index for evaluation of bridge safety. This paper proposes a forecasting modeling of stepwise regression analysis. Based on the deflection monitoring data of Yangtze River Bridge, the main factors influenced deflection deformation is chiefly studied. Authors use the monitoring data to forecast the deformation value of a bridge deflection at different time from the perspective of non-bridge structure, and compared to the forecasting of gray relational analysis based on linear regression. The result show that the accuracy and reliability of stepwise regression analysis is high, which provides the scientific basis to the bridge operation management. And above all, the ideas of this research provide and effective method for bridge deformation analysis.
Gao, Jun; Lavergne, M. Ruth; McIntyre, Paul
2013-01-01
Classification and regression tree (CART) analysis was used to identify subpopulations with lower palliative care program (PCP) enrolment rates. CART analysis uses recursive partitioning to group predictors. The PCP enrolment rate was 72 percent for the 6,892 adults who died of cancer from 2000 and 2005 in two counties in Nova Scotia, Canada. The lowest PCP enrolment rates were for nursing home residents over 82 years (27 percent), a group residing more than 43 kilometres from the PCP (31 percent), and another group living less than two weeks after their cancer diagnosis (37 percent). The highest rate (86 percent) was for the 2,118 persons who received palliative radiation. Findings from multiple logistic regression (MLR) were provided for comparison. CART findings identified low PCP enrolment subpopulations that were defined by interactions among demographic, social, medical, and health system predictors. PMID:21805944
Forecasting municipal solid waste generation using prognostic tools and regression analysis.
Ghinea, Cristina; Drăgoi, Elena Niculina; Comăniţă, Elena-Diana; Gavrilescu, Marius; Câmpean, Teofil; Curteanu, Silvia; Gavrilescu, Maria
2016-11-01
For an adequate planning of waste management systems the accurate forecast of waste generation is an essential step, since various factors can affect waste trends. The application of predictive and prognosis models are useful tools, as reliable support for decision making processes. In this paper some indicators such as: number of residents, population age, urban life expectancy, total municipal solid waste were used as input variables in prognostic models in order to predict the amount of solid waste fractions. We applied Waste Prognostic Tool, regression analysis and time series analysis to forecast municipal solid waste generation and composition by considering the Iasi Romania case study. Regression equations were determined for six solid waste fractions (paper, plastic, metal, glass, biodegradable and other waste). Accuracy Measures were calculated and the results showed that S-curve trend model is the most suitable for municipal solid waste (MSW) prediction. PMID:27454099
Inhibition of cyclooxygenase (COX)-2 affects endothelial progenitor cell proliferation
Colleselli, Daniela; Bijuklic, Klaudija; Mosheimer, Birgit A.; Kaehler, Christian M. . E-mail: C.M.Kaehler@uibk.ac.at
2006-09-10
Growing evidence indicates that inducible cyclooxygenase-2 (COX-2) is involved in the pathogenesis of inflammatory disorders and various types of cancer. Endothelial progenitor cells recruited from the bone marrow have been shown to be involved in the formation of new vessels in malignancies and discussed for being a key point in tumour progression and metastasis. However, until now, nothing is known about an interaction between COX and endothelial progenitor cells (EPC). Expression of COX-1 and COX-2 was detected by semiquantitative RT-PCR and Western blot. Proliferation kinetics, cell cycle distribution and rate of apoptosis were analysed by MTT test and FACS analysis. Further analyses revealed an implication of Akt phosphorylation and caspase-3 activation. Both COX-1 and COX-2 expression can be found in bone-marrow-derived endothelial progenitor cells in vitro. COX-2 inhibition leads to a significant reduction in proliferation of endothelial progenitor cells by an increase in apoptosis and cell cycle arrest. COX-2 inhibition leads further to an increased cleavage of caspase-3 protein and inversely to inhibition of Akt activation. Highly proliferating endothelial progenitor cells can be targeted by selective COX-2 inhibition in vitro. These results indicate that upcoming therapy strategies in cancer patients targeting COX-2 may be effective in inhibiting tumour vasculogenesis as well as angiogenic processes.
Genetic analysis of tolerance to infections using random regressions: a simulation study.
Kause, Antti
2011-08-01
Tolerance to infections is the ability of a host to limit the impact of a given pathogen burden on host performance. This simulation study demonstrated the merit of using random regressions to estimate unbiased genetic variances for tolerance slope and its genetic correlations with other traits, which could not be obtained using the previously implemented statistical methods. Genetic variance in tolerance was estimated as genetic variance in regression slopes of host performance along an increasing pathogen burden level. Random regressions combined with covariance functions allowed genetic variance for host performance to be estimated at any point along the pathogen burden trajectory, providing a novel means to analyse infection-induced changes in genetic variation of host performance. Yet, the results implied that decreasing family size as well as a non-zero environmental or genetic correlation between initial host performance before infection and pathogen burden led to biased estimates for tolerance genetic variance. In both cases, genetic correlation between tolerance slope and host performance in a pathogen-free environment became artificially negative, implying a genetic trade-off when it did not exist. Moreover, recording a normally distributed pathogen burden as a threshold trait is not a realistic way of obtaining unbiased estimates for tolerance genetic variance. The results show that random regressions are suitable for the genetic analysis of tolerance, given suitable data structure collected either under field or experimental conditions. PMID:21767462
Augmented kludge waveforms and Gaussian process regression for EMRI data analysis
NASA Astrophysics Data System (ADS)
Chua, Alvin J. K.
2016-05-01
Extreme-mass-ratio inspirals (EMRIs) will be an important type of astrophysical source for future space-based gravitational-wave detectors. There is a trade-off between accuracy and computational speed for the EMRI waveform templates required in the analysis of data from these detectors. We discuss how the systematic error incurred by using faster templates may be reduced with improved models such as augmented kludge waveforms, and marginalised over with statistical techniques such as Gaussian process regression.
Repeated-measures regression designs and analysis for environmental effects monitoring programs
NASA Astrophysics Data System (ADS)
Paine, Michael D.; Skinner, Marc A.; Kilgour, Bruce W.; DeBlois, Elisabeth M.; Tracy, Ellen
2014-12-01
This paper provides a general overview of repeated-measures (RM) regression designs and analysis for marine monitoring programs, in support of sediment chemistry, particle size and benthic macroinvertebrate community analyses provided as part of this series. In RM regression designs, the same n replicates (usually stations in monitoring programs) are re-sampled (i.e., repeatedly measured) at t>1 Times (usually years). The stations provide variation in the predictor, or X variables. In the Terra Nova environmental effects monitoring (EEM) program, n=48 stations were sampled in each of t=7 years from 2000 to 2010. Two distance measures from five drill centres (sources of drilling wastes) were fixed predictor variables. RM regression designs are rarely used in environmental monitoring programs, but are often suitable and would be appropriate if applied to data from many monitoring programs. For the Terra Nova EEM program, carry-over effects, or persistent and usually small-scale variations among stations unrelated to distance, were strong for most sediment quality variables. Whenever natural carry-over effects are strong, RM designs and analysis will usually be more powerful and suitable than alternative approaches to the analysis.
Non-Stationary Hydrologic Frequency Analysis using B-Splines Quantile Regression
NASA Astrophysics Data System (ADS)
Nasri, B.; St-Hilaire, A.; Bouezmarni, T.; Ouarda, T.
2015-12-01
Hydrologic frequency analysis is commonly used by engineers and hydrologists to provide the basic information on planning, design and management of hydraulic structures and water resources system under the assumption of stationarity. However, with increasing evidence of changing climate, it is possible that the assumption of stationarity would no longer be valid and the results of conventional analysis would become questionable. In this study, we consider a framework for frequency analysis of extreme flows based on B-Splines quantile regression, which allows to model non-stationary data that have a dependence on covariates. Such covariates may have linear or nonlinear dependence. A Markov Chain Monte Carlo (MCMC) algorithm is used to estimate quantiles and their posterior distributions. A coefficient of determination for quantiles regression is proposed to evaluate the estimation of the proposed model for each quantile level. The method is applied on annual maximum and minimum streamflow records in Ontario, Canada. Climate indices are considered to describe the non-stationarity in these variables and to estimate the quantiles in this case. The results show large differences between the non-stationary quantiles and their stationary equivalents for annual maximum and minimum discharge with high annual non-exceedance probabilities. Keywords: Quantile regression, B-Splines functions, MCMC, Streamflow, Climate indices, non-stationarity.
Gulbransen, Dana J; McGlathery, Karen J; Marklund, Maria; Norris, James N; Gurgel, Carlos Frederico D
2012-10-01
Gracilaria vermiculophylla (Ohmi) Papenfuss is an invasive alga that is native to Southeast Asia and has invaded many estuaries in North America and Europe. It is difficult to differentiate G. vermiculophylla from native forms using morphology and therefore molecular techniques are needed. In this study, we used three molecular markers (rbcL, cox2-cox3 spacer, cox1) to identify G. vermiculophylla at several locations in the western Atlantic. RbcL and cox2-cox3 spacer markers confirmed the presence of G. vermiculophylla on the east coast of the USA from Massachusetts to South Carolina. We used a 507 base pair region of cox1 mtDNA to (i) verify the widespread distribution of G. vermiculophylla in the Virginia (VA) coastal bays and (ii) determine the intraspecific diversity of these algae. Cox1 haplotype richness in the VA coastal bays was much higher than that previously found in other invaded locations, as well as some native locations. This difference is likely attributed to the more intensive sampling design used in this study, which was able to detect richness created by multiple, diverse introductions. On the basis of our results, we recommend that future studies take differences in sampling design into account when comparing haplotype richness and diversity between native and non-native studies in the literature. PMID:27011285
Mahdi, Chanif; Nurdiana, Nurdiana; Kikuchi, Takheshi; Fatchiyah, Fatchiyah
2014-01-01
To understand the structural features that dictate the selectivity of the two isoforms of the prostaglandin H2 synthase (PGHS/COX), the three-dimensional (3D) structure of COX-1/COX-2 was assessed by means of binding energy calculation of virtual molecular dynamic with using ligand alpha-Patchouli alcohol isomers. Molecular interaction studies with COX-1 and COX-2 were done using the molecular docking tools by Hex 8.0. Interactions were further visualized by using Discovery Studio Client 3.5 software tool. The binding energy of molecular interaction was calculated by AMBER12 and Virtual Molecular Dynamic 1.9.1 software. The analysis of the alpha-Patchouli alcohol isomer compounds showed that all alpha-Patchouli alcohol isomers were suggested as inhibitor of COX-1 and COX-2. Collectively, the scoring binding energy calculation (with PBSA Model Solvent) of alpha-Patchouli alcohol isomer compounds (CID442384, CID6432585, CID3080622, CID10955174, and CID56928117) was suggested as candidate for a selective COX-1 inhibitor and CID521903 as nonselective COX-1/COX-2. PMID:25484897
Raharjo, Sentot Joko; Mahdi, Chanif; Nurdiana, Nurdiana; Kikuchi, Takheshi; Fatchiyah, Fatchiyah
2014-01-01
To understand the structural features that dictate the selectivity of the two isoforms of the prostaglandin H2 synthase (PGHS/COX), the three-dimensional (3D) structure of COX-1/COX-2 was assessed by means of binding energy calculation of virtual molecular dynamic with using ligand alpha-Patchouli alcohol isomers. Molecular interaction studies with COX-1 and COX-2 were done using the molecular docking tools by Hex 8.0. Interactions were further visualized by using Discovery Studio Client 3.5 software tool. The binding energy of molecular interaction was calculated by AMBER12 and Virtual Molecular Dynamic 1.9.1 software. The analysis of the alpha-Patchouli alcohol isomer compounds showed that all alpha-Patchouli alcohol isomers were suggested as inhibitor of COX-1 and COX-2. Collectively, the scoring binding energy calculation (with PBSA Model Solvent) of alpha-Patchouli alcohol isomer compounds (CID442384, CID6432585, CID3080622, CID10955174, and CID56928117) was suggested as candidate for a selective COX-1 inhibitor and CID521903 as nonselective COX-1/COX-2. PMID:25484897
GENE-LEVEL PHARMACOGENETIC ANALYSIS ON SURVIVAL OUTCOMES USING GENE-TRAIT SIMILARITY REGRESSION
Tzeng, Jung-Ying; Lu, Wenbin; Hsu, Fang-Chi
2014-01-01
Gene/pathway-based methods are drawing significant attention due to their usefulness in detecting rare and common variants that affect disease susceptibility. The biological mechanism of drug responses indicates that a gene-based analysis has even greater potential in pharmacogenetics. Motivated by a study from the Vitamin Intervention for Stroke Prevention (VISP) trial, we develop a gene-trait similarity regression for survival analysis to assess the effect of a gene or pathway on time-to-event outcomes. The similarity regression has a general framework that covers a range of survival models, such as the proportional hazards model and the proportional odds model. The inference procedure developed under the proportional hazards model is robust against model misspecification. We derive the equivalence between the similarity survival regression and a random effects model, which further unifies the current variance-component based methods. We demonstrate the effectiveness of the proposed method through simulation studies. In addition, we apply the method to the VISP trial data to identify the genes that exhibit an association with the risk of a recurrent stroke. TCN2 gene was found to be associated with the recurrent stroke risk in the low-dose arm. This gene may impact recurrent stroke risk in response to cofactor therapy. PMID:25018788
Yao, Yan; Wang, Chang-yue; Liu, Hui-jun; Tang, Jian-bin; Cai, Jin-hui; Wang, Jing-jun
2015-07-01
Forest bio-fuel, a new type renewable energy, has attracted increasing attention as a promising alternative. In this study, a new method called Sparse Partial Least Squares Regression (SPLS) is used to construct the proximate analysis model to analyze the fuel characteristics of sawdust combining Near Infrared Spectrum Technique. Moisture, Ash, Volatile and Fixed Carbon percentage of 80 samples have been measured by traditional proximate analysis. Spectroscopic data were collected by Nicolet NIR spectrometer. After being filtered by wavelet transform, all of the samples are divided into training set and validation set according to sample category and producing area. SPLS, Principle Component Regression (PCR), Partial Least Squares Regression (PLS) and Least Absolute Shrinkage and Selection Operator (LASSO) are presented to construct prediction model. The result advocated that SPLS can select grouped wavelengths and improve the prediction performance. The absorption peaks of the Moisture is covered in the selected wavelengths, well other compositions have not been confirmed yet. In a word, SPLS can reduce the dimensionality of complex data sets and interpret the relationship between spectroscopic data and composition concentration, which will play an increasingly important role in the field of NIR application. PMID:26717741
NASA Astrophysics Data System (ADS)
Urrutia, J. D.; Bautista, L. A.; Baccay, E. B.
2014-04-01
The aim of this study was to develop mathematical models for estimating earthquake casualties such as death, number of injured persons, affected families and total cost of damage. To quantify the direct damages from earthquakes to human beings and properties given the magnitude, intensity, depth of focus, location of epicentre and time duration, the regression models were made. The researchers formulated models through regression analysis using matrices and used α = 0.01. The study considered thirty destructive earthquakes that hit the Philippines from the inclusive years 1968 to 2012. Relevant data about these said earthquakes were obtained from Philippine Institute of Volcanology and Seismology. Data on damages and casualties were gathered from the records of National Disaster Risk Reduction and Management Council. The mathematical models made are as follows: This study will be of great value in emergency planning, initiating and updating programs for earthquake hazard reductionin the Philippines, which is an earthquake-prone country.
Analysis of ontogenetic spectra of populations of plants and lichens via ordinal regression
NASA Astrophysics Data System (ADS)
Sofronov, G. Yu.; Glotov, N. V.; Ivanov, S. M.
2015-03-01
Ontogenetic spectra of plants and lichens tend to vary across the populations. This means that if several subsamples within a sample (or a population) were collected, then the subsamples would not be homogeneous. Consequently, the statistical analysis of the aggregated data would not be correct, which could potentially lead to false biological conclusions. In order to take into account the heterogeneity of the subsamples, we propose to use ordinal regression, which is a type of generalized linear regression. In this paper, we study the populations of cowberry Vaccinium vitis-idaea L. and epiphytic lichens Hypogymnia physodes (L.) Nyl. and Pseudevernia furfuracea (L.) Zopf. We obtain estimates for the proportions of between-sample variability in the total variability of the ontogenetic spectra of the populations.
NASA Astrophysics Data System (ADS)
Dervilis, N.; Worden, K.; Cross, E. J.
2015-07-01
In the data-based approach to structural health monitoring (SHM), the absence of data from damaged structures in many cases forces a dependence on novelty detection as a means of diagnosis. Unfortunately, this means that benign variations in the operating or environmental conditions of the structure must be handled very carefully, lest they lead to false alarms. If novelty detection is implemented in terms of outlier detection, the outliers may arise in the data as the result of both benign and malign causes and it is important to understand their sources. Comparatively recent developments in the field of robust regression have the potential to provide ways of exploring and visualising SHM data as a means of shedding light on the different origins of outliers. The current paper will illustrate the use of robust regression for SHM data analysis through experimental data acquired from the Z24 and Tamar Bridges, although the methods are general and not restricted to SHM or civil infrastructure.
Cao, Han-Han; Du, Ruo-Fei; Yang, Jia-Ning; Feng, Yi
2014-03-01
In this paper, microcrystalline cellulose WJ101 was used as a model material to investigate the effect of various process parameters on granule yield and friability after dry granulation with a single factor and the effect of comprehensive inspection process parameters on the effect of granule yield and friability, then the correlation between process parameters and granule quality was established. The regress equation was established between process parameters and granule yield and friability by multiple regression analysis, the affecting the order of the size of the order of the process parameters on granule yield and friability was: rollers speed > rollers pressure > speed of horizontal feed. Granule yield was positively correlated with pressure and speed of horizontal feed and negatively correlated rollers speed, while friability was on the contrary. By comparison, fitted value and real value, fitted and real value are basically the same of no significant differences (P > 0.05) and with high precision and reliability. PMID:24961115
Alados, C.L.; Pueyo, Y.; Giner, M.L.; Navarro, T.; Escos, J.; Barroso, F.; Cabezudo, B.; Emlen, J.M.
2003-01-01
We studied the effect of grazing on the degree of regression of successional vegetation dynamic in a semi-arid Mediterranean matorral. We quantified the spatial distribution patterns of the vegetation by fractal analyses, using the fractal information dimension and spatial autocorrelation measured by detrended fluctuation analyses (DFA). It is the first time that fractal analysis of plant spatial patterns has been used to characterize the regressive ecological succession. Plant spatial patterns were compared over a long-term grazing gradient (low, medium and heavy grazing pressure) and on ungrazed sites for two different plant communities: A middle dense matorral of Chamaerops and Periploca at Sabinar-Romeral and a middle dense matorral of Chamaerops, Rhamnus and Ulex at Requena-Montano. The two communities differed also in the microclimatic characteristics (sea oriented at the Sabinar-Romeral site and inland oriented at the Requena-Montano site). The information fractal dimension increased as we moved from a middle dense matorral to discontinuous and scattered matorral and, finally to the late regressive succession, at Stipa steppe stage. At this stage a drastic change in the fractal dimension revealed a change in the vegetation structure, accurately indicating end successional vegetation stages. Long-term correlation analysis (DFA) revealed that an increase in grazing pressure leads to unpredictability (randomness) in species distributions, a reduction in diversity, and an increase in cover of the regressive successional species, e.g. Stipa tenacissima L. These comparisons provide a quantitative characterization of the successional dynamic of plant spatial patterns in response to grazing perturbation gradient. ?? 2002 Elsevier Science B.V. All rights reserved.
ERIC Educational Resources Information Center
Johns, Stephanie
2010-01-01
Kathy Cox, the superintendent of schools for Georgia, believes "excellence is not an accident". She made a name for herself by winning $1 million proving she was smarter than a fifth-grader on a popular television show. This article presents a profile of Cox, her family, her role as school superintendent, and her accomplishments. Although she…
Validation of a heteroscedastic hazards regression model.
Wu, Hong-Dar Isaac; Hsieh, Fushing; Chen, Chen-Hsin
2002-03-01
A Cox-type regression model accommodating heteroscedasticity, with a power factor of the baseline cumulative hazard, is investigated for analyzing data with crossing hazards behavior. Since the approach of partial likelihood cannot eliminate the baseline hazard, an overidentified estimating equation (OEE) approach is introduced in the estimation procedure. It by-product, a model checking statistic, is presented to test for the overall adequacy of the heteroscedastic model. Further, under the heteroscedastic model setting, we propose two statistics to test the proportional hazards assumption. Implementation of this model is illustrated in a data analysis of a cancer clinical trial. PMID:11878222
Dhanya, S; Kumari Roshni, V S
2016-01-01
Textures play an important role in image classification. This paper proposes a high performance texture classification method using a combination of multiresolution analysis tool and linear regression modelling by channel elimination. The correlation between different frequency regions has been validated as a sort of effective texture characteristic. This method is motivated by the observation that there exists a distinctive correlation between the image samples belonging to the same kind of texture, at different frequency regions obtained by a wavelet transform. Experimentally, it is observed that this correlation differs across textures. The linear regression modelling is employed to analyze this correlation and extract texture features that characterize the samples. Our method considers not only the frequency regions but also the correlation between these regions. This paper primarily focuses on applying the Dual Tree Complex Wavelet Packet Transform and the Linear Regression model for classification of the obtained texture features. Additionally the paper also presents a comparative assessment of the classification results obtained from the above method with two more types of wavelet transform methods namely the Discrete Wavelet Transform and the Discrete Wavelet Packet Transform. PMID:26835234
Oil and gas pipeline construction cost analysis and developing regression models for cost estimation
NASA Astrophysics Data System (ADS)
Thaduri, Ravi Kiran
In this study, cost data for 180 pipelines and 136 compressor stations have been analyzed. On the basis of the distribution analysis, regression models have been developed. Material, Labor, ROW and miscellaneous costs make up the total cost of a pipeline construction. The pipelines are analyzed based on different pipeline lengths, diameter, location, pipeline volume and year of completion. In a pipeline construction, labor costs dominate the total costs with a share of about 40%. Multiple non-linear regression models are developed to estimate the component costs of pipelines for various cross-sectional areas, lengths and locations. The Compressor stations are analyzed based on the capacity, year of completion and location. Unlike the pipeline costs, material costs dominate the total costs in the construction of compressor station, with an average share of about 50.6%. Land costs have very little influence on the total costs. Similar regression models are developed to estimate the component costs of compressor station for various capacities and locations.
Meadows, Cheyney; Rajala-Schultz, Päivi J; Frazer, Grant S; Meiring, Richard W; Hoblet, Kent H
2006-12-18
An observational study was conducted in order to assess the impact of a contract breeding program on the reproductive performance in a selected group of Ohio dairies using event-time analysis. The contract breeding program was offered by a breeding co-operative and featured tail chalking and daily evaluation of cows for insemination by co-operative technicians. Dairy employees no longer handled estrus detection activities. Between early 2002 and mid-2004, test-day records related to production and reproduction were obtained for 16,453 lactations representing 11,398 cows in a non-random sample of 31 dairies identified as well-managed client herds of the breeding co-operative. Of the 31 herds, 15 were using the contract breeding at the start of the data acquisition period, having started in the previous 2 years. The remaining 16 herds managed their own breeding program and used the co-operative for semen purchase. Cox proportional hazards modeling techniques were used to estimate the association of the contract breeding, as well as the effect of other significant predictors, with the hazard of pregnancy. Two separate Cox models were developed and compared: one that only considered fixed covariates and a second that included both fixed and time-varying covariates. Estimates of effects were expressed as the hazard ratio (HR) for pregnancy. Results of the fixed covariates model indicated that, controlling for breed, herd size, use of ovulation synchronization protocols in the herd, whether somatic cell score exceeded 4.5 prior to pregnancy or censoring, parity, calving season, and maximum test-day milk prior to pregnancy or censoring, the contract breeding program was associated with an increased hazard of pregnancy (HR=1.315; 95% CI 1.261-1.371). The results of the time-varying covariates model, which controlled for breed, herd size, use of ovulation synchronization protocols, somatic cell score above 4.5, parity, calving season, and testing season also found that the
Irrechukwu, Onyi N; Reiter, David A; Lin, Ping-Chang; Roque, Remigio A; Fishbein, Kenneth W; Spencer, Richard G
2012-06-01
Increased sensitivity in the characterization of cartilage matrix status by magnetic resonance (MR) imaging, through the identification of surrogate markers for tissue quality, would be of great use in the noninvasive evaluation of engineered cartilage. Recent advances in MR evaluation of cartilage include multiexponential and multiparametric analysis, which we now extend to engineered cartilage. We studied constructs which developed from chondrocytes seeded in collagen hydrogels. MR measurements of transverse relaxation times were performed on samples after 1, 2, 3, and 4 weeks of development. Corresponding biochemical measurements of sulfated glycosaminoglycan (sGAG) were also performed. sGAG per wet weight increased from 7.74±1.34 μg/mg in week 1 to 21.06±4.14 μg/mg in week 4. Using multiexponential T₂ analysis, we detected at least three distinct water compartments, with T₂ values and weight fractions of (45 ms, 3%), (200 ms, 4%), and (500 ms, 97%), respectively. These values are consistent with known properties of engineered cartilage and previous studies of native cartilage. Correlations between sGAG and MR measurements were examined using conventional univariate analysis with T₂ data from monoexponential fits with individual multiexponential compartment fractions and sums of these fractions, through multiple linear regression based on linear combinations of fractions, and, finally, with multivariate analysis using the support vector regression (SVR) formalism. The phenomenological relationship between T₂ from monoexponential fitting and sGAG exhibited a correlation coefficient of r²=0.56, comparable to the more physically motivated correlations between individual fractions or sums of fractions and sGAG; the correlation based on the sum of the two proteoglycan-associated fractions was r²=0.58. Correlations between measured sGAG and those calculated using standard linear regression were more modest, with r² in the range 0
NASA Astrophysics Data System (ADS)
Liu, Pudong; Shi, Runhe; Wang, Hong; Bai, Kaixu; Gao, Wei
2014-10-01
Leaf pigments are key elements for plant photosynthesis and growth. Traditional manual sampling of these pigments is labor-intensive and costly, which also has the difficulty in capturing their temporal and spatial characteristics. The aim of this work is to estimate photosynthetic pigments at large scale by remote sensing. For this purpose, inverse model were proposed with the aid of stepwise multiple linear regression (SMLR) analysis. Furthermore, a leaf radiative transfer model (i.e. PROSPECT model) was employed to simulate the leaf reflectance where wavelength varies from 400 to 780 nm at 1 nm interval, and then these values were treated as the data from remote sensing observations. Meanwhile, simulated chlorophyll concentration (Cab), carotenoid concentration (Car) and their ratio (Cab/Car) were taken as target to build the regression model respectively. In this study, a total of 4000 samples were simulated via PROSPECT with different Cab, Car and leaf mesophyll structures as 70% of these samples were applied for training while the last 30% for model validation. Reflectance (r) and its mathematic transformations (1/r and log (1/r)) were all employed to build regression model respectively. Results showed fair agreements between pigments and simulated reflectance with all adjusted coefficients of determination (R2) larger than 0.8 as 6 wavebands were selected to build the SMLR model. The largest value of R2 for Cab, Car and Cab/Car are 0.8845, 0.876 and 0.8765, respectively. Meanwhile, mathematic transformations of reflectance showed little influence on regression accuracy. We concluded that it was feasible to estimate the chlorophyll and carotenoids and their ratio based on statistical model with leaf reflectance data.
COX2 Inhibition Reduces Aortic Valve Calcification In Vivo
Wirrig, Elaine E.; Gomez, M. Victoria; Hinton, Robert B.; Yutzey, Katherine E.
2016-01-01
Objective Calcific aortic valve disease (CAVD) is a significant cause of morbidity and mortality, which affects approximately 1% of the US population and is characterized by calcific nodule formation and stenosis of the valve. Klotho-deficient mice were used to study the molecular mechanisms of CAVD as they develop robust aortic valve (AoV) calcification. Through microarray analysis of AoV tissues from klotho-deficient and wild type mice, increased expression of the gene encoding cyclooxygenase 2/COX2 (Ptgs2) was found. COX2 activity contributes to bone differentiation and homeostasis, thus the contribution of COX2 activity to AoV calcification was assessed. Approach and Results In klotho-deficient mice, COX2 expression is increased throughout regions of valve calcification and is induced in the valvular interstitial cells (VICs) prior to calcification formation. Similarly, COX2 expression is increased in human diseased AoVs. Treatment of cultured porcine aortic VICs with osteogenic media induces bone marker gene expression and calcification in vitro, which is blocked by inhibition of COX2 activity. In vivo, genetic loss of function of COX2 cyclooxygenase activity partially rescues AoV calcification in klotho-deficient mice. Moreover, pharmacologic inhibition of COX2 activity in klotho-deficient mice via celecoxib-containing diet reduces AoV calcification and blocks osteogenic gene expression. Conclusions COX2 expression is upregulated in CAVD and its activity contributes to osteogenic gene induction and valve calcification in vitro and in vivo. PMID:25722432
Analysis of sparse data in logistic regression in medical research: A newer approach
Devika, S; Jeyaseelan, L; Sebastian, G
2016-01-01
Background and Objective: In the analysis of dichotomous type response variable, logistic regression is usually used. However, the performance of logistic regression in the presence of sparse data is questionable. In such a situation, a common problem is the presence of high odds ratios (ORs) with very wide 95% confidence interval (CI) (OR: >999.999, 95% CI: <0.001, >999.999). In this paper, we addressed this issue by using penalized logistic regression (PLR) method. Materials and Methods: Data from case-control study on hyponatremia and hiccups conducted in Christian Medical College, Vellore, Tamil Nadu, India was used. The outcome variable was the presence/absence of hiccups and the main exposure variable was the status of hyponatremia. Simulation dataset was created with different sample sizes and with a different number of covariates. Results: A total of 23 cases and 50 controls were used for the analysis of ordinary and PLR methods. The main exposure variable hyponatremia was present in nine (39.13%) of the cases and in four (8.0%) of the controls. Of the 23 hiccup cases, all were males and among the controls, 46 (92.0%) were males. Thus, the complete separation between gender and the disease group led into an infinite OR with 95% CI (OR: >999.999, 95% CI: <0.001, >999.999) whereas there was a finite and consistent regression coefficient for gender (OR: 5.35; 95% CI: 0.42, 816.48) using PLR. After adjusting for all the confounding variables, hyponatremia entailed 7.9 (95% CI: 2.06, 38.86) times higher risk for the development of hiccups as was found using PLR whereas there was an overestimation of risk OR: 10.76 (95% CI: 2.17, 53.41) using the conventional method. Simulation experiment shows that the estimated coverage probability of this method is near the nominal level of 95% even for small sample sizes and for a large number of covariates. Conclusions: PLR is almost equal to the ordinary logistic regression when the sample size is large and is superior in
NASA Astrophysics Data System (ADS)
Goovaerts, Pierre
2013-06-01
Analyzing temporal trends in health outcomes can provide a more comprehensive picture of the burden of a disease like cancer and generate new insights about the impact of various interventions. In the United States such an analysis is increasingly conducted using joinpoint regression outside a spatial framework, which overlooks the existence of significant variation among U.S. counties and states with regard to the incidence of cancer. This paper presents several innovative ways to account for space in joinpoint regression: (1) prior filtering of noise in the data by binomial kriging and use of the kriging variance as measure of reliability in weighted least-square regression, (2) detection of significant boundaries between adjacent counties based on tests of parallelism of time trends and confidence intervals of annual percent change of rates, and (3) creation of spatially compact groups of counties with similar temporal trends through the application of hierarchical cluster analysis to the results of boundary analysis. The approach is illustrated using time series of proportions of prostate cancer late-stage cases diagnosed yearly in every county of Florida since 1980s. The annual percent change (APC) in late-stage diagnosis and the onset years for significant declines vary greatly across Florida. Most counties with non-significant average APC are located in the north-western part of Florida, known as the Panhandle, which is more rural than other parts of Florida. The number of significant boundaries peaked in the early 1990s when prostate-specific antigen (PSA) test became widely available, a temporal trend that suggests the existence of geographical disparities in the implementation and/or impact of the new screening procedure, in particular as it began available.
NASA Technical Reports Server (NTRS)
Waller, M. C.
1976-01-01
An electro-optical device called an oculometer which tracks a subject's lookpoint as a time function has been used to collect data in a real-time simulation study of instrument landing system (ILS) approaches. The data describing the scanning behavior of a pilot during the instrument approaches have been analyzed by use of a stepwise regression analysis technique. A statistically significant correlation between pilot workload, as indicated by pilot ratings, and scanning behavior has been established. In addition, it was demonstrated that parameters derived from the scanning behavior data can be combined in a mathematical equation to provide a good representation of pilot workload.
Hofland, G.S.; Barton, C.C.
1990-10-01
The computer program FREQFIT is designed to perform regression and statistical chi-squared goodness of fit analysis on one-dimensional or two-dimensional data. The program features an interactive user dialogue, numerous help messages, an option for screen or line printer output, and the flexibility to use practically any commercially available graphics package to create plots of the program`s results. FREQFIT is written in Microsoft QuickBASIC, for IBM-PC compatible computers. A listing of the QuickBASIC source code for the FREQFIT program, a user manual, and sample input data, output, and plots are included. 6 refs., 1 fig.
NASA Astrophysics Data System (ADS)
Sugihara, Shigemitsu; Shinozaki, Tsuguhiro; Ohishi, Hiroyuki; Araki, Yoshinori; Furukawa, Kohei
It is difficult to deregulate sediment-related disaster warning information, for the reason that it is difficult to quantify the risk of disaster after the heavy rain. If we can quantify the risk according to the rain situation, it will be an indication of deregulation. In this study, using logistic regression analysis, we quantified the risk according to the rain situation as the probability of disaster occurrence. And we analyzed the setup of resolutive criterion for sediment-related disaster warning information. As a result, we can improve convenience of the evaluation method of probability of disaster occurrence, which is useful to provide information of imminently situation.
Regression Models for the Analysis of Longitudinal Gaussian Data from Multiple Sources
O’Brien, Liam M.; Fitzmaurice, Garrett M.
2006-01-01
We present a regression model for the joint analysis of longitudinal multiple source Gaussian data. Longitudinal multiple source data arise when repeated measurements are taken from two or more sources, and each source provides a measure of the same underlying variable and on the same scale. This type of data generally produces a relatively large number of observations per subject; thus estimation of an unstructured covariance matrix often may not be possible. We consider two methods by which parsimonious models for the covariance can be obtained for longitudinal multiple source data. The methods are illustrated with an example of multiple informant data arising from a longitudinal interventional trial in psychiatry. PMID:15726666
Amene, E; Hanson, L A; Zahn, E A; Wild, S R; Döpfer, D
2016-07-01
The purpose of this study was to apply a novel statistical method for variable selection and a model-based approach for filling data gaps in mortality rates associated with foodborne diseases using the WHO Vital Registration mortality dataset. Correlation analysis and elastic net regularization methods were applied to drop redundant variables and to select the most meaningful subset of predictors. Whenever predictor data were missing, multiple imputation was used to fill in plausible values. Cluster analysis was applied to identify similar groups of countries based on the values of the predictors. Finally, a Bayesian hierarchical regression model was fit to the final dataset for predicting mortality rates. From 113 potential predictors, 32 were retained after correlation analysis. Out of these 32 predictors, eight with non-zero coefficients were selected using the elastic net regularization method. Based on the values of these variables, four clusters of countries were identified. The uncertainty of predictions was large for countries within clusters lacking mortality rates, and it was low for a cluster that had mortality rate information. Our results demonstrated that, using Bayesian hierarchical regression models, a data-driven clustering of countries and a meaningful subset of predictors can be used to fill data gaps in foodborne disease mortality. PMID:26785774
NASA Astrophysics Data System (ADS)
Mandal, Nilrudra; Doloi, Biswanath; Mondal, Biswanath
2016-01-01
In the present study, an attempt has been made to apply the Taguchi parameter design method and regression analysis for optimizing the cutting conditions on surface finish while machining AISI 4340 steel with the help of the newly developed yttria based Zirconia Toughened Alumina (ZTA) inserts. These inserts are prepared through wet chemical co-precipitation route followed by powder metallurgy process. Experiments have been carried out based on an orthogonal array L9 with three parameters (cutting speed, depth of cut and feed rate) at three levels (low, medium and high). Based on the mean response and signal to noise ratio (SNR), the best optimal cutting condition has been arrived at A3B1C1 i.e. cutting speed is 420 m/min, depth of cut is 0.5 mm and feed rate is 0.12 m/min considering the condition smaller is the better approach. Analysis of Variance (ANOVA) is applied to find out the significance and percentage contribution of each parameter. The mathematical model of surface roughness has been developed using regression analysis as a function of the above mentioned independent variables. The predicted values from the developed model and experimental values are found to be very close to each other justifying the significance of the model. A confirmation run has been carried out with 95 % confidence level to verify the optimized result and the values obtained are within the prescribed limit.
Ziemssen, Tjalf; Reimann, Manja; Gasch, Julia; Rüdiger, Heinz
2013-09-01
Biological rhythms, describing the temporal variation of biological processes, are a characteristic feature of complex systems. The analysis of biological rhythms can provide important insights into the pathophysiology of different diseases, especially, in cardiovascular medicine. In the field of the autonomic nervous system, heart rate variability (HRV) and baroreflex sensitivity (BRS) describe important fluctuations of blood pressure and heart rate which are often analyzed by Fourier transformation. However, these parameters are stochastic with overlaying rhythmical structures. R-R intervals as independent variables of time are not equidistant. That is why the trigonometric regressive spectral (TRS) analysis--reviewed in this paper--was introduced, considering both the statistical and rhythmical features of such time series. The data segments required for TRS analysis can be as short as 20 s allowing for dynamic evaluation of heart rate and blood pressure interaction over longer periods. Beyond HRV, TRS also estimates BRS based on linear regression analyses of coherent heart rate and blood pressure oscillations. An additional advantage is that all oscillations are analyzed by the same (maximal) number of R-R intervals thereby providing a high number of individual BRS values. This ensures a high confidence level of BRS determination which, along with short recording periods, may be of profound clinical relevance. The dynamic assessment of heart rate and blood pressure spectra by TRS allows a more precise evaluation of cardiovascular modulation under different settings as has already been demonstrated in different clinical studies. PMID:23812502
Selenium Exposure and Cancer Risk: an Updated Meta-analysis and Meta-regression
Cai, Xianlei; Wang, Chen; Yu, Wanqi; Fan, Wenjie; Wang, Shan; Shen, Ning; Wu, Pengcheng; Li, Xiuyang; Wang, Fudi
2016-01-01
The objective of this study was to investigate the associations between selenium exposure and cancer risk. We identified 69 studies and applied meta-analysis, meta-regression and dose-response analysis to obtain available evidence. The results indicated that high selenium exposure had a protective effect on cancer risk (pooled OR = 0.78; 95%CI: 0.73–0.83). The results of linear and nonlinear dose-response analysis indicated that high serum/plasma selenium and toenail selenium had the efficacy on cancer prevention. However, we did not find a protective efficacy of selenium supplement. High selenium exposure may have different effects on specific types of cancer. It decreased the risk of breast cancer, lung cancer, esophageal cancer, gastric cancer, and prostate cancer, but it was not associated with colorectal cancer, bladder cancer, and skin cancer. PMID:26786590
Bareth, Bettina; Dennerlein, Sven; Mick, David U.; Nikolov, Miroslav; Urlaub, Henning
2013-01-01
Cox1, the core subunit of the cytochrome c oxidase, receives two heme a cofactors during assembly of the 13-subunit enzyme complex. However, at which step of the assembly process and how heme is inserted into Cox1 have remained an enigma. Shy1, the yeast SURF1 homolog, has been implicated in heme transfer to Cox1, whereas the heme a synthase, Cox15, catalyzes the final step of heme a synthesis. Here we performed a comprehensive analysis of cytochrome c oxidase assembly intermediates containing Shy1. Our analyses suggest that Cox15 displays a role in cytochrome c oxidase assembly, which is independent of its functions as the heme a synthase. Cox15 forms protein complexes with Shy1 and also associates with Cox1-containing complexes independently of Shy1 function. These findings indicate that Shy1 does not serve as a mobile heme carrier between the heme a synthase and maturing Cox1 but rather cooperates with Cox15 for heme transfer and insertion in early assembly intermediates of cytochrome c oxidase. PMID:23979592
NASA Astrophysics Data System (ADS)
Rajab, Jasim M.; MatJafri, M. Z.; Lim, H. S.
2013-06-01
This study encompasses columnar ozone modelling in the peninsular Malaysia. Data of eight atmospheric parameters [air surface temperature (AST), carbon monoxide (CO), methane (CH4), water vapour (H2Ovapour), skin surface temperature (SSKT), atmosphere temperature (AT), relative humidity (RH), and mean surface pressure (MSP)] data set, retrieved from NASA's Atmospheric Infrared Sounder (AIRS), for the entire period (2003-2008) was employed to develop models to predict the value of columnar ozone (O3) in study area. The combined method, which is based on using both multiple regressions combined with principal component analysis (PCA) modelling, was used to predict columnar ozone. This combined approach was utilized to improve the prediction accuracy of columnar ozone. Separate analysis was carried out for north east monsoon (NEM) and south west monsoon (SWM) seasons. The O3 was negatively correlated with CH4, H2Ovapour, RH, and MSP, whereas it was positively correlated with CO, AST, SSKT, and AT during both the NEM and SWM season periods. Multiple regression analysis was used to fit the columnar ozone data using the atmospheric parameter's variables as predictors. A variable selection method based on high loading of varimax rotated principal components was used to acquire subsets of the predictor variables to be comprised in the linear regression model of the atmospheric parameter's variables. It was found that the increase in columnar O3 value is associated with an increase in the values of AST, SSKT, AT, and CO and with a drop in the levels of CH4, H2Ovapour, RH, and MSP. The result of fitting the best models for the columnar O3 value using eight of the independent variables gave about the same values of the R (≈0.93) and R2 (≈0.86) for both the NEM and SWM seasons. The common variables that appeared in both regression equations were SSKT, CH4 and RH, and the principal precursor of the columnar O3 value in both the NEM and SWM seasons was SSKT.
Ma, Ya-Nan; Wang, Jing; Dong, Guang-Hui; Liu, Miao-Miao; Wang, Da; Liu, Yu-Qin; Zhao, Yang; Ren, Wan-Hui; Lee, Yungling Leo; Zhao, Ya-Dong; He, Qin-Cheng
2013-01-01
Background There have been few published studies on spirometric reference values for healthy children in China. We hypothesize that there would have been changes in lung function that would not have been precisely predicted by the existing spirometric reference equations. The objective of the study was to develop more accurate predictive equations for spirometric reference values for children aged 9 to 15 years in Northeast China. Methodology/Principal Findings Spirometric measurements were obtained from 3,922 children, including 1,974 boys and 1,948 girls, who were randomly selected from five cities of Liaoning province, Northeast China, using the ATS (American Thoracic Society) and ERS (European Respiratory Society) standards. The data was then randomly split into a training subset containing 2078 cases and a validation subset containing 1844 cases. Predictive equations used multiple linear regression techniques with three predictor variables: height, age and weight. Model goodness of fit was examined using the coefficient of determination or the R2 and adjusted R2. The predicted values were compared with those obtained from the existing spirometric reference equations. The results showed the prediction equations using linear regression analysis performed well for most spirometric parameters. Paired t-tests were used to compare the predicted values obtained from the developed and existing spirometric reference equations based on the validation subset. The t-test for males was not statistically significant (p>0.01). The predictive accuracy of the developed equations was higher than the existing equations and the predictive ability of the model was also validated. Conclusion/Significance We developed prediction equations using linear regression analysis of spirometric parameters for children aged 9–15 years in Northeast China. These equations represent the first attempt at predicting lung function for Chinese children following the ATS/ERS Task Force 2005
Regression analysis of growth responses to water depth in three wetland plant species
Sorrell, Brian K.; Tanner, Chris C.; Brix, Hans
2012-01-01
Background and aims Plant species composition in wetlands and on lakeshores often shows dramatic zonation, which is frequently ascribed to differences in flooding tolerance. This study compared the growth responses to water depth of three species (Phormium tenax, Carex secta and Typha orientalis) differing in depth preferences in wetlands, using non-linear and quantile regression analyses to establish how flooding tolerance can explain field zonation. Methodology Plants were established for 8 months in outdoor cultures in waterlogged soil without standing water, and then randomly allocated to water depths from 0 to 0.5 m. Morphological and growth responses to depth were followed for 54 days before harvest, and then analysed by repeated-measures analysis of covariance, and non-linear and quantile regression analysis (QRA), to compare flooding tolerances. Principal results Growth responses to depth differed between the three species, and were non-linear. Phormium tenax growth decreased rapidly in standing water >0.25 m depth, C. secta growth increased initially with depth but then decreased at depths >0.30 m, accompanied by increased shoot height and decreased shoot density, and T. orientalis was unaffected by the 0- to 0.50-m depth range. In P. tenax the decrease in growth was associated with a decrease in the number of leaves produced per ramet and in C. secta the effect of water depth was greatest for the tallest shoots. Allocation patterns were unaffected by depth. Conclusions The responses are consistent with the principle that zonation in the field is primarily structured by competition in shallow water and by physiological flooding tolerance in deep water. Regression analyses, especially QRA, proved to be powerful tools in distinguishing genuine phenotypic responses to water depth from non-phenotypic variation due to size and developmental differences. PMID:23259044
Automated particle identification through regression analysis of size, shape and colour
NASA Astrophysics Data System (ADS)
Rodriguez Luna, J. C.; Cooper, J. M.; Neale, S. L.
2016-04-01
Rapid point of care diagnostic tests and tests to provide therapeutic information are now available for a range of specific conditions from the measurement of blood glucose levels for diabetes to card agglutination tests for parasitic infections. Due to a lack of specificity these test are often then backed up by more conventional lab based diagnostic methods for example a card agglutination test may be carried out for a suspected parasitic infection in the field and if positive a blood sample can then be sent to a lab for confirmation. The eventual diagnosis is often achieved by microscopic examination of the sample. In this paper we propose a computerized vision system for aiding in the diagnostic process; this system used a novel particle recognition algorithm to improve specificity and speed during the diagnostic process. We will show the detection and classification of different types of cells in a diluted blood sample using regression analysis of their size, shape and colour. The first step is to define the objects to be tracked by a Gaussian Mixture Model for background subtraction and binary opening and closing for noise suppression. After subtracting the objects of interest from the background the next challenge is to predict if a given object belongs to a certain category or not. This is a classification problem, and the output of the algorithm is a Boolean value (true/false). As such the computer program should be able to "predict" with reasonable level of confidence if a given particle belongs to the kind we are looking for or not. We show the use of a binary logistic regression analysis with three continuous predictors: size, shape and color histogram. The results suggest this variables could be very useful in a logistic regression equation as they proved to have a relatively high predictive value on their own.
JOINT STRUCTURE SELECTION AND ESTIMATION IN THE TIME-VARYING COEFFICIENT COX MODEL
Xiao, Wei; Lu, Wenbin; Zhang, Hao Helen
2016-01-01
Time-varying coefficient Cox model has been widely studied and popularly used in survival data analysis due to its flexibility for modeling covariate effects. It is of great practical interest to accurately identify the structure of covariate effects in a time-varying coefficient Cox model, i.e. covariates with null effect, constant effect and truly time-varying effect, and estimate the corresponding regression coefficients. Combining the ideas of local polynomial smoothing and group nonnegative garrote, we develop a new penalization approach to achieve such goals. Our method is able to identify the underlying true model structure with probability tending to one and simultaneously estimate the time-varying coefficients consistently. The asymptotic normalities of the resulting estimators are also established. We demonstrate the performance of our method using simulations and an application to the primary biliary cirrhosis data. PMID:27540275
NASA Astrophysics Data System (ADS)
Buck, J. A.; Underhill, P. R.; Morelli, J.; Krause, T. W.
2016-02-01
Nuclear steam generators (SGs) are a critical component for ensuring safe and efficient operation of a reactor. Life management strategies are implemented in which SG tubes are regularly inspected by conventional eddy current testing (ECT) and ultrasonic testing (UT) technologies to size flaws, and safe operating life of SGs is predicted based on growth models. ECT, the more commonly used technique, due to the rapidity with which full SG tube wall inspection can be performed, is challenged when inspecting ferromagnetic support structure materials in the presence of magnetite sludge and multiple overlapping degradation modes. In this work, an emerging inspection method, pulsed eddy current (PEC), is being investigated to address some of these particular inspection conditions. Time-domain signals were collected by an 8 coil array PEC probe in which ferromagnetic drilled support hole diameter, depth of rectangular tube frets and 2D tube off-centering were varied. Data sets were analyzed with a modified principal components analysis (MPCA) to extract dominant signal features. Multiple linear regression models were applied to MPCA scores to size hole diameter as well as size rectangular outer diameter tube frets. Models were improved through exploratory factor analysis, which was applied to MPCA scores to refine selection for regression models inputs by removing nonessential information.
Poisson regression analysis of mortality among male workers at a thorium-processing plant
Liu, Zhiyuan; Lee, Tze-San; Kotek, T.J.
1991-12-31
Analyses of mortality among a cohort of 3119 male workers employed between 1915 and 1973 at a thorium-processing plant were updated to the end of 1982. Of the whole group, 761 men were deceased and 2161 men were still alive, while 197 men were lost to follow-up. A total of 250 deaths was added to the 511 deaths observed in the previous study. The standardized mortality ratio (SMR) for all causes of death was 1.12 with 95% confidence interval (CI) of 1.05-1.21. The SMRs were also significantly increased for all malignant neoplasms (SMR = 1.23, 95% CI = 1.04-1.43) and lung cancer (SMR = 1.36, 95% CI = 1.02-1.78). Poisson regression analysis was employed to evaluate the joint effects of job classification, duration of employment, time since first employment, age and year at first employment on mortality of all malignant neoplasms and lung cancer. A comparison of internal and external analyses with the Poisson regression model was also conducted and showed no obvious difference in fitting the data on lung cancer mortality of the thorium workers. The results of the multivariate analysis showed that there was no significant effect of all the study factors on mortality due to all malignant neoplasms and lung cancer. Therefore, further study is needed for the former thorium workers.
Error analysis of leaf area estimates made from allometric regression models
NASA Technical Reports Server (NTRS)
Feiveson, A. H.; Chhikara, R. S.
1986-01-01
Biological net productivity, measured in terms of the change in biomass with time, affects global productivity and the quality of life through biochemical and hydrological cycles and by its effect on the overall energy balance. Estimating leaf area for large ecosystems is one of the more important means of monitoring this productivity. For a particular forest plot, the leaf area is often estimated by a two-stage process. In the first stage, known as dimension analysis, a small number of trees are felled so that their areas can be measured as accurately as possible. These leaf areas are then related to non-destructive, easily-measured features such as bole diameter and tree height, by using a regression model. In the second stage, the non-destructive features are measured for all or for a sample of trees in the plots and then used as input into the regression model to estimate the total leaf area. Because both stages of the estimation process are subject to error, it is difficult to evaluate the accuracy of the final plot leaf area estimates. This paper illustrates how a complete error analysis can be made, using an example from a study made on aspen trees in northern Minnesota. The study was a joint effort by NASA and the University of California at Santa Barbara known as COVER (Characterization of Vegetation with Remote Sensing).
Survival regression analysis: a powerful tool for evaluating fighting and assessment.
Moya-Laraño; Wise
2000-09-01
Theoretical models of animal contests frequently generate predictions about how asymmetries (e.g. differences in size, residence status) between contestants affect fight duration. Linear regression and nonparametric correlation analyses are commonly used to test the fit of data to such models. We show how survival regression analysis (SRA) is a powerful technique for studying the effect of asymmetries on the duration of contests. SRA, which is under-utilized by students of animal behaviour, offers several advantages over more frequently used procedures. It provides unbiased parameter estimates even when including censored data (i.e. results of contests that have not ended at the time when observations are stopped). The analysis of hazard functions, which is a component of SRA, is an easy way to test for consistency with predictions of the sequential assessment game model. These and other advantages of SRA are illustrated by using SRA and more conventional methods to analyse the effect of asymmetries on contest duration for encounters between female Mediterranean tarantulas, Lycosa tarentula (L.). It is hoped that this example of the advantages of SRA will encourage more widespread use of this powerful technique. Copyright 2000 The Association for the Study of Animal Behaviour. PMID:11007639
Regression-based adaptive sparse polynomial dimensional decomposition for sensitivity analysis
NASA Astrophysics Data System (ADS)
Tang, Kunkun; Congedo, Pietro; Abgrall, Remi
2014-11-01
Polynomial dimensional decomposition (PDD) is employed in this work for global sensitivity analysis and uncertainty quantification of stochastic systems subject to a large number of random input variables. Due to the intimate structure between PDD and Analysis-of-Variance, PDD is able to provide simpler and more direct evaluation of the Sobol' sensitivity indices, when compared to polynomial chaos (PC). Unfortunately, the number of PDD terms grows exponentially with respect to the size of the input random vector, which makes the computational cost of the standard method unaffordable for real engineering applications. In order to address this problem of curse of dimensionality, this work proposes a variance-based adaptive strategy aiming to build a cheap meta-model by sparse-PDD with PDD coefficients computed by regression. During this adaptive procedure, the model representation by PDD only contains few terms, so that the cost to resolve repeatedly the linear system of the least-square regression problem is negligible. The size of the final sparse-PDD representation is much smaller than the full PDD, since only significant terms are eventually retained. Consequently, a much less number of calls to the deterministic model is required to compute the final PDD coefficients.
A least trimmed square regression method for second level FMRI effective connectivity analysis.
Li, Xingfeng; Coyle, Damien; Maguire, Liam; McGinnity, Thomas Martin
2013-01-01
We present a least trimmed square (LTS) robust regression method to combine different runs/subjects for second/high level effective connectivity analysis. The basic idea of this method is to treat the extreme nonlinear model variability as outliers if they exceed a certain threshold. A bootstrap method for the LTS estimation is employed to detect model outliers. We compared the LTS robust method with a non-robust method using simulated and real datasets. The difference between LTS and the non-robust method for second level effective connectivity analysis is significant, suggesting the conventional non-robust method is easily affected by the model variability from the first level analysis. In addition, after these outliers are detected and excluded for the high level analysis, the model coefficients of the second level are combined within the framework of a mixed model. The variance of the mixed model is estimated using the Newton-Raphson (NR) type Levenberg-Marquardt algorithm. Three sets of real data are adopted to compare conventional methods which do not include random effects in the analysis with a mixed model for second level effective connectivity analysis. The results show that the conventional method is significantly different from the mixed model when greater model variability exists, suggesting there is a strong random effect, and the mixed model should be employed for the second level effective connectivity analysis. PMID:23093379
Analysis of changes in extreme temperature and precipitation using quantile regression
NASA Astrophysics Data System (ADS)
Lee, Kyoungmi; Baek, Hee-Jeong; Cho, ChunHo
2013-04-01
One of the important research areas in climatology is to identify whether the long-period tendencies of change in meteorological variables appear. In the past, the analysis has been limited by the estimation of long-period trends for annual or seasonal average values on meteorological variables. However, recently, the interest in the trends regarding the whole range of values for meteorological variables, including the extreme ones, has arisen. The quantile regression is the regression analysis method for estimating the regression slopes for the values of any quantile from 0 to 1 of dependent variable distributions. This method provides a more complete picture for the conditional distribution of the dependent variable given the independent variable when both lower and upper or all quantiles are of interest. This study examines the changes in regional extreme temperature and precipitation in South Korea using quantile regression, which is applied to analyze trends, not only in the mean but in all parts of the data distribution. The results show considerable diversity across space and quantile level in South Korea. For daily temperatures in winter, the slopes in lower quantiles generally have a more distinct increase trend compared to the upper quantiles. The time series for daily minimum temperature during the winter season only shows a significant increasing trend in the lower quantile. In case of summer, most sites show an increase trend in both lower and upper quantiles for daily minimum temperature, while there are a number of sites with a decrease trend for daily maximum temperature. It was also found that the increase trend of extreme low temperature in large urban areas (0.80°C/decade) is much larger than in rural areas (0.54°C/decade) due to the effects of urbanization. Extreme climate events can have greater negative impacts on society, economy and natural environments than changes in climate means. The fast growth of population and industrialization in
NASA Astrophysics Data System (ADS)
Păniţă, Ovidiu
2015-09-01
In the years 2012-2014 on Banu-Maracine DRS there were tested an assortment of 25 isogenic lines of wheat (Triticum aestivum ssp.vulgare), the analyzed characters being the number of seeds/spike, seeds weight/spike (g), no. of spikes/m2, weight of a thousand seeds (WTS) (g) and no. of emerged plants/m2. Based on recorded data and statistical processing of those, they were identified a numbers of links between these characters. Also available regression models were identified between some of the studied characters. Based on component analysis, no. of seeds/spike and seeds weight/spike are components that influence in excess of 88% variance analysis, a total of seven genotypes with positive scores for both factors.
Modelling and analysis of turbulent datasets using Auto Regressive Moving Average processes
NASA Astrophysics Data System (ADS)
Faranda, Davide; Pons, Flavio Maria Emanuele; Dubrulle, Bérengère; Daviaud, François; Saint-Michel, Brice; Herbert, Éric; Cortet, Pierre-Philippe
2014-10-01
We introduce a novel way to extract information from turbulent datasets by applying an Auto Regressive Moving Average (ARMA) statistical analysis. Such analysis goes well beyond the analysis of the mean flow and of the fluctuations and links the behavior of the recorded time series to a discrete version of a stochastic differential equation which is able to describe the correlation structure in the dataset. We introduce a new index Υ that measures the difference between the resulting analysis and the Obukhov model of turbulence, the simplest stochastic model reproducing both Richardson law and the Kolmogorov spectrum. We test the method on datasets measured in a von Kármán swirling flow experiment. We found that the ARMA analysis is well correlated with spatial structures of the flow, and can discriminate between two different flows with comparable mean velocities, obtained by changing the forcing. Moreover, we show that the Υ is highest in regions where shear layer vortices are present, thereby establishing a link between deviations from the Kolmogorov model and coherent structures. These deviations are consistent with the ones observed by computing the Hurst exponents for the same time series. We show that some salient features of the analysis are preserved when considering global instead of local observables. Finally, we analyze flow configurations with multistability features where the ARMA technique is efficient in discriminating different stability branches of the system.
Modelling and analysis of turbulent datasets using Auto Regressive Moving Average processes
Faranda, Davide Dubrulle, Bérengère; Daviaud, François; Pons, Flavio Maria Emanuele; Saint-Michel, Brice; Herbert, Éric; Cortet, Pierre-Philippe
2014-10-15
We introduce a novel way to extract information from turbulent datasets by applying an Auto Regressive Moving Average (ARMA) statistical analysis. Such analysis goes well beyond the analysis of the mean flow and of the fluctuations and links the behavior of the recorded time series to a discrete version of a stochastic differential equation which is able to describe the correlation structure in the dataset. We introduce a new index Υ that measures the difference between the resulting analysis and the Obukhov model of turbulence, the simplest stochastic model reproducing both Richardson law and the Kolmogorov spectrum. We test the method on datasets measured in a von Kármán swirling flow experiment. We found that the ARMA analysis is well correlated with spatial structures of the flow, and can discriminate between two different flows with comparable mean velocities, obtained by changing the forcing. Moreover, we show that the Υ is highest in regions where shear layer vortices are present, thereby establishing a link between deviations from the Kolmogorov model and coherent structures. These deviations are consistent with the ones observed by computing the Hurst exponents for the same time series. We show that some salient features of the analysis are preserved when considering global instead of local observables. Finally, we analyze flow configurations with multistability features where the ARMA technique is efficient in discriminating different stability branches of the system.
The Impact of Outliers on Net-Benefit Regression Model in Cost-Effectiveness Analysis.
Wen, Yu-Wen; Tsai, Yi-Wen; Wu, David Bin-Chia; Chen, Pei-Fen
2013-01-01
Ordinary least square (OLS) in regression has been widely used to analyze patient-level data in cost-effectiveness analysis (CEA). However, the estimates, inference and decision making in the economic evaluation based on OLS estimation may be biased by the presence of outliers. Instead, robust estimation can remain unaffected and provide result which is resistant to outliers. The objective of this study is to explore the impact of outliers on net-benefit regression (NBR) in CEA using OLS and to propose a potential solution by using robust estimations, i.e. Huber M-estimation, Hampel M-estimation, Tukey's bisquare M-estimation, MM-estimation and least trimming square estimation. Simulations under different outlier-generating scenarios and an empirical example were used to obtain the regression estimates of NBR by OLS and five robust estimations. Empirical size and empirical power of both OLS and robust estimations were then compared in the context of hypothesis testing. Simulations showed that the five robust approaches compared with OLS estimation led to lower empirical sizes and achieved higher empirical powers in testing cost-effectiveness. Using real example of antiplatelet therapy, the estimated incremental net-benefit by OLS estimation was lower than those by robust approaches because of outliers in cost data. Robust estimations demonstrated higher probability of cost-effectiveness compared to OLS estimation. The presence of outliers can bias the results of NBR and its interpretations. It is recommended that the use of robust estimation in NBR can be an appropriate method to avoid such biased decision making. PMID:23840378
Huang, Dong; Cabral, Ricardo; De la Torre, Fernando
2016-02-01
Discriminative methods (e.g., kernel regression, SVM) have been extensively used to solve problems such as object recognition, image alignment and pose estimation from images. These methods typically map image features ( X) to continuous (e.g., pose) or discrete (e.g., object category) values. A major drawback of existing discriminative methods is that samples are directly projected onto a subspace and hence fail to account for outliers common in realistic training sets due to occlusion, specular reflections or noise. It is important to notice that existing discriminative approaches assume the input variables X to be noise free. Thus, discriminative methods experience significant performance degradation when gross outliers are present. Despite its obvious importance, the problem of robust discriminative learning has been relatively unexplored in computer vision. This paper develops the theory of robust regression (RR) and presents an effective convex approach that uses recent advances on rank minimization. The framework applies to a variety of problems in computer vision including robust linear discriminant analysis, regression with missing data, and multi-label classification. Several synthetic and real examples with applications to head pose estimation from images, image and video classification and facial attribute classification with missing data are used to illustrate the benefits of RR. PMID:26761740
Combining regression analysis and air quality modelling to predict benzene concentration levels
NASA Astrophysics Data System (ADS)
Vlachokostas, Ch.; Achillas, Ch.; Chourdakis, E.; Moussiopoulos, N.
2011-05-01
State of the art epidemiological research has found consistent associations between traffic-related air pollution and various outcomes, such as respiratory symptoms and premature mortality. However, many urban areas are characterised by the absence of the necessary monitoring infrastructure, especially for benzene (C 6H 6), which is a known human carcinogen. The use of environmental statistics combined with air quality modelling can be of vital importance in order to assess air quality levels of traffic-related pollutants in an urban area in the case where there are no available measurements. This paper aims at developing and presenting a reliable approach, in order to forecast C 6H 6 levels in urban environments, demonstrated for Thessaloniki, Greece. Multiple stepwise regression analysis is used and a strong statistical relationship is detected between C 6H 6 and CO. The adopted regression model is validated in order to depict its applicability and representativeness. The presented results demonstrate that the adopted approach is capable of capturing C 6H 6 concentration trends and should be considered as complementary to air quality monitoring.
NASA Astrophysics Data System (ADS)
Nordemann, D. J. R.; Rigozo, N. R.; de Souza Echer, M. P.; Echer, E.
2008-11-01
We present here an implementation of a least squares iterative regression method applied to the sine functions embedded in the principal components extracted from geophysical time series. This method seems to represent a useful improvement for the non-stationary time series periodicity quantitative analysis. The principal components determination followed by the least squares iterative regression method was implemented in an algorithm written in the Scilab (2006) language. The main result of the method is to obtain the set of sine functions embedded in the series analyzed in decreasing order of significance, from the most important ones, likely to represent the physical processes involved in the generation of the series, to the less important ones that represent noise components. Taking into account the need of a deeper knowledge of the Sun's past history and its implication to global climate change, the method was applied to the Sunspot Number series (1750-2004). With the threshold and parameter values used here, the application of the method leads to a total of 441 explicit sine functions, among which 65 were considered as being significant and were used for a reconstruction that gave a normalized mean squared error of 0.146.
Rubio, Francisco J; Genton, Marc G
2016-06-30
We study Bayesian linear regression models with skew-symmetric scale mixtures of normal error distributions. These kinds of models can be used to capture departures from the usual assumption of normality of the errors in terms of heavy tails and asymmetry. We propose a general noninformative prior structure for these regression models and show that the corresponding posterior distribution is proper under mild conditions. We extend these propriety results to cases where the response variables are censored. The latter scenario is of interest in the context of accelerated failure time models, which are relevant in survival analysis. We present a simulation study that demonstrates good frequentist properties of the posterior credible intervals associated with the proposed priors. This study also sheds some light on the trade-off between increased model flexibility and the risk of over-fitting. We illustrate the performance of the proposed models with real data. Although we focus on models with univariate response variables, we also present some extensions to the multivariate case in the Supporting Information. Copyright © 2016 John Wiley & Sons, Ltd. PMID:26856806
NASA Astrophysics Data System (ADS)
Simms, Laura E.; Engebretson, Mark J.; Pilipenko, Viacheslav; Reeves, Geoffrey D.; Clilverd, Mark
2016-04-01
The daily maximum relativistic electron flux at geostationary orbit can be predicted well with a set of daily averaged predictor variables including previous day's flux, seed electron flux, solar wind velocity and number density, AE index, IMF Bz, Dst, and ULF and VLF wave power. As predictor variables are intercorrelated, we used multiple regression analyses to determine which are the most predictive of flux when other variables are controlled. Empirical models produced from regressions of flux on measured predictors from 1 day previous were reasonably effective at predicting novel observations. Adding previous flux to the parameter set improves the prediction of the peak of the increases but delays its anticipation of an event. Previous day's solar wind number density and velocity, AE index, and ULF wave activity are the most significant explanatory variables; however, the AE index, measuring substorm processes, shows a negative correlation with flux when other parameters are controlled. This may be due to the triggering of electromagnetic ion cyclotron waves by substorms that cause electron precipitation. VLF waves show lower, but significant, influence. The combined effect of ULF and VLF waves shows a synergistic interaction, where each increases the influence of the other on flux enhancement. Correlations between observations and predictions for this 1 day lag model ranged from 0.71 to 0.89 (average: 0.78). A path analysis of correlations between predictors suggests that solar wind and IMF parameters affect flux through intermediate processes such as ring current (Dst), AE, and wave activity.
A cautionary note on the use of EESC-based regression analysis for ozone trend studies
NASA Astrophysics Data System (ADS)
Kuttippurath, J.; Bodeker, G. E.; Roscoe, H. K.; Nair, P. J.
2015-01-01
Equivalent effective stratospheric chlorine (EESC) construct of ozone regression models attributes ozone changes to EESC changes using a single value of the sensitivity of ozone to EESC over the whole period. Using space-based total column ozone (TCO) measurements, and a synthetic TCO time series constructed such that EESC does not fall below its late 1990s maximum, we demonstrate that the EESC-based estimates of ozone changes in the polar regions (70-90°) after 2000 may, falsely, suggest an EESC-driven increase in ozone over this period. An EESC-based regression of our synthetic "failed Montreal Protocol with constant EESC" time series suggests a positive TCO trend that is statistically significantly different from zero over 2001-2012 when, in fact, no recovery has taken place. Our analysis demonstrates that caution needs to be exercised when using explanatory variables, with a single fit coefficient, fitted to the entire data record, to interpret changes in only part of the record.
Improved Regression Analysis of Temperature-Dependent Strain-Gage Balance Calibration Data
NASA Technical Reports Server (NTRS)
Ulbrich, N.
2015-01-01
An improved approach is discussed that may be used to directly include first and second order temperature effects in the load prediction algorithm of a wind tunnel strain-gage balance. The improved approach was designed for the Iterative Method that fits strain-gage outputs as a function of calibration loads and uses a load iteration scheme during the wind tunnel test to predict loads from measured gage outputs. The improved approach assumes that the strain-gage balance is at a constant uniform temperature when it is calibrated and used. First, the method introduces a new independent variable for the regression analysis of the balance calibration data. The new variable is designed as the difference between the uniform temperature of the balance and a global reference temperature. This reference temperature should be the primary calibration temperature of the balance so that, if needed, a tare load iteration can be performed. Then, two temperature{dependent terms are included in the regression models of the gage outputs. They are the temperature difference itself and the square of the temperature difference. Simulated temperature{dependent data obtained from Triumph Aerospace's 2013 calibration of NASA's ARC-30K five component semi{span balance is used to illustrate the application of the improved approach.
NASA Astrophysics Data System (ADS)
Liu, Pao-Wen Grace; Tsai, Jiun-Horng; Lai, Hsin-Chih; Tsai, Der-Min; Li, Li-Wei
2013-11-01
Sensitivity of meteorological variation to air quality has attracted people's attention since climate change became a world issue. The goal of this study is to investigate the sensitivity of ground-level ozone concentrations to temperature variation in Taiwan. Several multivariate regression models were built based on historical data of ozone and meteorological variables at three cities located in northern, mid-western, and southern Taiwan. Results of descriptive statistics indicate that the severe pollution from the highest to the minor conditions following by the order of the southern (Pingtung), mid-western (Fengyuan), and the northern sites (Hsichih). Multiple regression models containing a principal component trigger variable effectively simulated the historical ozone exceedance during 2004-2009. Inclusion of the PC trigger were improved R2 from the lowest 0.38 to the highest 0.58. High probability of detection and critical success index (mostly between 85% and 90%) and low false alarm rates (0-2.6%) were achieved for predicting the high ozone days (≧100 ppb). The results of sensitivity analysis indicated that (1) the ozone sensitivity was positively correlated with the temperature variation, (2) the sensitivity levels were opposite to that of the ozone problem severity, (3) the sensitivity was mostly apparent in ozone seasons, and (4) the sensitivity strongly depended on the seasonality in the urban cities Hischih and Fengyuan, but weakly depended on seasonality in the rural city Pingtung.