An Entropy-Based Measure for Assessing Fuzziness in Logistic Regression
Weiss, Brandi A.; Dardick, William
2015-01-01
This article introduces an entropy-based measure of data–model fit that can be used to assess the quality of logistic regression models. Entropy has previously been used in mixture-modeling to quantify how well individuals are classified into latent classes. The current study proposes the use of entropy for logistic regression models to quantify the quality of classification and separation of group membership. Entropy complements preexisting measures of data–model fit and provides unique information not contained in other measures. Hypothetical data scenarios, an applied example, and Monte Carlo simulation results are used to demonstrate the application of entropy in logistic regression. Entropy should be used in conjunction with other measures of data–model fit to assess how well logistic regression models classify cases into observed categories. PMID:29795897
An Entropy-Based Measure for Assessing Fuzziness in Logistic Regression.
Weiss, Brandi A; Dardick, William
2016-12-01
This article introduces an entropy-based measure of data-model fit that can be used to assess the quality of logistic regression models. Entropy has previously been used in mixture-modeling to quantify how well individuals are classified into latent classes. The current study proposes the use of entropy for logistic regression models to quantify the quality of classification and separation of group membership. Entropy complements preexisting measures of data-model fit and provides unique information not contained in other measures. Hypothetical data scenarios, an applied example, and Monte Carlo simulation results are used to demonstrate the application of entropy in logistic regression. Entropy should be used in conjunction with other measures of data-model fit to assess how well logistic regression models classify cases into observed categories.
Robust mislabel logistic regression without modeling mislabel probabilities.
Hung, Hung; Jou, Zhi-Yu; Huang, Su-Yun
2018-03-01
Logistic regression is among the most widely used statistical methods for linear discriminant analysis. In many applications, we only observe possibly mislabeled responses. Fitting a conventional logistic regression can then lead to biased estimation. One common resolution is to fit a mislabel logistic regression model, which takes into consideration of mislabeled responses. Another common method is to adopt a robust M-estimation by down-weighting suspected instances. In this work, we propose a new robust mislabel logistic regression based on γ-divergence. Our proposal possesses two advantageous features: (1) It does not need to model the mislabel probabilities. (2) The minimum γ-divergence estimation leads to a weighted estimating equation without the need to include any bias correction term, that is, it is automatically bias-corrected. These features make the proposed γ-logistic regression more robust in model fitting and more intuitive for model interpretation through a simple weighting scheme. Our method is also easy to implement, and two types of algorithms are included. Simulation studies and the Pima data application are presented to demonstrate the performance of γ-logistic regression. © 2017, The International Biometric Society.
On the Usefulness of a Multilevel Logistic Regression Approach to Person-Fit Analysis
ERIC Educational Resources Information Center
Conijn, Judith M.; Emons, Wilco H. M.; van Assen, Marcel A. L. M.; Sijtsma, Klaas
2011-01-01
The logistic person response function (PRF) models the probability of a correct response as a function of the item locations. Reise (2000) proposed to use the slope parameter of the logistic PRF as a person-fit measure. He reformulated the logistic PRF model as a multilevel logistic regression model and estimated the PRF parameters from this…
Schörgendorfer, Angela; Branscum, Adam J; Hanson, Timothy E
2013-06-01
Logistic regression is a popular tool for risk analysis in medical and population health science. With continuous response data, it is common to create a dichotomous outcome for logistic regression analysis by specifying a threshold for positivity. Fitting a linear regression to the nondichotomized response variable assuming a logistic sampling model for the data has been empirically shown to yield more efficient estimates of odds ratios than ordinary logistic regression of the dichotomized endpoint. We illustrate that risk inference is not robust to departures from the parametric logistic distribution. Moreover, the model assumption of proportional odds is generally not satisfied when the condition of a logistic distribution for the data is violated, leading to biased inference from a parametric logistic analysis. We develop novel Bayesian semiparametric methodology for testing goodness of fit of parametric logistic regression with continuous measurement data. The testing procedures hold for any cutoff threshold and our approach simultaneously provides the ability to perform semiparametric risk estimation. Bayes factors are calculated using the Savage-Dickey ratio for testing the null hypothesis of logistic regression versus a semiparametric generalization. We propose a fully Bayesian and a computationally efficient empirical Bayesian approach to testing, and we present methods for semiparametric estimation of risks, relative risks, and odds ratios when parametric logistic regression fails. Theoretical results establish the consistency of the empirical Bayes test. Results from simulated data show that the proposed approach provides accurate inference irrespective of whether parametric assumptions hold or not. Evaluation of risk factors for obesity shows that different inferences are derived from an analysis of a real data set when deviations from a logistic distribution are permissible in a flexible semiparametric framework. © 2013, The International Biometric Society.
ERIC Educational Resources Information Center
Chen, Chau-Kuang
2005-01-01
Logistic and Cox regression methods are practical tools used to model the relationships between certain student learning outcomes and their relevant explanatory variables. The logistic regression model fits an S-shaped curve into a binary outcome with data points of zero and one. The Cox regression model allows investigators to study the duration…
Model building strategy for logistic regression: purposeful selection.
Zhang, Zhongheng
2016-03-01
Logistic regression is one of the most commonly used models to account for confounders in medical literature. The article introduces how to perform purposeful selection model building strategy with R. I stress on the use of likelihood ratio test to see whether deleting a variable will have significant impact on model fit. A deleted variable should also be checked for whether it is an important adjustment of remaining covariates. Interaction should be checked to disentangle complex relationship between covariates and their synergistic effect on response variable. Model should be checked for the goodness-of-fit (GOF). In other words, how the fitted model reflects the real data. Hosmer-Lemeshow GOF test is the most widely used for logistic regression model.
An Entropy-Based Measure for Assessing Fuzziness in Logistic Regression
ERIC Educational Resources Information Center
Weiss, Brandi A.; Dardick, William
2016-01-01
This article introduces an entropy-based measure of data-model fit that can be used to assess the quality of logistic regression models. Entropy has previously been used in mixture-modeling to quantify how well individuals are classified into latent classes. The current study proposes the use of entropy for logistic regression models to quantify…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ghazali, Amirul Syafiq Mohd; Ali, Zalila; Noor, Norlida Mohd
Multinomial logistic regression is widely used to model the outcomes of a polytomous response variable, a categorical dependent variable with more than two categories. The model assumes that the conditional mean of the dependent categorical variables is the logistic function of an affine combination of predictor variables. Its procedure gives a number of logistic regression models that make specific comparisons of the response categories. When there are q categories of the response variable, the model consists of q-1 logit equations which are fitted simultaneously. The model is validated by variable selection procedures, tests of regression coefficients, a significant test ofmore » the overall model, goodness-of-fit measures, and validation of predicted probabilities using odds ratio. This study used the multinomial logistic regression model to investigate obesity and overweight among primary school students in a rural area on the basis of their demographic profiles, lifestyles and on the diet and food intake. The results indicated that obesity and overweight of students are related to gender, religion, sleep duration, time spent on electronic games, breakfast intake in a week, with whom meals are taken, protein intake, and also, the interaction between breakfast intake in a week with sleep duration, and the interaction between gender and protein intake.« less
NASA Astrophysics Data System (ADS)
Ghazali, Amirul Syafiq Mohd; Ali, Zalila; Noor, Norlida Mohd; Baharum, Adam
2015-10-01
Multinomial logistic regression is widely used to model the outcomes of a polytomous response variable, a categorical dependent variable with more than two categories. The model assumes that the conditional mean of the dependent categorical variables is the logistic function of an affine combination of predictor variables. Its procedure gives a number of logistic regression models that make specific comparisons of the response categories. When there are q categories of the response variable, the model consists of q-1 logit equations which are fitted simultaneously. The model is validated by variable selection procedures, tests of regression coefficients, a significant test of the overall model, goodness-of-fit measures, and validation of predicted probabilities using odds ratio. This study used the multinomial logistic regression model to investigate obesity and overweight among primary school students in a rural area on the basis of their demographic profiles, lifestyles and on the diet and food intake. The results indicated that obesity and overweight of students are related to gender, religion, sleep duration, time spent on electronic games, breakfast intake in a week, with whom meals are taken, protein intake, and also, the interaction between breakfast intake in a week with sleep duration, and the interaction between gender and protein intake.
glmnetLRC f/k/a lrc package: Logistic Regression Classification
DOE Office of Scientific and Technical Information (OSTI.GOV)
2016-06-09
Methods for fitting and predicting logistic regression classifiers (LRC) with an arbitrary loss function using elastic net or best subsets. This package adds additional model fitting features to the existing glmnet and bestglm R packages. This package was created to perform the analyses described in Amidan BG, Orton DJ, LaMarche BL, et al. 2014. Signatures for Mass Spectrometry Data Quality. Journal of Proteome Research. 13(4), 2215-2222. It makes the model fitting available in the glmnet and bestglm packages more general by identifying optimal model parameters via cross validation with an customizable loss function. It also identifies the optimal threshold formore » binary classification.« less
Staley, James R; Jones, Edmund; Kaptoge, Stephen; Butterworth, Adam S; Sweeting, Michael J; Wood, Angela M; Howson, Joanna M M
2017-06-01
Logistic regression is often used instead of Cox regression to analyse genome-wide association studies (GWAS) of single-nucleotide polymorphisms (SNPs) and disease outcomes with cohort and case-cohort designs, as it is less computationally expensive. Although Cox and logistic regression models have been compared previously in cohort studies, this work does not completely cover the GWAS setting nor extend to the case-cohort study design. Here, we evaluated Cox and logistic regression applied to cohort and case-cohort genetic association studies using simulated data and genetic data from the EPIC-CVD study. In the cohort setting, there was a modest improvement in power to detect SNP-disease associations using Cox regression compared with logistic regression, which increased as the disease incidence increased. In contrast, logistic regression had more power than (Prentice weighted) Cox regression in the case-cohort setting. Logistic regression yielded inflated effect estimates (assuming the hazard ratio is the underlying measure of association) for both study designs, especially for SNPs with greater effect on disease. Given logistic regression is substantially more computationally efficient than Cox regression in both settings, we propose a two-step approach to GWAS in cohort and case-cohort studies. First to analyse all SNPs with logistic regression to identify associated variants below a pre-defined P-value threshold, and second to fit Cox regression (appropriately weighted in case-cohort studies) to those identified SNPs to ensure accurate estimation of association with disease.
Exploring Person Fit with an Approach Based on Multilevel Logistic Regression
ERIC Educational Resources Information Center
Walker, A. Adrienne; Engelhard, George, Jr.
2015-01-01
The idea that test scores may not be valid representations of what students know, can do, and should learn next is well known. Person fit provides an important aspect of validity evidence. Person fit analyses at the individual student level are not typically conducted and person fit information is not communicated to educational stakeholders. In…
Regularization Paths for Conditional Logistic Regression: The clogitL1 Package.
Reid, Stephen; Tibshirani, Rob
2014-07-01
We apply the cyclic coordinate descent algorithm of Friedman, Hastie, and Tibshirani (2010) to the fitting of a conditional logistic regression model with lasso [Formula: see text] and elastic net penalties. The sequential strong rules of Tibshirani, Bien, Hastie, Friedman, Taylor, Simon, and Tibshirani (2012) are also used in the algorithm and it is shown that these offer a considerable speed up over the standard coordinate descent algorithm with warm starts. Once implemented, the algorithm is used in simulation studies to compare the variable selection and prediction performance of the conditional logistic regression model against that of its unconditional (standard) counterpart. We find that the conditional model performs admirably on datasets drawn from a suitable conditional distribution, outperforming its unconditional counterpart at variable selection. The conditional model is also fit to a small real world dataset, demonstrating how we obtain regularization paths for the parameters of the model and how we apply cross validation for this method where natural unconditional prediction rules are hard to come by.
Regularization Paths for Conditional Logistic Regression: The clogitL1 Package
Reid, Stephen; Tibshirani, Rob
2014-01-01
We apply the cyclic coordinate descent algorithm of Friedman, Hastie, and Tibshirani (2010) to the fitting of a conditional logistic regression model with lasso (ℓ1) and elastic net penalties. The sequential strong rules of Tibshirani, Bien, Hastie, Friedman, Taylor, Simon, and Tibshirani (2012) are also used in the algorithm and it is shown that these offer a considerable speed up over the standard coordinate descent algorithm with warm starts. Once implemented, the algorithm is used in simulation studies to compare the variable selection and prediction performance of the conditional logistic regression model against that of its unconditional (standard) counterpart. We find that the conditional model performs admirably on datasets drawn from a suitable conditional distribution, outperforming its unconditional counterpart at variable selection. The conditional model is also fit to a small real world dataset, demonstrating how we obtain regularization paths for the parameters of the model and how we apply cross validation for this method where natural unconditional prediction rules are hard to come by. PMID:26257587
Deng, Yingyuan; Wang, Tianfu; Chen, Siping; Liu, Weixiang
2017-01-01
The aim of the study is to screen the significant sonographic features by logistic regression analysis and fit a model to diagnose thyroid nodules. A total of 525 pathological thyroid nodules were retrospectively analyzed. All the nodules underwent conventional ultrasonography (US), strain elastosonography (SE), and contrast -enhanced ultrasound (CEUS). Those nodules’ 12 suspicious sonographic features were used to assess thyroid nodules. The significant features of diagnosing thyroid nodules were picked out by logistic regression analysis. All variables that were statistically related to diagnosis of thyroid nodules, at a level of p < 0.05 were embodied in a logistic regression analysis model. The significant features in the logistic regression model of diagnosing thyroid nodules were calcification, suspected cervical lymph node metastasis, hypoenhancement pattern, margin, shape, vascularity, posterior acoustic, echogenicity, and elastography score. According to the results of logistic regression analysis, the formula that could predict whether or not thyroid nodules are malignant was established. The area under the receiver operating curve (ROC) was 0.930 and the sensitivity, specificity, accuracy, positive predictive value, and negative predictive value were 83.77%, 89.56%, 87.05%, 86.04%, and 87.79% respectively. PMID:29228030
Pang, Tiantian; Huang, Leidan; Deng, Yingyuan; Wang, Tianfu; Chen, Siping; Gong, Xuehao; Liu, Weixiang
2017-01-01
The aim of the study is to screen the significant sonographic features by logistic regression analysis and fit a model to diagnose thyroid nodules. A total of 525 pathological thyroid nodules were retrospectively analyzed. All the nodules underwent conventional ultrasonography (US), strain elastosonography (SE), and contrast -enhanced ultrasound (CEUS). Those nodules' 12 suspicious sonographic features were used to assess thyroid nodules. The significant features of diagnosing thyroid nodules were picked out by logistic regression analysis. All variables that were statistically related to diagnosis of thyroid nodules, at a level of p < 0.05 were embodied in a logistic regression analysis model. The significant features in the logistic regression model of diagnosing thyroid nodules were calcification, suspected cervical lymph node metastasis, hypoenhancement pattern, margin, shape, vascularity, posterior acoustic, echogenicity, and elastography score. According to the results of logistic regression analysis, the formula that could predict whether or not thyroid nodules are malignant was established. The area under the receiver operating curve (ROC) was 0.930 and the sensitivity, specificity, accuracy, positive predictive value, and negative predictive value were 83.77%, 89.56%, 87.05%, 86.04%, and 87.79% respectively.
Conditional Poisson models: a flexible alternative to conditional logistic case cross-over analysis.
Armstrong, Ben G; Gasparrini, Antonio; Tobias, Aurelio
2014-11-24
The time stratified case cross-over approach is a popular alternative to conventional time series regression for analysing associations between time series of environmental exposures (air pollution, weather) and counts of health outcomes. These are almost always analyzed using conditional logistic regression on data expanded to case-control (case crossover) format, but this has some limitations. In particular adjusting for overdispersion and auto-correlation in the counts is not possible. It has been established that a Poisson model for counts with stratum indicators gives identical estimates to those from conditional logistic regression and does not have these limitations, but it is little used, probably because of the overheads in estimating many stratum parameters. The conditional Poisson model avoids estimating stratum parameters by conditioning on the total event count in each stratum, thus simplifying the computing and increasing the number of strata for which fitting is feasible compared with the standard unconditional Poisson model. Unlike the conditional logistic model, the conditional Poisson model does not require expanding the data, and can adjust for overdispersion and auto-correlation. It is available in Stata, R, and other packages. By applying to some real data and using simulations, we demonstrate that conditional Poisson models were simpler to code and shorter to run than are conditional logistic analyses and can be fitted to larger data sets than possible with standard Poisson models. Allowing for overdispersion or autocorrelation was possible with the conditional Poisson model but when not required this model gave identical estimates to those from conditional logistic regression. Conditional Poisson regression models provide an alternative to case crossover analysis of stratified time series data with some advantages. The conditional Poisson model can also be used in other contexts in which primary control for confounding is by fine stratification.
Yusuf, O B; Bamgboye, E A; Afolabi, R F; Shodimu, M A
2014-09-01
Logistic regression model is widely used in health research for description and predictive purposes. Unfortunately, most researchers are sometimes not aware that the underlying principles of the techniques have failed when the algorithm for maximum likelihood does not converge. Young researchers particularly postgraduate students may not know why separation problem whether quasi or complete occurs, how to identify it and how to fix it. This study was designed to critically evaluate convergence issues in articles that employed logistic regression analysis published in an African Journal of Medicine and medical sciences between 2004 and 2013. Problems of quasi or complete separation were described and were illustrated with the National Demographic and Health Survey dataset. A critical evaluation of articles that employed logistic regression was conducted. A total of 581 articles was reviewed, of which 40 (6.9%) used binary logistic regression. Twenty-four (60.0%) stated the use of logistic regression model in the methodology while none of the articles assessed model fit. Only 3 (12.5%) properly described the procedures. Of the 40 that used the logistic regression model, the problem of convergence occurred in 6 (15.0%) of the articles. Logistic regression tends to be poorly reported in studies published between 2004 and 2013. Our findings showed that the procedure may not be well understood by researchers since very few described the process in their reports and may be totally unaware of the problem of convergence or how to deal with it.
The association of health-related fitness with indicators of academic performance in Texas schools.
Welk, Gregory J; Jackson, Allen W; Morrow, James R; Haskell, William H; Meredith, Marilu D; Cooper, Kenneth H
2010-09-01
This study examined the associations between indicators of health-related physical fitness (cardiovascular fitness and body mass index) and academic performance (Texas Assessment of Knowledge and Skills). Partial correlations were generally stronger for cardiovascular fitness than body mass index and consistently stronger in the middle school grades. Mixed-model regression analyses revealed modest associations between fitness and academic achievement after controlling for potentially confounding variables. The effects of fitness on academic achievement were positive but small. A separate logistic regression analysis indicated that higher fitness rates increased the odds of schools achieving exemplary/recognized school status within the state. School fitness attainment is an indicator of higher performing schools. Direction of causality cannot be inferred due to the cross-sectional nature of the data.
Determining factors influencing survival of breast cancer by fuzzy logistic regression model.
Nikbakht, Roya; Bahrampour, Abbas
2017-01-01
Fuzzy logistic regression model can be used for determining influential factors of disease. This study explores the important factors of actual predictive survival factors of breast cancer's patients. We used breast cancer data which collected by cancer registry of Kerman University of Medical Sciences during the period of 2000-2007. The variables such as morphology, grade, age, and treatments (surgery, radiotherapy, and chemotherapy) were applied in the fuzzy logistic regression model. Performance of model was determined in terms of mean degree of membership (MDM). The study results showed that almost 41% of patients were in neoplasm and malignant group and more than two-third of them were still alive after 5-year follow-up. Based on the fuzzy logistic model, the most important factors influencing survival were chemotherapy, morphology, and radiotherapy, respectively. Furthermore, the MDM criteria show that the fuzzy logistic regression have a good fit on the data (MDM = 0.86). Fuzzy logistic regression model showed that chemotherapy is more important than radiotherapy in survival of patients with breast cancer. In addition, another ability of this model is calculating possibilistic odds of survival in cancer patients. The results of this study can be applied in clinical research. Furthermore, there are few studies which applied the fuzzy logistic models. Furthermore, we recommend using this model in various research areas.
Non-proportional odds multivariate logistic regression of ordinal family data.
Zaloumis, Sophie G; Scurrah, Katrina J; Harrap, Stephen B; Ellis, Justine A; Gurrin, Lyle C
2015-03-01
Methods to examine whether genetic and/or environmental sources can account for the residual variation in ordinal family data usually assume proportional odds. However, standard software to fit the non-proportional odds model to ordinal family data is limited because the correlation structure of family data is more complex than for other types of clustered data. To perform these analyses we propose the non-proportional odds multivariate logistic regression model and take a simulation-based approach to model fitting using Markov chain Monte Carlo methods, such as partially collapsed Gibbs sampling and the Metropolis algorithm. We applied the proposed methodology to male pattern baldness data from the Victorian Family Heart Study. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Dudley, Robert W.; Hodgkins, Glenn A.; Dickinson, Jesse
2017-01-01
We present a logistic regression approach for forecasting the probability of future groundwater levels declining or maintaining below specific groundwater-level thresholds. We tested our approach on 102 groundwater wells in different climatic regions and aquifers of the United States that are part of the U.S. Geological Survey Groundwater Climate Response Network. We evaluated the importance of current groundwater levels, precipitation, streamflow, seasonal variability, Palmer Drought Severity Index, and atmosphere/ocean indices for developing the logistic regression equations. Several diagnostics of model fit were used to evaluate the regression equations, including testing of autocorrelation of residuals, goodness-of-fit metrics, and bootstrap validation testing. The probabilistic predictions were most successful at wells with high persistence (low month-to-month variability) in their groundwater records and at wells where the groundwater level remained below the defined low threshold for sustained periods (generally three months or longer). The model fit was weakest at wells with strong seasonal variability in levels and with shorter duration low-threshold events. We identified challenges in deriving probabilistic-forecasting models and possible approaches for addressing those challenges.
Item Response Theory Modeling of the Philadelphia Naming Test.
Fergadiotis, Gerasimos; Kellough, Stacey; Hula, William D
2015-06-01
In this study, we investigated the fit of the Philadelphia Naming Test (PNT; Roach, Schwartz, Martin, Grewal, & Brecher, 1996) to an item-response-theory measurement model, estimated the precision of the resulting scores and item parameters, and provided a theoretical rationale for the interpretation of PNT overall scores by relating explanatory variables to item difficulty. This article describes the statistical model underlying the computer adaptive PNT presented in a companion article (Hula, Kellough, & Fergadiotis, 2015). Using archival data, we evaluated the fit of the PNT to 1- and 2-parameter logistic models and examined the precision of the resulting parameter estimates. We regressed the item difficulty estimates on three predictor variables: word length, age of acquisition, and contextual diversity. The 2-parameter logistic model demonstrated marginally better fit, but the fit of the 1-parameter logistic model was adequate. Precision was excellent for both person ability and item difficulty estimates. Word length, age of acquisition, and contextual diversity all independently contributed to variance in item difficulty. Item-response-theory methods can be productively used to analyze and quantify anomia severity in aphasia. Regression of item difficulty on lexical variables supported the validity of the PNT and interpretation of anomia severity scores in the context of current word-finding models.
London Measure of Unplanned Pregnancy: guidance for its use as an outcome measure
Hall, Jennifer A; Barrett, Geraldine; Copas, Andrew; Stephenson, Judith
2017-01-01
Background The London Measure of Unplanned Pregnancy (LMUP) is a psychometrically validated measure of the degree of intention of a current or recent pregnancy. The LMUP is increasingly being used worldwide, and can be used to evaluate family planning or preconception care programs. However, beyond recommending the use of the full LMUP scale, there is no published guidance on how to use the LMUP as an outcome measure. Ordinal logistic regression has been recommended informally, but studies published to date have all used binary logistic regression and dichotomized the scale at different cut points. There is thus a need for evidence-based guidance to provide a standardized methodology for multivariate analysis and to enable comparison of results. This paper makes recommendations for the regression method for analysis of the LMUP as an outcome measure. Materials and methods Data collected from 4,244 pregnant women in Malawi were used to compare five regression methods: linear, logistic with two cut points, and ordinal logistic with either the full or grouped LMUP score. The recommendations were then tested on the original UK LMUP data. Results There were small but no important differences in the findings across the regression models. Logistic regression resulted in the largest loss of information, and assumptions were violated for the linear and ordinal logistic regression. Consequently, robust standard errors were used for linear regression and a partial proportional odds ordinal logistic regression model attempted. The latter could only be fitted for grouped LMUP score. Conclusion We recommend the linear regression model with robust standard errors to make full use of the LMUP score when analyzed as an outcome measure. Ordinal logistic regression could be considered, but a partial proportional odds model with grouped LMUP score may be required. Logistic regression is the least-favored option, due to the loss of information. For logistic regression, the cut point for un/planned pregnancy should be between nine and ten. These recommendations will standardize the analysis of LMUP data and enhance comparability of results across studies. PMID:28435343
Nonconvex Sparse Logistic Regression With Weakly Convex Regularization
NASA Astrophysics Data System (ADS)
Shen, Xinyue; Gu, Yuantao
2018-06-01
In this work we propose to fit a sparse logistic regression model by a weakly convex regularized nonconvex optimization problem. The idea is based on the finding that a weakly convex function as an approximation of the $\\ell_0$ pseudo norm is able to better induce sparsity than the commonly used $\\ell_1$ norm. For a class of weakly convex sparsity inducing functions, we prove the nonconvexity of the corresponding sparse logistic regression problem, and study its local optimality conditions and the choice of the regularization parameter to exclude trivial solutions. Despite the nonconvexity, a method based on proximal gradient descent is used to solve the general weakly convex sparse logistic regression, and its convergence behavior is studied theoretically. Then the general framework is applied to a specific weakly convex function, and a necessary and sufficient local optimality condition is provided. The solution method is instantiated in this case as an iterative firm-shrinkage algorithm, and its effectiveness is demonstrated in numerical experiments by both randomly generated and real datasets.
Deletion Diagnostics for Alternating Logistic Regressions
Preisser, John S.; By, Kunthel; Perin, Jamie; Qaqish, Bahjat F.
2013-01-01
Deletion diagnostics are introduced for the regression analysis of clustered binary outcomes estimated with alternating logistic regressions, an implementation of generalized estimating equations (GEE) that estimates regression coefficients in a marginal mean model and in a model for the intracluster association given by the log odds ratio. The diagnostics are developed within an estimating equations framework that recasts the estimating functions for association parameters based upon conditional residuals into equivalent functions based upon marginal residuals. Extensions of earlier work on GEE diagnostics follow directly, including computational formulae for one-step deletion diagnostics that measure the influence of a cluster of observations on the estimated regression parameters and on the overall marginal mean or association model fit. The diagnostic formulae are evaluated with simulations studies and with an application concerning an assessment of factors associated with health maintenance visits in primary care medical practices. The application and the simulations demonstrate that the proposed cluster-deletion diagnostics for alternating logistic regressions are good approximations of their exact fully iterated counterparts. PMID:22777960
Correlation and simple linear regression.
Eberly, Lynn E
2007-01-01
This chapter highlights important steps in using correlation and simple linear regression to address scientific questions about the association of two continuous variables with each other. These steps include estimation and inference, assessing model fit, the connection between regression and ANOVA, and study design. Examples in microbiology are used throughout. This chapter provides a framework that is helpful in understanding more complex statistical techniques, such as multiple linear regression, linear mixed effects models, logistic regression, and proportional hazards regression.
Parsaeian, M; Mohammad, K; Mahmoudi, M; Zeraati, H
2012-01-01
Background: The purpose of this investigation was to compare empirically predictive ability of an artificial neural network with a logistic regression in prediction of low back pain. Methods: Data from the second national health survey were considered in this investigation. This data includes the information of low back pain and its associated risk factors among Iranian people aged 15 years and older. Artificial neural network and logistic regression models were developed using a set of 17294 data and they were validated in a test set of 17295 data. Hosmer and Lemeshow recommendation for model selection was used in fitting the logistic regression. A three-layer perceptron with 9 inputs, 3 hidden and 1 output neurons was employed. The efficiency of two models was compared by receiver operating characteristic analysis, root mean square and -2 Loglikelihood criteria. Results: The area under the ROC curve (SE), root mean square and -2Loglikelihood of the logistic regression was 0.752 (0.004), 0.3832 and 14769.2, respectively. The area under the ROC curve (SE), root mean square and -2Loglikelihood of the artificial neural network was 0.754 (0.004), 0.3770 and 14757.6, respectively. Conclusions: Based on these three criteria, artificial neural network would give better performance than logistic regression. Although, the difference is statistically significant, it does not seem to be clinically significant. PMID:23113198
Parsaeian, M; Mohammad, K; Mahmoudi, M; Zeraati, H
2012-01-01
The purpose of this investigation was to compare empirically predictive ability of an artificial neural network with a logistic regression in prediction of low back pain. Data from the second national health survey were considered in this investigation. This data includes the information of low back pain and its associated risk factors among Iranian people aged 15 years and older. Artificial neural network and logistic regression models were developed using a set of 17294 data and they were validated in a test set of 17295 data. Hosmer and Lemeshow recommendation for model selection was used in fitting the logistic regression. A three-layer perceptron with 9 inputs, 3 hidden and 1 output neurons was employed. The efficiency of two models was compared by receiver operating characteristic analysis, root mean square and -2 Loglikelihood criteria. The area under the ROC curve (SE), root mean square and -2Loglikelihood of the logistic regression was 0.752 (0.004), 0.3832 and 14769.2, respectively. The area under the ROC curve (SE), root mean square and -2Loglikelihood of the artificial neural network was 0.754 (0.004), 0.3770 and 14757.6, respectively. Based on these three criteria, artificial neural network would give better performance than logistic regression. Although, the difference is statistically significant, it does not seem to be clinically significant.
Fong, Youyi; Yu, Xuesong
2016-01-01
Many modern serial dilution assays are based on fluorescence intensity (FI) readouts. We study optimal transformation model choice for fitting five parameter logistic curves (5PL) to FI-based serial dilution assay data. We first develop a generalized least squares-pseudolikelihood type algorithm for fitting heteroscedastic logistic models. Next we show that the 5PL and log 5PL functions can approximate each other well. We then compare four 5PL models with different choices of log transformation and variance modeling through a Monte Carlo study and real data. Our findings are that the optimal choice depends on the intended use of the fitted curves. PMID:27642502
Modeling health survey data with excessive zero and K responses.
Lin, Ting Hsiang; Tsai, Min-Hsiao
2013-04-30
Zero-inflated Poisson regression is a popular tool used to analyze data with excessive zeros. Although much work has already been performed to fit zero-inflated data, most models heavily depend on special features of the individual data. To be specific, this means that there is a sizable group of respondents who endorse the same answers making the data have peaks. In this paper, we propose a new model with the flexibility to model excessive counts other than zero, and the model is a mixture of multinomial logistic and Poisson regression, in which the multinomial logistic component models the occurrence of excessive counts, including zeros, K (where K is a positive integer) and all other values. The Poisson regression component models the counts that are assumed to follow a Poisson distribution. Two examples are provided to illustrate our models when the data have counts containing many ones and sixes. As a result, the zero-inflated and K-inflated models exhibit a better fit than the zero-inflated Poisson and standard Poisson regressions. Copyright © 2012 John Wiley & Sons, Ltd.
Szekér, Szabolcs; Vathy-Fogarassy, Ágnes
2018-01-01
Logistic regression based propensity score matching is a widely used method in case-control studies to select the individuals of the control group. This method creates a suitable control group if all factors affecting the output variable are known. However, if relevant latent variables exist as well, which are not taken into account during the calculations, the quality of the control group is uncertain. In this paper, we present a statistics-based research in which we try to determine the relationship between the accuracy of the logistic regression model and the uncertainty of the dependent variable of the control group defined by propensity score matching. Our analyses show that there is a linear correlation between the fit of the logistic regression model and the uncertainty of the output variable. In certain cases, a latent binary explanatory variable can result in a relative error of up to 70% in the prediction of the outcome variable. The observed phenomenon calls the attention of analysts to an important point, which must be taken into account when deducting conclusions.
No rationale for 1 variable per 10 events criterion for binary logistic regression analysis.
van Smeden, Maarten; de Groot, Joris A H; Moons, Karel G M; Collins, Gary S; Altman, Douglas G; Eijkemans, Marinus J C; Reitsma, Johannes B
2016-11-24
Ten events per variable (EPV) is a widely advocated minimal criterion for sample size considerations in logistic regression analysis. Of three previous simulation studies that examined this minimal EPV criterion only one supports the use of a minimum of 10 EPV. In this paper, we examine the reasons for substantial differences between these extensive simulation studies. The current study uses Monte Carlo simulations to evaluate small sample bias, coverage of confidence intervals and mean square error of logit coefficients. Logistic regression models fitted by maximum likelihood and a modified estimation procedure, known as Firth's correction, are compared. The results show that besides EPV, the problems associated with low EPV depend on other factors such as the total sample size. It is also demonstrated that simulation results can be dominated by even a few simulated data sets for which the prediction of the outcome by the covariates is perfect ('separation'). We reveal that different approaches for identifying and handling separation leads to substantially different simulation results. We further show that Firth's correction can be used to improve the accuracy of regression coefficients and alleviate the problems associated with separation. The current evidence supporting EPV rules for binary logistic regression is weak. Given our findings, there is an urgent need for new research to provide guidance for supporting sample size considerations for binary logistic regression analysis.
Mixed conditional logistic regression for habitat selection studies.
Duchesne, Thierry; Fortin, Daniel; Courbin, Nicolas
2010-05-01
1. Resource selection functions (RSFs) are becoming a dominant tool in habitat selection studies. RSF coefficients can be estimated with unconditional (standard) and conditional logistic regressions. While the advantage of mixed-effects models is recognized for standard logistic regression, mixed conditional logistic regression remains largely overlooked in ecological studies. 2. We demonstrate the significance of mixed conditional logistic regression for habitat selection studies. First, we use spatially explicit models to illustrate how mixed-effects RSFs can be useful in the presence of inter-individual heterogeneity in selection and when the assumption of independence from irrelevant alternatives (IIA) is violated. The IIA hypothesis states that the strength of preference for habitat type A over habitat type B does not depend on the other habitat types also available. Secondly, we demonstrate the significance of mixed-effects models to evaluate habitat selection of free-ranging bison Bison bison. 3. When movement rules were homogeneous among individuals and the IIA assumption was respected, fixed-effects RSFs adequately described habitat selection by simulated animals. In situations violating the inter-individual homogeneity and IIA assumptions, however, RSFs were best estimated with mixed-effects regressions, and fixed-effects models could even provide faulty conclusions. 4. Mixed-effects models indicate that bison did not select farmlands, but exhibited strong inter-individual variations in their response to farmlands. Less than half of the bison preferred farmlands over forests. Conversely, the fixed-effect model simply suggested an overall selection for farmlands. 5. Conditional logistic regression is recognized as a powerful approach to evaluate habitat selection when resource availability changes. This regression is increasingly used in ecological studies, but almost exclusively in the context of fixed-effects models. Fitness maximization can imply differences in trade-offs among individuals, which can yield inter-individual differences in selection and lead to departure from IIA. These situations are best modelled with mixed-effects models. Mixed-effects conditional logistic regression should become a valuable tool for ecological research.
Li, Yi; Tseng, Yufeng J.; Pan, Dahua; Liu, Jianzhong; Kern, Petra S.; Gerberick, G. Frank; Hopfinger, Anton J.
2008-01-01
Currently, the only validated methods to identify skin sensitization effects are in vivo models, such as the Local Lymph Node Assay (LLNA) and guinea pig studies. There is a tremendous need, in particular due to novel legislation, to develop animal alternatives, eg. Quantitative Structure-Activity Relationship (QSAR) models. Here, QSAR models for skin sensitization using LLNA data have been constructed. The descriptors used to generate these models are derived from the 4D-molecular similarity paradigm and are referred to as universal 4D-fingerprints. A training set of 132 structurally diverse compounds and a test set of 15 structurally diverse compounds were used in this study. The statistical methodologies used to build the models are logistic regression (LR), and partial least square coupled logistic regression (PLS-LR), which prove to be effective tools for studying skin sensitization measures expressed in the two categorical terms of sensitizer and non-sensitizer. QSAR models with low values of the Hosmer-Lemeshow goodness-of-fit statistic, χHL2, are significant and predictive. For the training set, the cross-validated prediction accuracy of the logistic regression models ranges from 77.3% to 78.0%, while that of PLS-logistic regression models ranges from 87.1% to 89.4%. For the test set, the prediction accuracy of logistic regression models ranges from 80.0%-86.7%, while that of PLS-logistic regression models ranges from 73.3%-80.0%. The QSAR models are made up of 4D-fingerprints related to aromatic atoms, hydrogen bond acceptors and negatively partially charged atoms. PMID:17226934
Evolution of the Marine Officer Fitness Report: A Multivariate Analysis
This thesis explores the evaluation behavior of United States Marine Corps (USMC) Reporting Seniors (RSs) from 2010 to 2017. Using fitness report...RSs evaluate the performance of subordinate active component unrestricted officer MROs over time. I estimate logistic regression models of the...lowest. However, these correlations indicating the effects of race matching on FITREP evaluations narrow in significance when performance-based factors
Black, L E; Brion, G M; Freitas, S J
2007-06-01
Predicting the presence of enteric viruses in surface waters is a complex modeling problem. Multiple water quality parameters that indicate the presence of human fecal material, the load of fecal material, and the amount of time fecal material has been in the environment are needed. This paper presents the results of a multiyear study of raw-water quality at the inlet of a potable-water plant that related 17 physical, chemical, and biological indices to the presence of enteric viruses as indicated by cytopathic changes in cell cultures. It was found that several simple, multivariate logistic regression models that could reliably identify observations of the presence or absence of total culturable virus could be fitted. The best models developed combined a fecal age indicator (the atypical coliform [AC]/total coliform [TC] ratio), the detectable presence of a human-associated sterol (epicoprostanol) to indicate the fecal source, and one of several fecal load indicators (the levels of Giardia species cysts, coliform bacteria, and coprostanol). The best fit to the data was found when the AC/TC ratio, the presence of epicoprostanol, and the density of fecal coliform bacteria were input into a simple, multivariate logistic regression equation, resulting in 84.5% and 78.6% accuracies for the identification of the presence and absence of total culturable virus, respectively. The AC/TC ratio was the most influential input variable in all of the models generated, but producing the best prediction required additional input related to the fecal source and the fecal load. The potential for replacing microbial indicators of fecal load with levels of coprostanol was proposed and evaluated by multivariate logistic regression modeling for the presence and absence of virus.
Bond, H S; Sullivan, S G; Cowling, B J
2016-06-01
Influenza vaccination is the most practical means available for preventing influenza virus infection and is widely used in many countries. Because vaccine components and circulating strains frequently change, it is important to continually monitor vaccine effectiveness (VE). The test-negative design is frequently used to estimate VE. In this design, patients meeting the same clinical case definition are recruited and tested for influenza; those who test positive are the cases and those who test negative form the comparison group. When determining VE in these studies, the typical approach has been to use logistic regression, adjusting for potential confounders. Because vaccine coverage and influenza incidence change throughout the season, time is included among these confounders. While most studies use unconditional logistic regression, adjusting for time, an alternative approach is to use conditional logistic regression, matching on time. Here, we used simulation data to examine the potential for both regression approaches to permit accurate and robust estimates of VE. In situations where vaccine coverage changed during the influenza season, the conditional model and unconditional models adjusting for categorical week and using a spline function for week provided more accurate estimates. We illustrated the two approaches on data from a test-negative study of influenza VE against hospitalization in children in Hong Kong which resulted in the conditional logistic regression model providing the best fit to the data.
Use and interpretation of logistic regression in habitat-selection studies
Keating, Kim A.; Cherry, Steve
2004-01-01
Logistic regression is an important tool for wildlife habitat-selection studies, but the method frequently has been misapplied due to an inadequate understanding of the logistic model, its interpretation, and the influence of sampling design. To promote better use of this method, we review its application and interpretation under 3 sampling designs: random, case-control, and use-availability. Logistic regression is appropriate for habitat use-nonuse studies employing random sampling and can be used to directly model the conditional probability of use in such cases. Logistic regression also is appropriate for studies employing case-control sampling designs, but careful attention is required to interpret results correctly. Unless bias can be estimated or probability of use is small for all habitats, results of case-control studies should be interpreted as odds ratios, rather than probability of use or relative probability of use. When data are gathered under a use-availability design, logistic regression can be used to estimate approximate odds ratios if probability of use is small, at least on average. More generally, however, logistic regression is inappropriate for modeling habitat selection in use-availability studies. In particular, using logistic regression to fit the exponential model of Manly et al. (2002:100) does not guarantee maximum-likelihood estimates, valid probabilities, or valid likelihoods. We show that the resource selection function (RSF) commonly used for the exponential model is proportional to a logistic discriminant function. Thus, it may be used to rank habitats with respect to probability of use and to identify important habitat characteristics or their surrogates, but it is not guaranteed to be proportional to probability of use. Other problems associated with the exponential model also are discussed. We describe an alternative model based on Lancaster and Imbens (1996) that offers a method for estimating conditional probability of use in use-availability studies. Although promising, this model fails to converge to a unique solution in some important situations. Further work is needed to obtain a robust method that is broadly applicable to use-availability studies.
Gruber, Susan; Logan, Roger W; Jarrín, Inmaculada; Monge, Susana; Hernán, Miguel A
2015-01-15
Inverse probability weights used to fit marginal structural models are typically estimated using logistic regression. However, a data-adaptive procedure may be able to better exploit information available in measured covariates. By combining predictions from multiple algorithms, ensemble learning offers an alternative to logistic regression modeling to further reduce bias in estimated marginal structural model parameters. We describe the application of two ensemble learning approaches to estimating stabilized weights: super learning (SL), an ensemble machine learning approach that relies on V-fold cross validation, and an ensemble learner (EL) that creates a single partition of the data into training and validation sets. Longitudinal data from two multicenter cohort studies in Spain (CoRIS and CoRIS-MD) were analyzed to estimate the mortality hazard ratio for initiation versus no initiation of combined antiretroviral therapy among HIV positive subjects. Both ensemble approaches produced hazard ratio estimates further away from the null, and with tighter confidence intervals, than logistic regression modeling. Computation time for EL was less than half that of SL. We conclude that ensemble learning using a library of diverse candidate algorithms offers an alternative to parametric modeling of inverse probability weights when fitting marginal structural models. With large datasets, EL provides a rich search over the solution space in less time than SL with comparable results. Copyright © 2014 John Wiley & Sons, Ltd.
Gruber, Susan; Logan, Roger W.; Jarrín, Inmaculada; Monge, Susana; Hernán, Miguel A.
2014-01-01
Inverse probability weights used to fit marginal structural models are typically estimated using logistic regression. However a data-adaptive procedure may be able to better exploit information available in measured covariates. By combining predictions from multiple algorithms, ensemble learning offers an alternative to logistic regression modeling to further reduce bias in estimated marginal structural model parameters. We describe the application of two ensemble learning approaches to estimating stabilized weights: super learning (SL), an ensemble machine learning approach that relies on V -fold cross validation, and an ensemble learner (EL) that creates a single partition of the data into training and validation sets. Longitudinal data from two multicenter cohort studies in Spain (CoRIS and CoRIS-MD) were analyzed to estimate the mortality hazard ratio for initiation versus no initiation of combined antiretroviral therapy among HIV positive subjects. Both ensemble approaches produced hazard ratio estimates further away from the null, and with tighter confidence intervals, than logistic regression modeling. Computation time for EL was less than half that of SL. We conclude that ensemble learning using a library of diverse candidate algorithms offers an alternative to parametric modeling of inverse probability weights when fitting marginal structural models. With large datasets, EL provides a rich search over the solution space in less time than SL with comparable results. PMID:25316152
Alghamdi, Manal; Al-Mallah, Mouaz; Keteyian, Steven; Brawner, Clinton; Ehrman, Jonathan; Sakr, Sherif
2017-01-01
Machine learning is becoming a popular and important approach in the field of medical research. In this study, we investigate the relative performance of various machine learning methods such as Decision Tree, Naïve Bayes, Logistic Regression, Logistic Model Tree and Random Forests for predicting incident diabetes using medical records of cardiorespiratory fitness. In addition, we apply different techniques to uncover potential predictors of diabetes. This FIT project study used data of 32,555 patients who are free of any known coronary artery disease or heart failure who underwent clinician-referred exercise treadmill stress testing at Henry Ford Health Systems between 1991 and 2009 and had a complete 5-year follow-up. At the completion of the fifth year, 5,099 of those patients have developed diabetes. The dataset contained 62 attributes classified into four categories: demographic characteristics, disease history, medication use history, and stress test vital signs. We developed an Ensembling-based predictive model using 13 attributes that were selected based on their clinical importance, Multiple Linear Regression, and Information Gain Ranking methods. The negative effect of the imbalance class of the constructed model was handled by Synthetic Minority Oversampling Technique (SMOTE). The overall performance of the predictive model classifier was improved by the Ensemble machine learning approach using the Vote method with three Decision Trees (Naïve Bayes Tree, Random Forest, and Logistic Model Tree) and achieved high accuracy of prediction (AUC = 0.92). The study shows the potential of ensembling and SMOTE approaches for predicting incident diabetes using cardiorespiratory fitness data.
van Turenhout, Sietze T; Oort, Frank A; Terhaar sive Droste, Jochim S; Coupé, Veerle M H; van der Hulst, Rene W; Loffeld, Ruud J; Scholten, Pieter; Depla, Annekatrien C T M; Bouman, Anneke A; Meijer, Gerrit A; Mulder, Chris J J; van Rossum, Leo G M
2012-07-01
Colorectal cancer screening by fecal immunochemical tests (FITs) is hampered by frequent false-positive (FP) results and thereby the risk of complications and strain on colonoscopy capacity. Hemorrhoids might be a plausible cause of FP results. To determine the contribution of hemorrhoids to the frequency of FP FIT results. Retrospective analysis from prospective cohort study. Five large teaching hospitals, including 1 academic hospital. All subjects scheduled for elective colonoscopy. FIT before bowel preparation. Frequency of FP FIT results in subjects with hemorrhoids as the only relevant abnormality compared with FP FIT results in subjects with no relevant abnormalities. Logistic regression analysis to determine colonic abnormalities influencing FP results. In 2855 patients, 434 had positive FIT results: 213 had advanced neoplasia and 221 had FP results. In 9 individuals (4.1%; 95% CI, 1.4-6.8) with an FP FIT result, hemorrhoids were the only abnormality. In univariate unadjusted analysis, subjects with hemorrhoids as the only abnormality did not have more positive results (9/134; 6.7%) compared with subjects without any abnormalities (43/886; 4.9%; P = .396). Logistic regression identified hemorrhoids, nonadvanced polyps, and a group of miscellaneous abnormalities, all significantly influencing false positivity. Of 1000 subjects with hemorrhoids, 67 would have FP results, of whom 18 would have FP results because of hemorrhoids only. Potential underreporting of hemorrhoids; high-risk individuals. Hemorrhoids in individuals participating in colorectal cancer screening will probably not lead to a substantial number of false-positive test results. Copyright © 2012 American Society for Gastrointestinal Endoscopy. Published by Mosby, Inc. All rights reserved.
Intermediate and advanced topics in multilevel logistic regression analysis
Merlo, Juan
2017-01-01
Multilevel data occur frequently in health services, population and public health, and epidemiologic research. In such research, binary outcomes are common. Multilevel logistic regression models allow one to account for the clustering of subjects within clusters of higher‐level units when estimating the effect of subject and cluster characteristics on subject outcomes. A search of the PubMed database demonstrated that the use of multilevel or hierarchical regression models is increasing rapidly. However, our impression is that many analysts simply use multilevel regression models to account for the nuisance of within‐cluster homogeneity that is induced by clustering. In this article, we describe a suite of analyses that can complement the fitting of multilevel logistic regression models. These ancillary analyses permit analysts to estimate the marginal or population‐average effect of covariates measured at the subject and cluster level, in contrast to the within‐cluster or cluster‐specific effects arising from the original multilevel logistic regression model. We describe the interval odds ratio and the proportion of opposed odds ratios, which are summary measures of effect for cluster‐level covariates. We describe the variance partition coefficient and the median odds ratio which are measures of components of variance and heterogeneity in outcomes. These measures allow one to quantify the magnitude of the general contextual effect. We describe an R 2 measure that allows analysts to quantify the proportion of variation explained by different multilevel logistic regression models. We illustrate the application and interpretation of these measures by analyzing mortality in patients hospitalized with a diagnosis of acute myocardial infarction. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. PMID:28543517
Intermediate and advanced topics in multilevel logistic regression analysis.
Austin, Peter C; Merlo, Juan
2017-09-10
Multilevel data occur frequently in health services, population and public health, and epidemiologic research. In such research, binary outcomes are common. Multilevel logistic regression models allow one to account for the clustering of subjects within clusters of higher-level units when estimating the effect of subject and cluster characteristics on subject outcomes. A search of the PubMed database demonstrated that the use of multilevel or hierarchical regression models is increasing rapidly. However, our impression is that many analysts simply use multilevel regression models to account for the nuisance of within-cluster homogeneity that is induced by clustering. In this article, we describe a suite of analyses that can complement the fitting of multilevel logistic regression models. These ancillary analyses permit analysts to estimate the marginal or population-average effect of covariates measured at the subject and cluster level, in contrast to the within-cluster or cluster-specific effects arising from the original multilevel logistic regression model. We describe the interval odds ratio and the proportion of opposed odds ratios, which are summary measures of effect for cluster-level covariates. We describe the variance partition coefficient and the median odds ratio which are measures of components of variance and heterogeneity in outcomes. These measures allow one to quantify the magnitude of the general contextual effect. We describe an R 2 measure that allows analysts to quantify the proportion of variation explained by different multilevel logistic regression models. We illustrate the application and interpretation of these measures by analyzing mortality in patients hospitalized with a diagnosis of acute myocardial infarction. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.
Multivariate prediction of upper limb prosthesis acceptance or rejection.
Biddiss, Elaine A; Chau, Tom T
2008-07-01
To develop a model for prediction of upper limb prosthesis use or rejection. A questionnaire exploring factors in prosthesis acceptance was distributed internationally to individuals with upper limb absence through community-based support groups and rehabilitation hospitals. A total of 191 participants (59 prosthesis rejecters and 132 prosthesis wearers) were included in this study. A logistic regression model, a C5.0 decision tree, and a radial basis function neural network were developed and compared in terms of sensitivity (prediction of prosthesis rejecters), specificity (prediction of prosthesis wearers), and overall cross-validation accuracy. The logistic regression and neural network provided comparable overall accuracies of approximately 84 +/- 3%, specificity of 93%, and sensitivity of 61%. Fitting time-frame emerged as the predominant predictor. Individuals fitted within two years of birth (congenital) or six months of amputation (acquired) were 16 times more likely to continue prosthesis use. To increase rates of prosthesis acceptance, clinical directives should focus on timely, client-centred fitting strategies and the development of improved prostheses and healthcare for individuals with high-level or bilateral limb absence. Multivariate analyses are useful in determining the relative importance of the many factors involved in prosthesis acceptance and rejection.
The 6-min push test is reliable and predicts low fitness in spinal cord injury.
Cowan, Rachel E; Callahan, Morgan K; Nash, Mark S
2012-10-01
The objective of this study is to assess 6-min push test (6MPT) reliability, determine whether the 6MPT is sensitive to fitness differences, and assess if 6MPT distance predicts fitness level in persons with spinal cord injury (SCI) or disease. Forty individuals with SCI who could self-propel a manual wheelchair completed an incremental arm crank peak oxygen consumption assessment and two 6MPTs across 3 d (37% tetraplegia (TP), 63% paraplegia (PP), 85% men, 70% white, 63% Hispanic, mean age = 34 ± 10 yr, mean duration of injury = 13 ± 10 yr, and mean body mass index = 24 ± 5 kg.m). Intraclass correlation and Bland-Altman plots assessed 6MPT distance (m) reliability. Mann-Whitney U test compared 6MPT distance (m) of high and low fitness groups for TP and PP. The fitness status prediction was developed using N = 30 and validated in N = 10 (validation group (VG)). A nonstatistical prediction approach, below or above a threshold distance (TP = 445 m and PP = 604 m), was validated statistically by binomial logistic regression. Accuracy, sensitivity, and specificity were computed to evaluate the threshold approach. Intraclass correlation coefficients exceeded 0.90 for the whole sample and the TP/PP subsets. High fitness persons propelled farther than low fitness persons for both TP/PP (both P < 0.05). Binomial logistic regression (P < 0.008) predicted the same fitness levels in the VG as the threshold approach. In the VG, overall accuracy was 70%. Eighty-six percent of low fitness persons were correctly identified (sensitivity), and 33% of high fitness persons were correctly identified (specificity). The 6MPT may be a useful tool for SCI clinicians and researchers. 6MPT distance demonstrates excellent reliability and is sensitive to differences in fitness level. 6MPT distances less than a threshold distance may be an effective approach to identify low fitness in person with SCI.
Advanced Statistics for Exotic Animal Practitioners.
Hodsoll, John; Hellier, Jennifer M; Ryan, Elizabeth G
2017-09-01
Correlation and regression assess the association between 2 or more variables. This article reviews the core knowledge needed to understand these analyses, moving from visual analysis in scatter plots through correlation, simple and multiple linear regression, and logistic regression. Correlation estimates the strength and direction of a relationship between 2 variables. Regression can be considered more general and quantifies the numerical relationships between an outcome and 1 or multiple variables in terms of a best-fit line, allowing predictions to be made. Each technique is discussed with examples and the statistical assumptions underlying their correct application. Copyright © 2017 Elsevier Inc. All rights reserved.
Markgraf, Rainer; Deutschinoff, Gerd; Pientka, Ludger; Scholten, Theo; Lorenz, Cristoph
2001-01-01
Background: Mortality predictions calculated using scoring scales are often not accurate in populations other than those in which the scales were developed because of differences in case-mix. The present study investigates the effect of first-level customization, using a logistic regression technique, on discrimination and calibration of the Acute Physiology and Chronic Health Evaluation (APACHE) II and III scales. Method: Probabilities of hospital death for patients were estimated by applying APACHE II and III and comparing these with observed outcomes. Using the split sample technique, a customized model to predict outcome was developed by logistic regression. The overall goodness-of-fit of the original and the customized models was assessed. Results: Of 3383 consecutive intensive care unit (ICU) admissions over 3 years, 2795 patients could be analyzed, and were split randomly into development and validation samples. The discriminative powers of APACHE II and III were unchanged by customization (areas under the receiver operating characteristic [ROC] curve 0.82 and 0.85, respectively). Hosmer-Lemeshow goodness-of-fit tests showed good calibration for APACHE II, but insufficient calibration for APACHE III. Customization improved calibration for both models, with a good fit for APACHE III as well. However, fit was different for various subgroups. Conclusions: The overall goodness-of-fit of APACHE III mortality prediction was improved significantly by customization, but uniformity of fit in different subgroups was not achieved. Therefore, application of the customized model provides no advantage, because differences in case-mix still limit comparisons of quality of care. PMID:11178223
Calibration power of the Braden scale in predicting pressure ulcer development.
Chen, Hong-Lin; Cao, Ying-Juan; Wang, Jing; Huai, Bao-Sha
2016-11-02
Calibration is the degree of correspondence between the estimated probability produced by a model and the actual observed probability. The aim of this study was to investigate the calibration power of the Braden scale in predicting pressure ulcer development (PU). A retrospective analysis was performed among consecutive patients in 2013. The patients were separated into training a group and a validation group. The predicted incidence was calculated using a logistic regression model in the training group and the Hosmer-Lemeshow test was used for assessing the goodness of fit. In the validation cohort, the observed and the predicted incidence were compared by the Chi-square (χ 2 ) goodness of fit test for calibration power. We included 2585 patients in the study, of these 78 patients (3.0%) developed a PU. Between the training and validation groups the patient characteristics were non-significant (p>0.05). In the training group, the logistic regression model for predicting pressure ulcer was Logit(P) = -0.433*Braden score+2.616. The Hosmer-Lemeshow test showed no goodness fit (χ 2 =13.472; p=0.019). In the validation group, the predicted pressure ulcer incidence also did not fit well with the observed incidence (χ 2 =42.154, p=0.000 by Braden scores; and χ 2 =17.223, p=0.001 by Braden scale risk classification). The Braden scale has low calibration power in predicting PU formation.
NASA Astrophysics Data System (ADS)
Bradshaw, Tyler; Fu, Rau; Bowen, Stephen; Zhu, Jun; Forrest, Lisa; Jeraj, Robert
2015-07-01
Dose painting relies on the ability of functional imaging to identify resistant tumor subvolumes to be targeted for additional boosting. This work assessed the ability of FDG, FLT, and Cu-ATSM PET imaging to predict the locations of residual FDG PET in canine tumors following radiotherapy. Nineteen canines with spontaneous sinonasal tumors underwent PET/CT imaging with radiotracers FDG, FLT, and Cu-ATSM prior to hypofractionated radiotherapy. Therapy consisted of 10 fractions of 4.2 Gy to the sinonasal cavity with or without an integrated boost of 0.8 Gy to the GTV. Patients had an additional FLT PET/CT scan after fraction 2, a Cu-ATSM PET/CT scan after fraction 3, and follow-up FDG PET/CT scans after radiotherapy. Following image registration, simple and multiple linear and logistic voxel regressions were performed to assess how well pre- and mid-treatment PET imaging predicted post-treatment FDG uptake. R2 and pseudo R2 were used to assess the goodness of fits. For simple linear regression models, regression coefficients for all pre- and mid-treatment PET images were significantly positive across the population (P < 0.05). However, there was large variability among patients in goodness of fits: R2 ranged from 0.00 to 0.85, with a median of 0.12. Results for logistic regression models were similar. Multiple linear regression models resulted in better fits (median R2 = 0.31), but there was still large variability between patients in R2. The R2 from regression models for different predictor variables were highly correlated across patients (R ≈ 0.8), indicating tumors that were poorly predicted with one tracer were also poorly predicted by other tracers. In conclusion, the high inter-patient variability in goodness of fits indicates that PET was able to predict locations of residual tumor in some patients, but not others. This suggests not all patients would be good candidates for dose painting based on a single biological target.
Bradshaw, Tyler; Fu, Rau; Bowen, Stephen; Zhu, Jun; Forrest, Lisa; Jeraj, Robert
2015-07-07
Dose painting relies on the ability of functional imaging to identify resistant tumor subvolumes to be targeted for additional boosting. This work assessed the ability of FDG, FLT, and Cu-ATSM PET imaging to predict the locations of residual FDG PET in canine tumors following radiotherapy. Nineteen canines with spontaneous sinonasal tumors underwent PET/CT imaging with radiotracers FDG, FLT, and Cu-ATSM prior to hypofractionated radiotherapy. Therapy consisted of 10 fractions of 4.2 Gy to the sinonasal cavity with or without an integrated boost of 0.8 Gy to the GTV. Patients had an additional FLT PET/CT scan after fraction 2, a Cu-ATSM PET/CT scan after fraction 3, and follow-up FDG PET/CT scans after radiotherapy. Following image registration, simple and multiple linear and logistic voxel regressions were performed to assess how well pre- and mid-treatment PET imaging predicted post-treatment FDG uptake. R(2) and pseudo R(2) were used to assess the goodness of fits. For simple linear regression models, regression coefficients for all pre- and mid-treatment PET images were significantly positive across the population (P < 0.05). However, there was large variability among patients in goodness of fits: R(2) ranged from 0.00 to 0.85, with a median of 0.12. Results for logistic regression models were similar. Multiple linear regression models resulted in better fits (median R(2) = 0.31), but there was still large variability between patients in R(2). The R(2) from regression models for different predictor variables were highly correlated across patients (R ≈ 0.8), indicating tumors that were poorly predicted with one tracer were also poorly predicted by other tracers. In conclusion, the high inter-patient variability in goodness of fits indicates that PET was able to predict locations of residual tumor in some patients, but not others. This suggests not all patients would be good candidates for dose painting based on a single biological target.
An examination of constraints to wilderness visitation
Gary T. Green; J. Michael Bowker; Cassandra Y. Johnson; H. Ken Cordell; Xiongfei Wang
2007-01-01
Certain social groups appear notably less in wilderness visitation surveys than their population proportion. This study examines whether different social groups in American society (minorities, women, rural dwellers, low income and less educated populations) perceive more constraints to wilderness visitation than other groups. Logistic regressions were fit to data from...
Optimizing Treatment of Lung Cancer Patients with Comorbidities
2017-10-01
of treatment options, comorbid illness, age, sex , histology, and tumor size. We will simulate base case scenarios for stage I NSCLC for all possible...fitting adjusted logistic regression models controlling for age, sex and cancer stage. Results Overall, 5,644 (80.4%) and 1,377 (19.6%) patients
Epidemiology of Injuries Associated With Physical Training Among Young Men in the Army
1993-01-01
ratio (AOR), which was generated from "back- Achilles tendonitis. and patellofemoral syndrome. stepping" multiple logistic regression output (BMDP...categories of physical 6.3% ankle sprains, 5.9% overuse knee injuries, such activity and components of physical fitness and to as patellofemoral
A nonparametric multiple imputation approach for missing categorical data.
Zhou, Muhan; He, Yulei; Yu, Mandi; Hsu, Chiu-Hsieh
2017-06-06
Incomplete categorical variables with more than two categories are common in public health data. However, most of the existing missing-data methods do not use the information from nonresponse (missingness) probabilities. We propose a nearest-neighbour multiple imputation approach to impute a missing at random categorical outcome and to estimate the proportion of each category. The donor set for imputation is formed by measuring distances between each missing value with other non-missing values. The distance function is calculated based on a predictive score, which is derived from two working models: one fits a multinomial logistic regression for predicting the missing categorical outcome (the outcome model) and the other fits a logistic regression for predicting missingness probabilities (the missingness model). A weighting scheme is used to accommodate contributions from two working models when generating the predictive score. A missing value is imputed by randomly selecting one of the non-missing values with the smallest distances. We conduct a simulation to evaluate the performance of the proposed method and compare it with several alternative methods. A real-data application is also presented. The simulation study suggests that the proposed method performs well when missingness probabilities are not extreme under some misspecifications of the working models. However, the calibration estimator, which is also based on two working models, can be highly unstable when missingness probabilities for some observations are extremely high. In this scenario, the proposed method produces more stable and better estimates. In addition, proper weights need to be chosen to balance the contributions from the two working models and achieve optimal results for the proposed method. We conclude that the proposed multiple imputation method is a reasonable approach to dealing with missing categorical outcome data with more than two levels for assessing the distribution of the outcome. In terms of the choices for the working models, we suggest a multinomial logistic regression for predicting the missing outcome and a binary logistic regression for predicting the missingness probability.
Ertas, Gokhan
2018-07-01
To assess the value of joint evaluation of diffusion tensor imaging (DTI) measures by using logistic regression modelling to detect high GS risk group prostate tumors. Fifty tumors imaged using DTI on a 3 T MRI device were analyzed. Regions of interests focusing on the center of tumor foci and noncancerous tissue on the maps of mean diffusivity (MD) and fractional anisotropy (FA) were used to extract the minimum, the maximum and the mean measures. Measure ratio was computed by dividing tumor measure by noncancerous tissue measure. Logistic regression models were fitted for all possible pair combinations of the measures using 5-fold cross validation. Systematic differences are present for all MD measures and also for all FA measures in distinguishing the high risk tumors [GS ≥ 7(4 + 3)] from the low risk tumors [GS ≤ 7(3 + 4)] (P < 0.05). Smaller value for MD measures and larger value for FA measures indicate the high risk. The models enrolling the measures achieve good fits and good classification performances (R 2 adj = 0.55-0.60, AUC = 0.88-0.91), however the models using the measure ratios perform better (R 2 adj = 0.59-0.75, AUC = 0.88-0.95). The model that employs the ratios of minimum MD and maximum FA accomplishes the highest sensitivity, specificity and accuracy (Se = 77.8%, Sp = 96.9% and Acc = 90.0%). Joint evaluation of MD and FA diffusion tensor imaging measures is valuable to detect high GS risk group peripheral zone prostate tumors. However, use of the ratios of the measures improves the accuracy of the detections substantially. Logistic regression modelling provides a favorable solution for the joint evaluations easily adoptable in clinical practice. Copyright © 2018 Elsevier Inc. All rights reserved.
Comparison of Survival Models for Analyzing Prognostic Factors in Gastric Cancer Patients
Habibi, Danial; Rafiei, Mohammad; Chehrei, Ali; Shayan, Zahra; Tafaqodi, Soheil
2018-03-27
Objective: There are a number of models for determining risk factors for survival of patients with gastric cancer. This study was conducted to select the model showing the best fit with available data. Methods: Cox regression and parametric models (Exponential, Weibull, Gompertz, Log normal, Log logistic and Generalized Gamma) were utilized in unadjusted and adjusted forms to detect factors influencing mortality of patients. Comparisons were made with Akaike Information Criterion (AIC) by using STATA 13 and R 3.1.3 softwares. Results: The results of this study indicated that all parametric models outperform the Cox regression model. The Log normal, Log logistic and Generalized Gamma provided the best performance in terms of AIC values (179.2, 179.4 and 181.1, respectively). On unadjusted analysis, the results of the Cox regression and parametric models indicated stage, grade, largest diameter of metastatic nest, largest diameter of LM, number of involved lymph nodes and the largest ratio of metastatic nests to lymph nodes, to be variables influencing the survival of patients with gastric cancer. On adjusted analysis, according to the best model (log normal), grade was found as the significant variable. Conclusion: The results suggested that all parametric models outperform the Cox model. The log normal model provides the best fit and is a good substitute for Cox regression. Creative Commons Attribution License
Austin, Peter C
2010-04-22
Multilevel logistic regression models are increasingly being used to analyze clustered data in medical, public health, epidemiological, and educational research. Procedures for estimating the parameters of such models are available in many statistical software packages. There is currently little evidence on the minimum number of clusters necessary to reliably fit multilevel regression models. We conducted a Monte Carlo study to compare the performance of different statistical software procedures for estimating multilevel logistic regression models when the number of clusters was low. We examined procedures available in BUGS, HLM, R, SAS, and Stata. We found that there were qualitative differences in the performance of different software procedures for estimating multilevel logistic models when the number of clusters was low. Among the likelihood-based procedures, estimation methods based on adaptive Gauss-Hermite approximations to the likelihood (glmer in R and xtlogit in Stata) or adaptive Gaussian quadrature (Proc NLMIXED in SAS) tended to have superior performance for estimating variance components when the number of clusters was small, compared to software procedures based on penalized quasi-likelihood. However, only Bayesian estimation with BUGS allowed for accurate estimation of variance components when there were fewer than 10 clusters. For all statistical software procedures, estimation of variance components tended to be poor when there were only five subjects per cluster, regardless of the number of clusters.
Product unit neural network models for predicting the growth limits of Listeria monocytogenes.
Valero, A; Hervás, C; García-Gimeno, R M; Zurera, G
2007-08-01
A new approach to predict the growth/no growth interface of Listeria monocytogenes as a function of storage temperature, pH, citric acid (CA) and ascorbic acid (AA) is presented. A linear logistic regression procedure was performed and a non-linear model was obtained by adding new variables by means of a Neural Network model based on Product Units (PUNN). The classification efficiency of the training data set and the generalization data of the new Logistic Regression PUNN model (LRPU) were compared with Linear Logistic Regression (LLR) and Polynomial Logistic Regression (PLR) models. 92% of the total cases from the LRPU model were correctly classified, an improvement on the percentage obtained using the PLR model (90%) and significantly higher than the results obtained with the LLR model, 80%. On the other hand predictions of LRPU were closer to data observed which permits to design proper formulations in minimally processed foods. This novel methodology can be applied to predictive microbiology for describing growth/no growth interface of food-borne microorganisms such as L. monocytogenes. The optimal balance is trying to find models with an acceptable interpretation capacity and with good ability to fit the data on the boundaries of variable range. The results obtained conclude that these kinds of models might well be very a valuable tool for mathematical modeling.
Evaluating penalized logistic regression models to predict Heat-Related Electric grid stress days
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bramer, L. M.; Rounds, J.; Burleyson, C. D.
Understanding the conditions associated with stress on the electricity grid is important in the development of contingency plans for maintaining reliability during periods when the grid is stressed. In this paper, heat-related grid stress and the relationship with weather conditions is examined using data from the eastern United States. Penalized logistic regression models were developed and applied to predict stress on the electric grid using weather data. The inclusion of other weather variables, such as precipitation, in addition to temperature improved model performance. Several candidate models and datasets were examined. A penalized logistic regression model fit at the operation-zone levelmore » was found to provide predictive value and interpretability. Additionally, the importance of different weather variables observed at different time scales were examined. Maximum temperature and precipitation were identified as important across all zones while the importance of other weather variables was zone specific. The methods presented in this work are extensible to other regions and can be used to aid in planning and development of the electrical grid.« less
Evaluating penalized logistic regression models to predict Heat-Related Electric grid stress days
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bramer, Lisa M.; Rounds, J.; Burleyson, C. D.
Understanding the conditions associated with stress on the electricity grid is important in the development of contingency plans for maintaining reliability during periods when the grid is stressed. In this paper, heat-related grid stress and the relationship with weather conditions were examined using data from the eastern United States. Penalized logistic regression models were developed and applied to predict stress on the electric grid using weather data. The inclusion of other weather variables, such as precipitation, in addition to temperature improved model performance. Several candidate models and combinations of predictive variables were examined. A penalized logistic regression model which wasmore » fit at the operation-zone level was found to provide predictive value and interpretability. Additionally, the importance of different weather variables observed at various time scales were examined. Maximum temperature and precipitation were identified as important across all zones while the importance of other weather variables was zone specific. In conclusion, the methods presented in this work are extensible to other regions and can be used to aid in planning and development of the electrical grid.« less
Evaluating penalized logistic regression models to predict Heat-Related Electric grid stress days
Bramer, Lisa M.; Rounds, J.; Burleyson, C. D.; ...
2017-09-22
Understanding the conditions associated with stress on the electricity grid is important in the development of contingency plans for maintaining reliability during periods when the grid is stressed. In this paper, heat-related grid stress and the relationship with weather conditions were examined using data from the eastern United States. Penalized logistic regression models were developed and applied to predict stress on the electric grid using weather data. The inclusion of other weather variables, such as precipitation, in addition to temperature improved model performance. Several candidate models and combinations of predictive variables were examined. A penalized logistic regression model which wasmore » fit at the operation-zone level was found to provide predictive value and interpretability. Additionally, the importance of different weather variables observed at various time scales were examined. Maximum temperature and precipitation were identified as important across all zones while the importance of other weather variables was zone specific. In conclusion, the methods presented in this work are extensible to other regions and can be used to aid in planning and development of the electrical grid.« less
ERIC Educational Resources Information Center
White Hughto, Jaclyn M.; Biello, Katie B.; Reisner, Sari L.; Perez-Brumer, Amaya; Heflin, Katherine J.; Mimiaga, Matthew J.
2016-01-01
Background: Differences in sexual health-related outcomes by sexual behavior and identity remain underinvestigated among bisexual female adolescents. Methods: Data from girls (N?=?875) who participated in the Massachusetts Youth Risk Behavior Surveillance survey were analyzed. Weighted logistic regression models were fit to examine sexual and…
A Survival Model for Shortleaf Pine Tress Growing in Uneven-Aged Stands
Thomas B. Lynch; Lawrence R. Gering; Michael M. Huebschmann; Paul A. Murphy
1999-01-01
A survival model for shortleaf pine (Pinus echinata Mill.) trees growing in uneven-aged stands was developed using data from permanently established plots maintained by an industrial forestry company in western Arkansas. Parameters were fitted to a logistic regression model with a Bernoulli dependent variable in which "0" represented...
NASA Astrophysics Data System (ADS)
Ceppi, C.; Mancini, F.; Ritrovato, G.
2009-04-01
This study aim at the landslide susceptibility mapping within an area of the Daunia (Apulian Apennines, Italy) by a multivariate statistical method and data manipulation in a Geographical Information System (GIS) environment. Among the variety of existing statistical data analysis techniques, the logistic regression was chosen to produce a susceptibility map all over an area where small settlements are historically threatened by landslide phenomena. By logistic regression a best fitting between the presence or absence of landslide (dependent variable) and the set of independent variables is performed on the basis of a maximum likelihood criterion, bringing to the estimation of regression coefficients. The reliability of such analysis is therefore due to the ability to quantify the proneness to landslide occurrences by the probability level produced by the analysis. The inventory of dependent and independent variables were managed in a GIS, where geometric properties and attributes have been translated into raster cells in order to proceed with the logistic regression by means of SPSS (Statistical Package for the Social Sciences) package. A landslide inventory was used to produce the bivariate dependent variable whereas the independent set of variable concerned with slope, aspect, elevation, curvature, drained area, lithology and land use after their reductions to dummy variables. The effect of independent parameters on landslide occurrence was assessed by the corresponding coefficient in the logistic regression function, highlighting a major role played by the land use variable in determining occurrence and distribution of phenomena. Once the outcomes of the logistic regression are determined, data are re-introduced in the GIS to produce a map reporting the proneness to landslide as predicted level of probability. As validation of results and regression model a cell-by-cell comparison between the susceptibility map and the initial inventory of landslide events was performed and an agreement at 75% level achieved.
Comprehension of texts by deaf elementary school students: The role of grammatical understanding.
Barajas, Carmen; González-Cuenca, Antonia M; Carrero, Francisco
2016-12-01
The aim of this study was to analyze how the reading process of deaf Spanish elementary school students is affected both by those components that explain reading comprehension according to the Simple View of Reading model: decoding and linguistic comprehension (both lexical and grammatical) and by other variables that are external to the reading process: the type of assistive technology used, the age at which it is implanted or fitted, the participant's socioeconomic status and school stage. Forty-seven students aged between 6 and 13 years participated in the study; all presented with profound or severe prelingual bilateral deafness, and all used digital hearing aids or cochlear implants. Students' text comprehension skills, decoding skills and oral comprehension skills (both lexical and grammatical) were evaluated. Logistic regression analysis indicated that neither the type of assistive technology, age at time of fitting or activation, socioeconomic status, nor school stage could predict the presence or absence of difficulties in text comprehension. Furthermore, logistic regression analysis indicated that neither decoding skills, nor lexical age could predict competency in text comprehension; however, grammatical age could explain 41% of the variance. Probing deeper into the effect of grammatical understanding, logistic regression analysis indicated that a participant's understanding of reversible passive object-verb-subject sentences and reversible predicative subject-verb-object sentences accounted for 38% of the variance in text comprehension. Based on these results, we suggest that it might be beneficial to devise and evaluate interventions that focus specifically on grammatical comprehension. Copyright © 2016 Elsevier Ltd. All rights reserved.
Rappole, Catherine; Grier, Tyson; Anderson, Morgan K; Hauschild, Veronique; Jones, Bruce H
2017-11-01
To investigate the effects of age, aerobic fitness, and body mass index (BMI) on injury risk in operational Army soldiers. Retrospective cohort study. Male soldiers from an operational Army brigade were administered electronic surveys regarding personal characteristics, physical fitness, and injuries occurring over the last 12 months. Injury risks were stratified by age, 2-mile run time, and BMI. Analyses included descriptive incidence, a Mantel-Haenszel χ 2 test to determine trends, a multivariable logistic regression to determine factors associated with injury, and a one-way analysis of variance (ANOVA). Forty-seventy percent of 1099 respondents reported at least one injury. A linear trend showed that as age, 2-mile run time, and BMI increased, so did injury risk (p<0.01). When controlling for BMI, the most significant independent injury risk factors were older age (odd ratio (OR) 30years-35years/≤24years=1.25, 95%CI: 1.08-2.32), (OR≥36years/≤24years=2.05, 95%CI: 1.36-3.10), and slow run times (OR≥15.9min/≤13.9min=1.91, 95%CI: 1.28-2.85). An ANOVA showed that both run times and BMI increased with age. The stratified analysis and the multivariable logistic regression suggested that older age and poor aerobic fitness are stronger predictors of injury than BMI. Copyright © 2017 Sports Medicine Australia. All rights reserved.
A computational approach to compare regression modelling strategies in prediction research.
Pajouheshnia, Romin; Pestman, Wiebe R; Teerenstra, Steven; Groenwold, Rolf H H
2016-08-25
It is often unclear which approach to fit, assess and adjust a model will yield the most accurate prediction model. We present an extension of an approach for comparing modelling strategies in linear regression to the setting of logistic regression and demonstrate its application in clinical prediction research. A framework for comparing logistic regression modelling strategies by their likelihoods was formulated using a wrapper approach. Five different strategies for modelling, including simple shrinkage methods, were compared in four empirical data sets to illustrate the concept of a priori strategy comparison. Simulations were performed in both randomly generated data and empirical data to investigate the influence of data characteristics on strategy performance. We applied the comparison framework in a case study setting. Optimal strategies were selected based on the results of a priori comparisons in a clinical data set and the performance of models built according to each strategy was assessed using the Brier score and calibration plots. The performance of modelling strategies was highly dependent on the characteristics of the development data in both linear and logistic regression settings. A priori comparisons in four empirical data sets found that no strategy consistently outperformed the others. The percentage of times that a model adjustment strategy outperformed a logistic model ranged from 3.9 to 94.9 %, depending on the strategy and data set. However, in our case study setting the a priori selection of optimal methods did not result in detectable improvement in model performance when assessed in an external data set. The performance of prediction modelling strategies is a data-dependent process and can be highly variable between data sets within the same clinical domain. A priori strategy comparison can be used to determine an optimal logistic regression modelling strategy for a given data set before selecting a final modelling approach.
Self-reported physical activity and preaccession fitness testing in U.S. Army applicants.
Gubata, Marlene E; Cowan, David N; Bedno, Sheryl A; Urban, Nadia; Niebuhr, David W
2011-08-01
The Assessment of Recruit Motivation and Strength (ARMS) study evaluated a physical fitness screening test for Army applicants before basic training. This report examines applicants' self-reported physical activity as a predictor of objective fitness measured by ARMS. In 2006, the ARMS study administered a fitness test and physical activity survey to Army applicants during their medical evaluation, using multiple logistic regression for comparison. Among both men and women, "qualified" and "exceeds-body-fat" subjects who met American College of Sports Medicine adult physical activity guidelines were more likely to pass the fitness test. Overall, subjects who met physical activity recommendations, watched less television, and played on sports teams had a higher odds of passing the ARMS test after adjustment for age, race, and smoking status. This study demonstrates that self-reported physical activity was associated with physical fitness and may be used to identify those at risk of failing a preaccession fitness test.
Hung, Chien-Ya; Sun, Pei-Lun; Chiang, Shu-Jen; Jaw, Fu-Shan
2014-01-01
Similar clinical appearances prevent accurate diagnosis of two common skin diseases, clavus and verruca. In this study, electrical impedance is employed as a novel tool to generate a predictive model for differentiating these two diseases. We used 29 clavus and 28 verruca lesions. To obtain impedance parameters, a LCR-meter system was applied to measure capacitance (C), resistance (Re), impedance magnitude (Z), and phase angle (θ). These values were combined with lesion thickness (d) to characterize the tissue specimens. The results from clavus and verruca were then fitted to a univariate logistic regression model with the generalized estimating equations (GEE) method. In model generation, log ZSD and θSD were formulated as predictors by fitting a multiple logistic regression model with the same GEE method. The potential nonlinear effects of covariates were detected by fitting generalized additive models (GAM). Moreover, the model was validated by the goodness-of-fit (GOF) assessments. Significant mean differences of the index d, Re, Z, and θ are found between clavus and verruca (p<0.001). A final predictive model is established with Z and θ indices. The model fits the observed data quite well. In GOF evaluation, the area under the receiver operating characteristics (ROC) curve is 0.875 (>0.7), the adjusted generalized R2 is 0.512 (>0.3), and the p value of the Hosmer-Lemeshow GOF test is 0.350 (>0.05). This technique promises to provide an approved model for differential diagnosis of clavus and verruca. It could provide a rapid, relatively low-cost, safe and non-invasive screening tool in clinic use.
Hsu, Chiu-Hsieh; Li, Yisheng; Long, Qi; Zhao, Qiuhong; Lance, Peter
2011-01-01
In colorectal polyp prevention trials, estimation of the rate of recurrence of adenomas at the end of the trial may be complicated by dependent censoring, that is, time to follow-up colonoscopy and dropout may be dependent on time to recurrence. Assuming that the auxiliary variables capture the dependence between recurrence and censoring times, we propose to fit two working models with the auxiliary variables as covariates to define risk groups and then extend an existing weighted logistic regression method for independent censoring to each risk group to accommodate potential dependent censoring. In a simulation study, we show that the proposed method results in both a gain in efficiency and reduction in bias for estimating the recurrence rate. We illustrate the methodology by analyzing a recurrent adenoma dataset from a colorectal polyp prevention trial. PMID:22065985
Physical fitness of 9 year olds in England: related factors.
Kikuchi, S; Rona, R J; Chinn, S
1995-04-01
To examine the influence of social factors, passive smoking, and other parental health related factors, as well as anthropometric and other measurements on children's cardiorespiratory fitness. This was a cross sectional study. The analysis was based on 22 health areas in England. The subjects were 299 boys and 282 girls aged 8 to 9 years. Parents did not give positive consent for 15% of the eligible sample. A further 25% of the eligible sample did not participate because the cycle-ergometer broke down, study time was insufficient, or they were excluded from the analysis because they were from ethnic minority groups or had missing data on one continuous variable. Cardiorespiratory fitness was determined using the cycle-ergometer test. It was measured in terms of PWC85%-that is, power output per body weight (watt/kg) assessed at 85% of maximum heart rate. The association between children's fitness and biological and social factors was analysed in two stages. Firstly, multiple logistic analysis was used to examine the factors associated with the children's ability to complete the test for at least four minutes. Secondly, multiple linear regression analysis was used to examine the independent association of the factors with PWC85%. In the logistic analysis, shorter children, children with higher blood pressure, and boys with a larger sibship size had poorer fitness. In the multiple regression analysis, only height (p < 0.001) was positively associated, and the sum of skinfold thicknesses at four sites (p = 0.001) was negatively associated with fitness in both sexes. In girls, a positive association was found with pre-exercise peak expiratory flow rate (p < 0.05), and there were negative associations with systolic blood pressure (p < 0.05) and family history of heart attack (p < 0.05). In boys an association was found with skinfold distribution and fitness (p < 0.05), so that children with relatively less body fat were fitter. Social and health behaviour factors such as father's social class, father's employment status, or parents' smoking habits were unrelated to child's fitness. Height and obesity are strongly associated, and systolic blood pressure to a small extent, with children's fitness, but social factors are unrelated.
Garnier, Alain; Gaillet, Bruno
2015-12-01
Not so many fermentation mathematical models allow analytical solutions of batch process dynamics. The most widely used is the combination of the logistic microbial growth kinetics with Luedeking-Piret bioproduct synthesis relation. However, the logistic equation is principally based on formalistic similarities and only fits a limited range of fermentation types. In this article, we have developed an analytical solution for the combination of Monod growth kinetics with Luedeking-Piret relation, which can be identified by linear regression and used to simulate batch fermentation evolution. Two classical examples are used to show the quality of fit and the simplicity of the method proposed. A solution for the combination of Haldane substrate-limited growth model combined with Luedeking-Piret relation is also provided. These models could prove useful for the analysis of fermentation data in industry as well as academia. © 2015 Wiley Periodicals, Inc.
Real, Jordi; Forné, Carles; Roso-Llorach, Albert; Martínez-Sánchez, Jose M
2016-05-01
Controlling for confounders is a crucial step in analytical observational studies, and multivariable models are widely used as statistical adjustment techniques. However, the validation of the assumptions of the multivariable regression models (MRMs) should be made clear in scientific reporting. The objective of this study is to review the quality of statistical reporting of the most commonly used MRMs (logistic, linear, and Cox regression) that were applied in analytical observational studies published between 2003 and 2014 by journals indexed in MEDLINE.Review of a representative sample of articles indexed in MEDLINE (n = 428) with observational design and use of MRMs (logistic, linear, and Cox regression). We assessed the quality of reporting about: model assumptions and goodness-of-fit, interactions, sensitivity analysis, crude and adjusted effect estimate, and specification of more than 1 adjusted model.The tests of underlying assumptions or goodness-of-fit of the MRMs used were described in 26.2% (95% CI: 22.0-30.3) of the articles and 18.5% (95% CI: 14.8-22.1) reported the interaction analysis. Reporting of all items assessed was higher in articles published in journals with a higher impact factor.A low percentage of articles indexed in MEDLINE that used multivariable techniques provided information demonstrating rigorous application of the model selected as an adjustment method. Given the importance of these methods to the final results and conclusions of observational studies, greater rigor is required in reporting the use of MRMs in the scientific literature.
Keogh, Ruth H; Mangtani, Punam; Rodrigues, Laura; Nguipdop Djomo, Patrick
2016-01-05
Traditional analyses of standard case-control studies using logistic regression do not allow estimation of time-varying associations between exposures and the outcome. We present two approaches which allow this. The motivation is a study of vaccine efficacy as a function of time since vaccination. Our first approach is to estimate time-varying exposure-outcome associations by fitting a series of logistic regressions within successive time periods, reusing controls across periods. Our second approach treats the case-control sample as a case-cohort study, with the controls forming the subcohort. In the case-cohort analysis, controls contribute information at all times they are at risk. Extensions allow left truncation, frequency matching and, using the case-cohort analysis, time-varying exposures. Simulations are used to investigate the methods. The simulation results show that both methods give correct estimates of time-varying effects of exposures using standard case-control data. Using the logistic approach there are efficiency gains by reusing controls over time and care should be taken over the definition of controls within time periods. However, using the case-cohort analysis there is no ambiguity over the definition of controls. The performance of the two analyses is very similar when controls are used most efficiently under the logistic approach. Using our methods, case-control studies can be used to estimate time-varying exposure-outcome associations where they may not previously have been considered. The case-cohort analysis has several advantages, including that it allows estimation of time-varying associations as a continuous function of time, while the logistic regression approach is restricted to assuming a step function form for the time-varying association.
ERIC Educational Resources Information Center
Liu, Xing
2008-01-01
The proportional odds (PO) model, which is also called cumulative odds model (Agresti, 1996, 2002 ; Armstrong & Sloan, 1989; Long, 1997, Long & Freese, 2006; McCullagh, 1980; McCullagh & Nelder, 1989; Powers & Xie, 2000; O'Connell, 2006), is one of the most commonly used models for the analysis of ordinal categorical data and comes from the class…
Fitness, motor competence, and body composition are weakly associated with adolescent back pain.
Perry, Mark; Straker, Leon; O'Sullivan, Peter; Smith, Anne; Hands, Beth
2009-06-01
Cross-sectional survey. To assess the associations between adolescent back pain and fitness, motor competence, and body composition. Although deficits in physical fitness and motor control have been shown to relate to adult back pain, the evidence in adolescents is less clear. In this cross-sectional study, 1608 "Raine" cohort adolescents (mean age, 14 years) answered questions on lifetime, month, and chronic prevalence of back pain, and participated in a range of physical tests assessing aerobic capacity, muscle performance, flexibility, motor competence, and body composition.A history of any diagnosed back pain in the adolescent was obtained from the primary caregiver. After multivariate logistic regression analysis, increased likelihood of back pain in boys was associated with greater aerobic capacity, greater waist girth, and both reduced and greater flexibility. Back pain in girls was associated with greater abdominal endurance, reduced kinesthetic integration, and both reduced and greater back endurance. Lower likelihood of back pain was associated with greater bimanual dexterity in boys and greater lower extremity power in girls. Physical characteristics are commonly cited as important risk factors in back pain development. Although some factors were associated with adolescent back pain, and these differed between boys and girls, they made only a small contribution to logistic regression models for back pain. The results suggest future work should explore the interaction of multiple domains of risk factors (physical, lifestyle, and psychosocial) and subgroups of adolescent back pain, for whom different risk factors may be important.
Bielak, Lawrence F; Whaley, Dana H; Sheedy, Patrick F; Peyser, Patricia A
2010-09-01
The etiology of breast arterial calcification (BAC) is not well understood. We examined reproductive history and cardiovascular disease (CVD) risk factor associations with the presence of detectable BAC in asymptomatic postmenopausal women. Reproductive history and CVD risk factors were obtained in 240 asymptomatic postmenopausal women from a community-based research study who had a screening mammogram within 2 years of their participation in the study. The mammograms were reviewed for the presence of detectable BAC. Age-adjusted logistic regression models were fit to assess the association between each risk factor and the presence of BAC. Multiple variable logistic regression models were used to identify the most parsimonious model for the presence of BAC. The prevalence of BAC increased with increased age (p < 0.0001). The most parsimonious logistic regression model for BAC presence included age at time of examination, increased parity (p = 0.01), earlier age at first birth (p = 0.002), weight, and an age-by-weight interaction term (p = 0.004). Older women with a smaller body size had a higher probability of having BAC than women of the same age with a larger body size. The presence or absence of BAC at mammography may provide an assessment of a postmenopausal woman's lifetime estrogen exposure and indicate women who could be at risk for hormonally related conditions.
Active travel to school and cardiovascular fitness in Danish children and adolescents.
Cooper, Ashley R; Wedderkopp, Niels; Wang, Han; Andersen, Lars Bo; Froberg, Karsten; Page, Angie S
2006-10-01
Active travel to school provides an opportunity for daily physical activity. Previous studies have shown that walking and cycling to school are associated with higher physical activity levels. The purpose of this study was to investigate whether the way that children and adolescents travel to school is associated with level of cardiovascular fitness. Participants were recruited via a proportional, two-stage cluster sample of schools (N = 25) in the region of Odense, Denmark as part of the European Youth Heart Study (EYHS). Nine hundred nineteen participants (529 children, age 9.7 +/- 0.5 yr; 390 adolescents, age 15.5 +/- 0.4 yr) completed a maximal cycle ergometer test to assess cardiorespiratory fitness (Wmax x kg(-1)). Mode of travel to school was investigated by questionnaire. Physical activity was measured in 531 participants using an accelerometer. Regression analyses with robust standard errors and adjustment for confounders (gender, age, body composition (skinfolds), pubertal status, and physical activity) and the cluster sampling procedure were used to compare fitness levels for different travel modes. Multinomial logistic regression was applied to assess the odds for belonging to quartiles of fitness. Children and adolescents who cycled to school were significantly more fit than those who walked or traveled by motorized transport and were nearly five times as likely (OR 4.8; 95% CI 2.8-8.4) to be in the top quartile of fitness. Cycling to school may contribute to higher cardiovascular fitness in young people.
[Situation analysis of physical fitness among Chinese Han students in 2014].
Song, Y; Lei, Y T; Hu, P J; Zhang, B; Ma, J
2018-06-18
To analyze the situation of physical fitness among Chinese Han students in 2014, so as to develop the guideline of physical activity regarding to the targeted students and to provide bases for the improvements of students' physical fitness. Subjects were from 2014 Chinese National Surveys on Students' Constitution and Health (CNSSCH). In this survey, 212 401 Han students aged 7-18 years participated and the measurement of physical fitness completed. The qualified rates of indicators regarding to physical fitness were evaluated based on "National Students Constitutional Health Standards" (2014 revised edition). Logistic regression was used to assess the association between the indicators of pull ups (boys) and endurance run (boys and girls) and influencing factors. In 2014, among the boys, the qualified rates of pull ups and endurance run were 18.7% and 76.6% respectively, while the qualified rate of endurance run was 80.6% among the girls. These two indicators were the weak items of physical fitness among the Chinese Han students. There was regional difference in the qualified rates of physical fitness, and the students in Zhejiang and Jiangsu provinces had higher qualified rates. Logistic regression showed that the urban students (OR=0.67), the students with malnutrition (OR=0.76), overweight (OR=0.32) or obesity (OR=0.12) were less likely to be qualified to pull ups; the students who had physical activity more than 1 h per day (OR=1.31) was more likely to be qualified to pull ups. The influencing factors of endurance run showed the similar pattern, in addition, the students with enough physical education (PE) were more likely to be qualified to endurance run, while the students with "Squeeze" or "no" PE class were less likely to be qualified to endurance run. The pull ups and endurance run have become the weak items of the physical fitness among primary and secondary school students in our national and provincial levels. Based on ensuring physical exercise time and PE curriculum and class hours, as well as improving students' nutrition, we should also strengthen the rational design of physical exercise and ensure the balanced development of various items so as to improve the overall development of students' physical fitness.
Determination of riverbank erosion probability using Locally Weighted Logistic Regression
NASA Astrophysics Data System (ADS)
Ioannidou, Elena; Flori, Aikaterini; Varouchakis, Emmanouil A.; Giannakis, Georgios; Vozinaki, Anthi Eirini K.; Karatzas, George P.; Nikolaidis, Nikolaos
2015-04-01
Riverbank erosion is a natural geomorphologic process that affects the fluvial environment. The most important issue concerning riverbank erosion is the identification of the vulnerable locations. An alternative to the usual hydrodynamic models to predict vulnerable locations is to quantify the probability of erosion occurrence. This can be achieved by identifying the underlying relations between riverbank erosion and the geomorphological or hydrological variables that prevent or stimulate erosion. Thus, riverbank erosion can be determined by a regression model using independent variables that are considered to affect the erosion process. The impact of such variables may vary spatially, therefore, a non-stationary regression model is preferred instead of a stationary equivalent. Locally Weighted Regression (LWR) is proposed as a suitable choice. This method can be extended to predict the binary presence or absence of erosion based on a series of independent local variables by using the logistic regression model. It is referred to as Locally Weighted Logistic Regression (LWLR). Logistic regression is a type of regression analysis used for predicting the outcome of a categorical dependent variable (e.g. binary response) based on one or more predictor variables. The method can be combined with LWR to assign weights to local independent variables of the dependent one. LWR allows model parameters to vary over space in order to reflect spatial heterogeneity. The probabilities of the possible outcomes are modelled as a function of the independent variables using a logistic function. Logistic regression measures the relationship between a categorical dependent variable and, usually, one or several continuous independent variables by converting the dependent variable to probability scores. Then, a logistic regression is formed, which predicts success or failure of a given binary variable (e.g. erosion presence or absence) for any value of the independent variables. The erosion occurrence probability can be calculated in conjunction with the model deviance regarding the independent variables tested. The most straightforward measure for goodness of fit is the G statistic. It is a simple and effective way to study and evaluate the Logistic Regression model efficiency and the reliability of each independent variable. The developed statistical model is applied to the Koiliaris River Basin on the island of Crete, Greece. Two datasets of river bank slope, river cross-section width and indications of erosion were available for the analysis (12 and 8 locations). Two different types of spatial dependence functions, exponential and tricubic, were examined to determine the local spatial dependence of the independent variables at the measurement locations. The results show a significant improvement when the tricubic function is applied as the erosion probability is accurately predicted at all eight validation locations. Results for the model deviance show that cross-section width is more important than bank slope in the estimation of erosion probability along the Koiliaris riverbanks. The proposed statistical model is a useful tool that quantifies the erosion probability along the riverbanks and can be used to assist managing erosion and flooding events. Acknowledgements This work is part of an on-going THALES project (CYBERSENSORS - High Frequency Monitoring System for Integrated Water Resources Management of Rivers). The project has been co-financed by the European Union (European Social Fund - ESF) and Greek national funds through the Operational Program "Education and Lifelong Learning" of the National Strategic Reference Framework (NSRF) - Research Funding Program: THALES. Investing in knowledge society through the European Social Fund.
Socio-demographic predictors of person-organization fit.
Merecz-Kot, Dorota; Andysz, Aleksandra
2017-02-21
The aim of this study was to explore the relationship between socio-demographic characteristics and the level of complementary and supplementary person-organization fit (P-O fit). The study sample was a group of 600 Polish workers, urban residents aged 19-65. Level of P-O fit was measured using the Subjective Person-Organization Fit Questionnaire by Czarnota-Bojarska. The binomial multivariate logistic regression was applied. The analyzes were performed separately for the men and women. Socio-demographic variables explained small percentage of the outcome variability. Gender differences were found. In the case of men shift work decreased complementary and supplementary fit, while long working hours decreased complementary fit. In the women, age was a stimulant of a complementary fit, involuntary job losses predicted both complementary and supplementary misfit. Additionally, relational responsibilities increased probability of supplementary P-O fit in the men. Going beyond personality and competences as the factors affecting P-O fit will allow development of a more accurate prediction of P-O fit. Int J Occup Med Environ Health 2017;30(1):133-139. This work is available in Open Access model and licensed under a CC BY-NC 3.0 PL license.
Yoshida, Yuko; Kim, Hunkyung; Iwasa, Hajime; Kwon, Jinhee; Sugiura, Miho; Furuna, Taketo; Yoshida, Hideyo; Suzuki, Takao
2007-01-01
We examined the prevalence and characteristics of urinary incontinence in community-dwelling elderly individuals. The participants were 1,783 individuals (768 men and 1,015 women) aged over 70 years who participated in a comprehensive health examination involving a medical examination and interview, plus physical performance tests. Differences in characteristics between individuals with and without urinary incontinence were examined, and multivariate logistic regression models were used to describe the characteristics associated with urinary incontinence. The prevalence of urinary incontinence was 13.4% in men and 23.3% in women. Urinary incontinence was significantly associated with a lower level of physical fitness. Multivariate logistic regression showed that urinary incontinence was significantly associated with a slower walking speed (Odds Ratio (OR) = 0.19, 95% Confidence Intervals (CI) 0.08-0.48) and lower serum albumin level (OR = 0.40, 95% CI 0.16-0.99) in men, and with a slower walking speed (OR = 0.29, 95% CI 0.15-0.56), a higher BMI (OR = 1.09, 95% CI 1.04-1.14), depression (OR = 3.06, 95% CI 1.40-6.69), and lack of physical activity (OR = 0.70, 95% CI 0.50-0.98) in women. The characteristics of urinary incontinence in this cohort of community-dwelling elderly individuals were a low level of physical fitness and poor nutritional state in men, and a low level of physical fitness, a tendency to be obese, a poor mental health state, and lack of physical activity in women.
Cevenini, Gabriele; Barbini, Emanuela; Scolletta, Sabino; Biagioli, Bonizella; Giomarelli, Pierpaolo; Barbini, Paolo
2007-11-22
Popular predictive models for estimating morbidity probability after heart surgery are compared critically in a unitary framework. The study is divided into two parts. In the first part modelling techniques and intrinsic strengths and weaknesses of different approaches were discussed from a theoretical point of view. In this second part the performances of the same models are evaluated in an illustrative example. Eight models were developed: Bayes linear and quadratic models, k-nearest neighbour model, logistic regression model, Higgins and direct scoring systems and two feed-forward artificial neural networks with one and two layers. Cardiovascular, respiratory, neurological, renal, infectious and hemorrhagic complications were defined as morbidity. Training and testing sets each of 545 cases were used. The optimal set of predictors was chosen among a collection of 78 preoperative, intraoperative and postoperative variables by a stepwise procedure. Discrimination and calibration were evaluated by the area under the receiver operating characteristic curve and Hosmer-Lemeshow goodness-of-fit test, respectively. Scoring systems and the logistic regression model required the largest set of predictors, while Bayesian and k-nearest neighbour models were much more parsimonious. In testing data, all models showed acceptable discrimination capacities, however the Bayes quadratic model, using only three predictors, provided the best performance. All models showed satisfactory generalization ability: again the Bayes quadratic model exhibited the best generalization, while artificial neural networks and scoring systems gave the worst results. Finally, poor calibration was obtained when using scoring systems, k-nearest neighbour model and artificial neural networks, while Bayes (after recalibration) and logistic regression models gave adequate results. Although all the predictive models showed acceptable discrimination performance in the example considered, the Bayes and logistic regression models seemed better than the others, because they also had good generalization and calibration. The Bayes quadratic model seemed to be a convincing alternative to the much more usual Bayes linear and logistic regression models. It showed its capacity to identify a minimum core of predictors generally recognized as essential to pragmatically evaluate the risk of developing morbidity after heart surgery.
Whaley, Dana H.; Sheedy, Patrick F.; Peyser, Patricia A.
2010-01-01
Abstract Objective The etiology of breast arterial calcification (BAC) is not well understood. We examined reproductive history and cardiovascular disease (CVD) risk factor associations with the presence of detectable BAC in asymptomatic postmenopausal women. Methods Reproductive history and CVD risk factors were obtained in 240 asymptomatic postmenopausal women from a community-based research study who had a screening mammogram within 2 years of their participation in the study. The mammograms were reviewed for the presence of detectable BAC. Age-adjusted logistic regression models were fit to assess the association between each risk factor and the presence of BAC. Multiple variable logistic regression models were used to identify the most parsimonious model for the presence of BAC. Results The prevalence of BAC increased with increased age (p < 0.0001). The most parsimonious logistic regression model for BAC presence included age at time of examination, increased parity (p = 0.01), earlier age at first birth (p = 0.002), weight, and an age-by-weight interaction term (p = 0.004). Older women with a smaller body size had a higher probability of having BAC than women of the same age with a larger body size. Conclusions The presence or absence of BAC at mammography may provide an assessment of a postmenopausal woman's lifetime estrogen exposure and indicate women who could be at risk for hormonally related conditions. PMID:20629578
Saucedo-Reyes, Daniela; Carrillo-Salazar, José A; Román-Padilla, Lizbeth; Saucedo-Veloz, Crescenciano; Reyes-Santamaría, María I; Ramírez-Gilly, Mariana; Tecante, Alberto
2018-03-01
High hydrostatic pressure inactivation kinetics of Escherichia coli ATCC 25922 and Salmonella enterica subsp. enterica serovar Typhimurium ATCC 14028 ( S. typhimurium) in a low acid mamey pulp at four pressure levels (300, 350, 400, and 450 MPa), different exposure times (0-8 min), and temperature of 25 ± 2℃ were obtained. Survival curves showed deviations from linearity in the form of a tail (upward concavity). The primary models tested were the Weibull model, the modified Gompertz equation, and the biphasic model. The Weibull model gave the best goodness of fit ( R 2 adj > 0.956, root mean square error < 0.290) in the modeling and the lowest Akaike information criterion value. Exponential-logistic and exponential decay models, and Bigelow-type and an empirical models for b'( P) and n( P) parameters, respectively, were tested as alternative secondary models. The process validation considered the two- and one-step nonlinear regressions for making predictions of the survival fraction; both regression types provided an adequate goodness of fit and the one-step nonlinear regression clearly reduced fitting errors. The best candidate model according to the Akaike theory information, with better accuracy and more reliable predictions was the Weibull model integrated by the exponential-logistic and exponential decay secondary models as a function of time and pressure (two-step procedure) or incorporated as one equation (one-step procedure). Both mathematical expressions were used to determine the t d parameter, where the desired reductions ( 5D) (considering d = 5 ( t 5 ) as the criterion of 5 Log 10 reduction (5 D)) in both microorganisms are attainable at 400 MPa for 5.487 ± 0.488 or 5.950 ± 0.329 min, respectively, for the one- or two-step nonlinear procedure.
Míguez, A; Iftimi, A; Montes, F
2016-09-01
Epidemiologists agree that there is a prevailing seasonality in the presentation of epidemic waves of respiratory syncytial virus (RSV) infections and influenza. The aim of this study is to quantify the potential relationship between the activity of RSV, with respect to the influenza virus, in order to use the RSV seasonal curve as a predictor of the evolution of an influenza virus epidemic wave. Two statistical tools, logistic regression and time series, are used for predicting the evolution of influenza. Both logistic models and time series of influenza consider RSV information from previous weeks. Data consist of influenza and confirmed RSV cases reported in Comunitat Valenciana (Spain) during the period from week 40 (2010) to week 8 (2014). Binomial logistic regression models used to predict the two states of influenza wave, basal or peak, result in a rate of correct classification higher than 92% with the validation set. When a finer three-states categorization is established, basal, increasing peak and decreasing peak, the multinomial logistic model performs well in 88% of cases of the validation set. The ARMAX model fits well for influenza waves and shows good performance for short-term forecasts up to 3 weeks. The seasonal evolution of influenza virus can be predicted a minimum of 4 weeks in advance using logistic models based on RSV. It would be necessary to study more inter-pandemic seasons to establish a stronger relationship between the epidemic waves of both viruses.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hooman, A.; Mohammadzadeh, M
Some medical and epidemiological surveys have been designed to predict a nominal response variable with several levels. With regard to the type of pregnancy there are four possible states: wanted, unwanted by wife, unwanted by husband and unwanted by couple. In this paper, we have predicted the type of pregnancy, as well as the factors influencing it using three different models and comparing them. Regarding the type of pregnancy with several levels, we developed a multinomial logistic regression, a neural network and a flexible discrimination based on the data and compared their results using tow statistical indices: Surface under curvemore » (ROC) and kappa coefficient. Based on these tow indices, flexible discrimination proved to be a better fit for prediction on data in comparison to other methods. When the relations among variables are complex, one can use flexible discrimination instead of multinomial logistic regression and neural network to predict the nominal response variables with several levels in order to gain more accurate predictions.« less
Methods for estimating drought streamflow probabilities for Virginia streams
Austin, Samuel H.
2014-01-01
Maximum likelihood logistic regression model equations used to estimate drought flow probabilities for Virginia streams are presented for 259 hydrologic basins in Virginia. Winter streamflows were used to estimate the likelihood of streamflows during the subsequent drought-prone summer months. The maximum likelihood logistic regression models identify probable streamflows from 5 to 8 months in advance. More than 5 million streamflow daily values collected over the period of record (January 1, 1900 through May 16, 2012) were compiled and analyzed over a minimum 10-year (maximum 112-year) period of record. The analysis yielded the 46,704 equations with statistically significant fit statistics and parameter ranges published in two tables in this report. These model equations produce summer month (July, August, and September) drought flow threshold probabilities as a function of streamflows during the previous winter months (November, December, January, and February). Example calculations are provided, demonstrating how to use the equations to estimate probable streamflows as much as 8 months in advance.
Wildfire Risk Mapping over the State of Mississippi: Land Surface Modeling Approach
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cooke, William H.; Mostovoy, Georgy; Anantharaj, Valentine G
2012-01-01
Three fire risk indexes based on soil moisture estimates were applied to simulate wildfire probability over the southern part of Mississippi using the logistic regression approach. The fire indexes were retrieved from: (1) accumulated difference between daily precipitation and potential evapotranspiration (P-E); (2) top 10 cm soil moisture content simulated by the Mosaic land surface model; and (3) the Keetch-Byram drought index (KBDI). The P-E, KBDI, and soil moisture based indexes were estimated from gridded atmospheric and Mosaic-simulated soil moisture data available from the North American Land Data Assimilation System (NLDAS-2). Normalized deviations of these indexes from the 31-year meanmore » (1980-2010) were fitted into the logistic regression model describing probability of wildfires occurrence as a function of the fire index. It was assumed that such normalization provides more robust and adequate description of temporal dynamics of soil moisture anomalies than the original (not normalized) set of indexes. The logistic model parameters were evaluated for 0.25 x0.25 latitude/longitude cells and for probability representing at least one fire event occurred during 5 consecutive days. A 23-year (1986-2008) forest fires record was used. Two periods were selected and examined (January mid June and mid September December). The application of the logistic model provides an overall good agreement between empirical/observed and model-fitted fire probabilities over the study area during both seasons. The fire risk indexes based on the top 10 cm soil moisture and KBDI have the largest impact on the wildfire odds (increasing it by almost 2 times in response to each unit change of the corresponding fire risk index during January mid June period and by nearly 1.5 times during mid September-December) observed over 0.25 x0.25 cells located along the state of Mississippi Coast line. This result suggests a rather strong control of fire risk indexes on fire occurrence probability over this region.« less
Application of Regulatory Focus Theory to Search Advertising.
Mowle, Elyse N; Georgia, Emily J; Doss, Brian D; Updegraff, John A
The purpose of this paper is to test the utility of regulatory focus theory principles in a real-world setting; specifically, Internet hosted text advertisements. Effect of compatibility of the ad text with the regulatory focus of the consumer was examined. Advertisements were created using Google AdWords. Data were collected for the number of views and clicks each ad received. Effect of regulatory fit was measured using logistic regression. Logistic regression analyses demonstrated that there was a strong main effect for keyword, such that users were almost six times as likely to click on a promotion advertisement as a prevention advertisement, as well as a main effect for compatibility, such that users were twice as likely to click on an advertisement with content that was consistent with their keyword. Finally, there was a strong interaction of these two variables, such that the effect of consistent advertisements was stronger for promotion searches than for prevention searches. The effect of ad compatibility had medium to large effect sizes, suggesting that individuals' state may have more influence on advertising response than do individuals' traits (e.g. personality traits). Measurement of regulatory fit was limited by the constraints of Google AdWords. The results of this study provide a possible framework for ad creation for Internet advertisers. This paper is the first study to demonstrate the utility of regulatory focus theory in online advertising.
NASA Astrophysics Data System (ADS)
Ariffin, Syaiba Balqish; Midi, Habshah
2014-06-01
This article is concerned with the performance of logistic ridge regression estimation technique in the presence of multicollinearity and high leverage points. In logistic regression, multicollinearity exists among predictors and in the information matrix. The maximum likelihood estimator suffers a huge setback in the presence of multicollinearity which cause regression estimates to have unduly large standard errors. To remedy this problem, a logistic ridge regression estimator is put forward. It is evident that the logistic ridge regression estimator outperforms the maximum likelihood approach for handling multicollinearity. The effect of high leverage points are then investigated on the performance of the logistic ridge regression estimator through real data set and simulation study. The findings signify that logistic ridge regression estimator fails to provide better parameter estimates in the presence of both high leverage points and multicollinearity.
Rodrigues, Luis P; Stodden, David F; Lopes, Vítor P
2016-01-01
To test how different developmental pathways of health-related physical fitness and motor competence tests relate to weight status (overweight and obesity) at the end of primary school. Longitudinal study on growth, health-related physical fitness, and motor competence of 472 primary school children assessed yearly throughout 1st to 4th grade, with an average age of 6.3±0.7 years of age at 1st grade. Children's pathways of change on each of the fitness and motor competence tests were determined along the four years of the study. Participants were divided into three groups according to their rate of change in each test over time: Low Rate of Change, Average Rate of Change, and High Rate of Change. A logistic regression was used to predict the odds ratio of becoming overweight or obese, depending on the developmental pathway of change in fitness and motor competence across childhood. Children with a low or average rate of change in their developmental pathways of fitness and motor competence were several times more prone to become overweight or obese at the end of primary school (OR 2.0 to 6.3), independent of sex and body mass index at baseline. Specifically, a negative developmental pathway (Low Rate of Change) in cardiorespiratory fitness demonstrated over a six-fold elevated risk of being overweight or obese, compared to peers with a positive pathway. Not all children improve their motor competence and fitness levels over time and many actually regress over time. Developing positive fitness and motor competence pathways during childhood protects from obesity and overweight. Copyright © 2015 Sports Medicine Australia. Published by Elsevier Ltd. All rights reserved.
Craven, Stephen; Shirsat, Nishikant; Whelan, Jessica; Glennon, Brian
2013-01-01
A Monod kinetic model, logistic equation model, and statistical regression model were developed for a Chinese hamster ovary cell bioprocess operated under three different modes of operation (batch, bolus fed-batch, and continuous fed-batch) and grown on two different bioreactor scales (3 L bench-top and 15 L pilot-scale). The Monod kinetic model was developed for all modes of operation under study and predicted cell density, glucose glutamine, lactate, and ammonia concentrations well for the bioprocess. However, it was computationally demanding due to the large number of parameters necessary to produce a good model fit. The transferability of the Monod kinetic model structure and parameter set across bioreactor scales and modes of operation was investigated and a parameter sensitivity analysis performed. The experimentally determined parameters had the greatest influence on model performance. They changed with scale and mode of operation, but were easily calculated. The remaining parameters, which were fitted using a differential evolutionary algorithm, were not as crucial. Logistic equation and statistical regression models were investigated as alternatives to the Monod kinetic model. They were less computationally intensive to develop due to the absence of a large parameter set. However, modeling of the nutrient and metabolite concentrations proved to be troublesome due to the logistic equation model structure and the inability of both models to incorporate a feed. The complexity, computational load, and effort required for model development has to be balanced with the necessary level of model sophistication when choosing which model type to develop for a particular application. Copyright © 2012 American Institute of Chemical Engineers (AIChE).
Sample size determination for logistic regression on a logit-normal distribution.
Kim, Seongho; Heath, Elisabeth; Heilbrun, Lance
2017-06-01
Although the sample size for simple logistic regression can be readily determined using currently available methods, the sample size calculation for multiple logistic regression requires some additional information, such as the coefficient of determination ([Formula: see text]) of a covariate of interest with other covariates, which is often unavailable in practice. The response variable of logistic regression follows a logit-normal distribution which can be generated from a logistic transformation of a normal distribution. Using this property of logistic regression, we propose new methods of determining the sample size for simple and multiple logistic regressions using a normal transformation of outcome measures. Simulation studies and a motivating example show several advantages of the proposed methods over the existing methods: (i) no need for [Formula: see text] for multiple logistic regression, (ii) available interim or group-sequential designs, and (iii) much smaller required sample size.
Ren, Y Y; Zhou, L C; Yang, L; Liu, P Y; Zhao, B W; Liu, H X
2016-09-01
The paper highlights the use of the logistic regression (LR) method in the construction of acceptable statistically significant, robust and predictive models for the classification of chemicals according to their aquatic toxic modes of action. Essentials accounting for a reliable model were all considered carefully. The model predictors were selected by stepwise forward discriminant analysis (LDA) from a combined pool of experimental data and chemical structure-based descriptors calculated by the CODESSA and DRAGON software packages. Model predictive ability was validated both internally and externally. The applicability domain was checked by the leverage approach to verify prediction reliability. The obtained models are simple and easy to interpret. In general, LR performs much better than LDA and seems to be more attractive for the prediction of the more toxic compounds, i.e. compounds that exhibit excess toxicity versus non-polar narcotic compounds and more reactive compounds versus less reactive compounds. In addition, model fit and regression diagnostics was done through the influence plot which reflects the hat-values, studentized residuals, and Cook's distance statistics of each sample. Overdispersion was also checked for the LR model. The relationships between the descriptors and the aquatic toxic behaviour of compounds are also discussed.
On the analysis of Canadian Holstein dairy cow lactation curves using standard growth functions.
López, S; France, J; Odongo, N E; McBride, R A; Kebreab, E; AlZahal, O; McBride, B W; Dijkstra, J
2015-04-01
Six classical growth functions (monomolecular, Schumacher, Gompertz, logistic, Richards, and Morgan) were fitted to individual and average (by parity) cumulative milk production curves of Canadian Holstein dairy cows. The data analyzed consisted of approximately 91,000 daily milk yield records corresponding to 122 first, 99 second, and 92 third parity individual lactation curves. The functions were fitted using nonlinear regression procedures, and their performance was assessed using goodness-of-fit statistics (coefficient of determination, residual mean squares, Akaike information criterion, and the correlation and concordance coefficients between observed and adjusted milk yields at several days in milk). Overall, all the growth functions evaluated showed an acceptable fit to the cumulative milk production curves, with the Richards equation ranking first (smallest Akaike information criterion) followed by the Morgan equation. Differences among the functions in their goodness-of-fit were enlarged when fitted to average curves by parity, where the sigmoidal functions with a variable point of inflection (Richards and Morgan) outperformed the other 4 equations. All the functions provided satisfactory predictions of milk yield (calculated from the first derivative of the functions) at different lactation stages, from early to late lactation. The Richards and Morgan equations provided the most accurate estimates of peak yield and total milk production per 305-d lactation, whereas the least accurate estimates were obtained with the logistic equation. In conclusion, classical growth functions (especially sigmoidal functions with a variable point of inflection) proved to be feasible alternatives to fit cumulative milk production curves of dairy cows, resulting in suitable statistical performance and accurate estimates of lactation traits. Copyright © 2015 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
The crux of the method: assumptions in ordinary least squares and logistic regression.
Long, Rebecca G
2008-10-01
Logistic regression has increasingly become the tool of choice when analyzing data with a binary dependent variable. While resources relating to the technique are widely available, clear discussions of why logistic regression should be used in place of ordinary least squares regression are difficult to find. The current paper compares and contrasts the assumptions of ordinary least squares with those of logistic regression and explains why logistic regression's looser assumptions make it adept at handling violations of the more important assumptions in ordinary least squares.
NASA Astrophysics Data System (ADS)
Althuwaynee, Omar F.; Pradhan, Biswajeet; Ahmad, Noordin
2014-06-01
This article uses methodology based on chi-squared automatic interaction detection (CHAID), as a multivariate method that has an automatic classification capacity to analyse large numbers of landslide conditioning factors. This new algorithm was developed to overcome the subjectivity of the manual categorization of scale data of landslide conditioning factors, and to predict rainfall-induced susceptibility map in Kuala Lumpur city and surrounding areas using geographic information system (GIS). The main objective of this article is to use CHi-squared automatic interaction detection (CHAID) method to perform the best classification fit for each conditioning factor, then, combining it with logistic regression (LR). LR model was used to find the corresponding coefficients of best fitting function that assess the optimal terminal nodes. A cluster pattern of landslide locations was extracted in previous study using nearest neighbor index (NNI), which were then used to identify the clustered landslide locations range. Clustered locations were used as model training data with 14 landslide conditioning factors such as; topographic derived parameters, lithology, NDVI, land use and land cover maps. Pearson chi-squared value was used to find the best classification fit between the dependent variable and conditioning factors. Finally the relationship between conditioning factors were assessed and the landslide susceptibility map (LSM) was produced. An area under the curve (AUC) was used to test the model reliability and prediction capability with the training and validation landslide locations respectively. This study proved the efficiency and reliability of decision tree (DT) model in landslide susceptibility mapping. Also it provided a valuable scientific basis for spatial decision making in planning and urban management studies.
Using Dominance Analysis to Determine Predictor Importance in Logistic Regression
ERIC Educational Resources Information Center
Azen, Razia; Traxel, Nicole
2009-01-01
This article proposes an extension of dominance analysis that allows researchers to determine the relative importance of predictors in logistic regression models. Criteria for choosing logistic regression R[superscript 2] analogues were determined and measures were selected that can be used to perform dominance analysis in logistic regression. A…
NASA Astrophysics Data System (ADS)
Erener, Arzu; Sivas, A. Abdullah; Selcuk-Kestel, A. Sevtap; Düzgün, H. Sebnem
2017-07-01
All of the quantitative landslide susceptibility mapping (QLSM) methods requires two basic data types, namely, landslide inventory and factors that influence landslide occurrence (landslide influencing factors, LIF). Depending on type of landslides, nature of triggers and LIF, accuracy of the QLSM methods differs. Moreover, how to balance the number of 0 (nonoccurrence) and 1 (occurrence) in the training set obtained from the landslide inventory and how to select which one of the 1's and 0's to be included in QLSM models play critical role in the accuracy of the QLSM. Although performance of various QLSM methods is largely investigated in the literature, the challenge of training set construction is not adequately investigated for the QLSM methods. In order to tackle this challenge, in this study three different training set selection strategies along with the original data set is used for testing the performance of three different regression methods namely Logistic Regression (LR), Bayesian Logistic Regression (BLR) and Fuzzy Logistic Regression (FLR). The first sampling strategy is proportional random sampling (PRS), which takes into account a weighted selection of landslide occurrences in the sample set. The second method, namely non-selective nearby sampling (NNS), includes randomly selected sites and their surrounding neighboring points at certain preselected distances to include the impact of clustering. Selective nearby sampling (SNS) is the third method, which concentrates on the group of 1's and their surrounding neighborhood. A randomly selected group of landslide sites and their neighborhood are considered in the analyses similar to NNS parameters. It is found that LR-PRS, FLR-PRS and BLR-Whole Data set-ups, with order, yield the best fits among the other alternatives. The results indicate that in QLSM based on regression models, avoidance of spatial correlation in the data set is critical for the model's performance.
Sperm Retrieval in Patients with Klinefelter Syndrome: A Skewed Regression Model Analysis.
Chehrazi, Mohammad; Rahimiforoushani, Abbas; Sabbaghian, Marjan; Nourijelyani, Keramat; Sadighi Gilani, Mohammad Ali; Hoseini, Mostafa; Vesali, Samira; Yaseri, Mehdi; Alizadeh, Ahad; Mohammad, Kazem; Samani, Reza Omani
2017-01-01
The most common chromosomal abnormality due to non-obstructive azoospermia (NOA) is Klinefelter syndrome (KS) which occurs in 1-1.72 out of 500-1000 male infants. The probability of retrieving sperm as the outcome could be asymmetrically different between patients with and without KS, therefore logistic regression analysis is not a well-qualified test for this type of data. This study has been designed to evaluate skewed regression model analysis for data collected from microsurgical testicular sperm extraction (micro-TESE) among azoospermic patients with and without non-mosaic KS syndrome. This cohort study compared the micro-TESE outcome between 134 men with classic KS and 537 men with NOA and normal karyotype who were referred to Royan Institute between 2009 and 2011. In addition to our main outcome, which was sperm retrieval, we also used logistic and skewed regression analyses to compare the following demographic and hormonal factors: age, level of follicle stimulating hormone (FSH), luteinizing hormone (LH), and testosterone between the two groups. A comparison of the micro-TESE between the KS and control groups showed a success rate of 28.4% (38/134) for the KS group and 22.2% (119/537) for the control group. In the KS group, a significantly difference (P<0.001) existed between testosterone levels for the successful sperm retrieval group (3.4 ± 0.48 mg/mL) compared to the unsuccessful sperm retrieval group (2.33 ± 0.23 mg/mL). The index for quasi Akaike information criterion (QAIC) had a goodness of fit of 74 for the skewed model which was lower than logistic regression (QAIC=85). According to the results, skewed regression is more efficient in estimating sperm retrieval success when the data from patients with KS are analyzed. This finding should be investigated by conducting additional studies with different data structures.
The UK Military Experience of Thoracic Injury in the Wars in Iraq and Afghanistan
2013-01-01
investigations including computed tomography (CT), laboratory and blood bank. A Role 4 hospital is a fixed capability in the home nation capable of providing full...not an independent predictor of mortality in our model. Goodness of the logistic regression model fit was demonstrated using a Hosmer and Lemeshow test...of good practice and ethical care; thus we believe the hidden mortality is minimal. It is possible that in some circumstances, the desire to do
Sugihara, Toru; Yasunaga, Hideo; Horiguchi, Hiromasa; Fujimura, Tetsuya; Fushimi, Kiyohide; Yu, Changhong; Kattan, Michael W; Homma, Yukio
2014-12-01
Little is known about the disparity of choices between three urinary diversions after radical cystectomy, focusing on patient and institutional factors. We identified urothelial carcinoma patients who received radical cystectomy with cutaneous ureterostomy, ileal conduit or continent reservoir using the Japanese Diagnosis Procedure Combination database from 2007 to 2012. Data comprised age, sex, comorbidities (converted into the Charlson index), TNM classification (converted into oncological stage), hospitals' academic status, hospital volume, bed volume and geographical region. Multivariate ordinal logistic regression analyses fitted with the proportional odds model were performed to analyze factors affecting urinary diversion choices. For dependent variables, the three diversions were converted into an ordinal variable in order of complexity: cutaneous ureterostomy (reference), ileal conduit and continent reservoir. Geographical variations were also examined by multivariate logistic regression models. A total of 4790 patients (1131 cutaneous ureterostomies [23.6 %], 2970 ileal conduits [62.0 %] and 689 continent reservoirs [14.4 %]) were included. Ordinal logistic regression analyses showed that male sex, lower age, lower Charlson index, early tumor stage, higher hospital volume (≥3.4 cases/year) and larger bed volume (≥450 beds) were significantly associated with the preference of more complex urinary diversion. Significant geographical disparity was also found. Good patient condition and early oncological status, as well as institutional factors, including high hospital volume, large bed volume and specific geographical regions, are independently related to the likelihood of choosing complex diversions. Recognizing this disparity would help reinforce the need for clinical practice uniformity.
Peng, Yong; Peng, Shuangling; Wang, Xinghua; Tan, Shiyang
2018-06-01
This study aims to identify the effects of characteristics of vehicle, roadway, driver, and environment on fatality of drivers in vehicle-fixed object accidents on expressways in Changsha-Zhuzhou-Xiangtan district of Hunan province in China by developing multinomial logistic regression models. For this purpose, 121 vehicle-fixed object accidents from 2011-2017 are included in the modeling process. First, descriptive statistical analysis is made to understand the main characteristics of the vehicle-fixed object crashes. Then, 19 explanatory variables are selected, and correlation analysis of each two variables is conducted to choose the variables to be concluded. Finally, five multinomial logistic regression models including different independent variables are compared, and the model with best fitting and prediction capability is chosen as the final model. The results showed that the turning direction in avoiding fixed objects raised the possibility that drivers would die. About 64% of drivers died in the accident were found being ejected out of the car, of which 50% did not use a seatbelt before the fatal accidents. Drivers are likely to die when they encounter bad weather on the expressway. Drivers with less than 10 years of driving experience are more likely to die in these accidents. Fatigue or distracted driving is also a significant factor in fatality of drivers. Findings from this research provide an insight into reducing fatality of drivers in vehicle-fixed object accidents.
Applying Kaplan-Meier to Item Response Data
ERIC Educational Resources Information Center
McNeish, Daniel
2018-01-01
Some IRT models can be equivalently modeled in alternative frameworks such as logistic regression. Logistic regression can also model time-to-event data, which concerns the probability of an event occurring over time. Using the relation between time-to-event models and logistic regression and the relation between logistic regression and IRT, this…
Cardiorespiratory Fitness, Waist Circumference and Alanine Aminotransferase in Youth
Trilk, Jennifer L.; Ortaglia, Andrew; Blair, Steven N.; Bottai, Matteo; Church, Timothy S.; Pate, Russell R.
2012-01-01
Non-alcoholic fatty liver disease (NAFLD) is considered the liver component of the metabolic syndrome and is strongly associated with cardiometabolic diseases. In adults, cardiorespiratory fitness (CRF) is inversely associated with alanine aminotransferase (ALT), a blood biomarker for NAFLD. However, information regarding these associations is scarce for youth. Purpose To examine associations between CRF, waist circumference (WC) and ALT in youth. Methods Data were obtained from youth (n=2844, 12-19 years) in the National Health and Nutrition Examination Survey (NHANES) 2001-2004. CRF was dichotomized into youth FITNESSGRAM® categories of “low” and “adequate” CRF. Logistic and quantile regression were used for a comprehensive analysis of associations, and variables with previously-reported associations with ALT were a priori included in the models. Results Results from logistic regression suggested that youth with low CRF had 1.5 times the odds of having an ALT>30 than youth with adequate CRF, although the association was not statistically significant (P=0.09). However, quantile regression demonstrated that youth with low CRF had statistically significantly higher ALT (+1.04, +1.05, and +2.57 U/L) at the upper end of the ALT distribution (80th, 85th, and 90th percentiles, respectively) than youth with adequate CRF. For every 1-cm increase in WC, the odds of having an ALT>30 increased by 1.06 (P<0.001), and the strength of this association increased across the ALT distribution. Conclusions Future studies should examine whether interventions to improve CRF can decrease hepatic fat and liver enzyme concentrations in youth with ALT ≥80th percentile or in youth diagnosed with NAFLD. PMID:23190589
Porter, Anna K; Matthews, Krystin J; Salvo, Deborah; Kohl, Harold W
2017-07-01
Most US adolescents do not meet guidelines of at least 60 daily minutes of moderate- to vigorous-intensity physical activity. In addition, sedentary behaviors among this age group are of increasing concern. This study examined the association of movement behaviors with cardiovascular fitness among US adolescents. Data from the 2012 NHANES National Youth Fitness Survey were used to assess the association of movement behaviors (physical activity, sedentary time, screen time) with cardiovascular fitness among adolescent males and females. Multiple logistic regressions were used to test the independent and interactive effects of movement behaviors on cardiovascular fitness. Among females, physical activity was directly associated with cardiovascular fitness; no significant association was observed between sedentary behaviors and CVF. Among males, sedentary time moderated the relationship between physical activity and cardiovascular fitness, such that a significant, direct association was only observed among those with high sedentary time (OR: 5.01; 95% CI: 1.60, 15.70). Results from this cross-sectional analysis suggest that among female US adolescents, physical activity, but not sedentary behavior, is associated with cardiovascular fitness. Among males, the interaction between physical activity and sedentary time seems to be important for cardiovascular fitness. Longitudinal studies are warranted to confirm these findings.
Esserman, Denise A.; Moore, Charity G.; Roth, Mary T.
2009-01-01
Older community dwelling adults often take multiple medications for numerous chronic diseases. Non-adherence to these medications can have a large public health impact. Therefore, the measurement and modeling of medication adherence in the setting of polypharmacy is an important area of research. We apply a variety of different modeling techniques (standard linear regression; weighted linear regression; adjusted linear regression; naïve logistic regression; beta-binomial (BB) regression; generalized estimating equations (GEE)) to binary medication adherence data from a study in a North Carolina based population of older adults, where each medication an individual was taking was classified as adherent or non-adherent. In addition, through simulation we compare these different methods based on Type I error rates, bias, power, empirical 95% coverage, and goodness of fit. We find that estimation and inference using GEE is robust to a wide variety of scenarios and we recommend using this in the setting of polypharmacy when adherence is dichotomously measured for multiple medications per person. PMID:20414358
[Development of the lung cancer diagnostic system].
Lv, You-Jiang; Yu, Shou-Yi
2009-07-01
To develop a lung cancer diagnosis system. A retrospective analysis was conducted in 1883 patients with primary lung cancer or benign pulmonary diseases (pneumonia, tuberculosis, or pneumonia pseudotumor). SPSS11.5 software was used for data processing. For the relevant factors, a non-factor Logistic regression analysis was used followed by establishment of the regression model. Microsoft Visual Studio 2005 system development platform and VB.Net corresponding language were used to develop the lung cancer diagnosis system. The non-factor multi-factor regression model showed a goodness-of-fit (R2) of the model of 0.806, with a diagnostic accuracy for benign lung diseases of 92.8%, a diagnostic accuracy for lung cancer of 89.0%, and an overall accuracy of 90.8%. The model system for early clinical diagnosis of lung cancer has been established.
Amagasa, Takashi; Nakayama, Takeo
2013-08-01
To clarify how long working hours affect the likelihood of current and future depression. Using data from four repeated measurements collected from 218 clerical workers, four models associating work-related factors to the depressive mood scale were established. The final model was constructed after comparing and testing the goodness-of-fit index using structural equation modeling. Multiple logistic regression analysis was also performed. The final model showed the best fit (normed fit index = 0.908; goodness-of-fit index = 0.936; root-mean-square error of approximation = 0.018). Its standardized total effect indicated that long working hours affected depression at the time of evaluation and 1 to 3 years later. The odds ratio for depression risk was 14.7 in employees who were not long-hours overworked according to the initial survey but who were long-hours overworked according to the second survey. Long working hours increase current and future risks of depression.
Jiang, Jun; Lei, Lan; Zhou, Xiaowan; Li, Peng; Wei, Ren
2018-02-20
Recent studies have shown that low hemoglobin (Hb) level promote the progression of chronic kidney disease. This study assessed the relationship between Hb level and type 1 diabetic nephropathy (DN) in Anhui Han's patients. There were a total of 236 patients diagnosed with type 1 diabetes mellitus and (T1DM) seen between January 2014 and December 2016 in our centre. Hemoglobin levels in patients with DN were compared with those without DN. The relationship between Hb level and the urinary albumin-creatinine ratio (ACR) was examined by Spearman's correlational analysis and multiple stepwise regression analysis. The binary logistic multivariate regression analysis was performed to analyze the correlated factors for type 1 DN, calculate the Odds Ratio (OR) and 95%confidence interval (CI). The predicting value of Hb level for DN was evaluated by area under receiver operation characteristic curve (AUROC) for discrimination and Hosmer-Lemeshow goodness-of-fit test for calibration. The average Hb levels in the DN group (116.1 ± 20.8 g/L) were significantly lower than the non-DN group (131.9 ± 14.4 g/L) , P < 0.001. Hb levels were independently correlated with the urinary ACR in multiple stepwise regression analysis. The logistic multivariate regression analysis showed that the Hb level (OR: 0.936, 95% CI: 0.910 to 0.963, P < 0.001) was inversely correlated with DN in patients with T1DM. In sub-analysis, low Hb level (Hb < 120g/L in female, Hb < 130g/L in male) was still negatively associated with DN in patients with T1DM. The AUROC was 0.721 (95% CI: 0.655 to 0.787) in assessing the discrimination of the Hb level for DN. The value of P was 0.593 in Hosmer-Lemeshow goodness-of-fit test. In Anhui Han's patients with T1DM, the Hb level is inversely correlated with urinary ACR and DN. This article is protected by copyright. All rights reserved.
Alcohol-related predictors of adolescent driving: gender differences in crashes and offenses.
Shope, J T; Waller, P F; Lang, S W
1996-11-01
Demographic and alcohol-related data collected from eight-grade students (age 13 years) were used in logistic regression to predict subsequent first-year driving crashes and offenses (age 17 years). For young men's crashes and offenses, good-fitting models used living situation (both parents or not), parents' attitude about teen drinking (negative or neutral), and the interaction term. Young men who lived with both parents and reported negative parental attitudes regarding teen drinking were less likely to have crashes and offenses. For young women's crashes, a good-fitting model included friends' involvement with alcohol. Young women who reported that their friends were not involved with alcohol were least likely to have crashes. No model predicting young women's offenses emerged.
Diagnostic efficiency of an ability-focused battery.
Miller, Justin B; Fichtenberg, Norman L; Millis, Scott R
2010-05-01
An ability-focused battery (AFB) is a selected group of well-validated neuropsychological measures that assess the conventional range of cognitive domains. This study examined the diagnostic efficiency of an AFB for use in clinical decision making with a mixed sample composed of individuals with neurological brain dysfunction and individuals referred for cognitive assessment without evidence of neurological disorders. Using logistic regression analyses and ROC curve analysis, a five-domain model composed of attention, processing speed, visual-spatial reasoning, language/verbal reasoning, and memory domain scores was fitted that had an AUC of.89 (95% CI =.84-.95). A more parsimonious two-domain model using processing speed and memory was also fitted that had an AUC of.90 (95% confidence interval =.84-.95). A model composed of a global ability score calculated from the mean of the individual domain scores was also fitted with an AUC of.88 (95% CI =.82-.94).
Measures of health, fitness, and functional movement among firefighter recruits.
Cornell, David J; Gnacinski, Stacy L; Zamzow, Aaron; Mims, Jason; Ebersole, Kyle T
2017-06-01
The purpose of this study was to examine the associations between various health and fitness measures and Functional Movement Screen™ (FMS™) scores among 78 firefighter recruits. Relationships between FMS™ scores and age, body mass index (BMI), sit and reach (S&R) distance, estimated maximal aerobic capacity (V˙ O2max ), estimated one-repetition maximum squat (1RM-Squat max ), and plank endurance (%Plank max ) were examined. Total FMS™ scores were significantly correlated with BMI (r = -0.231, p = 0.042), estimated 1RM-Squat max (r = 0.302, p = 0.007), and %Plank max (r = 0.320, p = 0.004). Multiple regression analyses indicated that this combination of predictors significantly predicted (F(3, 74) = 5.043, p = 0.003) Total FMS™ score outcomes and accounted for 17% of the total variance (R 2 = 0.170). In addition, logistic regression analyses indicated that estimated 1RM-Squat max also significantly predicted (χ 2 = 6.662, df = 1, p = 0.010) FMS™ group membership (≤14 or ≥15). These results suggest that the health and fitness measures of obesity (BMI), bilateral lower extremity strength (estimated 1RM-Squat max ), and core muscular endurance (%Plank max ) are significantly associated with functional movement patterns among firefighter recruits. Consequently, injury prevention programs implemented among firefighter recruits should target these aspects of health and fitness.
NASA Astrophysics Data System (ADS)
Lin, Yingzhi; Deng, Xiangzheng; Li, Xing; Ma, Enjun
2014-12-01
Spatially explicit simulation of land use change is the basis for estimating the effects of land use and cover change on energy fluxes, ecology and the environment. At the pixel level, logistic regression is one of the most common approaches used in spatially explicit land use allocation models to determine the relationship between land use and its causal factors in driving land use change, and thereby to evaluate land use suitability. However, these models have a drawback in that they do not determine/allocate land use based on the direct relationship between land use change and its driving factors. Consequently, a multinomial logistic regression method was introduced to address this flaw, and thereby, judge the suitability of a type of land use in any given pixel in a case study area of the Jiangxi Province, China. A comparison of the two regression methods indicated that the proportion of correctly allocated pixels using multinomial logistic regression was 92.98%, which was 8.47% higher than that obtained using logistic regression. Paired t-test results also showed that pixels were more clearly distinguished by multinomial logistic regression than by logistic regression. In conclusion, multinomial logistic regression is a more efficient and accurate method for the spatial allocation of land use changes. The application of this method in future land use change studies may improve the accuracy of predicting the effects of land use and cover change on energy fluxes, ecology, and environment.
Application of Regulatory Focus Theory to Search Advertising
Mowle, Elyse N.; Georgia, Emily J.; Doss, Brian D.; Updegraff, John A.
2015-01-01
Purpose The purpose of this paper is to test the utility of regulatory focus theory principles in a real-world setting; specifically, Internet hosted text advertisements. Effect of compatibility of the ad text with the regulatory focus of the consumer was examined. Design/methodology/approach Advertisements were created using Google AdWords. Data were collected for the number of views and clicks each ad received. Effect of regulatory fit was measured using logistic regression. Findings Logistic regression analyses demonstrated that there was a strong main effect for keyword, such that users were almost six times as likely to click on a promotion advertisement as a prevention advertisement, as well as a main effect for compatibility, such that users were twice as likely to click on an advertisement with content that was consistent with their keyword. Finally, there was a strong interaction of these two variables, such that the effect of consistent advertisements was stronger for promotion searches than for prevention searches. Research limitations/implications The effect of ad compatibility had medium to large effect sizes, suggesting that individuals’ state may have more influence on advertising response than do individuals’ traits (e.g. personality traits). Measurement of regulatory fit was limited by the constraints of Google AdWords. Practical implications The results of this study provide a possible framework for ad creation for Internet advertisers. Originality/value This paper is the first study to demonstrate the utility of regulatory focus theory in online advertising. PMID:26430293
Sperandio, Evandro Fornias; Arantes, Rodolfo Leite; da Silva, Rodrigo Pereira; Matheus, Agatha Caveda; Lauria, Vinícius Tonon; Bianchim, Mayara Silveira; Romiti, Marcello; Gagliardi, Antônio Ricardo de Toledo; Dourado, Victor Zuniga
2016-01-01
Accelerometry provides objective measurement of physical activity levels, but is unfeasible in clinical practice. Thus, we aimed to identify physical fitness tests capable of predicting physical inactivity among adults. Diagnostic test study developed at a university laboratory and a diagnostic clinic. 188 asymptomatic subjects underwent assessment of physical activity levels through accelerometry, ergospirometry on treadmill, body composition from bioelectrical impedance, isokinetic muscle function, postural balance on a force platform and six-minute walk test. We conducted descriptive analysis and multiple logistic regression including age, sex, oxygen uptake, body fat, center of pressure, quadriceps peak torque, distance covered in six-minute walk test and steps/day in the model, as predictors of physical inactivity. We also determined sensitivity (S), specificity (Sp) and area under the curve of the main predictors by means of receiver operating characteristic curves. The prevalence of physical inactivity was 14%. The mean number of steps/day (≤ 5357) was the best predictor of physical inactivity (S = 99%; Sp = 82%). The best physical fitness test was a distance in the six-minute walk test and ≤ 96% of predicted values (S = 70%; Sp = 80%). Body fat > 25% was also significant (S = 83%; Sp = 51%). After logistic regression, steps/day and distance in the six-minute walk test remained predictors of physical inactivity. The six-minute walk test should be included in epidemiological studies as a simple and cheap tool for screening for physical inactivity.
Sampaolo, Letizia; Tommaso, Giulia; Gherardi, Bianca; Carrozzi, Giuliano; Freni Sterrantino, Anna; Ottone, Marta; Goldoni, Carlo Alberto; Bertozzi, Nicoletta; Scaringi, Meri; Bolognesi, Lara; Masocco, Maria; Salmaso, Stefania; Lauriola, Paolo
2017-01-01
"OBJECTIVES: to identify groups of people in relation to the perception of environmental risk and to assess the main characteristics using data collected in the environmental module of the surveillance network Italian Behavioral Risk Factor Surveillance System (PASSI). perceptive profiles were identified using a latent class analysis; later they were included as outcome in multinomial logistic regression models to assess the association between environmental risk perception and demographic, health, socio-economic and behavioural variables. the latent class analysis allowed to split the sample in "worried", "indifferent", and "positive" people. The multinomial logistic regression model showed that the "worried" profile typically includes people of Italian nationality, living in highly urbanized areas, with a high level of education, and with economic difficulties; they pay special attention to their own health and fitness, but they have a negative perception of their own psychophysical state. the application of advanced statistical analysis enable to appraise PASSI data in order to characterize the perception of environmental risk, making the planning of interventions related to risk communication possible. ".
Valérie Passo Tsamo, Claudine; Andre, Christelle M; Ritter, Christian; Tomekpe, Kodjo; Ngoh Newilah, Gérard; Rogez, Hervé; Larondelle, Yvan
2014-08-27
This study aimed at understanding the contribution of the fruit physicochemical parameters to Musa sp. diversity and plantain ripening stages. A discriminant analysis was first performed on a collection of 35 Musa sp. cultivars, organized in six groups based on the consumption mode (dessert or cooking banana) and the genomic constitution. A principal component analysis reinforced by a logistic regression on plantain cultivars was proposed as an analytical approach to describe the plantain ripening stages. The results of the discriminant analysis showed that edible fraction, peel pH, pulp water content, and pulp total phenolics were among the most contributing attributes for the discrimination of the cultivar groups. With mean values ranging from 65.4 to 247.3 mg of gallic acid equivalents/100 g of fresh weight, the pulp total phenolics strongly differed between interspecific and monospecific cultivars within dessert and nonplantain cooking bananas. The results of the logistic regression revealed that the best models according to fitting parameters involved more than one physicochemical attribute. Interestingly, pulp and peel total phenolic contents contributed in the building up of these models.
Variational dynamic background model for keyword spotting in handwritten documents
NASA Astrophysics Data System (ADS)
Kumar, Gaurav; Wshah, Safwan; Govindaraju, Venu
2013-12-01
We propose a bayesian framework for keyword spotting in handwritten documents. This work is an extension to our previous work where we proposed dynamic background model, DBM for keyword spotting that takes into account the local character level scores and global word level scores to learn a logistic regression classifier to separate keywords from non-keywords. In this work, we add a bayesian layer on top of the DBM called the variational dynamic background model, VDBM. The logistic regression classifier uses the sigmoid function to separate keywords from non-keywords. The sigmoid function being neither convex nor concave, exact inference of VDBM becomes intractable. An expectation maximization step is proposed to do approximate inference. The advantage of VDBM over the DBM is multi-fold. Firstly, being bayesian, it prevents over-fitting of data. Secondly, it provides better modeling of data and an improved prediction of unseen data. VDBM is evaluated on the IAM dataset and the results prove that it outperforms our prior work and other state of the art line based word spotting system.
Occupational exposures and non-Hodgkin's lymphoma: Canadian case-control study.
Karunanayake, Chandima P; McDuffie, Helen H; Dosman, James A; Spinelli, John J; Pahwa, Punam
2008-08-07
The objective was to study the association between Non-Hodgkin's Lymphoma (NHL) and occupational exposures related to long held occupations among males in six provinces of Canada. A population based case-control study was conducted from 1991 to 1994. Males with newly diagnosed NHL (ICD-10) were stratified by province of residence and age group. A total of 513 incident cases and 1506 population based controls were included in the analysis. Conditional logistic regression was conducted to fit statistical models. Based on conditional logistic regression modeling, the following factors independently increased the risk of NHL: farmer and machinist as long held occupations; constant exposure to diesel exhaust fumes; constant exposure to ionizing radiation (radium); and personal history of another cancer. Men who had worked for 20 years or more as farmer and machinist were the most likely to develop NHL. An increased risk of developing NHL is associated with the following: long held occupations of faer and machinist; exposure to diesel fumes; and exposure to ionizing radiation (radium). The risk of NHL increased with the duration of employment as a farmer or machinist.
What are hierarchical models and how do we analyze them?
Royle, Andy
2016-01-01
In this chapter we provide a basic definition of hierarchical models and introduce the two canonical hierarchical models in this book: site occupancy and N-mixture models. The former is a hierarchical extension of logistic regression and the latter is a hierarchical extension of Poisson regression. We introduce basic concepts of probability modeling and statistical inference including likelihood and Bayesian perspectives. We go through the mechanics of maximizing the likelihood and characterizing the posterior distribution by Markov chain Monte Carlo (MCMC) methods. We give a general perspective on topics such as model selection and assessment of model fit, although we demonstrate these topics in practice in later chapters (especially Chapters 5, 6, 7, and 10 Chapter 5 Chapter 6 Chapter 7 Chapter 10)
Threshold altitude resulting in decompression sickness
NASA Technical Reports Server (NTRS)
Kumar, K. V.; Waligora, James M.; Calkins, Dick S.
1990-01-01
A review of case reports, hypobaric chamber training data, and experimental evidence indicated that the threshold for incidence of altitude decompression sickness (DCS) was influenced by various factors such as prior denitrogenation, exercise or rest, and period of exposure, in addition to individual susceptibility. Fitting these data with appropriate statistical models makes it possible to examine the influence of various factors on the threshold for DCS. This approach was illustrated by logistic regression analysis on the incidence of DCS below 9144 m. Estimations using these regressions showed that, under a noprebreathe, 6-h exposure, simulated EVA profile, the threshold for symptoms occurred at approximately 3353 m; while under a noprebreathe, 2-h exposure profile with knee-bends exercise, the threshold occurred at 7925 m.
NASA Astrophysics Data System (ADS)
Martínez-Fernández, J.; Chuvieco, E.; Koutsias, N.
2013-02-01
Humans are responsible for most forest fires in Europe, but anthropogenic factors behind these events are still poorly understood. We tried to identify the driving factors of human-caused fire occurrence in Spain by applying two different statistical approaches. Firstly, assuming stationary processes for the whole country, we created models based on multiple linear regression and binary logistic regression to find factors associated with fire density and fire presence, respectively. Secondly, we used geographically weighted regression (GWR) to better understand and explore the local and regional variations of those factors behind human-caused fire occurrence. The number of human-caused fires occurring within a 25-yr period (1983-2007) was computed for each of the 7638 Spanish mainland municipalities, creating a binary variable (fire/no fire) to develop logistic models, and a continuous variable (fire density) to build standard linear regression models. A total of 383 657 fires were registered in the study dataset. The binary logistic model, which estimates the probability of having/not having a fire, successfully classified 76.4% of the total observations, while the ordinary least squares (OLS) regression model explained 53% of the variation of the fire density patterns (adjusted R2 = 0.53). Both approaches confirmed, in addition to forest and climatic variables, the importance of variables related with agrarian activities, land abandonment, rural population exodus and developmental processes as underlying factors of fire occurrence. For the GWR approach, the explanatory power of the GW linear model for fire density using an adaptive bandwidth increased from 53% to 67%, while for the GW logistic model the correctly classified observations improved only slightly, from 76.4% to 78.4%, but significantly according to the corrected Akaike Information Criterion (AICc), from 3451.19 to 3321.19. The results from GWR indicated a significant spatial variation in the local parameter estimates for all the variables and an important reduction of the autocorrelation in the residuals of the GW linear model. Despite the fitting improvement of local models, GW regression, more than an alternative to "global" or traditional regression modelling, seems to be a valuable complement to explore the non-stationary relationships between the response variable and the explanatory variables. The synergy of global and local modelling provides insights into fire management and policy and helps further our understanding of the fire problem over large areas while at the same time recognizing its local character.
Standards for Standardized Logistic Regression Coefficients
ERIC Educational Resources Information Center
Menard, Scott
2011-01-01
Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…
Organized sports, overweight, and physical fitness in primary school children in Germany.
Drenowatz, Clemens; Steiner, Ronald P; Brandstetter, Susanne; Klenk, Jochen; Wabitsch, Martin; Steinacker, Jürgen M
2013-01-01
Physical inactivity is associated with poor physical fitness and increased body weight. This study examined the relationship between participation in organized sports and overweight as well as physical fitness in primary school children in southern Germany. Height, weight, and various components of physical fitness were measured in 995 children (7.6 ± 0.4 years). Sports participation and confounding variables such as migration background, parental education, parental body weight, and parental sports participation were assessed via parent questionnaire. Multiple logistic regression as well as multivariate analysis of covariance (MANCOVA) was used to determine associations between physical fitness, participation in organized sports, and body weight. Participation in organized sports less than once a week was prevalent in 29.2%, once or twice in 60.2%, and more often in 10.6% of the children. Overweight was found in 12.4% of the children. Children participating in organized sports more than once per week displayed higher physical fitness and were less likely to be overweight (OR = 0.52, P < 0.01). Even though causality cannot be established, the facilitation of participation in organized sports may be a crucial aspect in public health efforts addressing the growing problems associated with overweight and obesity.
Mapping of the DLQI scores to EQ-5D utility values using ordinal logistic regression.
Ali, Faraz Mahmood; Kay, Richard; Finlay, Andrew Y; Piguet, Vincent; Kupfer, Joerg; Dalgard, Florence; Salek, M Sam
2017-11-01
The Dermatology Life Quality Index (DLQI) and the European Quality of Life-5 Dimension (EQ-5D) are separate measures that may be used to gather health-related quality of life (HRQoL) information from patients. The EQ-5D is a generic measure from which health utility estimates can be derived, whereas the DLQI is a specialty-specific measure to assess HRQoL. To reduce the burden of multiple measures being administered and to enable a more disease-specific calculation of health utility estimates, we explored an established mathematical technique known as ordinal logistic regression (OLR) to develop an appropriate model to map DLQI data to EQ-5D-based health utility estimates. Retrospective data from 4010 patients were randomly divided five times into two groups for the derivation and testing of the mapping model. Split-half cross-validation was utilized resulting in a total of ten ordinal logistic regression models for each of the five EQ-5D dimensions against age, sex, and all ten items of the DLQI. Using Monte Carlo simulation, predicted health utility estimates were derived and compared against those observed. This method was repeated for both OLR and a previously tested mapping methodology based on linear regression. The model was shown to be highly predictive and its repeated fitting demonstrated a stable model using OLR as well as linear regression. The mean differences between OLR-predicted health utility estimates and observed health utility estimates ranged from 0.0024 to 0.0239 across the ten modeling exercises, with an average overall difference of 0.0120 (a 1.6% underestimate, not of clinical importance). This modeling framework developed in this study will enable researchers to calculate EQ-5D health utility estimates from a specialty-specific study population, reducing patient and economic burden.
Westreich, Daniel; Lessler, Justin; Funk, Michele Jonsson
2010-01-01
Summary Objective Propensity scores for the analysis of observational data are typically estimated using logistic regression. Our objective in this Review was to assess machine learning alternatives to logistic regression which may accomplish the same goals but with fewer assumptions or greater accuracy. Study Design and Setting We identified alternative methods for propensity score estimation and/or classification from the public health, biostatistics, discrete mathematics, and computer science literature, and evaluated these algorithms for applicability to the problem of propensity score estimation, potential advantages over logistic regression, and ease of use. Results We identified four techniques as alternatives to logistic regression: neural networks, support vector machines, decision trees (CART), and meta-classifiers (in particular, boosting). Conclusion While the assumptions of logistic regression are well understood, those assumptions are frequently ignored. All four alternatives have advantages and disadvantages compared with logistic regression. Boosting (meta-classifiers) and to a lesser extent decision trees (particularly CART) appear to be most promising for use in the context of propensity score analysis, but extensive simulation studies are needed to establish their utility in practice. PMID:20630332
Fungible weights in logistic regression.
Jones, Jeff A; Waller, Niels G
2016-06-01
In this article we develop methods for assessing parameter sensitivity in logistic regression models. To set the stage for this work, we first review Waller's (2008) equations for computing fungible weights in linear regression. Next, we describe 2 methods for computing fungible weights in logistic regression. To demonstrate the utility of these methods, we compute fungible logistic regression weights using data from the Centers for Disease Control and Prevention's (2010) Youth Risk Behavior Surveillance Survey, and we illustrate how these alternate weights can be used to evaluate parameter sensitivity. To make our work accessible to the research community, we provide R code (R Core Team, 2015) that will generate both kinds of fungible logistic regression weights. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kirchhoff, William H.
2012-09-15
The extended logistic function provides a physically reasonable description of interfaces such as depth profiles or line scans of surface topological or compositional features. It describes these interfaces with the minimum number of parameters, namely, position, width, and asymmetry. Logistic Function Profile Fit (LFPF) is a robust, least-squares fitting program in which the nonlinear extended logistic function is linearized by a Taylor series expansion (equivalent to a Newton-Raphson approach) with no apparent introduction of bias in the analysis. The program provides reliable confidence limits for the parameters when systematic errors are minimal and provides a display of the residuals frommore » the fit for the detection of systematic errors. The program will aid researchers in applying ASTM E1636-10, 'Standard practice for analytically describing sputter-depth-profile and linescan-profile data by an extended logistic function,' and may also prove useful in applying ISO 18516: 2006, 'Surface chemical analysis-Auger electron spectroscopy and x-ray photoelectron spectroscopy-determination of lateral resolution.' Examples are given of LFPF fits to a secondary ion mass spectrometry depth profile, an Auger surface line scan, and synthetic data generated to exhibit known systematic errors for examining the significance of such errors to the extrapolation of partial profiles.« less
Finding the Perfect Match: Factors That Influence Family Medicine Residency Selection.
Wright, Katherine M; Ryan, Elizabeth R; Gatta, John L; Anderson, Lauren; Clements, Deborah S
2016-04-01
Residency program selection is a significant experience for emerging physicians, yet there is limited information about how applicants narrow their list of potential programs. This study examines factors that influence residency program selection among medical students interested in family medicine at the time of application. Medical students with an expressed interest in family medicine were invited to participate in a 37-item, online survey. Students were asked to rate factors that may impact residency selection on a 6-point Likert scale in addition to three open-ended qualitative questions. Mean values were calculated for each survey item and were used to determine a rank order for selection criteria. Logistic regression analysis was performed to identify factors that predict a strong interest in urban, suburban, and rural residency programs. Logistic regression was also used to identify factors that predict a strong interest in academic health center-based residencies, community-based residencies, and community-based residencies with an academic affiliation. A total of 705 medical students from 32 states across the country completed the survey. Location, work/life balance, and program structure (curriculum, schedule) were rated the most important factors for residency selection. Logistic regression analysis was used to refine our understanding of how each factor relates to specific types of residencies. These findings have implications for how to best advise students in selecting a residency, as well as marketing residencies to the right candidates. Refining the recruitment process will ensure a better fit between applicants and potential programs. Limited recruitment resources may be better utilized by focusing on targeted dissemination strategies.
Snyder, Marcia; Freeman, Mary C.; Purucker, S. Thomas; Pringle, Catherine M.
2016-01-01
Freshwater shrimps are an important biotic component of tropical ecosystems. However, they can have a low probability of detection when abundances are low. We sampled 3 of the most common freshwater shrimp species, Macrobrachium olfersii, Macrobrachium carcinus, and Macrobrachium heterochirus, and used occupancy modeling and logistic regression models to improve our limited knowledge of distribution of these cryptic species by investigating both local- and landscape-scale effects at La Selva Biological Station in Costa Rica. Local-scale factors included substrate type and stream size, and landscape-scale factors included presence or absence of regional groundwater inputs. Capture rates for 2 of the sampled species (M. olfersii and M. carcinus) were sufficient to compare the fit of occupancy models. Occupancy models did not converge for M. heterochirus, but M. heterochirus had high enough occupancy rates that logistic regression could be used to model the relationship between occupancy rates and predictors. The best-supported models for M. olfersii and M. carcinus included conductivity, discharge, and substrate parameters. Stream size was positively correlated with occupancy rates of all 3 species. High stream conductivity, which reflects the quantity of regional groundwater input into the stream, was positively correlated with M. olfersii occupancy rates. Boulder substrates increased occupancy rate of M. carcinus and decreased the detection probability of M. olfersii. Our models suggest that shrimp distribution is driven by factors that function at local (substrate and discharge) and landscape (conductivity) scales.
Westreich, Daniel; Lessler, Justin; Funk, Michele Jonsson
2010-08-01
Propensity scores for the analysis of observational data are typically estimated using logistic regression. Our objective in this review was to assess machine learning alternatives to logistic regression, which may accomplish the same goals but with fewer assumptions or greater accuracy. We identified alternative methods for propensity score estimation and/or classification from the public health, biostatistics, discrete mathematics, and computer science literature, and evaluated these algorithms for applicability to the problem of propensity score estimation, potential advantages over logistic regression, and ease of use. We identified four techniques as alternatives to logistic regression: neural networks, support vector machines, decision trees (classification and regression trees [CART]), and meta-classifiers (in particular, boosting). Although the assumptions of logistic regression are well understood, those assumptions are frequently ignored. All four alternatives have advantages and disadvantages compared with logistic regression. Boosting (meta-classifiers) and, to a lesser extent, decision trees (particularly CART), appear to be most promising for use in the context of propensity score analysis, but extensive simulation studies are needed to establish their utility in practice. Copyright (c) 2010 Elsevier Inc. All rights reserved.
GWAS with longitudinal phenotypes: performance of approximate procedures
Sikorska, Karolina; Montazeri, Nahid Mostafavi; Uitterlinden, André; Rivadeneira, Fernando; Eilers, Paul HC; Lesaffre, Emmanuel
2015-01-01
Analysis of genome-wide association studies with longitudinal data using standard procedures, such as linear mixed model (LMM) fitting, leads to discouragingly long computation times. There is a need to speed up the computations significantly. In our previous work (Sikorska et al: Fast linear mixed model computations for genome-wide association studies with longitudinal data. Stat Med 2012; 32.1: 165–180), we proposed the conditional two-step (CTS) approach as a fast method providing an approximation to the P-value for the longitudinal single-nucleotide polymorphism (SNP) effect. In the first step a reduced conditional LMM is fit, omitting all the SNP terms. In the second step, the estimated random slopes are regressed on SNPs. The CTS has been applied to the bone mineral density data from the Rotterdam Study and proved to work very well even in unbalanced situations. In another article (Sikorska et al: GWAS on your notebook: fast semi-parallel linear and logistic regression for genome-wide association studies. BMC Bioinformatics 2013; 14: 166), we suggested semi-parallel computations, greatly speeding up fitting many linear regressions. Combining CTS with fast linear regression reduces the computation time from several weeks to a few minutes on a single computer. Here, we explore further the properties of the CTS both analytically and by simulations. We investigate the performance of our proposal in comparison with a related but different approach, the two-step procedure. It is analytically shown that for the balanced case, under mild assumptions, the P-value provided by the CTS is the same as from the LMM. For unbalanced data and in realistic situations, simulations show that the CTS method does not inflate the type I error rate and implies only a minimal loss of power. PMID:25712081
Should metacognition be measured by logistic regression?
Rausch, Manuel; Zehetleitner, Michael
2017-03-01
Are logistic regression slopes suitable to quantify metacognitive sensitivity, i.e. the efficiency with which subjective reports differentiate between correct and incorrect task responses? We analytically show that logistic regression slopes are independent from rating criteria in one specific model of metacognition, which assumes (i) that rating decisions are based on sensory evidence generated independently of the sensory evidence used for primary task responses and (ii) that the distributions of evidence are logistic. Given a hierarchical model of metacognition, logistic regression slopes depend on rating criteria. According to all considered models, regression slopes depend on the primary task criterion. A reanalysis of previous data revealed that massive numbers of trials are required to distinguish between hierarchical and independent models with tolerable accuracy. It is argued that researchers who wish to use logistic regression as measure of metacognitive sensitivity need to control the primary task criterion and rating criteria. Copyright © 2017 Elsevier Inc. All rights reserved.
Koseki, Shige; Nonaka, Junko
2012-09-01
The objective of this study was to develop a probabilistic model to predict the end of lag time (λ) during the growth of Bacillus cereus vegetative cells as a function of temperature, pH, and salt concentration using logistic regression. The developed λ model was subsequently combined with a logistic differential equation to simulate bacterial numbers over time. To develop a novel model for λ, we determined whether bacterial growth had begun, i.e., whether λ had ended, at each time point during the growth kinetics. The growth of B. cereus was evaluated by optical density (OD) measurements in culture media for various pHs (5.5 ∼ 7.0) and salt concentrations (0.5 ∼ 2.0%) at static temperatures (10 ∼ 20°C). The probability of the end of λ was modeled using dichotomous judgments obtained at each OD measurement point concerning whether a significant increase had been observed. The probability of the end of λ was described as a function of time, temperature, pH, and salt concentration and showed a high goodness of fit. The λ model was validated with independent data sets of B. cereus growth in culture media and foods, indicating acceptable performance. Furthermore, the λ model, in combination with a logistic differential equation, enabled a simulation of the population of B. cereus in various foods over time at static and/or fluctuating temperatures with high accuracy. Thus, this newly developed modeling procedure enables the description of λ using observable environmental parameters without any conceptual assumptions and the simulation of bacterial numbers over time with the use of a logistic differential equation.
Pevnick, Joshua M.; Fuller, Garth; Duncan, Ray; Spiegel, Brennan M. R.
2016-01-01
Background Personal fitness trackers (PFT) have substantial potential to improve healthcare. Objective To quantify and characterize early adopters who shared their PFT data with providers. Methods We used bivariate statistics and logistic regression to compare patients who shared any PFT data vs. patients who did not. Results A patient portal was used to invite 79,953 registered portal users to share their data. Of 66,105 users included in our analysis, 499 (0.8%) uploaded data during an initial 37-day study period. Bivariate and regression analysis showed that early adopters were more likely than non-adopters to be younger, male, white, health system employees, and to have higher BMIs. Neither comorbidities nor utilization predicted adoption. Conclusion Our results demonstrate that patients had little intrinsic desire to share PFT data with their providers, and suggest that patients most at risk for poor health outcomes are least likely to share PFT data. Marketing, incentives, and/or cultural change may be needed to induce such data-sharing. PMID:27846287
Endoscopic third ventriculostomy in the treatment of childhood hydrocephalus.
Kulkarni, Abhaya V; Drake, James M; Mallucci, Conor L; Sgouros, Spyros; Roth, Jonathan; Constantini, Shlomi
2009-08-01
To develop a model to predict the probability of endoscopic third ventriculostomy (ETV) success in the treatment for hydrocephalus on the basis of a child's individual characteristics. We analyzed 618 ETVs performed consecutively on children at 12 international institutions to identify predictors of ETV success at 6 months. A multivariable logistic regression model was developed on 70% of the dataset (training set) and validated on 30% of the dataset (validation set). In the training set, 305/455 ETVs (67.0%) were successful. The regression model (containing patient age, cause of hydrocephalus, and previous cerebrospinal fluid shunt) demonstrated good fit (Hosmer-Lemeshow, P = .78) and discrimination (C statistic = 0.70). In the validation set, 105/163 ETVs (64.4%) were successful and the model maintained good fit (Hosmer-Lemeshow, P = .45), discrimination (C statistic = 0.68), and calibration (calibration slope = 0.88). A simplified ETV Success Score was devised that closely approximates the predicted probability of ETV success. Children most likely to succeed with ETV can now be accurately identified and spared the long-term complications of CSF shunting.
Logistic models--an odd(s) kind of regression.
Jupiter, Daniel C
2013-01-01
The logistic regression model bears some similarity to the multivariable linear regression with which we are familiar. However, the differences are great enough to warrant a discussion of the need for and interpretation of logistic regression. Copyright © 2013 American College of Foot and Ankle Surgeons. Published by Elsevier Inc. All rights reserved.
Lindström, Paula J; Suni, Jaana H; Nygård, Clas-Håkan
2009-07-01
The importance of neuromuscular-type exercise (NME) has been recognized in recent recommendations for public health. However, the knowledge on associations and dose response of different types of leisure-time physical activity (LTPA) with musculoskeletal fitness and health is incomplete. This study evaluated the validity of the NME recommendation for public health introduced by the Physical Activity Pie. Engagement in LTPA and health-related fitness were assessed in 2 consecutive studies with the same adult population age 30 to 69 years (n = 575). Cross-sectional associations between different LTPA types and motor and musculoskeletal fitness were examined by logistic-regression models. Engagement in NME was associated with good static and dynamic balance and lower extremity strength. The highest odds ratios (OR) were found between brisk NME and static balance (most vs least fit OR = 2.39, moderate vs least fit OR = 1.94) and brisk NME and leg strength (more vs least fit OR = 2.10). Some associations were also found between brisk aerobic exercise and good balance. This cross-sectional study suggests that the recommendation for NME in the Physical Activity Pie is valid in terms of balance and leg strength, the 2 major fitness factors related to mobility functioning, especially among aging adults.
Schubert, Alexandre; Januário, Renata Selvatici B; Casonatto, Juliano; Sonoo, Christi Noriko
2013-01-01
To verify the association between nutritional status, physical fitness, and body image in children and adolescents. This cross-sectional study included 401 students (236 boys and 165 girls) aged between 8 and 16 years that were regularly enrolled in sports in the local clubs. The nutritional status was evaluated by the body mass index. Students were assessed for satisfaction with body image, abdominal strength resistance, and cardiorespiratory fitness. The variables were assessed on the same day following a standardized order. In order to verify relationships between variables, the chi-square test was used. Afterwards, the binary logistic regression was applied to identify the magnitude of the associations, considering p<0.05 as significant. Association was found between body image and body mass index (p=0.001), abdominal strength resistance (p=0.005) and cardiorespiratory fitness (p=0.001). The Odds Ratio for presenting the body image insatisfaction for those who have not achieved the expected values for the health criteria in abdominal strength resistance and cardiorespiratory fitness were 2.14 and 2.42 times respectively, and for those with overweight and obesity, 2.87 times. Insatisfaction with body image is associated with body mass index and also with physical fitness, abdominal strength resistance, and cardiorespiratory fitness variables.
[Factors associated with low levels of aerobic fitness among adolescents].
Gonçalves, Eliane Cristina de Andrade; Silva, Diego Augusto Santos
2016-06-01
To evaluate the prevalence of low aerobic fitness levels and to analyze the association with sociodemographic factors, lifestyle and excess body fatness among adolescents of southern Brazil. The study included 879 adolescents aged 14 to 19 years the city of São José/SC, Brazil. The aerobic fitness was assessed by Canadian modified test of aerobic fitness. Sociodemographic variables (skin color, age, sex, study turn, economic level), sexual maturation and lifestyle (eating habits, screen time, physical activity, consumption of alcohol and tobacco) were assessed by a self-administered questionnaire. Excess body fatness was evaluated by sum of skinfolds triceps and subscapular. We used logistic regression to estimate odds ratios and 95% confidence intervals. Prevalence of low aerobic fitness level was 87.5%. The girls who spent two hours or more in front screen, consumed less than one glass of milk by day, did not smoke and had an excess of body fatness had a higher chance of having lower levels of aerobic fitness. White boys with low physical activity had had a higher chance of having lower levels of aerobic fitness. Eight out of ten adolescents were with low fitness levels aerobic. Modifiable lifestyle factors were associated with low levels of aerobic fitness. Interventions that emphasize behavior change are needed. Copyright © 2015 Sociedade de Pediatria de São Paulo. Publicado por Elsevier Editora Ltda. All rights reserved.
Assessing LULC changes over Chilika Lake watershed in Eastern India using Driving Force Analysis
NASA Astrophysics Data System (ADS)
Jadav, S.; Syed, T. H.
2017-12-01
Rapid population growth and industrial development has brought about significant changes in Land Use Land Cover (LULC) of many developing countries in the world. This study investigates LULC changes in the Chilika Lake watershed of Eastern India for the period of 1988 to 2016. The methodology involves pre-processing and classification of Landsat satellite images using support vector machine (SVM) supervised classification algorithm. Results reveal that `Cropland', `Emergent Vegetation' and `Settlement' has expanded over the study period by 284.61 km², 106.83 km² and 98.83 km² respectively. Contemporaneously, `Lake Area', `Vegetation' and `Scrub Land' have decreased by 121.62 km², 96.05 km² and 80.29 km² respectively. This study also analyzes five major driving force variables of socio-economic and climatological factors triggering LULC changes through a bivariate logistic regression model. The outcome gives credible relative operating characteristics (ROC) value of 0.76 that indicate goodness fit of logistic regression model. In addition, independent variables like distance to drainage network and average annual rainfall have negative regression coefficient values that represent decreased rate of dependent variable (changed LULC) whereas independent variables (population density, distance to road and distance to railway) have positive regression coefficient indicates increased rate of changed LULC . Results from this study will be crucial for planning and restoration of this vital lake water body that has major implications over the society and environment at large.
Parameters Estimation of Geographically Weighted Ordinal Logistic Regression (GWOLR) Model
NASA Astrophysics Data System (ADS)
Zuhdi, Shaifudin; Retno Sari Saputro, Dewi; Widyaningsih, Purnami
2017-06-01
A regression model is the representation of relationship between independent variable and dependent variable. The dependent variable has categories used in the logistic regression model to calculate odds on. The logistic regression model for dependent variable has levels in the logistics regression model is ordinal. GWOLR model is an ordinal logistic regression model influenced the geographical location of the observation site. Parameters estimation in the model needed to determine the value of a population based on sample. The purpose of this research is to parameters estimation of GWOLR model using R software. Parameter estimation uses the data amount of dengue fever patients in Semarang City. Observation units used are 144 villages in Semarang City. The results of research get GWOLR model locally for each village and to know probability of number dengue fever patient categories.
Zhang, Chuanwu; Garrard, Lili; Keighley, John; Carlson, Susan; Gajewski, Byron
2017-01-10
Despite the widely recognized association between the severity of early preterm birth (ePTB) and its related severe diseases, little is known about the potential risk factors of ePTB and the sub-population with high risk of ePTB. Moreover, motivated by a future confirmatory clinical trial to identify whether supplementing pregnant women with docosahexaenoic acid (DHA) has a different effect on the risk subgroup population or not in terms of ePTB prevalence, this study aims to identify potential risk subgroups and risk factors for ePTB, defined as babies born less than 34 weeks of gestation. The analysis data (N = 3,994,872) were obtained from CDC and NCHS' 2014 Natality public data file. The sample was split into independent training and validation cohorts for model generation and model assessment, respectively. Logistic regression and CART models were used to examine potential ePTB risk predictors and their interactions, including mothers' age, nativity, race, Hispanic origin, marital status, education, pre-pregnancy smoking status, pre-pregnancy BMI, pre-pregnancy diabetes status, pre-pregnancy hypertension status, previous preterm birth status, infertility treatment usage status, fertility enhancing drug usage status, and delivery payment source. Both logistic regression models with either 14 or 10 ePTB risk factors produced the same C-index (0.646) based on the training cohort. The C-index of the logistic regression model based on 10 predictors was 0.645 for the validation cohort. Both C-indexes indicated a good discrimination and acceptable model fit. The CART model identified preterm birth history and race as the most important risk factors, and revealed that the subgroup with a preterm birth history and a race designation as Black had the highest risk for ePTB. The c-index and misclassification rate were 0.579 and 0.034 for the training cohort, and 0.578 and 0.034 for the validation cohort, respectively. This study revealed 14 maternal characteristic variables that reliably identified risk for ePTB through either logistic regression model and/or a CART model. Moreover, both models efficiently identify risk subgroups for further enrichment clinical trial design.
Zhang, Xinyan; Li, Bingzong; Han, Huiying; Song, Sha; Xu, Hongxia; Hong, Yating; Yi, Nengjun; Zhuang, Wenzhuo
2018-05-10
Multiple myeloma (MM), like other cancers, is caused by the accumulation of genetic abnormalities. Heterogeneity exists in the patients' response to treatments, for example, bortezomib. This urges efforts to identify biomarkers from numerous molecular features and build predictive models for identifying patients that can benefit from a certain treatment scheme. However, previous studies treated the multi-level ordinal drug response as a binary response where only responsive and non-responsive groups are considered. It is desirable to directly analyze the multi-level drug response, rather than combining the response to two groups. In this study, we present a novel method to identify significantly associated biomarkers and then develop ordinal genomic classifier using the hierarchical ordinal logistic model. The proposed hierarchical ordinal logistic model employs the heavy-tailed Cauchy prior on the coefficients and is fitted by an efficient quasi-Newton algorithm. We apply our hierarchical ordinal regression approach to analyze two publicly available datasets for MM with five-level drug response and numerous gene expression measures. Our results show that our method is able to identify genes associated with the multi-level drug response and to generate powerful predictive models for predicting the multi-level response. The proposed method allows us to jointly fit numerous correlated predictors and thus build efficient models for predicting the multi-level drug response. The predictive model for the multi-level drug response can be more informative than the previous approaches. Thus, the proposed approach provides a powerful tool for predicting multi-level drug response and has important impact on cancer studies.
A predictive risk model for medical intractability in epilepsy.
Huang, Lisu; Li, Shi; He, Dake; Bao, Weiqun; Li, Ling
2014-08-01
This study aimed to investigate early predictors (6 months after diagnosis) of medical intractability in epilepsy. All children <12 years of age having two or more unprovoked seizures 24 h apart at Xinhua Hospital between 1992 and 2006 were included. Medical intractability was defined as failure, due to lack of seizure control, of more than 2 antiepileptic drugs at maximum tolerated doses, with an average of more than 1 seizure per month for 24 months and no more than 3 consecutive months of seizure freedom during this interval. Univariate and multivariate logistic regression models were performed to determine the risk factors for developing medical intractability. Receiver operating characteristic curve was applied to fit the best compounded predictive model. A total of 649 patients were identified, out of which 119 (18%) met the study definition of intractable epilepsy at 2 years after diagnosis, and the rate of intractable epilepsy in patients with idiopathic syndromes was 12%. Multivariate logistic regression analysis revealed that neurodevelopmental delay, symptomatic etiology, partial seizures, and more than 10 seizures before diagnosis were significant and independent risk factors for intractable epilepsy. The best model to predict medical intractability in epilepsy comprised neurological physical abnormality, age at onset of epilepsy under 1 year, more than 10 seizures before diagnosis, and partial epilepsy, and the area under receiver operating characteristic curve was 0.7797. This model also fitted best in patients with idiopathic syndromes. A predictive model of medically intractable epilepsy composed of only four characteristics is established. This model is comparatively accurate and simple to apply clinically. Copyright © 2014 Elsevier Inc. All rights reserved.
Ibañez-Sanz, Gemma; Garcia, Montse; Milà, Núria; Rodríguez-Moranta, Francisco; Binefa, Gemma; Gómez-Matas, Javier; Benito, Llúcia; Padrol, Isabel; Barenys, Mercè; Moreno, Victor
2017-09-01
The aim of this study was to analyse false-negative (FN) results of the faecal immunochemical test (FIT) and its determinants in a colorectal cancer screening programme in Catalonia. We carried out a cross-sectional study among 218 screenees with a negative FIT result who agreed to undergo a colonoscopy. A false-negative result was defined as the detection, at colonoscopy, of intermediate/high-risk polyps or colorectal cancer in a patient with a previous negative FIT (<20 µgHb/g). Multivariate logistic regression models were constructed to identify sociodemographic (sex, age) and screening variables (quantitative faecal haemoglobin, colonoscopy findings) related to FN results. Adjusted odds ratios and their 95% confidence intervals were estimated. There were 15.6% FN FIT results. Faecal haemoglobin was undetected in 45.5% of these results and was below 4 µgHb/g in 94.0% of the individuals with a FN result. About 60% of the lesions were located in the proximal colon, whereas the expected percentage was 30%. Decreasing the positivity threshold of FIT does not increase the detection rate of advanced neoplasia, but may increase the costs and potential adverse effects.
Snejdrlova, Michaela; Kalvach, Zdenek; Topinkova, Eva; Vrablik, Michal; Prochazkova, Renata; Kvasilova, Marie; Lanska, Vera; Zlatohlavek, Lukas; Prusikova, Martina; Ceska, Richard
2011-01-01
Life expectancy is determined by a combination of genetic predisposition (~25%) and environmental influences (~75%). Nevertheless a stronger genetic influence is anticipated in long-living individuals. Apolipoprotein E (APOE) gene belongs among the most studied candidate genes of longevity. We evaluated the relation of APOE polymorphism and fitness status in the elderly. We examined a total number of 128 subjects, over 80 years of age. Using a battery of functional tests their fitness status was assessed and the subjects were stratified into 5 functional categories according to Spirduso´s classification. Biochemistry analysis was performed by enzymatic method using automated analyzers. APOE gene polymorphism was analysed performed using PCR-RFLP. APOE4 allele carriers had significantly worse fitness status compared to non-carriers (p=0.025). Multiple logistic regression analysis showed the APOE4 carriers had higher risk (p=0.05) of functional unfitness compared to APOE2/E3 individuals. APOE gene polymorphism seems be an important genetic contributor to frailty development in the elderly. While APOE2 carriers tend to remain functionally fit till higher age, the functional status of APOE4 carriers deteriorates more rapidly. © 2011 Neuroendocrinology Letters
The purpose of this report is to provide a reference manual that could be used by investigators for making informed use of logistic regression using two methods (standard logistic regression and MARS). The details for analyses of relationships between a dependent binary response ...
Predicting U.S. Army Reserve Unit Manning Using Market Demographics
2015-06-01
develops linear regression , classification tree, and logistic regression models to determine the ability of the location to support manning requirements... logistic regression model delivers predictive results that allow decision-makers to identify locations with a high probability of meeting unit...manning requirements. The recommendation of this thesis is that the USAR implement the logistic regression model. 14. SUBJECT TERMS U.S
Logistic Regression: Concept and Application
ERIC Educational Resources Information Center
Cokluk, Omay
2010-01-01
The main focus of logistic regression analysis is classification of individuals in different groups. The aim of the present study is to explain basic concepts and processes of binary logistic regression analysis intended to determine the combination of independent variables which best explain the membership in certain groups called dichotomous…
NASA Astrophysics Data System (ADS)
Pradhan, Biswajeet
2010-05-01
This paper presents the results of the cross-validation of a multivariate logistic regression model using remote sensing data and GIS for landslide hazard analysis on the Penang, Cameron, and Selangor areas in Malaysia. Landslide locations in the study areas were identified by interpreting aerial photographs and satellite images, supported by field surveys. SPOT 5 and Landsat TM satellite imagery were used to map landcover and vegetation index, respectively. Maps of topography, soil type, lineaments and land cover were constructed from the spatial datasets. Ten factors which influence landslide occurrence, i.e., slope, aspect, curvature, distance from drainage, lithology, distance from lineaments, soil type, landcover, rainfall precipitation, and normalized difference vegetation index (ndvi), were extracted from the spatial database and the logistic regression coefficient of each factor was computed. Then the landslide hazard was analysed using the multivariate logistic regression coefficients derived not only from the data for the respective area but also using the logistic regression coefficients calculated from each of the other two areas (nine hazard maps in all) as a cross-validation of the model. For verification of the model, the results of the analyses were then compared with the field-verified landslide locations. Among the three cases of the application of logistic regression coefficient in the same study area, the case of Selangor based on the Selangor logistic regression coefficients showed the highest accuracy (94%), where as Penang based on the Penang coefficients showed the lowest accuracy (86%). Similarly, among the six cases from the cross application of logistic regression coefficient in other two areas, the case of Selangor based on logistic coefficient of Cameron showed highest (90%) prediction accuracy where as the case of Penang based on the Selangor logistic regression coefficients showed the lowest accuracy (79%). Qualitatively, the cross application model yields reasonable results which can be used for preliminary landslide hazard mapping.
Logistic regression applied to natural hazards: rare event logistic regression with replications
NASA Astrophysics Data System (ADS)
Guns, M.; Vanacker, V.
2012-06-01
Statistical analysis of natural hazards needs particular attention, as most of these phenomena are rare events. This study shows that the ordinary rare event logistic regression, as it is now commonly used in geomorphologic studies, does not always lead to a robust detection of controlling factors, as the results can be strongly sample-dependent. In this paper, we introduce some concepts of Monte Carlo simulations in rare event logistic regression. This technique, so-called rare event logistic regression with replications, combines the strength of probabilistic and statistical methods, and allows overcoming some of the limitations of previous developments through robust variable selection. This technique was here developed for the analyses of landslide controlling factors, but the concept is widely applicable for statistical analyses of natural hazards.
Large unbalanced credit scoring using Lasso-logistic regression ensemble.
Wang, Hong; Xu, Qingsong; Zhou, Lifeng
2015-01-01
Recently, various ensemble learning methods with different base classifiers have been proposed for credit scoring problems. However, for various reasons, there has been little research using logistic regression as the base classifier. In this paper, given large unbalanced data, we consider the plausibility of ensemble learning using regularized logistic regression as the base classifier to deal with credit scoring problems. In this research, the data is first balanced and diversified by clustering and bagging algorithms. Then we apply a Lasso-logistic regression learning ensemble to evaluate the credit risks. We show that the proposed algorithm outperforms popular credit scoring models such as decision tree, Lasso-logistic regression and random forests in terms of AUC and F-measure. We also provide two importance measures for the proposed model to identify important variables in the data.
Lyles, Robert H.; Mitchell, Emily M.; Weinberg, Clarice R.; Umbach, David M.; Schisterman, Enrique F.
2016-01-01
Summary Potential reductions in laboratory assay costs afforded by pooling equal aliquots of biospecimens have long been recognized in disease surveillance and epidemiological research and, more recently, have motivated design and analytic developments in regression settings. For example, Weinberg and Umbach (1999, Biometrics 55, 718–726) provided methods for fitting set-based logistic regression models to case-control data when a continuous exposure variable (e.g., a biomarker) is assayed on pooled specimens. We focus on improving estimation efficiency by utilizing available subject-specific information at the pool allocation stage. We find that a strategy that we call “(y,c)-pooling,” which forms pooling sets of individuals within strata defined jointly by the outcome and other covariates, provides more precise estimation of the risk parameters associated with those covariates than does pooling within strata defined only by the outcome. We review the approach to set-based analysis through offsets developed by Weinberg and Umbach in a recent correction to their original paper. We propose a method for variance estimation under this design and use simulations and a real-data example to illustrate the precision benefits of (y,c)-pooling relative to y-pooling. We also note and illustrate that set-based models permit estimation of covariate interactions with exposure. PMID:26964741
Antin, Jonathan F.; Stanley, Laura M.; Guo, Feng
2011-01-01
The purpose of this research effort was to compare older driver and non-driver functional impairment profiles across some 60 assessment metrics in an initial effort to contribute to the development of fitness-to-drive assessment models. Of the metrics evaluated, 21 showed statistically significant differences, almost all favoring the drivers. Also, it was shown that a logistic regression model comprised of five of the assessment scores could completely and accurately separate the two groups. The results of this study imply that older drivers are far less functionally impaired than non-drivers of similar ages, and that a parsimonious model can accurately assign individuals to either group. With such models, any driver classified or diagnosed as a non-driver would be a strong candidate for further investigation and intervention. PMID:22058607
Organized Sports, Overweight, and Physical Fitness in Primary School Children in Germany
Steiner, Ronald P.; Brandstetter, Susanne; Klenk, Jochen; Wabitsch, Martin; Steinacker, Jürgen M.
2013-01-01
Physical inactivity is associated with poor physical fitness and increased body weight. This study examined the relationship between participation in organized sports and overweight as well as physical fitness in primary school children in southern Germany. Height, weight, and various components of physical fitness were measured in 995 children (7.6 ± 0.4 years). Sports participation and confounding variables such as migration background, parental education, parental body weight, and parental sports participation were assessed via parent questionnaire. Multiple logistic regression as well as multivariate analysis of covariance (MANCOVA) was used to determine associations between physical fitness, participation in organized sports, and body weight. Participation in organized sports less than once a week was prevalent in 29.2%, once or twice in 60.2%, and more often in 10.6% of the children. Overweight was found in 12.4% of the children. Children participating in organized sports more than once per week displayed higher physical fitness and were less likely to be overweight (OR = 0.52, P < 0.01). Even though causality cannot be established, the facilitation of participation in organized sports may be a crucial aspect in public health efforts addressing the growing problems associated with overweight and obesity. PMID:23533728
Hootman, J M; Macera, C A; Ainsworth, B E; Martin, M; Addy, C L; Blair, S N
2001-08-01
To help public health practitioners promote physical activities with a low risk of injury, this study determined the relation among type and duration of physical activity, cardiorespiratory fitness, and musculoskeletal injury in a sample of adults enrolled in the Aerobics Center Longitudinal Study. Subjects included 4,034 men and 967 women who underwent a baseline physical examination between 1970 and 1985 and who returned a mailed follow-up survey in 1986. At baseline, a treadmill graded exercise test was used to measure cardiorespiratory fitness. At follow-up, subjects reported injuries and type and duration of physical activity in the preceding 12 months. Polytomous logistic regression was used to estimate the association among physical activity type and duration, cardiorespiratory fitness, and injury. The risk of sustaining an activity-related injury increased with higher duration of physical activity per week and cardiorespiratory fitness levels. Results suggest that cardiorespiratory fitness may be a surrogate for unmeasured components of physical activity, such as exercise intensity. Among walkers, increasing duration of activity per week was not associated with an increased risk of injury. Results suggest that, for most adults, walking is a safe form of physical activity associated with a lower risk of injury than running or sport participation.
Pandey, Anjali; Singh, K K
2015-12-29
There exist ample of research literature investigating the various facet of contraceptive use behaviors in India but the use of contraception by married Indian women, prior to having their first pregnancy has been neglected so far. This study attempts to identify the socio demographic determinants and differentials of contraceptive use or non use by a woman in India, before she proceeds to have her first child. The analysis was done using data from the third National Family Health Survey (2005-2006), India. This study utilized information from 54,918 women who ever have been married and whose current age at the time of NFHS-3 survey was 15-34 years. To identify the crucial socio-demographic determinants governing this pioneering behavior, logistic regression technique has been used. Hosmer Lemeshow test and ROC curve analysis was also performed in order to check the fitting of logistic regression model to the data under consideration. Of all the considered explanatory variables religion, caste, education, current age, age at marriage, media exposure and zonal classifications were found to be significantly affecting the study behavior. Place of residence i.e. urban--rural locality came to be insignificant in multivariable logistic regression. In the light of sufficient evidences confirming the presence of early marriages and child bearing practices in India, conjunct efforts are required to address the socio demographic differentials in contraceptive use by the young married women prior to their first pregnancy. Encouraging women to opt for higher education, ensuring marriages only after legal minimum age at marriage and promoting the family planning programs via print and electronic media may address the existing socio economic barriers. Also, the family planning programs should be oriented to take care of the geographical variations in the study behavior.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Robertson, John M., E-mail: jrobertson@beaumont.ed; Soehn, Matthias; Yan Di
Purpose: Understanding the dose-volume relationship of small bowel irradiation and severe acute diarrhea may help reduce the incidence of this side effect during adjuvant treatment for rectal cancer. Methods and Materials: Consecutive patients treated curatively for rectal cancer were reviewed, and the maximum grade of acute diarrhea was determined. The small bowel was outlined on the treatment planning CT scan, and a dose-volume histogram was calculated for the initial pelvic treatment (45 Gy). Logistic regression models were fitted for varying cutoff-dose levels from 5 to 45 Gy in 5-Gy increments. The model with the highest LogLikelihood was used to developmore » a cutoff-dose normal tissue complication probability (NTCP) model. Results: There were a total of 152 patients (48% preoperative, 47% postoperative, 5% other), predominantly treated prone (95%) with a three-field technique (94%) and a protracted venous infusion of 5-fluorouracil (78%). Acute Grade 3 diarrhea occurred in 21%. The largest LogLikelihood was found for the cutoff-dose logistic regression model with 15 Gy as the cutoff-dose, although the models for 20 Gy and 25 Gy had similar significance. According to this model, highly significant correlations (p <0.001) between small bowel volumes receiving at least 15 Gy and toxicity exist in the considered patient population. Similar findings applied to both the preoperatively (p = 0.001) and postoperatively irradiated groups (p = 0.001). Conclusion: The incidence of Grade 3 diarrhea was significantly correlated with the volume of small bowel receiving at least 15 Gy using a cutoff-dose NTCP model.« less
Oppong Asante, Kwaku; Meyer-Weitz, Anna
2017-05-01
This study aimed to determine the prevalence and risk factors associated with suicidal ideations and attempts among a sample of homeless street children and adolescents found in Accra, Ghana. A cross-sectional survey of a convenience sample of 227 (122 male and 105 female) homeless youth was conducted in Ghana. An interviewer-administered questionnaire was used to collect data due to a low level of literacy among the study population. Bivariate and multivariate logistic regressions were fitted to analyse the data. The results indicated that 26.4% and 26.0% of the participants had attempted suicide and reported suicidal ideations respectively. The multivariate logistic regression showed that smoking, past and present use of alcohol, use of marijuana, and engagement in prostitution, were associated with suicidal ideations and suicide attempts. Suicidal ideations were associated with having been physically beaten, robbed, and assaulted with a weapon; while a suicide attempt was predicted by having been robbed and physically beaten. This study increased our understanding of the determinants of suicidal ideations and attempts among homeless youth. These findings suggest urgency to up-skill mental health workers to assess for risk factors and offer pathways to care for this vulnerable group.
Use of multilevel logistic regression to identify the causes of differential item functioning.
Balluerka, Nekane; Gorostiaga, Arantxa; Gómez-Benito, Juana; Hidalgo, María Dolores
2010-11-01
Given that a key function of tests is to serve as evaluation instruments and for decision making in the fields of psychology and education, the possibility that some of their items may show differential behaviour is a major concern for psychometricians. In recent decades, important progress has been made as regards the efficacy of techniques designed to detect this differential item functioning (DIF). However, the findings are scant when it comes to explaining its causes. The present study addresses this problem from the perspective of multilevel analysis. Starting from a case study in the area of transcultural comparisons, multilevel logistic regression is used: 1) to identify the item characteristics associated with the presence of DIF; 2) to estimate the proportion of variation in the DIF coefficients that is explained by these characteristics; and 3) to evaluate alternative explanations of the DIF by comparing the explanatory power or fit of different sequential models. The comparison of these models confirmed one of the two alternatives (familiarity with the stimulus) and rejected the other (the topic area) as being a cause of differential functioning with respect to the compared groups.
Jung, Yoon Suk; Park, Chan Hyuk; Kim, Nam Hee; Park, Jung Ho; Park, Dong Il; Sohn, Chong Il
2018-01-01
The fecal immunochemical test (FIT) has low sensitivity for detecting advanced colorectal neoplasia (ACRN); thus, a considerable portion of FIT-negative persons may have ACRN. We aimed to develop a risk-scoring model for predicting ACRN in FIT-negative persons. We reviewed the records of participants aged ≥40 years who underwent a colonoscopy and FIT during a health check-up. We developed a risk-scoring model for predicting ACRN in FIT-negative persons. Of 11,873 FIT-negative participants, 255 (2.1%) had ACRN. On the basis of the multivariable logistic regression model, point scores were assigned as follows among FIT-negative persons: age (per year from 40 years old), 1 point; current smoker, 10 points; overweight, 5 points; obese, 7 points; hypertension, 6 points; old cerebrovascular attack (CVA), 15 points. Although the proportion of ACRN in FIT-negative persons increased as risk scores increased (from 0.6% in the group with 0-4 points to 8.1% in the group with 35-39 points), it was significantly lower than that in FIT-positive persons (14.9%). However, there was no statistical difference between the proportion of ACRN in FIT-negative persons with ≥40 points and in FIT-positive persons (10.5% vs. 14.9%, P = 0.321). FIT-negative persons may need to undergo screening colonoscopy if they clinically have a high risk of ACRN. The scoring model based on age, smoking habits, overweight or obesity, hypertension, and old CVA may be useful in selecting and prioritizing FIT-negative persons for screening colonoscopy.
High injury rates among female army trainees: a function of gender?
Bell, N S; Mangione, T W; Hemenway, D; Amoroso, P J; Jones, B H
2000-04-01
Studies suggest that women are at greater risk than men for sports and training injuries. This study investigated the association between gender and risk of exercise-related injuries among Army basic trainees while controlling for physical fitness and demographics. Eight hundred and sixty-one trainees were followed during their 8-week basic training course. Demographic characteristics, body composition, and physical fitness were measured at the beginning of training. Physical fitness measures were taken again at the end of training. Multivariate logistic regression analysis was used to evaluate the association between gender and risk of injury while controlling for potential confounders. Women experienced twice as many injuries as men (relative risk [RR] = 2.1, 1.78-2.5) and experienced serious time-loss injuries almost 2.5 times more often than men (RR = 2.4, 1. 92-3.05). Women entered training at significantly lower levels of physical fitness than men, but made much greater improvements in fitness over the training period.In multivariate analyses, where demographics, body composition, and initial physical fitness were controlled, female gender was no longer a significant predictor of injuries (RR = 1.14, 0.48-2.72). Physical fitness, particularly aerobic fitness, remained significant. The key risk factor for training injuries appears to be physical fitness, particularly cardiovascular fitness. The significant improvement in endurance attained by women suggests that women enter training less physically fit relative to their own fitness potential, as well as to men. Remedial training for less fit soldiers is likely to reduce injuries and decrease the gender differential in risk of injuries.
Power and Sample Size Calculations for Logistic Regression Tests for Differential Item Functioning
ERIC Educational Resources Information Center
Li, Zhushan
2014-01-01
Logistic regression is a popular method for detecting uniform and nonuniform differential item functioning (DIF) effects. Theoretical formulas for the power and sample size calculations are derived for likelihood ratio tests and Wald tests based on the asymptotic distribution of the maximum likelihood estimators for the logistic regression model.…
A Methodology for Generating Placement Rules that Utilizes Logistic Regression
ERIC Educational Resources Information Center
Wurtz, Keith
2008-01-01
The purpose of this article is to provide the necessary tools for institutional researchers to conduct a logistic regression analysis and interpret the results. Aspects of the logistic regression procedure that are necessary to evaluate models are presented and discussed with an emphasis on cutoff values and choosing the appropriate number of…
John Hogland; Nedret Billor; Nathaniel Anderson
2013-01-01
Discriminant analysis, referred to as maximum likelihood classification within popular remote sensing software packages, is a common supervised technique used by analysts. Polytomous logistic regression (PLR), also referred to as multinomial logistic regression, is an alternative classification approach that is less restrictive, more flexible, and easy to interpret. To...
Large Unbalanced Credit Scoring Using Lasso-Logistic Regression Ensemble
Wang, Hong; Xu, Qingsong; Zhou, Lifeng
2015-01-01
Recently, various ensemble learning methods with different base classifiers have been proposed for credit scoring problems. However, for various reasons, there has been little research using logistic regression as the base classifier. In this paper, given large unbalanced data, we consider the plausibility of ensemble learning using regularized logistic regression as the base classifier to deal with credit scoring problems. In this research, the data is first balanced and diversified by clustering and bagging algorithms. Then we apply a Lasso-logistic regression learning ensemble to evaluate the credit risks. We show that the proposed algorithm outperforms popular credit scoring models such as decision tree, Lasso-logistic regression and random forests in terms of AUC and F-measure. We also provide two importance measures for the proposed model to identify important variables in the data. PMID:25706988
Hoppe, C B; Oliveira, J A P; Grecca, F S; Haas, A N; Gomes, M S
2017-08-01
To evaluate the association between chronic oral inflammatory burden (OIB) - as the combination of periodontal and endodontic disease load - and physical fitness. One hundred and twelve nonsmoker male police officers who performed a standardized physical fitness test (PFT) were analysed. Participants underwent oral clinical and periapical radiographic examinations. Periodontal disease was assessed by probing depth (PD) and clinical attachment loss (AL). For radiographic analysis, both apical periodontitis (AP) and root canal treatment (RCT) variables were analysed. Endodontic Burden (EB) was calculated merging the total number of teeth with AP and/or RCT per individual. OIB was calculated combining EB and AL. The outcome of physical fitness was dichotomized according to whether the highest PFT score was 'achieved' or 'not-achieved'. Multivariable logistic regression models were adjusted for age, body mass index and frequency of daily exercise. There was no significant association between AP, RCT and EB with physical fitness whereas PD, AL and OIB were significantly associated with low physical fitness (P < 0.05). Multivariate regression analysis revealed that individuals with OIB = EB ≥ 3 and AL ≥ 4 mm had a 81% lower chance of reaching the highest PFT score (OR = 0.19, 95%CI = 0.04-0.87, P = 0.03) compared to individuals with EB < 3 and and no AL ≥ 4 mm. Individuals with unfavourable periodontal parameters but with low EB (OIB = EB < 3 & AL ≥ 4 mm) showed no significant differences on the chance to reach the highest PFT score compared to participants with favourable periodontal status and low EB (OIB = EB < 3 & no AL ≥ 4 mm). The OIB - higher levels of EB in periodontal patients - was independently associated with poor physical fitness in males. © 2016 International Endodontic Journal. Published by John Wiley & Sons Ltd.
What Are the Odds of that? A Primer on Understanding Logistic Regression
ERIC Educational Resources Information Center
Huang, Francis L.; Moon, Tonya R.
2013-01-01
The purpose of this Methodological Brief is to present a brief primer on logistic regression, a commonly used technique when modeling dichotomous outcomes. Using data from the National Education Longitudinal Study of 1988 (NELS:88), logistic regression techniques were used to investigate student-level variables in eighth grade (i.e., enrolled in a…
Cost-of-illness studies based on massive data: a prevalence-based, top-down regression approach.
Stollenwerk, Björn; Welchowski, Thomas; Vogl, Matthias; Stock, Stephanie
2016-04-01
Despite the increasing availability of routine data, no analysis method has yet been presented for cost-of-illness (COI) studies based on massive data. We aim, first, to present such a method and, second, to assess the relevance of the associated gain in numerical efficiency. We propose a prevalence-based, top-down regression approach consisting of five steps: aggregating the data; fitting a generalized additive model (GAM); predicting costs via the fitted GAM; comparing predicted costs between prevalent and non-prevalent subjects; and quantifying the stochastic uncertainty via error propagation. To demonstrate the method, it was applied to aggregated data in the context of chronic lung disease to German sickness funds data (from 1999), covering over 7.3 million insured. To assess the gain in numerical efficiency, the computational time of the innovative approach has been compared with corresponding GAMs applied to simulated individual-level data. Furthermore, the probability of model failure was modeled via logistic regression. Applying the innovative method was reasonably fast (19 min). In contrast, regarding patient-level data, computational time increased disproportionately by sample size. Furthermore, using patient-level data was accompanied by a substantial risk of model failure (about 80 % for 6 million subjects). The gain in computational efficiency of the innovative COI method seems to be of practical relevance. Furthermore, it may yield more precise cost estimates.
Stylianou, Neophytos; Akbarov, Artur; Kontopantelis, Evangelos; Buchan, Iain; Dunn, Ken W
2015-08-01
Predicting mortality from burn injury has traditionally employed logistic regression models. Alternative machine learning methods have been introduced in some areas of clinical prediction as the necessary software and computational facilities have become accessible. Here we compare logistic regression and machine learning predictions of mortality from burn. An established logistic mortality model was compared to machine learning methods (artificial neural network, support vector machine, random forests and naïve Bayes) using a population-based (England & Wales) case-cohort registry. Predictive evaluation used: area under the receiver operating characteristic curve; sensitivity; specificity; positive predictive value and Youden's index. All methods had comparable discriminatory abilities, similar sensitivities, specificities and positive predictive values. Although some machine learning methods performed marginally better than logistic regression the differences were seldom statistically significant and clinically insubstantial. Random forests were marginally better for high positive predictive value and reasonable sensitivity. Neural networks yielded slightly better prediction overall. Logistic regression gives an optimal mix of performance and interpretability. The established logistic regression model of burn mortality performs well against more complex alternatives. Clinical prediction with a small set of strong, stable, independent predictors is unlikely to gain much from machine learning outside specialist research contexts. Copyright © 2015 Elsevier Ltd and ISBI. All rights reserved.
Valle, Denis; Lima, Joanna M Tucker; Millar, Justin; Amratia, Punam; Haque, Ubydul
2015-11-04
Logistic regression is a statistical model widely used in cross-sectional and cohort studies to identify and quantify the effects of potential disease risk factors. However, the impact of imperfect tests on adjusted odds ratios (and thus on the identification of risk factors) is under-appreciated. The purpose of this article is to draw attention to the problem associated with modelling imperfect diagnostic tests, and propose simple Bayesian models to adequately address this issue. A systematic literature review was conducted to determine the proportion of malaria studies that appropriately accounted for false-negatives/false-positives in a logistic regression setting. Inference from the standard logistic regression was also compared with that from three proposed Bayesian models using simulations and malaria data from the western Brazilian Amazon. A systematic literature review suggests that malaria epidemiologists are largely unaware of the problem of using logistic regression to model imperfect diagnostic test results. Simulation results reveal that statistical inference can be substantially improved when using the proposed Bayesian models versus the standard logistic regression. Finally, analysis of original malaria data with one of the proposed Bayesian models reveals that microscopy sensitivity is strongly influenced by how long people have lived in the study region, and an important risk factor (i.e., participation in forest extractivism) is identified that would have been missed by standard logistic regression. Given the numerous diagnostic methods employed by malaria researchers and the ubiquitous use of logistic regression to model the results of these diagnostic tests, this paper provides critical guidelines to improve data analysis practice in the presence of misclassification error. Easy-to-use code that can be readily adapted to WinBUGS is provided, enabling straightforward implementation of the proposed Bayesian models.
JROTC as a Substitute for PE: Really?
Lounsbery, Monica A. F.; Holt, Kathryn A.; Monnat, Shannon A.; McKenzie, Thomas L.; Funk, Brian
2014-01-01
Purpose Even though physical education (PE) is an evidence-based strategy for providing and promoting physical activity, alternative programs such as Junior Reserve Officer Training Corps (JROTC) are commonly substituted for PE in many states. The purpose of this study was to compare student physical activity and lesson contexts during high school PE and JROTC sessions. Method SOFIT (System for Observing Fitness Instruction Time) was used to assess PE and JROTC sessions (N=38 each) in 4 high schools that provided both programs. Data were analyzed using t-tests, negative binomial regression, and logistic regression. Results Students engaged in significantly more moderate to vigorous physical activity during PE than JROTC sessions and they were significantly less sedentary. Significant differences between the two program types were also found among lesson contexts. Conclusions PE and JROTC provide substantially different content and contexts and students in them engage in substantially different amounts of moderate to vigorous physical activity. Students in JROTC, and perhaps other alternative programs, are less likely to accrue health-supporting physical activity and engage in fewer opportunities to be physically fit and motorically skilled. Policies and practices for providing substitutions for PE should be carefully examined. PMID:25141093
Predicting Madura cattle growth curve using non-linear model
NASA Astrophysics Data System (ADS)
Widyas, N.; Prastowo, S.; Widi, T. S. M.; Baliarti, E.
2018-03-01
Madura cattle is Indonesian native. It is a composite breed that has undergone hundreds of years of selection and domestication to reach nowadays remarkable uniformity. Crossbreeding has reached the isle of Madura and the Madrasin, a cross between Madura cows and Limousine semen emerged. This paper aimed to compare the growth curve between Madrasin and one type of pure Madura cows, the common Madura cattle (Madura) using non-linear models. Madura cattles are kept traditionally thus reliable records are hardly available. Data were collected from small holder farmers in Madura. Cows from different age classes (<6 months, 6-12 months, 1-2years, 2-3years, 3-5years and >5years) were observed, and body measurements (chest girth, body length and wither height) were taken. In total 63 Madura and 120 Madrasin records obtained. Linear model was built with cattle sub-populations and age as explanatory variables. Body weights were estimated based on the chest girth. Growth curves were built using logistic regression. Results showed that within the same age, Madrasin has significantly larger body compared to Madura (p<0.05). The logistic models fit better for Madura and Madrasin cattle data; with the estimated MSE for these models were 39.09 and 759.28 with prediction accuracy of 99 and 92% for Madura and Madrasin, respectively. Prediction of growth curve using logistic regression model performed well in both types of Madura cattle. However, attempts to administer accurate data on Madura cattle are necessary to better characterize and study these cattle.
Li, Baoyue; Lingsma, Hester F; Steyerberg, Ewout W; Lesaffre, Emmanuel
2011-05-23
Logistic random effects models are a popular tool to analyze multilevel also called hierarchical data with a binary or ordinal outcome. Here, we aim to compare different statistical software implementations of these models. We used individual patient data from 8509 patients in 231 centers with moderate and severe Traumatic Brain Injury (TBI) enrolled in eight Randomized Controlled Trials (RCTs) and three observational studies. We fitted logistic random effects regression models with the 5-point Glasgow Outcome Scale (GOS) as outcome, both dichotomized as well as ordinal, with center and/or trial as random effects, and as covariates age, motor score, pupil reactivity or trial. We then compared the implementations of frequentist and Bayesian methods to estimate the fixed and random effects. Frequentist approaches included R (lme4), Stata (GLLAMM), SAS (GLIMMIX and NLMIXED), MLwiN ([R]IGLS) and MIXOR, Bayesian approaches included WinBUGS, MLwiN (MCMC), R package MCMCglmm and SAS experimental procedure MCMC.Three data sets (the full data set and two sub-datasets) were analysed using basically two logistic random effects models with either one random effect for the center or two random effects for center and trial. For the ordinal outcome in the full data set also a proportional odds model with a random center effect was fitted. The packages gave similar parameter estimates for both the fixed and random effects and for the binary (and ordinal) models for the main study and when based on a relatively large number of level-1 (patient level) data compared to the number of level-2 (hospital level) data. However, when based on relatively sparse data set, i.e. when the numbers of level-1 and level-2 data units were about the same, the frequentist and Bayesian approaches showed somewhat different results. The software implementations differ considerably in flexibility, computation time, and usability. There are also differences in the availability of additional tools for model evaluation, such as diagnostic plots. The experimental SAS (version 9.2) procedure MCMC appeared to be inefficient. On relatively large data sets, the different software implementations of logistic random effects regression models produced similar results. Thus, for a large data set there seems to be no explicit preference (of course if there is no preference from a philosophical point of view) for either a frequentist or Bayesian approach (if based on vague priors). The choice for a particular implementation may largely depend on the desired flexibility, and the usability of the package. For small data sets the random effects variances are difficult to estimate. In the frequentist approaches the MLE of this variance was often estimated zero with a standard error that is either zero or could not be determined, while for Bayesian methods the estimates could depend on the chosen "non-informative" prior of the variance parameter. The starting value for the variance parameter may be also critical for the convergence of the Markov chain.
Cha, Jae Myung; Lee, Joung Il; Joo, Kwang Ro; Shin, Hyun Phil; Park, Jae Jun
2011-11-01
Colorectal cancer (CRC) screening with a fecal immunochemical test (FIT) reduces CRC mortality; however, the acceptance rate of a colonoscopy in patients with a positive FIT was not high. The aim of this study was therefore to determine whether a telephone reminder call could increase the acceptance rate of colonoscopy in patients with a positive FIT. We performed FITs for asymptomatic participants aged 50 years or older. For patients with a positive FIT, a colonoscopy was recommended via mailing notification only (control group) or via a telephone reminder call after mailing notification (intervention group). The calls informed patients about the significance of a positive FIT and encouraged a colonoscopy following positive FITs. The FIT results were positive in 90 of 8,318 patients who received FITs. Fifty patients were advised to receive colonoscopy via mailing notification only, and 40 patients were advised via both a telephone reminder call and a mailing notification. The acceptance rate of colonoscopy was significantly higher in the intervention group than in the control group (p = 0.038). The lesion-detection rate for an advanced neoplasia was also significantly higher in the intervention group than in the control group (p = 0.046). According to multivariate logistic regression analysis, a telephone reminder was a significant determinant of colonoscopy acceptance in patients with a positive FIT (OR 4.33; 95% CI, 1.19-15.75; p = 0.026). Telephone reminder calls in addition to mailing notification improved the acceptance rate of colonoscopy in patients with a positive FIT.
Logistic regression for risk factor modelling in stuttering research.
Reed, Phil; Wu, Yaqionq
2013-06-01
To outline the uses of logistic regression and other statistical methods for risk factor analysis in the context of research on stuttering. The principles underlying the application of a logistic regression are illustrated, and the types of questions to which such a technique has been applied in the stuttering field are outlined. The assumptions and limitations of the technique are discussed with respect to existing stuttering research, and with respect to formulating appropriate research strategies to accommodate these considerations. Finally, some alternatives to the approach are briefly discussed. The way the statistical procedures are employed are demonstrated with some hypothetical data. Research into several practical issues concerning stuttering could benefit if risk factor modelling were used. Important examples are early diagnosis, prognosis (whether a child will recover or persist) and assessment of treatment outcome. After reading this article you will: (a) Summarize the situations in which logistic regression can be applied to a range of issues about stuttering; (b) Follow the steps in performing a logistic regression analysis; (c) Describe the assumptions of the logistic regression technique and the precautions that need to be checked when it is employed; (d) Be able to summarize its advantages over other techniques like estimation of group differences and simple regression. Copyright © 2012 Elsevier Inc. All rights reserved.
Sharma, Jagannath; Golby, Jim; Greeves, Julie; Spears, Iain R
2011-03-01
Medial tibial stress syndrome (MTSS) is a common injury in active populations and has been suggested to be a result of both biomechanical and lifestyle factors. The main aim of this study was to determine prospectively whether gait biomechanics and lifestyle factors can be used as a predictor of MTSS development. British infantry male recruits (n=468) were selected for the study. Plantar pressure variables, lifestyle factors comprising smoking habit and aerobic fitness as measured by a 1.5 mile timed-run were collected on the first day of training. Injury data were collected during the 26 week training period and incidence rate was 7.9% (n=37). A logistic regression model for membership of the MTSS and non-MTSS groups was developed. An imbalance in foot pressure with greater pressure on the medial side than on the lateral side was the primary risk factor. Low aerobic fitness, as deduced from a 1.5 mile timed-run and smoking habit were also important, but were additive risk factors for MTSS. In conclusion, "poor" biomechanics were the strongest predictors of MTSS development but lifestyle factors were also important. The logistic regression model combining all three risk factors was capable of predicting 96.9% of the non-injured group and 67.5% of the MTSS group with an overall accuracy of 87.7%. While the model has yet to be validated against an external sample and limitations exist with regards to the quality of the data collected, it is nonetheless suggested that the combined analysis of biomechanical and lifestyle factors has the potential to improve the prediction of MTSS. Copyright © 2010 Elsevier B.V. All rights reserved.
Developmental trajectories of paediatric headache - sex-specific analyses and predictors.
Isensee, Corinna; Fernandez Castelao, Carolin; Kröner-Herwig, Birgit
2016-01-01
Headache is the most common pain disorder in children and adolescents and is associated with diverse dysfunctions and psychological symptoms. Several studies evidenced sex-specific differences in headache frequency. Until now no study exists that examined sex-specific patterns of change in paediatric headache across time and included pain-related somatic and (socio-)psychological predictors. Latent Class Growth Analysis (LCGA) was used in order to identify different trajectory classes of headache across four annual time points in a population-based sample (n = 3 227; mean age 11.34 years; 51.2 % girls). In multinomial logistic regression analyses the influence of several predictors on the class membership was examined. For girls, a four-class model was identified as the best fitting model. While the majority of girls reported no (30.5 %) or moderate headache frequencies (32.5 %) across time, one class with a high level of headache days (20.8 %) and a class with an increasing headache frequency across time (16.2 %) were identified. For boys a two class model with a 'no headache class' (48.6 %) and 'moderate headache class' (51.4 %) showed the best model fit. Regarding logistic regression analyses, migraine and parental headache proved to be stable predictors across sexes. Depression/anxiety was a significant predictor for all pain classes in girls. Life events, dysfunctional stress coping and school burden were also able to differentiate at least between some classes in both sexes. The identified trajectories reflect sex-specific differences in paediatric headache, as seen in the number and type of classes extracted. The documented risk factors can deliver ideas for preventive actions and considerations for treatment programmes.
Coffee consumption modifies risk of estrogen-receptor negative breast cancer
2011-01-01
Introduction Breast cancer is a complex disease and may be sub-divided into hormone-responsive (estrogen receptor (ER) positive) and non-hormone-responsive subtypes (ER-negative). Some evidence suggests that heterogeneity exists in the associations between coffee consumption and breast cancer risk, according to different estrogen receptor subtypes. We assessed the association between coffee consumption and postmenopausal breast cancer risk in a large population-based study (2,818 cases and 3,111 controls), overall, and stratified by ER tumour subtypes. Methods Odds ratios (OR) and corresponding 95% confidence intervals (CI) were estimated using the multivariate logistic regression models fitted to examine breast cancer risk in a stratified case-control analysis. Heterogeneity among ER subtypes was evaluated in a case-only analysis, by fitting binary logistic regression models, treating ER status as a dependent variable, with coffee consumption included as a covariate. Results In the Swedish study, coffee consumption was associated with a modest decrease in overall breast cancer risk in the age-adjusted model (OR> 5 cups/day compared to OR≤ 1 cup/day: 0.80, 95% CI: 0.64, 0.99, P trend = 0.028). In the stratified case-control analyses, a significant reduction in the risk of ER-negative breast cancer was observed in heavy coffee drinkers (OR> 5 cups/day compared to OR≤ 1 cup/day : 0.43, 95% CI: 0.25, 0.72, P trend = 0.0003) in a multivariate-adjusted model. The breast cancer risk reduction associated with higher coffee consumption was significantly higher for ER-negative compared to ER-positive tumours (P heterogeneity (age-adjusted) = 0.004). Conclusions A high daily intake of coffee was found to be associated with a statistically significant decrease in ER-negative breast cancer among postmenopausal women. PMID:21569535
Robertson, Sam; Woods, Carl; Gastin, Paul
2015-09-01
To develop a physiological performance and anthropometric attribute model to predict Australian Football League draft selection. Cross-sectional observational. Data was obtained (n=4902) from three Under-18 Australian football competitions between 2010 and 2013. Players were allocated into one of the three groups, based on their highest level of selection in their final year of junior football (Australian Football League Drafted, n=292; National Championship, n=293; State-level club, n=4317). Physiological performance (vertical jumps, agility, speed and running endurance) and anthropometric (body mass and height) data were obtained. Hedge's effect sizes were calculated to assess the influence of selection-level and competition on these physical attributes, with logistic regression models constructed to discriminate Australian Football League Drafted and National Championship players. Rule induction analysis was undertaken to determine a set of rules for discriminating selection-level. Effect size comparisons revealed a range of small to moderate differences between State-level club players and both other groups for all attributes, with trivial to small differences between Australian Football League Drafted and National Championship players noted. Logistic regression models showed multistage fitness test, height and 20 m sprint time as the most important attributes in predicting Draft success. Rule induction analysis showed that players displaying multistage fitness test scores of >14.01 and/or 20 m sprint times of <2.99 s were most likely to be recruited. High levels of performance in aerobic and/or speed tests increase the likelihood of elite junior Australian football players being recruited to the highest level of the sport. Copyright © 2014 Sports Medicine Australia. Published by Elsevier Ltd. All rights reserved.
Beal, Eliza W; Tumin, Dmitry; Chakedis, Jeffery; Porter, Erica; Moris, Dimitrios; Zhang, Xu-Feng; Arnold, Mark; Harzman, Alan; Husain, Syed; Schmidt, Carl R; Pawlik, Timothy M
2018-07-01
Given the conflicting nature of reported risk factors for post-discharge venous thromboembolism (VTE) and unclear guidelines for post-discharge pharmacoprophylaxis, we sought to determine risk factors for 30-day post-discharge VTE after colectomy to predict which patients will benefit from post-discharge pharmacoprophylaxis. Patients who underwent colectomy in the American College of Surgeons National Surgical Quality Improvement Project Participant Use Files from 2011 to 2015 were identified. Logistic regression modeling was used. Receiver-operating characteristic curves were used and the best cut-points were determined using Youden's J index (sensitivity + specificity - 1). Hosmer-Lemeshow goodness-of-fit test was used to test model calibration. A random sample of 30% of the cohort was used as a validation set. Among 77,823 cases, the overall incidence of VTE after colectomy was 1.9%, with 0.7% of VTE events occurring in the post-discharge setting. Factors associated with post-discharge VTE risk including body mass index, preoperative albumin, operation time, hospital length of stay, race, smoking status, inflammatory bowel disease, return to the operating room and postoperative ileus were included in logistic regression equation model. The model demonstrated good calibration (goodness of fit P = 0.7137) and good discrimination (area under the curve (AUC) = 0.68; validation set, AUC = 0.70). A score of ≥-5.00 had the maxim sensitivity and specificity, resulting in 36.63% of patients being treated with prophylaxis for an overall VTE risk of 0.67%. Approximately one-third of post-colectomy VTE events occurred after discharge. Patients with predicted post-discharge VTE risk of ≥-5.00 should be recommended for extended post-discharge VTE prophylaxis.
Dynamic Dimensionality Selection for Bayesian Classifier Ensembles
2015-03-19
learning of weights in an otherwise generatively learned naive Bayes classifier. WANBIA-C is very cometitive to Logistic Regression but much more...classifier, Generative learning, Discriminative learning, Naïve Bayes, Feature selection, Logistic regression , higher order attribute independence 16...discriminative learning of weights in an otherwise generatively learned naive Bayes classifier. WANBIA-C is very cometitive to Logistic Regression but
Travis Woolley; David C. Shaw; Lisa M. Ganio; Stephen Fitzgerald
2012-01-01
Logistic regression models used to predict tree mortality are critical to post-fire management, planning prescribed bums and understanding disturbance ecology. We review literature concerning post-fire mortality prediction using logistic regression models for coniferous tree species in the western USA. We include synthesis and review of: methods to develop, evaluate...
Preserving Institutional Privacy in Distributed binary Logistic Regression.
Wu, Yuan; Jiang, Xiaoqian; Ohno-Machado, Lucila
2012-01-01
Privacy is becoming a major concern when sharing biomedical data across institutions. Although methods for protecting privacy of individual patients have been proposed, it is not clear how to protect the institutional privacy, which is many times a critical concern of data custodians. Built upon our previous work, Grid Binary LOgistic REgression (GLORE)1, we developed an Institutional Privacy-preserving Distributed binary Logistic Regression model (IPDLR) that considers both individual and institutional privacy for building a logistic regression model in a distributed manner. We tested our method using both simulated and clinical data, showing how it is possible to protect the privacy of individuals and of institutions using a distributed strategy.
Covariate Imbalance and Adjustment for Logistic Regression Analysis of Clinical Trial Data
Ciolino, Jody D.; Martin, Reneé H.; Zhao, Wenle; Jauch, Edward C.; Hill, Michael D.; Palesch, Yuko Y.
2014-01-01
In logistic regression analysis for binary clinical trial data, adjusted treatment effect estimates are often not equivalent to unadjusted estimates in the presence of influential covariates. This paper uses simulation to quantify the benefit of covariate adjustment in logistic regression. However, International Conference on Harmonization guidelines suggest that covariate adjustment be pre-specified. Unplanned adjusted analyses should be considered secondary. Results suggest that that if adjustment is not possible or unplanned in a logistic setting, balance in continuous covariates can alleviate some (but never all) of the shortcomings of unadjusted analyses. The case of log binomial regression is also explored. PMID:24138438
Differentially private distributed logistic regression using private and public data.
Ji, Zhanglong; Jiang, Xiaoqian; Wang, Shuang; Xiong, Li; Ohno-Machado, Lucila
2014-01-01
Privacy protecting is an important issue in medical informatics and differential privacy is a state-of-the-art framework for data privacy research. Differential privacy offers provable privacy against attackers who have auxiliary information, and can be applied to data mining models (for example, logistic regression). However, differentially private methods sometimes introduce too much noise and make outputs less useful. Given available public data in medical research (e.g. from patients who sign open-consent agreements), we can design algorithms that use both public and private data sets to decrease the amount of noise that is introduced. In this paper, we modify the update step in Newton-Raphson method to propose a differentially private distributed logistic regression model based on both public and private data. We try our algorithm on three different data sets, and show its advantage over: (1) a logistic regression model based solely on public data, and (2) a differentially private distributed logistic regression model based on private data under various scenarios. Logistic regression models built with our new algorithm based on both private and public datasets demonstrate better utility than models that trained on private or public datasets alone without sacrificing the rigorous privacy guarantee.
Amini, Payam; Maroufizadeh, Saman; Samani, Reza Omani; Hamidi, Omid; Sepidarkish, Mahdi
2017-06-01
Preterm birth (PTB) is a leading cause of neonatal death and the second biggest cause of death in children under five years of age. The objective of this study was to determine the prevalence of PTB and its associated factors using logistic regression and decision tree classification methods. This cross-sectional study was conducted on 4,415 pregnant women in Tehran, Iran, from July 6-21, 2015. Data were collected by a researcher-developed questionnaire through interviews with mothers and review of their medical records. To evaluate the accuracy of the logistic regression and decision tree methods, several indices such as sensitivity, specificity, and the area under the curve were used. The PTB rate was 5.5% in this study. The logistic regression outperformed the decision tree for the classification of PTB based on risk factors. Logistic regression showed that multiple pregnancies, mothers with preeclampsia, and those who conceived with assisted reproductive technology had an increased risk for PTB ( p < 0.05). Identifying and training mothers at risk as well as improving prenatal care may reduce the PTB rate. We also recommend that statisticians utilize the logistic regression model for the classification of risk groups for PTB.
Tahir, M Ramzan; Tran, Quang X; Nikulin, Mikhail S
2017-05-30
We studied the problem of testing a hypothesized distribution in survival regression models when the data is right censored and survival times are influenced by covariates. A modified chi-squared type test, known as Nikulin-Rao-Robson statistic, is applied for the comparison of accelerated failure time models. This statistic is used to test the goodness-of-fit for hypertabastic survival model and four other unimodal hazard rate functions. The results of simulation study showed that the hypertabastic distribution can be used as an alternative to log-logistic and log-normal distribution. In statistical modeling, because of its flexible shape of hazard functions, this distribution can also be used as a competitor of Birnbaum-Saunders and inverse Gaussian distributions. The results for the real data application are shown. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Logistic regression for dichotomized counts.
Preisser, John S; Das, Kalyan; Benecha, Habtamu; Stamm, John W
2016-12-01
Sometimes there is interest in a dichotomized outcome indicating whether a count variable is positive or zero. Under this scenario, the application of ordinary logistic regression may result in efficiency loss, which is quantifiable under an assumed model for the counts. In such situations, a shared-parameter hurdle model is investigated for more efficient estimation of regression parameters relating to overall effects of covariates on the dichotomous outcome, while handling count data with many zeroes. One model part provides a logistic regression containing marginal log odds ratio effects of primary interest, while an ancillary model part describes the mean count of a Poisson or negative binomial process in terms of nuisance regression parameters. Asymptotic efficiency of the logistic model parameter estimators of the two-part models is evaluated with respect to ordinary logistic regression. Simulations are used to assess the properties of the models with respect to power and Type I error, the latter investigated under both misspecified and correctly specified models. The methods are applied to data from a randomized clinical trial of three toothpaste formulations to prevent incident dental caries in a large population of Scottish schoolchildren. © The Author(s) 2014.
Zhu, K; Lou, Z; Zhou, J; Ballester, N; Kong, N; Parikh, P
2015-01-01
This article is part of the Focus Theme of Methods of Information in Medicine on "Big Data and Analytics in Healthcare". Hospital readmissions raise healthcare costs and cause significant distress to providers and patients. It is, therefore, of great interest to healthcare organizations to predict what patients are at risk to be readmitted to their hospitals. However, current logistic regression based risk prediction models have limited prediction power when applied to hospital administrative data. Meanwhile, although decision trees and random forests have been applied, they tend to be too complex to understand among the hospital practitioners. Explore the use of conditional logistic regression to increase the prediction accuracy. We analyzed an HCUP statewide inpatient discharge record dataset, which includes patient demographics, clinical and care utilization data from California. We extracted records of heart failure Medicare beneficiaries who had inpatient experience during an 11-month period. We corrected the data imbalance issue with under-sampling. In our study, we first applied standard logistic regression and decision tree to obtain influential variables and derive practically meaning decision rules. We then stratified the original data set accordingly and applied logistic regression on each data stratum. We further explored the effect of interacting variables in the logistic regression modeling. We conducted cross validation to assess the overall prediction performance of conditional logistic regression (CLR) and compared it with standard classification models. The developed CLR models outperformed several standard classification models (e.g., straightforward logistic regression, stepwise logistic regression, random forest, support vector machine). For example, the best CLR model improved the classification accuracy by nearly 20% over the straightforward logistic regression model. Furthermore, the developed CLR models tend to achieve better sensitivity of more than 10% over the standard classification models, which can be translated to correct labeling of additional 400 - 500 readmissions for heart failure patients in the state of California over a year. Lastly, several key predictor identified from the HCUP data include the disposition location from discharge, the number of chronic conditions, and the number of acute procedures. It would be beneficial to apply simple decision rules obtained from the decision tree in an ad-hoc manner to guide the cohort stratification. It could be potentially beneficial to explore the effect of pairwise interactions between influential predictors when building the logistic regression models for different data strata. Judicious use of the ad-hoc CLR models developed offers insights into future development of prediction models for hospital readmissions, which can lead to better intuition in identifying high-risk patients and developing effective post-discharge care strategies. Lastly, this paper is expected to raise the awareness of collecting data on additional markers and developing necessary database infrastructure for larger-scale exploratory studies on readmission risk prediction.
Interpretation of commonly used statistical regression models.
Kasza, Jessica; Wolfe, Rory
2014-01-01
A review of some regression models commonly used in respiratory health applications is provided in this article. Simple linear regression, multiple linear regression, logistic regression and ordinal logistic regression are considered. The focus of this article is on the interpretation of the regression coefficients of each model, which are illustrated through the application of these models to a respiratory health research study. © 2013 The Authors. Respirology © 2013 Asian Pacific Society of Respirology.
Assessment and monitoring practices of Australian fitness professionals.
Bennie, Jason A; Wiesner, Glen H; van Uffelen, Jannique G Z; Harvey, Jack T; Craike, Melinda J; Biddle, Stuart J H
2018-04-01
Assessment and monitoring of client health and fitness is a key part of fitness professionals' practices. However, little is known about prevalence of this practice. This study describes the assessment/monitoring practices of a large sample of Australian fitness professionals. Cross-sectional. In 2014, 1206 fitness professionals completed an online survey. Respondents reported their frequency (4 point-scale: [1] 'never' to [4] 'always') of assessment/monitoring of eight health and fitness constructs (e.g. body composition, aerobic fitness). This was classified as: (i) 'high' ('always' assessing/monitoring ≥5 constructs); (ii) 'medium' (1-4 constructs); (iii) 'low' (0 constructs). Classifications are reported by demographic and fitness industry characteristics. The odds of being classified as a 'high assessor/monitor' according to social ecological correlates were examined using a multiple-factor logistic regression model. Mean age of respondents was 39.3 (±11.6) years and 71.6% were female. A total of 15.8% (95% CI: 13.7%-17.9%) were classified as a 'high' assessor/monitor. Constructs with the largest proportion of being 'always' assessed were body composition (47.7%; 95% CI: 45.0%-50.1%) and aerobic fitness (42.5%; 95% CI: 39.6%-45.3%). Those with the lowest proportion of being 'always' assessed were balance (24.0%; 95% CI: 24.7%-26.5%) and mental health (20.2%; 95% CI: 18.1%-29.6%). A perceived lack of client interest and fitness professionals not considering assessing their responsibility were associated with lower odds of being classified as a 'high assessor/monitor'. Most fitness professionals do not routinely assess/monitor client fitness and health. Key factors limiting client health assessment and monitoring include a perceived lack of client interest and professionals not considering this their role. Copyright © 2017. Published by Elsevier Ltd.
Saeedi, Pouya; Mohd Taib, Mohd Nasir; Hazizi, Abu Saad
2012-10-01
Nutritional supplement (NS) use has increased among the general population, athletes, and fitness club participants and has become a widespread and acceptable behavior. The objective of this study was to determine the differences in sociodemographic, health-related, and psychological factors between NS users and nonusers. A case-control study design was used, whereby participants included 147 NS users (cases) and 147 nonusers (controls) age 18 yr and above who exercised at least 3 d/wk in 24 fitness clubs in Tehran. A self-administered pretested and validated questionnaire was used to collect data. The results showed that on average, NS users were younger (29.8 ± 9.5 yr) than nonusers (35.5 ± 12.2 yr). Logistic-regression analysis showed that NS use was significantly associated with moderate or high physical activity level (PAL), smoking, gender, eating attitude, and age. In conclusion, NS users were more likely to be female, younger, and smokers; to have moderate or high PAL; and to be more prone to eating disorders than nonusers.
Amagasa, Takashi; Nakayama, Takeo
2012-07-01
To test the hypothesis that relationship reported between long working hours and depression was inconsistent in previous studies because job demand was treated as a confounder. Structural equation modeling was used to construct five models, using work-related factors and depressive mood scale obtained from 218 clerical workers, to test for goodness of fit and was externally validated with data obtained from 1160 sales workers. Multiple logistic regression analysis was also performed. The model that showed that long working hours increased depression risk when job demand was regarded as an intermediate variable was the best fitted model (goodness-of-fit index/root-mean-square error of approximation: 0.981 to 0.996/0.042 to 0.044). The odds ratio for depression risk with work that was high demand and 60 hours or more per week was estimated at 2 to 4 versus work that was low demand and less than 60 hours per week. Long working hours increased depression risk, with job demand being an intermediate variable.
Guede Rojas, Francisco; Chirosa Ríos, Luis Javier; Fuentealba Urra, Sergio; Vergara Ríos, César; Ulloa Díaz, David; Campos Jara, Christian; Barbosa González, Paola; Cuevas Aburto, Jesualdo
2017-01-01
There is no conclusive evidence about the association between physical fitness (PF) and health related quality of life (HRQOL) in older adults. To seek for an association between PF and HRQOL in non-disabled community-dwelling Chilean older adults. One hundred and sixteen subjects participated in the study. PF was assessed using the Senior Fitness Test (SFT) and hand grip strength (HGS). HRQOL was assessed using eight dimensions provided by the SF-12v2 questionnaire. Binary multivariate logistic regression models were carried out considering the potential influence of confounder variables. Non-adjusted models, indicated that subjects with better performance in arm curl test (ACT) were more likely to score higher on vitality dimension (OR > 1) and those with higher HGS were more likely to score higher on physical functioning, bodily pain, vitality and mental health (OR > 1). The adjusted models consistently showed that ACT and HGS predicted a favorable perception of vitality and mental health dimensions respectively (OR > 1). HGS and ACT have a predictive value for certain dimensions of HRQOL.
Choi, Seung Hoan; Labadorf, Adam T; Myers, Richard H; Lunetta, Kathryn L; Dupuis, Josée; DeStefano, Anita L
2017-02-06
Next generation sequencing provides a count of RNA molecules in the form of short reads, yielding discrete, often highly non-normally distributed gene expression measurements. Although Negative Binomial (NB) regression has been generally accepted in the analysis of RNA sequencing (RNA-Seq) data, its appropriateness has not been exhaustively evaluated. We explore logistic regression as an alternative method for RNA-Seq studies designed to compare cases and controls, where disease status is modeled as a function of RNA-Seq reads using simulated and Huntington disease data. We evaluate the effect of adjusting for covariates that have an unknown relationship with gene expression. Finally, we incorporate the data adaptive method in order to compare false positive rates. When the sample size is small or the expression levels of a gene are highly dispersed, the NB regression shows inflated Type-I error rates but the Classical logistic and Bayes logistic (BL) regressions are conservative. Firth's logistic (FL) regression performs well or is slightly conservative. Large sample size and low dispersion generally make Type-I error rates of all methods close to nominal alpha levels of 0.05 and 0.01. However, Type-I error rates are controlled after applying the data adaptive method. The NB, BL, and FL regressions gain increased power with large sample size, large log2 fold-change, and low dispersion. The FL regression has comparable power to NB regression. We conclude that implementing the data adaptive method appropriately controls Type-I error rates in RNA-Seq analysis. Firth's logistic regression provides a concise statistical inference process and reduces spurious associations from inaccurately estimated dispersion parameters in the negative binomial framework.
Differentially private distributed logistic regression using private and public data
2014-01-01
Background Privacy protecting is an important issue in medical informatics and differential privacy is a state-of-the-art framework for data privacy research. Differential privacy offers provable privacy against attackers who have auxiliary information, and can be applied to data mining models (for example, logistic regression). However, differentially private methods sometimes introduce too much noise and make outputs less useful. Given available public data in medical research (e.g. from patients who sign open-consent agreements), we can design algorithms that use both public and private data sets to decrease the amount of noise that is introduced. Methodology In this paper, we modify the update step in Newton-Raphson method to propose a differentially private distributed logistic regression model based on both public and private data. Experiments and results We try our algorithm on three different data sets, and show its advantage over: (1) a logistic regression model based solely on public data, and (2) a differentially private distributed logistic regression model based on private data under various scenarios. Conclusion Logistic regression models built with our new algorithm based on both private and public datasets demonstrate better utility than models that trained on private or public datasets alone without sacrificing the rigorous privacy guarantee. PMID:25079786
Park, Ji Hyun; Kim, Hyeon-Young; Lee, Hanna; Yun, Eun Kyoung
2015-12-01
This study compares the performance of the logistic regression and decision tree analysis methods for assessing the risk factors for infection in cancer patients undergoing chemotherapy. The subjects were 732 cancer patients who were receiving chemotherapy at K university hospital in Seoul, Korea. The data were collected between March 2011 and February 2013 and were processed for descriptive analysis, logistic regression and decision tree analysis using the IBM SPSS Statistics 19 and Modeler 15.1 programs. The most common risk factors for infection in cancer patients receiving chemotherapy were identified as alkylating agents, vinca alkaloid and underlying diabetes mellitus. The logistic regression explained 66.7% of the variation in the data in terms of sensitivity and 88.9% in terms of specificity. The decision tree analysis accounted for 55.0% of the variation in the data in terms of sensitivity and 89.0% in terms of specificity. As for the overall classification accuracy, the logistic regression explained 88.0% and the decision tree analysis explained 87.2%. The logistic regression analysis showed a higher degree of sensitivity and classification accuracy. Therefore, logistic regression analysis is concluded to be the more effective and useful method for establishing an infection prediction model for patients undergoing chemotherapy. Copyright © 2015 Elsevier Ltd. All rights reserved.
Yang, Lixue; Chen, Kean
2015-11-01
To improve the design of underwater target recognition systems based on auditory perception, this study compared human listeners with automatic classifiers. Performances measures and strategies in three discrimination experiments, including discriminations between man-made and natural targets, between ships and submarines, and among three types of ships, were used. In the experiments, the subjects were asked to assign a score to each sound based on how confident they were about the category to which it belonged, and logistic regression, which represents linear discriminative models, also completed three similar tasks by utilizing many auditory features. The results indicated that the performances of logistic regression improved as the ratio between inter- and intra-class differences became larger, whereas the performances of the human subjects were limited by their unfamiliarity with the targets. Logistic regression performed better than the human subjects in all tasks but the discrimination between man-made and natural targets, and the strategies employed by excellent human subjects were similar to that of logistic regression. Logistic regression and several human subjects demonstrated similar performances when discriminating man-made and natural targets, but in this case, their strategies were not similar. An appropriate fusion of their strategies led to further improvement in recognition accuracy.
NASA Astrophysics Data System (ADS)
Mei, Zhixiong; Wu, Hao; Li, Shiyun
2018-06-01
The Conversion of Land Use and its Effects at Small regional extent (CLUE-S), which is a widely used model for land-use simulation, utilizes logistic regression to estimate the relationships between land use and its drivers, and thus, predict land-use change probabilities. However, logistic regression disregards possible spatial autocorrelation and self-organization in land-use data. Autologistic regression can depict spatial autocorrelation but cannot address self-organization, while logistic regression by considering only self-organization (NElogistic regression) fails to capture spatial autocorrelation. Therefore, this study developed a regression (NE-autologistic regression) method, which incorporated both spatial autocorrelation and self-organization, to improve CLUE-S. The Zengcheng District of Guangzhou, China was selected as the study area. The land-use data of 2001, 2005, and 2009, as well as 10 typical driving factors, were used to validate the proposed regression method and the improved CLUE-S model. Then, three future land-use scenarios in 2020: the natural growth scenario, ecological protection scenario, and economic development scenario, were simulated using the improved model. Validation results showed that NE-autologistic regression performed better than logistic regression, autologistic regression, and NE-logistic regression in predicting land-use change probabilities. The spatial allocation accuracy and kappa values of NE-autologistic-CLUE-S were higher than those of logistic-CLUE-S, autologistic-CLUE-S, and NE-logistic-CLUE-S for the simulations of two periods, 2001-2009 and 2005-2009, which proved that the improved CLUE-S model achieved the best simulation and was thereby effective to a certain extent. The scenario simulation results indicated that under all three scenarios, traffic land and residential/industrial land would increase, whereas arable land and unused land would decrease during 2009-2020. Apparent differences also existed in the simulated change sizes and locations of each land-use type under different scenarios. The results not only demonstrate the validity of the improved model but also provide a valuable reference for relevant policy-makers.
Prediction of performance on the RCMP physical ability requirement evaluation.
Stanish, H I; Wood, T M; Campagna, P
1999-08-01
The Royal Canadian Mounted Police use the Physical Ability Requirement Evaluation (PARE) for screening applicants. The purposes of this investigation were to identify those field tests of physical fitness that were associated with PARE performance and determine which most accurately classified successful and unsuccessful PARE performers. The participants were 27 female and 21 male volunteers. Testing included measures of aerobic power, anaerobic power, agility, muscular strength, muscular endurance, and body composition. Multiple regression analysis revealed a three-variable model for males (70-lb bench press, standing long jump, and agility) explaining 79% of the variability in PARE time, whereas a one-variable model (agility) explained 43% of the variability for females. Analysis of the classification accuracy of the males' data was prohibited because 91% of the males passed the PARE. Classification accuracy of the females' data, using logistic regression, produced a two-variable model (agility, 1.5-mile endurance run) with 93% overall classification accuracy.
Unitary Response Regression Models
ERIC Educational Resources Information Center
Lipovetsky, S.
2007-01-01
The dependent variable in a regular linear regression is a numerical variable, and in a logistic regression it is a binary or categorical variable. In these models the dependent variable has varying values. However, there are problems yielding an identity output of a constant value which can also be modelled in a linear or logistic regression with…
Binary logistic regression-Instrument for assessing museum indoor air impact on exhibits.
Bucur, Elena; Danet, Andrei Florin; Lehr, Carol Blaziu; Lehr, Elena; Nita-Lazar, Mihai
2017-04-01
This paper presents a new way to assess the environmental impact on historical artifacts using binary logistic regression. The prediction of the impact on the exhibits during certain pollution scenarios (environmental impact) was calculated by a mathematical model based on the binary logistic regression; it allows the identification of those environmental parameters from a multitude of possible parameters with a significant impact on exhibitions and ranks them according to their severity effect. Air quality (NO 2 , SO 2 , O 3 and PM 2.5 ) and microclimate parameters (temperature, humidity) monitoring data from a case study conducted within exhibition and storage spaces of the Romanian National Aviation Museum Bucharest have been used for developing and validating the binary logistic regression method and the mathematical model. The logistic regression analysis was used on 794 data combinations (715 to develop of the model and 79 to validate it) by a Statistical Package for Social Sciences (SPSS 20.0). The results from the binary logistic regression analysis demonstrated that from six parameters taken into consideration, four of them present a significant effect upon exhibits in the following order: O 3 >PM 2.5 >NO 2 >humidity followed at a significant distance by the effects of SO 2 and temperature. The mathematical model, developed in this study, correctly predicted 95.1 % of the cumulated effect of the environmental parameters upon the exhibits. Moreover, this model could also be used in the decisional process regarding the preventive preservation measures that should be implemented within the exhibition space. The paper presents a new way to assess the environmental impact on historical artifacts using binary logistic regression. The mathematical model developed on the environmental parameters analyzed by the binary logistic regression method could be useful in a decision-making process establishing the best measures for pollution reduction and preventive preservation of exhibits.
Are Hemorrhoids Associated with False-Positive Fecal Immunochemical Test Results?
Kim, Nam Hee; Park, Jung Ho; Park, Dong Il; Sohn, Chong Il; Choi, Kyuyong; Jung, Yoon Suk
2017-01-01
False-positive (FP) results of fecal immunochemical tests (FITs) conducted in colorectal cancer (CRC) screening could lead to performing unnecessary colonoscopies. Hemorrhoids are a possible cause of FP FIT results; however, studies on this topic are extremely rare. We investigated whether hemorrhoids are associated with FP FIT results. A retrospective study was conducted at a university hospital in Korea from June 2013 to May 2015. Of the 34547 individuals who underwent FITs, 3946 aged ≥50 years who underwent colonoscopies were analyzed. Logistic regression analysis was performed to determine factors associated with FP FIT results. Among 3946 participants, 704 (17.8%) showed positive FIT results and 1303 (33.0%) had hemorrhoids. Of the 704 participants with positive FIT results, 165 had advanced colorectal neoplasia (ACRN) and 539 had no ACRN (FP results). Of the 1303 participants with hemorrhoids, 291 showed FP results, of whom 81 showed FP results because of hemorrhoids only. Participants with hemorrhoids had a higher rate of FP results than those without hemorrhoids (291/1176, 24.7% vs. 248/2361, 10.5%; p<0.001). Additionally, the participants with hemorrhoids as the only abnormality had a higher rate of FP results than those experiencing no such abnormalities (81/531, 15.3% vs. 38/1173, 3.2%; p<0.001). In multivariate analysis, the presence of hemorrhoids was identified as an independent predictor of FP results (adjusted odds ratio, 2.76; 95% confidence interval, 2.24-3.40; p<0.001). Hemorrhoids are significantly associated with FP FIT results. Their presence seemed to be a non-negligible contributor of FP results in FIT-based CRC screening programs.
Is the Nintendo Wii Fit really acceptable to older people?: a discrete choice experiment
2011-01-01
Background Interactive video games such as the Nintendo Wii Fit are increasingly used as a therapeutic tool in health and aged care settings however, their acceptability to older people is unclear. The aim of this study was to determine the acceptability of the Nintendo Wii Fit as a therapy tool for hospitalised older people using a discrete choice experiment (DCE) before and after exposure to the intervention. Methods A DCE was administered to 21 participants in an interview style format prior to, and following several sessions of using the Wii Fit in physiotherapy. The physiotherapist prescribed the Wii Fit activities, supervised and supported the patient during the therapy sessions. Attributes included in the DCE were: mode of therapy (traditional or using the Wii Fit), amount of therapy, cost of therapy program and percentage of recovery made. Data was analysed using conditional (fixed-effects) logistic regression. Results Prior to commencing the therapy program participants were most concerned about therapy time (avoiding programs that were too intensive), and the amount of recovery they would make. Following the therapy program, participants were more concerned with the mode of therapy and preferred traditional therapy programs over programs using the Wii Fit. Conclusions The usefulness of the Wii Fit as a therapy tool with hospitalised older people is limited not only by the small proportion of older people who are able to use it, but by older people's preferences for traditional approaches to therapy. Mainstream media portrayals of the popularity of the Wii Fit with older people may not reflect the true acceptability in the older hospitalised population. PMID:22011360
2011-01-01
Background In adults, there is a substantial body of evidence that physical inactivity or low cardiorespiratory fitness levels are strongly associated with the development of metabolic syndrome. Although this association has been studied extensively in adults, little is known regarding this association in adolescents. The aim of this study was to analyze the association between physical activity and cardiorespiratory fitness levels with metabolic syndrome in Brazilian adolescents. Methods A random sample of 223 girls (mean age, 14.4 ± 1.6 years) and 233 boys (mean age, 14.6 ± 1.6 years) was selected for the study. The level of physical activity was determined by the Bouchard three-day physical activity record. Cardiorespiratory fitness was estimated by the Leger 20-meter shuttle run test. The metabolic syndrome components assessed included waist circumference, blood pressure, HDL-cholesterol, triglycerides, and fasting plasma glucose levels. Independent Student t-tests were used to assess gender differences. The associations between physical activity and cardiorespiratory fitness with the presence of metabolic syndrome were calculated using logistic regression models adjusted for age and gender. Results A high prevalence of metabolic syndrome was observed in inactive adolescents (males, 11.4%; females, 7.2%) and adolescents with low cardiorespiratory fitness levels (males, 13.9%; females, 8.6%). A significant relationship existed between metabolic syndrome and low cardiorespiratory fitness (OR, 3.0 [1.13-7.94]). Conclusion The prevalence of metabolic syndrome is high among adolescents who are inactive and those with low cardiorespiratory fitness. Prevention strategies for metabolic syndrome should concentrate on enhancing fitness levels early in life. PMID:21878095
Fitness, fatness, and academic performance in seventh-grade elementary school students
2014-01-01
Background In addition to the benefits on physical and mental health, cardiorespiratory fitness has shown to have positive effects on cognition. This study aimed to investigate the relationship between cardiorespiratory fitness and body weight status on academic performance among seventh-grade students. Methods Participants included 1531 grade 7 students (787 male, 744 female), ranging in age from 12 to 14 years (Mage = 12.3 ± 0.60), from 3 different cohorts. Academic performance was measured using the marks students had, at the end of their academic year, in mathematics, language (Portuguese), foreign language (English), and sciences. To assess cardiorespiratory fitness the Progressive Aerobic Cardiovascular Endurance Run, from Fitnessgram, was used as the test battery. The relationship between academic achievement and the independent and combined association of cardiorespiratory fitness/weight status was analysed, using multinomial logistic regression. Results Cardiorespiratory fitness and weight status were independently related with academic achievement. Fit students, compared with unfit students had significantly higher odds for having high academic achievement (OR = 2.29, 95% CI: 1.48-3.55, p < 0.001). Likewise, having a normal weight status was also related with high academic achievement (OR = 3.65, 95% CI: 1.82-7.34, p < 0.001). Conclusions Cardiorespiratory fitness and weight status were independently and combined related to academic achievement in seventh-grade students independent of the different cohorts, providing further support that aerobically fit and normal weight students are more likely to have better performance at school regardless of the year that they were born. PMID:25001376
Fitness, fatness, and academic performance in seventh-grade elementary school students.
Sardinha, Luís B; Marques, Adilson; Martins, Sandra; Palmeira, António; Minderico, Cláudia
2014-07-07
In addition to the benefits on physical and mental health, cardiorespiratory fitness has shown to have positive effects on cognition. This study aimed to investigate the relationship between cardiorespiratory fitness and body weight status on academic performance among seventh-grade students. Participants included 1531 grade 7 students (787 male, 744 female), ranging in age from 12 to 14 years (Mage = 12.3 ± 0.60), from 3 different cohorts. Academic performance was measured using the marks students had, at the end of their academic year, in mathematics, language (Portuguese), foreign language (English), and sciences. To assess cardiorespiratory fitness the Progressive Aerobic Cardiovascular Endurance Run, from Fitnessgram, was used as the test battery. The relationship between academic achievement and the independent and combined association of cardiorespiratory fitness/weight status was analysed, using multinomial logistic regression. Cardiorespiratory fitness and weight status were independently related with academic achievement. Fit students, compared with unfit students had significantly higher odds for having high academic achievement (OR = 2.29, 95% CI: 1.48-3.55, p < 0.001). Likewise, having a normal weight status was also related with high academic achievement (OR = 3.65, 95% CI: 1.82-7.34, p < 0.001). Cardiorespiratory fitness and weight status were independently and combined related to academic achievement in seventh-grade students independent of the different cohorts, providing further support that aerobically fit and normal weight students are more likely to have better performance at school regardless of the year that they were born.
Ogunleye, Ayodele A; Sandercock, Gavin R; Voss, Christine; Eisenmann, Joey C; Reed, Katharine
2013-11-01
Cardiorespiratory fitness is known to be cardioprotective and its association with the components of the metabolic syndrome in children is becoming clearer. The aim of the present study was to examine the extent to which cardiorespiratory fitness may offset the weight-related association with mean arterial pressure (MAP) in schoolchildren. Cross-sectional study. Schoolchildren from the East of England, U.K. A total of 5983 (48% females) schoolchildren, 10 to 16 years of age, had height, weight and blood pressure measured by standard procedures and cardiorespiratory fitness assessed by the 20 m shuttle-run test. Participants were classified as fit or unfit using internationally accepted fitness cut-off points; and as normal weight, overweight or obese based on BMI, again using international cut-off points. Age-adjusted ANCOVA was used to determine the main effects and interaction of fitness and BMI on MAP Z-score. Logistic regression models were used to estimate odds ratios of elevated MAP. Prevalence of elevated MAP in schoolchildren was 14.8% overall and 35.7% in those who were obese-unfit. Approximately 21% of participants were overweight and 5% obese, while 23% were classified as unfit. MAP generally increased across BMI categories and was higher in the aerobically unfit participants. Obese-fit males had lower MAP compared with obese-unfit males (P < 0.001); this trend was similar in females (P = 0.05). Increasing fitness level may have a positive impact on the weight-related elevations of MAP seen in obese and overweight schoolchildren.
Physician job satisfaction in Saudi Arabia: insights from a tertiary hospital survey.
Aldrees, Turki; Al-Eissa, Sami; Badri, Motasim; Aljuhayman, Ahmed; Zamakhshary, Mohammed
2015-01-01
Job satisfaction refers to the extent to which people like or dislike their job. Job satisfaction varies across professions. Few studies have explored this issue among physicians in Saudi Arabia. The objective of this study is to determine the level and factors associated with job satisfaction among Saudi and non-Saudi physicians. In this cross-sectional study conducted in a major tertiary hospital in Riyadh, a 5-point Likert scale structured questionnaire was used to collect data on a wide range of socio-demographic, practice environment characteristics and level and consequences of job satisfaction from practicing physicians (consultants or residents) across different medical specialties. Logistic regression models were fitted to determine factors associated with job satisfaction. Of 344 participants, 300 (87.2%) were Saudis, 252 (73%) males, 255 (74%) married, 188 (54.7%) consultants and age [median (IQR)] was 32 (27-42.7) years. Overall, 104 (30%) respondents were dissatisfied with their jobs. Intensive care physicians were the most dissatisfied physicians (50%). In a multiple logistic regression model, income satisfaction (odds ratio [OR]=0.448 95% CI 0.278-0.723, P < .001) was the only factor independently associated with dissatisfaction. Factors adversely associated with physicians job satisfaction identified in this study should be addressed in governmental strategic planning aimed at improving the healthcare system and patient care.
Explaining match outcome in elite Australian Rules football using team performance indicators.
Robertson, Sam; Back, Nicole; Bartlett, Jonathan D
2016-01-01
The relationships between team performance indicators and match outcome have been examined in many team sports, however are limited in Australian Rules football. Using data from the 2013 and 2014 Australian Football League (AFL) regular seasons, this study assessed the ability of commonly reported discrete team performance indicators presented in their relative form (standardised against their opposition for a given match) to explain match outcome (Win/Loss). Logistic regression and decision tree (chi-squared automatic interaction detection (CHAID)) analyses both revealed relative differences between opposing teams for "kicks" and "goal conversion" as the most influential in explaining match outcome, with two models achieving 88.3% and 89.8% classification accuracies, respectively. Models incorporating a smaller performance indicator set displayed a slightly reduced ability to explain match outcome (81.0% and 81.5% for logistic regression and CHAID, respectively). However, both were fit to 2014 data with reduced error in comparison to the full models. Despite performance similarities across the two analysis approaches, the CHAID model revealed multiple winning performance indicator profiles, thereby increasing its comparative feasibility for use in the field. Coaches and analysts may find these results useful in informing strategy and game plan development in Australian Rules football, with the development of team-specific models recommended in future.
NASA Astrophysics Data System (ADS)
Wang, Liang-Jie; Sawada, Kazuhide; Moriguchi, Shuji
2013-01-01
To mitigate the damage caused by landslide disasters, different mathematical models have been applied to predict landslide spatial distribution characteristics. Although some researchers have achieved excellent results around the world, few studies take the spatial resolution of the database into account. Four types of digital elevation model (DEM) ranging from 2 to 20 m derived from light detection and ranging technology to analyze landslide susceptibility in Mizunami City, Gifu Prefecture, Japan, are presented. Fifteen landslide-causative factors are considered using a logistic-regression approach to create models for landslide potential analysis. Pre-existing landslide bodies are used to evaluate the performance of the four models. The results revealed that the 20-m model had the highest classification accuracy (71.9%), whereas the 2-m model had the lowest value (68.7%). In the 2-m model, 89.4% of the landslide bodies fit in the medium to very high categories. For the 20-m model, only 83.3% of the landslide bodies were concentrated in the medium to very high classes. When the cell size decreases from 20 to 2 m, the area under the relative operative characteristic increases from 0.68 to 0.77. Therefore, higher-resolution DEMs would provide better results for landslide-susceptibility mapping.
The Integrative Weaning Index in Elderly ICU Subjects.
Azeredo, Leandro M; Nemer, Sérgio N; Barbas, Carmen Sv; Caldeira, Jefferson B; Noé, Rosângela; Guimarães, Bruno L; Caldas, Célia P
2017-03-01
With increasing life expectancy and ICU admission of elderly patients, mechanical ventilation, and weaning trials have increased worldwide. We evaluated a cohort with 479 subjects in the ICU. Patients younger than 18 y, tracheostomized, or with neurologic diseases were excluded, resulting in 331 subjects. Subjects ≥70 y old were considered elderly, whereas those <70 y old were considered non-elderly. Besides the conventional weaning indexes, we evaluated the performance of the integrative weaning index (IWI). The probability of successful weaning was investigated using relative risk and logistic regression. The Hosmer-Lemeshow goodness-of-fit test was used to calibrate and the C statistic was calculated to evaluate the association between predicted probabilities and observed proportions in the logistic regression model. Prevalence of successful weaning in the sample was 83.7%. There was no difference in mortality between elderly and non-elderly subjects ( P = .16), in days of mechanical ventilation ( P = .22) and days of weaning ( P = .55). In elderly subjects, the IWI was the only respiratory variable associated with mechanical ventilation weaning in this population ( P < .001). The IWI was the independent variable found in weaning of elderly subjects that may contribute to the critical moment of this population in intensive care. Copyright © 2017 by Daedalus Enterprises.
Asher, Anthony L; Devin, Clinton J; McCutcheon, Brandon; Chotai, Silky; Archer, Kristin R; Nian, Hui; Harrell, Frank E; McGirt, Matthew; Mummaneni, Praveen V; Shaffrey, Christopher I; Foley, Kevin; Glassman, Steven D; Bydon, Mohamad
2017-12-01
OBJECTIVE In this analysis the authors compare the characteristics of smokers to nonsmokers using demographic, socioeconomic, and comorbidity variables. They also investigate which of these characteristics are most strongly associated with smoking status. Finally, the authors investigate whether the association between known patient risk factors and disability outcome is differentially modified by patient smoking status for those who have undergone surgery for lumbar degeneration. METHODS A total of 7547 patients undergoing degenerative lumbar surgery were entered into a prospective multicenter registry (Quality Outcomes Database [QOD]). A retrospective analysis of the prospectively collected data was conducted. Patients were dichotomized as smokers (current smokers) and nonsmokers. Multivariable logistic regression analysis fitted for patient smoking status and subsequent measurement of variable importance was performed to identify the strongest patient characteristics associated with smoking status. Multivariable linear regression models fitted for 12-month Oswestry Disability Index (ODI) scores in subsets of smokers and nonsmokers was performed to investigate whether differential effects of risk factors by smoking status might be present. RESULTS In total, 18% (n = 1365) of patients were smokers and 82% (n = 6182) were nonsmokers. In a multivariable logistic regression analysis, the factors significantly associated with patients' smoking status were sex (p < 0.0001), age (p < 0.0001), body mass index (p < 0.0001), educational status (p < 0.0001), insurance status (p < 0.001), and employment/occupation (p = 0.0024). Patients with diabetes had lowers odds of being a smoker (p = 0.0008), while patients with coronary artery disease had greater odds of being a smoker (p = 0.044). Patients' propensity for smoking was also significantly associated with higher American Society of Anesthesiologists (ASA) class (p < 0.0001), anterior-alone surgical approach (p = 0.018), greater number of levels (p = 0.0246), decompression only (p = 0.0001), and higher baseline ODI score (p < 0.0001). In a multivariable proportional odds logistic regression model, the adjusted odds ratio of risk factors and direction of improvement in 12-month ODI scores remained similar between the subsets of smokers and nonsmokers. CONCLUSIONS Using a large, national, multiinstitutional registry, the authors described the profile of patients who undergo lumbar spine surgery and its association with their smoking status. Compared with nonsmokers, smokers were younger, male, nondiabetic, nonobese patients presenting with leg pain more so than back pain, with higher ASA classes, higher disability, less education, more likely to be unemployed, and with Medicaid/uninsured insurance status. Smoking status did not affect the association between these risk factors and 12-month ODI outcome, suggesting that interventions for modifiable risk factors are equally efficacious between smokers and nonsmokers.
Tagliaferri, Angela; Love, Thomas E.; Szczotka-Flynn, Loretta
2014-01-01
BACKGROUND Contact lens induced papillary conjunctivitis (CLPC) continues to be a major cause of dropout during contact lens extended wear. This retrospective study explores risk factors for the development of CLPC during silicone hydrogel lens extended wear. METHODS Data from 205 subjects enrolled in the Longitudinal Analysis of Silicone Hydrogel Contact Lens (LASH) study wearing lotrafilcon A silicone hydrogel lenses for up to 30 days of continuous wear were used to determine risk factors for CLPC in this secondary analysis of the main cohort. The main covariates of interest included substantial lens-associated bacterial bioburden, and topographically determined lens base curve-to-cornea fitting relationships. Additional covariates of interest included history of prior adverse events, time of year, race, education level, gender and other subject demographics. Statistical analyses included univariate logistic regression to assess the impact of potential risk factors on the binary CLPC outcome, and Cox proportional hazards regression to describe the impact of those factors on time-to-CLPC diagnosis. RESULTS Across 12 months of follow-up, 52 subjects (25%) experienced CLPC. No associations were found between CLPC development and the presence of bacterial bioburden, lens-to-cornea fitting relationships, history of prior adverse events, gender or race. CLPC development followed the same seasonal trends as the local peaks in environmental allergans. CONCLUSIONS Lens fit and biodeposits, in the form of lens associated bacterial bioburden, were not associated with the development of CLPC during extended wear with lotrafilcon A silicone hydrogel lenses. PMID:24681609
Andreasi, Viviane; Michelin, Edilaine; Rinaldi, Ana Elisa M; Burini, Roberto Carlos
2010-01-01
To analyze associations between health-related physical fitness and the anthropometric and demographic indicators of children at three elementary schools in Botucatu, SP, Brazil. The sample for this cross-sectional study was 988 elementary school students, recruited from the second to ninth grades (an age range of 7 to 15 years). The children underwent anthropometric assessment (weight, height, waist circumference and tricipital and subscapular skin folds) and were tested for health-related physical fitness (flexibility: sit and reach test; abdominal strength/stamina: 1-minute abdominal test; and aerobic stamina: 9-minute running/walking test). Data were analyzed using descriptive statistics plus Student's t test, the chi-square test or Fisher's exact test and logistic regression with a significance level of 5%. The physical fitness levels observed were significantly influenced by age (all levels), sex (abdominal strength/stamina), obesity (all levels), body adiposity (flexibility, abdominal strength/stamina) and abdominal adiposity (abdominal strength/stamina and aerobic stamina). Females were more prone to be unfit in abdominal strength/stamina. Both obesity and excessive abdominal adiposity predisposed children to be unfit in abdominal strength/stamina and aerobic stamina. Excess body adiposity increased the likelihood of poor trunk flexibility. Unhealthy physical fitness levels were related to female sex, obesity and excessive abdominal adiposity. Implementing programs designed to effect lifestyle changes to achieve physical fitness and healthy nutrition in these schools would meet the objectives of promoting healthy body weight and increased physical fitness among these schoolchildren.
Traveling by Private Motorized Vehicle and Physical Fitness in Taiwanese Adults.
Liao, Yung; Tsai, Hsiu-Hua; Wang, Ho-Seng; Lin, Ching-Ping; Wu, Min-Chen; Chen, Jui-Fu
2016-08-01
Although the time spent sitting in motorized vehicles has been determined to be adversely associated with cardiometabolic health, its association with other health indicators remains unclear. This study examined associations between traveling by private motorized vehicle and 4 indicators of physical fitness in adults. Data from 52,114 Taiwanese adults aged 20 to 65 years who participated in the 2013 National Adults Fitness Survey were used. The examined variables were height, body mass, and performance in modified sit-and-reach (flexibility), bent-leg sit-up (abdominal muscular strength and endurance), and a 3-min step test (cardiorespiratory endurance). Participants were asked on how many days they had used a private car or motorcycle for traveling from place to place and categorized as non-, occasional, and daily private motorized vehicle travelers. Logistic and linear regression models were used to examine associations between the categories of using private motorized vehicles to travel and physical fitness performance. After an adjustment for potential demographic and behavioral confounders, daily traveling by private motorized vehicle was associated with a higher probability of overweight (odds ratio = 1.18), lower performance of abdominal muscular strength and endurance (-0.37 times/min), and lower cardiorespiratory fitness (-0.60 physical fitness index) than was traveling that did not involve private motorized vehicles. The results suggest that in addition to unfavorable cardiorespiratory fitness and a risk of overweight, daily traveling by private motorized vehicle is associated with poor performance in abdominal muscular strength and endurance.
Advanced colorectal neoplasia risk stratification by penalized logistic regression.
Lin, Yunzhi; Yu, Menggang; Wang, Sijian; Chappell, Richard; Imperiale, Thomas F
2016-08-01
Colorectal cancer is the second leading cause of death from cancer in the United States. To facilitate the efficiency of colorectal cancer screening, there is a need to stratify risk for colorectal cancer among the 90% of US residents who are considered "average risk." In this article, we investigate such risk stratification rules for advanced colorectal neoplasia (colorectal cancer and advanced, precancerous polyps). We use a recently completed large cohort study of subjects who underwent a first screening colonoscopy. Logistic regression models have been used in the literature to estimate the risk of advanced colorectal neoplasia based on quantifiable risk factors. However, logistic regression may be prone to overfitting and instability in variable selection. Since most of the risk factors in our study have several categories, it was tempting to collapse these categories into fewer risk groups. We propose a penalized logistic regression method that automatically and simultaneously selects variables, groups categories, and estimates their coefficients by penalizing the [Formula: see text]-norm of both the coefficients and their differences. Hence, it encourages sparsity in the categories, i.e. grouping of the categories, and sparsity in the variables, i.e. variable selection. We apply the penalized logistic regression method to our data. The important variables are selected, with close categories simultaneously grouped, by penalized regression models with and without the interactions terms. The models are validated with 10-fold cross-validation. The receiver operating characteristic curves of the penalized regression models dominate the receiver operating characteristic curve of naive logistic regressions, indicating a superior discriminative performance. © The Author(s) 2013.
A multicenter mortality prediction model for patients receiving prolonged mechanical ventilation
Carson, Shannon S.; Kahn, Jeremy M.; Hough, Catherine L.; Seeley, Eric J.; White, Douglas B.; Douglas, Ivor S.; Cox, Christopher E.; Caldwell, Ellen; Bangdiwala, Shrikant I.; Garrett, Joanne M.; Rubenfeld, Gordon D.
2012-01-01
Objective Significant deficiencies exist in the communication of prognosis for patients requiring prolonged mechanical ventilation after acute illness, in part because of clinician uncertainty about long-term outcomes. We sought to refine a mortality prediction model for patients requiring prolonged ventilation using a multicentered study design. Design Cohort study. Setting Five geographically diverse tertiary care medical centers in the United States (California, Colorado, North Carolina, Pennsylvania, Washington). Patients Two hundred sixty adult patients who received at least 21 days of mechanical ventilation after acute illness. Interventions None. Measurements and Main Results For the probability model, we included age, platelet count, and requirement for vasopressors and/or hemodialysis, each measured on day 21 of mechanical ventilation, in a logistic regression model with 1-yr mortality as the outcome variable. We subsequently modified a simplified prognostic scoring rule (ProVent score) by categorizing the risk variables (age 18–49, 50–64, and >65 yrs; platelet count 0–150 and >150; vasopressors; hemodialysis) in another logistic regression model and assigning points to variables according to β coefficient values. Overall mortality at 1 yr was 48%. The area under the curve of the receiver operator characteristic curve for the primary ProVent probability model was 0.79 (95% confidence interval, 0.75–0.81), and the p value for the Hosmer-Lemeshow goodness-of-fit statistic was .89. The area under the curve for the categorical model was 0.77, and the p value for the goodness-of-fit statistic was .34. The area under the curve for the ProVent score was 0.76, and the p value for the Hosmer-Lemeshow goodness-of-fit statistic was .60. For the 50 patients with a ProVent score >2, only one patient was able to be discharged directly home, and 1-yr mortality was 86%. Conclusion The ProVent probability model is a simple and reproducible model that can accurately identify patients requiring prolonged mechanical ventilation who are at high risk of 1-yr mortality. PMID:22080643
Rupert, Michael G.; Cannon, Susan H.; Gartner, Joseph E.
2003-01-01
Logistic regression was used to predict the probability of debris flows occurring in areas recently burned by wildland fires. Multiple logistic regression is conceptually similar to multiple linear regression because statistical relations between one dependent variable and several independent variables are evaluated. In logistic regression, however, the dependent variable is transformed to a binary variable (debris flow did or did not occur), and the actual probability of the debris flow occurring is statistically modeled. Data from 399 basins located within 15 wildland fires that burned during 2000-2002 in Colorado, Idaho, Montana, and New Mexico were evaluated. More than 35 independent variables describing the burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated. The models were developed as follows: (1) Basins that did and did not produce debris flows were delineated from National Elevation Data using a Geographic Information System (GIS). (2) Data describing the burn severity, geology, land surface gradient, rainfall, and soil properties were determined for each basin. These data were then downloaded to a statistics software package for analysis using logistic regression. (3) Relations between the occurrence/non-occurrence of debris flows and burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated and several preliminary multivariate logistic regression models were constructed. All possible combinations of independent variables were evaluated to determine which combination produced the most effective model. The multivariate model that best predicted the occurrence of debris flows was selected. (4) The multivariate logistic regression model was entered into a GIS, and a map showing the probability of debris flows was constructed. The most effective model incorporates the percentage of each basin with slope greater than 30 percent, percentage of land burned at medium and high burn severity in each basin, particle size sorting, average storm intensity (millimeters per hour), soil organic matter content, soil permeability, and soil drainage. The results of this study demonstrate that logistic regression is a valuable tool for predicting the probability of debris flows occurring in recently-burned landscapes.
Equal Area Logistic Estimation for Item Response Theory
NASA Astrophysics Data System (ADS)
Lo, Shih-Ching; Wang, Kuo-Chang; Chang, Hsin-Li
2009-08-01
Item response theory (IRT) models use logistic functions exclusively as item response functions (IRFs). Applications of IRT models require obtaining the set of values for logistic function parameters that best fit an empirical data set. However, success in obtaining such set of values does not guarantee that the constructs they represent actually exist, for the adequacy of a model is not sustained by the possibility of estimating parameters. In this study, an equal area based two-parameter logistic model estimation algorithm is proposed. Two theorems are given to prove that the results of the algorithm are equivalent to the results of fitting data by logistic model. Numerical results are presented to show the stability and accuracy of the algorithm.
Ebrahimzadeh, Farzad; Hajizadeh, Ebrahim; Vahabi, Nasim; Almasian, Mohammad; Bakhteyar, Katayoon
2015-01-01
Background: Unwanted pregnancy not intended by at least one of the parents has undesirable consequences for the family and the society. In the present study, three classification models were used and compared to predict unwanted pregnancies in an urban population. Methods: In this cross-sectional study, 887 pregnant mothers referring to health centers in Khorramabad, Iran, in 2012 were selected by the stratified and cluster sampling; relevant variables were measured and for prediction of unwanted pregnancy, logistic regression, discriminant analysis, and probit regression models and SPSS software version 21 were used. To compare these models, indicators such as sensitivity, specificity, the area under the ROC curve, and the percentage of correct predictions were used. Results: The prevalence of unwanted pregnancies was 25.3%. The logistic and probit regression models indicated that parity and pregnancy spacing, contraceptive methods, household income and number of living male children were related to unwanted pregnancy. The performance of the models based on the area under the ROC curve was 0.735, 0.733, and 0.680 for logistic regression, probit regression, and linear discriminant analysis, respectively. Conclusion: Given the relatively high prevalence of unwanted pregnancies in Khorramabad, it seems necessary to revise family planning programs. Despite the similar accuracy of the models, if the researcher is interested in the interpretability of the results, the use of the logistic regression model is recommended. PMID:26793655
Ebrahimzadeh, Farzad; Hajizadeh, Ebrahim; Vahabi, Nasim; Almasian, Mohammad; Bakhteyar, Katayoon
2015-01-01
Unwanted pregnancy not intended by at least one of the parents has undesirable consequences for the family and the society. In the present study, three classification models were used and compared to predict unwanted pregnancies in an urban population. In this cross-sectional study, 887 pregnant mothers referring to health centers in Khorramabad, Iran, in 2012 were selected by the stratified and cluster sampling; relevant variables were measured and for prediction of unwanted pregnancy, logistic regression, discriminant analysis, and probit regression models and SPSS software version 21 were used. To compare these models, indicators such as sensitivity, specificity, the area under the ROC curve, and the percentage of correct predictions were used. The prevalence of unwanted pregnancies was 25.3%. The logistic and probit regression models indicated that parity and pregnancy spacing, contraceptive methods, household income and number of living male children were related to unwanted pregnancy. The performance of the models based on the area under the ROC curve was 0.735, 0.733, and 0.680 for logistic regression, probit regression, and linear discriminant analysis, respectively. Given the relatively high prevalence of unwanted pregnancies in Khorramabad, it seems necessary to revise family planning programs. Despite the similar accuracy of the models, if the researcher is interested in the interpretability of the results, the use of the logistic regression model is recommended.
Kempe, P T; van Oppen, P; de Haan, E; Twisk, J W R; Sluis, A; Smit, J H; van Dyck, R; van Balkom, A J L M
2007-09-01
Two methods for predicting remissions in obsessive-compulsive disorder (OCD) treatment are evaluated. Y-BOCS measurements of 88 patients with a primary OCD (DSM-III-R) diagnosis were performed over a 16-week treatment period, and during three follow-ups. Remission at any measurement was defined as a Y-BOCS score lower than thirteen combined with a reduction of seven points when compared with baseline. Logistic regression models were compared with a Cox regression for recurrent events model. Logistic regression yielded different models at different evaluation times. The recurrent events model remained stable when fewer measurements were used. Higher baseline levels of neuroticism and more severe OCD symptoms were associated with a lower chance of remission, early age of onset and more depressive symptoms with a higher chance. Choice of outcome time affects logistic regression prediction models. Recurrent events analysis uses all information on remissions and relapses. Short- and long-term predictors for OCD remission show overlap.
Estimating the exceedance probability of rain rate by logistic regression
NASA Technical Reports Server (NTRS)
Chiu, Long S.; Kedem, Benjamin
1990-01-01
Recent studies have shown that the fraction of an area with rain intensity above a fixed threshold is highly correlated with the area-averaged rain rate. To estimate the fractional rainy area, a logistic regression model, which estimates the conditional probability that rain rate over an area exceeds a fixed threshold given the values of related covariates, is developed. The problem of dependency in the data in the estimation procedure is bypassed by the method of partial likelihood. Analyses of simulated scanning multichannel microwave radiometer and observed electrically scanning microwave radiometer data during the Global Atlantic Tropical Experiment period show that the use of logistic regression in pixel classification is superior to multiple regression in predicting whether rain rate at each pixel exceeds a given threshold, even in the presence of noisy data. The potential of the logistic regression technique in satellite rain rate estimation is discussed.
NASA Astrophysics Data System (ADS)
Cary, Theodore W.; Cwanger, Alyssa; Venkatesh, Santosh S.; Conant, Emily F.; Sehgal, Chandra M.
2012-03-01
This study compares the performance of two proven but very different machine learners, Naïve Bayes and logistic regression, for differentiating malignant and benign breast masses using ultrasound imaging. Ultrasound images of 266 masses were analyzed quantitatively for shape, echogenicity, margin characteristics, and texture features. These features along with patient age, race, and mammographic BI-RADS category were used to train Naïve Bayes and logistic regression classifiers to diagnose lesions as malignant or benign. ROC analysis was performed using all of the features and using only a subset that maximized information gain. Performance was determined by the area under the ROC curve, Az, obtained from leave-one-out cross validation. Naïve Bayes showed significant variation (Az 0.733 +/- 0.035 to 0.840 +/- 0.029, P < 0.002) with the choice of features, but the performance of logistic regression was relatively unchanged under feature selection (Az 0.839 +/- 0.029 to 0.859 +/- 0.028, P = 0.605). Out of 34 features, a subset of 6 gave the highest information gain: brightness difference, margin sharpness, depth-to-width, mammographic BI-RADs, age, and race. The probabilities of malignancy determined by Naïve Bayes and logistic regression after feature selection showed significant correlation (R2= 0.87, P < 0.0001). The diagnostic performance of Naïve Bayes and logistic regression can be comparable, but logistic regression is more robust. Since probability of malignancy cannot be measured directly, high correlation between the probabilities derived from two basic but dissimilar models increases confidence in the predictive power of machine learning models for characterizing solid breast masses on ultrasound.
Wang, Qingliang; Li, Xiaojie; Hu, Kunpeng; Zhao, Kun; Yang, Peisheng; Liu, Bo
2015-05-12
To explore the risk factors of portal hypertensive gastropathy (PHG) in patients with hepatitis B associated cirrhosis and establish a Logistic regression model of noninvasive prediction. The clinical data of 234 hospitalized patients with hepatitis B associated cirrhosis from March 2012 to March 2014 were analyzed retrospectively. The dependent variable was the occurrence of PHG while the independent variables were screened by binary Logistic analysis. Multivariate Logistic regression was used for further analysis of significant noninvasive independent variables. Logistic regression model was established and odds ratio was calculated for each factor. The accuracy, sensitivity and specificity of model were evaluated by the curve of receiver operating characteristic (ROC). According to univariate Logistic regression, the risk factors included hepatic dysfunction, albumin (ALB), bilirubin (TB), prothrombin time (PT), platelet (PLT), white blood cell (WBC), portal vein diameter, spleen index, splenic vein diameter, diameter ratio, PLT to spleen volume ratio, esophageal varices (EV) and gastric varices (GV). Multivariate analysis showed that hepatic dysfunction (X1), TB (X2), PLT (X3) and splenic vein diameter (X4) were the major occurring factors for PHG. The established regression model was Logit P=-2.667+2.186X1-2.167X2+0.725X3+0.976X4. The accuracy of model for PHG was 79.1% with a sensitivity of 77.2% and a specificity of 80.8%. Hepatic dysfunction, TB, PLT and splenic vein diameter are risk factors for PHG and the noninvasive predicted Logistic regression model was Logit P=-2.667+2.186X1-2.167X2+0.725X3+0.976X4.
Variable Selection in Logistic Regression.
1987-06-01
23 %. AUTIOR(.) S. CONTRACT OR GRANT NUMBE Rf.i %Z. D. Bai, P. R. Krishnaiah and . C. Zhao F49620-85- C-0008 " PERFORMING ORGANIZATION NAME AND AOORESS...d I7 IOK-TK- d 7 -I0 7’ VARIABLE SELECTION IN LOGISTIC REGRESSION Z. D. Bai, P. R. Krishnaiah and L. C. Zhao Center for Multivariate Analysis...University of Pittsburgh Center for Multivariate Analysis University of Pittsburgh Y !I VARIABLE SELECTION IN LOGISTIC REGRESSION Z- 0. Bai, P. R. Krishnaiah
NASA Astrophysics Data System (ADS)
Madhu, B.; Ashok, N. C.; Balasubramanian, S.
2014-11-01
Multinomial logistic regression analysis was used to develop statistical model that can predict the probability of breast cancer in Southern Karnataka using the breast cancer occurrence data during 2007-2011. Independent socio-economic variables describing the breast cancer occurrence like age, education, occupation, parity, type of family, health insurance coverage, residential locality and socioeconomic status of each case was obtained. The models were developed as follows: i) Spatial visualization of the Urban- rural distribution of breast cancer cases that were obtained from the Bharat Hospital and Institute of Oncology. ii) Socio-economic risk factors describing the breast cancer occurrences were complied for each case. These data were then analysed using multinomial logistic regression analysis in a SPSS statistical software and relations between the occurrence of breast cancer across the socio-economic status and the influence of other socio-economic variables were evaluated and multinomial logistic regression models were constructed. iii) the model that best predicted the occurrence of breast cancer were identified. This multivariate logistic regression model has been entered into a geographic information system and maps showing the predicted probability of breast cancer occurrence in Southern Karnataka was created. This study demonstrates that Multinomial logistic regression is a valuable tool for developing models that predict the probability of breast cancer Occurrence in Southern Karnataka.
NASA Astrophysics Data System (ADS)
Kamaruddin, Ainur Amira; Ali, Zalila; Noor, Norlida Mohd.; Baharum, Adam; Ahmad, Wan Muhamad Amir W.
2014-07-01
Logistic regression analysis examines the influence of various factors on a dichotomous outcome by estimating the probability of the event's occurrence. Logistic regression, also called a logit model, is a statistical procedure used to model dichotomous outcomes. In the logit model the log odds of the dichotomous outcome is modeled as a linear combination of the predictor variables. The log odds ratio in logistic regression provides a description of the probabilistic relationship of the variables and the outcome. In conducting logistic regression, selection procedures are used in selecting important predictor variables, diagnostics are used to check that assumptions are valid which include independence of errors, linearity in the logit for continuous variables, absence of multicollinearity, and lack of strongly influential outliers and a test statistic is calculated to determine the aptness of the model. This study used the binary logistic regression model to investigate overweight and obesity among rural secondary school students on the basis of their demographics profile, medical history, diet and lifestyle. The results indicate that overweight and obesity of students are influenced by obesity in family and the interaction between a student's ethnicity and routine meals intake. The odds of a student being overweight and obese are higher for a student having a family history of obesity and for a non-Malay student who frequently takes routine meals as compared to a Malay student.
Understanding logistic regression analysis.
Sperandei, Sandro
2014-01-01
Logistic regression is used to obtain odds ratio in the presence of more than one explanatory variable. The procedure is quite similar to multiple linear regression, with the exception that the response variable is binomial. The result is the impact of each variable on the odds ratio of the observed event of interest. The main advantage is to avoid confounding effects by analyzing the association of all variables together. In this article, we explain the logistic regression procedure using examples to make it as simple as possible. After definition of the technique, the basic interpretation of the results is highlighted and then some special issues are discussed.
Sollerhed, Ann-Christin; Andersson, Ingemar; Ejlertsson, Göran
2013-01-01
As an increase in pain symptoms among children has been shown in the last decades, the aim of this study was to describe perceptions of recurrent pain, measured physical fitness and levels of reported physical activity (PA) in children, and to investigate if any associations between PA, fitness and recurrent pain could be identified. A school-based study comprised 206 Swedish children 8-12 years old, 114 boys, 92 girls. A questionnaire with questions about perceived pain, self-reported PA and lifestyle factors was used. Health-related fitness was assessed by 11 physical tests. A physical index was calculated from these tests as a z score. High physical index indicated high fitness and low physical index indicated low fitness. ANOVA test, chi-square test and logistic regression analysis were used to compare active and inactive children. The prevalence of one pain location (head, abdomen or back) was 26%, two 11% and three 4% (n=206). Female gender, living in single-parent families, low PA and low subjective health were associated with reported recurrent pain. Children reporting high levels of PA had high physical index and reported low prevalence of pain symptoms. The physical index and level of self-reported PA decreased gradually the more pain locations. Physically active children had higher fitness levels and reported less pain symptoms than inactive peers. Coping with pain is an integral part of PA, and active children learn to cope with unpleasant body sensations which together with high fitness may reduce the perception of pain.
[Sarcopenic obesity and physical fitness in octogenarians: the multi-center EXERNET Project].
Muñoz-Arribas, Alberto; Mata, Esmeralda; Pedrero-Chamizo, Raquel; Espino, Luis; Gusi, Narcis; Villa, Gerardo; Gonzalez-Gross, Marcela; Casajús, José Antonio; Ara, Ignacio; Gómez-Cabello, Alba
2013-11-01
The aim of this study was to analyze the usefulness of different fitness test to detect the risk of sarcopenic obesity (SO) in octogenarian people. 306 subjects (76 men, 230 women) with a mean age of 82.5 ± 2.3 years from the Multi-center EXERNET Project sample fulfilled the inclusion criteria. Body composition was assessed in all subjects by bioelectrical impedance. Four groups were created based on the percentage of fat mass and muscle mass: 1) normal, 2) high fat mass, 3) low muscle mass and 4) SO. Physical fitness was assessed using 8 different tests modified from the batteries “Senior Fitness Test” and Eurofit (EXERNET battery). The risk of suffering SO depending on the fitness level was studied by logistic regression. Among the studied physical fitness tests, those that better predicted the risk of SO were leg strength, arm strength, agility, walking speed and balance in men; 95% CI [(0.606-0.957) (0.496-0.882), (0.020-2.014), (0.17-1.39), (0.913-1.002), all p < 0.05, except balance test (p = 0.07)] and balance test and agility in women; (95% CI [(0.928-1.002) (0.983-1.408), (both p = 0.07)]. Adequate levels of physical fitness are associated with a lower risk of SO. Some easy fitness tests seem to be useful for the detection of SO in those cases where the body-composition required methods for diagnosis are not available. Copyright AULA MEDICA EDICIONES 2013. Published by AULA MEDICA. All rights reserved.
Zhang, Peng; Parenteau, Chantal; Wang, Lu; Holcombe, Sven; Kohoyda-Inglis, Carla; Sullivan, June; Wang, Stewart
2013-11-01
This study resulted in a model-averaging methodology that predicts crash injury risk using vehicle, demographic, and morphomic variables and assesses the importance of individual predictors. The effectiveness of this methodology was illustrated through analysis of occupant chest injuries in frontal vehicle crashes. The crash data were obtained from the International Center for Automotive Medicine (ICAM) database for calendar year 1996 to 2012. The morphomic data are quantitative measurements of variations in human body 3-dimensional anatomy. Morphomics are obtained from imaging records. In this study, morphomics were obtained from chest, abdomen, and spine CT using novel patented algorithms. A NASS-trained crash investigator with over thirty years of experience collected the in-depth crash data. There were 226 cases available with occupants involved in frontal crashes and morphomic measurements. Only cases with complete recorded data were retained for statistical analysis. Logistic regression models were fitted using all possible configurations of vehicle, demographic, and morphomic variables. Different models were ranked by the Akaike Information Criteria (AIC). An averaged logistic regression model approach was used due to the limited sample size relative to the number of variables. This approach is helpful when addressing variable selection, building prediction models, and assessing the importance of individual variables. The final predictive results were developed using this approach, based on the top 100 models in the AIC ranking. Model-averaging minimized model uncertainty, decreased the overall prediction variance, and provided an approach to evaluating the importance of individual variables. There were 17 variables investigated: four vehicle, four demographic, and nine morphomic. More than 130,000 logistic models were investigated in total. The models were characterized into four scenarios to assess individual variable contribution to injury risk. Scenario 1 used vehicle variables; Scenario 2, vehicle and demographic variables; Scenario 3, vehicle and morphomic variables; and Scenario 4 used all variables. AIC was used to rank the models and to address over-fitting. In each scenario, the results based on the top three models and the averages of the top 100 models were presented. The AIC and the area under the receiver operating characteristic curve (AUC) were reported in each model. The models were re-fitted after removing each variable one at a time. The increases of AIC and the decreases of AUC were then assessed to measure the contribution and importance of the individual variables in each model. The importance of the individual variables was also determined by their weighted frequencies of appearance in the top 100 selected models. Overall, the AUC was 0.58 in Scenario 1, 0.78 in Scenario 2, 0.76 in Scenario 3 and 0.82 in Scenario 4. The results showed that morphomic variables are as accurate at predicting injury risk as demographic variables. The results of this study emphasize the importance of including morphomic variables when assessing injury risk. The results also highlight the need for morphomic data in the development of human mathematical models when assessing restraint performance in frontal crashes, since morphomic variables are more "tangible" measurements compared to demographic variables such as age and gender. Copyright © 2013 Elsevier Ltd. All rights reserved.
Factors associated with low fitness in adolescents--a mixed methods study.
Charlton, Richard; Gravenor, Michael B; Rees, Anwen; Knox, Gareth; Hill, Rebecca; Rahman, Muhammad A; Jones, Kerina; Christian, Danielle; Baker, Julien S; Stratton, Gareth; Brophy, Sinead
2014-07-29
Fitness and physical activity are important for cardiovascular and mental health but activity and fitness levels are declining especially in adolescents and among girls. This study examines clustering of factors associated with low fitness in adolescents in order to best target public health interventions for young people. 1147 children were assessed for fitness, had blood samples, anthropometric measures and all data were linked with routine electronic data to examine educational achievement, deprivation and health service usage. Factors associated with fitness were examined using logistic regression, conditional trees and data mining cluster analysis. Focus groups were conducted with children in a deprived school to examine barriers and facilitators to activity for children in a deprived community. Unfit adolescents are more likely to be deprived, female, have obesity in the family and not achieve in education. There were 3 main clusters for risk of future heart disease/diabetes (high cholesterol/insulin); children at low risk (not obese, fit, achieving in education), children 'visibly at risk' (overweight, unfit, many hospital/GP visits) and 'invisibly at risk' (unfit but not overweight, failing in academic achievement). Qualitative findings show barriers to physical activity include cost, poor access to activity, lack of core physical literacy skills and limited family support. Low fitness in the non-obese child can reveal a hidden group who have high risk factors for heart disease and diabetes but may not be identified as they are normal weight. In deprived communities low fitness is associated with non-achievement in education but in non-deprived communities low fitness is associated with female gender. Interventions need to target deprived families and schools in deprived areas with community wide campaigns.
Ritter, Anne C; Wagner, Amy K; Szaflarski, Jerzy P; Brooks, Maria M; Zafonte, Ross D; Pugh, Mary Jo V; Fabio, Anthony; Hammond, Flora M; Dreer, Laura E; Bushnik, Tamara; Walker, William C; Brown, Allen W; Johnson-Greene, Doug; Shea, Timothy; Krellman, Jason W; Rosenthal, Joseph A
2016-09-01
Posttraumatic seizures (PTS) are well-recognized acute and chronic complications of traumatic brain injury (TBI). Risk factors have been identified, but considerable variability in who develops PTS remains. Existing PTS prognostic models are not widely adopted for clinical use and do not reflect current trends in injury, diagnosis, or care. We aimed to develop and internally validate preliminary prognostic regression models to predict PTS during acute care hospitalization, and at year 1 and year 2 postinjury. Prognostic models predicting PTS during acute care hospitalization and year 1 and year 2 post-injury were developed using a recent (2011-2014) cohort from the TBI Model Systems National Database. Potential PTS predictors were selected based on previous literature and biologic plausibility. Bivariable logistic regression identified variables with a p-value < 0.20 that were used to fit initial prognostic models. Multivariable logistic regression modeling with backward-stepwise elimination was used to determine reduced prognostic models and to internally validate using 1,000 bootstrap samples. Fit statistics were calculated, correcting for overfitting (optimism). The prognostic models identified sex, craniotomy, contusion load, and pre-injury limitation in learning/remembering/concentrating as significant PTS predictors during acute hospitalization. Significant predictors of PTS at year 1 were subdural hematoma (SDH), contusion load, craniotomy, craniectomy, seizure during acute hospitalization, duration of posttraumatic amnesia, preinjury mental health treatment/psychiatric hospitalization, and preinjury incarceration. Year 2 significant predictors were similar to those of year 1: SDH, intraparenchymal fragment, craniotomy, craniectomy, seizure during acute hospitalization, and preinjury incarceration. Corrected concordance (C) statistics were 0.599, 0.747, and 0.716 for acute hospitalization, year 1, and year 2 models, respectively. The prognostic model for PTS during acute hospitalization did not discriminate well. Year 1 and year 2 models showed fair to good predictive validity for PTS. Cranial surgery, although medically necessary, requires ongoing research regarding potential benefits of increased monitoring for signs of epileptogenesis, PTS prophylaxis, and/or rehabilitation/social support. Future studies should externally validate models and determine clinical utility. Wiley Periodicals, Inc. © 2016 International League Against Epilepsy.
Swirski, A L; Pearl, D L; Peregrine, A S; Pintar, K
2016-04-01
The purpose of this study is to determine how demographic and exposure factors related to giardiasis vary between travel and endemic cases. Exposure and demographic data were gathered by public health inspectors from giardiasis cases reported from the Region of Waterloo from 2006 to 2012. Logistic regression models were fit to assess differences in exposure to risk factors for giardiasis between international travel-related cases and Canadian acquired cases while controlling for age and sex. Multinomial regression models were also fit to assess the differences in risk profiles between international and domestic travel-related cases and endemic cases. Travel-related cases (both international and domestic) were more likely to go camping or kayaking, and consume untreated water compared to endemic cases. Domestic travel-related cases were more likely to visit a petting zoo or farm compared to endemic cases, and were more likely to swim in freshwater compared to endemic cases and international travel-related cases. International travellers were more likely to swim in an ocean compared to both domestic travel-related and endemic cases. These findings demonstrate that travel-related and endemic cases have different risk exposure profiles which should be considered for appropriately targeting health promotion campaigns.
Yount, Kathryn M; Krause, Kathleen H
2017-01-01
To provide the first study in Vietnam of how gendered social learning about violence and exposure to non-family institutions influence women's attitudes about a wife's recourse after physical IPV. A probability sample of 532 married women, ages 18-50 years, was surveyed in July-August, 2012 in Mỹ Hào district. We fit a multivariate linear regression model to estimate correlates of favoring recourse in six situations using a validated attitudinal scale. We split attitudes towards recourse into three subscales (disfavor silence, favor informal recourse, favor formal recourse) and fit one multivariate ordinal logistic regression model for each behavior to estimate correlates of favoring recourse. On average, women favored recourse in 2.8 situations. Women who were older and had witnessed physical IPV in childhood had less favorable attitudes about recourse. Women who were hit as children, had completed more schooling, worked outside agriculture, and had sought recourse after IPV had more favorable attitudes about recourse. Normative change among women may require efforts to curb family violence, counsel those exposed to violence in childhood, and enhance women's opportunities for higher schooling and non-agricultural wage work. The state and organizations working on IPV might overcome pockets of unfavorable public opinion by enforcing accountability for IPV rather than seeking to alter ideas about recourse among women.
R, Jewkes; Y, Sikweyiya; K, Dunkle; R, Morrell
2015-07-07
Studies of rape of women seldom distinguish between men's participation in acts of single and multiple perpetrator rape. Multiple perpetrator rape (MPR) occurs globally with serious consequences for women. In South Africa it is a cultural practice with defined circumstances in which it commonly occurs. Prevention requires an understanding of whether it is a context specific intensification of single perpetrator rape, or a distinctly different practice of different men. This paper aims to address this question. We conducted a cross-sectional household study with a multi-stage, randomly selected sample of 1686 men aged 18-49 who completed a questionnaire administered using an Audio-enhanced Personal Digital Assistant. We attempted to fit an ordered logistic regression model for factors associated with rape perpetration. 27.6 % of men had raped and 8.8 % had perpetrated multiple perpetrator rape (MPR). Thus 31.9 % of men who had ever raped had done so with other perpetrators. An ordered regression model was fitted, showing that the same associated factors, albeit at higher prevalence, are associated with SPR and MPR. Multiple perpetrator rape appears as an intensified form of single perpetrator rape, rather than a different form of rape. Prevention approaches need to be mainstreamed among young men.
2011-01-01
Background Logistic random effects models are a popular tool to analyze multilevel also called hierarchical data with a binary or ordinal outcome. Here, we aim to compare different statistical software implementations of these models. Methods We used individual patient data from 8509 patients in 231 centers with moderate and severe Traumatic Brain Injury (TBI) enrolled in eight Randomized Controlled Trials (RCTs) and three observational studies. We fitted logistic random effects regression models with the 5-point Glasgow Outcome Scale (GOS) as outcome, both dichotomized as well as ordinal, with center and/or trial as random effects, and as covariates age, motor score, pupil reactivity or trial. We then compared the implementations of frequentist and Bayesian methods to estimate the fixed and random effects. Frequentist approaches included R (lme4), Stata (GLLAMM), SAS (GLIMMIX and NLMIXED), MLwiN ([R]IGLS) and MIXOR, Bayesian approaches included WinBUGS, MLwiN (MCMC), R package MCMCglmm and SAS experimental procedure MCMC. Three data sets (the full data set and two sub-datasets) were analysed using basically two logistic random effects models with either one random effect for the center or two random effects for center and trial. For the ordinal outcome in the full data set also a proportional odds model with a random center effect was fitted. Results The packages gave similar parameter estimates for both the fixed and random effects and for the binary (and ordinal) models for the main study and when based on a relatively large number of level-1 (patient level) data compared to the number of level-2 (hospital level) data. However, when based on relatively sparse data set, i.e. when the numbers of level-1 and level-2 data units were about the same, the frequentist and Bayesian approaches showed somewhat different results. The software implementations differ considerably in flexibility, computation time, and usability. There are also differences in the availability of additional tools for model evaluation, such as diagnostic plots. The experimental SAS (version 9.2) procedure MCMC appeared to be inefficient. Conclusions On relatively large data sets, the different software implementations of logistic random effects regression models produced similar results. Thus, for a large data set there seems to be no explicit preference (of course if there is no preference from a philosophical point of view) for either a frequentist or Bayesian approach (if based on vague priors). The choice for a particular implementation may largely depend on the desired flexibility, and the usability of the package. For small data sets the random effects variances are difficult to estimate. In the frequentist approaches the MLE of this variance was often estimated zero with a standard error that is either zero or could not be determined, while for Bayesian methods the estimates could depend on the chosen "non-informative" prior of the variance parameter. The starting value for the variance parameter may be also critical for the convergence of the Markov chain. PMID:21605357
ERIC Educational Resources Information Center
Koon, Sharon; Petscher, Yaacov
2015-01-01
The purpose of this report was to explicate the use of logistic regression and classification and regression tree (CART) analysis in the development of early warning systems. It was motivated by state education leaders' interest in maintaining high classification accuracy while simultaneously improving practitioner understanding of the rules by…
Park, Seon-Cheol; Lee, Min-Soo; Shinfuku, Naotaka; Sartorius, Norman; Park, Yong Chon
2015-09-01
The purpose of this study was to investigate whether there were gender-specific depressive symptom profiles or gender-specific patterns of psychotropic agent usage in Asian patients with depression. Clinical data from the Research on Asian Psychotropic Prescription Patterns for Antidepressant study (1171 depressed patients) were used to determine gender differences by analysis of covariates for continuous variables and by logistic regression analysis for discrete variables. In addition, a binary logistic regression model was fitted to identify independent clinical correlates of the gender-specific pattern on psychotropic drug usage. Men were more likely than women to have loss of interest (adjusted odds ratio = 1.379, p = 0.009), fatigue (adjusted odds ratio = 1.298, p = 0.033) and concurrent substance abuse (adjusted odds ratio = 3.793, p = 0.008), but gender differences in other symptom profiles and clinical features were not significant. Men were also more likely than women to be prescribed adjunctive therapy with a second-generation antipsychotic (adjusted odds ratio = 1.320, p = 0.044). However, men were less likely than women to have suicidal thoughts/acts (adjusted odds ratio = 0.724, p = 0.028). Binary logistic regression models revealed that lower age (odds ratio = 0.986, p = 0.027) and current hospitalization (odds ratio = 3.348, p < 0.0001) were independent clinical correlates of use of second-generation antipsychotics as adjunctive therapy for treating depressed Asian men. Unique gender-specific symptom profiles and gender-specific patterns of psychotropic drug usage can be identified in Asian patients with depression. Hence, ethnic and cultural influences on the gender preponderance of depression should be considered in the clinical psychiatry of Asian patients. © The Royal Australian and New Zealand College of Psychiatrists 2015.
Xu, Chengcheng; Wang, Wei; Liu, Pan; Zhang, Fangwei
2015-01-01
This study aimed to identify the traffic flow variables contributing to crash risks under different traffic states and to develop a real-time crash risk model incorporating the varying crash mechanisms across different traffic states. The crash, traffic, and geometric data were collected on the I-880N freeway in California in 2008 and 2009. This study considered 4 different traffic states in Wu's 4-phase traffic theory. They are free fluid traffic, bunched fluid traffic, bunched congested traffic, and standing congested traffic. Several different statistical methods were used to accomplish the research objective. The preliminary analysis showed that traffic states significantly affected crash likelihood, collision type, and injury severity. Nonlinear canonical correlation analysis (NLCCA) was conducted to identify the underlying phenomena that made certain traffic states more hazardous than others. The results suggested that different traffic states were associated with various collision types and injury severities. The matching of traffic flow characteristics and crash characteristics in NLCCA revealed how traffic states affected traffic safety. The logistic regression analyses showed that the factors contributing to crash risks were quite different across various traffic states. To incorporate the varying crash mechanisms across different traffic states, random parameters logistic regression was used to develop a real-time crash risk model. Bayesian inference based on Markov chain Monte Carlo simulations was used for model estimation. The parameters of traffic flow variables in the model were allowed to vary across different traffic states. Compared with the standard logistic regression model, the proposed model significantly improved the goodness-of-fit and predictive performance. These results can promote a better understanding of the relationship between traffic flow characteristics and crash risks, which is valuable knowledge in the pursuit of improving traffic safety on freeways through the use of dynamic safety management systems.
2017-03-23
PUBLIC RELEASE; DISTRIBUTION UNLIMITED Using Multiple and Logistic Regression to Estimate the Median Will- Cost and Probability of Cost and... Cost and Probability of Cost and Schedule Overrun for Program Managers Ryan C. Trudelle Follow this and additional works at: https://scholar.afit.edu...afit.edu. Recommended Citation Trudelle, Ryan C., "Using Multiple and Logistic Regression to Estimate the Median Will- Cost and Probability of Cost and
2013-11-01
Ptrend 0.78 0.62 0.75 Unconditional logistic regression was used to estimate odds ratios (OR) and 95 % confidence intervals (CI) for risk of node...Ptrend 0.71 0.67 Unconditional logistic regression was used to estimate odds ratios (OR) and 95 % confidence intervals (CI) for risk of high-grade tumors... logistic regression was used to estimate odds ratios (OR) and 95 % confidence intervals (CI) for the associations between each of the seven SNPs and
Kim, Sun Mi; Kim, Yongdai; Jeong, Kuhwan; Jeong, Heeyeong; Kim, Jiyoung
2018-01-01
The aim of this study was to compare the performance of image analysis for predicting breast cancer using two distinct regression models and to evaluate the usefulness of incorporating clinical and demographic data (CDD) into the image analysis in order to improve the diagnosis of breast cancer. This study included 139 solid masses from 139 patients who underwent a ultrasonography-guided core biopsy and had available CDD between June 2009 and April 2010. Three breast radiologists retrospectively reviewed 139 breast masses and described each lesion using the Breast Imaging Reporting and Data System (BI-RADS) lexicon. We applied and compared two regression methods-stepwise logistic (SL) regression and logistic least absolute shrinkage and selection operator (LASSO) regression-in which the BI-RADS descriptors and CDD were used as covariates. We investigated the performances of these regression methods and the agreement of radiologists in terms of test misclassification error and the area under the curve (AUC) of the tests. Logistic LASSO regression was superior (P<0.05) to SL regression, regardless of whether CDD was included in the covariates, in terms of test misclassification errors (0.234 vs. 0.253, without CDD; 0.196 vs. 0.258, with CDD) and AUC (0.785 vs. 0.759, without CDD; 0.873 vs. 0.735, with CDD). However, it was inferior (P<0.05) to the agreement of three radiologists in terms of test misclassification errors (0.234 vs. 0.168, without CDD; 0.196 vs. 0.088, with CDD) and the AUC without CDD (0.785 vs. 0.844, P<0.001), but was comparable to the AUC with CDD (0.873 vs. 0.880, P=0.141). Logistic LASSO regression based on BI-RADS descriptors and CDD showed better performance than SL in predicting the presence of breast cancer. The use of CDD as a supplement to the BI-RADS descriptors significantly improved the prediction of breast cancer using logistic LASSO regression.
NASA Astrophysics Data System (ADS)
Lombardo, L.; Cama, M.; Maerker, M.; Parisi, L.; Rotigliano, E.
2014-12-01
This study aims at comparing the performances of Binary Logistic Regression (BLR) and Boosted Regression Trees (BRT) methods in assessing landslide susceptibility for multiple-occurrence regional landslide events within the Mediterranean region. A test area was selected in the north-eastern sector of Sicily (southern Italy), corresponding to the catchments of the Briga and the Giampilieri streams both stretching for few kilometres from the Peloritan ridge (eastern Sicily, Italy) to the Ionian sea. This area was struck on the 1st October 2009 by an extreme climatic event resulting in thousands of rapid shallow landslides, mainly of debris flows and debris avalanches types involving the weathered layer of a low to high grade metamorphic bedrock. Exploiting the same set of predictors and the 2009 landslide archive, BLR- and BRT-based susceptibility models were obtained for the two catchments separately, adopting a random partition (RP) technique for validation; besides, the models trained in one of the two catchments (Briga) were tested in predicting the landslide distribution in the other (Giampilieri), adopting a spatial partition (SP) based validation procedure. All the validation procedures were based on multi-folds tests so to evaluate and compare the reliability of the fitting, the prediction skill, the coherence in the predictor selection and the precision of the susceptibility estimates. All the obtained models for the two methods produced very high predictive performances, with a general congruence between BLR and BRT in the predictor importance. In particular, the research highlighted that BRT-models reached a higher prediction performance with respect to BLR-models, for RP based modelling, whilst for the SP-based models the difference in predictive skills between the two methods dropped drastically, converging to an analogous excellent performance. However, when looking at the precision of the probability estimates, BLR demonstrated to produce more robust models in terms of selected predictors and coefficients, as well as of dispersion of the estimated probabilities around the mean value for each mapped pixel. The difference in the behaviour could be interpreted as the result of overfitting effects, which heavily affect decision tree classification more than logistic regression techniques.
SPSS macros to compare any two fitted values from a regression model.
Weaver, Bruce; Dubois, Sacha
2012-12-01
In regression models with first-order terms only, the coefficient for a given variable is typically interpreted as the change in the fitted value of Y for a one-unit increase in that variable, with all other variables held constant. Therefore, each regression coefficient represents the difference between two fitted values of Y. But the coefficients represent only a fraction of the possible fitted value comparisons that might be of interest to researchers. For many fitted value comparisons that are not captured by any of the regression coefficients, common statistical software packages do not provide the standard errors needed to compute confidence intervals or carry out statistical tests-particularly in more complex models that include interactions, polynomial terms, or regression splines. We describe two SPSS macros that implement a matrix algebra method for comparing any two fitted values from a regression model. The !OLScomp and !MLEcomp macros are for use with models fitted via ordinary least squares and maximum likelihood estimation, respectively. The output from the macros includes the standard error of the difference between the two fitted values, a 95% confidence interval for the difference, and a corresponding statistical test with its p-value.
Yu, Yuanyuan; Li, Hongkai; Sun, Xiaoru; Su, Ping; Wang, Tingting; Liu, Yi; Yuan, Zhongshang; Liu, Yanxun; Xue, Fuzhong
2017-12-28
Confounders can produce spurious associations between exposure and outcome in observational studies. For majority of epidemiologists, adjusting for confounders using logistic regression model is their habitual method, though it has some problems in accuracy and precision. It is, therefore, important to highlight the problems of logistic regression and search the alternative method. Four causal diagram models were defined to summarize confounding equivalence. Both theoretical proofs and simulation studies were performed to verify whether conditioning on different confounding equivalence sets had the same bias-reducing potential and then to select the optimum adjusting strategy, in which logistic regression model and inverse probability weighting based marginal structural model (IPW-based-MSM) were compared. The "do-calculus" was used to calculate the true causal effect of exposure on outcome, then the bias and standard error were used to evaluate the performances of different strategies. Adjusting for different sets of confounding equivalence, as judged by identical Markov boundaries, produced different bias-reducing potential in the logistic regression model. For the sets satisfied G-admissibility, adjusting for the set including all the confounders reduced the equivalent bias to the one containing the parent nodes of the outcome, while the bias after adjusting for the parent nodes of exposure was not equivalent to them. In addition, all causal effect estimations through logistic regression were biased, although the estimation after adjusting for the parent nodes of exposure was nearest to the true causal effect. However, conditioning on different confounding equivalence sets had the same bias-reducing potential under IPW-based-MSM. Compared with logistic regression, the IPW-based-MSM could obtain unbiased causal effect estimation when the adjusted confounders satisfied G-admissibility and the optimal strategy was to adjust for the parent nodes of outcome, which obtained the highest precision. All adjustment strategies through logistic regression were biased for causal effect estimation, while IPW-based-MSM could always obtain unbiased estimation when the adjusted set satisfied G-admissibility. Thus, IPW-based-MSM was recommended to adjust for confounders set.
Bennie, J A; Thomas, G; Wiesner, G H; van Uffelen, J G Z; Khan, A; Kolbe-Alexander, T; Vergeer, I; Biddle, S J H
2018-07-01
Fitness industry professionals (personal trainers, group instructors) may have a role in health promotion, particularly when working with subgroups with known health risks (e.g. older adults, obese). The aim of this study is to examine fitness professionals' level of interest in engaging with high-risk populations. Cross-sectional evaluation of a national survey. In 2014, 9100 Australian registered exercise professionals were invited to complete an online survey. Respondents reported their level of interest in engaging with nine health-risk population subgroups. A multivariable logistic regression analysis assessed the odds of being classified as having a 'low level' of interest in training high health-risk subgroups, adjusting for demographic and fitness industry-related factors. Of 1185 respondents (aged 17-72 years), 31.1% reported having a 'high level' of interest in training high health-risk subgroups. The highest level of interest was among 'obese clients' and 'adults (18-64 years) with chronic health conditions'. In the adjusted analysis, males (odds ratio [OR], 1.55, 95% confidence interval [CI]: 1.06-2.25) and those in urban settings (OR, 2.26, 95% CI: 1.54-3.37) were more likely to have a 'low level' of interest. Fitness professionals have a modest level of interest in training high health-risk subgroups. In addition to the development of strategies to increase interest, research should examine whether fitness professionals are able to safely prescribe exercise to high health-risk subgroups. Copyright © 2018. Published by Elsevier Ltd.
Modeling Governance KB with CATPCA to Overcome Multicollinearity in the Logistic Regression
NASA Astrophysics Data System (ADS)
Khikmah, L.; Wijayanto, H.; Syafitri, U. D.
2017-04-01
The problem often encounters in logistic regression modeling are multicollinearity problems. Data that have multicollinearity between explanatory variables with the result in the estimation of parameters to be bias. Besides, the multicollinearity will result in error in the classification. In general, to overcome multicollinearity in regression used stepwise regression. They are also another method to overcome multicollinearity which involves all variable for prediction. That is Principal Component Analysis (PCA). However, classical PCA in only for numeric data. Its data are categorical, one method to solve the problems is Categorical Principal Component Analysis (CATPCA). Data were used in this research were a part of data Demographic and Population Survey Indonesia (IDHS) 2012. This research focuses on the characteristic of women of using the contraceptive methods. Classification results evaluated using Area Under Curve (AUC) values. The higher the AUC value, the better. Based on AUC values, the classification of the contraceptive method using stepwise method (58.66%) is better than the logistic regression model (57.39%) and CATPCA (57.39%). Evaluation of the results of logistic regression using sensitivity, shows the opposite where CATPCA method (99.79%) is better than logistic regression method (92.43%) and stepwise (92.05%). Therefore in this study focuses on major class classification (using a contraceptive method), then the selected model is CATPCA because it can raise the level of the major class model accuracy.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Defraene, Gilles, E-mail: gilles.defraene@uzleuven.be; Van den Bergh, Laura; Al-Mamgani, Abrahim
2012-03-01
Purpose: To study the impact of clinical predisposing factors on rectal normal tissue complication probability modeling using the updated results of the Dutch prostate dose-escalation trial. Methods and Materials: Toxicity data of 512 patients (conformally treated to 68 Gy [n = 284] and 78 Gy [n = 228]) with complete follow-up at 3 years after radiotherapy were studied. Scored end points were rectal bleeding, high stool frequency, and fecal incontinence. Two traditional dose-based models (Lyman-Kutcher-Burman (LKB) and Relative Seriality (RS) and a logistic model were fitted using a maximum likelihood approach. Furthermore, these model fits were improved by including themore » most significant clinical factors. The area under the receiver operating characteristic curve (AUC) was used to compare the discriminating ability of all fits. Results: Including clinical factors significantly increased the predictive power of the models for all end points. In the optimal LKB, RS, and logistic models for rectal bleeding and fecal incontinence, the first significant (p = 0.011-0.013) clinical factor was 'previous abdominal surgery.' As second significant (p = 0.012-0.016) factor, 'cardiac history' was included in all three rectal bleeding fits, whereas including 'diabetes' was significant (p = 0.039-0.048) in fecal incontinence modeling but only in the LKB and logistic models. High stool frequency fits only benefitted significantly (p = 0.003-0.006) from the inclusion of the baseline toxicity score. For all models rectal bleeding fits had the highest AUC (0.77) where it was 0.63 and 0.68 for high stool frequency and fecal incontinence, respectively. LKB and logistic model fits resulted in similar values for the volume parameter. The steepness parameter was somewhat higher in the logistic model, also resulting in a slightly lower D{sub 50}. Anal wall DVHs were used for fecal incontinence, whereas anorectal wall dose best described the other two endpoints. Conclusions: Comparable prediction models were obtained with LKB, RS, and logistic NTCP models. Including clinical factors improved the predictive power of all models significantly.« less
Are Hemorrhoids Associated with False-Positive Fecal Immunochemical Test Results?
Kim, Nam Hee; Park, Jung Ho; Park, Dong Il; Sohn, Chong Il; Choi, Kyuyong
2017-01-01
Purpose False-positive (FP) results of fecal immunochemical tests (FITs) conducted in colorectal cancer (CRC) screening could lead to performing unnecessary colonoscopies. Hemorrhoids are a possible cause of FP FIT results; however, studies on this topic are extremely rare. We investigated whether hemorrhoids are associated with FP FIT results. Materials and Methods A retrospective study was conducted at a university hospital in Korea from June 2013 to May 2015. Of the 34547 individuals who underwent FITs, 3946 aged ≥50 years who underwent colonoscopies were analyzed. Logistic regression analysis was performed to determine factors associated with FP FIT results. Results Among 3946 participants, 704 (17.8%) showed positive FIT results and 1303 (33.0%) had hemorrhoids. Of the 704 participants with positive FIT results, 165 had advanced colorectal neoplasia (ACRN) and 539 had no ACRN (FP results). Of the 1303 participants with hemorrhoids, 291 showed FP results, of whom 81 showed FP results because of hemorrhoids only. Participants with hemorrhoids had a higher rate of FP results than those without hemorrhoids (291/1176, 24.7% vs. 248/2361, 10.5%; p<0.001). Additionally, the participants with hemorrhoids as the only abnormality had a higher rate of FP results than those experiencing no such abnormalities (81/531, 15.3% vs. 38/1173, 3.2%; p<0.001). In multivariate analysis, the presence of hemorrhoids was identified as an independent predictor of FP results (adjusted odds ratio, 2.76; 95% confidence interval, 2.24–3.40; p<0.001). Conclusion Hemorrhoids are significantly associated with FP FIT results. Their presence seemed to be a non-negligible contributor of FP results in FIT-based CRC screening programs. PMID:27873508
Juraschek, Stephen P.; Blaha, Michael J.; Whelton, Seamus P.; Blumenthal, Roger; Jones, Steven R.; Keteyian, Steven J.; Schairer, John; Brawner, Clinton A.; Al‐Mallah, Mouaz H.
2014-01-01
Background Increased physical fitness is protective against cardiovascular disease. We hypothesized that increased fitness would be inversely associated with hypertension. Methods and Results We examined the association of fitness with prevalent and incident hypertension in 57 284 participants from The Henry Ford ExercIse Testing (FIT) Project (1991–2009). Fitness was measured during a clinician‐referred treadmill stress test. Incident hypertension was defined as a new diagnosis of hypertension on 3 separate consecutive encounters derived from electronic medical records or administrative claims files. Analyses were performed with logistic regression or Cox proportional hazards models and were adjusted for hypertension risk factors. The mean age overall was 53 years, with 49% women and 29% black. Mean peak metabolic equivalents (METs) achieved was 9.2 (SD, 3.0). Fitness was inversely associated with prevalent hypertension even after adjustment (≥12 METs versus <6 METs; OR: 0.73; 95% CI: 0.67, 0.80). During a median follow‐up period of 4.4 years (interquartile range: 2.2 to 7.7 years), there were 8053 new cases of hypertension (36.4% of 22 109 participants without baseline hypertension). The unadjusted 5‐year cumulative incidences across categories of METs (<6, 6 to 9, 10 to 11, and ≥12) were 49%, 41%, 30%, and 21%. After adjustment, participants achieving ≥12 METs had a 20% lower risk of incident hypertension compared to participants achieving <6 METs (HR: 0.80; 95% CI: 0.72, 0.89). This relationship was preserved across strata of age, sex, race, obesity, resting blood pressure, and diabetes. Conclusions Higher fitness is associated with a lower probability of prevalent and incident hypertension independent of baseline risk factors. PMID:25520327
Recruit Fitness as a Predictor of Police Academy Graduation.
Shusko, M; Benedetti, L; Korre, M; Eshleman, E J; Farioli, A; Christophi, C A; Kales, S N
2017-10-01
Suboptimal recruit fitness may be a risk factor for poor performance, injury, illness, and lost time during police academy training. To assess the probability of successful completion and graduation from a police academy as a function of recruits' baseline fitness levels at the time of academy entry. Retrospective study where all available records from recruit training courses held (2006-2012) at all Massachusetts municipal police academies were reviewed and analysed. Entry fitness levels were quantified from the following measures, as recorded at the start of each training class: body composition, push-ups, sit-ups, sit-and-reach, and 1.5-mile run-time. The primary outcome of interest was the odds of not successfully graduating from an academy. We used generalized linear mixed models in order to fit logistic regression models with random intercepts for assessing the probability of not graduating, based on entry-level fitness. The primary analyses were restricted to recruits with complete entry-level fitness data. The fitness measures most strongly associated with academy failure were lesser number of push-ups completed (odds ratio [OR] = 5.2, 95% confidence interval [CI] 2.3-11.7, for 20 versus 41-60 push-ups) and slower run times (OR = 3.8, 95% CI 1.8-7.8, [1.5 mile run time of ≥15'20″] versus [12'33″ to 10'37″]). Baseline pushups and 1.5-mile run-time showed the best ability to predict successful academy graduation, especially when considered together. Future research should include prospective validation of entry-level fitness as a predictor of subsequent police academy success. © The Author 2017. Published by Oxford University Press on behalf of the Society of Occupational Medicine.
Sources of practice knowledge among Australian fitness trainers.
Bennie, Jason A; Wiesner, Glen H; van Uffelen, Jannique G Z; Harvey, Jack T; Biddle, Stuart J H
2017-12-01
Few studies have examined the sources of practice knowledge fitness trainers use to inform their training methods and update knowledge. This study aims to describe sources of practice knowledge among Australian fitness trainers. In July 2014, 9100 Australian fitness trainers were invited to complete an online survey. Respondents reported the frequency of use of eight sources of practice knowledge (e.g. fitness magazines, academic texts). In a separate survey, exercise science experts (n = 27) ranked each source as either (1) 'high-quality' or (2) 'low-quality'. Proportions of users of 'high-quality' sources were calculated across demographic (age, sex) and fitness industry-related characteristics (qualification, setting, role). A multivariate logistic regression analysis assessed the odds of being classified as a user of high-quality sources, adjusting for demographic and fitness industry-related factors. Out of 1185 fitness trainers (response rate = 13.0%), aged 17-72 years, 47.6% (95% CI, 44.7-50.4%) were classified as frequent users of high-quality sources of practice knowledge. In the adjusted analysis, compared to trainers aged 17-26 years, those aged ≥61 years (OR, 2.15; 95% CI, 1.05-4.38) and 40-50 years (OR, 1.54; 95% CI, 1.02-2.31) were more likely to be classified as a user of high-quality sources. When compared to trainers working in large centres, those working in outdoor settings (OR, 1.81; 95% CI, 1.23-2.65) and medium centres (OR, 1.59; 95% CI, 1.12-2.29) were more likely to be classified as users of high-quality sources. Our findings suggest that efforts should be made to improve the quality of knowledge acquisition among Australian fitness trainers.
Li, Y.; Graubard, B. I.; Huang, P.; Gastwirth, J. L.
2015-01-01
Determining the extent of a disparity, if any, between groups of people, for example, race or gender, is of interest in many fields, including public health for medical treatment and prevention of disease. An observed difference in the mean outcome between an advantaged group (AG) and disadvantaged group (DG) can be due to differences in the distribution of relevant covariates. The Peters–Belson (PB) method fits a regression model with covariates to the AG to predict, for each DG member, their outcome measure as if they had been from the AG. The difference between the mean predicted and the mean observed outcomes of DG members is the (unexplained) disparity of interest. We focus on applying the PB method to estimate the disparity based on binary/multinomial/proportional odds logistic regression models using data collected from complex surveys with more than one DG. Estimators of the unexplained disparity, an analytic variance–covariance estimator that is based on the Taylor linearization variance–covariance estimation method, as well as a Wald test for testing a joint null hypothesis of zero for unexplained disparities between two or more minority groups and a majority group, are provided. Simulation studies with data selected from simple random sampling and cluster sampling, as well as the analyses of disparity in body mass index in the National Health and Nutrition Examination Survey 1999–2004, are conducted. Empirical results indicate that the Taylor linearization variance–covariance estimation is accurate and that the proposed Wald test maintains the nominal level. PMID:25382235
Mapping Shallow Landslide Slope Inestability at Large Scales Using Remote Sensing and GIS
NASA Astrophysics Data System (ADS)
Avalon Cullen, C.; Kashuk, S.; Temimi, M.; Suhili, R.; Khanbilvardi, R.
2015-12-01
Rainfall induced landslides are one of the most frequent hazards on slanted terrains. They lead to great economic losses and fatalities worldwide. Most factors inducing shallow landslides are local and can only be mapped with high levels of uncertainty at larger scales. This work presents an attempt to determine slope instability at large scales. Buffer and threshold techniques are used to downscale areas and minimize uncertainties. Four static parameters (slope angle, soil type, land cover and elevation) for 261 shallow rainfall-induced landslides in the continental United States are examined. ASTER GDEM is used as bases for topographical characterization of slope and buffer analysis. Slope angle threshold assessment at the 50, 75, 95, 98, and 99 percentiles is tested locally. Further analysis of each threshold in relation to other parameters is investigated in a logistic regression environment for the continental U.S. It is determined that lower than 95-percentile thresholds under-estimate slope angles. Best regression fit can be achieved when utilizing the 99-threshold slope angle. This model predicts the highest number of cases correctly at 87.0% accuracy. A one-unit rise in the 99-threshold range increases landslide likelihood by 11.8%. The logistic regression model is carried over to ArcGIS where all variables are processed based on their corresponding coefficients. A regional slope instability map for the continental United States is created and analyzed against the available landslide records and their spatial distributions. It is expected that future inclusion of dynamic parameters like precipitation and other proxies like soil moisture into the model will further improve accuracy.
Mocellin, Simone; Thompson, John F; Pasquali, Sandro; Montesco, Maria C; Pilati, Pierluigi; Nitti, Donato; Saw, Robyn P; Scolyer, Richard A; Stretch, Jonathan R; Rossi, Carlo R
2009-12-01
To improve selection for sentinel node (SN) biopsy (SNB) in patients with cutaneous melanoma using statistical models predicting SN status. About 80% of patients currently undergoing SNB are node negative. In the absence of conclusive evidence of a SNBassociated survival benefit, these patients may be over-treated. Here, we tested the efficiency of 4 different models in predicting SN status. The clinicopathologic data (age, gender, tumor thickness, Clark level, regression, ulceration, histologic subtype, and mitotic index) of 1132 melanoma patients who had undergone SNB at institutions in Italy and Australia were analyzed. Logistic regression, classification tree, random forest, and support vector machine models were fitted to the data. The predictive models were built with the aim of maximizing the negative predictive value (NPV) and reducing the rate of SNB procedures though minimizing the error rate. After cross-validation logistic regression, classification tree, random forest, and support vector machine predictive models obtained clinically relevant NPV (93.6%, 94.0%, 97.1%, and 93.0%, respectively), SNB reduction (27.5%, 29.8%, 18.2%, and 30.1%, respectively), and error rates (1.8%, 1.8%, 0.5%, and 2.1%, respectively). Using commonly available clinicopathologic variables, predictive models can preoperatively identify a proportion of patients ( approximately 25%) who might be spared SNB, with an acceptable (1%-2%) error. If validated in large prospective series, these models might be implemented in the clinical setting for improved patient selection, which ultimately would lead to better quality of life for patients and optimization of resource allocation for the health care system.
Logistic regression models of factors influencing the location of bioenergy and biofuels plants
T.M. Young; R.L. Zaretzki; J.H. Perdue; F.M. Guess; X. Liu
2011-01-01
Logistic regression models were developed to identify significant factors that influence the location of existing wood-using bioenergy/biofuels plants and traditional wood-using facilities. Logistic models provided quantitative insight for variables influencing the location of woody biomass-using facilities. Availability of "thinnings to a basal area of 31.7m2/ha...
Discrete post-processing of total cloud cover ensemble forecasts
NASA Astrophysics Data System (ADS)
Hemri, Stephan; Haiden, Thomas; Pappenberger, Florian
2017-04-01
This contribution presents an approach to post-process ensemble forecasts for the discrete and bounded weather variable of total cloud cover. Two methods for discrete statistical post-processing of ensemble predictions are tested. The first approach is based on multinomial logistic regression, the second involves a proportional odds logistic regression model. Applying them to total cloud cover raw ensemble forecasts from the European Centre for Medium-Range Weather Forecasts improves forecast skill significantly. Based on station-wise post-processing of raw ensemble total cloud cover forecasts for a global set of 3330 stations over the period from 2007 to early 2014, the more parsimonious proportional odds logistic regression model proved to slightly outperform the multinomial logistic regression model. Reference Hemri, S., Haiden, T., & Pappenberger, F. (2016). Discrete post-processing of total cloud cover ensemble forecasts. Monthly Weather Review 144, 2565-2577.
Fuzzy multinomial logistic regression analysis: A multi-objective programming approach
NASA Astrophysics Data System (ADS)
Abdalla, Hesham A.; El-Sayed, Amany A.; Hamed, Ramadan
2017-05-01
Parameter estimation for multinomial logistic regression is usually based on maximizing the likelihood function. For large well-balanced datasets, Maximum Likelihood (ML) estimation is a satisfactory approach. Unfortunately, ML can fail completely or at least produce poor results in terms of estimated probabilities and confidence intervals of parameters, specially for small datasets. In this study, a new approach based on fuzzy concepts is proposed to estimate parameters of the multinomial logistic regression. The study assumes that the parameters of multinomial logistic regression are fuzzy. Based on the extension principle stated by Zadeh and Bárdossy's proposition, a multi-objective programming approach is suggested to estimate these fuzzy parameters. A simulation study is used to evaluate the performance of the new approach versus Maximum likelihood (ML) approach. Results show that the new proposed model outperforms ML in cases of small datasets.
A Primer on Logistic Regression.
ERIC Educational Resources Information Center
Woldbeck, Tanya
This paper introduces logistic regression as a viable alternative when the researcher is faced with variables that are not continuous. If one is to use simple regression, the dependent variable must be measured on a continuous scale. In the behavioral sciences, it may not always be appropriate or possible to have a measured dependent variable on a…
An intervention program to promote health-related physical fitness in nurses.
Yuan, Su-Chuan; Chou, Ming-Chih; Hwu, Lien-Jen; Chang, Yin-O; Hsu, Wen-Hsin; Kuo, Hsien-Wen
2009-05-01
To assess the effects of exercise intervention on nurses' health-related physical fitness. Regular exercise that includes gymnastics or aerobics has a positive effect on fitness. In Taiwan, there are not much data which assess the effects of exercise intervention on nurses' health-related physical fitness. Many studies have reported the high incidence of musculoskeletal disorders (MSDs) in nurses However, there has been limited research on intervention programs that are designed to improve the general physical fitness of nurses. A quasi-experimental study was conducted at a medical centre in central Taiwan. Ninety nurses from five different units of a hospital volunteered to participate in this study and participated in an experimental group and a control group. The experimental group engaged in a three-month intervention program consisting of treadmill exercise. Indicators of the health-related physical fitness of both groups were established and assessed before and after the intervention. Before intervention, the control group had significantly better grasp strength, flexibility and durability of abdominal muscles than the experimental group (p < 0.05). After the intervention, logistic regression was used to adjust for marital status, work duration, regular exercise and workload and found that the experimental group performed significantly better (p < 0.05) on body mass index, grasp strength, flexibility, durability of abdominal and back muscles and cardiopulmonary function. This study demonstrates that the development and implementation of an intervention program can promote and improve the health-related physical fitness of nurses. It is suggested that nurses engage in an exercise program while in the workplace to lower the risk of MSDs and to promote working efficiency.
A Solution to Separation and Multicollinearity in Multiple Logistic Regression
Shen, Jianzhao; Gao, Sujuan
2010-01-01
In dementia screening tests, item selection for shortening an existing screening test can be achieved using multiple logistic regression. However, maximum likelihood estimates for such logistic regression models often experience serious bias or even non-existence because of separation and multicollinearity problems resulting from a large number of highly correlated items. Firth (1993, Biometrika, 80(1), 27–38) proposed a penalized likelihood estimator for generalized linear models and it was shown to reduce bias and the non-existence problems. The ridge regression has been used in logistic regression to stabilize the estimates in cases of multicollinearity. However, neither solves the problems for each other. In this paper, we propose a double penalized maximum likelihood estimator combining Firth’s penalized likelihood equation with a ridge parameter. We present a simulation study evaluating the empirical performance of the double penalized likelihood estimator in small to moderate sample sizes. We demonstrate the proposed approach using a current screening data from a community-based dementia study. PMID:20376286
A Solution to Separation and Multicollinearity in Multiple Logistic Regression.
Shen, Jianzhao; Gao, Sujuan
2008-10-01
In dementia screening tests, item selection for shortening an existing screening test can be achieved using multiple logistic regression. However, maximum likelihood estimates for such logistic regression models often experience serious bias or even non-existence because of separation and multicollinearity problems resulting from a large number of highly correlated items. Firth (1993, Biometrika, 80(1), 27-38) proposed a penalized likelihood estimator for generalized linear models and it was shown to reduce bias and the non-existence problems. The ridge regression has been used in logistic regression to stabilize the estimates in cases of multicollinearity. However, neither solves the problems for each other. In this paper, we propose a double penalized maximum likelihood estimator combining Firth's penalized likelihood equation with a ridge parameter. We present a simulation study evaluating the empirical performance of the double penalized likelihood estimator in small to moderate sample sizes. We demonstrate the proposed approach using a current screening data from a community-based dementia study.
Ye, Dong-qing; Hu, Yi-song; Li, Xiang-pei; Huang, Fen; Yang, Shi-gui; Hao, Jia-hu; Yin, Jing; Zhang, Guo-qing; Liu, Hui-hui
2004-11-01
To explore the impact of environmental factors, daily lifestyle, psycho-social factors and the interactions between environmental factors and chemokines genes on systemic lupus erythematosus (SLE). Case-control study was carried out and environmental factors for SLE were analyzed by univariate and multivariate unconditional logistic regression. Interactions between environmental factors and chemokines polymorphism contributing to systemic lupus erythematosus were also analyzed by logistic regression model. There were nineteen factors associated with SLE when univariate unconditional logistic regression was used. However, when multivariate unconditional logistic regression was used, only five factors showed having impacts on the disease, in which drinking well water (OR=0.099) was protective factor for SLE, and multiple drug allergy (OR=8.174), over-exposure to sunshine (OR=18.339), taking antibiotics (OR=9.630) and oral contraceptives were risk factors for SLE. When unconditional logistic regression model was used, results showed that there was interaction between eating irritable food and -2518MCP-1G/G genotype (OR=4.387). No interaction between environmental factors was found that contributing to SLE in this study. Many environmental factors were related to SLE, and there was an interaction between -2518MCP-1G/G genotype and eating irritable food.
Mielniczuk, Jan; Teisseyre, Paweł
2018-03-01
Detection of gene-gene interactions is one of the most important challenges in genome-wide case-control studies. Besides traditional logistic regression analysis, recently the entropy-based methods attracted a significant attention. Among entropy-based methods, interaction information is one of the most promising measures having many desirable properties. Although both logistic regression and interaction information have been used in several genome-wide association studies, the relationship between them has not been thoroughly investigated theoretically. The present paper attempts to fill this gap. We show that although certain connections between the two methods exist, in general they refer two different concepts of dependence and looking for interactions in those two senses leads to different approaches to interaction detection. We introduce ordering between interaction measures and specify conditions for independent and dependent genes under which interaction information is more discriminative measure than logistic regression. Moreover, we show that for so-called perfect distributions those measures are equivalent. The numerical experiments illustrate the theoretical findings indicating that interaction information and its modified version are more universal tools for detecting various types of interaction than logistic regression and linkage disequilibrium measures. © 2017 WILEY PERIODICALS, INC.
Effects of Inventory Bias on Landslide Susceptibility Calculations
NASA Technical Reports Server (NTRS)
Stanley, T. A.; Kirschbaum, D. B.
2017-01-01
Many landslide inventories are known to be biased, especially inventories for large regions such as Oregon's SLIDO or NASA's Global Landslide Catalog. These biases must affect the results of empirically derived susceptibility models to some degree. We evaluated the strength of the susceptibility model distortion from postulated biases by truncating an unbiased inventory. We generated a synthetic inventory from an existing landslide susceptibility map of Oregon, then removed landslides from this inventory to simulate the effects of reporting biases likely to affect inventories in this region, namely population and infrastructure effects. Logistic regression models were fitted to the modified inventories. Then the process of biasing a susceptibility model was repeated with SLIDO data. We evaluated each susceptibility model with qualitative and quantitative methods. Results suggest that the effects of landslide inventory bias on empirical models should not be ignored, even if those models are, in some cases, useful. We suggest fitting models in well-documented areas and extrapolating across the study region as a possible approach to modeling landslide susceptibility with heavily biased inventories.
Effects of Inventory Bias on Landslide Susceptibility Calculations
NASA Technical Reports Server (NTRS)
Stanley, Thomas; Kirschbaum, Dalia B.
2017-01-01
Many landslide inventories are known to be biased, especially inventories for large regions such as Oregons SLIDO or NASAs Global Landslide Catalog. These biases must affect the results of empirically derived susceptibility models to some degree. We evaluated the strength of the susceptibility model distortion from postulated biases by truncating an unbiased inventory. We generated a synthetic inventory from an existing landslide susceptibility map of Oregon, then removed landslides from this inventory to simulate the effects of reporting biases likely to affect inventories in this region, namely population and infrastructure effects. Logistic regression models were fitted to the modified inventories. Then the process of biasing a susceptibility model was repeated with SLIDO data. We evaluated each susceptibility model with qualitative and quantitative methods. Results suggest that the effects of landslide inventory bias on empirical models should not be ignored, even if those models are, in some cases, useful. We suggest fitting models in well-documented areas and extrapolating across the study region as a possible approach to modelling landslide susceptibility with heavily biased inventories.
ERIC Educational Resources Information Center
Shih, Ching-Lin; Liu, Tien-Hsiang; Wang, Wen-Chung
2014-01-01
The simultaneous item bias test (SIBTEST) method regression procedure and the differential item functioning (DIF)-free-then-DIF strategy are applied to the logistic regression (LR) method simultaneously in this study. These procedures are used to adjust the effects of matching true score on observed score and to better control the Type I error…
Connecting clinical and actuarial prediction with rule-based methods.
Fokkema, Marjolein; Smits, Niels; Kelderman, Henk; Penninx, Brenda W J H
2015-06-01
Meta-analyses comparing the accuracy of clinical versus actuarial prediction have shown actuarial methods to outperform clinical methods, on average. However, actuarial methods are still not widely used in clinical practice, and there has been a call for the development of actuarial prediction methods for clinical practice. We argue that rule-based methods may be more useful than the linear main effect models usually employed in prediction studies, from a data and decision analytic as well as a practical perspective. In addition, decision rules derived with rule-based methods can be represented as fast and frugal trees, which, unlike main effects models, can be used in a sequential fashion, reducing the number of cues that have to be evaluated before making a prediction. We illustrate the usability of rule-based methods by applying RuleFit, an algorithm for deriving decision rules for classification and regression problems, to a dataset on prediction of the course of depressive and anxiety disorders from Penninx et al. (2011). The RuleFit algorithm provided a model consisting of 2 simple decision rules, requiring evaluation of only 2 to 4 cues. Predictive accuracy of the 2-rule model was very similar to that of a logistic regression model incorporating 20 predictor variables, originally applied to the dataset. In addition, the 2-rule model required, on average, evaluation of only 3 cues. Therefore, the RuleFit algorithm appears to be a promising method for creating decision tools that are less time consuming and easier to apply in psychological practice, and with accuracy comparable to traditional actuarial methods. (c) 2015 APA, all rights reserved).
What Counts When it Comes to School Enjoyment and Aspiration in the Middle Grades.
Smith, Megan L; Mann, Michael J; Georgieva, Zornitsa; Curtis, Reagan; Schimmel, Christine J
2016-01-01
Young adolescents, and the middle level educators who work with them, face many exciting but demanding challenges during this key period of development. According to stage-environment fit theory, the degree to which middle grades students perceive a good fit between their school environment and their needs impacts their academic and life outcomes. The authors endeavored to build on middle level research by studying the extent to which students' needs are supported by school environment factors and how this "fit" relates to two academic outcome variables: school enjoyment and aspiration. The sample consisted of middle grades students ( N = 1,027) between the ages of 10 and 14. Hierarchical logistic regression analyses were conducted. After controlling for age, ethnicity, and gender, four subscales (Social Skills Needs, Mental Health Needs, Academic and Career Needs, and School Support) were entered as potential predictors. Both models were significant and accounted for ~20% of the variance. This study suggests that middle level educators, counselors, and administrators may benefit from considering ways to enhance the match between students' and the middle grades' learning environment, especially by considering non-academic factors as a way to provide indirect, but powerful, support for academic and life success.
Access disparities to Magnet hospitals for patients undergoing neurosurgical operations
Missios, Symeon; Bekelis, Kimon
2017-01-01
Background Centers of excellence focusing on quality improvement have demonstrated superior outcomes for a variety of surgical interventions. We investigated the presence of access disparities to hospitals recognized by the Magnet Recognition Program of the American Nurses Credentialing Center (ANCC) for patients undergoing neurosurgical operations. Methods We performed a cohort study of all neurosurgery patients who were registered in the New York Statewide Planning and Research Cooperative System (SPARCS) database from 2009–2013. We examined the association of African-American race and lack of insurance with Magnet status hospitalization for neurosurgical procedures. A mixed effects propensity adjusted multivariable regression analysis was used to control for confounding. Results During the study period, 190,535 neurosurgical patients met the inclusion criteria. Using a multivariable logistic regression, we demonstrate that African-Americans had lower admission rates to Magnet institutions (OR 0.62; 95% CI, 0.58–0.67). This persisted in a mixed effects logistic regression model (OR 0.77; 95% CI, 0.70–0.83) to adjust for clustering at the patient county level, and a propensity score adjusted logistic regression model (OR 0.75; 95% CI, 0.69–0.82). Additionally, lack of insurance was associated with lower admission rates to Magnet institutions (OR 0.71; 95% CI, 0.68–0.73), in a multivariable logistic regression model. This persisted in a mixed effects logistic regression model (OR 0.72; 95% CI, 0.69–0.74), and a propensity score adjusted logistic regression model (OR 0.72; 95% CI, 0.69–0.75). Conclusions Using a comprehensive all-payer cohort of neurosurgery patients in New York State we identified an association of African-American race and lack of insurance with lower rates of admission to Magnet hospitals. PMID:28684152
Adjusting for Confounding in Early Postlaunch Settings: Going Beyond Logistic Regression Models.
Schmidt, Amand F; Klungel, Olaf H; Groenwold, Rolf H H
2016-01-01
Postlaunch data on medical treatments can be analyzed to explore adverse events or relative effectiveness in real-life settings. These analyses are often complicated by the number of potential confounders and the possibility of model misspecification. We conducted a simulation study to compare the performance of logistic regression, propensity score, disease risk score, and stabilized inverse probability weighting methods to adjust for confounding. Model misspecification was induced in the independent derivation dataset. We evaluated performance using relative bias confidence interval coverage of the true effect, among other metrics. At low events per coefficient (1.0 and 0.5), the logistic regression estimates had a large relative bias (greater than -100%). Bias of the disease risk score estimates was at most 13.48% and 18.83%. For the propensity score model, this was 8.74% and >100%, respectively. At events per coefficient of 1.0 and 0.5, inverse probability weighting frequently failed or reduced to a crude regression, resulting in biases of -8.49% and 24.55%. Coverage of logistic regression estimates became less than the nominal level at events per coefficient ≤5. For the disease risk score, inverse probability weighting, and propensity score, coverage became less than nominal at events per coefficient ≤2.5, ≤1.0, and ≤1.0, respectively. Bias of misspecified disease risk score models was 16.55%. In settings with low events/exposed subjects per coefficient, disease risk score methods can be useful alternatives to logistic regression models, especially when propensity score models cannot be used. Despite better performance of disease risk score methods than logistic regression and propensity score models in small events per coefficient settings, bias, and coverage still deviated from nominal.
Jean, J-S; Guo, H-R; Chen, S-H; Liu, C-C; Chang, W-T; Yang, Y-J; Huang, M-C
2006-12-01
To determine the association between rainfall rate and occurrence of enterovirus infection related to contamination of drinking water. One fatality case and three cases of severe illness were observed during the enterovirus epidemic in a village in southern Taiwan from 16 September to 3 October 1998. Groundwater samples were collected from the public well in the village after heavy rainfall to test for enterovirus using the reverse transcription-polymerase chain reaction (RT-PCR) assay. The RT-PCR assay detected the enterovirus in the groundwater sample collected on 26 September 1998. The logistic regression model also revealed a statistically significant association between the rainfall rate and the observation of cases of enterovirus infection. According to the fitted logistic regression model, the probability of detecting cases of enterovirus infection was greater than 50% at rainfall rates >31 mm h(-1). The higher the rainfall rate, the higher the probability of enterovirus epidemic. Contamination of drinking water by the enterovirus may lead to epidemics that cause deaths and severe illness, and such contamination may be caused by heavy rainfall. The major finding in this study is that the enterovirus could be flushed to groundwater in an unconfined aquifer after a heavy rainfall. This work allows for a warning level so that an action can be taken to minimize future outbreaks and so protect public health.
Ahonen, Emily Q; Nebot, Manel; Giménez, Emmanuel
2007-01-01
Poor mental health is a common problem in adolescence. Little information is available, however, about the factors influencing negative mood states in otherwise healthy adolescents. We aimed to describe the mood states and related factors in a sample of adolescents in the city of Barcelona (Spain). We administered a health survey to a sample of 2,727 students from public, subsidized, and private schools in Barcelona, aged approximately 14, 16, and 18 years old. To analyze the associations among moods and related factors, we used bivariate logistic regression, and fitted multivariate logistic regressions using the statistically significant variables from the bivariate analysis. To examine the possible group effects of the school on individual students, we employed multilevel analysis. The frequencies of negative mood states increased with age, with girls consistently reporting more frequent negative mood states than boys. The factors associated with negative mood states were problematic alcohol use, perceived mistreatment or abuse, antisocial behavior, intention to use or current use of illegal drugs (not including cannabis), lower perceived academic performance, and feeling isolated. Mood states are influenced by lifestyle and social factors, about which there is little local information. To plan and implement appropriate public health interventions, more complete information about the possible areas of influence is required. To complement the information obtained from studies such as the present study, longitudinal and qualitative studies would be desirable.
Workie, Demeke Lakew; Zike, Dereje Tesfaye; Fenta, Haile Mekonnen; Mekonnen, Mulusew Admasu
2017-09-01
Unintended pregnancy related to unmet need is a worldwide problem that affects societies. The main objective of this study was to identify the prevalence and determinants of unmet need for family planning among women aged (15-49) in Ethiopia. The Performance Monitoring and Accountability2020/Ethiopia was conducted in April 2016 at round-4 from 7494 women with two-stage-stratified sampling. Bi-variable and multi-variable binary logistic regression model with complex sampling design was fitted. The prevalence of unmet-need for family planning was 16.2% in Ethiopia. Women between the age range of 15-24 years were 2.266 times more likely to have unmet need family planning compared to above 35 years. Women who were currently married were about 8 times more likely to have unmet need family planning compared to never married women. Women who had no under-five child were 0.125 times less likely to have unmet need family planning compared to those who had more than two-under-5. The key determinants of unmet need family planning in Ethiopia were residence, age, marital-status, education, household members, birth-events and number of under-5 children. Thus the Government of Ethiopia would take immediate steps to address the causes of high unmet need for family planning among women.
Pfeiffer, R M; Riedl, R
2015-08-15
We assess the asymptotic bias of estimates of exposure effects conditional on covariates when summary scores of confounders, instead of the confounders themselves, are used to analyze observational data. First, we study regression models for cohort data that are adjusted for summary scores. Second, we derive the asymptotic bias for case-control studies when cases and controls are matched on a summary score, and then analyzed either using conditional logistic regression or by unconditional logistic regression adjusted for the summary score. Two scores, the propensity score (PS) and the disease risk score (DRS) are studied in detail. For cohort analysis, when regression models are adjusted for the PS, the estimated conditional treatment effect is unbiased only for linear models, or at the null for non-linear models. Adjustment of cohort data for DRS yields unbiased estimates only for linear regression; all other estimates of exposure effects are biased. Matching cases and controls on DRS and analyzing them using conditional logistic regression yields unbiased estimates of exposure effect, whereas adjusting for the DRS in unconditional logistic regression yields biased estimates, even under the null hypothesis of no association. Matching cases and controls on the PS yield unbiased estimates only under the null for both conditional and unconditional logistic regression, adjusted for the PS. We study the bias for various confounding scenarios and compare our asymptotic results with those from simulations with limited sample sizes. To create realistic correlations among multiple confounders, we also based simulations on a real dataset. Copyright © 2015 John Wiley & Sons, Ltd.
Nie, Z Q; Ou, Y Q; Zhuang, J; Qu, Y J; Mai, J Z; Chen, J M; Liu, X Q
2016-05-01
Conditional logistic regression analysis and unconditional logistic regression analysis are commonly used in case control study, but Cox proportional hazard model is often used in survival data analysis. Most literature only refer to main effect model, however, generalized linear model differs from general linear model, and the interaction was composed of multiplicative interaction and additive interaction. The former is only statistical significant, but the latter has biological significance. In this paper, macros was written by using SAS 9.4 and the contrast ratio, attributable proportion due to interaction and synergy index were calculated while calculating the items of logistic and Cox regression interactions, and the confidence intervals of Wald, delta and profile likelihood were used to evaluate additive interaction for the reference in big data analysis in clinical epidemiology and in analysis of genetic multiplicative and additive interactions.
Hidden Connections between Regression Models of Strain-Gage Balance Calibration Data
NASA Technical Reports Server (NTRS)
Ulbrich, Norbert
2013-01-01
Hidden connections between regression models of wind tunnel strain-gage balance calibration data are investigated. These connections become visible whenever balance calibration data is supplied in its design format and both the Iterative and Non-Iterative Method are used to process the data. First, it is shown how the regression coefficients of the fitted balance loads of a force balance can be approximated by using the corresponding regression coefficients of the fitted strain-gage outputs. Then, data from the manual calibration of the Ames MK40 six-component force balance is chosen to illustrate how estimates of the regression coefficients of the fitted balance loads can be obtained from the regression coefficients of the fitted strain-gage outputs. The study illustrates that load predictions obtained by applying the Iterative or the Non-Iterative Method originate from two related regression solutions of the balance calibration data as long as balance loads are given in the design format of the balance, gage outputs behave highly linear, strict statistical quality metrics are used to assess regression models of the data, and regression model term combinations of the fitted loads and gage outputs can be obtained by a simple variable exchange.
Ingre, Michael; Åkerstedt, Torbjörn; Ekstedt, Mirjam; Kecklund, Göran
2012-07-01
The main objective of the present study was to investigate relative personal fit as the association between rated needs and preferences for work hours, on the one hand, and actual work hours, on the other hand, in three groups (hospital, call-center, and police) working with periodic self-rostering. We also examined the association between personal fit and satisfaction with the work schedule and preference for a fixed and regular shift schedule, respectively. We collected questionnaire data and objective work hour data over 6-12 months from the computerized self-rostering system. The response rate of the questionnaire was 69% at the hospital and call-center and 98% among the police. In total, 29 433 shifts for 285 shift workers were included in the study. Data was analyzed by means of mixed ANOVA, Kendal tau correlations and ordinal (proportional odds) logistic regression. The results show that evening types worked relatively more hours during the evening and night hours compared to morning types as an indication of relative personal fit. Relative personal fit was also found for long shift, short rest, and morning-, evening- and night-shift frequency, but only personal fit related to morning, evening and night-shift was associated with satisfaction with work hours. Reported conflicts at the workplace about work hours and problems with lack of predictability of time for family/leisure activities, was associated with poor satisfaction and a preference for a fixed shift schedule. The present study shows that periodic self-rostering is associated with relative personal fit, in particular with respect to night, evening, and morning work. Personal fit seems to be associated with satisfaction with work hours and may be a moderator of tolerance to shift work exposure.
Aerobic Fitness, Micronutrient Status, and Academic Achievement in Indian School-Aged Children
Desai, Ishaan K.; Kurpad, Anura V.; Chomitz, Virginia R.; Thomas, Tinku
2015-01-01
Aerobic fitness has been shown to have several beneficial effects on child health. However, research on its relationship with academic performance has been limited, particularly in developing countries and among undernourished populations. This study examined the association between aerobic fitness and academic achievement in clinically healthy but nutritionally compromised Indian school-aged children and assessed whether micronutrient status affects this association. 273 participants, aged 7 to 10.5 years, were enrolled from three primary schools in Bangalore, India. Data on participants’ aerobic fitness (20-m shuttle test), demographics, anthropometry, diet, physical activity, and micronutrient status were abstracted. School-wide exam scores in mathematics and Kannada language served as indicators of academic performance and were standardized by grade level. The strength of the fitness/achievement association was analyzed using Spearman’s rank correlation, multiple variable logistic regression, and multi-level models. Significant positive correlations between aerobic capacity (VO2 peak) and academic scores in math and Kannada were observed (P < 0.05). After standardizing scores across grade levels and adjusting for school, gender, socioeconomic status, and weight status (BMI Z-score), children with greater aerobic capacities (mL * kg-1 * min-1) had greater odds of scoring above average on math and Kannada exams (OR=1.08, 95% CI: 1.02 to 1.15 and OR=1.11, 95% CI: 1.04 to 1.18, respectively). This association remained significant after adjusting for micronutrient deficiencies. These findings provide preliminary evidence of a fitness/achievement association in Indian children. While the mechanisms by which aerobic fitness may be linked to academic achievement require further investigation, the results suggest that educators and policymakers should consider the adequacy of opportunities for physical activity and fitness in schools for both their physical and potential academic benefits. PMID:25806824
Aerobic fitness, micronutrient status, and academic achievement in Indian school-aged children.
Desai, Ishaan K; Kurpad, Anura V; Chomitz, Virginia R; Thomas, Tinku
2015-01-01
Aerobic fitness has been shown to have several beneficial effects on child health. However, research on its relationship with academic performance has been limited, particularly in developing countries and among undernourished populations. This study examined the association between aerobic fitness and academic achievement in clinically healthy but nutritionally compromised Indian school-aged children and assessed whether micronutrient status affects this association. 273 participants, aged 7 to 10.5 years, were enrolled from three primary schools in Bangalore, India. Data on participants' aerobic fitness (20-m shuttle test), demographics, anthropometry, diet, physical activity, and micronutrient status were abstracted. School-wide exam scores in mathematics and Kannada language served as indicators of academic performance and were standardized by grade level. The strength of the fitness/achievement association was analyzed using Spearman's rank correlation, multiple variable logistic regression, and multi-level models. Significant positive correlations between aerobic capacity (VO2 peak) and academic scores in math and Kannada were observed (P < 0.05). After standardizing scores across grade levels and adjusting for school, gender, socioeconomic status, and weight status (BMI Z-score), children with greater aerobic capacities (mL * kg(-1) * min(-1)) had greater odds of scoring above average on math and Kannada exams (OR=1.08, 95% CI: 1.02 to 1.15 and OR=1.11, 95% CI: 1.04 to 1.18, respectively). This association remained significant after adjusting for micronutrient deficiencies. These findings provide preliminary evidence of a fitness/achievement association in Indian children. While the mechanisms by which aerobic fitness may be linked to academic achievement require further investigation, the results suggest that educators and policymakers should consider the adequacy of opportunities for physical activity and fitness in schools for both their physical and potential academic benefits.
Sui, Xuemei; Sarzynski, Mark A.; Lee, Duck-chul; Lavie, Carl J.; Zhang, Jiajia; Kokkinos, Peter F.; Payne, Jonathan; Blair, Steven N.
2016-01-01
Background Most of the existing literature has linked either a baseline cardiorespiratory fitness, or change between baseline and one follow-up measurement of cardiorespiratory fitness, to hypertension. The purpose of the study is to assess the association between longitudinal patterns of cardiorespiratory fitness changes with time and incident hypertension in adult men and women. Patients and Methods Participants were aged from 20 to 82 years, free of hypertension during the first three examinations, and received at least four preventive medical examinations at the Cooper Clinic in Dallas, TX, during 1971 – 2006. They were classified into one of five groups based on all of the measured cardiorespiratory fitness values (in metabolic equivalents) during maximal treadmill tests. Logistic regression was used to compute odds ratios and 95% confidence intervals. Results Among a total of 4,932 participants (13% women), 1,954 developed hypertension. After controlling for baseline potential confounders, follow-up duration, and number of follow-up visits, odds ratios (95% confidence intervals) for hypertension were: 1.00 for decreasing group (referent), 0.64 (0.52–0.80) for increasing, 0.89 (0.70–1.12) for Bell-shape, 0.78 (0.62–0.98) for U-shape, and 0.83 (0.69–1.00) for inconsistent group. The general pattern of the association was consistent regardless of participants’ baseline cardiorespiratory fitness or body mass index levels. Conclusion An increasing pattern of cardiorespiratory fitness provides the lowest risk of hypertension in this middle-aged relatively healthy population. Identifying specific pattern(s) of cardiorespiratory fitness change may be important for determining associations with comorbidity such as hypertension. PMID:27986522
Follow-Up of Abnormal Breast and Colorectal Cancer Screening by Race/Ethnicity.
McCarthy, Anne Marie; Kim, Jane J; Beaber, Elisabeth F; Zheng, Yingye; Burnett-Hartman, Andrea; Chubak, Jessica; Ghai, Nirupa R; McLerran, Dale; Breen, Nancy; Conant, Emily F; Geller, Berta M; Green, Beverly B; Klabunde, Carrie N; Inrig, Stephen; Skinner, Celette Sugg; Quinn, Virginia P; Haas, Jennifer S; Schnall, Mitchell; Rutter, Carolyn M; Barlow, William E; Corley, Douglas A; Armstrong, Katrina; Doubeni, Chyke A
2016-10-01
Timely follow-up of abnormal tests is critical to the effectiveness of cancer screening, but may vary by screening test, healthcare system, and sociodemographic group. Timely follow-up of abnormal mammogram and fecal occult blood testing or fecal immunochemical tests (FOBT/FIT) were compared by race/ethnicity using Population-Based Research Optimizing Screening through Personalized Regimens consortium data. Participants were women with an abnormal mammogram (aged 40-75 years) or FOBT/FIT (aged 50-75 years) in 2010-2012. Analyses were performed in 2015. Timely follow-up was defined as colonoscopy ≤3 months following positive FOBT/FIT; additional imaging or biopsy ≤3 months following Breast Imaging Reporting and Data System Category 0, 4, or 5 mammograms; or ≤9 months following Category 3 mammograms. Logistic regression was used to model receipt of timely follow-up adjusting for study site, age, year, insurance, and income. Among 166,602 mammograms, 10.7% were abnormal; among 566,781 FOBT/FITs, 4.3% were abnormal. Nearly 96% of patients with abnormal mammograms received timely follow-up versus 68% with abnormal FOBT/FIT. There was greater variability in receipt of follow-up across healthcare systems for positive FOBT/FIT than for abnormal mammograms. For mammography, black women were less likely than whites to receive timely follow-up (91.8% vs 96.0%, OR=0.71, 95% CI=0.51, 0.97). For FOBT/FIT, Hispanics were more likely than whites to receive timely follow-up than whites (70.0% vs 67.6%, OR=1.12, 95% CI=1.04, 1.21). Timely follow-up among women was more likely for abnormal mammograms than FOBT/FITs, with small variations in follow-up rates by race/ethnicity and larger variation across healthcare systems. Copyright © 2016 American Journal of Preventive Medicine. All rights reserved.
Zhang, Guangnan; Li, Yanyan; King, Mark J; Zhong, Qiaoting
2018-03-21
Motor vehicle overloading is correlated with the possibility of road crash occurrence and severity. Although overloading of motor vehicles is pervasive in developing nations, few empirical analyses have been performed on factors that might influence the occurrence of overloading. This study aims to address this shortcoming by seeking evidence from several years of crash data from Guangdong province, China. Data on overloading and other factors are extracted for crash-involved vehicles from traffic crash records for 2006-2010 provided by the Traffic Management Bureau in Guangdong province. Logistic regression is applied to identify risk factors for overloading in crash-involved vehicles and within these crashes to identify factors contributing to greater crash severity. Driver, vehicle, road and environmental characteristics and violation types are considered in the regression models. In addition to the basic logistic models, association analysis is employed to identify the potential interactions among different risk factors during fitting the logistic models of overloading and severity. Crash-involved vehicles driven by males from rural households and in an unsafe condition are more likely to be overloaded and to be involved in higher severity overloaded vehicle crashes. If overloaded vehicles speed, the risk of severe traffic crash casualties increases. Young drivers (aged under 25 years) in mountainous areas are more likely to be involved in higher severity overloaded vehicle crashes. This study identifies several factors associated with overloading in crash-involved vehicles and with higher severity overloading crashes and provides an important reference for future research on those specific risk factors. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Hifinger, Monika; Putrik, Polina; Ramiro, Sofia; Keszei, András P; Hmamouchi, Ihsane; Dougados, Maxime; Gossec, Laure; Boonen, Annelies
2016-04-01
To investigate the relationship between country of residence and fatigue in RA, and to explore which country characteristics are related to fatigue. Data from the multinational COMORA study were analysed. Contribution of country of residence to level of fatigue [0-10 on visual analogue scale (VAS)] and presence of severe fatigue (VAS ⩾ 5) was explored in multivariable linear or logistic regression models including first socio-demographics and objective disease outcomes (M1), and then also subjective outcomes (M2). Next, country of residence was replaced by country characteristics: gross domestic product (GDP), human development index (HDI), latitude (as indicator of climate), language and income inequality index (gini-index). Model fit (R(2)) for linear models was compared. A total of 3920 patients from 17 countries were included, mean age 56 years (s.d. 13), 82% females. Mean fatigue across countries ranged from 1.86 (s.d. 2.46) to 4.99 (s.d. 2.64) and proportion of severe fatigue from 14% (Venezuela) to 65% (Egypt). Objective disease outcomes did not explain much of the variation in fatigue ([Formula: see text] = 0.12), while subjective outcomes had a strong negative impact and partly explained the variation in fatigue ([Formula: see text]= 0.27). Country of residence had a significant additional effect (increasing model fit to [Formula: see text] = 0.20 and [Formula: see text] = 0.36, respectively). Remarkably, higher GDP and better HDI were associated with higher fatigue, and explained a large part of the country effect. Logistic regression confirmed the limited contribution of objective outcomes and the relevant contribution of country of residence. Country of residence has an important influence on fatigue. Paradoxically, patients from wealthier countries had higher fatigue. © The Author 2015. Published by Oxford University Press on behalf of the British Society for Rheumatology. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Sullivan, Michael C; Yeo, Heather; Roman, Sanziana A; Bell, Richard H; Sosa, Julie A
2013-03-01
To determine how marital status and having children impact US general surgical residents' attitudes toward training and personal life. There is a paucity of research describing how family and children affect the experience of general surgery residents. Cross-sectional survey involving all US categorical general surgery residents. Responses were evaluated by resident/program characteristics. Statistical analysis included the χ test and hierarchical logistic regression modeling. A total of 4402 residents were included (82.4% response rate) and categorized as married, single, or other (separated/divorced/widowed). Men were more likely to be married (57.8% vs 37.9%, P < 0.001) and have children (31.5% vs 12.0%, P < 0.001). Married residents were most likely to look forward to work (P < 0.001), and report happiness at work (P < 0.001) and a good program fit (P < 0.001). "Other" residents most frequently felt that work hours caused strain on family life (P < 0.001). Residents with children more frequently looked forward to work (P = 0.001), were happy at work (P = 0.001), and reported a good program fit (P = 0.034), but had strain on family life (P < 0.001), and worried about future finances (P = 0.005). On hierarchical logistic regression modeling, having children was predictive of a resident looking forward to work [odds ratio (OR): 1.22, P = 0.035], yet feeling that work caused family strain (OR: 1.66, P < 0.001); being single was associated with less strain (OR: 0.72, P < 0.001). The female gender was negatively associated with looking forward to work (OR: 0.81, P = 0.007). Residents who were married or parents reported greater satisfaction and work-life conflict. The complex effects of family on surgical residents should inform programs to target support mechanisms for their trainees.
Komatsu, Masayo; Nezu, Satoko; Tomioka, Kimiko; Hazaki, Kan; Harano, Akihiro; Morikawa, Masayuki; Takagi, Masahiro; Yamada, Masahiro; Matsumoto, Yoshitaka; Iwamoto, Junko; Ishizuka, Rika; Saeki, Keigo; Okamoto, Nozomi; Kurumatani, Norio
2013-01-01
To investigate factors associated with activities of daily living in independently living elderly persons in a community. The potential subjects were 4,472 individuals aged 65 years and older who voluntarily participated in a large cohort study, the Fujiwara-kyo study. We used self-administered questionnaires consisting of an activities of daily living (ADL) questionnaire with the Physical Fitness Test established by the Ministry of Education, Culture, Sports, Science and Technology (12 ADL items) to determine the index of higher-level physical independence, demographics, Geriatric Depression Scale, and so on. Mini-mental state examination, measurement of physical fitness, and blood tests were also carried out. A lower ADL level was defined as having a total score of the 12 ADL items (range, 12-36 points) that was below the first quartile of a total score for all the subjects. Factors associated with a low ADL level were examined by multiple logistic regression. A total of 4,198 remained as subjects for analysis. The male, female and 5-year-old groups showed significant differences in the median score of 12 ADL items between any two groups. The highest odds ratio among factors associated with lower ADL level by multiple logistic regression with mutually adjusted independent variables was 4.49 (95%CI: 2.82-7.17) in the groups of "very sharp pain" or "strong pain" during the last month. Low physical ability, self-awareness of limb weakness, a BMI of over 25, low physical activity, cerebrovascular disorder, depression, low cognitive function, unable "to see normally", unable "to hear someone", "muscle, bone and joint pain" were independently associated with lower ADL level. Multiple factors are associated with lower ADL level assessed on the basis of the 12 ADL items.
Zhang, Ying-Ying; Zhou, Xiao-Bin; Wang, Qiu-Zhen; Zhu, Xiao-Yan
2017-05-01
Multivariable logistic regression (MLR) has been increasingly used in Chinese clinical medical research during the past few years. However, few evaluations of the quality of the reporting strategies in these studies are available.To evaluate the reporting quality and model accuracy of MLR used in published work, and related advice for authors, readers, reviewers, and editors.A total of 316 articles published in 5 leading Chinese clinical medical journals with high impact factor from January 2010 to July 2015 were selected for evaluation. Articles were evaluated according 12 established criteria for proper use and reporting of MLR models.Among the articles, the highest quality score was 9, the lowest 1, and the median 5 (4-5). A total of 85.1% of the articles scored below 6. No significant differences were found among these journals with respect to quality score (χ = 6.706, P = .15). More than 50% of the articles met the following 5 criteria: complete identification of the statistical software application that was used (97.2%), calculation of the odds ratio and its confidence interval (86.4%), description of sufficient events (>10) per variable, selection of variables, and fitting procedure (78.2%, 69.3%, and 58.5%, respectively). Less than 35% of the articles reported the coding of variables (18.7%). The remaining 5 criteria were not satisfied by a sufficient number of articles: goodness-of-fit (10.1%), interactions (3.8%), checking for outliers (3.2%), collinearity (1.9%), and participation of statisticians and epidemiologists (0.3%). The criterion of conformity with linear gradients was applicable to 186 articles; however, only 7 (3.8%) mentioned or tested it.The reporting quality and model accuracy of MLR in selected articles were not satisfactory. In fact, severe deficiencies were noted. Only 1 article scored 9. We recommend authors, readers, reviewers, and editors to consider MLR models more carefully and cooperate more closely with statisticians and epidemiologists. Journals should develop statistical reporting guidelines concerning MLR.
Watanabe, Hiroyuki; Miyazaki, Hiroyasu
2006-01-01
Over- and/or under-correction of QT intervals for changes in heart rate may lead to misleading conclusions and/or masking the potential of a drug to prolong the QT interval. This study examines a nonparametric regression model (Loess Smoother) to adjust the QT interval for differences in heart rate, with an improved fitness over a wide range of heart rates. 240 sets of (QT, RR) observations collected from each of 8 conscious and non-treated beagle dogs were used as the materials for investigation. The fitness of the nonparametric regression model to the QT-RR relationship was compared with four models (individual linear regression, common linear regression, and Bazett's and Fridericia's correlation models) with reference to Akaike's Information Criterion (AIC). Residuals were visually assessed. The bias-corrected AIC of the nonparametric regression model was the best of the models examined in this study. Although the parametric models did not fit, the nonparametric regression model improved the fitting at both fast and slow heart rates. The nonparametric regression model is the more flexible method compared with the parametric method. The mathematical fit for linear regression models was unsatisfactory at both fast and slow heart rates, while the nonparametric regression model showed significant improvement at all heart rates in beagle dogs.
MODELING SNAKE MICROHABITAT FROM RADIOTELEMETRY STUDIES USING POLYTOMOUS LOGISTIC REGRESSION
Multivariate analysis of snake microhabitat has historically used techniques that were derived under assumptions of normality and common covariance structure (e.g., discriminant function analysis, MANOVA). In this study, polytomous logistic regression (PLR which does not require ...
Overhead longwave infrared hyperspectral material identification using radiometric models
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zelinski, M. E.
Material detection algorithms used in hyperspectral data processing are computationally efficient but can produce relatively high numbers of false positives. Material identification performed as a secondary processing step on detected pixels can help separate true and false positives. This paper presents a material identification processing chain for longwave infrared hyperspectral data of solid materials collected from airborne platforms. The algorithms utilize unwhitened radiance data and an iterative algorithm that determines the temperature, humidity, and ozone of the atmospheric profile. Pixel unmixing is done using constrained linear regression and Bayesian Information Criteria for model selection. The resulting product includes an optimalmore » atmospheric profile and full radiance material model that includes material temperature, abundance values, and several fit statistics. A logistic regression method utilizing all model parameters to improve identification is also presented. This paper details the processing chain and provides justification for the algorithms used. Several examples are provided using modeled data at different noise levels.« less
Brenn, T; Arnesen, E
1985-01-01
For comparative evaluation, discriminant analysis, logistic regression and Cox's model were used to select risk factors for total and coronary deaths among 6595 men aged 20-49 followed for 9 years. Groups with mortality between 5 and 93 per 1000 were considered. Discriminant analysis selected variable sets only marginally different from the logistic and Cox methods which always selected the same sets. A time-saving option, offered for both the logistic and Cox selection, showed no advantage compared with discriminant analysis. Analysing more than 3800 subjects, the logistic and Cox methods consumed, respectively, 80 and 10 times more computer time than discriminant analysis. When including the same set of variables in non-stepwise analyses, all methods estimated coefficients that in most cases were almost identical. In conclusion, discriminant analysis is advocated for preliminary or stepwise analysis, otherwise Cox's method should be used.
ERIC Educational Resources Information Center
DeMars, Christine E.
2009-01-01
The Mantel-Haenszel (MH) and logistic regression (LR) differential item functioning (DIF) procedures have inflated Type I error rates when there are large mean group differences, short tests, and large sample sizes.When there are large group differences in mean score, groups matched on the observed number-correct score differ on true score,…
Tchetgen Tchetgen, Eric
2011-03-01
This article considers the detection and evaluation of genetic effects incorporating gene-environment interaction and independence. Whereas ordinary logistic regression cannot exploit the assumption of gene-environment independence, the proposed approach makes explicit use of the independence assumption to improve estimation efficiency. This method, which uses both cases and controls, fits a constrained retrospective regression in which the genetic variant plays the role of the response variable, and the disease indicator and the environmental exposure are the independent variables. The regression model constrains the association of the environmental exposure with the genetic variant among the controls to be null, thus explicitly encoding the gene-environment independence assumption, which yields substantial gain in accuracy in the evaluation of genetic effects. The proposed retrospective regression approach has several advantages. It is easy to implement with standard software, and it readily accounts for multiple environmental exposures of a polytomous or of a continuous nature, while easily incorporating extraneous covariates. Unlike the profile likelihood approach of Chatterjee and Carroll (Biometrika. 2005;92:399-418), the proposed method does not require a model for the association of a polytomous or continuous exposure with the disease outcome, and, therefore, it is agnostic to the functional form of such a model and completely robust to its possible misspecification.
Satellite rainfall retrieval by logistic regression
NASA Technical Reports Server (NTRS)
Chiu, Long S.
1986-01-01
The potential use of logistic regression in rainfall estimation from satellite measurements is investigated. Satellite measurements provide covariate information in terms of radiances from different remote sensors.The logistic regression technique can effectively accommodate many covariates and test their significance in the estimation. The outcome from the logistical model is the probability that the rainrate of a satellite pixel is above a certain threshold. By varying the thresholds, a rainrate histogram can be obtained, from which the mean and the variant can be estimated. A logistical model is developed and applied to rainfall data collected during GATE, using as covariates the fractional rain area and a radiance measurement which is deduced from a microwave temperature-rainrate relation. It is demonstrated that the fractional rain area is an important covariate in the model, consistent with the use of the so-called Area Time Integral in estimating total rain volume in other studies. To calibrate the logistical model, simulated rain fields generated by rainfield models with prescribed parameters are needed. A stringent test of the logistical model is its ability to recover the prescribed parameters of simulated rain fields. A rain field simulation model which preserves the fractional rain area and lognormality of rainrates as found in GATE is developed. A stochastic regression model of branching and immigration whose solutions are lognormally distributed in some asymptotic limits has also been developed.
Practical Session: Logistic Regression
NASA Astrophysics Data System (ADS)
Clausel, M.; Grégoire, G.
2014-12-01
An exercise is proposed to illustrate the logistic regression. One investigates the different risk factors in the apparition of coronary heart disease. It has been proposed in Chapter 5 of the book of D.G. Kleinbaum and M. Klein, "Logistic Regression", Statistics for Biology and Health, Springer Science Business Media, LLC (2010) and also by D. Chessel and A.B. Dufour in Lyon 1 (see Sect. 6 of http://pbil.univ-lyon1.fr/R/pdf/tdr341.pdf). This example is based on data given in the file evans.txt coming from http://www.sph.emory.edu/dkleinb/logreg3.htm#data.
The cross-validated AUC for MCP-logistic regression with high-dimensional data.
Jiang, Dingfeng; Huang, Jian; Zhang, Ying
2013-10-01
We propose a cross-validated area under the receiving operator characteristic (ROC) curve (CV-AUC) criterion for tuning parameter selection for penalized methods in sparse, high-dimensional logistic regression models. We use this criterion in combination with the minimax concave penalty (MCP) method for variable selection. The CV-AUC criterion is specifically designed for optimizing the classification performance for binary outcome data. To implement the proposed approach, we derive an efficient coordinate descent algorithm to compute the MCP-logistic regression solution surface. Simulation studies are conducted to evaluate the finite sample performance of the proposed method and its comparison with the existing methods including the Akaike information criterion (AIC), Bayesian information criterion (BIC) or Extended BIC (EBIC). The model selected based on the CV-AUC criterion tends to have a larger predictive AUC and smaller classification error than those with tuning parameters selected using the AIC, BIC or EBIC. We illustrate the application of the MCP-logistic regression with the CV-AUC criterion on three microarray datasets from the studies that attempt to identify genes related to cancers. Our simulation studies and data examples demonstrate that the CV-AUC is an attractive method for tuning parameter selection for penalized methods in high-dimensional logistic regression models.
The Trend Odds Model for Ordinal Data‡
Capuano, Ana W.; Dawson, Jeffrey D.
2013-01-01
Ordinal data appear in a wide variety of scientific fields. These data are often analyzed using ordinal logistic regression models that assume proportional odds. When this assumption is not met, it may be possible to capture the lack of proportionality using a constrained structural relationship between the odds and the cut-points of the ordinal values (Peterson and Harrell, 1990). We consider a trend odds version of this constrained model, where the odds parameter increases or decreases in a monotonic manner across the cut-points. We demonstrate algebraically and graphically how this model is related to latent logistic, normal, and exponential distributions. In particular, we find that scale changes in these potential latent distributions are consistent with the trend odds assumption, with the logistic and exponential distributions having odds that increase in a linear or nearly linear fashion. We show how to fit this model using SAS Proc Nlmixed, and perform simulations under proportional odds and trend odds processes. We find that the added complexity of the trend odds model gives improved power over the proportional odds model when there are moderate to severe departures from proportionality. A hypothetical dataset is used to illustrate the interpretation of the trend odds model, and we apply this model to a Swine Influenza example where the proportional odds assumption appears to be violated. PMID:23225520
The trend odds model for ordinal data.
Capuano, Ana W; Dawson, Jeffrey D
2013-06-15
Ordinal data appear in a wide variety of scientific fields. These data are often analyzed using ordinal logistic regression models that assume proportional odds. When this assumption is not met, it may be possible to capture the lack of proportionality using a constrained structural relationship between the odds and the cut-points of the ordinal values. We consider a trend odds version of this constrained model, wherein the odds parameter increases or decreases in a monotonic manner across the cut-points. We demonstrate algebraically and graphically how this model is related to latent logistic, normal, and exponential distributions. In particular, we find that scale changes in these potential latent distributions are consistent with the trend odds assumption, with the logistic and exponential distributions having odds that increase in a linear or nearly linear fashion. We show how to fit this model using SAS Proc NLMIXED and perform simulations under proportional odds and trend odds processes. We find that the added complexity of the trend odds model gives improved power over the proportional odds model when there are moderate to severe departures from proportionality. A hypothetical data set is used to illustrate the interpretation of the trend odds model, and we apply this model to a swine influenza example wherein the proportional odds assumption appears to be violated. Copyright © 2012 John Wiley & Sons, Ltd.
Gonçalves, Reginaldo; Szmuchrowski, Leszek Antony; Damasceno, Vinícius Oliveira; de Medeiros, Marcelo Lemos; Couto, Bruno Pena; Lamounier, Joel Alves
2014-01-01
Objective: To identify the association between both, body mass index and aerobic fitness, with cardiovascular disease risk factors in children. Methods: Cross-sectional study, carried out in Itaúna-MG, in 2010, with 290 school children ranging from 6 to 10 years-old of both sexes, randomly selected. Children from schools located in the countryside and those with medical restrctions for physical activity were not included. Blood sample was collected after a 12-hour fasting period. Blood pressure, stature and weight were evaluated in accordance with international standards. The following were considered as cardiovascular risk factors: high blood pressure, high total cholesterol, LDL, triglycerides and insulin levels, and low HDL. The statistical analysis included the Spearman's coefficient and the logistic regression, with cardiovascular risk factors as dependent variables. Results: Significant correlations were found, in both sexes, among body mass index and aerobic fitness with most of the cardiovascular risk factors. Children of both sexes with body mass index in the fourth quartile demonstrated increased chances of having high blood insulin and clustering cardiovascular risk factors. Moreover, girls with aerobic fitness in the first quartile also demonstrated increased chances of having high blood insulin and clustering cardiovascular risk factors. Conclusion: The significant associations and the increased chances of having cardiovascular risk factors in children with less aerobic fitness and higher levels of body mass index justify the use of these variables for health monitoring in Pediatrics. PMID:25479851
Crosby, Richard A; Stradtman, Lindsay; Collins, Tom; Vanderpool, Robin
2017-09-01
To determine the return rate of community-delivered fecal immunochemical test (FIT) kits in a rural population and to identify significant predictors of returning kits. Residents were recruited in 8 rural Kentucky counties to enroll in the study and receive an FIT kit. Of 345 recruited, 82.0% returned an FIT kit from the point of distribution. These participants were compared to the remainder relative to age, sex, marital status, having an annual income below $15,000, not graduating from high school, not having a regular health care provider, not having health care coverage, being a current smoker, indicating current overweight or obese status, and a scale measure of fatalism pertaining to colorectal cancer. Predictors achieving significance at the bivariate level were entered into a stepwise logistic regression model to calculate adjusted OR and 95% CI. The return rate was 82.0%. In adjusted analyses, those indicating an annual income of less than $15,000 were 2.85 times more likely to return their kits (95% CI: 1.56-5.24; P < .001). Also, those not perceiving themselves to be overweight/obese were 1.95 times more likely to return their kits (95% CI: 1.07-3.55; P = .029). An outreach-based colorectal cancer screening program in a rural population may yield high return rates. People with annual incomes below $15,000 and those not having perceptions of being overweight/obese may be particularly likely to return FIT kits. © 2016 National Rural Health Association.
The relationship between offspring size and fitness: integrating theory and empiricism.
Rollinson, Njal; Hutchings, Jeffrey A
2013-02-01
How parents divide the energy available for reproduction between size and number of offspring has a profound effect on parental reproductive success. Theory indicates that the relationship between offspring size and offspring fitness is of fundamental importance to the evolution of parental reproductive strategies: this relationship predicts the optimal division of resources between size and number of offspring, it describes the fitness consequences for parents that deviate from optimality, and its shape can predict the most viable type of investment strategy in a given environment (e.g., conservative vs. diversified bet-hedging). Many previous attempts to estimate this relationship and the corresponding value of optimal offspring size have been frustrated by a lack of integration between theory and empiricism. In the present study, we draw from C. Smith and S. Fretwell's classic model to explain how a sound estimate of the offspring size--fitness relationship can be derived with empirical data. We evaluate what measures of fitness can be used to model the offspring size--fitness curve and optimal size, as well as which statistical models should and should not be used to estimate offspring size--fitness relationships. To construct the fitness curve, we recommend that offspring fitness be measured as survival up to the age at which the instantaneous rate of offspring mortality becomes random with respect to initial investment. Parental fitness is then expressed in ecologically meaningful, theoretically defensible, and broadly comparable units: the number of offspring surviving to independence. Although logistic and asymptotic regression have been widely used to estimate offspring size-fitness relationships, the former provides relatively unreliable estimates of optimal size when offspring survival and sample sizes are low, and the latter is unreliable under all conditions. We recommend that the Weibull-1 model be used to estimate this curve because it provides modest improvements in prediction accuracy under experimentally relevant conditions.
Fitness but not weight status is associated with projected physical independence in older adults.
Sardinha, Luis B; Cyrino, Edilson S; Santos, Leandro Dos; Ekelund, Ulf; Santos, Diana A
2016-06-01
Obesity and fitness have been associated with older adults' physical independence. We aimed to investigate the independent and combined associations of physical fitness and adiposity, assessed by body mass index (BMI) and waist circumference (WC) with the projected ability for physical independence. A total of 3496 non-institutionalized older adults aged 65 and older (1167 male) were included in the analysis. BMI and WC were assessed and categorized according to established criteria. Physical fitness was evaluated with the Senior Fitness Test and individual test results were expressed as Z-scores. Projected ability for physical independence was assessed with the 12-item composite physical function scale. Logistic regression was used to estimate the odds ratio (OR) for being physically dependent. A total of 30.1 % of participants were classified as at risk for losing physical independence at age 90 years. Combined fitness and fatness analysis demonstrated that unfit older adults had increased odds ratio for being physically dependent in all BMI categories (normal: OR = 9.5, 95 %CI = 6.5-13.8; overweight: OR = 6.0, 95 %CI = 4.3-8.3; obese: OR = 6.7, 95 %CI = 4.6-10.0) and all WC categories (normal: OR = 10.4, 95%CI = 6.5-16.8; middle: OR = 6.2, 95 %CI = 4.1-9.3; upper: OR = 7.0, 95 %CI = 4.8-10.0) compared to fit participants that were of normal weight and fit participants with normal WC, respectively. No increased odds ratio was observed for fit participants that had increased BMI or WC. In conclusion, projected physical independence may be enhanced by a normal weight, a normal WC, or an increased physical fitness. Adiposity measures were not associated with physical independence, whereas fitness is independently related to physical independence. Independent of their weight and WC status, unfit older adults are at increased risk for losing physical independence.
Sanchez-Vaznaugh, Emma V.; Goldman Rosas, Lisa; Fernández-Peña, José Ramón; Baek, Jonggyu; Egerter, Susan; Sánchez, Brisa N.
2017-01-01
Objectives To investigate the contribution of school neighborhood socioeconomic advantage to the association between school-district physical education policy compliance in California public schools and Latino students’ physical fitness. Methods Cross-sectional Fitnessgram data for public-school students were linked with school- and district-level information, district-level physical education policy compliance from 2004–2005 and 2005–2006, and 2000 United States Census data. Multilevel logistic regression models examined whether income and education levels in school neighborhoods moderated the effects of district-level physical education policy compliance on Latino fifth-graders’ fitness levels. Results Physical education compliance data were available for 48 California school districts, which included 64,073 Latino fifth-graders. Fewer than half (23, or 46%) of these districts were found to be in compliance, and only 16% of Latino fifth-graders attended schools in compliant districts. Overall, there was a positive association between district compliance with physical education policy and fitness (OR, 95%CI: 1.38, 1.07, 1.78) adjusted for covariates. There was no significant interaction between school neighborhood socioeconomic advantage and physical education policy compliance (p>.05): there was a positive pattern in the association between school district compliance with physical education policy and student fitness levels across levels of socioeconomic advantage, though the association was not always significant. Conclusions Across neighborhoods with varying levels of socioeconomic advantage, increasing physical education policy compliance in elementary schools may be an effective strategy for improving fitness among Latino children. PMID:28591139
Song, Hyung Keun; Choi, Ho June; Yang, Kyu Hyun
2016-12-01
The aim of our study was to identify the risk factors for avascular necrosis of the femoral head (AVN) and fixation failure (FF) after screw osteosynthesis in patients with valgus angulated femoral neck fractures. We conducted a retrospective study of 308 patients (mean age, 72.5 years, range, 50-97 years), with a mean follow-up of 21.4 months (range, 12-64 months). The risk for failure in treatment (FIT) associated with patient- and fracture-related factors was evaluated by logistic regression analyses. FIT was identified in 32 cases (10.3%): 22 cases (7.1%) of AVN and 10 cases (3.2%) of FF. Initial valgus tilt>15° (p=0.023), posterior tilt>15° (p=0.012), and screw sliding distance (p=0.037) were significantly associated with FIT. FIT occurred in 7 patients (5.2%) with B1.2.1 fractures and 17 patients (48.6%) with B1.1.2 fractures (p<0.001). The odds of FIT were 17-fold higher in patients with initial valgus and posterior tilts>15° (B1.1.2) compared to patients with <15° of tilt in both planes (B1.2.1). The severity of initial deformity predicts AVN and FF in patients with valgus angulated femoral neck fractures. Patients with an initial valgus and posterior tilt>15° are reasonable candidates for primary arthroplasty due to high risk of FIT. Copyright © 2016 Elsevier Ltd. All rights reserved.
Hulsegge, Gerben; Henschke, Nicholas; McKay, Damien; Chaitow, Jeffrey; West, Kerry; Broderick, Carolyn; Singh-Grewal, Davinder
2015-04-01
To describe fundamental movement skills (FMS), physical fitness and level of physical activity among Australian children with juvenile idiopathic arthritis (JIA) and compare this with healthy peers. Children aged 6-16 years with JIA were recruited from hospital rheumatology clinics and private rheumatology rooms in Sydney, Australia. All children attended an assessment day, where FMS were assessed by a senior paediatric physiotherapist, physical fitness was assessed using the multistage 20-metre shuttle run test, and physical activity and physical and psychosocial well-being were assessed with questionnaires. These results were compared with age- and gender-matched peers from the NSW Schools Physical Activity and Nutrition Survey and the Health of Young Victorians Study using logistic regression analysis. Twenty-eight children with JIA participated in this study. There were no differences in the proportion of children who had mastered FMS between children with JIA and their healthy peers (P > 0.05). However, there was a trend for children with JIA to have poorer physical fitness and be less physically active than healthy peers. Parents of children with JIA indicated more physical and psychosocial impairments among their children and themselves compared with parents of healthy children (P < 0.05). This is the first study in Australia to compare FMS, physical activity and fitness in children with JIA and their peers. While older children with JIA appear to have poorer physical fitness and physical activity levels than their peers, there is no difference in FMS. © 2014 The Authors. Journal of Paediatrics and Child Health © 2014 Paediatrics and Child Health Division (Royal Australasian College of Physicians).
Krause, Kathleen H.
2015-01-01
Objective To provide the first study in Vietnam of how gendered social learning about violence and exposure to non-family institutions influence women’s attitudes about a wife’s recourse after physical IPV. Method A probability sample of 532 married women, ages 18–50 years, was surveyed in July–August, 2012 in Mỹ Hào district. We fit a multivariate linear regression model to estimate correlates of favoring recourse in six situations using a validated attitudinal scale. We split attitudes towards recourse into three subscales (disfavor silence, favor informal recourse, favor formal recourse) and fit one multivariate ordinal logistic regression model for each behavior to estimate correlates of favoring recourse. Results On average, women favored recourse in 2.8 situations. Women who were older and had witnessed physical IPV in childhood had less favorable attitudes about recourse. Women who were hit as children, had completed more schooling, worked outside agriculture, and had sought recourse after IPV had more favorable attitudes about recourse. Conclusions Normative change among women may require efforts to curb family violence, counsel those exposed to violence in childhood, and enhance women’s opportunities for higher schooling and non-agricultural wage work. The state and organizations working on IPV might overcome pockets of unfavorable public opinion by enforcing accountability for IPV rather than seeking to alter ideas about recourse among women. PMID:28392967
Vaeth, Michael; Skovlund, Eva
2004-06-15
For a given regression problem it is possible to identify a suitably defined equivalent two-sample problem such that the power or sample size obtained for the two-sample problem also applies to the regression problem. For a standard linear regression model the equivalent two-sample problem is easily identified, but for generalized linear models and for Cox regression models the situation is more complicated. An approximately equivalent two-sample problem may, however, also be identified here. In particular, we show that for logistic regression and Cox regression models the equivalent two-sample problem is obtained by selecting two equally sized samples for which the parameters differ by a value equal to the slope times twice the standard deviation of the independent variable and further requiring that the overall expected number of events is unchanged. In a simulation study we examine the validity of this approach to power calculations in logistic regression and Cox regression models. Several different covariate distributions are considered for selected values of the overall response probability and a range of alternatives. For the Cox regression model we consider both constant and non-constant hazard rates. The results show that in general the approach is remarkably accurate even in relatively small samples. Some discrepancies are, however, found in small samples with few events and a highly skewed covariate distribution. Comparison with results based on alternative methods for logistic regression models with a single continuous covariate indicates that the proposed method is at least as good as its competitors. The method is easy to implement and therefore provides a simple way to extend the range of problems that can be covered by the usual formulas for power and sample size determination. Copyright 2004 John Wiley & Sons, Ltd.
Li, Wen; Zhao, Li-Zhong; Ma, Dong-Wang; Wang, De-Zheng; Shi, Lei; Wang, Hong-Lei; Dong, Mo; Zhang, Shu-Yi; Cao, Lei; Zhang, Wei-Hua; Zhang, Xi-Peng; Zhang, Qing-Huai; Yu, Lin; Qin, Hai; Wang, Xi-Mo; Chen, Sam Li-Sheng
2018-05-01
We aimed to predict colorectal cancer (CRC) based on the demographic features and clinical correlates of personal symptoms and signs from Tianjin community-based CRC screening data.A total of 891,199 residents who were aged 60 to 74 and were screened in 2012 were enrolled. The Lasso logistic regression model was used to identify the predictors for CRC. Predictive validity was assessed by the receiver operating characteristic (ROC) curve. Bootstrapping method was also performed to validate this prediction model.CRC was best predicted by a model that included age, sex, education level, occupations, diarrhea, constipation, colon mucosa and bleeding, gallbladder disease, a stressful life event, family history of CRC, and a positive fecal immunochemical test (FIT). The area under curve (AUC) for the questionnaire with a FIT was 84% (95% CI: 82%-86%), followed by 76% (95% CI: 74%-79%) for a FIT alone, and 73% (95% CI: 71%-76%) for the questionnaire alone. With 500 bootstrap replications, the estimated optimism (<0.005) shows good discrimination in validation of prediction model.A risk prediction model for CRC based on a series of symptoms and signs related to enteric diseases in combination with a FIT was developed from first round of screening. The results of the current study are useful for increasing the awareness of high-risk subjects and for individual-risk-guided invitations or strategies to achieve mass screening for CRC.
Fernandes, Amanda Paula; Andrade, Amanda Cristina de Souza; Ramos, Cynthia Graciane Carvalho; Friche, Amélia Augusta de Lima; Dias, Maria Angélica de Salles; Xavier, César Coelho; Proietti, Fernando Augusto; Caiaffa, Waleska Teixeira
2015-11-01
This study analyzed leisure-time physical activity among 1,621 adults who were non-users of the Academias da Cidade Program in Belo Horizonte, Minas Gerais State, Brazil, but who lived in the vicinity of a fitness center in operation (exposed Group I) or in the vicinity of two sites reserved for future installation of centers (control Groups II and III). The dependent variable was leisure-time physical activity, and linear distance from the households to the fitness centers was the exposure variable, categorized in radial buffers: < 500m; 500-1,000m; and 1,000-1,500m. Binary logistic regression was performed with the Generalized Estimation Equations method. Residents living within < 500m of the fitness center gave better ratings to the physical environment when compared to those living in the 1,000 and 1,500m buffers and showed higher odds of leisure-time physical activity (OR = 1.16; 95%CI: 1.03-1.30), independently of socio-demographic factors; the same was not observed in the control groups (II and III). The findings suggests the program's potential for influencing physical activity in the population living closer to the fitness center and thus provide a strategic alternative for mitigating inequalities in leisure-time physical activity.
Kesselmeier, Miriam; Lorenzo Bermejo, Justo
2017-11-01
Logistic regression is the most common technique used for genetic case-control association studies. A disadvantage of standard maximum likelihood estimators of the genotype relative risk (GRR) is their strong dependence on outlier subjects, for example, patients diagnosed at unusually young age. Robust methods are available to constrain outlier influence, but they are scarcely used in genetic studies. This article provides a non-intimidating introduction to robust logistic regression, and investigates its benefits and limitations in genetic association studies. We applied the bounded Huber and extended the R package 'robustbase' with the re-descending Hampel functions to down-weight outlier influence. Computer simulations were carried out to assess the type I error rate, mean squared error (MSE) and statistical power according to major characteristics of the genetic study and investigated markers. Simulations were complemented with the analysis of real data. Both standard and robust estimation controlled type I error rates. Standard logistic regression showed the highest power but standard GRR estimates also showed the largest bias and MSE, in particular for associated rare and recessive variants. For illustration, a recessive variant with a true GRR=6.32 and a minor allele frequency=0.05 investigated in a 1000 case/1000 control study by standard logistic regression resulted in power=0.60 and MSE=16.5. The corresponding figures for Huber-based estimation were power=0.51 and MSE=0.53. Overall, Hampel- and Huber-based GRR estimates did not differ much. Robust logistic regression may represent a valuable alternative to standard maximum likelihood estimation when the focus lies on risk prediction rather than identification of susceptibility variants. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Sampson, Maureen L; Gounden, Verena; van Deventer, Hendrik E; Remaley, Alan T
2016-02-01
The main drawback of the periodic analysis of quality control (QC) material is that test performance is not monitored in time periods between QC analyses, potentially leading to the reporting of faulty test results. The objective of this study was to develop a patient based QC procedure for the more timely detection of test errors. Results from a Chem-14 panel measured on the Beckman LX20 analyzer were used to develop the model. Each test result was predicted from the other 13 members of the panel by multiple regression, which resulted in correlation coefficients between the predicted and measured result of >0.7 for 8 of the 14 tests. A logistic regression model, which utilized the measured test result, the predicted test result, the day of the week and time of day, was then developed for predicting test errors. The output of the logistic regression was tallied by a daily CUSUM approach and used to predict test errors, with a fixed specificity of 90%. The mean average run length (ARL) before error detection by CUSUM-Logistic Regression (CSLR) was 20 with a mean sensitivity of 97%, which was considerably shorter than the mean ARL of 53 (sensitivity 87.5%) for a simple prediction model that only used the measured result for error detection. A CUSUM-Logistic Regression analysis of patient laboratory data can be an effective approach for the rapid and sensitive detection of clinical laboratory errors. Published by Elsevier Inc.
Predicting space telerobotic operator training performance from human spatial ability assessment
NASA Astrophysics Data System (ADS)
Liu, Andrew M.; Oman, Charles M.; Galvan, Raquel; Natapoff, Alan
2013-11-01
Our goal was to determine whether existing tests of spatial ability can predict an astronaut's qualification test performance after robotic training. Because training astronauts to be qualified robotics operators is so long and expensive, NASA is interested in tools that can predict robotics performance before training begins. Currently, the Astronaut Office does not have a validated tool to predict robotics ability as part of its astronaut selection or training process. Commonly used tests of human spatial ability may provide such a tool to predict robotics ability. We tested the spatial ability of 50 active astronauts who had completed at least one robotics training course, then used logistic regression models to analyze the correlation between spatial ability test scores and the astronauts' performance in their evaluation test at the end of the training course. The fit of the logistic function to our data is statistically significant for several spatial tests. However, the prediction performance of the logistic model depends on the criterion threshold assumed. To clarify the critical selection issues, we show how the probability of correct classification vs. misclassification varies as a function of the mental rotation test criterion level. Since the costs of misclassification are low, the logistic models of spatial ability and robotic performance are reliable enough only to be used to customize regular and remedial training. We suggest several changes in tracking performance throughout robotics training that could improve the range and reliability of predictive models.
Fitting program for linear regressions according to Mahon (1996)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Trappitsch, Reto G.
2018-01-09
This program takes the users' Input data and fits a linear regression to it using the prescription presented by Mahon (1996). Compared to the commonly used York fit, this method has the correct prescription for measurement error propagation. This software should facilitate the proper fitting of measurements with a simple Interface.
A comparative study on entrepreneurial attitudes modeled with logistic regression and Bayes nets.
López Puga, Jorge; García García, Juan
2012-11-01
Entrepreneurship research is receiving increasing attention in our context, as entrepreneurs are key social agents involved in economic development. We compare the success of the dichotomic logistic regression model and the Bayes simple classifier to predict entrepreneurship, after manipulating the percentage of missing data and the level of categorization in predictors. A sample of undergraduate university students (N = 1230) completed five scales (motivation, attitude towards business creation, obstacles, deficiencies, and training needs) and we found that each of them predicted different aspects of the tendency to business creation. Additionally, our results show that the receiver operating characteristic (ROC) curve is affected by the rate of missing data in both techniques, but logistic regression seems to be more vulnerable when faced with missing data, whereas Bayes nets underperform slightly when categorization has been manipulated. Our study sheds light on the potential entrepreneur profile and we propose to use Bayesian networks as an additional alternative to overcome the weaknesses of logistic regression when missing data are present in applied research.
Campos-Filho, N; Franco, E L
1989-02-01
A frequent procedure in matched case-control studies is to report results from the multivariate unmatched analyses if they do not differ substantially from the ones obtained after conditioning on the matching variables. Although conceptually simple, this rule requires that an extensive series of logistic regression models be evaluated by both the conditional and unconditional maximum likelihood methods. Most computer programs for logistic regression employ only one maximum likelihood method, which requires that the analyses be performed in separate steps. This paper describes a Pascal microcomputer (IBM PC) program that performs multiple logistic regression by both maximum likelihood estimation methods, which obviates the need for switching between programs to obtain relative risk estimates from both matched and unmatched analyses. The program calculates most standard statistics and allows factoring of categorical or continuous variables by two distinct methods of contrast. A built-in, descriptive statistics option allows the user to inspect the distribution of cases and controls across categories of any given variable.
Comparison of cranial sex determination by discriminant analysis and logistic regression.
Amores-Ampuero, Anabel; Alemán, Inmaculada
2016-04-05
Various methods have been proposed for estimating dimorphism. The objective of this study was to compare sex determination results from cranial measurements using discriminant analysis or logistic regression. The study sample comprised 130 individuals (70 males) of known sex, age, and cause of death from San José cemetery in Granada (Spain). Measurements of 19 neurocranial dimensions and 11 splanchnocranial dimensions were subjected to discriminant analysis and logistic regression, and the percentages of correct classification were compared between the sex functions obtained with each method. The discriminant capacity of the selected variables was evaluated with a cross-validation procedure. The percentage accuracy with discriminant analysis was 78.2% for the neurocranium (82.4% in females and 74.6% in males) and 73.7% for the splanchnocranium (79.6% in females and 68.8% in males). These percentages were higher with logistic regression analysis: 85.7% for the neurocranium (in both sexes) and 94.1% for the splanchnocranium (100% in females and 91.7% in males).
Hill, Andrew; Loh, Po-Ru; Bharadwaj, Ragu B.; Pons, Pascal; Shang, Jingbo; Guinan, Eva; Lakhani, Karim; Kilty, Iain
2017-01-01
Abstract Background: The association of differing genotypes with disease-related phenotypic traits offers great potential to both help identify new therapeutic targets and support stratification of patients who would gain the greatest benefit from specific drug classes. Development of low-cost genotyping and sequencing has made collecting large-scale genotyping data routine in population and therapeutic intervention studies. In addition, a range of new technologies is being used to capture numerous new and complex phenotypic descriptors. As a result, genotype and phenotype datasets have grown exponentially. Genome-wide association studies associate genotypes and phenotypes using methods such as logistic regression. As existing tools for association analysis limit the efficiency by which value can be extracted from increasing volumes of data, there is a pressing need for new software tools that can accelerate association analyses on large genotype-phenotype datasets. Results: Using open innovation (OI) and contest-based crowdsourcing, the logistic regression analysis in a leading, community-standard genetics software package (PLINK 1.07) was substantially accelerated. OI allowed us to do this in <6 months by providing rapid access to highly skilled programmers with specialized, difficult-to-find skill sets. Through a crowd-based contest a combination of computational, numeric, and algorithmic approaches was identified that accelerated the logistic regression in PLINK 1.07 by 18- to 45-fold. Combining contest-derived logistic regression code with coarse-grained parallelization, multithreading, and associated changes to data initialization code further developed through distributed innovation, we achieved an end-to-end speedup of 591-fold for a data set size of 6678 subjects by 645 863 variants, compared to PLINK 1.07's logistic regression. This represents a reduction in run time from 4.8 hours to 29 seconds. Accelerated logistic regression code developed in this project has been incorporated into the PLINK2 project. Conclusions: Using iterative competition-based OI, we have developed a new, faster implementation of logistic regression for genome-wide association studies analysis. We present lessons learned and recommendations on running a successful OI process for bioinformatics. PMID:28327993
Hill, Andrew; Loh, Po-Ru; Bharadwaj, Ragu B; Pons, Pascal; Shang, Jingbo; Guinan, Eva; Lakhani, Karim; Kilty, Iain; Jelinsky, Scott A
2017-05-01
The association of differing genotypes with disease-related phenotypic traits offers great potential to both help identify new therapeutic targets and support stratification of patients who would gain the greatest benefit from specific drug classes. Development of low-cost genotyping and sequencing has made collecting large-scale genotyping data routine in population and therapeutic intervention studies. In addition, a range of new technologies is being used to capture numerous new and complex phenotypic descriptors. As a result, genotype and phenotype datasets have grown exponentially. Genome-wide association studies associate genotypes and phenotypes using methods such as logistic regression. As existing tools for association analysis limit the efficiency by which value can be extracted from increasing volumes of data, there is a pressing need for new software tools that can accelerate association analyses on large genotype-phenotype datasets. Using open innovation (OI) and contest-based crowdsourcing, the logistic regression analysis in a leading, community-standard genetics software package (PLINK 1.07) was substantially accelerated. OI allowed us to do this in <6 months by providing rapid access to highly skilled programmers with specialized, difficult-to-find skill sets. Through a crowd-based contest a combination of computational, numeric, and algorithmic approaches was identified that accelerated the logistic regression in PLINK 1.07 by 18- to 45-fold. Combining contest-derived logistic regression code with coarse-grained parallelization, multithreading, and associated changes to data initialization code further developed through distributed innovation, we achieved an end-to-end speedup of 591-fold for a data set size of 6678 subjects by 645 863 variants, compared to PLINK 1.07's logistic regression. This represents a reduction in run time from 4.8 hours to 29 seconds. Accelerated logistic regression code developed in this project has been incorporated into the PLINK2 project. Using iterative competition-based OI, we have developed a new, faster implementation of logistic regression for genome-wide association studies analysis. We present lessons learned and recommendations on running a successful OI process for bioinformatics. © The Author 2017. Published by Oxford University Press.
Lin, Chao-Cheng; Bai, Ya-Mei; Chen, Jen-Yeu; Hwang, Tzung-Jeng; Chen, Tzu-Ting; Chiu, Hung-Wen; Li, Yu-Chuan
2010-03-01
Metabolic syndrome (MetS) is an important side effect of second-generation antipsychotics (SGAs). However, many SGA-treated patients with MetS remain undetected. In this study, we trained and validated artificial neural network (ANN) and multiple logistic regression models without biochemical parameters to rapidly identify MetS in patients with SGA treatment. A total of 383 patients with a diagnosis of schizophrenia or schizoaffective disorder (DSM-IV criteria) with SGA treatment for more than 6 months were investigated to determine whether they met the MetS criteria according to the International Diabetes Federation. The data for these patients were collected between March 2005 and September 2005. The input variables of ANN and logistic regression were limited to demographic and anthropometric data only. All models were trained by randomly selecting two-thirds of the patient data and were internally validated with the remaining one-third of the data. The models were then externally validated with data from 69 patients from another hospital, collected between March 2008 and June 2008. The area under the receiver operating characteristic curve (AUC) was used to measure the performance of all models. Both the final ANN and logistic regression models had high accuracy (88.3% vs 83.6%), sensitivity (93.1% vs 86.2%), and specificity (86.9% vs 83.8%) to identify MetS in the internal validation set. The mean +/- SD AUC was high for both the ANN and logistic regression models (0.934 +/- 0.033 vs 0.922 +/- 0.035, P = .63). During external validation, high AUC was still obtained for both models. Waist circumference and diastolic blood pressure were the common variables that were left in the final ANN and logistic regression models. Our study developed accurate ANN and logistic regression models to detect MetS in patients with SGA treatment. The models are likely to provide a noninvasive tool for large-scale screening of MetS in this group of patients. (c) 2010 Physicians Postgraduate Press, Inc.
Bayesian logistic regression in detection of gene-steroid interaction for cancer at PDLIM5 locus.
Wang, Ke-Sheng; Owusu, Daniel; Pan, Yue; Xie, Changchun
2016-06-01
The PDZ and LIM domain 5 (PDLIM5) gene may play a role in cancer, bipolar disorder, major depression, alcohol dependence and schizophrenia; however, little is known about the interaction effect of steroid and PDLIM5 gene on cancer. This study examined 47 single-nucleotide polymorphisms (SNPs) within the PDLIM5 gene in the Marshfield sample with 716 cancer patients (any diagnosed cancer, excluding minor skin cancer) and 2848 noncancer controls. Multiple logistic regression model in PLINK software was used to examine the association of each SNP with cancer. Bayesian logistic regression in PROC GENMOD in SAS statistical software, ver. 9.4 was used to detect gene- steroid interactions influencing cancer. Single marker analysis using PLINK identified 12 SNPs associated with cancer (P< 0.05); especially, SNP rs6532496 revealed the strongest association with cancer (P = 6.84 × 10⁻³); while the next best signal was rs951613 (P = 7.46 × 10⁻³). Classic logistic regression in PROC GENMOD showed that both rs6532496 and rs951613 revealed strong gene-steroid interaction effects (OR=2.18, 95% CI=1.31-3.63 with P = 2.9 × 10⁻³ for rs6532496 and OR=2.07, 95% CI=1.24-3.45 with P = 5.43 × 10⁻³ for rs951613, respectively). Results from Bayesian logistic regression showed stronger interaction effects (OR=2.26, 95% CI=1.2-3.38 for rs6532496 and OR=2.14, 95% CI=1.14-3.2 for rs951613, respectively). All the 12 SNPs associated with cancer revealed significant gene-steroid interaction effects (P < 0.05); whereas 13 SNPs showed gene-steroid interaction effects without main effect on cancer. SNP rs4634230 revealed the strongest gene-steroid interaction effect (OR=2.49, 95% CI=1.5-4.13 with P = 4.0 × 10⁻⁴ based on the classic logistic regression and OR=2.59, 95% CI=1.4-3.97 from Bayesian logistic regression; respectively). This study provides evidence of common genetic variants within the PDLIM5 gene and interactions between PLDIM5 gene polymorphisms and steroid use influencing cancer.
Locomotive syndrome is associated not only with physical capacity but also degree of depression.
Ikemoto, Tatsunori; Inoue, Masayuki; Nakata, Masatoshi; Miyagawa, Hirofumi; Shimo, Kazuhiro; Wakabayashi, Toshiko; Arai, Young-Chang P; Ushida, Takahiro
2016-05-01
Reports of locomotive syndrome (LS) have recently been increasing. Although physical performance measures for LS have been well investigated to date, studies including psychiatric assessment are still scarce. Hence, the aim of this study was to investigate both physical and mental parameters in relation to presence and severity of LS using a 25-question geriatric locomotive function scale (GLFS-25) questionnaire. 150 elderly people aged over 60 years who were members of our physical-fitness center and displayed well-being were enrolled in this study. Firstly, using the previously determined GLFS-25 cutoff value (=16 points), subjects were divided into two groups accordingly: an LS and non-LS group in order to compare each parameter (age, grip strength, timed-up-and-go test (TUG), one-leg standing with eye open, back muscle and leg muscle strength, degree of depression and cognitive impairment) between the groups using the Mann-Whitney U-test followed by multiple logistic regression analysis. Secondly, a multiple linear regression was conducted to determine which variables showed the strongest correlation with severity of LS. We confirmed 110 people for non-LS (73%) and 40 people for LS using the GLFS-25 cutoff value. Comparative analysis between LS and non-LS revealed significant differences in parameters in age, grip strength, TUG, one-leg standing, back muscle strength and degree of depression (p < 0.006, after Bonferroni correction). Multiple logistic regression revealed that functional decline in grip strength, TUG and one-leg standing and degree of depression were significantly associated with LS. On the other hand, we observed that the significant contributors towards the GLFS-25 score were TUG and degree of depression in multiple linear regression analysis. The results indicate that LS is associated with not only the capacity of physical performance but also the degree of depression although most participants fell under the criteria of LS. Copyright © 2016 The Japanese Orthopaedic Association. Published by Elsevier B.V. All rights reserved.
Knol, Mirjam J; van der Tweel, Ingeborg; Grobbee, Diederick E; Numans, Mattijs E; Geerlings, Mirjam I
2007-10-01
To determine the presence of interaction in epidemiologic research, typically a product term is added to the regression model. In linear regression, the regression coefficient of the product term reflects interaction as departure from additivity. However, in logistic regression it refers to interaction as departure from multiplicativity. Rothman has argued that interaction estimated as departure from additivity better reflects biologic interaction. So far, literature on estimating interaction on an additive scale using logistic regression only focused on dichotomous determinants. The objective of the present study was to provide the methods to estimate interaction between continuous determinants and to illustrate these methods with a clinical example. and results From the existing literature we derived the formulas to quantify interaction as departure from additivity between one continuous and one dichotomous determinant and between two continuous determinants using logistic regression. Bootstrapping was used to calculate the corresponding confidence intervals. To illustrate the theory with an empirical example, data from the Utrecht Health Project were used, with age and body mass index as risk factors for elevated diastolic blood pressure. The methods and formulas presented in this article are intended to assist epidemiologists to calculate interaction on an additive scale between two variables on a certain outcome. The proposed methods are included in a spreadsheet which is freely available at: http://www.juliuscenter.nl/additive-interaction.xls.
Numerical scoring for the Classic BILAG index.
Cresswell, Lynne; Yee, Chee-Seng; Farewell, Vernon; Rahman, Anisur; Teh, Lee-Suan; Griffiths, Bridget; Bruce, Ian N; Ahmad, Yasmeen; Prabu, Athiveeraramapandian; Akil, Mohammed; McHugh, Neil; Toescu, Veronica; D'Cruz, David; Khamashta, Munther A; Maddison, Peter; Isenberg, David A; Gordon, Caroline
2009-12-01
To develop an additive numerical scoring scheme for the Classic BILAG index. SLE patients were recruited into this multi-centre cross-sectional study. At every assessment, data were collected on disease activity and therapy. Logistic regression was used to model an increase in therapy, as an indicator of active disease, by the Classic BILAG score in eight systems. As both indicate inactivity, scores of D and E were set to 0 and used as the baseline in the fitted model. The coefficients from the fitted model were used to determine the numerical values for Grades A, B and C. Different scoring schemes were then compared using receiver operating characteristic (ROC) curves. Validation analysis was performed using assessments from a single centre. There were 1510 assessments from 369 SLE patients. The currently used coding scheme (A = 9, B = 3, C = 1 and D/E = 0) did not fit the data well. The regression model suggested three possible numerical scoring schemes: (i) A = 11, B = 6, C = 1 and D/E = 0; (ii) A = 12, B = 6, C = 1 and D/E = 0; and (iii) A = 11, B = 7, C = 1 and D/E = 0. These schemes produced comparable ROC curves. Based on this, A = 12, B = 6, C = 1 and D/E = 0 seemed a reasonable and practical choice. The validation analysis suggested that although the A = 12, B = 6, C = 1 and D/E = 0 coding is still reasonable, a scheme with slightly less weighting for B, such as A = 12, B = 5, C = 1 and D/E = 0, may be more appropriate. A reasonable additive numerical scoring scheme based on treatment decision for the Classic BILAG index is A = 12, B = 5, C = 1, D = 0 and E = 0.
Tsuyuki, Kiyomi; Surratt, Hilary L
2015-05-01
Antiretroviral (ARV) medication diversion to the illicit market has been documented in South Florida, and linked to sub-optimal adherence in people living with HIV. ARV diversion reflects an unmet need for care in vulnerable populations that have difficulty engaging in consistent HIV care due to competing needs and co-morbidities. This study applies the Gelberg-Andersen behavioral model of health care utilization for vulnerable populations to understand how social vulnerability is linked to ARV diversion and adherence. Cross-sectional data were collected from a targeted sample of vulnerable people living with HIV in South Florida between 2010 and 2012 (n = 503). Structured interviews collected quantitative data on ARV diversion, access and utilization of care, and ARV adherence. Logistic regression was used to estimate the goodness-of-fit of additive models that test domain fit. Linear regression was used to estimate the effects of social vulnerability and ARV diversion on ARV adherence. The best fitting model to predict ARV diversion identifies having a low monthly income and unstable HIV care as salient enabling factors that promote ARV diversion. Importantly, health care need factors did not protect against ARV diversion, evidence that immediate competing needs are prioritized even in the face of poor health for this sample. We also find that ARV diversion provides a link between social vulnerability and sub-optimal ARV adherence, with ARV diversion and domains from the Behavioral Model explaining 25 % of the variation in ARV adherence. Our analyses reveal great need to improve engagement in HIV care for vulnerable populations by strengthening enabling factors (e.g. patient-provider relationship) to improve retention in HIV care and ARV adherence for vulnerable populations.
Numerical scoring for the Classic BILAG index
Cresswell, Lynne; Yee, Chee-Seng; Farewell, Vernon; Rahman, Anisur; Teh, Lee-Suan; Griffiths, Bridget; Bruce, Ian N.; Ahmad, Yasmeen; Prabu, Athiveeraramapandian; Akil, Mohammed; McHugh, Neil; Toescu, Veronica; D’Cruz, David; Khamashta, Munther A.; Maddison, Peter; Isenberg, David A.
2009-01-01
Objective. To develop an additive numerical scoring scheme for the Classic BILAG index. Methods. SLE patients were recruited into this multi-centre cross-sectional study. At every assessment, data were collected on disease activity and therapy. Logistic regression was used to model an increase in therapy, as an indicator of active disease, by the Classic BILAG score in eight systems. As both indicate inactivity, scores of D and E were set to 0 and used as the baseline in the fitted model. The coefficients from the fitted model were used to determine the numerical values for Grades A, B and C. Different scoring schemes were then compared using receiver operating characteristic (ROC) curves. Validation analysis was performed using assessments from a single centre. Results. There were 1510 assessments from 369 SLE patients. The currently used coding scheme (A = 9, B = 3, C = 1 and D/E = 0) did not fit the data well. The regression model suggested three possible numerical scoring schemes: (i) A = 11, B = 6, C = 1 and D/E = 0; (ii) A = 12, B = 6, C = 1 and D/E = 0; and (iii) A = 11, B = 7, C = 1 and D/E = 0. These schemes produced comparable ROC curves. Based on this, A = 12, B = 6, C = 1 and D/E = 0 seemed a reasonable and practical choice. The validation analysis suggested that although the A = 12, B = 6, C = 1 and D/E = 0 coding is still reasonable, a scheme with slightly less weighting for B, such as A = 12, B = 5, C = 1 and D/E = 0, may be more appropriate. Conclusions. A reasonable additive numerical scoring scheme based on treatment decision for the Classic BILAG index is A = 12, B = 5, C = 1, D = 0 and E = 0. PMID:19779027
ERIC Educational Resources Information Center
Osborne, Jason W.
2012-01-01
Logistic regression is slowly gaining acceptance in the social sciences, and fills an important niche in the researcher's toolkit: being able to predict important outcomes that are not continuous in nature. While OLS regression is a valuable tool, it cannot routinely be used to predict outcomes that are binary or categorical in nature. These…
Margolis, Lewis H; Mayer, Michelle; Clark, Kathryn A; Farel, Anita M
2011-08-01
To examine the relationship between measures of state economic, political, health services, and Title V capacity and individual level measures of the well-being of CSHCN. We selected five measures of Title V capacity from the Title V Information System and 13 state capacity measures from a variety of data sources, and eight indicators of intermediate health outcomes from the National Survey of Children with Special Health Care Needs. To assess the associations between Title V capacity and health services outcomes, we used stepwise regression to identify significant capacity measures while accounting for the survey design and clustering of observations by state. To assess the associations between economic, political and health systems capacity and health outcomes we fit weighted logistic regression models for each outcome, using a stepwise procedure to reduce the models. Using statistically significant capacity measures from the stepwise models, we fit reduced random effects logistic regression models to account for clustering of observations by state. Few measures of Title V and state capacity were associated with health services outcomes. For health systems measures, a higher percentage of uninsured children was associated with decreased odds of receipt of early intervention services, decreased odds of receipt of professional care coordination, and increased odds of delayed or missed care. Parents in states with higher per capita Medicaid expenditures on children were more likely to report receipt of special education services. Only two state capacity measures were associated explicitly with Title V: states with higher generalist physician to population ratios were associated with a greater likelihood of parent report of having heard of Title V and states with higher per capita gross state product were less likely to be associated with a report of using Title V services, conditional on having heard of Title V. The state level measure of family participation in Title V governance was negatively associated with receipt of care coordination and having used Title V services. The measures of state economic, political, health systems, and Title V capacity that we have analyzed are only weakly associated with the well-being of children with special health care needs. If Congress and other policymakers increase the expectations of the states in assuring that the needs of CSHCN and their families are addressed, it is essential to be cognizant of the capacities of the states to undertake that role.
Bar-Gera, H; Musicant, O; Schechtman, E; Ze'evi, T
2016-11-01
The yellow signal driver behavior, reflecting the dilemma zone behavior, is analyzed using naturalistic data from digital enforcement cameras. The key variable in the analysis is the entrance time after the yellow onset, and its distribution. This distribution can assist in determining two critical outcomes: the safety outcome related to red-light-running angle accidents, and the efficiency outcome. The connection to other approaches for evaluating the yellow signal driver behavior is also discussed. The dataset was obtained from 37 digital enforcement cameras at non-urban signalized intersections in Israel, over a period of nearly two years. The data contain more than 200 million vehicle entrances, of which 2.3% (∼5million vehicles) entered the intersection during the yellow phase. In all non-urban signalized intersections in Israel the green phase ends with 3s of flashing green, followed by 3s of yellow. In most non-urban signalized roads in Israel the posted speed limit is 90km/h. Our analysis focuses on crossings during the yellow phase and the first 1.5s of the red phase. The analysis method consists of two stages. In the first stage we tested whether the frequency of crossings is constant at the beginning of the yellow phase. We found that the pattern was stable (i.e., the frequencies were constant) at 18 intersections, nearly stable at 13 intersections and unstable at 6 intersections. In addition to the 6 intersections with unstable patterns, two other outlying intersections were excluded from subsequent analysis. Logistic regression models were fitted for each of the remaining 29 intersection. We examined both standard (exponential) logistic regression and four parameters logistic regression. The results show a clear advantage for the former. The estimated parameters show that the time when the frequency of crossing reduces to half ranges from1.7 to 2.3s after yellow onset. The duration of the reduction of the relative frequency from 0.9 to 0.1 ranged from 1.9 to 2.9s. Copyright © 2015 Elsevier Ltd. All rights reserved.
Factors influencing hospital high length of stay outliers
2012-01-01
Background The study of length of stay (LOS) outliers is important for the management and financing of hospitals. Our aim was to study variables associated with high LOS outliers and their evolution over time. Methods We used hospital administrative data from inpatient episodes in public acute care hospitals in the Portuguese National Health Service (NHS), with discharges between years 2000 and 2009, together with some hospital characteristics. The dependent variable, LOS outliers, was calculated for each diagnosis related group (DRG) using a trim point defined for each year by the geometric mean plus two standard deviations. Hospitals were classified on the basis of administrative, economic and teaching characteristics. We also studied the influence of comorbidities and readmissions. Logistic regression models, including a multivariable logistic regression, were used in the analysis. All the logistic regressions were fitted using generalized estimating equations (GEE). Results In near nine million inpatient episodes analysed we found a proportion of 3.9% high LOS outliers, accounting for 19.2% of total inpatient days. The number of hospital patient discharges increased between years 2000 and 2005 and slightly decreased after that. The proportion of outliers ranged between the lowest value of 3.6% (in years 2001 and 2002) and the highest value of 4.3% in 2009. Teaching hospitals with over 1,000 beds have significantly more outliers than other hospitals, even after adjustment to readmissions and several patient characteristics. Conclusions In the last years both average LOS and high LOS outliers are increasing in Portuguese NHS hospitals. As high LOS outliers represent an important proportion in the total inpatient days, this should be seen as an important alert for the management of hospitals and for national health policies. As expected, age, type of admission, and hospital type were significantly associated with high LOS outliers. The proportion of high outliers does not seem to be related to their financial coverage; they should be studied in order to highlight areas for further investigation. The increasing complexity of both hospitals and patients may be the single most important determinant of high LOS outliers and must therefore be taken into account by health managers when considering hospital costs. PMID:22906386
Barriers and benefits of a healthy diet in spain: comparison with other European member states.
Holgado, B; de Irala-Estévez, J; Martínez-González, M A; Gibney, M; Kearney, J; Martínez, J A
2000-06-01
Our purpose was to identify the main barriers and benefits perceived by the European citizens in regard to following a healthy diet and to assess the differences in expected benefits and difficulties between Spain and the remaining countries of the European Union. A cross-sectional study in which quota-controlled, nationally representative samples of approximately 1000 adults from each country completed a questionnaire. The survey was carried out between October 1995 and February 1996 in the 15 member states of the European Union. Participants (aged 15 y and older) were selected and interviewed in their homes about their attitudes towards healthy diets. They were asked to select two options from a list of 22 potential barriers to achieve a healthy diet and the benefits derived from a healthy diet. The associations of the perceived benefits of barriers with the sociodemographic variables within Spain and the rest of the European Union were compared with the Pearson chi-squared test and the chi-squared linear trend test. Two multivariate logistic regression models were also fitted to assess the characteristics independently related to the selection of 'Resistance to change' among the main barriers and to the selection of 'Prevent disease/stay healthy' as the main perceived benefits. The barrier most frequently mentioned in Spain was 'Irregular work hours' (29.7%) in contrast with the rest of the European Union where 'Giving up foods that I like' was the barrier most often chosen (26.2%). In the multivariate logistic regression model studying resistance to change, Spaniards were less resistant to change than the rest of the European Union. The benefit more frequently mentioned across Europe was 'Prevent disease/stay healthy'. In the multivariate logistic regression model women, older individuals, and people with a higher educational level were more likely to choose this benefit. It is apparent that there are many barriers to achieve healthy eating, mostly lack of time. For this reason a higher availability of food in line with the nutrition guidelines could be helpful. The population could have a better knowledge of the benefits derived from a healthy diet.
Predicting Social Trust with Binary Logistic Regression
ERIC Educational Resources Information Center
Adwere-Boamah, Joseph; Hufstedler, Shirley
2015-01-01
This study used binary logistic regression to predict social trust with five demographic variables from a national sample of adult individuals who participated in The General Social Survey (GSS) in 2012. The five predictor variables were respondents' highest degree earned, race, sex, general happiness and the importance of personally assisting…
Effect of folic acid on appetite in children: ordinal logistic and fuzzy logistic regressions.
Namdari, Mahshid; Abadi, Alireza; Taheri, S Mahmoud; Rezaei, Mansour; Kalantari, Naser; Omidvar, Nasrin
2014-03-01
Reduced appetite and low food intake are often a concern in preschool children, since it can lead to malnutrition, a leading cause of impaired growth and mortality in childhood. It is occasionally considered that folic acid has a positive effect on appetite enhancement and consequently growth in children. The aim of this study was to assess the effect of folic acid on the appetite of preschool children 3 to 6 y old. The study sample included 127 children ages 3 to 6 who were randomly selected from 20 preschools in the city of Tehran in 2011. Since appetite was measured by linguistic terms, a fuzzy logistic regression was applied for modeling. The obtained results were compared with a statistical ordinal logistic model. After controlling for the potential confounders, in a statistical ordinal logistic model, serum folate showed a significantly positive effect on appetite. A small but positive effect of folate was detected by fuzzy logistic regression. Based on fuzzy regression, the risk for poor appetite in preschool children was related to the employment status of their mothers. In this study, a positive association was detected between the levels of serum folate and improved appetite. For further investigation, a randomized controlled, double-blind clinical trial could be helpful to address causality. Copyright © 2014 Elsevier Inc. All rights reserved.
[Exploratory analysis of work engagement: use of the Utrecht scale in Benin].
Ahanhanzo, Yolaine Glèlè; Kittel, France; Paraïso, Noël Moussiliou; Godin, Isabelle; Wilmet-Dramaix, Michèle; Makoutodé, Michel
2014-01-01
Work engagement, an emerging concept in the field of positive psychology in the workplace is not well known in developing countries. Defined as a positive and and fulfilling mindset related to work, it recalls a positive attitude incentive of performance and need to be investigated. In the context of the socioeconomic crisis of health workers, and with the chronic issue of poor quality of data, this study was designed to identify the factors associated with work engagement among health workers. in charge of data collection in the Benin Routine Health Information System. This study was a cross-sectional and analytical study targeting health workers in charge of data collection in public and private health centres. The dependent variable was work engagement and independent variables were sociodemographic and professional features, personal and professional resources and perception of technical factors. Logistic regression was used. The adequacy of the model was tested with the Hosmer-Lemeshow goodness of fit test. The results indicate that the level of work engagement is similar with that observed in previous studies. Predictors identified in logistic regression are perception of technical factors, location of the job, and personal resources, such as level of effort and overcommitment. This study identified factors associated with work engagement in a developing country, and adds to the knowledge concerning this new concept in Benin. The findings can contribute to research for improvement of human resources management in the health sector to achieve real performance and development.
A framework for evaluating student perceptions of health policy training in medical school.
Patel, Mitesh S; Lypson, Monica L; Miller, D Douglas; Davis, Matthew M
2014-10-01
Nearly half of graduating medical students in the United States report that medical school provides inadequate instruction in topics related to health policy. Although most medical schools report some form of policy education, there lacks a standard for teaching core concepts and evaluating student satisfaction. Responses to the Association of American Medical College's Medical School Graduation Questionnaire were obtained for the years 2007-2008 and 2011-2012 and mapped to domains of training in health policy curricula for four domains: systems and principles; value and equity; quality and safety; and politics and law. Chi-square tests were used to test differences among unadjusted temporal trends. Multiple logistic regression models were fit to the outcome variables and adjusted for student characteristics, student preferences, and medical school characteristics. Compared with 2007-2008, students' perceptions of training in 2011-2012 increased on a relative basis by 11.7% for components within systems and principles, 2.8% for quality and safety, and 6.8% for value and equity. Components within politics and law had a composite decline of 4.8%. Multiple logistic regression models found higher odds of reporting satisfaction with training over time for all components within the domains of systems and principles, quality and safety, and value and equity (P < .01), with the exception of medical economics. Medical student perceptions of training in health policy improved over time. Causal factors for these trends require further study. Despite improvement, nearly 40% of graduating medical students still report inadequate instruction in health policy.
Risk factors for repetitive strain injuries among school teachers in Thailand.
Chaiklieng, Sunisa; Suggaravetsiri, Pornnapa
2012-01-01
Prolonged posture, static works and repetition are previously reported as the cause of repetitive strain injuries (RSIs) among workers including teachers. This cross-sectional analytic study aimed to investigate the prevalence and risk factors of RSIs among school teachers. Participants were 452 full-time school teachers in Thailand. Data were collected by the structural questionnaires, illuminance measurements and the physical fitness tests. Descriptive statistics and inferential statistics which were Chi-square test and multiple logistic regression analysis were used. Most teachers in this study were females (57.3%), the mean years of work experience was 22.6 ± 10.4 years. The six-month prevalence of RSIs was 73.7%. The univariate analysis identified the related risk factors to RSIs which were chronic disease (OR=1.8; 95% CI = 1.16-2.73), history of trauma (OR=2.0; 95% CI = 1.02-4.01), member of family had RSIs (OR=2.0; 95% CI = 1.02- 4.01), stretch to write on board (OR=1.7; 95% CI = 1.06-1.70) and high heel shoe >2 inch (OR=1.6; 95% CI = 1.03-2.51). Multiple logistic regression analysis showed that chronic diseases and high heel shoe >2 inch significantly related to developing of RSIs. The poor grip strength and back muscle flexibility significantly affected RSIs of teachers. In conclusions, RSIs were highly prevalent in school teachers that they should be aware of health promotion to prevent RSIs.
Resilience model for parents of children with cancer in mainland China-An exploratory study.
Ye, Zeng Jie; Qiu, Hong Zhong; Li, Peng Fei; Liang, Mu Zi; Wang, Shu Ni; Quan, Xiao Ming
2017-04-01
Parents have psychosocial functions that are critical for the entire family. Therefore, when their child is diagnosed with cancer, it is important that they exhibit resilience, which is the ability to preserve their emotional and physical well-being in the face of stress. The Resilience Model for Parents of Children with Cancer (RMP-CC) was developed to increase our understanding of how resilience is positively and negatively affected by protective and risk factors, respectively, in Chinese parents with children diagnosed with cancer. To evaluate the RMP-CC, the latent psychosocial variables and demographics of 229 parents were evaluated using exploratory structural equation modeling (SEM) and logistic regression. The majority of goodness-of-fit indices indicate that the SEM of RMP-CC was a good model with a high level of variance in resilience (58%). Logistic regression revealed that two demographics, educational level and clinical classification of cancer, accounted for 12% of this variance. Our results indicate that RMP-CC is an effective structure by which to develop mainland Chinese parent-focused interventions that are grounded in the experiences of the parents as caregivers of children who have been diagnosed with cancer. RMP-CC allows for a better understanding of what these parents experience while their children undergo treatment. Further studies will be needed to confirm the efficiency of the current structure, and would assist in further refinement of its clinical applications. Copyright © 2017 Elsevier Ltd. All rights reserved.
Islam Mondal, Md. Nazrul; Nasir Ullah, Md. Monzur Morshad; Khan, Md. Nuruzzaman; Islam, Mohammad Zamirul; Islam, Md. Nurul; Moni, Sabiha Yasmin; Hoque, Md. Nazrul; Rahman, Md. Mashiur
2015-01-01
Background: Reproductive health (RH) is a critical component of women’s health and overall well-being around the world, especially in developing countries. We examine the factors that determine knowledge of RH care among female university students in Bangladesh. Methods: Data on 300 female students were collected from Rajshahi University, Bangladesh through a structured questionnaire using purposive sampling technique. The data were used for univariate analysis, to carry out the description of the variables; bivariate analysis was used to examine the associations between the variables; and finally, multivariate analysis (binary logistic regression model) was used to examine and fit the model and interpret the parameter estimates, especially in terms of odds ratios. Results: The results revealed that more than one-third (34.3%) respondents do not have sufficient knowledge of RH care. The χ2-test identified the significant (p < 0.05) associations between respondents’ knowledge of RH care with respondents’ age, education, family type, watching television; and knowledge about pregnancy, family planning, and contraceptive use. Finally, the binary logistic regression model identified respondents’ age, education, family type; and knowledge about family planning, and contraceptive use as the significant (p < 0.05) predictors of RH care. Conclusions and Global Health Implications: Knowledge of RH care among female university students was found unsatisfactory. Government and concerned organizations should promote and strengthen various health education programs to focus on RH care especially for the female university students in Bangladesh. PMID:27622005
Patrikar, S R; Bhalwar, R; Datta, A; Basannar, D R
2008-07-01
Male Preference is well known phenomena world wide from ancient ages. A descriptive study was carried out to assess the attitude of women towards birth of son, use of contraception methods and sex determination methods in rural village Kasurdi in Pune district. Univariate analysis was carried out by considering each factor determining sex preference separately as well as using a Logistic Regression Model. Adequacy of fit of the model has also been tested. Out of 110 respondents interviewed, 62.7% felt that male child is necessary in the family. Univariate analysis revealed that sex of first child, concern undergone for second pregnancy with regards to sex of the child, number of children in family and type of family were significant factors contributing to the son preference. The analysis under the logistic regression model revealed that sex of the first child and concern undergone in second pregnancy with respect to the sex of the second child are the most dominating and significant factors in the causation of son preference. The difference between family sizes when compared with the sex of first child was statistically significant signifying that if the first child is a male then it hardly matters whether the second child is male or female, but if the sex of first child is female then the families land up with bigger family size. On an average most of the respondents favour two children with an equal share of male and female children.
Prediction of Fitness to Drive in Patients with Alzheimer's Dementia
Piersma, Dafne; Fuermaier, Anselm B. M.; de Waard, Dick; Davidse, Ragnhild J.; de Groot, Jolieke; Doumen, Michelle J. A.; Bredewoud, Ruud A.; Claesen, René; Lemstra, Afina W.; Vermeeren, Annemiek; Ponds, Rudolf; Verhey, Frans; Brouwer, Wiebo H.; Tucha, Oliver
2016-01-01
The number of patients with Alzheimer’s disease (AD) is increasing and so is the number of patients driving a car. To enable patients to retain their mobility while at the same time not endangering public safety, each patient should be assessed for fitness to drive. The aim of this study is to develop a method to assess fitness to drive in a clinical setting, using three types of assessments, i.e. clinical interviews, neuropsychological assessment and driving simulator rides. The goals are (1) to determine for each type of assessment which combination of measures is most predictive for on-road driving performance, (2) to compare the predictive value of clinical interviews, neuropsychological assessment and driving simulator evaluation and (3) to determine which combination of these assessments provides the best prediction of fitness to drive. Eighty-one patients with AD and 45 healthy individuals participated. All participated in a clinical interview, and were administered a neuropsychological test battery and a driving simulator ride (predictors). The criterion fitness to drive was determined in an on-road driving assessment by experts of the CBR Dutch driving test organisation according to their official protocol. The validity of the predictors to determine fitness to drive was explored by means of logistic regression analyses, discriminant function analyses, as well as receiver operating curve analyses. We found that all three types of assessments are predictive of on-road driving performance. Neuropsychological assessment had the highest classification accuracy followed by driving simulator rides and clinical interviews. However, combining all three types of assessments yielded the best prediction for fitness to drive in patients with AD with an overall accuracy of 92.7%, which makes this method highly valid for assessing fitness to drive in AD. This method may be used to advise patients with AD and their family members about fitness to drive. PMID:26910535
Evaluation of weighted regression and sample size in developing a taper model for loblolly pine
Kenneth L. Cormier; Robin M. Reich; Raymond L. Czaplewski; William A. Bechtold
1992-01-01
A stem profile model, fit using pseudo-likelihood weighted regression, was used to estimate merchantable volume of loblolly pine (Pinus taeda L.) in the southeast. The weighted regression increased model fit marginally, but did not substantially increase model performance. In all cases, the unweighted regression models performed as well as the...
Cheung, Li C; Pan, Qing; Hyun, Noorie; Schiffman, Mark; Fetterman, Barbara; Castle, Philip E; Lorey, Thomas; Katki, Hormuzd A
2017-09-30
For cost-effectiveness and efficiency, many large-scale general-purpose cohort studies are being assembled within large health-care providers who use electronic health records. Two key features of such data are that incident disease is interval-censored between irregular visits and there can be pre-existing (prevalent) disease. Because prevalent disease is not always immediately diagnosed, some disease diagnosed at later visits are actually undiagnosed prevalent disease. We consider prevalent disease as a point mass at time zero for clinical applications where there is no interest in time of prevalent disease onset. We demonstrate that the naive Kaplan-Meier cumulative risk estimator underestimates risks at early time points and overestimates later risks. We propose a general family of mixture models for undiagnosed prevalent disease and interval-censored incident disease that we call prevalence-incidence models. Parameters for parametric prevalence-incidence models, such as the logistic regression and Weibull survival (logistic-Weibull) model, are estimated by direct likelihood maximization or by EM algorithm. Non-parametric methods are proposed to calculate cumulative risks for cases without covariates. We compare naive Kaplan-Meier, logistic-Weibull, and non-parametric estimates of cumulative risk in the cervical cancer screening program at Kaiser Permanente Northern California. Kaplan-Meier provided poor estimates while the logistic-Weibull model was a close fit to the non-parametric. Our findings support our use of logistic-Weibull models to develop the risk estimates that underlie current US risk-based cervical cancer screening guidelines. Published 2017. This article has been contributed to by US Government employees and their work is in the public domain in the USA. Published 2017. This article has been contributed to by US Government employees and their work is in the public domain in the USA.
Hong, Ickpyo; Coker-Bolt, Patty; Anderson, Kelly R.; Lee, Danbi
2016-01-01
OBJECTIVE. This study examined the relationship between childhood obesity and overweight and functional activity and its enjoyment. METHOD. A cross-sectional design was used to analyze data from the 2012 National Health and Nutrition Examination Survey National Youth Fitness Survey. Multivariate logistic regression models were used. RESULTS. Data for 1,640 children ages 3–15 yr were retrieved. Physical activity was negatively associated with risk of obesity (odds ratio [OR] = 0.93; 95% confidence interval [CI] [0.87, 0.98]). Although children who were obese and overweight were more likely to have functional limitations (ORs = 1.58–1.61), their enjoyment of physical activity participation was not significantly different from that of the healthy-weight group. CONCLUSION. Physical activity lowered the risk of obesity. Children who were obese had functional limitations compared with healthy-weight children, but both groups enjoyed physical activity equally. Future studies are needed to determine barriers to participation among these children in recreation and sporting activities. PMID:27548862
NASA Astrophysics Data System (ADS)
Nandy, Sreyankar; Mostafa, Atahar; Kumavor, Patrick D.; Sanders, Melinda; Brewer, Molly; Zhu, Quing
2016-10-01
A spatial frequency domain imaging (SFDI) system was developed for characterizing ex vivo human ovarian tissue using wide-field absorption and scattering properties and their spatial heterogeneities. Based on the observed differences between absorption and scattering images of different ovarian tissue groups, six parameters were quantitatively extracted. These are the mean absorption and scattering, spatial heterogeneities of both absorption and scattering maps measured by a standard deviation, and a fitting error of a Gaussian model fitted to normalized mean Radon transform of the absorption and scattering maps. A logistic regression model was used for classification of malignant and normal ovarian tissues. A sensitivity of 95%, specificity of 100%, and area under the curve of 0.98 were obtained using six parameters extracted from the SFDI images. The preliminary results demonstrate the diagnostic potential of the SFDI method for quantitative characterization of wide-field optical properties and the spatial distribution heterogeneity of human ovarian tissue. SFDI could be an extremely robust and valuable tool for evaluation of the ovary and detection of neoplastic changes of ovarian cancer.
Drawing Nomograms with R: applications to categorical outcome and survival data.
Zhang, Zhongheng; Kattan, Michael W
2017-05-01
Outcome prediction is a major task in clinical medicine. The standard approach to this work is to collect a variety of predictors and build a model of appropriate type. The model is a mathematical equation that connects the outcome of interest with the predictors. A new patient with given clinical characteristics can be predicted for outcome with this model. However, the equation describing the relationship between predictors and outcome is often complex and the computation requires software for practical use. There is another method called nomogram which is a graphical calculating device allowing an approximate graphical computation of a mathematical function. In this article, we describe how to draw nomograms for various outcomes with nomogram() function. Binary outcome is fit by logistic regression model and the outcome of interest is the probability of the event of interest. Ordinal outcome variable is also discussed. Survival analysis can be fit with parametric model to fully describe the distributions of survival time. Statistics such as the median survival time, survival probability up to a specific time point are taken as the outcome of interest.
Hong, Ickpyo; Coker-Bolt, Patty; Anderson, Kelly R; Lee, Danbi; Velozo, Craig A
2016-01-01
This study examined the relationship between childhood obesity and overweight and functional activity and its enjoyment. A cross-sectional design was used to analyze data from the 2012 National Health and Nutrition Examination Survey National Youth Fitness Survey. Multivariate logistic regression models were used. Data for 1,640 children ages 3-15 yr were retrieved. Physical activity was negatively associated with risk of obesity (odds ratio [OR] = 0.93; 95% confidence interval [CI] [0.87, 0.98]). Although children who were obese and overweight were more likely to have functional limitations (ORs = 1.58-1.61), their enjoyment of physical activity participation was not significantly different from that of the healthy-weight group. Physical activity lowered the risk of obesity. Children who were obese had functional limitations compared with healthy-weight children, but both groups enjoyed physical activity equally. Future studies are needed to determine barriers to participation among these children in recreation and sporting activities. Copyright © 2016 by the American Occupational Therapy Association, Inc.
Predictive factors for work capacity in patients with musculoskeletal disorders.
Lydell, Marie; Baigi, Amir; Marklund, Bertil; Månsson, Jörgen
2005-09-01
To identify predictive factors for work capacity in patients with musculoskeletal disorders. A descriptive, evaluative, quantitative study. The study was based on 385 patients who participated in a rehabilitation programme. Patients were divided into 2 groups depending on their ability to work. The groups were compared with each other with regard to sociodemographic factors, diagnoses, disability pension and number of sick days. The patient's level of exercise habits, ability to undertake activities, physical capacity, pain and quality of life were compared further using logistic regression analysis. Predictive factors for work capacity, such as ability to undertake activities, quality of life and fitness on exercise, were identified as important independent factors. Other well-known factors, i.e. gender, age, education, pain and earlier sickness certification periods, were also identified. Factors that were not significantly different between the groups were employment status, profession, diagnosis and levels of exercise habits. Identifying predictors for ability to return to work is an essential task for deciding on suitable individual rehabilitation. This study identified new predictive factors, such as ability to undertake activities, quality of life and fitness on exercise.
Clustering performance comparison using K-means and expectation maximization algorithms.
Jung, Yong Gyu; Kang, Min Soo; Heo, Jun
2014-11-14
Clustering is an important means of data mining based on separating data categories by similar features. Unlike the classification algorithm, clustering belongs to the unsupervised type of algorithms. Two representatives of the clustering algorithms are the K -means and the expectation maximization (EM) algorithm. Linear regression analysis was extended to the category-type dependent variable, while logistic regression was achieved using a linear combination of independent variables. To predict the possibility of occurrence of an event, a statistical approach is used. However, the classification of all data by means of logistic regression analysis cannot guarantee the accuracy of the results. In this paper, the logistic regression analysis is applied to EM clusters and the K -means clustering method for quality assessment of red wine, and a method is proposed for ensuring the accuracy of the classification results.
Delva, J; Spencer, M S; Lin, J K
2000-01-01
This article compares estimates of the relative odds of nitrite use obtained from weighted unconditional logistic regression with estimates obtained from conditional logistic regression after post-stratification and matching of cases with controls by neighborhood of residence. We illustrate these methods by comparing the odds associated with nitrite use among adults of four racial/ethnic groups, with and without a high school education. We used aggregated data from the 1994-B through 1996 National Household Survey on Drug Abuse (NHSDA). Difference between the methods and implications for analysis and inference are discussed.
Austin, Peter C; Lee, Douglas S; Steyerberg, Ewout W; Tu, Jack V
2012-01-01
In biomedical research, the logistic regression model is the most commonly used method for predicting the probability of a binary outcome. While many clinical researchers have expressed an enthusiasm for regression trees, this method may have limited accuracy for predicting health outcomes. We aimed to evaluate the improvement that is achieved by using ensemble-based methods, including bootstrap aggregation (bagging) of regression trees, random forests, and boosted regression trees. We analyzed 30-day mortality in two large cohorts of patients hospitalized with either acute myocardial infarction (N = 16,230) or congestive heart failure (N = 15,848) in two distinct eras (1999–2001 and 2004–2005). We found that both the in-sample and out-of-sample prediction of ensemble methods offered substantial improvement in predicting cardiovascular mortality compared to conventional regression trees. However, conventional logistic regression models that incorporated restricted cubic smoothing splines had even better performance. We conclude that ensemble methods from the data mining and machine learning literature increase the predictive performance of regression trees, but may not lead to clear advantages over conventional logistic regression models for predicting short-term mortality in population-based samples of subjects with cardiovascular disease. PMID:22777999
Aerobic fitness does not modify the effect of FTO variation on body composition traits.
Huuskonen, Antti; Lappalainen, Jani; Oksala, Niku; Santtila, Matti; Häkkinen, Keijo; Kyröläinen, Heikki; Atalay, Mustafa
2012-01-01
Poor physical fitness and obesity are risk factors for all cause morbidity and mortality. We aimed to clarify whether common genetic variants of key energy intake determinants in leptin (LEP), leptin receptor (LEPR), and fat mass and obesity-associated (FTO) are associated with aerobic and neuromuscular performance, and whether aerobic fitness can alter the effect of these genotypes on body composition. 846 healthy Finnish males of Caucasian origin were genotyped for FTO (rs8050136), LEP (rs7799039) and LEPR (rs8179183 and rs1137101) single nucleotide polymorphisms (SNPs), and studied for associations with maximal oxygen consumption, body fat percent, serum leptin levels, waist circumference and maximal force of leg extensor muscles. Genotype AA of the FTO SNP rs8050136 associated with higher BMI and greater waist circumference compared to the genotype CC. In general linear model, no significant interaction for FTO genotype-relative VO(2)max (mL·kg(-1)·min(-1)) or FTO genotype-absolute VO(2)max (L·min(-1)) on BMI or waist circumference was found. Main effects of aerobic performance on body composition traits were significant (p<0.001). Logistic regression modelling found no significant interaction between aerobic fitness and FTO genotype. LEP SNP rs7799039, LEPR SNPs rs8179183 and rs1137101 did not associate with any of the measured variables, and no significant interactions of LEP or LEPR genotype with aerobic fitness were observed. In addition, none of the studied SNPs associated with aerobic or neuromuscular performance. Aerobic fitness may not modify the effect of FTO variation on body composition traits. However, relative aerobic capacity associates with lower BMI and waist circumference regardless of the FTO genotype. FTO, LEP and LEPR genotypes unlikely associate with physical performance.
Efficient occupancy model-fitting for extensive citizen-science data.
Dennis, Emily B; Morgan, Byron J T; Freeman, Stephen N; Ridout, Martin S; Brereton, Tom M; Fox, Richard; Powney, Gary D; Roy, David B
2017-01-01
Appropriate large-scale citizen-science data present important new opportunities for biodiversity modelling, due in part to the wide spatial coverage of information. Recently proposed occupancy modelling approaches naturally incorporate random effects in order to account for annual variation in the composition of sites surveyed. In turn this leads to Bayesian analysis and model fitting, which are typically extremely time consuming. Motivated by presence-only records of occurrence from the UK Butterflies for the New Millennium data base, we present an alternative approach, in which site variation is described in a standard way through logistic regression on relevant environmental covariates. This allows efficient occupancy model-fitting using classical inference, which is easily achieved using standard computers. This is especially important when models need to be fitted each year, typically for many different species, as with British butterflies for example. Using both real and simulated data we demonstrate that the two approaches, with and without random effects, can result in similar conclusions regarding trends. There are many advantages to classical model-fitting, including the ability to compare a range of alternative models, identify appropriate covariates and assess model fit, using standard tools of maximum likelihood. In addition, modelling in terms of covariates provides opportunities for understanding the ecological processes that are in operation. We show that there is even greater potential; the classical approach allows us to construct regional indices simply, which indicate how changes in occupancy typically vary over a species' range. In addition we are also able to construct dynamic occupancy maps, which provide a novel, modern tool for examining temporal changes in species distribution. These new developments may be applied to a wide range of taxa, and are valuable at a time of climate change. They also have the potential to motivate citizen scientists.
Efficient occupancy model-fitting for extensive citizen-science data
Morgan, Byron J. T.; Freeman, Stephen N.; Ridout, Martin S.; Brereton, Tom M.; Fox, Richard; Powney, Gary D.; Roy, David B.
2017-01-01
Appropriate large-scale citizen-science data present important new opportunities for biodiversity modelling, due in part to the wide spatial coverage of information. Recently proposed occupancy modelling approaches naturally incorporate random effects in order to account for annual variation in the composition of sites surveyed. In turn this leads to Bayesian analysis and model fitting, which are typically extremely time consuming. Motivated by presence-only records of occurrence from the UK Butterflies for the New Millennium data base, we present an alternative approach, in which site variation is described in a standard way through logistic regression on relevant environmental covariates. This allows efficient occupancy model-fitting using classical inference, which is easily achieved using standard computers. This is especially important when models need to be fitted each year, typically for many different species, as with British butterflies for example. Using both real and simulated data we demonstrate that the two approaches, with and without random effects, can result in similar conclusions regarding trends. There are many advantages to classical model-fitting, including the ability to compare a range of alternative models, identify appropriate covariates and assess model fit, using standard tools of maximum likelihood. In addition, modelling in terms of covariates provides opportunities for understanding the ecological processes that are in operation. We show that there is even greater potential; the classical approach allows us to construct regional indices simply, which indicate how changes in occupancy typically vary over a species’ range. In addition we are also able to construct dynamic occupancy maps, which provide a novel, modern tool for examining temporal changes in species distribution. These new developments may be applied to a wide range of taxa, and are valuable at a time of climate change. They also have the potential to motivate citizen scientists. PMID:28328937
Association between cardiorespiratory fitness and body fat in girls
Minatto, Giseli; de Sousa, Thiago Ferreira; de Carvalho, Wellington Roberto Gomes; Ribeiro, Roberto Régis; Santos, Keila Donassolo; Petroski, Edio Luiz
2016-01-01
Abstract Objective: To estimate the prevalence of low cardiorespiratory fitness and its association with excess body fat, considering the sexual maturation and economic level in female adolescents. Methods: Cross-sectional, epidemiological study of 1223 adolescents (10-17 years) from the public school system of Cascavel, PR, Brazil, in 2006. We analyzed the self-assessed sexual maturation level (prepubertal, pubertal and post-pubertal), the economic level (high and low) through a questionnaire and body fat (normal and high) through triceps and subscapular skinfolds. The 20-meter back-and-forth test was applied to estimate maximum oxygen consumption. Cardiorespiratory fitness was assessed according to reference criteria and considered low when the minimum health criterion for age and sex was not met. Chi-square test and logistic regression were applied, with a significance level of 5%. Results: The prevalence of low cardiorespiratory fitness was 51.3%, being associated with all study variables (p<0.001). At the crude analysis, adolescents with high body fat were associated with low cardiorespiratory fitness, when compared to those with normal body fat (OR=2.76; 95%CI: 2.17-3.52). After adjustment by sexual maturation, this association remained valid and showed an effect that was 1.8-fold higher (95%CI: 1.39-2.46) and after adjusting by economic level, the effect was 1.9-fold higher (95%CI: 1.45-2.61). Conclusions: Approximately half of the assessed girls showed unsatisfactory levels of cardiorespiratory fitness for health, which was associated with high body fat, regardless of sexual maturation level and economic level. Effective public health measures are needed, with particular attention to high-risk groups. PMID:27131896
Association between cardiorespiratory fitness and body fat in girls.
Minatto, Giseli; Sousa, Thiago Ferreira de; Carvalho, Wellington Roberto Gomes de; Ribeiro, Roberto Régis; Santos, Keila Donassolo; Petroski, Edio Luiz
2016-12-01
To estimate the prevalence of low cardiorespiratory fitness and its association with excess body fat, considering the sexual maturation and economic level in female adolescents. Cross-sectional, epidemiological study of 1,223 adolescents (10-17 years) from the public school system of Cascavel, PR, Brazil, in 2006. We analyzed the self-assessed sexual maturation level (prepubertal, pubertal and post-pubertal), the Economic Level (EL) (high and low) through a questionnaire and body fat (normal and high) through triceps and subscapular skinfolds. The 20-meter back-and-forth test was applied to estimate maximum oxygen consumption. Cardiorespiratory fitness was assessed according to reference criteria and considered low when the minimum health criterion for age and sex was not met. Chi-square test and logistic regression were applied, with a significance level of 5%. The prevalence of low cardiorespiratory fitness was 51.3%, being associated with all study variables (p<0.001). At the crude analysis, adolescents with high body fat were associated with low cardiorespiratory fitness, when compared to those with normal body fat (OR=2.76; 95%CI: 2.17-3.52). After adjustment by sexual maturation, this association remained valid and showed an effect that was 1.8-fold higher (95%CI: 1.39-2.46) and after adjusting by EL, the effect was 1.9-fold higher (95%CI: 1.45-2.61). Approximately half of the assessed girls showed unsatisfactory levels of cardiorespiratory fitness for health, which was associated with high body fat, regardless of sexual maturation level and EL. Effective public health measures are needed, with particular attention to high-risk groups. Copyright © 2016 Sociedade de Pediatria de São Paulo. Publicado por Elsevier Editora Ltda. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Daly, Don S.; Anderson, Kevin K.; White, Amanda M.
Background: A microarray of enzyme-linked immunosorbent assays, or ELISA microarray, predicts simultaneously the concentrations of numerous proteins in a small sample. These predictions, however, are uncertain due to processing error and biological variability. Making sound biological inferences as well as improving the ELISA microarray process require require both concentration predictions and creditable estimates of their errors. Methods: We present a statistical method based on monotonic spline statistical models, penalized constrained least squares fitting (PCLS) and Monte Carlo simulation (MC) to predict concentrations and estimate prediction errors in ELISA microarray. PCLS restrains the flexible spline to a fit of assay intensitymore » that is a monotone function of protein concentration. With MC, both modeling and measurement errors are combined to estimate prediction error. The spline/PCLS/MC method is compared to a common method using simulated and real ELISA microarray data sets. Results: In contrast to the rigid logistic model, the flexible spline model gave credible fits in almost all test cases including troublesome cases with left and/or right censoring, or other asymmetries. For the real data sets, 61% of the spline predictions were more accurate than their comparable logistic predictions; especially the spline predictions at the extremes of the prediction curve. The relative errors of 50% of comparable spline and logistic predictions differed by less than 20%. Monte Carlo simulation rendered acceptable asymmetric prediction intervals for both spline and logistic models while propagation of error produced symmetric intervals that diverged unrealistically as the standard curves approached horizontal asymptotes. Conclusions: The spline/PCLS/MC method is a flexible, robust alternative to a logistic/NLS/propagation-of-error method to reliably predict protein concentrations and estimate their errors. The spline method simplifies model selection and fitting, and reliably estimates believable prediction errors. For the 50% of the real data sets fit well by both methods, spline and logistic predictions are practically indistinguishable, varying in accuracy by less than 15%. The spline method may be useful when automated prediction across simultaneous assays of numerous proteins must be applied routinely with minimal user intervention.« less
Gonçalves, Reginaldo; Szmuchrowski, Leszek Antony; Damasceno, Vinícius Oliveira; de Medeiros, Marcelo Lemos; Couto, Bruno Pena; Lamounier, Joel Alves
2014-09-01
To identify the association between both, body mass index and aerobic fitness, with cardiovascular disease risk factors in children. Cross-sectional study, carried out in Itaúna-MG, in 2010, with 290 school children ranging from 6 to 10 years-old of both sexes, randomly selected. Children from schools located in the countryside and those with medical restrctions for physical activity were not included. Blood sample was collected after a 12-hour fasting period. Blood pressure, stature and weight were evaluated in accordance with international standards. The following were considered as cardiovascular risk factors: high blood pressure, high total cholesterol, LDL, triglycerides and insulin levels, and low HDL. The statistical analysis included the Spearman's coefficient and the logistic regression, with cardiovascular risk factors as dependent variables. Significant correlations were found, in both sexes, among body mass index and aerobic fitness with most of the cardiovascular risk factors. Children of both sexes with body mass index in the fourth quartile demonstrated increased chances of having high blood insulin and clustering cardiovascular risk factors. Moreover, girls with aerobic fitness in the first quartile also demonstrated increased chances of having high blood insulin and clustering cardiovascular risk factors. The significant associations and the increased chances of having cardiovascular risk factors in children with less aerobic fitness and higher levels of body mass index justify the use of these variables for health monitoring in Pediatrics. Copyright © 2014 Sociedade de Pediatria de São Paulo. Publicado por Elsevier Editora Ltda. All rights reserved.
Prins, R G; Beenackers, M A; Boog, M C; Van Lenthe, F J; Brug, J; Oenema, A
2014-03-01
This study aimed to explore whether individual cognitions and neighbourhood social capital strengthen each other in their relation with engaging in sports at least three times per week. Cross-sectional analyses on data from the last wave of the YouRAction trial (2009-2010, Rotterdam, the Netherlands; baseline response: 98%) were conducted. In total 1129 had data on the last wave questionnaire (93%) and 832 of them had complete data on a self-administered questionnaire on frequency of sports participation, perceived neighbourhood social capital, cognitions (attitude, subjective norm, perceived behavioural control and intention toward sport participation) and demographics. Ecometric methods were used to aggregate perceived neighbourhood social capital to the neighbourhood level. Multilevel logistic regression analyses (neighbourhood and individual as levels) were conducted to examine associations of cognitions, neighbourhood social capital and the social capital by individual cognition interaction with fit norm compliance. If the interaction was significant, simple slopes analyses were conducted to decompose interaction effects. It was found that neighbourhood social capital was significantly associated with fit norm compliance (OR: 5.40; 95% CI: 1.13-25.74). Moreover, neighbourhood social capital moderated the association of attitude, perceived behavioural control and intention with fit norm compliance. The simple slope analyses visualized that the associations of cognitions with fit norm compliance were stronger in case of more neighbourhood social capital. Hence, higher levels of neighbourhood social capital strengthen the associations of attitude, perceived behavioural control and intention in their association with fit norm compliance. Copyright © 2014 Elsevier Ltd. All rights reserved.
ERIC Educational Resources Information Center
Fidalgo, Angel M.; Alavi, Seyed Mohammad; Amirian, Seyed Mohammad Reza
2014-01-01
This study examines three controversial aspects in differential item functioning (DIF) detection by logistic regression (LR) models: first, the relative effectiveness of different analytical strategies for detecting DIF; second, the suitability of the Wald statistic for determining the statistical significance of the parameters of interest; and…
ERIC Educational Resources Information Center
French, Brian F.; Maller, Susan J.
2007-01-01
Two unresolved implementation issues with logistic regression (LR) for differential item functioning (DIF) detection include ability purification and effect size use. Purification is suggested to control inaccuracies in DIF detection as a result of DIF items in the ability estimate. Additionally, effect size use may be beneficial in controlling…
A Note on Three Statistical Tests in the Logistic Regression DIF Procedure
ERIC Educational Resources Information Center
Paek, Insu
2012-01-01
Although logistic regression became one of the well-known methods in detecting differential item functioning (DIF), its three statistical tests, the Wald, likelihood ratio (LR), and score tests, which are readily available under the maximum likelihood, do not seem to be consistently distinguished in DIF literature. This paper provides a clarifying…
ERIC Educational Resources Information Center
West, Lindsey M.; Davis, Telsie A.; Thompson, Martie P.; Kaslow, Nadine J.
2011-01-01
Protective factors for fostering reasons for living were examined among low-income, suicidal, African American women. Bivariate logistic regressions revealed that higher levels of optimism, spiritual well-being, and family social support predicted reasons for living. Multivariate logistic regressions indicated that spiritual well-being showed…
Comparison of Two Approaches for Handling Missing Covariates in Logistic Regression
ERIC Educational Resources Information Center
Peng, Chao-Ying Joanne; Zhu, Jin
2008-01-01
For the past 25 years, methodological advances have been made in missing data treatment. Most published work has focused on missing data in dependent variables under various conditions. The present study seeks to fill the void by comparing two approaches for handling missing data in categorical covariates in logistic regression: the…
Comparison of IRT Likelihood Ratio Test and Logistic Regression DIF Detection Procedures
ERIC Educational Resources Information Center
Atar, Burcu; Kamata, Akihito
2011-01-01
The Type I error rates and the power of IRT likelihood ratio test and cumulative logit ordinal logistic regression procedures in detecting differential item functioning (DIF) for polytomously scored items were investigated in this Monte Carlo simulation study. For this purpose, 54 simulation conditions (combinations of 3 sample sizes, 2 sample…
Multiple Logistic Regression Analysis of Cigarette Use among High School Students
ERIC Educational Resources Information Center
Adwere-Boamah, Joseph
2011-01-01
A binary logistic regression analysis was performed to predict high school students' cigarette smoking behavior from selected predictors from 2009 CDC Youth Risk Behavior Surveillance Survey. The specific target student behavior of interest was frequent cigarette use. Five predictor variables included in the model were: a) race, b) frequency of…
ERIC Educational Resources Information Center
Anderson, Carolyn J.; Verkuilen, Jay; Peyton, Buddy L.
2010-01-01
Survey items with multiple response categories and multiple-choice test questions are ubiquitous in psychological and educational research. We illustrate the use of log-multiplicative association (LMA) models that are extensions of the well-known multinomial logistic regression model for multiple dependent outcome variables to reanalyze a set of…
Propensity Score Estimation with Data Mining Techniques: Alternatives to Logistic Regression
ERIC Educational Resources Information Center
Keller, Bryan S. B.; Kim, Jee-Seon; Steiner, Peter M.
2013-01-01
Propensity score analysis (PSA) is a methodological technique which may correct for selection bias in a quasi-experiment by modeling the selection process using observed covariates. Because logistic regression is well understood by researchers in a variety of fields and easy to implement in a number of popular software packages, it has…
Two-factor logistic regression in pediatric liver transplantation
NASA Astrophysics Data System (ADS)
Uzunova, Yordanka; Prodanova, Krasimira; Spasov, Lyubomir
2017-12-01
Using a two-factor logistic regression analysis an estimate is derived for the probability of absence of infections in the early postoperative period after pediatric liver transplantation. The influence of both the bilirubin level and the international normalized ratio of prothrombin time of blood coagulation at the 5th postoperative day is studied.
ERIC Educational Resources Information Center
Courtney, Jon R.; Prophet, Retta
2011-01-01
Placement instability is often associated with a number of negative outcomes for children. To gain state level contextual knowledge of factors associated with placement stability/instability, logistic regression was applied to selected variables from the New Mexico Adoption and Foster Care Administrative Reporting System dataset. Predictors…
Classifying machinery condition using oil samples and binary logistic regression
NASA Astrophysics Data System (ADS)
Phillips, J.; Cripps, E.; Lau, John W.; Hodkiewicz, M. R.
2015-08-01
The era of big data has resulted in an explosion of condition monitoring information. The result is an increasing motivation to automate the costly and time consuming human elements involved in the classification of machine health. When working with industry it is important to build an understanding and hence some trust in the classification scheme for those who use the analysis to initiate maintenance tasks. Typically "black box" approaches such as artificial neural networks (ANN) and support vector machines (SVM) can be difficult to provide ease of interpretability. In contrast, this paper argues that logistic regression offers easy interpretability to industry experts, providing insight to the drivers of the human classification process and to the ramifications of potential misclassification. Of course, accuracy is of foremost importance in any automated classification scheme, so we also provide a comparative study based on predictive performance of logistic regression, ANN and SVM. A real world oil analysis data set from engines on mining trucks is presented and using cross-validation we demonstrate that logistic regression out-performs the ANN and SVM approaches in terms of prediction for healthy/not healthy engines.
Length bias correction in gene ontology enrichment analysis using logistic regression.
Mi, Gu; Di, Yanming; Emerson, Sarah; Cumbie, Jason S; Chang, Jeff H
2012-01-01
When assessing differential gene expression from RNA sequencing data, commonly used statistical tests tend to have greater power to detect differential expression of genes encoding longer transcripts. This phenomenon, called "length bias", will influence subsequent analyses such as Gene Ontology enrichment analysis. In the presence of length bias, Gene Ontology categories that include longer genes are more likely to be identified as enriched. These categories, however, are not necessarily biologically more relevant. We show that one can effectively adjust for length bias in Gene Ontology analysis by including transcript length as a covariate in a logistic regression model. The logistic regression model makes the statistical issue underlying length bias more transparent: transcript length becomes a confounding factor when it correlates with both the Gene Ontology membership and the significance of the differential expression test. The inclusion of the transcript length as a covariate allows one to investigate the direct correlation between the Gene Ontology membership and the significance of testing differential expression, conditional on the transcript length. We present both real and simulated data examples to show that the logistic regression approach is simple, effective, and flexible.
Hansson, Lisbeth; Khamis, Harry J
2008-12-01
Simulated data sets are used to evaluate conditional and unconditional maximum likelihood estimation in an individual case-control design with continuous covariates when there are different rates of excluded cases and different levels of other design parameters. The effectiveness of the estimation procedures is measured by method bias, variance of the estimators, root mean square error (RMSE) for logistic regression and the percentage of explained variation. Conditional estimation leads to higher RMSE than unconditional estimation in the presence of missing observations, especially for 1:1 matching. The RMSE is higher for the smaller stratum size, especially for the 1:1 matching. The percentage of explained variation appears to be insensitive to missing data, but is generally higher for the conditional estimation than for the unconditional estimation. It is particularly good for the 1:2 matching design. For minimizing RMSE, a high matching ratio is recommended; in this case, conditional and unconditional logistic regression models yield comparable levels of effectiveness. For maximizing the percentage of explained variation, the 1:2 matching design with the conditional logistic regression model is recommended.
Lee, Seokho; Shin, Hyejin; Lee, Sang Han
2016-12-01
Alzheimer's disease (AD) is usually diagnosed by clinicians through cognitive and functional performance test with a potential risk of misdiagnosis. Since the progression of AD is known to cause structural changes in the corpus callosum (CC), the CC thickness can be used as a functional covariate in AD classification problem for a diagnosis. However, misclassified class labels negatively impact the classification performance. Motivated by AD-CC association studies, we propose a logistic regression for functional data classification that is robust to misdiagnosis or label noise. Specifically, our logistic regression model is constructed by adopting individual intercepts to functional logistic regression model. This approach enables to indicate which observations are possibly mislabeled and also lead to a robust and efficient classifier. An effective algorithm using MM algorithm provides simple closed-form update formulas. We test our method using synthetic datasets to demonstrate its superiority over an existing method, and apply it to differentiating patients with AD from healthy normals based on CC from MRI. © 2016, The International Biometric Society.
Logistic regression for circular data
NASA Astrophysics Data System (ADS)
Al-Daffaie, Kadhem; Khan, Shahjahan
2017-05-01
This paper considers the relationship between a binary response and a circular predictor. It develops the logistic regression model by employing the linear-circular regression approach. The maximum likelihood method is used to estimate the parameters. The Newton-Raphson numerical method is used to find the estimated values of the parameters. A data set from weather records of Toowoomba city is analysed by the proposed methods. Moreover, a simulation study is considered. The R software is used for all computations and simulations.
Naval Research Logistics Quarterly. Volume 28. Number 3,
1981-09-01
denotes component-wise maximum. f has antone (isotone) differences on C x D if for cl < c2 and d, < d2, NAVAL RESEARCH LOGISTICS QUARTERLY VOL. 28...or negative correlations and linear or nonlinear regressions. Given are the mo- ments to order two and, for special cases, (he regression function and...data sets. We designate this bnb distribution as G - B - N(a, 0, v). The distribution admits only of positive correlation and linear regressions
T2 relaxation time is related to liver fibrosis severity
Siqueira, Luiz; Uppal, Ritika; Alford, Jamu; Fuchs, Bryan C.; Yamada, Suguru; Tanabe, Kenneth; Chung, Raymond T.; Lauwers, Gregory; Chew, Michael L.; Boland, Giles W.; Sahani, Duhyant V.; Vangel, Mark; Hahn, Peter F.; Caravan, Peter
2016-01-01
Background The grading of liver fibrosis relies on liver biopsy. Imaging techniques, including elastography and relaxometric, techniques have had varying success in diagnosing moderate fibrosis. The goal of this study was to determine if there is a relationship between the T2-relaxation time of hepatic parenchyma and the histologic grade of liver fibrosis in patients with hepatitis C undergoing both routine, liver MRI and liver biopsy, and to validate our methodology with phantoms and in a rat model of liver fibrosis. Methods This study is composed of three parts: (I) 123 patients who underwent both routine, clinical liver MRI and biopsy within a 6-month period, between July 1999 and January 2010 were enrolled in a retrospective study. MR imaging was performed at 1.5 T using dual-echo turbo-spin echo equivalent pulse sequence. T2 relaxation time of liver parenchyma in patients was calculated by mono-exponential fit of a region of interest (ROI) within the right lobe correlating to histopathologic grading (Ishak 0–6) and routine serum liver inflammation [aspartate aminotransferase (AST) and alanine aminotransferase (ALT)]. Statistical comparison was performed using ordinary logistic and ordinal logistic regression and ANOVA comparing T2 to Ishak fibrosis without and using AST and ALT as covariates; (II) a phantom was prepared using serial dilutions of dextran coated magnetic iron oxide nanoparticles. T2 weighed imaging was performed by comparing a dual echo fast spin echo sequence to a Carr-Purcell-Meigboom-Gill (CPMG) multi-echo sequence at 1.5 T. Statistical comparison was performed using a paired t-test; (III) male Wistar rats receiving weekly intraperitoneal injections of phosphate buffer solution (PBS) control (n=4 rats); diethylnitrosamine (DEN) for either 5 (n=5 rats) or 8 weeks (n=4 rats) were MR imaged on a Bruker Pharmascan 4.7 T magnet with a home-built bird-cage coil. T2 was quantified by using a mono-exponential fitting algorithm on multi-slice multi echo T2 weighted data. Statistical comparison was performed using ANOVA. Results (I) Histopathologic evaluation of both rat and human livers demonstrated no evidence of steatosis or hemochromatosis There was a monotonic increase in mean T2 value with increasing degree of fibrosis (control 65.4±2.9 ms, n=6 patients); mild (Ishak 1–2) 66.7±1.9 ms (n=30); moderate (Ishak 3–4) 71.6±1.7 ms (n=26); severe (Ishak 5–6) 72.4±1.4 ms (n=61); with relatively low standard error (~2.9 ms). There was a statistically significant difference between degrees of mild (Ishak <4) vs. moderate to severe fibrosis (Ishak >4) (P=0.03) based on logistic regression of T2 and Ishak, which became insignificant (P=0.07) when using inflammatory markers as covariates. Expanding on this model using ordinal logistic regression, there was significance amongst all 4 groups comparing T2 to Ishak (P=0.01), with significance using inflammation as a covariate (P=0.03) and approaching statistical significance amongst all groups by ANOVA (P=0.07); (II) there was a monotonic increase in T2 and statistical significance (ANOVA P<0.0001) between each rat subgroup [phosphate buffer solution (PBS) 25.2±0.8, DEN 5-week (31.1±1.5), and DEN 9-week (49.4±0.4) ms]; (III) the phantoms that had T2 values within the relevant range for the human liver (e.g., 20–100 ms), demonstrated no statistical difference between two point fits on turbo spin echo (TSE) data and multi-echo CPMG data (P=0.9). Conclusions The finding of increased T2 with liver fibrosis may relate to inflammation that may be an alternative or adjunct to other noninvasive MR imaging based approaches for assessing liver fibrosis. PMID:27190762
Asghari, Mehdi Poursheikhali; Hayatshahi, Sayyed Hamed Sadat; Abdolmaleki, Parviz
2012-01-01
From both the structural and functional points of view, β-turns play important biological roles in proteins. In the present study, a novel two-stage hybrid procedure has been developed to identify β-turns in proteins. Binary logistic regression was initially used for the first time to select significant sequence parameters in identification of β-turns due to a re-substitution test procedure. Sequence parameters were consisted of 80 amino acid positional occurrences and 20 amino acid percentages in sequence. Among these parameters, the most significant ones which were selected by binary logistic regression model, were percentages of Gly, Ser and the occurrence of Asn in position i+2, respectively, in sequence. These significant parameters have the highest effect on the constitution of a β-turn sequence. A neural network model was then constructed and fed by the parameters selected by binary logistic regression to build a hybrid predictor. The networks have been trained and tested on a non-homologous dataset of 565 protein chains. With applying a nine fold cross-validation test on the dataset, the network reached an overall accuracy (Qtotal) of 74, which is comparable with results of the other β-turn prediction methods. In conclusion, this study proves that the parameter selection ability of binary logistic regression together with the prediction capability of neural networks lead to the development of more precise models for identifying β-turns in proteins. PMID:27418910
Asghari, Mehdi Poursheikhali; Hayatshahi, Sayyed Hamed Sadat; Abdolmaleki, Parviz
2012-01-01
From both the structural and functional points of view, β-turns play important biological roles in proteins. In the present study, a novel two-stage hybrid procedure has been developed to identify β-turns in proteins. Binary logistic regression was initially used for the first time to select significant sequence parameters in identification of β-turns due to a re-substitution test procedure. Sequence parameters were consisted of 80 amino acid positional occurrences and 20 amino acid percentages in sequence. Among these parameters, the most significant ones which were selected by binary logistic regression model, were percentages of Gly, Ser and the occurrence of Asn in position i+2, respectively, in sequence. These significant parameters have the highest effect on the constitution of a β-turn sequence. A neural network model was then constructed and fed by the parameters selected by binary logistic regression to build a hybrid predictor. The networks have been trained and tested on a non-homologous dataset of 565 protein chains. With applying a nine fold cross-validation test on the dataset, the network reached an overall accuracy (Qtotal) of 74, which is comparable with results of the other β-turn prediction methods. In conclusion, this study proves that the parameter selection ability of binary logistic regression together with the prediction capability of neural networks lead to the development of more precise models for identifying β-turns in proteins.
Crane, Paul K; Gibbons, Laura E; Jolley, Lance; van Belle, Gerald
2006-11-01
We present an ordinal logistic regression model for identification of items with differential item functioning (DIF) and apply this model to a Mini-Mental State Examination (MMSE) dataset. We employ item response theory ability estimation in our models. Three nested ordinal logistic regression models are applied to each item. Model testing begins with examination of the statistical significance of the interaction term between ability and the group indicator, consistent with nonuniform DIF. Then we turn our attention to the coefficient of the ability term in models with and without the group term. If including the group term has a marked effect on that coefficient, we declare that it has uniform DIF. We examined DIF related to language of test administration in addition to self-reported race, Hispanic ethnicity, age, years of education, and sex. We used PARSCALE for IRT analyses and STATA for ordinal logistic regression approaches. We used an iterative technique for adjusting IRT ability estimates on the basis of DIF findings. Five items were found to have DIF related to language. These same items also had DIF related to other covariates. The ordinal logistic regression approach to DIF detection, when combined with IRT ability estimates, provides a reasonable alternative for DIF detection. There appear to be several items with significant DIF related to language of test administration in the MMSE. More attention needs to be paid to the specific criteria used to determine whether an item has DIF, not just the technique used to identify DIF.
Use of generalized ordered logistic regression for the analysis of multidrug resistance data.
Agga, Getahun E; Scott, H Morgan
2015-10-01
Statistical analysis of antimicrobial resistance data largely focuses on individual antimicrobial's binary outcome (susceptible or resistant). However, bacteria are becoming increasingly multidrug resistant (MDR). Statistical analysis of MDR data is mostly descriptive often with tabular or graphical presentations. Here we report the applicability of generalized ordinal logistic regression model for the analysis of MDR data. A total of 1,152 Escherichia coli, isolated from the feces of weaned pigs experimentally supplemented with chlortetracycline (CTC) and copper, were tested for susceptibilities against 15 antimicrobials and were binary classified into resistant or susceptible. The 15 antimicrobial agents tested were grouped into eight different antimicrobial classes. We defined MDR as the number of antimicrobial classes to which E. coli isolates were resistant ranging from 0 to 8. Proportionality of the odds assumption of the ordinal logistic regression model was violated only for the effect of treatment period (pre-treatment, during-treatment and post-treatment); but not for the effect of CTC or copper supplementation. Subsequently, a partially constrained generalized ordinal logistic model was built that allows for the effect of treatment period to vary while constraining the effects of treatment (CTC and copper supplementation) to be constant across the levels of MDR classes. Copper (Proportional Odds Ratio [Prop OR]=1.03; 95% CI=0.73-1.47) and CTC (Prop OR=1.1; 95% CI=0.78-1.56) supplementation were not significantly associated with the level of MDR adjusted for the effect of treatment period. MDR generally declined over the trial period. In conclusion, generalized ordered logistic regression can be used for the analysis of ordinal data such as MDR data when the proportionality assumptions for ordered logistic regression are violated. Published by Elsevier B.V.
Fitting Item Response Theory Models to Two Personality Inventories: Issues and Insights.
Chernyshenko, O S; Stark, S; Chan, K Y; Drasgow, F; Williams, B
2001-10-01
The present study compared the fit of several IRT models to two personality assessment instruments. Data from 13,059 individuals responding to the US-English version of the Fifth Edition of the Sixteen Personality Factor Questionnaire (16PF) and 1,770 individuals responding to Goldberg's 50 item Big Five Personality measure were analyzed. Various issues pertaining to the fit of the IRT models to personality data were considered. We examined two of the most popular parametric models designed for dichotomously scored items (i.e., the two- and three-parameter logistic models) and a parametric model for polytomous items (Samejima's graded response model). Also examined were Levine's nonparametric maximum likelihood formula scoring models for dichotomous and polytomous data, which were previously found to provide good fits to several cognitive ability tests (Drasgow, Levine, Tsien, Williams, & Mead, 1995). The two- and three-parameter logistic models fit some scales reasonably well but not others; the graded response model generally did not fit well. The nonparametric formula scoring models provided the best fit of the models considered. Several implications of these findings for personality measurement and personnel selection were described.
Fei, Y; Hu, J; Li, W-Q; Wang, W; Zong, G-Q
2017-03-01
Essentials Predicting the occurrence of portosplenomesenteric vein thrombosis (PSMVT) is difficult. We studied 72 patients with acute pancreatitis. Artificial neural networks modeling was more accurate than logistic regression in predicting PSMVT. Additional predictive factors may be incorporated into artificial neural networks. Objective To construct and validate artificial neural networks (ANNs) for predicting the occurrence of portosplenomesenteric venous thrombosis (PSMVT) and compare the predictive ability of the ANNs with that of logistic regression. Methods The ANNs and logistic regression modeling were constructed using simple clinical and laboratory data of 72 acute pancreatitis (AP) patients. The ANNs and logistic modeling were first trained on 48 randomly chosen patients and validated on the remaining 24 patients. The accuracy and the performance characteristics were compared between these two approaches by SPSS17.0 software. Results The training set and validation set did not differ on any of the 11 variables. After training, the back propagation network training error converged to 1 × 10 -20 , and it retained excellent pattern recognition ability. When the ANNs model was applied to the validation set, it revealed a sensitivity of 80%, specificity of 85.7%, a positive predictive value of 77.6% and negative predictive value of 90.7%. The accuracy was 83.3%. Differences could be found between ANNs modeling and logistic regression modeling in these parameters (10.0% [95% CI, -14.3 to 34.3%], 14.3% [95% CI, -8.6 to 37.2%], 15.7% [95% CI, -9.9 to 41.3%], 11.8% [95% CI, -8.2 to 31.8%], 22.6% [95% CI, -1.9 to 47.1%], respectively). When ANNs modeling was used to identify PSMVT, the area under receiver operating characteristic curve was 0.849 (95% CI, 0.807-0.901), which demonstrated better overall properties than logistic regression modeling (AUC = 0.716) (95% CI, 0.679-0.761). Conclusions ANNs modeling was a more accurate tool than logistic regression in predicting the occurrence of PSMVT following AP. More clinical factors or biomarkers may be incorporated into ANNs modeling to improve its predictive ability. © 2016 International Society on Thrombosis and Haemostasis.
McLaren, Christine E.; Chen, Wen-Pin; Nie, Ke; Su, Min-Ying
2009-01-01
Rationale and Objectives Dynamic contrast enhanced MRI (DCE-MRI) is a clinical imaging modality for detection and diagnosis of breast lesions. Analytical methods were compared for diagnostic feature selection and performance of lesion classification to differentiate between malignant and benign lesions in patients. Materials and Methods The study included 43 malignant and 28 benign histologically-proven lesions. Eight morphological parameters, ten gray level co-occurrence matrices (GLCM) texture features, and fourteen Laws’ texture features were obtained using automated lesion segmentation and quantitative feature extraction. Artificial neural network (ANN) and logistic regression analysis were compared for selection of the best predictors of malignant lesions among the normalized features. Results Using ANN, the final four selected features were compactness, energy, homogeneity, and Law_LS, with area under the receiver operating characteristic curve (AUC) = 0.82, and accuracy = 0.76. The diagnostic performance of these 4-features computed on the basis of logistic regression yielded AUC = 0.80 (95% CI, 0.688 to 0.905), similar to that of ANN. The analysis also shows that the odds of a malignant lesion decreased by 48% (95% CI, 25% to 92%) for every increase of 1 SD in the Law_LS feature, adjusted for differences in compactness, energy, and homogeneity. Using logistic regression with z-score transformation, a model comprised of compactness, NRL entropy, and gray level sum average was selected, and it had the highest overall accuracy of 0.75 among all models, with AUC = 0.77 (95% CI, 0.660 to 0.880). When logistic modeling of transformations using the Box-Cox method was performed, the most parsimonious model with predictors, compactness and Law_LS, had an AUC of 0.79 (95% CI, 0.672 to 0.898). Conclusion The diagnostic performance of models selected by ANN and logistic regression was similar. The analytic methods were found to be roughly equivalent in terms of predictive ability when a small number of variables were chosen. The robust ANN methodology utilizes a sophisticated non-linear model, while logistic regression analysis provides insightful information to enhance interpretation of the model features. PMID:19409817
Depression, self-esteem, diabetes care and self-care behaviors among middle-aged and older Mexicans.
Rivera-Hernandez, Maricruz
2014-07-01
Examine the associations of depression and self-esteem on self-care activities and care received among Mexicans with diabetes. Using data from the Mexican Nutrition and Health Survey 2012, logistic regression models were fit to test the associations between each self-care activity and diabetes care, and self-esteem and depression. People with low self-esteem were less likely to follow a diet, but no other associations were found. Contrary to what was expected, there were no relationships between depression and quality of care received or self-care behaviors. Current findings support the importance of looking at mental health and emotional state among older adults with diabetes. Future studies should explore the relationship between different psychological barriers to proper diabetes management. Published by Elsevier Ireland Ltd.
Depression, self-esteem, diabetes care and self-care behaviors among middle-aged and older Mexicans☆
Rivera-Hernandez, Maricruz
2016-01-01
Aims Examine the associations of depression and self-esteem on self-care activities and care received among Mexicans with diabetes. Methods Using data from the Mexican Nutrition and Health Survey 2012, logistic regression models were fit to test the associations between each self-care activity and diabetes care, and self-esteem and depression. Results People with low self-esteem were less likely to follow a diet, but no other associations were found. Contrary to what was expected, there were no relationships between depression and quality of care received or self-care behaviors. Conclusion Current findings support the importance of looking at mental health and emotional state among older adults with diabetes. Future studies should explore the relationship between different psychological barriers to proper diabetes management. PMID:24846446
Stochastic modeling of sunshine number data
NASA Astrophysics Data System (ADS)
Brabec, Marek; Paulescu, Marius; Badescu, Viorel
2013-11-01
In this paper, we will present a unified statistical modeling framework for estimation and forecasting sunshine number (SSN) data. Sunshine number has been proposed earlier to describe sunshine time series in qualitative terms (Theor Appl Climatol 72 (2002) 127-136) and since then, it was shown to be useful not only for theoretical purposes but also for practical considerations, e.g. those related to the development of photovoltaic energy production. Statistical modeling and prediction of SSN as a binary time series has been challenging problem, however. Our statistical model for SSN time series is based on an underlying stochastic process formulation of Markov chain type. We will show how its transition probabilities can be efficiently estimated within logistic regression framework. In fact, our logistic Markovian model can be relatively easily fitted via maximum likelihood approach. This is optimal in many respects and it also enables us to use formalized statistical inference theory to obtain not only the point estimates of transition probabilities and their functions of interest, but also related uncertainties, as well as to test of various hypotheses of practical interest, etc. It is straightforward to deal with non-homogeneous transition probabilities in this framework. Very importantly from both physical and practical points of view, logistic Markov model class allows us to test hypotheses about how SSN dependents on various external covariates (e.g. elevation angle, solar time, etc.) and about details of the dynamic model (order and functional shape of the Markov kernel, etc.). Therefore, using generalized additive model approach (GAM), we can fit and compare models of various complexity which insist on keeping physical interpretation of the statistical model and its parts. After introducing the Markovian model and general approach for identification of its parameters, we will illustrate its use and performance on high resolution SSN data from the Solar Radiation Monitoring Station of the West University of Timisoara.
Stochastic modeling of sunshine number data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brabec, Marek, E-mail: mbrabec@cs.cas.cz; Paulescu, Marius; Badescu, Viorel
2013-11-13
In this paper, we will present a unified statistical modeling framework for estimation and forecasting sunshine number (SSN) data. Sunshine number has been proposed earlier to describe sunshine time series in qualitative terms (Theor Appl Climatol 72 (2002) 127-136) and since then, it was shown to be useful not only for theoretical purposes but also for practical considerations, e.g. those related to the development of photovoltaic energy production. Statistical modeling and prediction of SSN as a binary time series has been challenging problem, however. Our statistical model for SSN time series is based on an underlying stochastic process formulation ofmore » Markov chain type. We will show how its transition probabilities can be efficiently estimated within logistic regression framework. In fact, our logistic Markovian model can be relatively easily fitted via maximum likelihood approach. This is optimal in many respects and it also enables us to use formalized statistical inference theory to obtain not only the point estimates of transition probabilities and their functions of interest, but also related uncertainties, as well as to test of various hypotheses of practical interest, etc. It is straightforward to deal with non-homogeneous transition probabilities in this framework. Very importantly from both physical and practical points of view, logistic Markov model class allows us to test hypotheses about how SSN dependents on various external covariates (e.g. elevation angle, solar time, etc.) and about details of the dynamic model (order and functional shape of the Markov kernel, etc.). Therefore, using generalized additive model approach (GAM), we can fit and compare models of various complexity which insist on keeping physical interpretation of the statistical model and its parts. After introducing the Markovian model and general approach for identification of its parameters, we will illustrate its use and performance on high resolution SSN data from the Solar Radiation Monitoring Station of the West University of Timisoara.« less
Ai, Zi-Sheng; Gao, You-Shui; Sun, Yuan; Liu, Yue; Zhang, Chang-Qing; Jiang, Cheng-Hua
2013-03-01
Risk factors for femoral neck fracture-induced avascular necrosis of the femoral head have not been elucidated clearly in middle-aged and elderly patients. Moreover, the high incidence of screw removal in China and its effect on the fate of the involved femoral head require statistical methods to reflect their intrinsic relationship. Ninety-nine patients older than 45 years with femoral neck fracture were treated by internal fixation between May 1999 and April 2004. Descriptive analysis, interaction analysis between associated factors, single factor logistic regression, multivariate logistic regression, and detailed interaction analysis were employed to explore potential relationships among associated factors. Avascular necrosis of the femoral head was found in 15 cases (15.2 %). Age × the status of implants (removal vs. maintenance) and gender × the timing of reduction were interactive according to two-factor interactive analysis. Age, the displacement of fractures, the quality of reduction, and the status of implants were found to be significant factors in single factor logistic regression analysis. Age, age × the status of implants, and the quality of reduction were found to be significant factors in multivariate logistic regression analysis. In fine interaction analysis after multivariate logistic regression analysis, implant removal was the most important risk factor for avascular necrosis in 56-to-85-year-old patients, with a risk ratio of 26.00 (95 % CI = 3.076-219.747). The middle-aged and elderly have less incidence of avascular necrosis of the femoral head following femoral neck fractures treated by cannulated screws. The removal of cannulated screws can induce a significantly high incidence of avascular necrosis of the femoral head in elderly patients, while a high-quality reduction is helpful to reduce avascular necrosis.
Zhou, Jinzhe; Zhou, Yanbing; Cao, Shougen; Li, Shikuan; Wang, Hao; Niu, Zhaojian; Chen, Dong; Wang, Dongsheng; Lv, Liang; Zhang, Jian; Li, Yu; Jiao, Xuelong; Tan, Xiaojie; Zhang, Jianli; Wang, Haibo; Zhang, Bingyuan; Lu, Yun; Sun, Zhenqing
2016-01-01
Reporting of surgical complications is common, but few provide information about the severity and estimate risk factors of complications. If have, but lack of specificity. We retrospectively analyzed data on 2795 gastric cancer patients underwent surgical procedure at the Affiliated Hospital of Qingdao University between June 2007 and June 2012, established multivariate logistic regression model to predictive risk factors related to the postoperative complications according to the Clavien-Dindo classification system. Twenty-four out of 86 variables were identified statistically significant in univariate logistic regression analysis, 11 significant variables entered multivariate analysis were employed to produce the risk model. Liver cirrhosis, diabetes mellitus, Child classification, invasion of neighboring organs, combined resection, introperative transfusion, Billroth II anastomosis of reconstruction, malnutrition, surgical volume of surgeons, operating time and age were independent risk factors for postoperative complications after gastrectomy. Based on logistic regression equation, p=Exp∑BiXi / (1+Exp∑BiXi), multivariate logistic regression predictive model that calculated the risk of postoperative morbidity was developed, p = 1/(1 + e((4.810-1.287X1-0.504X2-0.500X3-0.474X4-0.405X5-0.318X6-0.316X7-0.305X8-0.278X9-0.255X10-0.138X11))). The accuracy, sensitivity and specificity of the model to predict the postoperative complications were 86.7%, 76.2% and 88.6%, respectively. This risk model based on Clavien-Dindo grading severity of complications system and logistic regression analysis can predict severe morbidity specific to an individual patient's risk factors, estimate patients' risks and benefits of gastric surgery as an accurate decision-making tool and may serve as a template for the development of risk models for other surgical groups.
Motulsky, Harvey J; Brown, Ronald E
2006-01-01
Background Nonlinear regression, like linear regression, assumes that the scatter of data around the ideal curve follows a Gaussian or normal distribution. This assumption leads to the familiar goal of regression: to minimize the sum of the squares of the vertical or Y-value distances between the points and the curve. Outliers can dominate the sum-of-the-squares calculation, and lead to misleading results. However, we know of no practical method for routinely identifying outliers when fitting curves with nonlinear regression. Results We describe a new method for identifying outliers when fitting data with nonlinear regression. We first fit the data using a robust form of nonlinear regression, based on the assumption that scatter follows a Lorentzian distribution. We devised a new adaptive method that gradually becomes more robust as the method proceeds. To define outliers, we adapted the false discovery rate approach to handling multiple comparisons. We then remove the outliers, and analyze the data using ordinary least-squares regression. Because the method combines robust regression and outlier removal, we call it the ROUT method. When analyzing simulated data, where all scatter is Gaussian, our method detects (falsely) one or more outlier in only about 1–3% of experiments. When analyzing data contaminated with one or several outliers, the ROUT method performs well at outlier identification, with an average False Discovery Rate less than 1%. Conclusion Our method, which combines a new method of robust nonlinear regression with a new method of outlier identification, identifies outliers from nonlinear curve fits with reasonable power and few false positives. PMID:16526949
Huang, Sheu-Jen; Hung, Wen-Chi
2016-06-01
This study explored the intertwined effects between the policies and regulations of the companies and personal background on participation in the physical fitness programs and leisure-time activities in financial enterprises. A total of 823 employees were selected as the sample with the multilevel stratification random-sampling technique. The response rate was 52.0%. Data were analyzed with descriptive statistics and hierarchical linear logistic regression. Thirty-two percent and 39% of the employees participated in the physical fitness programs and leisure-time activities, respectively. The factors affecting participation were categorized into intrapersonal factors, interpersonal processes, and primary groups, as well as institutional factors. In the interpersonal processes and primary groups level, higher family social support, more equipment in health promotion was associated with more participation in the programs. With the influence from the institutional level, it was found that health promotion policy amplified the relationship between employees' age and participation, but attenuated the relationship between education level and participation. Health promotion equipment in the institutes attenuated the relationship between colleague social support, family social support, and education level with program participation. Physical activity equipment in the community attenuated the relationship between family social support and program participation. The influential factors of social support and worksite environment could predict the employees' participation in the physical fitness programs and leisure-time physical activities. Health promotion policy and equipment attenuated the negative effects of nonparticipation as well as amplified the positive effects of participation.
DeFina, Laura F; Willis, Benjamin L; Radford, Nina B; Christenson, Robert H; deFilippi, Christopher R; de Lemos, James A
2016-11-28
Cardiorespiratory fitness (CRF) and highly sensitive cardiac troponin T (hs-cTnT) are associated with risk of all-cause and cardiovascular mortality as well as incident heart failure. A link of low CRF to subclinical cardiac injury may explain this association. We hypothesized that CRF would be inversely associated with hs-cTnT measured in healthy adults over age 50. We evaluated 2498 participants (24.7% female, mean age 58.7 years) from the Cooper Center Longitudinal Study between August 2008 and January 2012. Plasma specimens obtained shortly before CRF estimates by Balke treadmill testing were used for hs-cTnT assays. CRF was grouped into low CRF (category 1), moderate CRF (categories 2-3), and high CRF (categories 4-5). Multivariable logistic regression was used to estimate the association of CRF with hs-cTnT. The prevalence of measurable hs-cTnT (≥3 ng/L) was 78.5%. In multivariable analyses, low-fit individuals were significantly more likely than high-fit individuals to have elevated hs-cTnT (≥14 ng/L) (odds ratio 2.47, 95% CI 1.10-5.36). In healthy older adults, CRF is inversely associated with hs-cTnT level adjusted for other risk factors. Prospective studies are needed to evaluate whether improving CRF is effective in preventing subclinical cardiac injury. © 2016 The Authors. Published on behalf of the American Heart Association, Inc., by Wiley Blackwell.
Rank-Optimized Logistic Matrix Regression toward Improved Matrix Data Classification.
Zhang, Jianguang; Jiang, Jianmin
2018-02-01
While existing logistic regression suffers from overfitting and often fails in considering structural information, we propose a novel matrix-based logistic regression to overcome the weakness. In the proposed method, 2D matrices are directly used to learn two groups of parameter vectors along each dimension without vectorization, which allows the proposed method to fully exploit the underlying structural information embedded inside the 2D matrices. Further, we add a joint [Formula: see text]-norm on two parameter matrices, which are organized by aligning each group of parameter vectors in columns. This added co-regularization term has two roles-enhancing the effect of regularization and optimizing the rank during the learning process. With our proposed fast iterative solution, we carried out extensive experiments. The results show that in comparison to both the traditional tensor-based methods and the vector-based regression methods, our proposed solution achieves better performance for matrix data classifications.
Aasa, Ulrika; Lundell, Sara; Aasa, Björn; Westerståhl, Maria
2015-12-01
Longitudinal design. A cohort followed in 3 waves of data collection. The aim of the study was to describe the relationships between the performance of 2 tests of spinal control at the age of 52 years and low back pain, physical activity level, and fitness earlier in life, as well as to describe the cross-sectional relationships between these measures. Altered spinal control has been linked to pain; however, other stimuli may also lead to inability to control the movements of the spine. Participants answered questions about physical activity and low back pain, and performed physical fitness tests at the age of 16, 34, and 52 years. The fitness test battery included tests of endurance in the back and abdominal muscles, a submaximal bicycle ergometer test to estimate maximal oxygen uptake, and measurements of hip flexion, thoracic spine flexibility, and anthropometrics. Two tests were aggregated to a physical fitness index. At the age of 52, also 2 tests of spinal control, the standing Waiter's bow (WB) and the supine double leg lower (LL) were performed. Logistic regression analyses showed that higher back muscle endurance at the age of 34 years could positively predict WB performance at 52 years and higher physical fitness at the age of 34 could positively predict LL performance at 52 years. Regarding cross-sectional relationships, an inability to perform the WB correctly was associated with lower physical fitness, flexibility and physical activity, and larger waist circumference. An inability to correctly perform the LL was associated with lower physical fitness. One-year prevalence of pain was not significantly associated with WB or LL test performance. An active life resulting in higher physical fitness is related to better spinal control in middle-aged men and women. This further strengthens the importance of physical activity throughout the life span. 3.
Multitarget stool DNA testing for colorectal-cancer screening.
Imperiale, Thomas F; Ransohoff, David F; Itzkowitz, Steven H; Levin, Theodore R; Lavin, Philip; Lidgard, Graham P; Ahlquist, David A; Berger, Barry M
2014-04-03
An accurate, noninvasive test could improve the effectiveness of colorectal-cancer screening. We compared a noninvasive, multitarget stool DNA test with a fecal immunochemical test (FIT) in persons at average risk for colorectal cancer. The DNA test includes quantitative molecular assays for KRAS mutations, aberrant NDRG4 and BMP3 methylation, and β-actin, plus a hemoglobin immunoassay. Results were generated with the use of a logistic-regression algorithm, with values of 183 or more considered to be positive. FIT values of more than 100 ng of hemoglobin per milliliter of buffer were considered to be positive. Tests were processed independently of colonoscopic findings. Of the 9989 participants who could be evaluated, 65 (0.7%) had colorectal cancer and 757 (7.6%) had advanced precancerous lesions (advanced adenomas or sessile serrated polyps measuring ≥1 cm in the greatest dimension) on colonoscopy. The sensitivity for detecting colorectal cancer was 92.3% with DNA testing and 73.8% with FIT (P=0.002). The sensitivity for detecting advanced precancerous lesions was 42.4% with DNA testing and 23.8% with FIT (P<0.001). The rate of detection of polyps with high-grade dysplasia was 69.2% with DNA testing and 46.2% with FIT (P=0.004); the rates of detection of serrated sessile polyps measuring 1 cm or more were 42.4% and 5.1%, respectively (P<0.001). Specificities with DNA testing and FIT were 86.6% and 94.9%, respectively, among participants with nonadvanced or negative findings (P<0.001) and 89.8% and 96.4%, respectively, among those with negative results on colonoscopy (P<0.001). The numbers of persons who would need to be screened to detect one cancer were 154 with colonoscopy, 166 with DNA testing, and 208 with FIT. In asymptomatic persons at average risk for colorectal cancer, multitarget stool DNA testing detected significantly more cancers than did FIT but had more false positive results. (Funded by Exact Sciences; ClinicalTrials.gov number, NCT01397747.).
Detecting DIF in Polytomous Items Using MACS, IRT and Ordinal Logistic Regression
ERIC Educational Resources Information Center
Elosua, Paula; Wells, Craig
2013-01-01
The purpose of the present study was to compare the Type I error rate and power of two model-based procedures, the mean and covariance structure model (MACS) and the item response theory (IRT), and an observed-score based procedure, ordinal logistic regression, for detecting differential item functioning (DIF) in polytomous items. A simulation…
ERIC Educational Resources Information Center
Rudner, Lawrence
2016-01-01
In the machine learning literature, it is commonly accepted as fact that as calibration sample sizes increase, Naïve Bayes classifiers initially outperform Logistic Regression classifiers in terms of classification accuracy. Applied to subtests from an on-line final examination and from a highly regarded certification examination, this study shows…
ERIC Educational Resources Information Center
Fan, Xitao; Wang, Lin
The Monte Carlo study compared the performance of predictive discriminant analysis (PDA) and that of logistic regression (LR) for the two-group classification problem. Prior probabilities were used for classification, but the cost of misclassification was assumed to be equal. The study used a fully crossed three-factor experimental design (with…
ERIC Educational Resources Information Center
Nguyen, Phuong L.
2006-01-01
This study examines the effects of parental SES, school quality, and community factors on children's enrollment and achievement in rural areas in Viet Nam, using logistic regression and ordered logistic regression. Multivariate analysis reveals significant differences in educational enrollment and outcomes by level of household expenditures and…
School Exits in the Milwaukee Parental Choice Program: Evidence of a Marketplace?
ERIC Educational Resources Information Center
Ford, Michael
2011-01-01
This article examines whether the large number of school exits from the Milwaukee school voucher program is evidence of a marketplace. Two logistic regression and multinomial logistic regression models tested the relation between the inability to draw large numbers of voucher students and the ability for a private school to remain viable. Data on…
Hierarchical Bayesian Logistic Regression to forecast metabolic control in type 2 DM patients.
Dagliati, Arianna; Malovini, Alberto; Decata, Pasquale; Cogni, Giulia; Teliti, Marsida; Sacchi, Lucia; Cerra, Carlo; Chiovato, Luca; Bellazzi, Riccardo
2016-01-01
In this work we present our efforts in building a model able to forecast patients' changes in clinical conditions when repeated measurements are available. In this case the available risk calculators are typically not applicable. We propose a Hierarchical Bayesian Logistic Regression model, which allows taking into account individual and population variability in model parameters estimate. The model is used to predict metabolic control and its variation in type 2 diabetes mellitus. In particular we have analyzed a population of more than 1000 Italian type 2 diabetic patients, collected within the European project Mosaic. The results obtained in terms of Matthews Correlation Coefficient are significantly better than the ones gathered with standard logistic regression model, based on data pooling.
Li, Ji; Gray, B.R.; Bates, D.M.
2008-01-01
Partitioning the variance of a response by design levels is challenging for binomial and other discrete outcomes. Goldstein (2003) proposed four definitions for variance partitioning coefficients (VPC) under a two-level logistic regression model. In this study, we explicitly derived formulae for multi-level logistic regression model and subsequently studied the distributional properties of the calculated VPCs. Using simulations and a vegetation dataset, we demonstrated associations between different VPC definitions, the importance of methods for estimating VPCs (by comparing VPC obtained using Laplace and penalized quasilikehood methods), and bivariate dependence between VPCs calculated at different levels. Such an empirical study lends an immediate support to wider applications of VPC in scientific data analysis.
Chatterjee, Tanaya; Chatterjee, Barun K; Majumdar, Dipanwita; Chakrabarti, Pinak
2015-02-01
An alternative to conventional antibiotics is needed to fight against emerging multiple drug resistant pathogenic bacteria. In this endeavor, the effect of silver nanoparticle (Ag-NP) has been studied quantitatively on two common pathogenic bacteria Escherichia coli and Staphylococcus aureus, and the growth curves were modeled. The effect of Ag-NP on bacterial growth kinetics was studied by measuring the optical density, and was fitted by non-linear regression using the Logistic and modified Gompertz models. Scanning Electron Microscopy and fluorescence microscopy were used to study the morphological changes of the bacterial cells. Generation of reactive oxygen species for Ag-NP treated cells were measured by fluorescence emission spectra. The modified Gompertz model, incorporating cell death, fits the observed data better than the Logistic model. With increasing concentration of Ag-NP, the growth kinetics of both bacteria shows a decline in growth rate with simultaneous enhancement of death rate constants. The duration of the lag phase was found to increase with Ag-NP concentration. SEM showed morphological changes, while fluorescence microscopy using DAPI showed compaction of DNA for Ag-NP-treated bacterial cells. E. coli was found to be more susceptible to Ag-NP as compared to S. aureus. The modified Gompertz model, using a death term, was found to be useful in explaining the non-monotonic nature of the growth curve. The modified Gompertz model derived here is of general nature and can be used to study any microbial growth kinetics under the influence of antimicrobial agents. Copyright © 2014 Elsevier B.V. All rights reserved.
Li, Hongqun; Yue, Bisong; Lian, Zhenmin; Zhao, Hongfeng; Zhao, Delong; Xiao, Xiangming
2012-09-01
A detailed understanding of the habitat needs of brown eared pheasants (Crossoptilon mantchuricum) is essential for conserving the species. We carried out field surveys in the Huanglong Mountains of Shaanxi Province, China, from March to June in 2007 and 2008. We arrayed a total of 206 grid plots (200 × 200 m) along transects in 2007 and 2008 and quantified a suite of environmental variables for each one. In the optimal logistic regression model, the most important variables for brown eared pheasants were slope degree, tree cover, distance to nearest water, cover and depth of fallen leaves. Hosmer and Leweshow goodness-of-fit tests explained that logistic models for the species were good fits. The model suggested that spring habitat selection of the brown eared pheasant was negatively related to distance to nearest water and slope degree, and positively to cover of trees and cover and depth of fallen leaves. In addition, the observed detected and undetected grids in 2007 did not show significant differences with predictions based on the model. These results showed that the model could well predict the habitat selection of brown eared pheasants. Based on these predictive models, we suggest that habitat management plans incorporating this new information can now focus more effectively on restrictions on the number of tourists entering the nature reserve, prohibition of firewood collection, livestock grazing, and medicinal plant harvesting by local residents in the core areas, protection of mixed forest and sources of the permanent water in the reserve, and use of alternatives to firewood.
Role of subdural electrocorticography in prediction of long-term seizure outcome in epilepsy surgery
Juhász, Csaba; Shah, Aashit; Sood, Sandeep; Chugani, Harry T.
2009-01-01
Since prediction of long-term seizure outcome using preoperative diagnostic modalities remains suboptimal in epilepsy surgery, we evaluated whether interictal spike frequency measures obtained from extraoperative subdural electrocorticography (ECoG) recording could predict long-term seizure outcome. This study included 61 young patients (age 0.4–23.0 years), who underwent extraoperative ECoG recording prior to cortical resection for alleviation of uncontrolled focal seizures. Patient age, frequency of preoperative seizures, neuroimaging findings, ictal and interictal ECoG measures were preoperatively obtained. The seizure outcome was prospectively measured [follow-up period: 2.5–6.4 years (mean 4.6 years)]. Univariate and multivariate logistic regression analyses determined how well preoperative demographic and diagnostic measures predicted long-term seizure outcome. Following the initial cortical resection, Engel Class I, II, III and IV outcomes were noted in 35, 6, 12 and 7 patients, respectively. One child died due to disseminated intravascular coagulation associated with pseudomonas sepsis 2 days after surgery. Univariate regression analyses revealed that incomplete removal of seizure onset zone, higher interictal spike-frequency in the preserved cortex and incomplete removal of cortical abnormalities on neuroimaging were associated with a greater risk of failing to obtain Class I outcome. Multivariate logistic regression analysis revealed that incomplete removal of seizure onset zone was the only independent predictor of failure to obtain Class I outcome. The goodness of regression model fit and the predictive ability of regression model were greatest in the full regression model incorporating both ictal and interictal measures [R2 0.44; Area under the receiver operating characteristic (ROC) curve: 0.81], slightly smaller in the reduced model incorporating ictal but not interictal measures (R2 0.40; Area under the ROC curve: 0.79) and slightly smaller again in the reduced model incorporating interictal but not ictal measures (R2 0.27; Area under the ROC curve: 0.77). Seizure onset zone and interictal spike frequency measures on subdural ECoG recording may both be useful in predicting the long-term seizure outcome of epilepsy surgery. Yet, the additive clinical impact of interictal spike frequency measures to predict long-term surgical outcome may be modest in the presence of ictal ECoG and neuroimaging data. PMID:19286694
Nielsen, Jannie; Bahendeka, Silver K; Whyte, Susan R; Meyrowitsch, Dan W; Bygbjerg, Ib C; Witte, Daniel R
2017-09-21
Prevention of type 2 diabetes (T2D) has been successfully established in randomised clinical trials. However, the best methods for the translation of this evidence into effective population-wide interventions remain unclear. To assess whether households could be a target for T2D prevention and screening, we investigated the resemblance of T2D risk factors at household level and by type of familial dyadic relationship in a rural Ugandan community. This cross-sectional household-based study included 437 individuals ≥13 years of age from 90 rural households in south-western Uganda. Resemblance in glycosylated haemoglobin (HbA1c), anthropometry, blood pressure, fitness status and sitting time were analysed using a general mixed model with random effects (by household or dyad) to calculate household intraclass correlation coefficients (ICCs) and dyadic regression coefficients. Logistic regression with household as a random effect was used to calculate the ORs for individuals having a condition or risk factor if another household member had the same condition. The strongest degree of household member resemblances in T2D risk factors was seen in relation to fitness status (ICC=0.24), HbA1c (ICC=0.18) and systolic blood pressure (ICC=0.11). Regarding dyadic resemblance, the highest standardised regression coefficient was seen in fitness status for spouses (0.54, 95% CI 0.32 to 0.76), parent-offspring (0.41, 95% CI 0.28 0.54) and siblings (0.41, 95% CI 0.25 to 0.57). Overall, parent-offspring and sibling pairs were the dyads with strongest resemblance, followed by spouses. The marked degree of resemblance in T2D risk factors at household level and between spouses, parent-offspring and sibling dyads suggest that shared behavioural and environmental factors may influence risk factor levels among cohabiting individuals, which point to the potential of the household setting for screening and prevention of T2D. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Prunier, J G; Colyn, M; Legendre, X; Nimon, K F; Flamand, M C
2015-01-01
Direct gradient analyses in spatial genetics provide unique opportunities to describe the inherent complexity of genetic variation in wildlife species and are the object of many methodological developments. However, multicollinearity among explanatory variables is a systemic issue in multivariate regression analyses and is likely to cause serious difficulties in properly interpreting results of direct gradient analyses, with the risk of erroneous conclusions, misdirected research and inefficient or counterproductive conservation measures. Using simulated data sets along with linear and logistic regressions on distance matrices, we illustrate how commonality analysis (CA), a detailed variance-partitioning procedure that was recently introduced in the field of ecology, can be used to deal with nonindependence among spatial predictors. By decomposing model fit indices into unique and common (or shared) variance components, CA allows identifying the location and magnitude of multicollinearity, revealing spurious correlations and thus thoroughly improving the interpretation of multivariate regressions. Despite a few inherent limitations, especially in the case of resistance model optimization, this review highlights the great potential of CA to account for complex multicollinearity patterns in spatial genetics and identifies future applications and lines of research. We strongly urge spatial geneticists to systematically investigate commonalities when performing direct gradient analyses. © 2014 John Wiley & Sons Ltd.
An early, novel illness severity score to predict outcome after cardiac arrest.
Rittenberger, Jon C; Tisherman, Samuel A; Holm, Margo B; Guyette, Francis X; Callaway, Clifton W
2011-11-01
Illness severity scores are commonly employed in critically ill patients to predict outcome. To date, prior scores for post-cardiac arrest patients rely on some event-related data. We developed an early, novel post-arrest illness severity score to predict survival, good outcome and development of multiple organ failure (MOF) after cardiac arrest. Retrospective review of data from adults treated after in-hospital or out-of-hospital cardiac arrest in a single tertiary care facility between 1/1/2005 and 12/31/2009. In addition to clinical data, initial illness severity was measured using serial organ function assessment (SOFA) scores and full outline of unresponsiveness (FOUR) scores at hospital or intensive care unit arrival. Outcomes were hospital mortality, good outcome (discharge to home or rehabilitation) and development of multiple organ failure (MOF). Single-variable logistic regression followed by Chi-squared automatic interaction detector (CHAID) was used to determine predictors of outcome. Stepwise multivariate logistic regression was used to determine the independent association between predictors and each outcome. The Hosmer-Lemeshow test was used to evaluate goodness of fit. The n-fold method was used to cross-validate each CHAID analysis and the difference between the misclassification risk estimates was used to determine model fit. Complete data from 457/495 (92%) subjects identified distinct categories of illness severity using combined FOUR motor and brainstem subscales, and combined SOFA cardiovascular and respiratory subscales: I. Awake; II. Moderate coma without cardiorespiratory failure; III. Moderate coma with cardiorespiratory failure; and IV. Severe coma. Survival was independently associated with category (I: OR 58.65; 95% CI 27.78, 123.82; II: OR 14.60; 95% CI 7.34, 29.02; III: OR 10.58; 95% CI 4.86, 23.00). Category was also similarly associated with good outcome and development of MOF. The proportion of subjects in each category changed over time. Initial illness severity explains much of the variation in cardiac arrest outcome. This model provides prognostic information at hospital arrival and may be used to stratify patients in future studies. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
Stol, Ilana S; Ehrenfeld, Jesse M; Epstein, Richard H
2014-03-01
Anesthesia information management systems (AIMS) are electronic health records that automatically import vital signs from patient monitors and allow for computer-assisted creation of the anesthesia record. When most recently surveyed in 2007, it was estimated that at least 16% of U.S. academic hospitals (i.e., with an anesthesia residency program) had installed an AIMS. At least an additional 28% reported that they were in the process of implementing, or searching for an AIMS. In this study, we updated the adoption figures as of May 2013 and examined the historical trend of AIMS deployment in U.S. anesthesia residency programs from the perspective of the theory of diffusion of technologic innovations. Questionnaires were sent by e-mail to program directors or their identified contact individuals at the 130 U.S. anesthesiology residency programs accredited as of June 30, 2012 by the Accreditation Council for Graduate Medical Education. The questionnaires asked whether the department had an AIMS, the year of installation, and, if not present, whether there were plans to install an AIMS within the next 12 months. Follow-up e-mails and phone calls were made until responses were obtained from all programs. Results were collected between February and May 2013. Implementation percentages were determined using the number of accredited anesthesia residency programs at the start of each academic year between 1987 and 2013 and were fit to a logistic regression curve using data through 2012. Responses were received from all 130 programs. Eighty-seven (67%) reported that they currently are using an AIMS. Ten programs without a current AIMS responded that they would be installing an AIMS within 12 months of the survey. The rate of AIMS adoption by year was well fit by a logistic regression curve (P = 0.90). By the end of 2014, approximately 75% of U.S. academic anesthesiology departments will be using an AIMS, with 84% adoption expected between 2018 and 2020. Historical adoption of AIMS has followed Roger's 1962 formulation of the theory of diffusion of innovation.
Abdullah, N; Abdul Murad, N A; Mohd Haniff, E A; Syafruddin, S E; Attia, J; Oldmeadow, C; Kamaruddin, M A; Abd Jalal, N; Ismail, N; Ishak, M; Jamal, R; Scott, R J; Holliday, E G
2017-08-01
Malaysia has a high and rising prevalence of type 2 diabetes (T2D). While environmental (non-genetic) risk factors for the disease are well established, the role of genetic variations and gene-environment interactions remain understudied in this population. This study aimed to estimate the relative contributions of environmental and genetic risk factors to T2D in Malaysia and also to assess evidence for gene-environment interactions that may explain additional risk variation. This was a case-control study including 1604 Malays, 1654 Chinese and 1728 Indians from the Malaysian Cohort Project. The proportion of T2D risk variance explained by known genetic and environmental factors was assessed by fitting multivariable logistic regression models and evaluating McFadden's pseudo R 2 and the area under the receiver-operating characteristic curve (AUC). Models with and without the genetic risk score (GRS) were compared using the log likelihood ratio Chi-squared test and AUCs. Multiplicative interaction between genetic and environmental risk factors was assessed via logistic regression within and across ancestral groups. Interactions were assessed for the GRS and its 62 constituent variants. The models including environmental risk factors only had pseudo R 2 values of 16.5-28.3% and AUC of 0.75-0.83. Incorporating a genetic score aggregating 62 T2D-associated risk variants significantly increased the model fit (likelihood ratio P-value of 2.50 × 10 -4 -4.83 × 10 -12 ) and increased the pseudo R 2 by about 1-2% and AUC by 1-3%. None of the gene-environment interactions reached significance after multiple testing adjustment, either for the GRS or individual variants. For individual variants, 33 out of 310 tested associations showed nominal statistical significance with 0.001 < P < 0.05. This study suggests that known genetic risk variants contribute a significant but small amount to overall T2D risk variation in Malaysian population groups. If gene-environment interactions involving common genetic variants exist, they are likely of small effect, requiring substantially larger samples for detection. Copyright © 2017 The Royal Society for Public Health. All rights reserved.
TU-CD-BRB-01: Normal Lung CT Texture Features Improve Predictive Models for Radiation Pneumonitis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Krafft, S; The University of Texas Graduate School of Biomedical Sciences, Houston, TX; Briere, T
2015-06-15
Purpose: Existing normal tissue complication probability (NTCP) models for radiation pneumonitis (RP) traditionally rely on dosimetric and clinical data but are limited in terms of performance and generalizability. Extraction of pre-treatment image features provides a potential new category of data that can improve NTCP models for RP. We consider quantitative measures of total lung CT intensity and texture in a framework for prediction of RP. Methods: Available clinical and dosimetric data was collected for 198 NSCLC patients treated with definitive radiotherapy. Intensity- and texture-based image features were extracted from the T50 phase of the 4D-CT acquired for treatment planning. Amore » total of 3888 features (15 clinical, 175 dosimetric, and 3698 image features) were gathered and considered candidate predictors for modeling of RP grade≥3. A baseline logistic regression model with mean lung dose (MLD) was first considered. Additionally, a least absolute shrinkage and selection operator (LASSO) logistic regression was applied to the set of clinical and dosimetric features, and subsequently to the full set of clinical, dosimetric, and image features. Model performance was assessed by comparing area under the curve (AUC). Results: A simple logistic fit of MLD was an inadequate model of the data (AUC∼0.5). Including clinical and dosimetric parameters within the framework of the LASSO resulted in improved performance (AUC=0.648). Analysis of the full cohort of clinical, dosimetric, and image features provided further and significant improvement in model performance (AUC=0.727). Conclusions: To achieve significant gains in predictive modeling of RP, new categories of data should be considered in addition to clinical and dosimetric features. We have successfully incorporated CT image features into a framework for modeling RP and have demonstrated improved predictive performance. Validation and further investigation of CT image features in the context of RP NTCP modeling is warranted. This work was supported by the Rosalie B. Hite Fellowship in Cancer research awarded to SPK.« less
NASA Astrophysics Data System (ADS)
Yilmaz, Işık
2009-06-01
The purpose of this study is to compare the landslide susceptibility mapping methods of frequency ratio (FR), logistic regression and artificial neural networks (ANN) applied in the Kat County (Tokat—Turkey). Digital elevation model (DEM) was first constructed using GIS software. Landslide-related factors such as geology, faults, drainage system, topographical elevation, slope angle, slope aspect, topographic wetness index (TWI) and stream power index (SPI) were used in the landslide susceptibility analyses. Landslide susceptibility maps were produced from the frequency ratio, logistic regression and neural networks models, and they were then compared by means of their validations. The higher accuracies of the susceptibility maps for all three models were obtained from the comparison of the landslide susceptibility maps with the known landslide locations. However, respective area under curve (AUC) values of 0.826, 0.842 and 0.852 for frequency ratio, logistic regression and artificial neural networks showed that the map obtained from ANN model is more accurate than the other models, accuracies of all models can be evaluated relatively similar. The results obtained in this study also showed that the frequency ratio model can be used as a simple tool in assessment of landslide susceptibility when a sufficient number of data were obtained. Input process, calculations and output process are very simple and can be readily understood in the frequency ratio model, however logistic regression and neural networks require the conversion of data to ASCII or other formats. Moreover, it is also very hard to process the large amount of data in the statistical package.
ERIC Educational Resources Information Center
Schumacher, Phyllis; Olinsky, Alan; Quinn, John; Smith, Richard
2010-01-01
The authors extended previous research by 2 of the authors who conducted a study designed to predict the successful completion of students enrolled in an actuarial program. They used logistic regression to determine the probability of an actuarial student graduating in the major or dropping out. They compared the results of this study with those…
Carolyn B. Meyer; Sherri L. Miller; C. John Ralph
2004-01-01
The scale at which habitat variables are measured affects the accuracy of resource selection functions in predicting animal use of sites. We used logistic regression models for a wide-ranging species, the marbled murrelet, (Brachyramphus marmoratus) in a large region in California to address how much changing the spatial or temporal scale of...
ERIC Educational Resources Information Center
Monahan, Patrick O.; McHorney, Colleen A.; Stump, Timothy E.; Perkins, Anthony J.
2007-01-01
Previous methodological and applied studies that used binary logistic regression (LR) for detection of differential item functioning (DIF) in dichotomously scored items either did not report an effect size or did not employ several useful measures of DIF magnitude derived from the LR model. Equations are provided for these effect size indices.…
ERIC Educational Resources Information Center
Magis, David; Raiche, Gilles; Beland, Sebastien; Gerard, Paul
2011-01-01
We present an extension of the logistic regression procedure to identify dichotomous differential item functioning (DIF) in the presence of more than two groups of respondents. Starting from the usual framework of a single focal group, we propose a general approach to estimate the item response functions in each group and to test for the presence…
Risk Factors of Falls in Community-Dwelling Older Adults: Logistic Regression Tree Analysis
ERIC Educational Resources Information Center
Yamashita, Takashi; Noe, Douglas A.; Bailer, A. John
2012-01-01
Purpose of the Study: A novel logistic regression tree-based method was applied to identify fall risk factors and possible interaction effects of those risk factors. Design and Methods: A nationally representative sample of American older adults aged 65 years and older (N = 9,592) in the Health and Retirement Study 2004 and 2006 modules was used.…
ERIC Educational Resources Information Center
Gordovil-Merino, Amalia; Guardia-Olmos, Joan; Pero-Cebollero, Maribel
2012-01-01
In this paper, we used simulations to compare the performance of classical and Bayesian estimations in logistic regression models using small samples. In the performed simulations, conditions were varied, including the type of relationship between independent and dependent variable values (i.e., unrelated and related values), the type of variable…
Ohlmacher, G.C.; Davis, J.C.
2003-01-01
Landslides in the hilly terrain along the Kansas and Missouri rivers in northeastern Kansas have caused millions of dollars in property damage during the last decade. To address this problem, a statistical method called multiple logistic regression has been used to create a landslide-hazard map for Atchison, Kansas, and surrounding areas. Data included digitized geology, slopes, and landslides, manipulated using ArcView GIS. Logistic regression relates predictor variables to the occurrence or nonoccurrence of landslides within geographic cells and uses the relationship to produce a map showing the probability of future landslides, given local slopes and geologic units. Results indicated that slope is the most important variable for estimating landslide hazard in the study area. Geologic units consisting mostly of shale, siltstone, and sandstone were most susceptible to landslides. Soil type and aspect ratio were considered but excluded from the final analysis because these variables did not significantly add to the predictive power of the logistic regression. Soil types were highly correlated with the geologic units, and no significant relationships existed between landslides and slope aspect. ?? 2003 Elsevier Science B.V. All rights reserved.
A Method for Calculating the Probability of Successfully Completing a Rocket Propulsion Ground Test
NASA Technical Reports Server (NTRS)
Messer, Bradley
2007-01-01
Propulsion ground test facilities face the daily challenge of scheduling multiple customers into limited facility space and successfully completing their propulsion test projects. Over the last decade NASA s propulsion test facilities have performed hundreds of tests, collected thousands of seconds of test data, and exceeded the capabilities of numerous test facility and test article components. A logistic regression mathematical modeling technique has been developed to predict the probability of successfully completing a rocket propulsion test. A logistic regression model is a mathematical modeling approach that can be used to describe the relationship of several independent predictor variables X(sub 1), X(sub 2),.., X(sub k) to a binary or dichotomous dependent variable Y, where Y can only be one of two possible outcomes, in this case Success or Failure of accomplishing a full duration test. The use of logistic regression modeling is not new; however, modeling propulsion ground test facilities using logistic regression is both a new and unique application of the statistical technique. Results from this type of model provide project managers with insight and confidence into the effectiveness of rocket propulsion ground testing.
Fei, Yang; Hu, Jian; Gao, Kun; Tu, Jianfeng; Li, Wei-Qin; Wang, Wei
2017-06-01
To construct a radical basis function (RBF) artificial neural networks (ANNs) model to predict the incidence of acute pancreatitis (AP)-induced portal vein thrombosis. The analysis included 353 patients with AP who had admitted between January 2011 and December 2015. RBF ANNs model and logistic regression model were constructed based on eleven factors relevant to AP respectively. Statistical indexes were used to evaluate the value of the prediction in two models. The predict sensitivity, specificity, positive predictive value, negative predictive value and accuracy by RBF ANNs model for PVT were 73.3%, 91.4%, 68.8%, 93.0% and 87.7%, respectively. There were significant differences between the RBF ANNs and logistic regression models in these parameters (P<0.05). In addition, a comparison of the area under receiver operating characteristic curves of the two models showed a statistically significant difference (P<0.05). The RBF ANNs model is more likely to predict the occurrence of PVT induced by AP than logistic regression model. D-dimer, AMY, Hct and PT were important prediction factors of approval for AP-induced PVT. Copyright © 2017 Elsevier Inc. All rights reserved.
Alexander, Adam C; Obong'o, Christopher O; Chavan, Prachi; Vander Weg, Mark W; Ward, Kenneth D
2018-03-21
Drug use remains an important public health concern in the United States, and understanding drug use among young adolescents is vital towards improving the health of the population. This study applied the Problem Behavior Theory (PBT) to lifetime drug use among a cross-sectional sample of Boy Scouts (N = 770). The PBT provides a conceptual framework for identifying risk and protective factors for adolescent problem behaviors, including drug use. Scouts reported their drug use and socio-demographics, and were assessed on several risk and protective factors. For analyses, sociodemographic and risk and protective factors were selected according to the framework provided by PBT, and use of each drug was regressed logistically on these selected factors. Final logistic models were assessed for goodness of fit and discriminatory power. The PBT demonstrated discriminatory power for all drugs (Tjur's R 2 values ≥.29), but fell sharply for illicit drug use (Tjur's R 2 =.20). There were no consistent correlates of drug use. Conclusions/Importance: The PBT had less explanatory power for illicit drug use compared to tobacco, alcohol, and marijuana, which suggests different risk and protective factors were associated with illicit drug use.
Wang, Shuang; Jiang, Xiaoqian; Wu, Yuan; Cui, Lijuan; Cheng, Samuel; Ohno-Machado, Lucila
2013-01-01
We developed an EXpectation Propagation LOgistic REgRession (EXPLORER) model for distributed privacy-preserving online learning. The proposed framework provides a high level guarantee for protecting sensitive information, since the information exchanged between the server and the client is the encrypted posterior distribution of coefficients. Through experimental results, EXPLORER shows the same performance (e.g., discrimination, calibration, feature selection etc.) as the traditional frequentist Logistic Regression model, but provides more flexibility in model updating. That is, EXPLORER can be updated one point at a time rather than having to retrain the entire data set when new observations are recorded. The proposed EXPLORER supports asynchronized communication, which relieves the participants from coordinating with one another, and prevents service breakdown from the absence of participants or interrupted communications. PMID:23562651
Millard, Steven P; Shofer, Jane; Braff, David; Calkins, Monica; Cadenhead, Kristin; Freedman, Robert; Green, Michael F; Greenwood, Tiffany A; Gur, Raquel; Gur, Ruben; Lazzeroni, Laura C; Light, Gregory A; Olincy, Ann; Nuechterlein, Keith; Seidman, Larry; Siever, Larry; Silverman, Jeremy; Stone, William S; Sprock, Joyce; Sugar, Catherine A; Swerdlow, Neal R; Tsuang, Ming; Turetsky, Bruce; Radant, Allen; Tsuang, Debby W
2016-07-01
Past studies describe numerous endophenotypes associated with schizophrenia (SZ), but many endophenotypes may overlap in information they provide, and few studies have investigated the utility of a multivariate index to improve discrimination between SZ and healthy community comparison subjects (CCS). We investigated 16 endophenotypes from the first phase of the Consortium on the Genetics of Schizophrenia, a large, multi-site family study, to determine whether a subset could distinguish SZ probands and CCS just as well as using all 16. Participants included 345 SZ probands and 517 CCS with a valid measure for at least one endophenotype. We used both logistic regression and random forest models to choose a subset of endophenotypes, adjusting for age, gender, smoking status, site, parent education, and the reading subtest of the Wide Range Achievement Test. As a sensitivity analysis, we re-fit models using multiple imputations to determine the effect of missing values. We identified four important endophenotypes: antisaccade, Continuous Performance Test-Identical Pairs 3-digit version, California Verbal Learning Test, and emotion identification. The logistic regression model that used just these four endophenotypes produced essentially the same results as the model that used all 16 (84% vs. 85% accuracy). While a subset of endophenotypes cannot replace clinical diagnosis nor encompass the complexity of the disease, it can aid in the design of future endophenotypic and genetic studies by reducing study cost and subject burden, simplifying sample enrichment, and improving the statistical power of locating those genetic regions associated with schizophrenia that may be the easiest to identify initially. Published by Elsevier B.V.
Parra-Henao, Gabriel; Quirós-Gómez, Oscar; Jaramillo-O, Nicolas; Cardona, Ángela Segura
2016-04-01
Triatoma dimidiata (Hemiptera: Reduviidae) is a secondary vector of Trypanosoma cruzi in Colombia and represents an important epidemiological risk mainly in the central and oriental regions of the country where it occupies sylvatic, peridomestic, and intradomestic ecotopes, and because of this complex distribution, its distribution and abundance could be conditioned by environmental factors. In this work, we explored the relationship between T. dimidiata distribution and environmental factors in the northwest, northeast, and central zones of Colombia and developed predictive models of infestation in the country. The associations between the presence ofT. dimidiata and environmental variables were studied using logistic regression models and ecological niche modeling for a sample of villages in Colombia. The analysis was based on the information collected in field about the presence ofT. dimidiata and the environmental data for each village extracted from remote sensing images. The presence of Triatoma dimidiata(Latreille, 1811) was found to be significantly associated with the maximum vegetation index, minimum land surface temperature (LST), and the digital elevation for the statistical model. Temperature seasonality, annual precipitation, and vegetation index were the variables that most influenced the ecological niche model ofT. dimidiata distribution. The logistic regression model showed a good fit and predicted suitable habitats in the Andean and Caribbean regions, which agrees with the known distribution of the species, but predicted suitable habitats in the Pacific and Orinoco regions proposing new areas of research. Improved models to predict suitable habitats forT. dimidiata hold promise for spatial targeting of integrated vector management. © The American Society of Tropical Medicine and Hygiene.
Parra-Henao, Gabriel; Quirós-Gómez, Oscar; Jaramillo-O, Nicolas; Cardona, Ángela Segura
2016-01-01
Triatoma dimidiata (Hemiptera: Reduviidae) is a secondary vector of Trypanosoma cruzi in Colombia and represents an important epidemiological risk mainly in the central and oriental regions of the country where it occupies sylvatic, peridomestic, and intradomestic ecotopes, and because of this complex distribution, its distribution and abundance could be conditioned by environmental factors. In this work, we explored the relationship between T. dimidiata distribution and environmental factors in the northwest, northeast, and central zones of Colombia and developed predictive models of infestation in the country. The associations between the presence of T. dimidiata and environmental variables were studied using logistic regression models and ecological niche modeling for a sample of villages in Colombia. The analysis was based on the information collected in field about the presence of T. dimidiata and the environmental data for each village extracted from remote sensing images. The presence of Triatoma dimidiata (Latreille, 1811) was found to be significantly associated with the maximum vegetation index, minimum land surface temperature (LST), and the digital elevation for the statistical model. Temperature seasonality, annual precipitation, and vegetation index were the variables that most influenced the ecological niche model of T. dimidiata distribution. The logistic regression model showed a good fit and predicted suitable habitats in the Andean and Caribbean regions, which agrees with the known distribution of the species, but predicted suitable habitats in the Pacific and Orinoco regions proposing new areas of research. Improved models to predict suitable habitats for T. dimidiata hold promise for spatial targeting of integrated vector management. PMID:26856910
Datamining approaches for modeling tumor control probability.
Naqa, Issam El; Deasy, Joseph O; Mu, Yi; Huang, Ellen; Hope, Andrew J; Lindsay, Patricia E; Apte, Aditya; Alaly, James; Bradley, Jeffrey D
2010-11-01
Tumor control probability (TCP) to radiotherapy is determined by complex interactions between tumor biology, tumor microenvironment, radiation dosimetry, and patient-related variables. The complexity of these heterogeneous variable interactions constitutes a challenge for building predictive models for routine clinical practice. We describe a datamining framework that can unravel the higher order relationships among dosimetric dose-volume prognostic variables, interrogate various radiobiological processes, and generalize to unseen data before when applied prospectively. Several datamining approaches are discussed that include dose-volume metrics, equivalent uniform dose, mechanistic Poisson model, and model building methods using statistical regression and machine learning techniques. Institutional datasets of non-small cell lung cancer (NSCLC) patients are used to demonstrate these methods. The performance of the different methods was evaluated using bivariate Spearman rank correlations (rs). Over-fitting was controlled via resampling methods. Using a dataset of 56 patients with primary NCSLC tumors and 23 candidate variables, we estimated GTV volume and V75 to be the best model parameters for predicting TCP using statistical resampling and a logistic model. Using these variables, the support vector machine (SVM) kernel method provided superior performance for TCP prediction with an rs=0.68 on leave-one-out testing compared to logistic regression (rs=0.4), Poisson-based TCP (rs=0.33), and cell kill equivalent uniform dose model (rs=0.17). The prediction of treatment response can be improved by utilizing datamining approaches, which are able to unravel important non-linear complex interactions among model variables and have the capacity to predict on unseen data for prospective clinical applications.
Suppression of the oculocephalic reflex (doll's eyes phenomenon) in normal full-term babies.
Snir, Moshe; Hasanreisoglu, Murat; Hasanreisoglue, Murat; Goldenberg-Cohen, Nitza; Friling, Ronit; Katz, Kalman; Nachum, Yoav; Benjamini, Yoav; Herscovici, Zvi; Axer-Siegel, Ruth
2010-05-01
To determine the precise age of suppression of the oculocephalic reflex in infants and its relationship to specific clinical characteristics. The oculocephalic reflex was prospectively tested in 325 healthy full-term babies aged 1 to 32 weeks attending an orthopedic outpatient clinic. Two ophthalmologists raised the baby's head 30 degrees above horizontal and rapidly rotated it in the horizontal and vertical planes while watching the conjugate eye movement. Suppression of the reflex, by observer agreement, was analyzed in relation to gestational age, postpartum age, postconceptional age, birth weight, and current weight. The data were fitted to a logistic regression model to determine the probability of suppression of the reflex according to the clinical variables. The oculocephalic reflex was suppressed in 75% of babies by the age of 11.5 weeks and in more than 95% of babies aged 20 weeks. Although postpartum age had a greater influence than gestational age, both were significantly correlated with suppression of the reflex (p = 0.01 and p = 0.04, respectively; two-sided t-test). Postpartum age was the best single variable explaining absence of the reflex. On logistic regression with cross-validation, the model including postpartum age and current weight yielded the best results; both these factors were highly correlated with suppression of the reflex (r = 0.74). The oculocephalic reflex is suppressed in the vast majority of normal infants by age 11.5 weeks. The disappearance of the reflex occurs gradually and longitudinally and is part of the normal maturation of the visual system.
Lundgren, Pia; Athikarisamy, Sam E; Patole, Sanjay; Lam, Geoffrey C; Smith, Lois E; Simmer, Karen
2018-05-01
This study evaluated the correlation between retinopathy of prematurity (ROP), anaemia and blood transfusions in extremely preterm infants. We included 227 infants born below 28 weeks of gestation at King Edward Memorial Hospital, Perth, Australia, from 2014-2016. Birth characteristics and risk factors for ROP were retrieved, and anaemia and severe anaemia were defined as a haemoglobins of <110 g/L and <80 g/L, respectively. Logistic regression was used for the analysis. Retinopathy of prematurity treatment was needed in 11% of cases and the mean number of blood transfusions (p < 0.01), and mean number of weeks of anaemia (p < 0.001) and of severe anaemia (p < 0.05), had positive associations with ROP cases warranting treatment. In the multivariate logistic regression analysis, the best-fit model of risk factors included anaemic days during first week of life, with an odds ratio (OR) of 1.46% and 95% confidence interval (CI) of 1.16-1.83 (p < 0.05), sepsis during the first 4 weeks of life (OR 3.14, 95% CI 1.10-9.00, p < 0.05) and days of ventilation (OR 1.03, 95% CI 1.01-1.06, p < 0.05). The duration of anaemia during the first week of life was an independent risk factor for ROP warranting treatment and preventing early anaemia may decrease this risk. ©2017 Foundation Acta Paediatrica. Published by John Wiley & Sons Ltd.
Bianchi, Valentina; Brambilla, Paolo; Garzitto, Marco; Colombo, Paola; Fornasari, Livia; Bellina, Monica; Bonivento, Carolina; Tesei, Alessandra; Piccin, Sara; Conte, Stefania; Perna, Giampaolo; Frigerio, Alessandra; Castiglioni, Isabella; Fabbro, Franco; Molteni, Massimo; Nobile, Maria
2017-05-01
Researchers' interest have recently moved toward the identification of recurrent psychopathological profiles characterized by concurrent elevations on different behavioural and emotional traits. This new strategy turned to be useful in terms of diagnosis and outcome prediction. We used a person-centred statistical approach to examine whether different groups could be identified in a referred sample and in a general-population sample of children and adolescents, and we investigated their relation to DSM-IV diagnoses. A latent class analysis (LCA) was performed on the Child Behaviour Checklist (CBCL) syndrome scales of the referred sample (N = 1225), of the general-population sample (N = 3418), and of the total sample. Models estimating 1-class through 5-class solutions were compared and agreement in the classification of subjects was evaluated. Chi square analyses, a logistic regression, and a multinomial logistic regression analysis were used to investigate the relations between classes and diagnoses. In the two samples and in the total sample, the best-fitting models were 4-class solutions. The identified classes were Internalizing Problems (15.68%), Severe Dysregulated (7.82%), Attention/Hyperactivity (10.19%), and Low Problems (66.32%). Subsequent analyses indicated a significant relationship between diagnoses and classes as well as a main association between the severe dysregulated class and comorbidity. Our data suggested the presence of four different psychopathological profiles related to different outcomes in terms of psychopathological diagnoses. In particular, our results underline the presence of a profile characterized by severe emotional and behavioural dysregulation that is mostly associated with the presence of multiple diagnosis.
Mood state sub-types in adults who stutter: A prospective study.
Tran, Yvonne; Blumgart, Elaine; Craig, Ashley
2018-06-01
Many adults who stutter have elevated negative mood states like anxiety and depressive mood. Little is known about how mood states change over time. The purpose of this study was to determine the trajectories or sub-types of mood states in adults who stutter over a 6 month period, and establish factors that contribute to these sub-types. Participants included 129 adults who stutter who completed an assessment regimen at baseline, including a measure of mood states (Symptom Checklist-90-Revised). Three mood states were assessed (interpersonal sensitivity or IS, depressive mood and anxiety) once a month over 6 months. Latent class growth mixture modeling was used to establish trajectories of change in these mood states over time. Logistic regression was then used to determine factors assessed at baseline that contribute to the IS trajectories. Three-class trajectory models were accepted as the best fit for IS, depressive mood and anxiety mood sub-types. Stable and normal mood state sub-types were found, incorporating around 60% of participants. Up to 40% belonged to sub-types comprising elevated levels of negative mood states. The logistic regression was conducted only with the IS domain, and revealed four factors that significantly contributed to IS mood sub-types. Those with low perceived control, low vitality, elevated social fears and being female were more likely to belong to elevated IS classes. This research revealed mood sub-types in adults who stutter, providing direction for the treatment of stuttering. Clarification of how much stuttering influences mood sub-types versus pre-existing mood is required. Copyright © 2017 Elsevier Inc. All rights reserved.
[Developing a predictive model for the caregiver strain index].
Álvarez-Tello, Margarita; Casado-Mejía, Rosa; Praena-Fernández, Juan Manuel; Ortega-Calvo, Manuel
Patient homecare with multiple morbidities is an increasingly common occurrence. The caregiver strain index is tool in the form of questionnaire that is designed to measure the perceived burden of those who care for their families. The aim of this study is to construct a diagnostic nomogram of informal caregiver burden using data from a predictive model. The model was drawn up using binary logistic regression and the questionnaire items as dichotomous factors. The dependent variable was the final score obtained with the questionnaire but categorised in accordance with that in the literature. Scores between 0 and 6 were labelled as "no" (no caregiver stress) and at or greater than 7 as "yes". The version 3.1.1R statistical software was used. To construct confidence intervals for the ROC curve 2000 boot strap replicates were used. A sample of 67 caregivers was obtained. A diagnosing nomogram was made up with its calibration graph (Brier scaled = 0.686, Nagelkerke R 2 =0.791), and the corresponding ROC curve (area under the curve=0.962). The predictive model generated using binary logistic regression and the nomogram contain four items (1, 4, 5 and 9) of the questionnaire. R plotting functions allow a very good solution for validating a model like this. The area under the ROC curve (0.96; 95% CI: 0.994-0.941) achieves a high discriminative value. Calibration also shows high goodness of fit values, suggesting that it may be clinically useful in community nursing and geriatric establishments. Copyright © 2015 SEGG. Publicado por Elsevier España, S.L.U. All rights reserved.
Potential serum biomarkers from a metabolomics study of autism
Wang, Han; Liang, Shuang; Wang, Maoqing; Gao, Jingquan; Sun, Caihong; Wang, Jia; Xia, Wei; Wu, Shiying; Sumner, Susan J.; Zhang, Fengyu; Sun, Changhao; Wu, Lijie
2016-01-01
Background Early detection and diagnosis are very important for autism. Current diagnosis of autism relies mainly on some observational questionnaires and interview tools that may involve a great variability. We performed a metabolomics analysis of serum to identify potential biomarkers for the early diagnosis and clinical evaluation of autism. Methods We analyzed a discovery cohort of patients with autism and participants without autism in the Chinese Han population using ultra-performance liquid chromatography quadrupole time-of-flight tandem mass spectrometry (UPLC/Q-TOF MS/MS) to detect metabolic changes in serum associated with autism. The potential metabolite candidates for biomarkers were individually validated in an additional independent cohort of cases and controls. We built a multiple logistic regression model to evaluate the validated biomarkers. Results We included 73 patients and 63 controls in the discovery cohort and 100 cases and 100 controls in the validation cohort. Metabolomic analysis of serum in the discovery stage identified 17 metabolites, 11 of which were validated in an independent cohort. A multiple logistic regression model built on the 11 validated metabolites fit well in both cohorts. The model consistently showed that autism was associated with 2 particular metabolites: sphingosine 1-phosphate and docosahexaenoic acid. Limitations While autism is diagnosed predominantly in boys, we were unable to perform the analysis by sex owing to difficulty recruiting enough female patients. Other limitations include the need to perform test–retest assessment within the same individual and the relatively small sample size. Conclusion Two metabolites have potential as biomarkers for the clinical diagnosis and evaluation of autism. PMID:26395811
Montero, Javier; Albaladejo, Alberto; Zalba, José-Ignacio
2014-05-01
To evaluate the influence of dental visiting patterns on the dental status and Oral Health-related Quality of Life (OHQoL) of patients visiting the University Clinic of Salamanca (Spain). This cross-sectional study consisted of a clinical oral examination and a questionnaire-based interviewin a consecutive sample of patients seeking a dental examination. Patients were classified as problem-based dental attendees(PB) and regular dental attendees(RB). Clinical and OHQoL(OHIP-14 & OIDP)data were compared betweengroups. Pair-wise comparisons were performed and a Logistic Regression Model was fitted for predicting the Odds Ratio (OR) of being a PB patient. The sample was composed of 255 patients aged 18 to 87 years (mean age: 63.1 ± 12.7; women: 51.8%). The PB patients had a poorer dental status (i.e. caries, periodontal and prosthetic needs), brushed their teethless,and were significantly more impaired in their OHQoL according to both instruments.The logistic regression coefficients demonstrated that on average the OR of being a PB patient was high in this dental patient sample, but this OR increased significantly if the patient was a male (OR= 1.1-5.0) or referred pain-related impacts according to the OHIP and, additionally, the OR decreased significantly as a function of the number of healthy fillings and the number of sextants coded as CPI=0. Regular dental check-ups are associated with better dental status and a better OHQoL after controlling for potentially related confounding factors.
A Survey of Musculoskeletal Injuries Associated with Zumba
Nichols, Andrew; Maskarinec, Gregory; Tseng, Chien-Wen
2013-01-01
Zumba is a highly popular Latin-inspired dance fitness program with ∼14 million participants in 150 countries. However, there is little published data on the rates or types of injuries among participants. We surveyed a convenience sample of 49 adults (100% participation) in 5 Zumba classes in Hawai‘i. Participants described any prior Zumba-related injuries. We used t-tests and logistic regression to determine if participant demographics or intensity of Zumba classes were associated with injuries. Participants were mostly female (82%), averaged 43.9 years of age (range 19 to 69 years), and took an average of 3 classes/week (1–2 hours/class) for an average of 11 months. Fourteen participants (29%) reported 21 prior Zumba-related injuries. Half of the 14 injured sought care from medical providers for their injuries. Of the 21 injuries, the most frequently injured sites were knees (42%), ankles (14%), and shoulders (14%). Participants with Zumba-related injuries did not differ significantly in age, months of Zumba, or hours/class compared to those who did not experience injuries. However, participants who reported injuries took significantly more classes/week (3.8 versus 2.7 classes, P = .006) than non-injured participants. In logistic regression, taking more classes/week remained significantly associated with injuries (odds ratio 3.6 [95% confidence interval 1.5 – 8.9, P = .006]) after controlling for age, gender, months of Zumba, and hours/class. Given Zumba's health benefits, our finding that 1 in 4 Zumba participants have experienced injuries indicates the need to improve Zumba routines, instructor training, and health provider counseling to reduce injury risk. PMID:24377078
Millard, Steven P.; Shofer, Jane; Braff, David; Calkins, Monica; Cadenhead, Kristin; Freedman, Robert; Green, Michael F.; Greenwood, Tiffany A.; Gur, Raquel; Gur, Ruben; Lazzeroni, Laura C.; Light, Gregory A.; Olincy, Ann; Nuechterlein, Keith; Seidman, Larry; Siever, Larry; Silverman, Jeremy; Stone, William; Sprock, Joyce; Sugar, Catherine A.; Swerdlow, Neal R.; Tsuang, Ming; Turetsky, Bruce; Radant, Allen; Tsuang, Debby W.
2016-01-01
Past studies describe numerous endophenotypes associated with schizophrenia (SZ), but many endophenotypes may overlap in information they provide, and few studies have investigated the utility of a multivariate index to improve discrimination between SZ and healthy community comparison subjects (CCS). We investigated 16 endophenotypes from the first phase of the Consortium on the Genetics of Schizophrenia, a large, multi-site family study, to determine whether a subset could distinguish SZ probands and CCS just as well as using all 16. Participants included 345 SZ probands and 517 CCS with a valid measure for at least one endophenotype. We used both logistic regression and random forest models to choose a subset of endophenotypes, adjusting for age, gender, smoking status, site, parent education, and the reading subtest of the Wide Range Achievement Test. As a sensitivity analysis, we re-fit models using multiple imputations to determine the effect of missing values. We identified four important endophenotypes: antisaccade, Continuous Performance Test-Identical Pairs 3-digit version, California Verbal Learning Test, and emotion identification. The logistic regression model that used just these four endophenotypes produced essentially the same results as the model that used all 16 (84% vs. 85% accuracy). While a subset of endophenotypes cannot replace clinical diagnosis nor encompass the complexity of the disease, it can aid in the design of future endophenotypic and genetic studies by reducing study cost and subject burden, simplifying sample enrichment, and improving statistical power of locating genetic regions associated with schizophrenia that may be the easiest to identify initially. PMID:27132484
Deriving the Regression Equation without Using Calculus
ERIC Educational Resources Information Center
Gordon, Sheldon P.; Gordon, Florence S.
2004-01-01
Probably the one "new" mathematical topic that is most responsible for modernizing courses in college algebra and precalculus over the last few years is the idea of fitting a function to a set of data in the sense of a least squares fit. Whether it be simple linear regression or nonlinear regression, this topic opens the door to applying the…
Dietary consumption patterns and laryngeal cancer risk.
Vlastarakos, Petros V; Vassileiou, Andrianna; Delicha, Evie; Kikidis, Dimitrios; Protopapas, Dimosthenis; Nikolopoulos, Thomas P
2016-06-01
We conducted a case-control study to investigate the effect of diet on laryngeal carcinogenesis. Our study population was made up of 140 participants-70 patients with laryngeal cancer (LC) and 70 controls with a non-neoplastic condition that was unrelated to diet, smoking, or alcohol. A food-frequency questionnaire determined the mean consumption of 113 different items during the 3 years prior to symptom onset. Total energy intake and cooking mode were also noted. The relative risk, odds ratio (OR), and 95% confidence interval (CI) were estimated by multiple logistic regression analysis. We found that the total energy intake was significantly higher in the LC group (p < 0.001), and that the difference remained statistically significant after logistic regression analysis (p < 0.001; OR: 118.70). Notably, meat consumption was higher in the LC group (p < 0.001), and the difference remained significant after logistic regression analysis (p = 0.029; OR: 1.16). LC patients also consumed significantly more fried food (p = 0.036); this difference also remained significant in the logistic regression model (p = 0.026; OR: 5.45). The LC group also consumed significantly more seafood (p = 0.012); the difference persisted after logistic regression analysis (p = 0.009; OR: 2.48), with the consumption of shrimp proving detrimental (p = 0.049; OR: 2.18). Finally, the intake of zinc was significantly higher in the LC group before and after logistic regression analysis (p = 0.034 and p = 0.011; OR: 30.15, respectively). Cereal consumption (including pastas) was also higher among the LC patients (p = 0.043), with logistic regression analysis showing that their negative effect was possibly associated with the sauces and dressings that traditionally accompany pasta dishes (p = 0.006; OR: 4.78). Conversely, a higher consumption of dairy products was found in controls (p < 0.05); logistic regression analysis showed that calcium appeared to be protective at the micronutrient level (p < 0.001; OR: 0.27). We found no difference in the overall consumption of fruits and vegetables between the LC patients and controls; however, the LC patients did have a greater consumption of cooked tomatoes and cooked root vegetables (p = 0.039 for both), and the controls had more consumption of leeks (p = 0.042) and, among controls younger than 65 years, cooked beans (p = 0.037). Lemon (p = 0.037), squeezed fruit juice (p = 0.032), and watermelon (p = 0.018) were also more frequently consumed by the controls. Other differences at the micronutrient level included greater consumption by the LC patients of retinol (p = 0.044), polyunsaturated fats (p = 0.041), and linoleic acid (p = 0.008); LC patients younger than 65 years also had greater intake of riboflavin (p = 0.045). We conclude that the differences in dietary consumption patterns between LC patients and controls indicate a possible role for lifestyle modifications involving nutritional factors as a means of decreasing the risk of laryngeal cancer.
Frison, Eline; Vandenbosch, Laura; Eggermont, Steven
2013-10-01
This study examined whether different types of media affect the use of dietary proteins and amino acid supplements, and intent to use anabolic-androgenic steroids. A random sample of 618 boys aged 11-18 years from eight schools in the Flemish part of Belgium completed standardized questionnaires as part of the Media and Adolescent Health Study. The survey measured exposure to sports media, appearance-focused media, fitness media, use of dietary supplements, and intent to use anabolic-androgenic steroids. Data were analyzed using logistic regressions and are presented as adjusted odds ratios (OR) and 95 % confidence intervals (CI); 8.6 % indicated to have used dietary proteins, 3.9 % indicated to have used amino acid supplements, and 11.8 % would consider using anabolic-androgenic steroids. After adjusting for fitness activity, exposure to fitness media was associated with the use of dietary proteins (OR = 7.24, CI = 2.25-23.28) and amino acid supplements (5.16, 1.21-21.92; 44.30, 8.25-238). Intent to use anabolic-androgenic steroids was associated with exposure to fitness media (2.38, 1.08-5.26; 8.07, 2.55-25.53) and appearance-focused media (6.02, 1.40-25.82; 8.94, 1.78-44.98). Sports media did not correlate with the use of dietary supplements and intent to use anabolic-androgenic steroids. Specific types of media are strong predictors of the use of supplements in adolescent boys. This provides an opportunity for intervention and prevention through the selection of fitness media as a communication channel. Health practitioners should also be aware that the contemporary body culture exerts pressure not only on girls but also on boys.
Gray, DeLeon L
2017-04-01
Education researchers have consistently linked students' perceptions of "fitting in" at school with patterns of motivation and positive emotions. This study proposes that "standing out" is also helpful for producing these outcomes, and that standing out works in concert with perceptions of fitting in. In a sample of 702 high school students nested within 33 classrooms, principal components analysis and confirmatory factor analysis were each conducted on half of the sample. Results support the proposed structure of measures of standing out and fitting in. Multilevel latent profile analysis was then used to classify students into four profiles of standing out while fitting in (SOFI): Unfulfilled, Somewhat Fulfilled, Nearly Fulfilled, and Fulfilled. A multinomial logistic regression revealed that students of color and those on who paid free/reduced prices lunch were overrepresented in the Unfulfilled and Somewhat Fulfilled profiles. A multilevel path analysis was then performed to assess the direct and indirect associations of profile membership with measures of task value and achievement emotions. Relative to the other profiles, students in the Fulfilled SOFI Profile express greater psychological membership in their classrooms and, in turn, express higher valuing of academic material (i.e., intrinsic value, utility value, and attainment value) and more positive achievement emotions (i.e., more enjoyment and pride; less boredom, hopelessness, and shame). This investigation provides critical insights on the potential benefits of structuring academic learning environments to foster feelings of distinctiveness among adolescents; and has implications for cultivating identities and achievement motivation in academic settings. Copyright © 2017 Society for the Study of School Psychology. Published by Elsevier Ltd. All rights reserved.
Rinsky-Halivni, Lilah; Klebanov, Miriam; Lerman, Yehuda; Paltiel, Ora
2017-05-01
Referral to voice therapy and recommendations for voice rest and microphone use are common interventions in occupational medicine aimed at preserving the working capability of teachers with occupation-related voice problems. Research on the impact of such interventions in terms of employment is lacking. This study examined changes in fitness (ie, ability) to work of dysphonic teachers referred to an occupational clinic and evaluated employment outcomes following voice therapy, voice rest, and microphone use. A historical prospective study was carried out. Of 365 classroom teachers who were first referred to a regional occupational medicine clinic due to dysphonia between January 2007 and December 2012, 156 were sampled and 153 were followed-up for an average of 5 years (range 2-8). Data were collected from medical records and from interviews conducted in 2014 aimed at assessing employment status. Logistic regression models were used to assess associations between interventions and employment outcomes. Survival analyses were performed to evaluate the association between participating in voice therapy and length of retained employment fitness. Thirty-four (22.2%) teachers suffered declines in working capabilities due to dysphonia. Voice therapy was demonstrated as being a protective factor against such declines (odds ratio = 0.05 [0.01-0.27]). Adherence to recommendation of voice therapy was <50%. Most of the decline in working fitness among nonadherent teachers occurred within 20 months after referral. Unlike voice therapy, voice rest and microphone use were not associated with retention of working capabilities. Voice therapy, especially when instituted early, is a strong predictor for retaining fitness for employment among dysphonic teachers. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Renal function of cancer patients "fit" for Cisplatin chemotherapy: physician perspective.
Montoya, J; Luna, H G; Amparo, J R; Casasola, C; Cristal-Luna, G
2014-07-01
Renal insufficiency is prevalent among cancer patients and it poses a hindrance in using cisplatin. We sought to describe the baseline renal function of our patients who were considered "fit" for cisplatin, along with saline hydration and mannitol diuresis, and determine occurrence of nephrotoxicity during chemotherapy. A retrospective study from 2008 to 2012 of 100 patients who were given cisplatin was done. Demographic and clinical variables were recorded. Creatinine Clearance was calculated using Cockcroft-Gault formula. Nephrotoxicity was defined as an increase of 0.5mg/dL or more after cisplatin infusion. Descriptive statistics, ANOVA, logistic regression analysis were done. A total of 100 patients were "fit" for cisplatin, with a mean age of 52 years, mean creatinine of 0.83mg/dL, CrCl of 94.14ml/ min, and ECOG performance status of 0-2. 12 patients have Chronic Kidney Disease (CKD) stage of 3, 42 patients with stage 2, 46 patients with stage 1. After cisplatin treatment, mean creatinine increased to 0.95mg/dL, and mean CrCl decreased to 83.7ml/min. Nine patients developed nephrotoxicity; all resolved with hydration. Patients with nephrotoxicity were significantly different from those without, in terms of weight p 0.012. None of the variables were predictors of nephrotoxicity. With hydration and mannitol diuresis, patients with ECOG 2, normal creatinine, CKD stage 3 or better, CrCl of 50ml/min and above are "fit" for cisplatin. During the study period, 9% of the patients "fit" for cisplatin developed nephrotoxicity, all resolved with conservative management. There was an increase in mean creatinine and a decrease in the mean CrCl after cisplatin.
Association between Physical Fitness and Successful Aging in Taiwanese Older Adults.
Lin, Pay-Shin; Hsieh, Chih-Chin; Cheng, Huey-Shinn; Tseng, Tsai-Jou; Su, Shin-Chang
2016-01-01
Population aging is escalating in numerous countries worldwide; among them is Taiwan, which will soon become an aged society. Thus, aging successfully is an increasing concern. One of the factors for achieving successful aging (SA) is maintaining high physical function. The purpose of this study was to determine the physical fitness factors associated with SA in Taiwanese older adults (OAs), because these factors are intervenable. Community-dwelling OAs aged more than 65 years and residing in Northern Taiwan were recruited in this study. They received a comprehensive geriatric assessment, which includes sociodemographic data, health conditions and behaviors, activities of daily living (ADL) and instrumental ADL (IADL) function, cognitive and depressive status, and quality of life. Physical fitness tests included the grip strength (GS), 30-second sit-to-stand (30s STS), timed up-and-go (TUG), functional reach (FR), one-leg standing, chair sit-and-reach, and reaction time (drop ruler) tests as well as the 6-minute walk test (6MWT). SA status was defined as follows: complete independence in performing ADL and IADL, satisfactory cognitive status (Mini-Mental State Examination ≥ 24), no depression (Geriatric Depression Scale < 5), and favorable social function (SF subscale ≥ 80 in SF-36). Adjusted multiple logistic regression analyses were performed. Among the total recruited OAs (n = 378), 100 (26.5%) met the aforementioned SA criteria. After adjustment for sociodemographic characteristics and health condition and behaviors, some physical fitness tests, namely GS, 30s STS, 6MWT, TUG, and FR tests, were significantly associated with SA individually, but not in the multivariate model. Among the physical fitness variables tested, cardiopulmonary endurance, mobility, muscle strength, and balance were significantly associated with SA in Taiwanese OAs. Early detection of deterioration in the identified functions and corresponding intervention is essential to ensuring SA.
Association between Physical Fitness and Successful Aging in Taiwanese Older Adults
Cheng, Huey-Shinn; Tseng, Tsai-Jou; Su, Shin-Chang
2016-01-01
Population aging is escalating in numerous countries worldwide; among them is Taiwan, which will soon become an aged society. Thus, aging successfully is an increasing concern. One of the factors for achieving successful aging (SA) is maintaining high physical function. The purpose of this study was to determine the physical fitness factors associated with SA in Taiwanese older adults (OAs), because these factors are intervenable. Community-dwelling OAs aged more than 65 years and residing in Northern Taiwan were recruited in this study. They received a comprehensive geriatric assessment, which includes sociodemographic data, health conditions and behaviors, activities of daily living (ADL) and instrumental ADL (IADL) function, cognitive and depressive status, and quality of life. Physical fitness tests included the grip strength (GS), 30-second sit-to-stand (30s STS), timed up-and-go (TUG), functional reach (FR), one-leg standing, chair sit-and-reach, and reaction time (drop ruler) tests as well as the 6-minute walk test (6MWT). SA status was defined as follows: complete independence in performing ADL and IADL, satisfactory cognitive status (Mini-Mental State Examination ≥ 24), no depression (Geriatric Depression Scale < 5), and favorable social function (SF subscale ≥ 80 in SF-36). Adjusted multiple logistic regression analyses were performed. Among the total recruited OAs (n = 378), 100 (26.5%) met the aforementioned SA criteria. After adjustment for sociodemographic characteristics and health condition and behaviors, some physical fitness tests, namely GS, 30s STS, 6MWT, TUG, and FR tests, were significantly associated with SA individually, but not in the multivariate model. Among the physical fitness variables tested, cardiopulmonary endurance, mobility, muscle strength, and balance were significantly associated with SA in Taiwanese OAs. Early detection of deterioration in the identified functions and corresponding intervention is essential to ensuring SA. PMID:26963614
Readiness for health behavior changes among low fitness men in a Finnish health promotion campaign.
Kaasalainen, Karoliina S; Kasila, Kirsti; Komulainen, Jyrki; Malvela, Miia; Poskiparta, Marita
2016-12-01
Men have been a hard-to-reach population in health behavior programs and it has been claimed that they are less interested in health issues than women. However, less is known about that how ready men are to adopt new health behaviors. This study examined readiness for change in physical activity (PA) and eating behavior (EB) among low fitness and overweight working-aged Finnish men who participated in a PA campaign. Associations among perceived health knowledge, health behaviors, psychosocial factors and readiness for change were studied. Data comprised 362 men aged 18-64. Physical fitness was assessed with a body fitness index constructed on the basis of the Polar OwnIndex Test, a hand grip test and an Inbody 720 body composition analysis. Health behavior information was gathered by questionnaire. Descriptive and comparative analyses were conducted by χ 2 test and Kruskall-Wallis and Mann-Whitney U tests. Associations between health knowledge and health behaviors were explored with logistic regression analyses. Readiness to increase PA and change EB was positively related to higher scores in psychosocial factors, PA and healthy eating habits. Self-rated knowledge on health issues was not related to PA or readiness to change health behaviors; however, it was positively associated with healthy eating and greater perceived promoters of PA. Participants' self-rated knowledge reflected not only an interest in health but also the differences in age and education. Health programs are needed that target both PA and healthy eating in low-fit men at different ages and motivational stages. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Advancing School and Community Engagement Now for Disease Prevention (ASCEND).
Treu, Judith A; Doughty, Kimberly; Reynolds, Jesse S; Njike, Valentine Y; Katz, David L
2017-03-01
To compare two intensity levels (standard vs. enhanced) of a nutrition and physical activity intervention vs. a control (usual programs) on nutrition knowledge, body mass index, fitness, academic performance, behavior, and medication use among elementary school students. Quasi-experimental with three arms. Elementary schools, students' homes, and a supermarket. A total of 1487 third-grade students. The standard intervention (SI) provided daily physical activity in classrooms and a program on making healthful foods, using food labels. The enhanced intervention (EI) provided these plus additional components for students and their families. Body mass index (zBMI), food label literacy, physical fitness, academic performance, behavior, and medication use for asthma or attention-deficit hyperactivity disorder (ADHD). Multivariable generalized linear model and logistic regression to assess change in outcome measures. Both the SI and EI groups gained less weight than the control (p < .001), but zBMI did not differ between groups (p = 1.00). There were no apparent effects on physical fitness or academic performance. Both intervention groups improved significantly but similarly in food label literacy (p = .36). Asthma medication use was reduced significantly in the SI group, and nonsignificantly (p = .10) in the EI group. Use of ADHD medication remained unchanged (p = .34). The standard intervention may improve food label literacy and reduce asthma medication use in elementary school children, but an enhanced version provides no further benefit.
Healthcare access and mammography screening in Michigan: a multilevel cross-sectional study
2012-01-01
Background Breast cancer screening rates have increased over time in the United States. However actual screening rates appear to be lower among black women compared with white women. Purpose To assess determinants of breast cancer screening among women in Michigan USA, focusing on individual and neighborhood socio-economic status and healthcare access. Methods Data from 1163 women ages 50-74 years who participated in the 2008 Michigan Special Cancer Behavioral Risk Factor Survey were analyzed. County-level SES and healthcare access were obtained from the Area Resource File. Multilevel logistic regression models were fit using SAS Proc Glimmix to account for clustering of individual observations by county. Separate models were fit for each of the two outcomes of interest; mammography screening and clinical breast examination. For each outcome, two sequential models were fit; a model including individual level covariates and a model including county level covariates. Results After adjusting for misclassification bias, overall cancer screening rates were lower than reported by survey respondents; black women had lower mammography screening rates but higher clinical breast examination rates than white women. However, after adjusting for other individual level variables, race was not a significant predictor of screening. Having health insurance or a usual healthcare provider were the most important predictors of cancer screening. Discussion Access to healthcare is important to ensuring appropriate cancer screening among women in Michigan. PMID:22436125
Factors associated with low levels of lumbar strength in adolescents in Southern Brazil☆
Silva, Diego Augusto Santos; Gonçalves, Eliane Cristina de Andrade; Grigollo, Leoberto Ricardo; Petroski, Edio Luiz
2014-01-01
OBJECTIVE: To determine the prevalence and factors associated with low levels of lumbar strength in adolescents. METHOD: This was a cross-sectional study involving 601 adolescents, aged 14 to 17 years, enrolled in public schools in the western region of Santa Catarina State - Southern Brazil. Lumbar strength was analyzed by the lumbar extension test developed by the Canadian Society of Exercise Physiology, which proposes different cutoffs for boys and girls. Independent variables were sex, age, socioeconomic status, dietary habits, alcohol consumption, physical activity, and aerobic fitness. For data analysis, univariate and multivariate logistic regression were used, with significance level of 5%. RESULTS: The prevalence of low levels of lumbar strength was 27.3%. The population subgroups most likely to present low levels of lumbar strength were females (OR: 1.54, 95% CI : 1.06 to 2.23), adolescents with low levels of aerobic fitness (OR: 2.10, 95% CI: 1.41 to 3.11) and the overweight (OR: 2.28, 95% CI: 1.35 to 3.81). CONCLUSION: Almost one-third of the studied students have low levels of lumbar strength. Interventions in the school population should be taken with special attention to female adolescents, those with low levels of aerobic fitness, and those with overweight, as these population subgroups were most likely to demostrate low levels of lumbar strength. PMID:25511000
Predictors of Membership in Alcoholics Anonymous in a Sample of Successfully Remitted Alcoholics
Krentzman, Amy R.; Robinson, Elizabeth A. R.; Perron, Brian E.; Cranford, James A.
2012-01-01
This study identifies factors associated with Alcoholics Anonymous (AA) membership in a sample of 81 persons who have achieved at least one year of total abstinence from drugs and alcohol. Forty-four were AA members, 37 were not. Logistic regression was used to test the cross-sectional associations of baseline demographic, substance-related, spiritual and religious, and personality variables with AA membership. Significant variables from the bivariate analyses were included in a multivariate model controlling for previous AA involvement. Having more positive views of God and more negative consequences of drinking were significantly associated with AA membership. This information can be used by clinicians to identify clients for whom AA might be a good fit, and can help others overcome obstacles to AA or explore alternative forms of abstinence support. PMID:21615004
ERIC Educational Resources Information Center
Guler, Nese; Penfield, Randall D.
2009-01-01
In this study, we investigate the logistic regression (LR), Mantel-Haenszel (MH), and Breslow-Day (BD) procedures for the simultaneous detection of both uniform and nonuniform differential item functioning (DIF). A simulation study was used to assess and compare the Type I error rate and power of a combined decision rule (CDR), which assesses DIF…
ERIC Educational Resources Information Center
Le, Huy; Marcus, Justin
2012-01-01
This study used Monte Carlo simulation to examine the properties of the overall odds ratio (OOR), which was recently introduced as an index for overall effect size in multiple logistic regression. It was found that the OOR was relatively independent of study base rate and performed better than most commonly used R-square analogs in indexing model…
Predicting Student Success on the Texas Chemistry STAAR Test: A Logistic Regression Analysis
ERIC Educational Resources Information Center
Johnson, William L.; Johnson, Annabel M.; Johnson, Jared
2012-01-01
Background: The context is the new Texas STAAR end-of-course testing program. Purpose: The authors developed a logistic regression model to predict who would pass-or-fail the new Texas chemistry STAAR end-of-course exam. Setting: Robert E. Lee High School (5A) with an enrollment of 2700 students, Tyler, Texas. Date of the study was the 2011-2012…
Susan L. King
2003-01-01
The performance of two classifiers, logistic regression and neural networks, are compared for modeling noncatastrophic individual tree mortality for 21 species of trees in West Virginia. The output of the classifier is usually a continuous number between 0 and 1. A threshold is selected between 0 and 1 and all of the trees below the threshold are classified as...
Logistic regression trees for initial selection of interesting loci in case-control studies
Nickolov, Radoslav Z; Milanov, Valentin B
2007-01-01
Modern genetic epidemiology faces the challenge of dealing with hundreds of thousands of genetic markers. The selection of a small initial subset of interesting markers for further investigation can greatly facilitate genetic studies. In this contribution we suggest the use of a logistic regression tree algorithm known as logistic tree with unbiased selection. Using the simulated data provided for Genetic Analysis Workshop 15, we show how this algorithm, with incorporation of multifactor dimensionality reduction method, can reduce an initial large pool of markers to a small set that includes the interesting markers with high probability. PMID:18466557
Rupert, Michael G.; Cannon, Susan H.; Gartner, Joseph E.; Michael, John A.; Helsel, Dennis R.
2008-01-01
Logistic regression was used to develop statistical models that can be used to predict the probability of debris flows in areas recently burned by wildfires by using data from 14 wildfires that burned in southern California during 2003-2006. Twenty-eight independent variables describing the basin morphology, burn severity, rainfall, and soil properties of 306 drainage basins located within those burned areas were evaluated. The models were developed as follows: (1) Basins that did and did not produce debris flows soon after the 2003 to 2006 fires were delineated from data in the National Elevation Dataset using a geographic information system; (2) Data describing the basin morphology, burn severity, rainfall, and soil properties were compiled for each basin. These data were then input to a statistics software package for analysis using logistic regression; and (3) Relations between the occurrence or absence of debris flows and the basin morphology, burn severity, rainfall, and soil properties were evaluated, and five multivariate logistic regression models were constructed. All possible combinations of independent variables were evaluated to determine which combinations produced the most effective models, and the multivariate models that best predicted the occurrence of debris flows were identified. Percentage of high burn severity and 3-hour peak rainfall intensity were significant variables in all models. Soil organic matter content and soil clay content were significant variables in all models except Model 5. Soil slope was a significant variable in all models except Model 4. The most suitable model can be selected from these five models on the basis of the availability of independent variables in the particular area of interest and field checking of probability maps. The multivariate logistic regression models can be entered into a geographic information system, and maps showing the probability of debris flows can be constructed in recently burned areas of southern California. This study demonstrates that logistic regression is a valuable tool for developing models that predict the probability of debris flows occurring in recently burned landscapes.
Hein, R; Abbas, S; Seibold, P; Salazar, R; Flesch-Janys, D; Chang-Claude, J
2012-01-01
Menopausal hormone therapy (MHT) is associated with an increased breast cancer risk in postmenopausal women, with combined estrogen-progestagen therapy posing a greater risk than estrogen monotherapy. However, few studies focused on potential effect modification of MHT-associated breast cancer risk by genetic polymorphisms in the progesterone metabolism. We assessed effect modification of MHT use by five coding single nucleotide polymorphisms (SNPs) in the progesterone metabolizing enzymes AKR1C3 (rs7741), AKR1C4 (rs3829125, rs17134592), and SRD5A1 (rs248793, rs3736316) using a two-center population-based case-control study from Germany with 2,502 postmenopausal breast cancer patients and 4,833 matched controls. An empirical-Bayes procedure that tests for interaction using a weighted combination of the prospective and the retrospective case-control estimators as well as standard prospective logistic regression were applied to assess multiplicative statistical interaction between polymorphisms and duration of MHT use with regard to breast cancer risk assuming a log-additive mode of inheritance. No genetic marginal effects were observed. Breast cancer risk associated with duration of combined therapy was significantly modified by SRD5A1_rs3736316, showing a reduced risk elevation in carriers of the minor allele (p (interaction,empirical-Bayes) = 0.006 using the empirical-Bayes method, p (interaction,logistic regression) = 0.013 using logistic regression). The risk associated with duration of use of monotherapy was increased by AKR1C3_rs7741 in minor allele carriers (p (interaction,empirical-Bayes) = 0.083, p (interaction,logistic regression) = 0.029) and decreased in minor allele carriers of two SNPs in AKR1C4 (rs3829125: p (interaction,empirical-Bayes) = 0.07, p (interaction,logistic regression) = 0.021; rs17134592: p (interaction,empirical-Bayes) = 0.101, p (interaction,logistic regression) = 0.038). After Bonferroni correction for multiple testing only SRD5A1_rs3736316 assessed using the empirical-Bayes method remained significant. Postmenopausal breast cancer risk associated with combined therapy may be modified by genetic variation in SRD5A1. Further well-powered studies are, however, required to replicate our finding.
Improvements in Spectrum's fit to program data tool.
Mahiane, Severin G; Marsh, Kimberly; Grantham, Kelsey; Crichlow, Shawna; Caceres, Karen; Stover, John
2017-04-01
The Joint United Nations Program on HIV/AIDS-supported Spectrum software package (Glastonbury, Connecticut, USA) is used by most countries worldwide to monitor the HIV epidemic. In Spectrum, HIV incidence trends among adults (aged 15-49 years) are derived by either fitting to seroprevalence surveillance and survey data or generating curves consistent with program and vital registration data, such as historical trends in the number of newly diagnosed infections or people living with HIV and AIDS related deaths. This article describes development and application of the fit to program data (FPD) tool in Joint United Nations Program on HIV/AIDS' 2016 estimates round. In the FPD tool, HIV incidence trends are described as a simple or double logistic function. Function parameters are estimated from historical program data on newly reported HIV cases, people living with HIV or AIDS-related deaths. Inputs can be adjusted for proportions undiagnosed or misclassified deaths. Maximum likelihood estimation or minimum chi-squared distance methods are used to identify the best fitting curve. Asymptotic properties of the estimators from these fits are used to estimate uncertainty. The FPD tool was used to fit incidence for 62 countries in 2016. Maximum likelihood and minimum chi-squared distance methods gave similar results. A double logistic curve adequately described observed trends in all but four countries where a simple logistic curve performed better. Robust HIV-related program and vital registration data are routinely available in many middle-income and high-income countries, whereas HIV seroprevalence surveillance and survey data may be scarce. In these countries, the FPD tool offers a simpler, improved approach to estimating HIV incidence trends.
Fitting the Rasch Model to Account for Variation in Item Discrimination
ERIC Educational Resources Information Center
Weitzman, R. A.
2009-01-01
Building on the Kelley and Gulliksen versions of classical test theory, this article shows that a logistic model having only a single item parameter can account for varying item discrimination, as well as difficulty, by using item-test correlations to adjust incorrect-correct (0-1) item responses prior to an initial model fit. The fit occurs…
2018-01-01
Introduction The aim of this study was to evaluate different clusters of anthropometric indicators (body mass index | BMI |, waist circumference | WC |, waist-to-height ratio | WHtR |, triceps skinfold |TR SF|, subscapular skinfold |SE SF|, sum of the triceps and subscapular skinfolds | ΣTR + SE |, and sum of the triceps, subscapular and suprailiac folds | ΣTR + SE + SI|) associated with the VO2max levels in adolescents. Methods The study included 1,132 adolescents (aged 14–19 years) enrolled in public schools of São José, Santa Catarina, Brazil, in the 2014 academic year. The dependent variable was the cluster of anthropometric indicators (BMI, WC, WHtR, TR SF, SE SF, SI SF, ΣTR + SE and ΣTR + SE + SI) of excess body fat. The independent variable was maximum oxygen uptake (VO2max), estimated by the modified Canadian aerobic fitness test—mCAFT. Control variables were: age, skin color, economic level, maternal education, physical activity and sexual maturation. Multinomial logistic regression was used for associations between the dependent and independent variables. Binary logistic regression was performed to identify the association between adolescents with all anthropometric indicators in excess and independent variables. Results One in ten adolescents presented all anthropometric indicators of excess body fat. Multinomial regression showed that with each increase of one VO2max unit, the odds of adolescents having three, four, five or more anthropometric indicators of excess body fat decreased by 0.92, 0.85 and 0.73 times, respectively. In the binary regression, this fact was reconfirmed, demonstrating that with each increase of one VO2max unit, the odds of adolescents having simultaneously the eight anthropometric indicators of excess body fat decreased by 0.55. Conclusion It was concluded that with each increase of one VO2max unit, adolescents decreased the odds of simultaneously presenting three or more anthropometric indicators of excess body fat, regardless of biological, economic and lifestyle factors. In addition, the present study identified that one in ten adolescents had all anthropometric indicators of excess body fat. PMID:29534098
Gonçalves, Eliane Cristina de Andrade; Nunes, Heloyse Elaine Gimenes; Silva, Diego Augusto Santos
2018-01-01
The aim of this study was to evaluate different clusters of anthropometric indicators (body mass index | BMI |, waist circumference | WC |, waist-to-height ratio | WHtR |, triceps skinfold |TR SF|, subscapular skinfold |SE SF|, sum of the triceps and subscapular skinfolds | ΣTR + SE |, and sum of the triceps, subscapular and suprailiac folds | ΣTR + SE + SI|) associated with the VO2max levels in adolescents. The study included 1,132 adolescents (aged 14-19 years) enrolled in public schools of São José, Santa Catarina, Brazil, in the 2014 academic year. The dependent variable was the cluster of anthropometric indicators (BMI, WC, WHtR, TR SF, SE SF, SI SF, ΣTR + SE and ΣTR + SE + SI) of excess body fat. The independent variable was maximum oxygen uptake (VO2max), estimated by the modified Canadian aerobic fitness test-mCAFT. Control variables were: age, skin color, economic level, maternal education, physical activity and sexual maturation. Multinomial logistic regression was used for associations between the dependent and independent variables. Binary logistic regression was performed to identify the association between adolescents with all anthropometric indicators in excess and independent variables. One in ten adolescents presented all anthropometric indicators of excess body fat. Multinomial regression showed that with each increase of one VO2max unit, the odds of adolescents having three, four, five or more anthropometric indicators of excess body fat decreased by 0.92, 0.85 and 0.73 times, respectively. In the binary regression, this fact was reconfirmed, demonstrating that with each increase of one VO2max unit, the odds of adolescents having simultaneously the eight anthropometric indicators of excess body fat decreased by 0.55. It was concluded that with each increase of one VO2max unit, adolescents decreased the odds of simultaneously presenting three or more anthropometric indicators of excess body fat, regardless of biological, economic and lifestyle factors. In addition, the present study identified that one in ten adolescents had all anthropometric indicators of excess body fat.
Regression Models for Identifying Noise Sources in Magnetic Resonance Images
Zhu, Hongtu; Li, Yimei; Ibrahim, Joseph G.; Shi, Xiaoyan; An, Hongyu; Chen, Yashen; Gao, Wei; Lin, Weili; Rowe, Daniel B.; Peterson, Bradley S.
2009-01-01
Stochastic noise, susceptibility artifacts, magnetic field and radiofrequency inhomogeneities, and other noise components in magnetic resonance images (MRIs) can introduce serious bias into any measurements made with those images. We formally introduce three regression models including a Rician regression model and two associated normal models to characterize stochastic noise in various magnetic resonance imaging modalities, including diffusion-weighted imaging (DWI) and functional MRI (fMRI). Estimation algorithms are introduced to maximize the likelihood function of the three regression models. We also develop a diagnostic procedure for systematically exploring MR images to identify noise components other than simple stochastic noise, and to detect discrepancies between the fitted regression models and MRI data. The diagnostic procedure includes goodness-of-fit statistics, measures of influence, and tools for graphical display. The goodness-of-fit statistics can assess the key assumptions of the three regression models, whereas measures of influence can isolate outliers caused by certain noise components, including motion artifacts. The tools for graphical display permit graphical visualization of the values for the goodness-of-fit statistic and influence measures. Finally, we conduct simulation studies to evaluate performance of these methods, and we analyze a real dataset to illustrate how our diagnostic procedure localizes subtle image artifacts by detecting intravoxel variability that is not captured by the regression models. PMID:19890478
Applications of statistics to medical science, III. Correlation and regression.
Watanabe, Hiroshi
2012-01-01
In this third part of a series surveying medical statistics, the concepts of correlation and regression are reviewed. In particular, methods of linear regression and logistic regression are discussed. Arguments related to survival analysis will be made in a subsequent paper.
Beukinga, Roelof J; Hulshoff, Jan Binne; Mul, Véronique E M; Noordzij, Walter; Kats-Ugurlu, Gursah; Slart, Riemer H J A; Plukker, John T M
2018-06-01
Purpose To assess the value of baseline and restaging fluorine 18 ( 18 F) fluorodeoxyglucose (FDG) positron emission tomography (PET) radiomics in predicting pathologic complete response to neoadjuvant chemotherapy and radiation therapy (NCRT) in patients with locally advanced esophageal cancer. Materials and Methods In this retrospective study, 73 patients with histologic analysis-confirmed T1/N1-3/M0 or T2-4a/N0-3/M0 esophageal cancer were treated with NCRT followed by surgery (Chemoradiotherapy for Esophageal Cancer followed by Surgery Study regimen) between October 2014 and August 2017. Clinical variables and radiomic features from baseline and restaging 18 F-FDG PET were selected by univariable logistic regression and least absolute shrinkage and selection operator. The selected variables were used to fit a multivariable logistic regression model, which was internally validated by using bootstrap resampling with 20 000 replicates. The performance of this model was compared with reference prediction models composed of maximum standardized uptake value metrics, clinical variables, and maximum standardized uptake value at baseline NCRT radiomic features. Outcome was defined as complete versus incomplete pathologic response (tumor regression grade 1 vs 2-5 according to the Mandard classification). Results Pathologic response was complete in 16 patients (21.9%) and incomplete in 57 patients (78.1%). A prediction model combining clinical T-stage and restaging NCRT (post-NCRT) joint maximum (quantifying image orderliness) yielded an optimism-corrected area under the receiver operating characteristics curve of 0.81. Post-NCRT joint maximum was replaceable with five other redundant post-NCRT radiomic features that provided equal model performance. All reference prediction models exhibited substantially lower discriminatory accuracy. Conclusion The combination of clinical T-staging and quantitative assessment of post-NCRT 18 F-FDG PET orderliness (joint maximum) provided high discriminatory accuracy in predicting pathologic complete response in patients with esophageal cancer. © RSNA, 2018 Online supplemental material is available for this article.
Schell, Greggory J; Lavieri, Mariel S; Stein, Joshua D; Musch, David C
2013-12-21
Open-angle glaucoma (OAG) is a prevalent, degenerate ocular disease which can lead to blindness without proper clinical management. The tests used to assess disease progression are susceptible to process and measurement noise. The aim of this study was to develop a methodology which accounts for the inherent noise in the data and improve significant disease progression identification. Longitudinal observations from the Collaborative Initial Glaucoma Treatment Study (CIGTS) were used to parameterize and validate a Kalman filter model and logistic regression function. The Kalman filter estimates the true value of biomarkers associated with OAG and forecasts future values of these variables. We develop two logistic regression models via generalized estimating equations (GEE) for calculating the probability of experiencing significant OAG progression: one model based on the raw measurements from CIGTS and another model based on the Kalman filter estimates of the CIGTS data. Receiver operating characteristic (ROC) curves and associated area under the ROC curve (AUC) estimates are calculated using cross-fold validation. The logistic regression model developed using Kalman filter estimates as data input achieves higher sensitivity and specificity than the model developed using raw measurements. The mean AUC for the Kalman filter-based model is 0.961 while the mean AUC for the raw measurements model is 0.889. Hence, using the probability function generated via Kalman filter estimates and GEE for logistic regression, we are able to more accurately classify patients and instances as experiencing significant OAG progression. A Kalman filter approach for estimating the true value of OAG biomarkers resulted in data input which improved the accuracy of a logistic regression classification model compared to a model using raw measurements as input. This methodology accounts for process and measurement noise to enable improved discrimination between progression and nonprogression in chronic diseases.
Computing group cardinality constraint solutions for logistic regression problems.
Zhang, Yong; Kwon, Dongjin; Pohl, Kilian M
2017-01-01
We derive an algorithm to directly solve logistic regression based on cardinality constraint, group sparsity and use it to classify intra-subject MRI sequences (e.g. cine MRIs) of healthy from diseased subjects. Group cardinality constraint models are often applied to medical images in order to avoid overfitting of the classifier to the training data. Solutions within these models are generally determined by relaxing the cardinality constraint to a weighted feature selection scheme. However, these solutions relate to the original sparse problem only under specific assumptions, which generally do not hold for medical image applications. In addition, inferring clinical meaning from features weighted by a classifier is an ongoing topic of discussion. Avoiding weighing features, we propose to directly solve the group cardinality constraint logistic regression problem by generalizing the Penalty Decomposition method. To do so, we assume that an intra-subject series of images represents repeated samples of the same disease patterns. We model this assumption by combining series of measurements created by a feature across time into a single group. Our algorithm then derives a solution within that model by decoupling the minimization of the logistic regression function from enforcing the group sparsity constraint. The minimum to the smooth and convex logistic regression problem is determined via gradient descent while we derive a closed form solution for finding a sparse approximation of that minimum. We apply our method to cine MRI of 38 healthy controls and 44 adult patients that received reconstructive surgery of Tetralogy of Fallot (TOF) during infancy. Our method correctly identifies regions impacted by TOF and generally obtains statistically significant higher classification accuracy than alternative solutions to this model, i.e., ones relaxing group cardinality constraints. Copyright © 2016 Elsevier B.V. All rights reserved.
Ren, Yilong; Wang, Yunpeng; Wu, Xinkai; Yu, Guizhen; Ding, Chuan
2016-10-01
Red light running (RLR) has become a major safety concern at signalized intersection. To prevent RLR related crashes, it is critical to identify the factors that significantly impact the drivers' behaviors of RLR, and to predict potential RLR in real time. In this research, 9-month's RLR events extracted from high-resolution traffic data collected by loop detectors from three signalized intersections were applied to identify the factors that significantly affect RLR behaviors. The data analysis indicated that occupancy time, time gap, used yellow time, time left to yellow start, whether the preceding vehicle runs through the intersection during yellow, and whether there is a vehicle passing through the intersection on the adjacent lane were significantly factors for RLR behaviors. Furthermore, due to the rare events nature of RLR, a modified rare events logistic regression model was developed for RLR prediction. The rare events logistic regression method has been applied in many fields for rare events studies and shows impressive performance, but so far none of previous research has applied this method to study RLR. The results showed that the rare events logistic regression model performed significantly better than the standard logistic regression model. More importantly, the proposed RLR prediction method is purely based on loop detector data collected from a single advance loop detector located 400 feet away from stop-bar. This brings great potential for future field applications of the proposed method since loops have been widely implemented in many intersections and can collect data in real time. This research is expected to contribute to the improvement of intersection safety significantly. Copyright © 2016 Elsevier Ltd. All rights reserved.
Engoren, Milo; Habib, Robert H; Dooner, John J; Schwann, Thomas A
2013-08-01
As many as 14 % of patients undergoing coronary artery bypass surgery are readmitted within 30 days. Readmission is usually the result of morbidity and may lead to death. The purpose of this study is to develop and compare statistical and genetic programming models to predict readmission. Patients were divided into separate Construction and Validation populations. Using 88 variables, logistic regression, genetic programs, and artificial neural nets were used to develop predictive models. Models were first constructed and tested on the Construction populations, then validated on the Validation population. Areas under the receiver operator characteristic curves (AU ROC) were used to compare the models. Two hundred and two patients (7.6 %) in the 2,644 patient Construction group and 216 (8.0 %) of the 2,711 patient Validation group were re-admitted within 30 days of CABG surgery. Logistic regression predicted readmission with AU ROC = .675 ± .021 in the Construction group. Genetic programs significantly improved the accuracy, AU ROC = .767 ± .001, p < .001). Artificial neural nets were less accurate with AU ROC = 0.597 ± .001 in the Construction group. Predictive accuracy of all three techniques fell in the Validation group. However, the accuracy of genetic programming (AU ROC = .654 ± .001) was still trivially but statistically non-significantly better than that of the logistic regression (AU ROC = .644 ± .020, p = .61). Genetic programming and logistic regression provide alternative methods to predict readmission that are similarly accurate.
Eken, Cenker; Bilge, Ugur; Kartal, Mutlu; Eray, Oktay
2009-06-03
Logistic regression is the most common statistical model for processing multivariate data in the medical literature. Artificial intelligence models like an artificial neural network (ANN) and genetic algorithm (GA) may also be useful to interpret medical data. The purpose of this study was to perform artificial intelligence models on a medical data sheet and compare to logistic regression. ANN, GA, and logistic regression analysis were carried out on a data sheet of a previously published article regarding patients presenting to an emergency department with flank pain suspicious for renal colic. The study population was composed of 227 patients: 176 patients had a diagnosis of urinary stone, while 51 ultimately had no calculus. The GA found two decision rules in predicting urinary stones. Rule 1 consisted of being male, pain not spreading to back, and no fever. In rule 2, pelvicaliceal dilatation on bedside ultrasonography replaced no fever. ANN, GA rule 1, GA rule 2, and logistic regression had a sensitivity of 94.9, 67.6, 56.8, and 95.5%, a specificity of 78.4, 76.47, 86.3, and 47.1%, a positive likelihood ratio of 4.4, 2.9, 4.1, and 1.8, and a negative likelihood ratio of 0.06, 0.42, 0.5, and 0.09, respectively. The area under the curve was found to be 0.867, 0.720, 0.715, and 0.713 for all applications, respectively. Data mining techniques such as ANN and GA can be used for predicting renal colic in emergency settings and to constitute clinical decision rules. They may be an alternative to conventional multivariate analysis applications used in biostatistics.
NASA Astrophysics Data System (ADS)
Duman, T. Y.; Can, T.; Gokceoglu, C.; Nefeslioglu, H. A.; Sonmez, H.
2006-11-01
As a result of industrialization, throughout the world, cities have been growing rapidly for the last century. One typical example of these growing cities is Istanbul, the population of which is over 10 million. Due to rapid urbanization, new areas suitable for settlement and engineering structures are necessary. The Cekmece area located west of the Istanbul metropolitan area is studied, because the landslide activity is extensive in this area. The purpose of this study is to develop a model that can be used to characterize landslide susceptibility in map form using logistic regression analysis of an extensive landslide database. A database of landslide activity was constructed using both aerial-photography and field studies. About 19.2% of the selected study area is covered by deep-seated landslides. The landslides that occur in the area are primarily located in sandstones with interbedded permeable and impermeable layers such as claystone, siltstone and mudstone. About 31.95% of the total landslide area is located at this unit. To apply logistic regression analyses, a data matrix including 37 variables was constructed. The variables used in the forwards stepwise analyses are different measures of slope, aspect, elevation, stream power index (SPI), plan curvature, profile curvature, geology, geomorphology and relative permeability of lithological units. A total of 25 variables were identified as exerting strong influence on landslide occurrence, and included by the logistic regression equation. Wald statistics values indicate that lithology, SPI and slope are more important than the other parameters in the equation. Beta coefficients of the 25 variables included the logistic regression equation provide a model for landslide susceptibility in the Cekmece area. This model is used to generate a landslide susceptibility map that correctly classified 83.8% of the landslide-prone areas.
Predicting clicks of PubMed articles.
Mao, Yuqing; Lu, Zhiyong
2013-01-01
Predicting the popularity or access usage of an article has the potential to improve the quality of PubMed searches. We can model the click trend of each article as its access changes over time by mining the PubMed query logs, which contain the previous access history for all articles. In this article, we examine the access patterns produced by PubMed users in two years (July 2009 to July 2011). We explore the time series of accesses for each article in the query logs, model the trends with regression approaches, and subsequently use the models for prediction. We show that the click trends of PubMed articles are best fitted with a log-normal regression model. This model allows the number of accesses an article receives and the time since it first becomes available in PubMed to be related via quadratic and logistic functions, with the model parameters to be estimated via maximum likelihood. Our experiments predicting the number of accesses for an article based on its past usage demonstrate that the mean absolute error and mean absolute percentage error of our model are 4.0% and 8.1% lower than the power-law regression model, respectively. The log-normal distribution is also shown to perform significantly better than a previous prediction method based on a human memory theory in cognitive science. This work warrants further investigation on the utility of such a log-normal regression approach towards improving information access in PubMed.
Predicting clicks of PubMed articles
Mao, Yuqing; Lu, Zhiyong
2013-01-01
Predicting the popularity or access usage of an article has the potential to improve the quality of PubMed searches. We can model the click trend of each article as its access changes over time by mining the PubMed query logs, which contain the previous access history for all articles. In this article, we examine the access patterns produced by PubMed users in two years (July 2009 to July 2011). We explore the time series of accesses for each article in the query logs, model the trends with regression approaches, and subsequently use the models for prediction. We show that the click trends of PubMed articles are best fitted with a log-normal regression model. This model allows the number of accesses an article receives and the time since it first becomes available in PubMed to be related via quadratic and logistic functions, with the model parameters to be estimated via maximum likelihood. Our experiments predicting the number of accesses for an article based on its past usage demonstrate that the mean absolute error and mean absolute percentage error of our model are 4.0% and 8.1% lower than the power-law regression model, respectively. The log-normal distribution is also shown to perform significantly better than a previous prediction method based on a human memory theory in cognitive science. This work warrants further investigation on the utility of such a log-normal regression approach towards improving information access in PubMed. PMID:24551386
New robust statistical procedures for the polytomous logistic regression models.
Castilla, Elena; Ghosh, Abhik; Martin, Nirian; Pardo, Leandro
2018-05-17
This article derives a new family of estimators, namely the minimum density power divergence estimators, as a robust generalization of the maximum likelihood estimator for the polytomous logistic regression model. Based on these estimators, a family of Wald-type test statistics for linear hypotheses is introduced. Robustness properties of both the proposed estimators and the test statistics are theoretically studied through the classical influence function analysis. Appropriate real life examples are presented to justify the requirement of suitable robust statistical procedures in place of the likelihood based inference for the polytomous logistic regression model. The validity of the theoretical results established in the article are further confirmed empirically through suitable simulation studies. Finally, an approach for the data-driven selection of the robustness tuning parameter is proposed with empirical justifications. © 2018, The International Biometric Society.
Staley, Dennis M.; Negri, Jacquelyn A.; Kean, Jason W.; Laber, Jayme L.; Tillery, Anne C.; Youberg, Ann M.
2016-06-30
Wildfire can significantly alter the hydrologic response of a watershed to the extent that even modest rainstorms can generate dangerous flash floods and debris flows. To reduce public exposure to hazard, the U.S. Geological Survey produces post-fire debris-flow hazard assessments for select fires in the western United States. We use publicly available geospatial data describing basin morphology, burn severity, soil properties, and rainfall characteristics to estimate the statistical likelihood that debris flows will occur in response to a storm of a given rainfall intensity. Using an empirical database and refined geospatial analysis methods, we defined new equations for the prediction of debris-flow likelihood using logistic regression methods. We showed that the new logistic regression model outperformed previous models used to predict debris-flow likelihood.
NASA Astrophysics Data System (ADS)
Kneringer, Philipp; Dietz, Sebastian; Mayr, Georg J.; Zeileis, Achim
2017-04-01
Low-visibility conditions have a large impact on aviation safety and economic efficiency of airports and airlines. To support decision makers, we develop a statistical probabilistic nowcasting tool for the occurrence of capacity-reducing operations related to low visibility. The probabilities of four different low visibility classes are predicted with an ordered logistic regression model based on time series of meteorological point measurements. Potential predictor variables for the statistical models are visibility, humidity, temperature and wind measurements at several measurement sites. A stepwise variable selection method indicates that visibility and humidity measurements are the most important model inputs. The forecasts are tested with a 30 minute forecast interval up to two hours, which is a sufficient time span for tactical planning at Vienna Airport. The ordered logistic regression models outperform persistence and are competitive with human forecasters.
Wang, Shuang; Jiang, Xiaoqian; Wu, Yuan; Cui, Lijuan; Cheng, Samuel; Ohno-Machado, Lucila
2013-06-01
We developed an EXpectation Propagation LOgistic REgRession (EXPLORER) model for distributed privacy-preserving online learning. The proposed framework provides a high level guarantee for protecting sensitive information, since the information exchanged between the server and the client is the encrypted posterior distribution of coefficients. Through experimental results, EXPLORER shows the same performance (e.g., discrimination, calibration, feature selection, etc.) as the traditional frequentist logistic regression model, but provides more flexibility in model updating. That is, EXPLORER can be updated one point at a time rather than having to retrain the entire data set when new observations are recorded. The proposed EXPLORER supports asynchronized communication, which relieves the participants from coordinating with one another, and prevents service breakdown from the absence of participants or interrupted communications. Copyright © 2013 Elsevier Inc. All rights reserved.
A Robust Bayesian Random Effects Model for Nonlinear Calibration Problems
Fong, Y.; Wakefield, J.; De Rosa, S.; Frahm, N.
2013-01-01
Summary In the context of a bioassay or an immunoassay, calibration means fitting a curve, usually nonlinear, through the observations collected on a set of samples containing known concentrations of a target substance, and then using the fitted curve and observations collected on samples of interest to predict the concentrations of the target substance in these samples. Recent technological advances have greatly improved our ability to quantify minute amounts of substance from a tiny volume of biological sample. This has in turn led to a need to improve statistical methods for calibration. In this paper, we focus on developing calibration methods robust to dependent outliers. We introduce a novel normal mixture model with dependent error terms to model the experimental noise. In addition, we propose a re-parameterization of the five parameter logistic nonlinear regression model that allows us to better incorporate prior information. We examine the performance of our methods with simulation studies and show that they lead to a substantial increase in performance measured in terms of mean squared error of estimation and a measure of the average prediction accuracy. A real data example from the HIV Vaccine Trials Network Laboratory is used to illustrate the methods. PMID:22551415
Relationships between training load, injury, and fitness in sub-elite collision sport athletes.
Gabbett, Tim J; Domrow, Nathan
2007-11-01
The purpose of this study was to develop statistical models that estimate the influence of training load on training injury and physical fitness in collision sport athletes. The incidence of training injuries was studied in 183 rugby league players over two competitive seasons. Participants were assessed for height, body mass, skinfold thickness, vertical jump, 10-m, 20-m and 40-m sprint time, agility, and estimated maximal aerobic power in the off-season, pre-season, mid-season, and end-season. Training load and injury data were summarised into pre-season, early-competition, and late-competition training phases. Individual training load, fitness, and injury data were modelled using a logistic regression model with a binomial distribution and logit link function, while team training load and injury data were modelled using a linear regression model. While physical fitness improved with training, there was no association (P=0.16-0.99) between training load and changes in physical fitness during any of the training phases. However, increases in training load during the early-competition training phase decreased (P= 0.04) agility performance. A relationship (P= 0.01-0.04) was observed between the log of training load and odds of injury during each training phase, resulting in a 1.50 - 2.85 increase in the odds of injury for each arbitrary unit increase in training load. Furthermore, during the pre-season training phase there was a relationship (P= 0.01) between training load and injury incidence within the training load range of 155 and 590 arbitrary units. During the early and late-competition training phases, increases in training load of 175-620 arbitrary units and 145-410 arbitrary units, respectively, resulted in no further increase in injury incidence. These findings demonstrate that increases in training load, particularly during the pre-season training phase, increase the odds of injury in collision sport athletes. However, while increases in training load from 175 to 620 arbitrary units during the early-competition training phase result in no further increase in injury incidence, marked reductions in agility performances can occur. These findings suggest that reductions in training load during the early-competition training phase can reduce the odds of injury without compromising agility performances in collision sport athletes.
Novel Stool-Based Protein Biomarkers for Improved Colorectal Cancer Screening: A Case-Control Study.
Bosch, Linda J W; de Wit, Meike; Pham, Thang V; Coupé, Veerle M H; Hiemstra, Annemieke C; Piersma, Sander R; Oudgenoeg, Gideon; Scheffer, George L; Mongera, Sandra; Sive Droste, Jochim Terhaar; Oort, Frank A; van Turenhout, Sietze T; Larbi, Ilhame Ben; Louwagie, Joost; van Criekinge, Wim; van der Hulst, Rene W M; Mulder, Chris J J; Carvalho, Beatriz; Fijneman, Remond J A; Jimenez, Connie R; Meijer, Gerrit A
2017-12-19
The fecal immunochemical test (FIT) for detecting hemoglobin is used widely for noninvasive colorectal cancer (CRC) screening, but its sensitivity leaves room for improvement. To identify novel protein biomarkers in stool that outperform or complement hemoglobin in detecting CRC and advanced adenomas. Case-control study. Colonoscopy-controlled referral population from several centers. 315 stool samples from one series of 12 patients with CRC and 10 persons without colorectal neoplasia (control samples) and a second series of 81 patients with CRC, 40 with advanced adenomas, and 43 with nonadvanced adenomas, as well as 129 persons without colorectal neoplasia (control samples); 72 FIT samples from a third independent series of 14 patients with CRC, 16 with advanced adenomas, and 18 with nonadvanced adenomas, as well as 24 persons without colorectal neoplasia (control samples). Stool samples were analyzed by mass spectrometry. Classification and regression tree (CART) analysis and logistic regression analyses were performed to identify protein combinations that differentiated CRC or advanced adenoma from control samples. Antibody-based assays for 4 selected proteins were done on FIT samples. In total, 834 human proteins were identified, 29 of which were statistically significantly enriched in CRC versus control stool samples in both series. Combinations of 4 proteins reached sensitivities of 80% and 45% for detecting CRC and advanced adenomas, respectively, at 95% specificity, which was higher than that of hemoglobin alone (P < 0.001 and P = 0.003, respectively). Selected proteins could be measured in small sample volumes used in FIT-based screening programs and discriminated between CRC and control samples (P < 0.001). Lack of availability of antibodies prohibited validation of the top protein combinations in FIT samples. Mass spectrometry of stool samples identified novel candidate protein biomarkers for CRC screening. Several protein combinations outperformed hemoglobin in discriminating CRC or advanced adenoma from control samples. Proof of concept that such proteins can be detected with antibody-based assays in small sample volumes indicates the potential of these biomarkers to be applied in population screening. Center for Translational Molecular Medicine, International Translational Cancer Research Dream Team, Stand Up to Cancer (American Association for Cancer Research and the Dutch Cancer Society), Dutch Digestive Foundation, and VU University Medical Center.
Cakir, Ebru; Kucuk, Ulku; Pala, Emel Ebru; Sezer, Ozlem; Ekin, Rahmi Gokhan; Cakmak, Ozgur
2017-05-01
Conventional cytomorphologic assessment is the first step to establish an accurate diagnosis in urinary cytology. In cytologic preparations, the separation of low-grade urothelial carcinoma (LGUC) from reactive urothelial proliferation (RUP) can be exceedingly difficult. The bladder washing cytologies of 32 LGUC and 29 RUP were reviewed. The cytologic slides were examined for the presence or absence of the 28 cytologic features. The cytologic criteria showing statistical significance in LGUC were increased numbers of monotonous single (non-umbrella) cells, three-dimensional cellular papillary clusters without fibrovascular cores, irregular bordered clusters, atypical single cells, irregular nuclear overlap, cytoplasmic homogeneity, increased N/C ratio, pleomorphism, nuclear border irregularity, nuclear eccentricity, elongated nuclei, and hyperchromasia (p ˂ 0.05), and the cytologic criteria showing statistical significance in RUP were inflammatory background, mixture of small and large urothelial cells, loose monolayer aggregates, and vacuolated cytoplasm (p ˂ 0.05). When these variables were subjected to a stepwise logistic regression analysis, four features were selected to distinguish LGUC from RUP: increased numbers of monotonous single (non-umbrella) cells, increased nuclear cytoplasmic ratio, hyperchromasia, and presence of small and large urothelial cells (p = 0.0001). By this logistic model of the 32 cases with proven LGUC, the stepwise logistic regression analysis correctly predicted 31 (96.9%) patients with this diagnosis, and of the 29 patients with RUP, the logistic model correctly predicted 26 (89.7%) patients as having this disease. There are several cytologic features to separate LGUC from RUP. Stepwise logistic regression analysis is a valuable tool for determining the most useful cytologic criteria to distinguish these entities. © 2017 APMIS. Published by John Wiley & Sons Ltd.
Brophy, Sinead; Rees, Anwen; Knox, Gareth; Baker, Julien; Thomas, Non E.
2012-01-01
Background This study examines obesity and factors associated with obesity in children aged 11–13 years in the UK. Methods 1147 children from ten secondary schools participated in a health survey that included blood samples, fitness test and anthropometric measures. Factors associated with obesity were examined using multilevel logistic regression. Findings Of the children examined (490 male; 657 female) a third were overweight, 1 in 6 had elevated blood pressure, more than 1 in 10 had high cholesterol, 58% consumed more fat than recommended, whilst 37% were classified as unfit. Children in deprived areas had a higher proportion of risk factors; for example, they had higher blood pressure (20% (deprived) compared to 11% (non-deprived), difference: 9.0% (95%CI: 4.7%–13.4%)). Obesity is associated with risk factors for heart disease and diabetes. Maintaining fitness is associated with a reduction in the risk factors for heart disease (high blood pressure and cholesterol) but not on risk factors for diabetes (insulin levels). In order of importance, the main risk factors for childhood obesity are being unfit, having an obese father, and being large at birth. Conclusion The high proportion of children with risk factors suggests future interventions need to focus on community and policy change to shift the population norm rather than targeting the behaviour of high risk individuals. Interventions need to focus on mothers’ lifestyle in pregnancy, fathers’ health, as well as promoting fitness among children. Obesity was not associated with deprivation. Therefore, strategies should be adopted in both deprived and non deprived areas. PMID:22693553
Growth curves for ostriches (Struthio camelus) in a Brazilian population.
Ramos, S B; Caetano, S L; Savegnago, R P; Nunes, B N; Ramos, A A; Munari, D P
2013-01-01
The objective of this study was to fit growth curves using nonlinear and linear functions to describe the growth of ostriches in a Brazilian population. The data set consisted of 112 animals with BW measurements from hatching to 383 d of age. Two nonlinear growth functions (Gompertz and logistic) and a third-order polynomial function were applied. The parameters for the models were estimated using the least-squares method and Gauss-Newton algorithm. The goodness-of-fit of the models was assessed using R(2) and the Akaike information criterion. The R(2) calculated for the logistic growth model was 0.945 for hens and 0.928 for cockerels and for the Gompertz growth model, 0.938 for hens and 0.924 for cockerels. The third-order polynomial fit gave R(2) of 0.938 for hens and 0.924 for cockerels. Among the Akaike information criterion calculations, the logistic growth model presented the lowest values in this study, both for hens and for cockerels. Nonlinear models are more appropriate for describing the sigmoid nature of ostrich growth.
Meacham, Meredith C; Rudolph, Abby E; Strathdee, Steffanie A; Rusch, Melanie L; Brouwer, Kimberly C; Patterson, Thomas L; Vera, Alicia; Rangel, Gudelia; Roesch, Scott C
2015-01-01
Although most people who inject drugs (PWID) in Tijuana, Mexico, primarily inject heroin, injection and non-injection use of methamphetamine and cocaine is common. We examined patterns of polydrug use among heroin injectors to inform prevention and treatment of drug use and its health and social consequences. Participants were PWID residing in Tijuana, aged ≥18 years who reported heroin injection in the past six months and were recruited through respondent-driven sampling (n = 1,025). Latent class analysis was conducted to assign individuals to classes on a probabilistic basis, using four indicators of past six-month polydrug and polyroute use: cocaine injecting, cocaine smoking or snorting, methamphetamine injecting, and methamphetamine smoking or snorting. Latent class membership was regressed onto covariates in a multinomial logistic regression. Latent class analyses testing 1, 2, 3, and 4 classes were fit, with the 3-class solution fitting best. Class 1 was defined by predominantly heroin use (50.2%, n = 515); class 2 by methamphetamine and heroin use (43.7%, n = 448), and class 3 by methamphetamine, cocaine, and heroin use (6.0%, n = 62). Bivariate and multivariate analyses indicated a group of methamphetamine and cocaine users that exhibited higher-risk sexual practices and lower heroin injecting frequency, and a group of methamphetamine users who were younger and more likely to be female. Discrete subtypes of heroin PWID were identified based on methamphetamine and cocaine use patterns. These findings have identified subtypes of heroin injectors who require more tailored interventions to reduce the health and social harms of injecting drug use.
Polydrug use and HIV risk among people who inject heroin in Tijuana, Mexico: A Latent class analysis
Meacham, M.C.; Rudolph, A.E.; Strathdee, S.A.; Rusch, M.L.; Brouwer, K.C.; Patterson, T.L.; Vera, A.; Rangel, G.; Roesch, S.C.
2016-01-01
Background Although most people who inject drugs (PWID) in Tijuana, Mexico, primarily inject heroin, injection and non-injection use of methamphetamine and cocaine is common. We examined patterns of polydrug use among heroin injectors to inform prevention and treatment of drug use and its health and social consequences. Methods Participants were PWID residing in Tijuana aged ≥ 18 years who reported heroin injection in the past 6 months and were recruited through respondent driven sampling (n=1025). Latent class analysis was conducted to assign individuals to classes on a probabilistic basis, using four indicators of past 6 month polydrug and polyroute use: cocaine injecting, cocaine smoking or snorting, methamphetamine injecting, methamphetamine smoking or snorting. Latent class membership was regressed onto covariates in a multinomial logistic regression. Results Latent class analyses testing 1, 2, 3, and 4 classes were fit, with the 3-class solution fitting best. Class 1 was defined by predominantly heroin use (50.2%, n=515); class 2 by methamphetamine and heroin use (43.7%, n=448), and class 3 by methamphetamine, cocaine, and heroin use (6.0%, n=62). Bivariate and multivariate analyses indicated a group of methamphetamine and cocaine users that exhibited higher risk sexual practices and lower heroin injecting frequency, and a group of methamphetamine users who were younger and more likely to be female. Conclusions Discrete subtypes of heroin PWID were identified based on methamphetamine and cocaine use patterns. These findings have identified subtypes of heroin injectors who require more tailored interventions to reduce the health and social harms of injecting drug use. PMID:26444185
Science of Test Research Consortium: Year Two Final Report
2012-10-02
July 2012. Analysis of an Intervention for Small Unmanned Aerial System ( SUAS ) Accidents, submitted to Quality Engineering, LQEN-2012-0056. Stone... Systems Engineering. Wolf, S. E., R. R. Hill, and J. J. Pignatiello. June 2012. Using Neural Networks and Logistic Regression to Model Small Unmanned ...Human Retina. 6. Wolf, S. E. March 2012. Modeling Small Unmanned Aerial System Mishaps using Logistic Regression and Artificial Neural Networks. 7
ERIC Educational Resources Information Center
Hidalgo, Mª Dolores; Gómez-Benito, Juana; Zumbo, Bruno D.
2014-01-01
The authors analyze the effectiveness of the R[superscript 2] and delta log odds ratio effect size measures when using logistic regression analysis to detect differential item functioning (DIF) in dichotomous items. A simulation study was carried out, and the Type I error rate and power estimates under conditions in which only statistical testing…
Brian S. Cade; Barry R. Noon; Rick D. Scherer; John J. Keane
2017-01-01
Counts of avian fledglings, nestlings, or clutch size that are bounded below by zero and above by some small integer form a discrete random variable distribution that is not approximated well by conventional parametric count distributions such as the Poisson or negative binomial. We developed a logistic quantile regression model to provide estimates of the empirical...
Mohammed, Mohammed A; Manktelow, Bradley N; Hofer, Timothy P
2016-04-01
There is interest in deriving case-mix adjusted standardised mortality ratios so that comparisons between healthcare providers, such as hospitals, can be undertaken in the controversial belief that variability in standardised mortality ratios reflects quality of care. Typically standardised mortality ratios are derived using a fixed effects logistic regression model, without a hospital term in the model. This fails to account for the hierarchical structure of the data - patients nested within hospitals - and so a hierarchical logistic regression model is more appropriate. However, four methods have been advocated for deriving standardised mortality ratios from a hierarchical logistic regression model, but their agreement is not known and neither do we know which is to be preferred. We found significant differences between the four types of standardised mortality ratios because they reflect a range of underlying conceptual issues. The most subtle issue is the distinction between asking how an average patient fares in different hospitals versus how patients at a given hospital fare at an average hospital. Since the answers to these questions are not the same and since the choice between these two approaches is not obvious, the extent to which profiling hospitals on mortality can be undertaken safely and reliably, without resolving these methodological issues, remains questionable. © The Author(s) 2012.
Chan, Siew Foong; Deeks, Jonathan J; Macaskill, Petra; Irwig, Les
2008-01-01
To compare three predictive models based on logistic regression to estimate adjusted likelihood ratios allowing for interdependency between diagnostic variables (tests). This study was a review of the theoretical basis, assumptions, and limitations of published models; and a statistical extension of methods and application to a case study of the diagnosis of obstructive airways disease based on history and clinical examination. Albert's method includes an offset term to estimate an adjusted likelihood ratio for combinations of tests. Spiegelhalter and Knill-Jones method uses the unadjusted likelihood ratio for each test as a predictor and computes shrinkage factors to allow for interdependence. Knottnerus' method differs from the other methods because it requires sequencing of tests, which limits its application to situations where there are few tests and substantial data. Although parameter estimates differed between the models, predicted "posttest" probabilities were generally similar. Construction of predictive models using logistic regression is preferred to the independence Bayes' approach when it is important to adjust for dependency of tests errors. Methods to estimate adjusted likelihood ratios from predictive models should be considered in preference to a standard logistic regression model to facilitate ease of interpretation and application. Albert's method provides the most straightforward approach.
Cameron, Isobel M; Scott, Neil W; Adler, Mats; Reid, Ian C
2014-12-01
It is important for clinical practice and research that measurement scales of well-being and quality of life exhibit only minimal differential item functioning (DIF). DIF occurs where different groups of people endorse items in a scale to different extents after being matched by the intended scale attribute. We investigate the equivalence or otherwise of common methods of assessing DIF. Three methods of measuring age- and sex-related DIF (ordinal logistic regression, Rasch analysis and Mantel χ(2) procedure) were applied to Hospital Anxiety Depression Scale (HADS) data pertaining to a sample of 1,068 patients consulting primary care practitioners. Three items were flagged by all three approaches as having either age- or sex-related DIF with a consistent direction of effect; a further three items identified did not meet stricter criteria for important DIF using at least one method. When applying strict criteria for significant DIF, ordinal logistic regression was slightly less sensitive. Ordinal logistic regression, Rasch analysis and contingency table methods yielded consistent results when identifying DIF in the HADS depression and HADS anxiety scales. Regardless of methods applied, investigators should use a combination of statistical significance, magnitude of the DIF effect and investigator judgement when interpreting the results.
NASA Astrophysics Data System (ADS)
Cao, Faxian; Yang, Zhijing; Ren, Jinchang; Ling, Wing-Kuen; Zhao, Huimin; Marshall, Stephen
2017-12-01
Although the sparse multinomial logistic regression (SMLR) has provided a useful tool for sparse classification, it suffers from inefficacy in dealing with high dimensional features and manually set initial regressor values. This has significantly constrained its applications for hyperspectral image (HSI) classification. In order to tackle these two drawbacks, an extreme sparse multinomial logistic regression (ESMLR) is proposed for effective classification of HSI. First, the HSI dataset is projected to a new feature space with randomly generated weight and bias. Second, an optimization model is established by the Lagrange multiplier method and the dual principle to automatically determine a good initial regressor for SMLR via minimizing the training error and the regressor value. Furthermore, the extended multi-attribute profiles (EMAPs) are utilized for extracting both the spectral and spatial features. A combinational linear multiple features learning (MFL) method is proposed to further enhance the features extracted by ESMLR and EMAPs. Finally, the logistic regression via the variable splitting and the augmented Lagrangian (LORSAL) is adopted in the proposed framework for reducing the computational time. Experiments are conducted on two well-known HSI datasets, namely the Indian Pines dataset and the Pavia University dataset, which have shown the fast and robust performance of the proposed ESMLR framework.
Latin hypercube approach to estimate uncertainty in ground water vulnerability
Gurdak, J.J.; McCray, J.E.; Thyne, G.; Qi, S.L.
2007-01-01
A methodology is proposed to quantify prediction uncertainty associated with ground water vulnerability models that were developed through an approach that coupled multivariate logistic regression with a geographic information system (GIS). This method uses Latin hypercube sampling (LHS) to illustrate the propagation of input error and estimate uncertainty associated with the logistic regression predictions of ground water vulnerability. Central to the proposed method is the assumption that prediction uncertainty in ground water vulnerability models is a function of input error propagation from uncertainty in the estimated logistic regression model coefficients (model error) and the values of explanatory variables represented in the GIS (data error). Input probability distributions that represent both model and data error sources of uncertainty were simultaneously sampled using a Latin hypercube approach with logistic regression calculations of probability of elevated nonpoint source contaminants in ground water. The resulting probability distribution represents the prediction intervals and associated uncertainty of the ground water vulnerability predictions. The method is illustrated through a ground water vulnerability assessment of the High Plains regional aquifer. Results of the LHS simulations reveal significant prediction uncertainties that vary spatially across the regional aquifer. Additionally, the proposed method enables a spatial deconstruction of the prediction uncertainty that can lead to improved prediction of ground water vulnerability. ?? 2007 National Ground Water Association.