Sample records for calculated logistic regression

  1. Power and Sample Size Calculations for Logistic Regression Tests for Differential Item Functioning

    ERIC Educational Resources Information Center

    Li, Zhushan

    2014-01-01

    Logistic regression is a popular method for detecting uniform and nonuniform differential item functioning (DIF) effects. Theoretical formulas for the power and sample size calculations are derived for likelihood ratio tests and Wald tests based on the asymptotic distribution of the maximum likelihood estimators for the logistic regression model.…

  2. Sample size determination for logistic regression on a logit-normal distribution.

    PubMed

    Kim, Seongho; Heath, Elisabeth; Heilbrun, Lance

    2017-06-01

    Although the sample size for simple logistic regression can be readily determined using currently available methods, the sample size calculation for multiple logistic regression requires some additional information, such as the coefficient of determination ([Formula: see text]) of a covariate of interest with other covariates, which is often unavailable in practice. The response variable of logistic regression follows a logit-normal distribution which can be generated from a logistic transformation of a normal distribution. Using this property of logistic regression, we propose new methods of determining the sample size for simple and multiple logistic regressions using a normal transformation of outcome measures. Simulation studies and a motivating example show several advantages of the proposed methods over the existing methods: (i) no need for [Formula: see text] for multiple logistic regression, (ii) available interim or group-sequential designs, and (iii) much smaller required sample size.

  3. [Application of SAS macro to evaluated multiplicative and additive interaction in logistic and Cox regression in clinical practices].

    PubMed

    Nie, Z Q; Ou, Y Q; Zhuang, J; Qu, Y J; Mai, J Z; Chen, J M; Liu, X Q

    2016-05-01

    Conditional logistic regression analysis and unconditional logistic regression analysis are commonly used in case control study, but Cox proportional hazard model is often used in survival data analysis. Most literature only refer to main effect model, however, generalized linear model differs from general linear model, and the interaction was composed of multiplicative interaction and additive interaction. The former is only statistical significant, but the latter has biological significance. In this paper, macros was written by using SAS 9.4 and the contrast ratio, attributable proportion due to interaction and synergy index were calculated while calculating the items of logistic and Cox regression interactions, and the confidence intervals of Wald, delta and profile likelihood were used to evaluate additive interaction for the reference in big data analysis in clinical epidemiology and in analysis of genetic multiplicative and additive interactions.

  4. Parameters Estimation of Geographically Weighted Ordinal Logistic Regression (GWOLR) Model

    NASA Astrophysics Data System (ADS)

    Zuhdi, Shaifudin; Retno Sari Saputro, Dewi; Widyaningsih, Purnami

    2017-06-01

    A regression model is the representation of relationship between independent variable and dependent variable. The dependent variable has categories used in the logistic regression model to calculate odds on. The logistic regression model for dependent variable has levels in the logistics regression model is ordinal. GWOLR model is an ordinal logistic regression model influenced the geographical location of the observation site. Parameters estimation in the model needed to determine the value of a population based on sample. The purpose of this research is to parameters estimation of GWOLR model using R software. Parameter estimation uses the data amount of dengue fever patients in Semarang City. Observation units used are 144 villages in Semarang City. The results of research get GWOLR model locally for each village and to know probability of number dengue fever patient categories.

  5. Epidemiologic programs for computers and calculators. A microcomputer program for multiple logistic regression by unconditional and conditional maximum likelihood methods.

    PubMed

    Campos-Filho, N; Franco, E L

    1989-02-01

    A frequent procedure in matched case-control studies is to report results from the multivariate unmatched analyses if they do not differ substantially from the ones obtained after conditioning on the matching variables. Although conceptually simple, this rule requires that an extensive series of logistic regression models be evaluated by both the conditional and unconditional maximum likelihood methods. Most computer programs for logistic regression employ only one maximum likelihood method, which requires that the analyses be performed in separate steps. This paper describes a Pascal microcomputer (IBM PC) program that performs multiple logistic regression by both maximum likelihood estimation methods, which obviates the need for switching between programs to obtain relative risk estimates from both matched and unmatched analyses. The program calculates most standard statistics and allows factoring of categorical or continuous variables by two distinct methods of contrast. A built-in, descriptive statistics option allows the user to inspect the distribution of cases and controls across categories of any given variable.

  6. An empirical study of statistical properties of variance partition coefficients for multi-level logistic regression models

    USGS Publications Warehouse

    Li, Ji; Gray, B.R.; Bates, D.M.

    2008-01-01

    Partitioning the variance of a response by design levels is challenging for binomial and other discrete outcomes. Goldstein (2003) proposed four definitions for variance partitioning coefficients (VPC) under a two-level logistic regression model. In this study, we explicitly derived formulae for multi-level logistic regression model and subsequently studied the distributional properties of the calculated VPCs. Using simulations and a vegetation dataset, we demonstrated associations between different VPC definitions, the importance of methods for estimating VPCs (by comparing VPC obtained using Laplace and penalized quasilikehood methods), and bivariate dependence between VPCs calculated at different levels. Such an empirical study lends an immediate support to wider applications of VPC in scientific data analysis.

  7. [Calculating Pearson residual in logistic regressions: a comparison between SPSS and SAS].

    PubMed

    Xu, Hao; Zhang, Tao; Li, Xiao-song; Liu, Yuan-yuan

    2015-01-01

    To compare the results of Pearson residual calculations in logistic regression models using SPSS and SAS. We reviewed Pearson residual calculation methods, and used two sets of data to test logistic models constructed by SPSS and STATA. One model contained a small number of covariates compared to the number of observed. The other contained a similar number of covariates as the number of observed. The two software packages produced similar Pearson residual estimates when the models contained a similar number of covariates as the number of observed, but the results differed when the number of observed was much greater than the number of covariates. The two software packages produce different results of Pearson residuals, especially when the models contain a small number of covariates. Further studies are warranted.

  8. Estimating interaction on an additive scale between continuous determinants in a logistic regression model.

    PubMed

    Knol, Mirjam J; van der Tweel, Ingeborg; Grobbee, Diederick E; Numans, Mattijs E; Geerlings, Mirjam I

    2007-10-01

    To determine the presence of interaction in epidemiologic research, typically a product term is added to the regression model. In linear regression, the regression coefficient of the product term reflects interaction as departure from additivity. However, in logistic regression it refers to interaction as departure from multiplicativity. Rothman has argued that interaction estimated as departure from additivity better reflects biologic interaction. So far, literature on estimating interaction on an additive scale using logistic regression only focused on dichotomous determinants. The objective of the present study was to provide the methods to estimate interaction between continuous determinants and to illustrate these methods with a clinical example. and results From the existing literature we derived the formulas to quantify interaction as departure from additivity between one continuous and one dichotomous determinant and between two continuous determinants using logistic regression. Bootstrapping was used to calculate the corresponding confidence intervals. To illustrate the theory with an empirical example, data from the Utrecht Health Project were used, with age and body mass index as risk factors for elevated diastolic blood pressure. The methods and formulas presented in this article are intended to assist epidemiologists to calculate interaction on an additive scale between two variables on a certain outcome. The proposed methods are included in a spreadsheet which is freely available at: http://www.juliuscenter.nl/additive-interaction.xls.

  9. A Bayesian goodness of fit test and semiparametric generalization of logistic regression with measurement data.

    PubMed

    Schörgendorfer, Angela; Branscum, Adam J; Hanson, Timothy E

    2013-06-01

    Logistic regression is a popular tool for risk analysis in medical and population health science. With continuous response data, it is common to create a dichotomous outcome for logistic regression analysis by specifying a threshold for positivity. Fitting a linear regression to the nondichotomized response variable assuming a logistic sampling model for the data has been empirically shown to yield more efficient estimates of odds ratios than ordinary logistic regression of the dichotomized endpoint. We illustrate that risk inference is not robust to departures from the parametric logistic distribution. Moreover, the model assumption of proportional odds is generally not satisfied when the condition of a logistic distribution for the data is violated, leading to biased inference from a parametric logistic analysis. We develop novel Bayesian semiparametric methodology for testing goodness of fit of parametric logistic regression with continuous measurement data. The testing procedures hold for any cutoff threshold and our approach simultaneously provides the ability to perform semiparametric risk estimation. Bayes factors are calculated using the Savage-Dickey ratio for testing the null hypothesis of logistic regression versus a semiparametric generalization. We propose a fully Bayesian and a computationally efficient empirical Bayesian approach to testing, and we present methods for semiparametric estimation of risks, relative risks, and odds ratios when parametric logistic regression fails. Theoretical results establish the consistency of the empirical Bayes test. Results from simulated data show that the proposed approach provides accurate inference irrespective of whether parametric assumptions hold or not. Evaluation of risk factors for obesity shows that different inferences are derived from an analysis of a real data set when deviations from a logistic distribution are permissible in a flexible semiparametric framework. © 2013, The International Biometric Society.

  10. A simple approach to power and sample size calculations in logistic regression and Cox regression models.

    PubMed

    Vaeth, Michael; Skovlund, Eva

    2004-06-15

    For a given regression problem it is possible to identify a suitably defined equivalent two-sample problem such that the power or sample size obtained for the two-sample problem also applies to the regression problem. For a standard linear regression model the equivalent two-sample problem is easily identified, but for generalized linear models and for Cox regression models the situation is more complicated. An approximately equivalent two-sample problem may, however, also be identified here. In particular, we show that for logistic regression and Cox regression models the equivalent two-sample problem is obtained by selecting two equally sized samples for which the parameters differ by a value equal to the slope times twice the standard deviation of the independent variable and further requiring that the overall expected number of events is unchanged. In a simulation study we examine the validity of this approach to power calculations in logistic regression and Cox regression models. Several different covariate distributions are considered for selected values of the overall response probability and a range of alternatives. For the Cox regression model we consider both constant and non-constant hazard rates. The results show that in general the approach is remarkably accurate even in relatively small samples. Some discrepancies are, however, found in small samples with few events and a highly skewed covariate distribution. Comparison with results based on alternative methods for logistic regression models with a single continuous covariate indicates that the proposed method is at least as good as its competitors. The method is easy to implement and therefore provides a simple way to extend the range of problems that can be covered by the usual formulas for power and sample size determination. Copyright 2004 John Wiley & Sons, Ltd.

  11. Upgrade Summer Severe Weather Tool

    NASA Technical Reports Server (NTRS)

    Watson, Leela

    2011-01-01

    The goal of this task was to upgrade to the existing severe weather database by adding observations from the 2010 warm season, update the verification dataset with results from the 2010 warm season, use statistical logistic regression analysis on the database and develop a new forecast tool. The AMU analyzed 7 stability parameters that showed the possibility of providing guidance in forecasting severe weather, calculated verification statistics for the Total Threat Score (TTS), and calculated warm season verification statistics for the 2010 season. The AMU also performed statistical logistic regression analysis on the 22-year severe weather database. The results indicated that the logistic regression equation did not show an increase in skill over the previously developed TTS. The equation showed less accuracy than TTS at predicting severe weather, little ability to distinguish between severe and non-severe weather days, and worse standard categorical accuracy measures and skill scores over TTS.

  12. Power and sample size for multivariate logistic modeling of unmatched case-control studies.

    PubMed

    Gail, Mitchell H; Haneuse, Sebastien

    2017-01-01

    Sample size calculations are needed to design and assess the feasibility of case-control studies. Although such calculations are readily available for simple case-control designs and univariate analyses, there is limited theory and software for multivariate unconditional logistic analysis of case-control data. Here we outline the theory needed to detect scalar exposure effects or scalar interactions while controlling for other covariates in logistic regression. Both analytical and simulation methods are presented, together with links to the corresponding software.

  13. Determining factors influencing survival of breast cancer by fuzzy logistic regression model.

    PubMed

    Nikbakht, Roya; Bahrampour, Abbas

    2017-01-01

    Fuzzy logistic regression model can be used for determining influential factors of disease. This study explores the important factors of actual predictive survival factors of breast cancer's patients. We used breast cancer data which collected by cancer registry of Kerman University of Medical Sciences during the period of 2000-2007. The variables such as morphology, grade, age, and treatments (surgery, radiotherapy, and chemotherapy) were applied in the fuzzy logistic regression model. Performance of model was determined in terms of mean degree of membership (MDM). The study results showed that almost 41% of patients were in neoplasm and malignant group and more than two-third of them were still alive after 5-year follow-up. Based on the fuzzy logistic model, the most important factors influencing survival were chemotherapy, morphology, and radiotherapy, respectively. Furthermore, the MDM criteria show that the fuzzy logistic regression have a good fit on the data (MDM = 0.86). Fuzzy logistic regression model showed that chemotherapy is more important than radiotherapy in survival of patients with breast cancer. In addition, another ability of this model is calculating possibilistic odds of survival in cancer patients. The results of this study can be applied in clinical research. Furthermore, there are few studies which applied the fuzzy logistic models. Furthermore, we recommend using this model in various research areas.

  14. Remote sensing and GIS-based landslide hazard analysis and cross-validation using multivariate logistic regression model on three test areas in Malaysia

    NASA Astrophysics Data System (ADS)

    Pradhan, Biswajeet

    2010-05-01

    This paper presents the results of the cross-validation of a multivariate logistic regression model using remote sensing data and GIS for landslide hazard analysis on the Penang, Cameron, and Selangor areas in Malaysia. Landslide locations in the study areas were identified by interpreting aerial photographs and satellite images, supported by field surveys. SPOT 5 and Landsat TM satellite imagery were used to map landcover and vegetation index, respectively. Maps of topography, soil type, lineaments and land cover were constructed from the spatial datasets. Ten factors which influence landslide occurrence, i.e., slope, aspect, curvature, distance from drainage, lithology, distance from lineaments, soil type, landcover, rainfall precipitation, and normalized difference vegetation index (ndvi), were extracted from the spatial database and the logistic regression coefficient of each factor was computed. Then the landslide hazard was analysed using the multivariate logistic regression coefficients derived not only from the data for the respective area but also using the logistic regression coefficients calculated from each of the other two areas (nine hazard maps in all) as a cross-validation of the model. For verification of the model, the results of the analyses were then compared with the field-verified landslide locations. Among the three cases of the application of logistic regression coefficient in the same study area, the case of Selangor based on the Selangor logistic regression coefficients showed the highest accuracy (94%), where as Penang based on the Penang coefficients showed the lowest accuracy (86%). Similarly, among the six cases from the cross application of logistic regression coefficient in other two areas, the case of Selangor based on logistic coefficient of Cameron showed highest (90%) prediction accuracy where as the case of Penang based on the Selangor logistic regression coefficients showed the lowest accuracy (79%). Qualitatively, the cross application model yields reasonable results which can be used for preliminary landslide hazard mapping.

  15. Modelling of binary logistic regression for obesity among secondary students in a rural area of Kedah

    NASA Astrophysics Data System (ADS)

    Kamaruddin, Ainur Amira; Ali, Zalila; Noor, Norlida Mohd.; Baharum, Adam; Ahmad, Wan Muhamad Amir W.

    2014-07-01

    Logistic regression analysis examines the influence of various factors on a dichotomous outcome by estimating the probability of the event's occurrence. Logistic regression, also called a logit model, is a statistical procedure used to model dichotomous outcomes. In the logit model the log odds of the dichotomous outcome is modeled as a linear combination of the predictor variables. The log odds ratio in logistic regression provides a description of the probabilistic relationship of the variables and the outcome. In conducting logistic regression, selection procedures are used in selecting important predictor variables, diagnostics are used to check that assumptions are valid which include independence of errors, linearity in the logit for continuous variables, absence of multicollinearity, and lack of strongly influential outliers and a test statistic is calculated to determine the aptness of the model. This study used the binary logistic regression model to investigate overweight and obesity among rural secondary school students on the basis of their demographics profile, medical history, diet and lifestyle. The results indicate that overweight and obesity of students are influenced by obesity in family and the interaction between a student's ethnicity and routine meals intake. The odds of a student being overweight and obese are higher for a student having a family history of obesity and for a non-Malay student who frequently takes routine meals as compared to a Malay student.

  16. Hierarchical Bayesian Logistic Regression to forecast metabolic control in type 2 DM patients.

    PubMed

    Dagliati, Arianna; Malovini, Alberto; Decata, Pasquale; Cogni, Giulia; Teliti, Marsida; Sacchi, Lucia; Cerra, Carlo; Chiovato, Luca; Bellazzi, Riccardo

    2016-01-01

    In this work we present our efforts in building a model able to forecast patients' changes in clinical conditions when repeated measurements are available. In this case the available risk calculators are typically not applicable. We propose a Hierarchical Bayesian Logistic Regression model, which allows taking into account individual and population variability in model parameters estimate. The model is used to predict metabolic control and its variation in type 2 diabetes mellitus. In particular we have analyzed a population of more than 1000 Italian type 2 diabetic patients, collected within the European project Mosaic. The results obtained in terms of Matthews Correlation Coefficient are significantly better than the ones gathered with standard logistic regression model, based on data pooling.

  17. Filtering data from the collaborative initial glaucoma treatment study for improved identification of glaucoma progression.

    PubMed

    Schell, Greggory J; Lavieri, Mariel S; Stein, Joshua D; Musch, David C

    2013-12-21

    Open-angle glaucoma (OAG) is a prevalent, degenerate ocular disease which can lead to blindness without proper clinical management. The tests used to assess disease progression are susceptible to process and measurement noise. The aim of this study was to develop a methodology which accounts for the inherent noise in the data and improve significant disease progression identification. Longitudinal observations from the Collaborative Initial Glaucoma Treatment Study (CIGTS) were used to parameterize and validate a Kalman filter model and logistic regression function. The Kalman filter estimates the true value of biomarkers associated with OAG and forecasts future values of these variables. We develop two logistic regression models via generalized estimating equations (GEE) for calculating the probability of experiencing significant OAG progression: one model based on the raw measurements from CIGTS and another model based on the Kalman filter estimates of the CIGTS data. Receiver operating characteristic (ROC) curves and associated area under the ROC curve (AUC) estimates are calculated using cross-fold validation. The logistic regression model developed using Kalman filter estimates as data input achieves higher sensitivity and specificity than the model developed using raw measurements. The mean AUC for the Kalman filter-based model is 0.961 while the mean AUC for the raw measurements model is 0.889. Hence, using the probability function generated via Kalman filter estimates and GEE for logistic regression, we are able to more accurately classify patients and instances as experiencing significant OAG progression. A Kalman filter approach for estimating the true value of OAG biomarkers resulted in data input which improved the accuracy of a logistic regression classification model compared to a model using raw measurements as input. This methodology accounts for process and measurement noise to enable improved discrimination between progression and nonprogression in chronic diseases.

  18. Regression analysis for solving diagnosis problem of children's health

    NASA Astrophysics Data System (ADS)

    Cherkashina, Yu A.; Gerget, O. M.

    2016-04-01

    The paper includes results of scientific researches. These researches are devoted to the application of statistical techniques, namely, regression analysis, to assess the health status of children in the neonatal period based on medical data (hemostatic parameters, parameters of blood tests, the gestational age, vascular-endothelial growth factor) measured at 3-5 days of children's life. In this paper a detailed description of the studied medical data is given. A binary logistic regression procedure is discussed in the paper. Basic results of the research are presented. A classification table of predicted values and factual observed values is shown, the overall percentage of correct recognition is determined. Regression equation coefficients are calculated, the general regression equation is written based on them. Based on the results of logistic regression, ROC analysis was performed, sensitivity and specificity of the model are calculated and ROC curves are constructed. These mathematical techniques allow carrying out diagnostics of health of children providing a high quality of recognition. The results make a significant contribution to the development of evidence-based medicine and have a high practical importance in the professional activity of the author.

  19. A reconnaissance method for delineation of tracts for regional-scale mineral-resource assessment based on geologic-map data

    USGS Publications Warehouse

    Raines, G.L.; Mihalasky, M.J.

    2002-01-01

    The U.S. Geological Survey (USGS) is proposing to conduct a global mineral-resource assessment using geologic maps, significant deposits, and exploration history as minimal data requirements. Using a geologic map and locations of significant pluton-related deposits, the pluton-related-deposit tract maps from the USGS national mineral-resource assessment have been reproduced with GIS-based analysis and modeling techniques. Agreement, kappa, and Jaccard's C correlation statistics between the expert USGS and calculated tract maps of 87%, 40%, and 28%, respectively, have been achieved using a combination of weights-of-evidence and weighted logistic regression methods. Between the experts' and calculated maps, the ranking of states measured by total permissive area correlates at 84%. The disagreement between the experts and calculated results can be explained primarily by tracts defined by geophysical evidence not considered in the calculations, generalization of tracts by the experts, differences in map scales, and the experts' inclusion of large tracts that are arguably not permissive. This analysis shows that tracts for regional mineral-resource assessment approximating those delineated by USGS experts can be calculated using weights of evidence and weighted logistic regression, a geologic map, and the location of significant deposits. Weights of evidence and weighted logistic regression applied to a global geologic map could provide quickly a useful reconnaissance definition of tracts for mineral assessment that is tied to the data and is reproducible. ?? 2002 International Association for Mathematical Geology.

  20. Predictive landslide susceptibility mapping using spatial information in the Pechabun area of Thailand

    NASA Astrophysics Data System (ADS)

    Oh, Hyun-Joo; Lee, Saro; Chotikasathien, Wisut; Kim, Chang Hwan; Kwon, Ju Hyoung

    2009-04-01

    For predictive landslide susceptibility mapping, this study applied and verified probability model, the frequency ratio and statistical model, logistic regression at Pechabun, Thailand, using a geographic information system (GIS) and remote sensing. Landslide locations were identified in the study area from interpretation of aerial photographs and field surveys, and maps of the topography, geology and land cover were constructed to spatial database. The factors that influence landslide occurrence, such as slope gradient, slope aspect and curvature of topography and distance from drainage were calculated from the topographic database. Lithology and distance from fault were extracted and calculated from the geology database. Land cover was classified from Landsat TM satellite image. The frequency ratio and logistic regression coefficient were overlaid for landslide susceptibility mapping as each factor’s ratings. Then the landslide susceptibility map was verified and compared using the existing landslide location. As the verification results, the frequency ratio model showed 76.39% and logistic regression model showed 70.42% in prediction accuracy. The method can be used to reduce hazards associated with landslides and to plan land cover.

  1. [Logistic regression model of noninvasive prediction for portal hypertensive gastropathy in patients with hepatitis B associated cirrhosis].

    PubMed

    Wang, Qingliang; Li, Xiaojie; Hu, Kunpeng; Zhao, Kun; Yang, Peisheng; Liu, Bo

    2015-05-12

    To explore the risk factors of portal hypertensive gastropathy (PHG) in patients with hepatitis B associated cirrhosis and establish a Logistic regression model of noninvasive prediction. The clinical data of 234 hospitalized patients with hepatitis B associated cirrhosis from March 2012 to March 2014 were analyzed retrospectively. The dependent variable was the occurrence of PHG while the independent variables were screened by binary Logistic analysis. Multivariate Logistic regression was used for further analysis of significant noninvasive independent variables. Logistic regression model was established and odds ratio was calculated for each factor. The accuracy, sensitivity and specificity of model were evaluated by the curve of receiver operating characteristic (ROC). According to univariate Logistic regression, the risk factors included hepatic dysfunction, albumin (ALB), bilirubin (TB), prothrombin time (PT), platelet (PLT), white blood cell (WBC), portal vein diameter, spleen index, splenic vein diameter, diameter ratio, PLT to spleen volume ratio, esophageal varices (EV) and gastric varices (GV). Multivariate analysis showed that hepatic dysfunction (X1), TB (X2), PLT (X3) and splenic vein diameter (X4) were the major occurring factors for PHG. The established regression model was Logit P=-2.667+2.186X1-2.167X2+0.725X3+0.976X4. The accuracy of model for PHG was 79.1% with a sensitivity of 77.2% and a specificity of 80.8%. Hepatic dysfunction, TB, PLT and splenic vein diameter are risk factors for PHG and the noninvasive predicted Logistic regression model was Logit P=-2.667+2.186X1-2.167X2+0.725X3+0.976X4.

  2. Binary logistic regression-Instrument for assessing museum indoor air impact on exhibits.

    PubMed

    Bucur, Elena; Danet, Andrei Florin; Lehr, Carol Blaziu; Lehr, Elena; Nita-Lazar, Mihai

    2017-04-01

    This paper presents a new way to assess the environmental impact on historical artifacts using binary logistic regression. The prediction of the impact on the exhibits during certain pollution scenarios (environmental impact) was calculated by a mathematical model based on the binary logistic regression; it allows the identification of those environmental parameters from a multitude of possible parameters with a significant impact on exhibitions and ranks them according to their severity effect. Air quality (NO 2 , SO 2 , O 3 and PM 2.5 ) and microclimate parameters (temperature, humidity) monitoring data from a case study conducted within exhibition and storage spaces of the Romanian National Aviation Museum Bucharest have been used for developing and validating the binary logistic regression method and the mathematical model. The logistic regression analysis was used on 794 data combinations (715 to develop of the model and 79 to validate it) by a Statistical Package for Social Sciences (SPSS 20.0). The results from the binary logistic regression analysis demonstrated that from six parameters taken into consideration, four of them present a significant effect upon exhibits in the following order: O 3 >PM 2.5 >NO 2 >humidity followed at a significant distance by the effects of SO 2 and temperature. The mathematical model, developed in this study, correctly predicted 95.1 % of the cumulated effect of the environmental parameters upon the exhibits. Moreover, this model could also be used in the decisional process regarding the preventive preservation measures that should be implemented within the exhibition space. The paper presents a new way to assess the environmental impact on historical artifacts using binary logistic regression. The mathematical model developed on the environmental parameters analyzed by the binary logistic regression method could be useful in a decision-making process establishing the best measures for pollution reduction and preventive preservation of exhibits.

  3. The Effect of Latent Binary Variables on the Uncertainty of the Prediction of a Dichotomous Outcome Using Logistic Regression Based Propensity Score Matching.

    PubMed

    Szekér, Szabolcs; Vathy-Fogarassy, Ágnes

    2018-01-01

    Logistic regression based propensity score matching is a widely used method in case-control studies to select the individuals of the control group. This method creates a suitable control group if all factors affecting the output variable are known. However, if relevant latent variables exist as well, which are not taken into account during the calculations, the quality of the control group is uncertain. In this paper, we present a statistics-based research in which we try to determine the relationship between the accuracy of the logistic regression model and the uncertainty of the dependent variable of the control group defined by propensity score matching. Our analyses show that there is a linear correlation between the fit of the logistic regression model and the uncertainty of the output variable. In certain cases, a latent binary explanatory variable can result in a relative error of up to 70% in the prediction of the outcome variable. The observed phenomenon calls the attention of analysts to an important point, which must be taken into account when deducting conclusions.

  4. Updated logistic regression equations for the calculation of post-fire debris-flow likelihood in the western United States

    USGS Publications Warehouse

    Staley, Dennis M.; Negri, Jacquelyn A.; Kean, Jason W.; Laber, Jayme L.; Tillery, Anne C.; Youberg, Ann M.

    2016-06-30

    Wildfire can significantly alter the hydrologic response of a watershed to the extent that even modest rainstorms can generate dangerous flash floods and debris flows. To reduce public exposure to hazard, the U.S. Geological Survey produces post-fire debris-flow hazard assessments for select fires in the western United States. We use publicly available geospatial data describing basin morphology, burn severity, soil properties, and rainfall characteristics to estimate the statistical likelihood that debris flows will occur in response to a storm of a given rainfall intensity. Using an empirical database and refined geospatial analysis methods, we defined new equations for the prediction of debris-flow likelihood using logistic regression methods. We showed that the new logistic regression model outperformed previous models used to predict debris-flow likelihood.

  5. The alarming problems of confounding equivalence using logistic regression models in the perspective of causal diagrams.

    PubMed

    Yu, Yuanyuan; Li, Hongkai; Sun, Xiaoru; Su, Ping; Wang, Tingting; Liu, Yi; Yuan, Zhongshang; Liu, Yanxun; Xue, Fuzhong

    2017-12-28

    Confounders can produce spurious associations between exposure and outcome in observational studies. For majority of epidemiologists, adjusting for confounders using logistic regression model is their habitual method, though it has some problems in accuracy and precision. It is, therefore, important to highlight the problems of logistic regression and search the alternative method. Four causal diagram models were defined to summarize confounding equivalence. Both theoretical proofs and simulation studies were performed to verify whether conditioning on different confounding equivalence sets had the same bias-reducing potential and then to select the optimum adjusting strategy, in which logistic regression model and inverse probability weighting based marginal structural model (IPW-based-MSM) were compared. The "do-calculus" was used to calculate the true causal effect of exposure on outcome, then the bias and standard error were used to evaluate the performances of different strategies. Adjusting for different sets of confounding equivalence, as judged by identical Markov boundaries, produced different bias-reducing potential in the logistic regression model. For the sets satisfied G-admissibility, adjusting for the set including all the confounders reduced the equivalent bias to the one containing the parent nodes of the outcome, while the bias after adjusting for the parent nodes of exposure was not equivalent to them. In addition, all causal effect estimations through logistic regression were biased, although the estimation after adjusting for the parent nodes of exposure was nearest to the true causal effect. However, conditioning on different confounding equivalence sets had the same bias-reducing potential under IPW-based-MSM. Compared with logistic regression, the IPW-based-MSM could obtain unbiased causal effect estimation when the adjusted confounders satisfied G-admissibility and the optimal strategy was to adjust for the parent nodes of outcome, which obtained the highest precision. All adjustment strategies through logistic regression were biased for causal effect estimation, while IPW-based-MSM could always obtain unbiased estimation when the adjusted set satisfied G-admissibility. Thus, IPW-based-MSM was recommended to adjust for confounders set.

  6. Prediction of spatially explicit rainfall intensity-duration thresholds for post-fire debris-flow generation in the western United States

    NASA Astrophysics Data System (ADS)

    Staley, Dennis; Negri, Jacquelyn; Kean, Jason

    2016-04-01

    Population expansion into fire-prone steeplands has resulted in an increase in post-fire debris-flow risk in the western United States. Logistic regression methods for determining debris-flow likelihood and the calculation of empirical rainfall intensity-duration thresholds for debris-flow initiation represent two common approaches for characterizing hazard and reducing risk. Logistic regression models are currently being used to rapidly assess debris-flow hazard in response to design storms of known intensities (e.g. a 10-year recurrence interval rainstorm). Empirical rainfall intensity-duration thresholds comprise a major component of the United States Geological Survey (USGS) and the National Weather Service (NWS) debris-flow early warning system at a regional scale in southern California. However, these two modeling approaches remain independent, with each approach having limitations that do not allow for synergistic local-scale (e.g. drainage-basin scale) characterization of debris-flow hazard during intense rainfall. The current logistic regression equations consider rainfall a unique independent variable, which prevents the direct calculation of the relation between rainfall intensity and debris-flow likelihood. Regional (e.g. mountain range or physiographic province scale) rainfall intensity-duration thresholds fail to provide insight into the basin-scale variability of post-fire debris-flow hazard and require an extensive database of historical debris-flow occurrence and rainfall characteristics. Here, we present a new approach that combines traditional logistic regression and intensity-duration threshold methodologies. This method allows for local characterization of both the likelihood that a debris-flow will occur at a given rainfall intensity, the direct calculation of the rainfall rates that will result in a given likelihood, and the ability to calculate spatially explicit rainfall intensity-duration thresholds for debris-flow generation in recently burned areas. Our approach synthesizes the two methods by incorporating measured rainfall intensity into each model variable (based on measures of topographic steepness, burn severity and surface properties) within the logistic regression equation. This approach provides a more realistic representation of the relation between rainfall intensity and debris-flow likelihood, as likelihood values asymptotically approach zero when rainfall intensity approaches 0 mm/h, and increase with more intense rainfall. Model performance was evaluated by comparing predictions to several existing regional thresholds. The model, based upon training data collected in southern California, USA, has proven to accurately predict rainfall intensity-duration thresholds for other areas in the western United States not included in the original training dataset. In addition, the improved logistic regression model shows promise for emergency planning purposes and real-time, site-specific early warning. With further validation, this model may permit the prediction of spatially-explicit intensity-duration thresholds for debris-flow generation in areas where empirically derived regional thresholds do not exist. This improvement would permit the expansion of the early-warning system into other regions susceptible to post-fire debris flow.

  7. Evaluating the perennial stream using logistic regression in central Taiwan

    NASA Astrophysics Data System (ADS)

    Ruljigaljig, T.; Cheng, Y. S.; Lin, H. I.; Lee, C. H.; Yu, T. T.

    2014-12-01

    This study produces a perennial stream head potential map, based on a logistic regression method with a Geographic Information System (GIS). Perennial stream initiation locations, indicates the location of the groundwater and surface contact, were identified in the study area from field survey. The perennial stream potential map in central Taiwan was constructed using the relationship between perennial stream and their causative factors, such as Catchment area, slope gradient, aspect, elevation, groundwater recharge and precipitation. Here, the field surveys of 272 streams were determined in the study area. The areas under the curve for logistic regression methods were calculated as 0.87. The results illustrate the importance of catchment area and groundwater recharge as key factors within the model. The results obtained from the model within the GIS were then used to produce a map of perennial stream and estimate the location of perennial stream head.

  8. A Method for Calculating the Probability of Successfully Completing a Rocket Propulsion Ground Test

    NASA Technical Reports Server (NTRS)

    Messer, Bradley

    2007-01-01

    Propulsion ground test facilities face the daily challenge of scheduling multiple customers into limited facility space and successfully completing their propulsion test projects. Over the last decade NASA s propulsion test facilities have performed hundreds of tests, collected thousands of seconds of test data, and exceeded the capabilities of numerous test facility and test article components. A logistic regression mathematical modeling technique has been developed to predict the probability of successfully completing a rocket propulsion test. A logistic regression model is a mathematical modeling approach that can be used to describe the relationship of several independent predictor variables X(sub 1), X(sub 2),.., X(sub k) to a binary or dichotomous dependent variable Y, where Y can only be one of two possible outcomes, in this case Success or Failure of accomplishing a full duration test. The use of logistic regression modeling is not new; however, modeling propulsion ground test facilities using logistic regression is both a new and unique application of the statistical technique. Results from this type of model provide project managers with insight and confidence into the effectiveness of rocket propulsion ground testing.

  9. Feature Clustering for Accelerating Parallel Coordinate Descent

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Scherrer, Chad; Tewari, Ambuj; Halappanavar, Mahantesh

    2012-12-06

    We demonstrate an approach for accelerating calculation of the regularization path for L1 sparse logistic regression problems. We show the benefit of feature clustering as a preconditioning step for parallel block-greedy coordinate descent algorithms.

  10. Latin hypercube approach to estimate uncertainty in ground water vulnerability

    USGS Publications Warehouse

    Gurdak, J.J.; McCray, J.E.; Thyne, G.; Qi, S.L.

    2007-01-01

    A methodology is proposed to quantify prediction uncertainty associated with ground water vulnerability models that were developed through an approach that coupled multivariate logistic regression with a geographic information system (GIS). This method uses Latin hypercube sampling (LHS) to illustrate the propagation of input error and estimate uncertainty associated with the logistic regression predictions of ground water vulnerability. Central to the proposed method is the assumption that prediction uncertainty in ground water vulnerability models is a function of input error propagation from uncertainty in the estimated logistic regression model coefficients (model error) and the values of explanatory variables represented in the GIS (data error). Input probability distributions that represent both model and data error sources of uncertainty were simultaneously sampled using a Latin hypercube approach with logistic regression calculations of probability of elevated nonpoint source contaminants in ground water. The resulting probability distribution represents the prediction intervals and associated uncertainty of the ground water vulnerability predictions. The method is illustrated through a ground water vulnerability assessment of the High Plains regional aquifer. Results of the LHS simulations reveal significant prediction uncertainties that vary spatially across the regional aquifer. Additionally, the proposed method enables a spatial deconstruction of the prediction uncertainty that can lead to improved prediction of ground water vulnerability. ?? 2007 National Ground Water Association.

  11. Using a binary logistic regression method and GIS for evaluating and mapping the groundwater spring potential in the Sultan Mountains (Aksehir, Turkey)

    NASA Astrophysics Data System (ADS)

    Ozdemir, Adnan

    2011-07-01

    SummaryThe purpose of this study is to produce a groundwater spring potential map of the Sultan Mountains in central Turkey, based on a logistic regression method within a Geographic Information System (GIS) environment. Using field surveys, the locations of the springs (440 springs) were determined in the study area. In this study, 17 spring-related factors were used in the analysis: geology, relative permeability, land use/land cover, precipitation, elevation, slope, aspect, total curvature, plan curvature, profile curvature, wetness index, stream power index, sediment transport capacity index, distance to drainage, distance to fault, drainage density, and fault density map. The coefficients of the predictor variables were estimated using binary logistic regression analysis and were used to calculate the groundwater spring potential for the entire study area. The accuracy of the final spring potential map was evaluated based on the observed springs. The accuracy of the model was evaluated by calculating the relative operating characteristics. The area value of the relative operating characteristic curve model was found to be 0.82. These results indicate that the model is a good estimator of the spring potential in the study area. The spring potential map shows that the areas of very low, low, moderate and high groundwater spring potential classes are 105.586 km 2 (28.99%), 74.271 km 2 (19.906%), 101.203 km 2 (27.14%), and 90.05 km 2 (24.671%), respectively. The interpretations of the potential map showed that stream power index, relative permeability of lithologies, geology, elevation, aspect, wetness index, plan curvature, and drainage density play major roles in spring occurrence and distribution in the Sultan Mountains. The logistic regression approach has not yet been used to delineate groundwater potential zones. In this study, the logistic regression method was used to locate potential zones for groundwater springs in the Sultan Mountains. The evolved model was found to be in strong agreement with the available groundwater spring test data. Hence, this method can be used routinely in groundwater exploration under favourable conditions.

  12. Landslide susceptibility mapping using frequency ratio, logistic regression, artificial neural networks and their comparison: A case study from Kat landslides (Tokat—Turkey)

    NASA Astrophysics Data System (ADS)

    Yilmaz, Işık

    2009-06-01

    The purpose of this study is to compare the landslide susceptibility mapping methods of frequency ratio (FR), logistic regression and artificial neural networks (ANN) applied in the Kat County (Tokat—Turkey). Digital elevation model (DEM) was first constructed using GIS software. Landslide-related factors such as geology, faults, drainage system, topographical elevation, slope angle, slope aspect, topographic wetness index (TWI) and stream power index (SPI) were used in the landslide susceptibility analyses. Landslide susceptibility maps were produced from the frequency ratio, logistic regression and neural networks models, and they were then compared by means of their validations. The higher accuracies of the susceptibility maps for all three models were obtained from the comparison of the landslide susceptibility maps with the known landslide locations. However, respective area under curve (AUC) values of 0.826, 0.842 and 0.852 for frequency ratio, logistic regression and artificial neural networks showed that the map obtained from ANN model is more accurate than the other models, accuracies of all models can be evaluated relatively similar. The results obtained in this study also showed that the frequency ratio model can be used as a simple tool in assessment of landslide susceptibility when a sufficient number of data were obtained. Input process, calculations and output process are very simple and can be readily understood in the frequency ratio model, however logistic regression and neural networks require the conversion of data to ASCII or other formats. Moreover, it is also very hard to process the large amount of data in the statistical package.

  13. Multivariate logistic regression analysis of postoperative complications and risk model establishment of gastrectomy for gastric cancer: A single-center cohort report.

    PubMed

    Zhou, Jinzhe; Zhou, Yanbing; Cao, Shougen; Li, Shikuan; Wang, Hao; Niu, Zhaojian; Chen, Dong; Wang, Dongsheng; Lv, Liang; Zhang, Jian; Li, Yu; Jiao, Xuelong; Tan, Xiaojie; Zhang, Jianli; Wang, Haibo; Zhang, Bingyuan; Lu, Yun; Sun, Zhenqing

    2016-01-01

    Reporting of surgical complications is common, but few provide information about the severity and estimate risk factors of complications. If have, but lack of specificity. We retrospectively analyzed data on 2795 gastric cancer patients underwent surgical procedure at the Affiliated Hospital of Qingdao University between June 2007 and June 2012, established multivariate logistic regression model to predictive risk factors related to the postoperative complications according to the Clavien-Dindo classification system. Twenty-four out of 86 variables were identified statistically significant in univariate logistic regression analysis, 11 significant variables entered multivariate analysis were employed to produce the risk model. Liver cirrhosis, diabetes mellitus, Child classification, invasion of neighboring organs, combined resection, introperative transfusion, Billroth II anastomosis of reconstruction, malnutrition, surgical volume of surgeons, operating time and age were independent risk factors for postoperative complications after gastrectomy. Based on logistic regression equation, p=Exp∑BiXi / (1+Exp∑BiXi), multivariate logistic regression predictive model that calculated the risk of postoperative morbidity was developed, p = 1/(1 + e((4.810-1.287X1-0.504X2-0.500X3-0.474X4-0.405X5-0.318X6-0.316X7-0.305X8-0.278X9-0.255X10-0.138X11))). The accuracy, sensitivity and specificity of the model to predict the postoperative complications were 86.7%, 76.2% and 88.6%, respectively. This risk model based on Clavien-Dindo grading severity of complications system and logistic regression analysis can predict severe morbidity specific to an individual patient's risk factors, estimate patients' risks and benefits of gastric surgery as an accurate decision-making tool and may serve as a template for the development of risk models for other surgical groups.

  14. Can shoulder dystocia be reliably predicted?

    PubMed

    Dodd, Jodie M; Catcheside, Britt; Scheil, Wendy

    2012-06-01

    To evaluate factors reported to increase the risk of shoulder dystocia, and to evaluate their predictive value at a population level. The South Australian Pregnancy Outcome Unit's population database from 2005 to 2010 was accessed to determine the occurrence of shoulder dystocia in addition to reported risk factors, including age, parity, self-reported ethnicity, presence of diabetes and infant birth weight. Odds ratios (and 95% confidence interval) of shoulder dystocia was calculated for each risk factor, which were then incorporated into a logistic regression model. Test characteristics for each variable in predicting shoulder dystocia were calculated. As a proportion of all births, the reported rate of shoulder dystocia increased significantly from 0.95% in 2005 to 1.38% in 2010 (P = 0.0002). Using a logistic regression model, induction of labour and infant birth weight greater than both 4000 and 4500 g were identified as significant independent predictors of shoulder dystocia. The value of risk factors alone and when incorporated into the logistic regression model was poorly predictive of the occurrence of shoulder dystocia. While there are a number of factors associated with an increased risk of shoulder dystocia, none are of sufficient sensitivity or positive predictive value to allow their use clinically to reliably and accurately identify the occurrence of shoulder dystocia. © 2012 The Authors ANZJOG © 2012 The Royal Australian and New Zealand College of Obstetricians and Gynaecologists.

  15. A statistical method for predicting seizure onset zones from human single-neuron recordings

    NASA Astrophysics Data System (ADS)

    Valdez, André B.; Hickman, Erin N.; Treiman, David M.; Smith, Kris A.; Steinmetz, Peter N.

    2013-02-01

    Objective. Clinicians often use depth-electrode recordings to localize human epileptogenic foci. To advance the diagnostic value of these recordings, we applied logistic regression models to single-neuron recordings from depth-electrode microwires to predict seizure onset zones (SOZs). Approach. We collected data from 17 epilepsy patients at the Barrow Neurological Institute and developed logistic regression models to calculate the odds of observing SOZs in the hippocampus, amygdala and ventromedial prefrontal cortex, based on statistics such as the burst interspike interval (ISI). Main results. Analysis of these models showed that, for a single-unit increase in burst ISI ratio, the left hippocampus was approximately 12 times more likely to contain a SOZ; and the right amygdala, 14.5 times more likely. Our models were most accurate for the hippocampus bilaterally (at 85% average sensitivity), and performance was comparable with current diagnostics such as electroencephalography. Significance. Logistic regression models can be combined with single-neuron recording to predict likely SOZs in epilepsy patients being evaluated for resective surgery, providing an automated source of clinically useful information.

  16. Measurement of faculty anesthesiologists' quality of clinical supervision has greater reliability when controlling for the leniency of the rating anesthesia resident: a retrospective cohort study.

    PubMed

    Dexter, Franklin; Ledolter, Johannes; Hindman, Bradley J

    2017-06-01

    Our department monitors the quality of anesthesiologists' clinical supervision and provides each anesthesiologist with periodic feedback. We hypothesized that greater differentiation among anesthesiologists' supervision scores could be obtained by adjusting for leniency of the rating resident. From July 1, 2013 to December 31, 2015, our department has utilized the de Oliveira Filho unidimensional nine-item supervision scale to assess the quality of clinical supervision provided by faculty as rated by residents. We examined all 13,664 ratings of the 97 anesthesiologists (ratees) by the 65 residents (raters). Testing for internal consistency among answers to questions (large Cronbach's alpha > 0.90) was performed to rule out that one or two questions accounted for leniency. Mixed-effects logistic regression was used to compare ratees while controlling for rater leniency vs using Student t tests without rater leniency. The mean supervision scale score was calculated for each combination of the 65 raters and nine questions. The Cronbach's alpha was very large (0.977). The mean score was calculated for each of the 3,421 observed combinations of resident and anesthesiologist. The logits of the percentage of scores equal to the maximum value of 4.00 were normally distributed (residents, P = 0.24; anesthesiologists, P = 0.50). There were 20/97 anesthesiologists identified as significant outliers (13 with below average supervision scores and seven with better than average) using the mixed-effects logistic regression with rater leniency entered as a fixed effect but not by Student's t test. In contrast, there were three of 97 anesthesiologists identified as outliers (all three above average) using Student's t tests but not by logistic regression with leniency. The 20 vs 3 was significant (P < 0.001). Use of logistic regression with leniency results in greater detection of anesthesiologists with significantly better (or worse) clinical supervision scores than use of Student's t tests (i.e., without adjustment for rater leniency).

  17. Refined ambient PM2.5 exposure surrogates and the risk of myocardial infarction

    EPA Science Inventory

    Using a case-crossover study design and conditional logistic regression, we compared the relative odds of transmural (full-wall) myocardial infarction (MI) calculated using exposure surrogates that account for human activity patterns and the indoor transport of ambient PM2....

  18. The effect of high leverage points on the logistic ridge regression estimator having multicollinearity

    NASA Astrophysics Data System (ADS)

    Ariffin, Syaiba Balqish; Midi, Habshah

    2014-06-01

    This article is concerned with the performance of logistic ridge regression estimation technique in the presence of multicollinearity and high leverage points. In logistic regression, multicollinearity exists among predictors and in the information matrix. The maximum likelihood estimator suffers a huge setback in the presence of multicollinearity which cause regression estimates to have unduly large standard errors. To remedy this problem, a logistic ridge regression estimator is put forward. It is evident that the logistic ridge regression estimator outperforms the maximum likelihood approach for handling multicollinearity. The effect of high leverage points are then investigated on the performance of the logistic ridge regression estimator through real data set and simulation study. The findings signify that logistic ridge regression estimator fails to provide better parameter estimates in the presence of both high leverage points and multicollinearity.

  19. Ensemble of trees approaches to risk adjustment for evaluating a hospital's performance.

    PubMed

    Liu, Yang; Traskin, Mikhail; Lorch, Scott A; George, Edward I; Small, Dylan

    2015-03-01

    A commonly used method for evaluating a hospital's performance on an outcome is to compare the hospital's observed outcome rate to the hospital's expected outcome rate given its patient (case) mix and service. The process of calculating the hospital's expected outcome rate given its patient mix and service is called risk adjustment (Iezzoni 1997). Risk adjustment is critical for accurately evaluating and comparing hospitals' performances since we would not want to unfairly penalize a hospital just because it treats sicker patients. The key to risk adjustment is accurately estimating the probability of an Outcome given patient characteristics. For cases with binary outcomes, the method that is commonly used in risk adjustment is logistic regression. In this paper, we consider ensemble of trees methods as alternatives for risk adjustment, including random forests and Bayesian additive regression trees (BART). Both random forests and BART are modern machine learning methods that have been shown recently to have excellent performance for prediction of outcomes in many settings. We apply these methods to carry out risk adjustment for the performance of neonatal intensive care units (NICU). We show that these ensemble of trees methods outperform logistic regression in predicting mortality among babies treated in NICU, and provide a superior method of risk adjustment compared to logistic regression.

  20. A comparison of Cox and logistic regression for use in genome-wide association studies of cohort and case-cohort design.

    PubMed

    Staley, James R; Jones, Edmund; Kaptoge, Stephen; Butterworth, Adam S; Sweeting, Michael J; Wood, Angela M; Howson, Joanna M M

    2017-06-01

    Logistic regression is often used instead of Cox regression to analyse genome-wide association studies (GWAS) of single-nucleotide polymorphisms (SNPs) and disease outcomes with cohort and case-cohort designs, as it is less computationally expensive. Although Cox and logistic regression models have been compared previously in cohort studies, this work does not completely cover the GWAS setting nor extend to the case-cohort study design. Here, we evaluated Cox and logistic regression applied to cohort and case-cohort genetic association studies using simulated data and genetic data from the EPIC-CVD study. In the cohort setting, there was a modest improvement in power to detect SNP-disease associations using Cox regression compared with logistic regression, which increased as the disease incidence increased. In contrast, logistic regression had more power than (Prentice weighted) Cox regression in the case-cohort setting. Logistic regression yielded inflated effect estimates (assuming the hazard ratio is the underlying measure of association) for both study designs, especially for SNPs with greater effect on disease. Given logistic regression is substantially more computationally efficient than Cox regression in both settings, we propose a two-step approach to GWAS in cohort and case-cohort studies. First to analyse all SNPs with logistic regression to identify associated variants below a pre-defined P-value threshold, and second to fit Cox regression (appropriately weighted in case-cohort studies) to those identified SNPs to ensure accurate estimation of association with disease.

  1. The crux of the method: assumptions in ordinary least squares and logistic regression.

    PubMed

    Long, Rebecca G

    2008-10-01

    Logistic regression has increasingly become the tool of choice when analyzing data with a binary dependent variable. While resources relating to the technique are widely available, clear discussions of why logistic regression should be used in place of ordinary least squares regression are difficult to find. The current paper compares and contrasts the assumptions of ordinary least squares with those of logistic regression and explains why logistic regression's looser assumptions make it adept at handling violations of the more important assumptions in ordinary least squares.

  2. Mapping of the DLQI scores to EQ-5D utility values using ordinal logistic regression.

    PubMed

    Ali, Faraz Mahmood; Kay, Richard; Finlay, Andrew Y; Piguet, Vincent; Kupfer, Joerg; Dalgard, Florence; Salek, M Sam

    2017-11-01

    The Dermatology Life Quality Index (DLQI) and the European Quality of Life-5 Dimension (EQ-5D) are separate measures that may be used to gather health-related quality of life (HRQoL) information from patients. The EQ-5D is a generic measure from which health utility estimates can be derived, whereas the DLQI is a specialty-specific measure to assess HRQoL. To reduce the burden of multiple measures being administered and to enable a more disease-specific calculation of health utility estimates, we explored an established mathematical technique known as ordinal logistic regression (OLR) to develop an appropriate model to map DLQI data to EQ-5D-based health utility estimates. Retrospective data from 4010 patients were randomly divided five times into two groups for the derivation and testing of the mapping model. Split-half cross-validation was utilized resulting in a total of ten ordinal logistic regression models for each of the five EQ-5D dimensions against age, sex, and all ten items of the DLQI. Using Monte Carlo simulation, predicted health utility estimates were derived and compared against those observed. This method was repeated for both OLR and a previously tested mapping methodology based on linear regression. The model was shown to be highly predictive and its repeated fitting demonstrated a stable model using OLR as well as linear regression. The mean differences between OLR-predicted health utility estimates and observed health utility estimates ranged from 0.0024 to 0.0239 across the ten modeling exercises, with an average overall difference of 0.0120 (a 1.6% underestimate, not of clinical importance). This modeling framework developed in this study will enable researchers to calculate EQ-5D health utility estimates from a specialty-specific study population, reducing patient and economic burden.

  3. Using Dominance Analysis to Determine Predictor Importance in Logistic Regression

    ERIC Educational Resources Information Center

    Azen, Razia; Traxel, Nicole

    2009-01-01

    This article proposes an extension of dominance analysis that allows researchers to determine the relative importance of predictors in logistic regression models. Criteria for choosing logistic regression R[superscript 2] analogues were determined and measures were selected that can be used to perform dominance analysis in logistic regression. A…

  4. Applying Kaplan-Meier to Item Response Data

    ERIC Educational Resources Information Center

    McNeish, Daniel

    2018-01-01

    Some IRT models can be equivalently modeled in alternative frameworks such as logistic regression. Logistic regression can also model time-to-event data, which concerns the probability of an event occurring over time. Using the relation between time-to-event models and logistic regression and the relation between logistic regression and IRT, this…

  5. A Method for Calculating the Probability of Successfully Completing a Rocket Propulsion Ground Test

    NASA Technical Reports Server (NTRS)

    Messer, Bradley P.

    2004-01-01

    Propulsion ground test facilities face the daily challenges of scheduling multiple customers into limited facility space and successfully completing their propulsion test projects. Due to budgetary and schedule constraints, NASA and industry customers are pushing to test more components, for less money, in a shorter period of time. As these new rocket engine component test programs are undertaken, the lack of technology maturity in the test articles, combined with pushing the test facilities capabilities to their limits, tends to lead to an increase in facility breakdowns and unsuccessful tests. Over the last five years Stennis Space Center's propulsion test facilities have performed hundreds of tests, collected thousands of seconds of test data, and broken numerous test facility and test article parts. While various initiatives have been implemented to provide better propulsion test techniques and improve the quality, reliability, and maintainability of goods and parts used in the propulsion test facilities, unexpected failures during testing still occur quite regularly due to the harsh environment in which the propulsion test facilities operate. Previous attempts at modeling the lifecycle of a propulsion component test project have met with little success. Each of the attempts suffered form incomplete or inconsistent data on which to base the models. By focusing on the actual test phase of the tests project rather than the formulation, design or construction phases of the test project, the quality and quantity of available data increases dramatically. A logistic regression model has been developed form the data collected over the last five years, allowing the probability of successfully completing a rocket propulsion component test to be calculated. A logistic regression model is a mathematical modeling approach that can be used to describe the relationship of several independent predictor variables X(sub 1), X(sub 2),..,X(sub k) to a binary or dichotomous dependent variable Y, where Y can only be one of two possible outcomes, in this case Success or Failure. Logistic regression has primarily been used in the fields of epidemiology and biomedical research, but lends itself to many other applications. As indicated the use of logistic regression is not new, however, modeling propulsion ground test facilities using logistic regression is both a new and unique application of the statistical technique. Results from the models provide project managers with insight and confidence into the affectivity of rocket engine component ground test projects. The initial success in modeling rocket propulsion ground test projects clears the way for more complex models to be developed in this area.

  6. The Outlier Detection for Ordinal Data Using Scalling Technique of Regression Coefficients

    NASA Astrophysics Data System (ADS)

    Adnan, Arisman; Sugiarto, Sigit

    2017-06-01

    The aims of this study is to detect the outliers by using coefficients of Ordinal Logistic Regression (OLR) for the case of k category responses where the score from 1 (the best) to 8 (the worst). We detect them by using the sum of moduli of the ordinal regression coefficients calculated by jackknife technique. This technique is improved by scalling the regression coefficients to their means. R language has been used on a set of ordinal data from reference distribution. Furthermore, we compare this approach by using studentised residual plots of jackknife technique for ANOVA (Analysis of Variance) and OLR. This study shows that the jackknifing technique along with the proper scaling may lead us to reveal outliers in ordinal regression reasonably well.

  7. Investigation of possibility of surface rupture derived from PFDHA and calculation of surface displacement based on dislocation

    NASA Astrophysics Data System (ADS)

    Inoue, N.; Kitada, N.; Irikura, K.

    2013-12-01

    A probability of surface rupture is important to configure the seismic source, such as area sources or fault models, for a seismic hazard evaluation. In Japan, Takemura (1998) estimated the probability based on the historical earthquake data. Kagawa et al. (2004) evaluated the probability based on a numerical simulation of surface displacements. The estimated probability indicates a sigmoid curve and increases between Mj (the local magnitude defined and calculated by Japan Meteorological Agency) =6.5 and Mj=7.0. The probability of surface rupture is also used in a probabilistic fault displacement analysis (PFDHA). The probability is determined from the collected earthquake catalog, which were classified into two categories: with surface rupture or without surface rupture. The logistic regression is performed for the classified earthquake data. Youngs et al. (2003), Ross and Moss (2011) and Petersen et al. (2011) indicate the logistic curves of the probability of surface rupture by normal, reverse and strike-slip faults, respectively. Takao et al. (2013) shows the logistic curve derived from only Japanese earthquake data. The Japanese probability curve shows the sharply increasing in narrow magnitude range by comparison with other curves. In this study, we estimated the probability of surface rupture applying the logistic analysis to the surface displacement derived from a surface displacement calculation. A source fault was defined in according to the procedure of Kagawa et al. (2004), which determined a seismic moment from a magnitude and estimated the area size of the asperity and the amount of slip. Strike slip and reverse faults were considered as source faults. We applied Wang et al. (2003) for calculations. The surface displacements with defined source faults were calculated by varying the depth of the fault. A threshold value as 5cm of surface displacement was used to evaluate whether a surface rupture reach or do not reach to the surface. We carried out the logistic regression analysis to the calculated displacements, which were classified by the above threshold. The estimated probability curve indicated the similar trend to the result of Takao et al. (2013). The probability of revere faults is larger than that of strike slip faults. On the other hand, PFDHA results show different trends. The probability of reverse faults at higher magnitude is lower than that of strike slip and normal faults. Ross and Moss (2011) suggested that the sediment and/or rock over the fault compress and not reach the displacement to the surface enough. The numerical theory applied in this study cannot deal with a complex initial situation such as topography.

  8. Comparison of multinomial logistic regression and logistic regression: which is more efficient in allocating land use?

    NASA Astrophysics Data System (ADS)

    Lin, Yingzhi; Deng, Xiangzheng; Li, Xing; Ma, Enjun

    2014-12-01

    Spatially explicit simulation of land use change is the basis for estimating the effects of land use and cover change on energy fluxes, ecology and the environment. At the pixel level, logistic regression is one of the most common approaches used in spatially explicit land use allocation models to determine the relationship between land use and its causal factors in driving land use change, and thereby to evaluate land use suitability. However, these models have a drawback in that they do not determine/allocate land use based on the direct relationship between land use change and its driving factors. Consequently, a multinomial logistic regression method was introduced to address this flaw, and thereby, judge the suitability of a type of land use in any given pixel in a case study area of the Jiangxi Province, China. A comparison of the two regression methods indicated that the proportion of correctly allocated pixels using multinomial logistic regression was 92.98%, which was 8.47% higher than that obtained using logistic regression. Paired t-test results also showed that pixels were more clearly distinguished by multinomial logistic regression than by logistic regression. In conclusion, multinomial logistic regression is a more efficient and accurate method for the spatial allocation of land use changes. The application of this method in future land use change studies may improve the accuracy of predicting the effects of land use and cover change on energy fluxes, ecology, and environment.

  9. Determination of riverbank erosion probability using Locally Weighted Logistic Regression

    NASA Astrophysics Data System (ADS)

    Ioannidou, Elena; Flori, Aikaterini; Varouchakis, Emmanouil A.; Giannakis, Georgios; Vozinaki, Anthi Eirini K.; Karatzas, George P.; Nikolaidis, Nikolaos

    2015-04-01

    Riverbank erosion is a natural geomorphologic process that affects the fluvial environment. The most important issue concerning riverbank erosion is the identification of the vulnerable locations. An alternative to the usual hydrodynamic models to predict vulnerable locations is to quantify the probability of erosion occurrence. This can be achieved by identifying the underlying relations between riverbank erosion and the geomorphological or hydrological variables that prevent or stimulate erosion. Thus, riverbank erosion can be determined by a regression model using independent variables that are considered to affect the erosion process. The impact of such variables may vary spatially, therefore, a non-stationary regression model is preferred instead of a stationary equivalent. Locally Weighted Regression (LWR) is proposed as a suitable choice. This method can be extended to predict the binary presence or absence of erosion based on a series of independent local variables by using the logistic regression model. It is referred to as Locally Weighted Logistic Regression (LWLR). Logistic regression is a type of regression analysis used for predicting the outcome of a categorical dependent variable (e.g. binary response) based on one or more predictor variables. The method can be combined with LWR to assign weights to local independent variables of the dependent one. LWR allows model parameters to vary over space in order to reflect spatial heterogeneity. The probabilities of the possible outcomes are modelled as a function of the independent variables using a logistic function. Logistic regression measures the relationship between a categorical dependent variable and, usually, one or several continuous independent variables by converting the dependent variable to probability scores. Then, a logistic regression is formed, which predicts success or failure of a given binary variable (e.g. erosion presence or absence) for any value of the independent variables. The erosion occurrence probability can be calculated in conjunction with the model deviance regarding the independent variables tested. The most straightforward measure for goodness of fit is the G statistic. It is a simple and effective way to study and evaluate the Logistic Regression model efficiency and the reliability of each independent variable. The developed statistical model is applied to the Koiliaris River Basin on the island of Crete, Greece. Two datasets of river bank slope, river cross-section width and indications of erosion were available for the analysis (12 and 8 locations). Two different types of spatial dependence functions, exponential and tricubic, were examined to determine the local spatial dependence of the independent variables at the measurement locations. The results show a significant improvement when the tricubic function is applied as the erosion probability is accurately predicted at all eight validation locations. Results for the model deviance show that cross-section width is more important than bank slope in the estimation of erosion probability along the Koiliaris riverbanks. The proposed statistical model is a useful tool that quantifies the erosion probability along the riverbanks and can be used to assist managing erosion and flooding events. Acknowledgements This work is part of an on-going THALES project (CYBERSENSORS - High Frequency Monitoring System for Integrated Water Resources Management of Rivers). The project has been co-financed by the European Union (European Social Fund - ESF) and Greek national funds through the Operational Program "Education and Lifelong Learning" of the National Strategic Reference Framework (NSRF) - Research Funding Program: THALES. Investing in knowledge society through the European Social Fund.

  10. Standards for Standardized Logistic Regression Coefficients

    ERIC Educational Resources Information Center

    Menard, Scott

    2011-01-01

    Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…

  11. Seasonal Variation in Physical Activity among Preschool Children in a Northern Canadian City

    ERIC Educational Resources Information Center

    Carson, Valerie; Spence, John C.; Cutumisu, Nicoleta; Boule, Normand; Edwards, Joy

    2010-01-01

    Little research has examined seasonal differences in physical activity (PA) levels among children. Proxy reports of PA were completed by 1,715 parents on their children in Edmonton, Alberta, Canada. Total PA (TPA) minutes were calculated, and each participant was classified as active, somewhat active, or inactive. Logistic regression models were…

  12. Post-fire tree establishment patterns at the alpine treeline ecotone: Mount Rainier National Park, Washington, USA

    Treesearch

    Kirk M. Stueve; Dawna L. Cerney; Regina M. Rochefort; Laurie L. Kurth

    2009-01-01

    We performed classification analysis of 1970 satellite imagery and 2003 aerial photography to delineate establishment. Local site conditions were calculated from a LIDAR-based DEM, ancillary climate data, and 1970 tree locations in a GIS. We used logistic regression on a spatially weighted landscape matrix to rank variables.

  13. Propensity score estimation: machine learning and classification methods as alternatives to logistic regression

    PubMed Central

    Westreich, Daniel; Lessler, Justin; Funk, Michele Jonsson

    2010-01-01

    Summary Objective Propensity scores for the analysis of observational data are typically estimated using logistic regression. Our objective in this Review was to assess machine learning alternatives to logistic regression which may accomplish the same goals but with fewer assumptions or greater accuracy. Study Design and Setting We identified alternative methods for propensity score estimation and/or classification from the public health, biostatistics, discrete mathematics, and computer science literature, and evaluated these algorithms for applicability to the problem of propensity score estimation, potential advantages over logistic regression, and ease of use. Results We identified four techniques as alternatives to logistic regression: neural networks, support vector machines, decision trees (CART), and meta-classifiers (in particular, boosting). Conclusion While the assumptions of logistic regression are well understood, those assumptions are frequently ignored. All four alternatives have advantages and disadvantages compared with logistic regression. Boosting (meta-classifiers) and to a lesser extent decision trees (particularly CART) appear to be most promising for use in the context of propensity score analysis, but extensive simulation studies are needed to establish their utility in practice. PMID:20630332

  14. Robust mislabel logistic regression without modeling mislabel probabilities.

    PubMed

    Hung, Hung; Jou, Zhi-Yu; Huang, Su-Yun

    2018-03-01

    Logistic regression is among the most widely used statistical methods for linear discriminant analysis. In many applications, we only observe possibly mislabeled responses. Fitting a conventional logistic regression can then lead to biased estimation. One common resolution is to fit a mislabel logistic regression model, which takes into consideration of mislabeled responses. Another common method is to adopt a robust M-estimation by down-weighting suspected instances. In this work, we propose a new robust mislabel logistic regression based on γ-divergence. Our proposal possesses two advantageous features: (1) It does not need to model the mislabel probabilities. (2) The minimum γ-divergence estimation leads to a weighted estimating equation without the need to include any bias correction term, that is, it is automatically bias-corrected. These features make the proposed γ-logistic regression more robust in model fitting and more intuitive for model interpretation through a simple weighting scheme. Our method is also easy to implement, and two types of algorithms are included. Simulation studies and the Pima data application are presented to demonstrate the performance of γ-logistic regression. © 2017, The International Biometric Society.

  15. Fungible weights in logistic regression.

    PubMed

    Jones, Jeff A; Waller, Niels G

    2016-06-01

    In this article we develop methods for assessing parameter sensitivity in logistic regression models. To set the stage for this work, we first review Waller's (2008) equations for computing fungible weights in linear regression. Next, we describe 2 methods for computing fungible weights in logistic regression. To demonstrate the utility of these methods, we compute fungible logistic regression weights using data from the Centers for Disease Control and Prevention's (2010) Youth Risk Behavior Surveillance Survey, and we illustrate how these alternate weights can be used to evaluate parameter sensitivity. To make our work accessible to the research community, we provide R code (R Core Team, 2015) that will generate both kinds of fungible logistic regression weights. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

  16. Factors associated with secondhand smoke exposure in different settings: Results from the German Health Update (GEDA) 2012.

    PubMed

    Fischer, Florian; Kraemer, Alexander

    2016-04-14

    The ubiquity of secondhand smoke (SHS) exposure at home or in private establishments, workplaces and public areas poses several challenges for the reduction of SHS exposure. This study aimed to describe the prevalence of SHS exposure in Germany and key factors associated with exposure. Results were also differentiated by place of exposure. A secondary data analysis based on the public use file of the German Health Update 2012 was conducted (n = 13,933). Only non-smokers were included in the analysis. In a multivariable logistic regression model the factors associated with SHS exposure were calculated. In addition, a further set of multivariable logistic regressions were calculated for factors associated with the place of SHS exposure (workplace, at home, bars/discotheques, restaurants, at the house of a friend). More than a quarter of non-smoking study participants were exposed to SHS. The main area of exposure was the workplace (40.9 %). The multivariable logistic regression indicated young age as the most important factor associated with SHS exposure. The odds for SHS exposure was higher in men than in women. The likelihood of SHS exposure decreased with higher education. SHS exposure and the associated factors varied between different places of exposure. Despite several actions to protect non-smokers which were implemented in Germany during the past years, SHS exposure still remains a relevant risk factor at a population level. According to the results of this study, particularly the workplace and other public places such as bars and discotheques have to be taken into account for the development of strategies to reduce SHS exposure.

  17. Validation of use of the International Consultation on Incontinence Questionnaire-Urinary Incontinence-Short Form (ICIQ-UI-SF) for impairment rating: a transversal retrospective study of 120 patients.

    PubMed

    Timmermans, Luc; Falez, Freddy; Mélot, Christian; Wespes, Eric

    2013-09-01

    A urinary incontinence impairment rating must be a highly accurate, non-invasive exploration of the condition using International Classification of Functioning (ICF)-based assessment tools. The objective of this study was to identify the best evaluation test and to determine an impairment rating model of urinary incontinence. In performing a cross-sectional study comparing successive urodynamic tests using both the International Consultation on Incontinence Questionnaire-Urinary Incontinence-Short Form (ICIQ-UI-SF) and the 1-hr pad-weighing test in 120 patients, we performed statistical likelihood ratio analysis and used logistic regression to calculate the probability of urodynamic incontinence using the most significant independent predictors. Subsequently, we created a template that was based on the significant predictors and the probability of urodynamic incontinence. The mean ICIQ-UI-SF score was 13.5 ± 4.6, and the median pad test value was 8 g. The discrimination statistic (receiver operating characteristic) described how well the urodynamic observations matched the ICIQ-UI-SF scores (under curve area (UDA):0.689) and the pad test data (UDA: 0.693). Using logistic regression analysis, we demonstrated that the best independent predictors of urodynamic incontinence were the patient's age and the ICIQ-UI-SF score. The logistic regression model permitted us to construct an equation to determine the probability of urodynamic incontinence. Using these tools, we created a template to generate a probability index of urodynamic urinary incontinence. Using this probability index, relative to the patient and to the maximum impairment of the whole person (MIWP) relative to urinary incontinence, we were able to calculate a patient's permanent impairment. Copyright © 2012 Wiley Periodicals, Inc.

  18. Development and validation of a mortality risk model for pediatric sepsis.

    PubMed

    Chen, Mengshi; Lu, Xiulan; Hu, Li; Liu, Pingping; Zhao, Wenjiao; Yan, Haipeng; Tang, Liang; Zhu, Yimin; Xiao, Zhenghui; Chen, Lizhang; Tan, Hongzhuan

    2017-05-01

    Pediatric sepsis is a burdensome public health problem. Assessing the mortality risk of pediatric sepsis patients, offering effective treatment guidance, and improving prognosis to reduce mortality rates, are crucial.We extracted data derived from electronic medical records of pediatric sepsis patients that were collected during the first 24 hours after admission to the pediatric intensive care unit (PICU) of the Hunan Children's hospital from January 2012 to June 2014. A total of 788 children were randomly divided into a training (592, 75%) and validation group (196, 25%). The risk factors for mortality among these patients were identified by conducting multivariate logistic regression in the training group. Based on the established logistic regression equation, the logit probabilities for all patients (in both groups) were calculated to verify the model's internal and external validities.According to the training group, 6 variables (brain natriuretic peptide, albumin, total bilirubin, D-dimer, lactate levels, and mechanical ventilation in 24 hours) were included in the final logistic regression model. The areas under the curves of the model were 0.854 (0.826, 0.881) and 0.844 (0.816, 0.873) in the training and validation groups, respectively.The Mortality Risk Model for Pediatric Sepsis we established in this study showed acceptable accuracy to predict the mortality risk in pediatric sepsis patients.

  19. Development and validation of a mortality risk model for pediatric sepsis

    PubMed Central

    Chen, Mengshi; Lu, Xiulan; Hu, Li; Liu, Pingping; Zhao, Wenjiao; Yan, Haipeng; Tang, Liang; Zhu, Yimin; Xiao, Zhenghui; Chen, Lizhang; Tan, Hongzhuan

    2017-01-01

    Abstract Pediatric sepsis is a burdensome public health problem. Assessing the mortality risk of pediatric sepsis patients, offering effective treatment guidance, and improving prognosis to reduce mortality rates, are crucial. We extracted data derived from electronic medical records of pediatric sepsis patients that were collected during the first 24 hours after admission to the pediatric intensive care unit (PICU) of the Hunan Children's hospital from January 2012 to June 2014. A total of 788 children were randomly divided into a training (592, 75%) and validation group (196, 25%). The risk factors for mortality among these patients were identified by conducting multivariate logistic regression in the training group. Based on the established logistic regression equation, the logit probabilities for all patients (in both groups) were calculated to verify the model's internal and external validities. According to the training group, 6 variables (brain natriuretic peptide, albumin, total bilirubin, D-dimer, lactate levels, and mechanical ventilation in 24 hours) were included in the final logistic regression model. The areas under the curves of the model were 0.854 (0.826, 0.881) and 0.844 (0.816, 0.873) in the training and validation groups, respectively. The Mortality Risk Model for Pediatric Sepsis we established in this study showed acceptable accuracy to predict the mortality risk in pediatric sepsis patients. PMID:28514310

  20. Propensity score estimation: neural networks, support vector machines, decision trees (CART), and meta-classifiers as alternatives to logistic regression.

    PubMed

    Westreich, Daniel; Lessler, Justin; Funk, Michele Jonsson

    2010-08-01

    Propensity scores for the analysis of observational data are typically estimated using logistic regression. Our objective in this review was to assess machine learning alternatives to logistic regression, which may accomplish the same goals but with fewer assumptions or greater accuracy. We identified alternative methods for propensity score estimation and/or classification from the public health, biostatistics, discrete mathematics, and computer science literature, and evaluated these algorithms for applicability to the problem of propensity score estimation, potential advantages over logistic regression, and ease of use. We identified four techniques as alternatives to logistic regression: neural networks, support vector machines, decision trees (classification and regression trees [CART]), and meta-classifiers (in particular, boosting). Although the assumptions of logistic regression are well understood, those assumptions are frequently ignored. All four alternatives have advantages and disadvantages compared with logistic regression. Boosting (meta-classifiers) and, to a lesser extent, decision trees (particularly CART), appear to be most promising for use in the context of propensity score analysis, but extensive simulation studies are needed to establish their utility in practice. Copyright (c) 2010 Elsevier Inc. All rights reserved.

  1. Impact of Colic Pain as a Significant Factor for Predicting the Stone Free Rate of One-Session Shock Wave Lithotripsy for Treating Ureter Stones: A Bayesian Logistic Regression Model Analysis

    PubMed Central

    Chung, Doo Yong; Cho, Kang Su; Lee, Dae Hun; Han, Jang Hee; Kang, Dong Hyuk; Jung, Hae Do; Kown, Jong Kyou; Ham, Won Sik; Choi, Young Deuk; Lee, Joo Yong

    2015-01-01

    Purpose This study was conducted to evaluate colic pain as a prognostic pretreatment factor that can influence ureter stone clearance and to estimate the probability of stone-free status in shock wave lithotripsy (SWL) patients with a ureter stone. Materials and Methods We retrospectively reviewed the medical records of 1,418 patients who underwent their first SWL between 2005 and 2013. Among these patients, 551 had a ureter stone measuring 4–20 mm and were thus eligible for our analyses. The colic pain as the chief complaint was defined as either subjective flank pain during history taking and physical examination. Propensity-scores for established for colic pain was calculated for each patient using multivariate logistic regression based upon the following covariates: age, maximal stone length (MSL), and mean stone density (MSD). Each factor was evaluated as predictor for stone-free status by Bayesian and non-Bayesian logistic regression model. Results After propensity-score matching, 217 patients were extracted in each group from the total patient cohort. There were no statistical differences in variables used in propensity- score matching. One-session success and stone-free rate were also higher in the painful group (73.7% and 71.0%, respectively) than in the painless group (63.6% and 60.4%, respectively). In multivariate non-Bayesian and Bayesian logistic regression models, a painful stone, shorter MSL, and lower MSD were significant factors for one-session stone-free status in patients who underwent SWL. Conclusions Colic pain in patients with ureter calculi was one of the significant predicting factors including MSL and MSD for one-session stone-free status of SWL. PMID:25902059

  2. Should metacognition be measured by logistic regression?

    PubMed

    Rausch, Manuel; Zehetleitner, Michael

    2017-03-01

    Are logistic regression slopes suitable to quantify metacognitive sensitivity, i.e. the efficiency with which subjective reports differentiate between correct and incorrect task responses? We analytically show that logistic regression slopes are independent from rating criteria in one specific model of metacognition, which assumes (i) that rating decisions are based on sensory evidence generated independently of the sensory evidence used for primary task responses and (ii) that the distributions of evidence are logistic. Given a hierarchical model of metacognition, logistic regression slopes depend on rating criteria. According to all considered models, regression slopes depend on the primary task criterion. A reanalysis of previous data revealed that massive numbers of trials are required to distinguish between hierarchical and independent models with tolerable accuracy. It is argued that researchers who wish to use logistic regression as measure of metacognitive sensitivity need to control the primary task criterion and rating criteria. Copyright © 2017 Elsevier Inc. All rights reserved.

  3. London Measure of Unplanned Pregnancy: guidance for its use as an outcome measure

    PubMed Central

    Hall, Jennifer A; Barrett, Geraldine; Copas, Andrew; Stephenson, Judith

    2017-01-01

    Background The London Measure of Unplanned Pregnancy (LMUP) is a psychometrically validated measure of the degree of intention of a current or recent pregnancy. The LMUP is increasingly being used worldwide, and can be used to evaluate family planning or preconception care programs. However, beyond recommending the use of the full LMUP scale, there is no published guidance on how to use the LMUP as an outcome measure. Ordinal logistic regression has been recommended informally, but studies published to date have all used binary logistic regression and dichotomized the scale at different cut points. There is thus a need for evidence-based guidance to provide a standardized methodology for multivariate analysis and to enable comparison of results. This paper makes recommendations for the regression method for analysis of the LMUP as an outcome measure. Materials and methods Data collected from 4,244 pregnant women in Malawi were used to compare five regression methods: linear, logistic with two cut points, and ordinal logistic with either the full or grouped LMUP score. The recommendations were then tested on the original UK LMUP data. Results There were small but no important differences in the findings across the regression models. Logistic regression resulted in the largest loss of information, and assumptions were violated for the linear and ordinal logistic regression. Consequently, robust standard errors were used for linear regression and a partial proportional odds ordinal logistic regression model attempted. The latter could only be fitted for grouped LMUP score. Conclusion We recommend the linear regression model with robust standard errors to make full use of the LMUP score when analyzed as an outcome measure. Ordinal logistic regression could be considered, but a partial proportional odds model with grouped LMUP score may be required. Logistic regression is the least-favored option, due to the loss of information. For logistic regression, the cut point for un/planned pregnancy should be between nine and ten. These recommendations will standardize the analysis of LMUP data and enhance comparability of results across studies. PMID:28435343

  4. Logistic models--an odd(s) kind of regression.

    PubMed

    Jupiter, Daniel C

    2013-01-01

    The logistic regression model bears some similarity to the multivariable linear regression with which we are familiar. However, the differences are great enough to warrant a discussion of the need for and interpretation of logistic regression. Copyright © 2013 American College of Foot and Ankle Surgeons. Published by Elsevier Inc. All rights reserved.

  5. Prediction of Emergency Department Hospital Admission Based on Natural Language Processing and Neural Networks.

    PubMed

    Zhang, Xingyu; Kim, Joyce; Patzer, Rachel E; Pitts, Stephen R; Patzer, Aaron; Schrager, Justin D

    2017-10-26

    To describe and compare logistic regression and neural network modeling strategies to predict hospital admission or transfer following initial presentation to Emergency Department (ED) triage with and without the addition of natural language processing elements. Using data from the National Hospital Ambulatory Medical Care Survey (NHAMCS), a cross-sectional probability sample of United States EDs from 2012 and 2013 survey years, we developed several predictive models with the outcome being admission to the hospital or transfer vs. discharge home. We included patient characteristics immediately available after the patient has presented to the ED and undergone a triage process. We used this information to construct logistic regression (LR) and multilayer neural network models (MLNN) which included natural language processing (NLP) and principal component analysis from the patient's reason for visit. Ten-fold cross validation was used to test the predictive capacity of each model and receiver operating curves (AUC) were then calculated for each model. Of the 47,200 ED visits from 642 hospitals, 6,335 (13.42%) resulted in hospital admission (or transfer). A total of 48 principal components were extracted by NLP from the reason for visit fields, which explained 75% of the overall variance for hospitalization. In the model including only structured variables, the AUC was 0.824 (95% CI 0.818-0.830) for logistic regression and 0.823 (95% CI 0.817-0.829) for MLNN. Models including only free-text information generated AUC of 0.742 (95% CI 0.731- 0.753) for logistic regression and 0.753 (95% CI 0.742-0.764) for MLNN. When both structured variables and free text variables were included, the AUC reached 0.846 (95% CI 0.839-0.853) for logistic regression and 0.844 (95% CI 0.836-0.852) for MLNN. The predictive accuracy of hospital admission or transfer for patients who presented to ED triage overall was good, and was improved with the inclusion of free text data from a patient's reason for visit regardless of modeling approach. Natural language processing and neural networks that incorporate patient-reported outcome free text may increase predictive accuracy for hospital admission.

  6. Comparison of Xenon-Enhanced Area-Detector CT and Krypton Ventilation SPECT/CT for Assessment of Pulmonary Functional Loss and Disease Severity in Smokers.

    PubMed

    Ohno, Yoshiharu; Fujisawa, Yasuko; Takenaka, Daisuke; Kaminaga, Shigeo; Seki, Shinichiro; Sugihara, Naoki; Yoshikawa, Takeshi

    2018-02-01

    The objective of this study was to compare the capability of xenon-enhanced area-detector CT (ADCT) performed with a subtraction technique and coregistered 81m Kr-ventilation SPECT/CT for the assessment of pulmonary functional loss and disease severity in smokers. Forty-six consecutive smokers (32 men and 14 women; mean age, 67.0 years) underwent prospective unenhanced and xenon-enhanced ADCT, 81m Kr-ventilation SPECT/CT, and pulmonary function tests. Disease severity was evaluated according to the Global Initiative for Chronic Obstructive Lung Disease (GOLD) classification. CT-based functional lung volume (FLV), the percentage of wall area to total airway area (WA%), and ventilated FLV on xenon-enhanced ADCT and SPECT/CT were calculated for each smoker. All indexes were correlated with percentage of forced expiratory volume in 1 second (%FEV 1 ) using step-wise regression analyses, and univariate and multivariate logistic regression analyses were performed. In addition, the diagnostic accuracy of the proposed model was compared with that of each radiologic index by means of McNemar analysis. Multivariate logistic regression showed that %FEV 1 was significantly affected (r = 0.77, r 2 = 0.59) by two factors: the first factor, ventilated FLV on xenon-enhanced ADCT (p < 0.0001); and the second factor, WA% (p = 0.004). Univariate logistic regression analyses indicated that all indexes significantly affected GOLD classification (p < 0.05). Multivariate logistic regression analyses revealed that ventilated FLV on xenon-enhanced ADCT and CT-based FLV significantly influenced GOLD classification (p < 0.0001). The diagnostic accuracy of the proposed model was significantly higher than that of ventilated FLV on SPECT/CT (p = 0.03) and WA% (p = 0.008). Xenon-enhanced ADCT is more effective than 81m Kr-ventilation SPECT/CT for the assessment of pulmonary functional loss and disease severity.

  7. Odontological approach to sexual dimorphism in southeastern France.

    PubMed

    Lladeres, Emilie; Saliba-Serre, Bérengère; Sastre, Julien; Foti, Bruno; Tardivo, Delphine; Adalian, Pascal

    2013-01-01

    The aim of this study was to establish a prediction formula to allow for the determination of sex among the southeastern French population using dental measurements. The sample consisted of 105 individuals (57 males and 48 females, aged between 18 and 25 years). Dental measurements were calculated using Euclidean distances, in three-dimensional space, from point coordinates obtained by a Microscribe. A multiple logistic regression analysis was performed to establish the prediction formula. Among 12 selected dental distances, a stepwise logistic regression analysis highlighted the two most significant discriminate predictors of sex: one located at the mandible and the other at the maxilla. A cutpoint was proposed to prediction of true sex. The prediction formula was then tested on a validation sample (20 males and 34 females, aged between 18 and 62 years and with a history of orthodontics or restorative care) to evaluate the accuracy of the method. © 2012 American Academy of Forensic Sciences.

  8. A Proposal for Phase 4 of the Forest Inventory and Analysis Program

    Treesearch

    Ronald E. McRoberts

    2005-01-01

    Maps of forest cover were constructed using observations from forest inventory plots, Landsat Thematic Mapper satellite imagery, and a logistic regression model. Estimates of mean proportion forest area and the variance of the mean were calculated for circular study areas with radii ranging from 1 km to 15 km. The spatial correlation among pixel predictions was...

  9. Reanalysis of the start of the UK 1967 to 1968 foot-and-mouth disease epidemic to calculate airborne transmission probabilities.

    PubMed

    Sanson, R L; Gloster, J; Burgin, L

    2011-09-24

    The aims of this study were to statistically reassess the likelihood that windborne spread of foot-and-mouth disease (FMD) virus (FMDV) occurred at the start of the UK 1967 to 1968 FMD epidemic at Oswestry, Shropshire, and to derive dose-response probability of infection curves for farms exposed to airborne FMDV. To enable this, data on all farms present in 1967 in the parishes near Oswestry were assembled. Cases were infected premises whose date of appearance of first clinical signs was within 14 days of the depopulation of the index farm. Logistic regression was used to evaluate the association between infection status and distance and direction from the index farm. The UK Met Office's NAME atmospheric dispersion model (ADM) was used to generate plumes for each day that FMDV was excreted from the index farm based on actual historical weather records from October 1967. Daily airborne FMDV exposure rates for all farms in the study area were calculated using a geographical information system. Probit analyses were used to calculate dose-response probability of infection curves to FMDV, using relative exposure rates on case and control farms. Both the logistic regression and probit analyses gave strong statistical support to the hypothesis that airborne spread occurred. There was some evidence that incubation period was inversely proportional to the exposure rate.

  10. Predicting Visual Distraction Using Driving Performance Data

    PubMed Central

    Kircher, Katja; Ahlstrom, Christer

    2010-01-01

    Behavioral variables are often used as performance indicators (PIs) of visual or internal distraction induced by secondary tasks. The objective of this study is to investigate whether visual distraction can be predicted by driving performance PIs in a naturalistic setting. Visual distraction is here defined by a gaze based real-time distraction detection algorithm called AttenD. Seven drivers used an instrumented vehicle for one month each in a small scale field operational test. For each of the visual distraction events detected by AttenD, seven PIs such as steering wheel reversal rate and throttle hold were calculated. Corresponding data were also calculated for time periods during which the drivers were classified as attentive. For each PI, means between distracted and attentive states were calculated using t-tests for different time-window sizes (2 – 40 s), and the window width with the smallest resulting p-value was selected as optimal. Based on the optimized PIs, logistic regression was used to predict whether the drivers were attentive or distracted. The logistic regression resulted in predictions which were 76 % correct (sensitivity = 77 % and specificity = 76 %). The conclusion is that there is a relationship between behavioral variables and visual distraction, but the relationship is not strong enough to accurately predict visual driver distraction. Instead, behavioral PIs are probably best suited as complementary to eye tracking based algorithms in order to make them more accurate and robust. PMID:21050615

  11. PARAMETRIC AND NON PARAMETRIC (MARS: MULTIVARIATE ADDITIVE REGRESSION SPLINES) LOGISTIC REGRESSIONS FOR PREDICTION OF A DICHOTOMOUS RESPONSE VARIABLE WITH AN EXAMPLE FOR PRESENCE/ABSENCE OF AMPHIBIANS

    EPA Science Inventory

    The purpose of this report is to provide a reference manual that could be used by investigators for making informed use of logistic regression using two methods (standard logistic regression and MARS). The details for analyses of relationships between a dependent binary response ...

  12. Predicting U.S. Army Reserve Unit Manning Using Market Demographics

    DTIC Science & Technology

    2015-06-01

    develops linear regression , classification tree, and logistic regression models to determine the ability of the location to support manning requirements... logistic regression model delivers predictive results that allow decision-makers to identify locations with a high probability of meeting unit...manning requirements. The recommendation of this thesis is that the USAR implement the logistic regression model. 14. SUBJECT TERMS U.S

  13. Analyzing Student Learning Outcomes: Usefulness of Logistic and Cox Regression Models. IR Applications, Volume 5

    ERIC Educational Resources Information Center

    Chen, Chau-Kuang

    2005-01-01

    Logistic and Cox regression methods are practical tools used to model the relationships between certain student learning outcomes and their relevant explanatory variables. The logistic regression model fits an S-shaped curve into a binary outcome with data points of zero and one. The Cox regression model allows investigators to study the duration…

  14. An appraisal of convergence failures in the application of logistic regression model in published manuscripts.

    PubMed

    Yusuf, O B; Bamgboye, E A; Afolabi, R F; Shodimu, M A

    2014-09-01

    Logistic regression model is widely used in health research for description and predictive purposes. Unfortunately, most researchers are sometimes not aware that the underlying principles of the techniques have failed when the algorithm for maximum likelihood does not converge. Young researchers particularly postgraduate students may not know why separation problem whether quasi or complete occurs, how to identify it and how to fix it. This study was designed to critically evaluate convergence issues in articles that employed logistic regression analysis published in an African Journal of Medicine and medical sciences between 2004 and 2013. Problems of quasi or complete separation were described and were illustrated with the National Demographic and Health Survey dataset. A critical evaluation of articles that employed logistic regression was conducted. A total of 581 articles was reviewed, of which 40 (6.9%) used binary logistic regression. Twenty-four (60.0%) stated the use of logistic regression model in the methodology while none of the articles assessed model fit. Only 3 (12.5%) properly described the procedures. Of the 40 that used the logistic regression model, the problem of convergence occurred in 6 (15.0%) of the articles. Logistic regression tends to be poorly reported in studies published between 2004 and 2013. Our findings showed that the procedure may not be well understood by researchers since very few described the process in their reports and may be totally unaware of the problem of convergence or how to deal with it.

  15. Logistic Regression: Concept and Application

    ERIC Educational Resources Information Center

    Cokluk, Omay

    2010-01-01

    The main focus of logistic regression analysis is classification of individuals in different groups. The aim of the present study is to explain basic concepts and processes of binary logistic regression analysis intended to determine the combination of independent variables which best explain the membership in certain groups called dichotomous…

  16. An Entropy-Based Measure for Assessing Fuzziness in Logistic Regression

    PubMed Central

    Weiss, Brandi A.; Dardick, William

    2015-01-01

    This article introduces an entropy-based measure of data–model fit that can be used to assess the quality of logistic regression models. Entropy has previously been used in mixture-modeling to quantify how well individuals are classified into latent classes. The current study proposes the use of entropy for logistic regression models to quantify the quality of classification and separation of group membership. Entropy complements preexisting measures of data–model fit and provides unique information not contained in other measures. Hypothetical data scenarios, an applied example, and Monte Carlo simulation results are used to demonstrate the application of entropy in logistic regression. Entropy should be used in conjunction with other measures of data–model fit to assess how well logistic regression models classify cases into observed categories. PMID:29795897

  17. Logistic regression applied to natural hazards: rare event logistic regression with replications

    NASA Astrophysics Data System (ADS)

    Guns, M.; Vanacker, V.

    2012-06-01

    Statistical analysis of natural hazards needs particular attention, as most of these phenomena are rare events. This study shows that the ordinary rare event logistic regression, as it is now commonly used in geomorphologic studies, does not always lead to a robust detection of controlling factors, as the results can be strongly sample-dependent. In this paper, we introduce some concepts of Monte Carlo simulations in rare event logistic regression. This technique, so-called rare event logistic regression with replications, combines the strength of probabilistic and statistical methods, and allows overcoming some of the limitations of previous developments through robust variable selection. This technique was here developed for the analyses of landslide controlling factors, but the concept is widely applicable for statistical analyses of natural hazards.

  18. Large unbalanced credit scoring using Lasso-logistic regression ensemble.

    PubMed

    Wang, Hong; Xu, Qingsong; Zhou, Lifeng

    2015-01-01

    Recently, various ensemble learning methods with different base classifiers have been proposed for credit scoring problems. However, for various reasons, there has been little research using logistic regression as the base classifier. In this paper, given large unbalanced data, we consider the plausibility of ensemble learning using regularized logistic regression as the base classifier to deal with credit scoring problems. In this research, the data is first balanced and diversified by clustering and bagging algorithms. Then we apply a Lasso-logistic regression learning ensemble to evaluate the credit risks. We show that the proposed algorithm outperforms popular credit scoring models such as decision tree, Lasso-logistic regression and random forests in terms of AUC and F-measure. We also provide two importance measures for the proposed model to identify important variables in the data.

  19. An Entropy-Based Measure for Assessing Fuzziness in Logistic Regression.

    PubMed

    Weiss, Brandi A; Dardick, William

    2016-12-01

    This article introduces an entropy-based measure of data-model fit that can be used to assess the quality of logistic regression models. Entropy has previously been used in mixture-modeling to quantify how well individuals are classified into latent classes. The current study proposes the use of entropy for logistic regression models to quantify the quality of classification and separation of group membership. Entropy complements preexisting measures of data-model fit and provides unique information not contained in other measures. Hypothetical data scenarios, an applied example, and Monte Carlo simulation results are used to demonstrate the application of entropy in logistic regression. Entropy should be used in conjunction with other measures of data-model fit to assess how well logistic regression models classify cases into observed categories.

  20. Sample size estimation for alternating logistic regressions analysis of multilevel randomized community trials of under-age drinking.

    PubMed

    Reboussin, Beth A; Preisser, John S; Song, Eun-Young; Wolfson, Mark

    2012-07-01

    Under-age drinking is an enormous public health issue in the USA. Evidence that community level structures may impact on under-age drinking has led to a proliferation of efforts to change the environment surrounding the use of alcohol. Although the focus of these efforts is to reduce drinking by individual youths, environmental interventions are typically implemented at the community level with entire communities randomized to the same intervention condition. A distinct feature of these trials is the tendency of the behaviours of individuals residing in the same community to be more alike than that of others residing in different communities, which is herein called 'clustering'. Statistical analyses and sample size calculations must account for this clustering to avoid type I errors and to ensure an appropriately powered trial. Clustering itself may also be of scientific interest. We consider the alternating logistic regressions procedure within the population-averaged modelling framework to estimate the effect of a law enforcement intervention on the prevalence of under-age drinking behaviours while modelling the clustering at multiple levels, e.g. within communities and within neighbourhoods nested within communities, by using pairwise odds ratios. We then derive sample size formulae for estimating intervention effects when planning a post-test-only or repeated cross-sectional community-randomized trial using the alternating logistic regressions procedure.

  1. Risk factors for lesions of the knee menisci among workers in South Korea's national parks.

    PubMed

    Shin, Donghee; Youn, Kanwoo; Lee, Eunja; Lee, Myeongjun; Chung, Hweemin; Kim, Deokweon

    2016-01-01

    This study was designed to investigate the prevalence of the menisci lesions in national park workers and work factors affecting this prevalence. The study subjects were 698 workers who worked in 20 Korean national parks in 2014. An orthopedist visited each national park and performed physical examinations. Knee MRI was performed if the McMurray test or Apley test was positive and there was a complaint of pain in knee area. An orthopedist and a radiologist respectively read these images of the menisci using a grading system based on the MRI signals. To calculate the cumulative intensity of trekking of the workers, the mean trail distance, the difficulty of the trail, the tenure at each national parks, and the number of treks per month for each worker from the start of work until the present were investigated. Chi-square tests was performed to see if there were differences in the menisci lesions grade according to the variables. The variables used in the Chi-square test were evaluated using simple logistic regression analysis to get crude odds ratios, and adjusted odds ratios and 95 % confidence intervals were calculated using multivariate logistic regression analysis after establishing three different models according to the adjusted variables. According to the MRI signal grades of menisci, 29 % were grade 0, 11.3 % were grade 1, 46.0 % were grade 2, and 13.7 % were grade 3. The differences in the MRI signal grades of menisci according to age and the intensity of trekking as calculated by the three different methods were statistically significant. Multiple logistic regression analysis was performed for three models. In model 1, there was no statistically significant factor affecting the menisci lesions. In model 2, among the factors affecting the menisci lesions, the OR of a high cumulative intensity of trekking was 4.08 (95 % CI 1.00-16.61), and in model 3, the OR of a high cumulative intensity of trekking was 5.84 (95 % CI 1.09-31.26). The factor that most affected the menisci lesions among the workers in Korean national park was a high cumulative intensity of trekking.

  2. A Methodology for Generating Placement Rules that Utilizes Logistic Regression

    ERIC Educational Resources Information Center

    Wurtz, Keith

    2008-01-01

    The purpose of this article is to provide the necessary tools for institutional researchers to conduct a logistic regression analysis and interpret the results. Aspects of the logistic regression procedure that are necessary to evaluate models are presented and discussed with an emphasis on cutoff values and choosing the appropriate number of…

  3. Comparison of standard maximum likelihood classification and polytomous logistic regression used in remote sensing

    Treesearch

    John Hogland; Nedret Billor; Nathaniel Anderson

    2013-01-01

    Discriminant analysis, referred to as maximum likelihood classification within popular remote sensing software packages, is a common supervised technique used by analysts. Polytomous logistic regression (PLR), also referred to as multinomial logistic regression, is an alternative classification approach that is less restrictive, more flexible, and easy to interpret. To...

  4. Large Unbalanced Credit Scoring Using Lasso-Logistic Regression Ensemble

    PubMed Central

    Wang, Hong; Xu, Qingsong; Zhou, Lifeng

    2015-01-01

    Recently, various ensemble learning methods with different base classifiers have been proposed for credit scoring problems. However, for various reasons, there has been little research using logistic regression as the base classifier. In this paper, given large unbalanced data, we consider the plausibility of ensemble learning using regularized logistic regression as the base classifier to deal with credit scoring problems. In this research, the data is first balanced and diversified by clustering and bagging algorithms. Then we apply a Lasso-logistic regression learning ensemble to evaluate the credit risks. We show that the proposed algorithm outperforms popular credit scoring models such as decision tree, Lasso-logistic regression and random forests in terms of AUC and F-measure. We also provide two importance measures for the proposed model to identify important variables in the data. PMID:25706988

  5. An Entropy-Based Measure for Assessing Fuzziness in Logistic Regression

    ERIC Educational Resources Information Center

    Weiss, Brandi A.; Dardick, William

    2016-01-01

    This article introduces an entropy-based measure of data-model fit that can be used to assess the quality of logistic regression models. Entropy has previously been used in mixture-modeling to quantify how well individuals are classified into latent classes. The current study proposes the use of entropy for logistic regression models to quantify…

  6. What Are the Odds of that? A Primer on Understanding Logistic Regression

    ERIC Educational Resources Information Center

    Huang, Francis L.; Moon, Tonya R.

    2013-01-01

    The purpose of this Methodological Brief is to present a brief primer on logistic regression, a commonly used technique when modeling dichotomous outcomes. Using data from the National Education Longitudinal Study of 1988 (NELS:88), logistic regression techniques were used to investigate student-level variables in eighth grade (i.e., enrolled in a…

  7. Classification of sodium MRI data of cartilage using machine learning.

    PubMed

    Madelin, Guillaume; Poidevin, Frederick; Makrymallis, Antonios; Regatte, Ravinder R

    2015-11-01

    To assess the possible utility of machine learning for classifying subjects with and subjects without osteoarthritis using sodium magnetic resonance imaging data. Theory: Support vector machine, k-nearest neighbors, naïve Bayes, discriminant analysis, linear regression, logistic regression, neural networks, decision tree, and tree bagging were tested. Sodium magnetic resonance imaging with and without fluid suppression by inversion recovery was acquired on the knee cartilage of 19 controls and 28 osteoarthritis patients. Sodium concentrations were measured in regions of interests in the knee for both acquisitions. Mean (MEAN) and standard deviation (STD) of these concentrations were measured in each regions of interest, and the minimum, maximum, and mean of these two measurements were calculated over all regions of interests for each subject. The resulting 12 variables per subject were used as predictors for classification. Either Min [STD] alone, or in combination with Mean [MEAN] or Min [MEAN], all from fluid suppressed data, were the best predictors with an accuracy >74%, mainly with linear logistic regression and linear support vector machine. Other good classifiers include discriminant analysis, linear regression, and naïve Bayes. Machine learning is a promising technique for classifying osteoarthritis patients and controls from sodium magnetic resonance imaging data. © 2014 Wiley Periodicals, Inc.

  8. On the Usefulness of a Multilevel Logistic Regression Approach to Person-Fit Analysis

    ERIC Educational Resources Information Center

    Conijn, Judith M.; Emons, Wilco H. M.; van Assen, Marcel A. L. M.; Sijtsma, Klaas

    2011-01-01

    The logistic person response function (PRF) models the probability of a correct response as a function of the item locations. Reise (2000) proposed to use the slope parameter of the logistic PRF as a person-fit measure. He reformulated the logistic PRF model as a multilevel logistic regression model and estimated the PRF parameters from this…

  9. Mortality risk prediction in burn injury: Comparison of logistic regression with machine learning approaches.

    PubMed

    Stylianou, Neophytos; Akbarov, Artur; Kontopantelis, Evangelos; Buchan, Iain; Dunn, Ken W

    2015-08-01

    Predicting mortality from burn injury has traditionally employed logistic regression models. Alternative machine learning methods have been introduced in some areas of clinical prediction as the necessary software and computational facilities have become accessible. Here we compare logistic regression and machine learning predictions of mortality from burn. An established logistic mortality model was compared to machine learning methods (artificial neural network, support vector machine, random forests and naïve Bayes) using a population-based (England & Wales) case-cohort registry. Predictive evaluation used: area under the receiver operating characteristic curve; sensitivity; specificity; positive predictive value and Youden's index. All methods had comparable discriminatory abilities, similar sensitivities, specificities and positive predictive values. Although some machine learning methods performed marginally better than logistic regression the differences were seldom statistically significant and clinically insubstantial. Random forests were marginally better for high positive predictive value and reasonable sensitivity. Neural networks yielded slightly better prediction overall. Logistic regression gives an optimal mix of performance and interpretability. The established logistic regression model of burn mortality performs well against more complex alternatives. Clinical prediction with a small set of strong, stable, independent predictors is unlikely to gain much from machine learning outside specialist research contexts. Copyright © 2015 Elsevier Ltd and ISBI. All rights reserved.

  10. Bias in logistic regression due to imperfect diagnostic test results and practical correction approaches.

    PubMed

    Valle, Denis; Lima, Joanna M Tucker; Millar, Justin; Amratia, Punam; Haque, Ubydul

    2015-11-04

    Logistic regression is a statistical model widely used in cross-sectional and cohort studies to identify and quantify the effects of potential disease risk factors. However, the impact of imperfect tests on adjusted odds ratios (and thus on the identification of risk factors) is under-appreciated. The purpose of this article is to draw attention to the problem associated with modelling imperfect diagnostic tests, and propose simple Bayesian models to adequately address this issue. A systematic literature review was conducted to determine the proportion of malaria studies that appropriately accounted for false-negatives/false-positives in a logistic regression setting. Inference from the standard logistic regression was also compared with that from three proposed Bayesian models using simulations and malaria data from the western Brazilian Amazon. A systematic literature review suggests that malaria epidemiologists are largely unaware of the problem of using logistic regression to model imperfect diagnostic test results. Simulation results reveal that statistical inference can be substantially improved when using the proposed Bayesian models versus the standard logistic regression. Finally, analysis of original malaria data with one of the proposed Bayesian models reveals that microscopy sensitivity is strongly influenced by how long people have lived in the study region, and an important risk factor (i.e., participation in forest extractivism) is identified that would have been missed by standard logistic regression. Given the numerous diagnostic methods employed by malaria researchers and the ubiquitous use of logistic regression to model the results of these diagnostic tests, this paper provides critical guidelines to improve data analysis practice in the presence of misclassification error. Easy-to-use code that can be readily adapted to WinBUGS is provided, enabling straightforward implementation of the proposed Bayesian models.

  11. Logistic regression for risk factor modelling in stuttering research.

    PubMed

    Reed, Phil; Wu, Yaqionq

    2013-06-01

    To outline the uses of logistic regression and other statistical methods for risk factor analysis in the context of research on stuttering. The principles underlying the application of a logistic regression are illustrated, and the types of questions to which such a technique has been applied in the stuttering field are outlined. The assumptions and limitations of the technique are discussed with respect to existing stuttering research, and with respect to formulating appropriate research strategies to accommodate these considerations. Finally, some alternatives to the approach are briefly discussed. The way the statistical procedures are employed are demonstrated with some hypothetical data. Research into several practical issues concerning stuttering could benefit if risk factor modelling were used. Important examples are early diagnosis, prognosis (whether a child will recover or persist) and assessment of treatment outcome. After reading this article you will: (a) Summarize the situations in which logistic regression can be applied to a range of issues about stuttering; (b) Follow the steps in performing a logistic regression analysis; (c) Describe the assumptions of the logistic regression technique and the precautions that need to be checked when it is employed; (d) Be able to summarize its advantages over other techniques like estimation of group differences and simple regression. Copyright © 2012 Elsevier Inc. All rights reserved.

  12. Logistic regression model for diagnosis of transition zone prostate cancer on multi-parametric MRI.

    PubMed

    Dikaios, Nikolaos; Alkalbani, Jokha; Sidhu, Harbir Singh; Fujiwara, Taiki; Abd-Alazeez, Mohamed; Kirkham, Alex; Allen, Clare; Ahmed, Hashim; Emberton, Mark; Freeman, Alex; Halligan, Steve; Taylor, Stuart; Atkinson, David; Punwani, Shonit

    2015-02-01

    We aimed to develop logistic regression (LR) models for classifying prostate cancer within the transition zone on multi-parametric magnetic resonance imaging (mp-MRI). One hundred and fifty-five patients (training cohort, 70 patients; temporal validation cohort, 85 patients) underwent mp-MRI and transperineal-template-prostate-mapping (TPM) biopsy. Positive cores were classified by cancer definitions: (1) any-cancer; (2) definition-1 [≥Gleason 4 + 3 or ≥ 6 mm cancer core length (CCL)] [high risk significant]; and (3) definition-2 (≥Gleason 3 + 4 or ≥ 4 mm CCL) cancer [intermediate-high risk significant]. For each, logistic-regression mp-MRI models were derived from the training cohort and validated internally and with the temporal cohort. Sensitivity/specificity and the area under the receiver operating characteristic (ROC-AUC) curve were calculated. LR model performance was compared to radiologists' performance. Twenty-eight of 70 patients from the training cohort, and 25/85 patients from the temporal validation cohort had significant cancer on TPM. The ROC-AUC of the LR model for classification of cancer was 0.73/0.67 at internal/temporal validation. The radiologist A/B ROC-AUC was 0.65/0.74 (temporal cohort). For patients scored by radiologists as Prostate Imaging Reporting and Data System (Pi-RADS) score 3, sensitivity/specificity of radiologist A 'best guess' and LR model was 0.14/0.54 and 0.71/0.61, respectively; and radiologist B 'best guess' and LR model was 0.40/0.34 and 0.50/0.76, respectively. LR models can improve classification of Pi-RADS score 3 lesions similar to experienced radiologists. • MRI helps find prostate cancer in the anterior of the gland • Logistic regression models based on mp-MRI can classify prostate cancer • Computers can help confirm cancer in areas doctors are uncertain about.

  13. A nonparametric multiple imputation approach for missing categorical data.

    PubMed

    Zhou, Muhan; He, Yulei; Yu, Mandi; Hsu, Chiu-Hsieh

    2017-06-06

    Incomplete categorical variables with more than two categories are common in public health data. However, most of the existing missing-data methods do not use the information from nonresponse (missingness) probabilities. We propose a nearest-neighbour multiple imputation approach to impute a missing at random categorical outcome and to estimate the proportion of each category. The donor set for imputation is formed by measuring distances between each missing value with other non-missing values. The distance function is calculated based on a predictive score, which is derived from two working models: one fits a multinomial logistic regression for predicting the missing categorical outcome (the outcome model) and the other fits a logistic regression for predicting missingness probabilities (the missingness model). A weighting scheme is used to accommodate contributions from two working models when generating the predictive score. A missing value is imputed by randomly selecting one of the non-missing values with the smallest distances. We conduct a simulation to evaluate the performance of the proposed method and compare it with several alternative methods. A real-data application is also presented. The simulation study suggests that the proposed method performs well when missingness probabilities are not extreme under some misspecifications of the working models. However, the calibration estimator, which is also based on two working models, can be highly unstable when missingness probabilities for some observations are extremely high. In this scenario, the proposed method produces more stable and better estimates. In addition, proper weights need to be chosen to balance the contributions from the two working models and achieve optimal results for the proposed method. We conclude that the proposed multiple imputation method is a reasonable approach to dealing with missing categorical outcome data with more than two levels for assessing the distribution of the outcome. In terms of the choices for the working models, we suggest a multinomial logistic regression for predicting the missing outcome and a binary logistic regression for predicting the missingness probability.

  14. Dynamic Dimensionality Selection for Bayesian Classifier Ensembles

    DTIC Science & Technology

    2015-03-19

    learning of weights in an otherwise generatively learned naive Bayes classifier. WANBIA-C is very cometitive to Logistic Regression but much more...classifier, Generative learning, Discriminative learning, Naïve Bayes, Feature selection, Logistic regression , higher order attribute independence 16...discriminative learning of weights in an otherwise generatively learned naive Bayes classifier. WANBIA-C is very cometitive to Logistic Regression but

  15. A review of logistic regression models used to predict post-fire tree mortality of western North American conifers

    Treesearch

    Travis Woolley; David C. Shaw; Lisa M. Ganio; Stephen Fitzgerald

    2012-01-01

    Logistic regression models used to predict tree mortality are critical to post-fire management, planning prescribed bums and understanding disturbance ecology. We review literature concerning post-fire mortality prediction using logistic regression models for coniferous tree species in the western USA. We include synthesis and review of: methods to develop, evaluate...

  16. Preserving Institutional Privacy in Distributed binary Logistic Regression.

    PubMed

    Wu, Yuan; Jiang, Xiaoqian; Ohno-Machado, Lucila

    2012-01-01

    Privacy is becoming a major concern when sharing biomedical data across institutions. Although methods for protecting privacy of individual patients have been proposed, it is not clear how to protect the institutional privacy, which is many times a critical concern of data custodians. Built upon our previous work, Grid Binary LOgistic REgression (GLORE)1, we developed an Institutional Privacy-preserving Distributed binary Logistic Regression model (IPDLR) that considers both individual and institutional privacy for building a logistic regression model in a distributed manner. We tested our method using both simulated and clinical data, showing how it is possible to protect the privacy of individuals and of institutions using a distributed strategy.

  17. Covariate Imbalance and Adjustment for Logistic Regression Analysis of Clinical Trial Data

    PubMed Central

    Ciolino, Jody D.; Martin, Reneé H.; Zhao, Wenle; Jauch, Edward C.; Hill, Michael D.; Palesch, Yuko Y.

    2014-01-01

    In logistic regression analysis for binary clinical trial data, adjusted treatment effect estimates are often not equivalent to unadjusted estimates in the presence of influential covariates. This paper uses simulation to quantify the benefit of covariate adjustment in logistic regression. However, International Conference on Harmonization guidelines suggest that covariate adjustment be pre-specified. Unplanned adjusted analyses should be considered secondary. Results suggest that that if adjustment is not possible or unplanned in a logistic setting, balance in continuous covariates can alleviate some (but never all) of the shortcomings of unadjusted analyses. The case of log binomial regression is also explored. PMID:24138438

  18. Differentially private distributed logistic regression using private and public data.

    PubMed

    Ji, Zhanglong; Jiang, Xiaoqian; Wang, Shuang; Xiong, Li; Ohno-Machado, Lucila

    2014-01-01

    Privacy protecting is an important issue in medical informatics and differential privacy is a state-of-the-art framework for data privacy research. Differential privacy offers provable privacy against attackers who have auxiliary information, and can be applied to data mining models (for example, logistic regression). However, differentially private methods sometimes introduce too much noise and make outputs less useful. Given available public data in medical research (e.g. from patients who sign open-consent agreements), we can design algorithms that use both public and private data sets to decrease the amount of noise that is introduced. In this paper, we modify the update step in Newton-Raphson method to propose a differentially private distributed logistic regression model based on both public and private data. We try our algorithm on three different data sets, and show its advantage over: (1) a logistic regression model based solely on public data, and (2) a differentially private distributed logistic regression model based on private data under various scenarios. Logistic regression models built with our new algorithm based on both private and public datasets demonstrate better utility than models that trained on private or public datasets alone without sacrificing the rigorous privacy guarantee.

  19. Logistic regression analysis of conventional ultrasonography, strain elastosonography, and contrast-enhanced ultrasound characteristics for the differentiation of benign and malignant thyroid nodules

    PubMed Central

    Deng, Yingyuan; Wang, Tianfu; Chen, Siping; Liu, Weixiang

    2017-01-01

    The aim of the study is to screen the significant sonographic features by logistic regression analysis and fit a model to diagnose thyroid nodules. A total of 525 pathological thyroid nodules were retrospectively analyzed. All the nodules underwent conventional ultrasonography (US), strain elastosonography (SE), and contrast -enhanced ultrasound (CEUS). Those nodules’ 12 suspicious sonographic features were used to assess thyroid nodules. The significant features of diagnosing thyroid nodules were picked out by logistic regression analysis. All variables that were statistically related to diagnosis of thyroid nodules, at a level of p < 0.05 were embodied in a logistic regression analysis model. The significant features in the logistic regression model of diagnosing thyroid nodules were calcification, suspected cervical lymph node metastasis, hypoenhancement pattern, margin, shape, vascularity, posterior acoustic, echogenicity, and elastography score. According to the results of logistic regression analysis, the formula that could predict whether or not thyroid nodules are malignant was established. The area under the receiver operating curve (ROC) was 0.930 and the sensitivity, specificity, accuracy, positive predictive value, and negative predictive value were 83.77%, 89.56%, 87.05%, 86.04%, and 87.79% respectively. PMID:29228030

  20. Logistic regression analysis of conventional ultrasonography, strain elastosonography, and contrast-enhanced ultrasound characteristics for the differentiation of benign and malignant thyroid nodules.

    PubMed

    Pang, Tiantian; Huang, Leidan; Deng, Yingyuan; Wang, Tianfu; Chen, Siping; Gong, Xuehao; Liu, Weixiang

    2017-01-01

    The aim of the study is to screen the significant sonographic features by logistic regression analysis and fit a model to diagnose thyroid nodules. A total of 525 pathological thyroid nodules were retrospectively analyzed. All the nodules underwent conventional ultrasonography (US), strain elastosonography (SE), and contrast -enhanced ultrasound (CEUS). Those nodules' 12 suspicious sonographic features were used to assess thyroid nodules. The significant features of diagnosing thyroid nodules were picked out by logistic regression analysis. All variables that were statistically related to diagnosis of thyroid nodules, at a level of p < 0.05 were embodied in a logistic regression analysis model. The significant features in the logistic regression model of diagnosing thyroid nodules were calcification, suspected cervical lymph node metastasis, hypoenhancement pattern, margin, shape, vascularity, posterior acoustic, echogenicity, and elastography score. According to the results of logistic regression analysis, the formula that could predict whether or not thyroid nodules are malignant was established. The area under the receiver operating curve (ROC) was 0.930 and the sensitivity, specificity, accuracy, positive predictive value, and negative predictive value were 83.77%, 89.56%, 87.05%, 86.04%, and 87.79% respectively.

  1. Prevalence and Determinants of Preterm Birth in Tehran, Iran: A Comparison between Logistic Regression and Decision Tree Methods.

    PubMed

    Amini, Payam; Maroufizadeh, Saman; Samani, Reza Omani; Hamidi, Omid; Sepidarkish, Mahdi

    2017-06-01

    Preterm birth (PTB) is a leading cause of neonatal death and the second biggest cause of death in children under five years of age. The objective of this study was to determine the prevalence of PTB and its associated factors using logistic regression and decision tree classification methods. This cross-sectional study was conducted on 4,415 pregnant women in Tehran, Iran, from July 6-21, 2015. Data were collected by a researcher-developed questionnaire through interviews with mothers and review of their medical records. To evaluate the accuracy of the logistic regression and decision tree methods, several indices such as sensitivity, specificity, and the area under the curve were used. The PTB rate was 5.5% in this study. The logistic regression outperformed the decision tree for the classification of PTB based on risk factors. Logistic regression showed that multiple pregnancies, mothers with preeclampsia, and those who conceived with assisted reproductive technology had an increased risk for PTB ( p < 0.05). Identifying and training mothers at risk as well as improving prenatal care may reduce the PTB rate. We also recommend that statisticians utilize the logistic regression model for the classification of risk groups for PTB.

  2. Logistic regression models for predicting physical and mental health-related quality of life in rheumatoid arthritis patients.

    PubMed

    Alishiri, Gholam Hossein; Bayat, Noushin; Fathi Ashtiani, Ali; Tavallaii, Seyed Abbas; Assari, Shervin; Moharamzad, Yashar

    2008-01-01

    The aim of this work was to develop two logistic regression models capable of predicting physical and mental health related quality of life (HRQOL) among rheumatoid arthritis (RA) patients. In this cross-sectional study which was conducted during 2006 in the outpatient rheumatology clinic of our university hospital, Short Form 36 (SF-36) was used for HRQOL measurements in 411 RA patients. A cutoff point to define poor versus good HRQOL was calculated using the first quartiles of SF-36 physical and mental component scores (33.4 and 36.8, respectively). Two distinct logistic regression models were used to derive predictive variables including demographic, clinical, and psychological factors. The sensitivity, specificity, and accuracy of each model were calculated. Poor physical HRQOL was positively associated with pain score, disease duration, monthly family income below 300 US$, comorbidity, patient global assessment of disease activity or PGA, and depression (odds ratios: 1.1; 1.004; 15.5; 1.1; 1.02; 2.08, respectively). The variables that entered into the poor mental HRQOL prediction model were monthly family income below 300 US$, comorbidity, PGA, and bodily pain (odds ratios: 6.7; 1.1; 1.01; 1.01, respectively). Optimal sensitivity and specificity were achieved at a cutoff point of 0.39 for the estimated probability of poor physical HRQOL and 0.18 for mental HRQOL. Sensitivity, specificity, and accuracy of the physical and mental models were 73.8, 87, 83.7% and 90.38, 70.36, 75.43%, respectively. The results show that the suggested models can be used to predict poor physical and mental HRQOL separately among RA patients using simple variables with acceptable accuracy. These models can be of use in the clinical decision-making of RA patients and to recognize patients with poor physical or mental HRQOL in advance, for better management.

  3. Geospatial techniques for allocating vulnerability zoning of geohazards along the Karakorum Highway, Gilgit-Baltistan-Pakistan

    NASA Astrophysics Data System (ADS)

    Khan, K. M.; Rashid, S.; Yaseen, M.; Ikram, M.

    2016-12-01

    The Karakoram Highway (KKH) 'eighth wonder of the world', constructed and completed by the consent of Pakistan and China in 1979 as a Friendship Highway. It connect Gilgit-Baltistan, a strategically prominent region of Pakistan, with Xinjiang region in China. Due to manifold geology/geomorphology, soil formation, steep slopes, climate change well as unsustainable anthropogenic activities, still, KKH is remarkably vulnerable to natural hazards i.e. land subsistence, landslides, erosion, rock fall, floods, debris flows, cyclical torrential rainfall and snowfall, lake outburst etc. Most of the time these geohazard's damaging effects jeopardized the life in the region. To ascertain the nature and frequency of the disaster and vulnerability zoning, a rating and management (logistic) analysis were made to investigate the spatiotemporal sharing of the natural hazard. The substantial dynamics of the physiograpy, geology, geomorphology, soils and climate were carefully understand while slope, aspect, elevation, profile curvature and rock hardness was calculated by different techniques. To assess the nature and intensity geospatial analysis were conducted and magnitude of every factor was gauged by using logistic regression. Moreover, ever relative variable was integrated in the evaluation process. Logistic regression and geospatial techniques were used to map the geohazard vulnerability zoning (GVZ). The GVZ model findings were endorsed by the reviews of documented hazards in the current years and the precision was realized more than 88.1 %. The study has proved the model authentication by highlighting the comfortable indenture among the vulnerability mapping and past documented hazards. By using a receiver operating characteristic curve, the logistic regression model made satisfactory results. The outcomes will be useful in sustainable land use and infrastructure planning, mainly in high risk zones for reduceing economic damages and community betterment.

  4. [Willingness of Patients with Obesity to Use New Media in Rehabilitation Aftercare].

    PubMed

    Dorow, M; Löbner, M; Stein, J; Kind, P; Markert, J; Keller, J; Weidauer, E; Riedel-Heller, S G

    2017-06-01

    Digital media offer new possibilities in rehabilitation aftercare. This study investigates the rehabilitants' willingness to use new media (sms, internet, social networks) in rehabilitation aftercare and factors that are associated with the willingness to use media-based aftercare. 92 rehabilitants (patients with obesity) filled in a questionnaire on the willingness to use new media in rehabilitation aftercare. In order to identify influencing factors, binary logistic regression models were calculated. 3 quarters of the rehabilitants (76.1%) reported that they would be willing to use new media in rehabilitation aftercare. The binary logistic regression model yielded two factors that were associated with the willingness to use media-based aftercare: the possession of a smartphone and the willingness to receive telephone counseling for aftercare. The majority of the rehabilitants was willing to use new media in rehabilitation aftercare. The reasons for refusal of media-based aftercare need to be examined more closely. © Georg Thieme Verlag KG Stuttgart · New York.

  5. Methods for estimating drought streamflow probabilities for Virginia streams

    USGS Publications Warehouse

    Austin, Samuel H.

    2014-01-01

    Maximum likelihood logistic regression model equations used to estimate drought flow probabilities for Virginia streams are presented for 259 hydrologic basins in Virginia. Winter streamflows were used to estimate the likelihood of streamflows during the subsequent drought-prone summer months. The maximum likelihood logistic regression models identify probable streamflows from 5 to 8 months in advance. More than 5 million streamflow daily values collected over the period of record (January 1, 1900 through May 16, 2012) were compiled and analyzed over a minimum 10-year (maximum 112-year) period of record. The analysis yielded the 46,704 equations with statistically significant fit statistics and parameter ranges published in two tables in this report. These model equations produce summer month (July, August, and September) drought flow threshold probabilities as a function of streamflows during the previous winter months (November, December, January, and February). Example calculations are provided, demonstrating how to use the equations to estimate probable streamflows as much as 8 months in advance.

  6. A modified approach to estimating sample size for simple logistic regression with one continuous covariate.

    PubMed

    Novikov, I; Fund, N; Freedman, L S

    2010-01-15

    Different methods for the calculation of sample size for simple logistic regression (LR) with one normally distributed continuous covariate give different results. Sometimes the difference can be large. Furthermore, some methods require the user to specify the prevalence of cases when the covariate equals its population mean, rather than the more natural population prevalence. We focus on two commonly used methods and show through simulations that the power for a given sample size may differ substantially from the nominal value for one method, especially when the covariate effect is large, while the other method performs poorly if the user provides the population prevalence instead of the required parameter. We propose a modification of the method of Hsieh et al. that requires specification of the population prevalence and that employs Schouten's sample size formula for a t-test with unequal variances and group sizes. This approach appears to increase the accuracy of the sample size estimates for LR with one continuous covariate.

  7. A trend analysis of laboratory positive propoxyphene workplace urine drug screens before and after the product recall.

    PubMed

    Price, James

    2015-01-01

    Propoxyphene was withdrawn from the US market in November 2010. This drug is still tested for in the workplace as part of expanded panel nonregulated testing. A convenience sample of urine specimens (n = 7838) were provided by workers from various industries. The percentage of positive specimens with 95% confidence intervals was calculated for each year of the study. Logistic regression was used to assess the impact of the year upon the propoxyphene result. The prevalence of positive propoxyphene tests was much higher before the product's withdrawal from the market. Logistic regression provided evidence of a decreasing linear trend (P < 0.000; β = -0.71). The odds ratio signifies that for every additional year the urine specimens were 0.49 times less likely to be positive for propoxyphene. This favors the determination that the change in propoxyphene positive drug test over the years is not by chance. The conclusion supports no longer performing nonregulated workplace propoxyphene urine drug testing for this population.

  8. Risk factors for displaced abomasum or ketosis in Swedish dairy herds.

    PubMed

    Stengärde, L; Hultgren, J; Tråvén, M; Holtenius, K; Emanuelson, U

    2012-03-01

    Risk factors associated with high or low long-term incidence of displaced abomasum (DA) or clinical ketosis were studied in 60 Swedish dairy herds, using multivariable logistic regression modelling. Forty high-incidence herds were included as cases and 20 low-incidence herds as controls. Incidence rates were calculated based on veterinary records of clinical diagnoses. During the 3-year period preceding the herd classification, herds with a high incidence had a disease incidence of DA or clinical ketosis above the 3rd quartile in a national database for disease recordings. Control herds had no cows with DA or clinical ketosis. All herds were visited during the housing period and herdsmen were interviewed about management routines, housing, feeding, milk yield, and herd health. Target groups were heifers in late gestation, dry cows, and cows in early lactation. Univariable logistic regression was used to screen for factors associated with being a high-incidence herd. A multivariable logistic regression model was built using stepwise regression. A higher maximum daily milk yield in multiparous cows and a large herd size (p=0.054 and p=0.066, respectively) tended to be associated with being a high-incidence herd. Not cleaning the heifer feeding platform daily increased the odds of having a high-incidence herd twelvefold (p<0.01). Keeping cows in only one group in the dry period increased the odds of having a high incidence herd eightfold (p=0.03). Herd size was confounded with housing system. Housing system was therefore added to the final logistic regression model. In conclusion, a large herd size, a high maximum daily milk yield, keeping dry cows in one group, and not cleaning the feeding platform daily appear to be important risk factors for a high incidence of DA or clinical ketosis in Swedish dairy herds. These results confirm the importance of housing, management and feeding in the prevention of metabolic disorders in dairy cows around parturition and in early lactation. Copyright © 2011 Elsevier B.V. All rights reserved.

  9. Predicting the aquatic toxicity mode of action using logistic regression and linear discriminant analysis.

    PubMed

    Ren, Y Y; Zhou, L C; Yang, L; Liu, P Y; Zhao, B W; Liu, H X

    2016-09-01

    The paper highlights the use of the logistic regression (LR) method in the construction of acceptable statistically significant, robust and predictive models for the classification of chemicals according to their aquatic toxic modes of action. Essentials accounting for a reliable model were all considered carefully. The model predictors were selected by stepwise forward discriminant analysis (LDA) from a combined pool of experimental data and chemical structure-based descriptors calculated by the CODESSA and DRAGON software packages. Model predictive ability was validated both internally and externally. The applicability domain was checked by the leverage approach to verify prediction reliability. The obtained models are simple and easy to interpret. In general, LR performs much better than LDA and seems to be more attractive for the prediction of the more toxic compounds, i.e. compounds that exhibit excess toxicity versus non-polar narcotic compounds and more reactive compounds versus less reactive compounds. In addition, model fit and regression diagnostics was done through the influence plot which reflects the hat-values, studentized residuals, and Cook's distance statistics of each sample. Overdispersion was also checked for the LR model. The relationships between the descriptors and the aquatic toxic behaviour of compounds are also discussed.

  10. Learning investment indicators through data extension

    NASA Astrophysics Data System (ADS)

    Dvořák, Marek

    2017-07-01

    Stock prices in the form of time series were analysed using single and multivariate statistical methods. After simple data preprocessing in the form of logarithmic differences, we augmented this single variate time series to a multivariate representation. This method makes use of sliding windows to calculate several dozen of new variables using simple statistic tools like first and second moments as well as more complicated statistic, like auto-regression coefficients and residual analysis, followed by an optional quadratic transformation that was further used for data extension. These were used as a explanatory variables in a regularized logistic LASSO regression which tried to estimate Buy-Sell Index (BSI) from real stock market data.

  11. Risk Factors for Suicidal Ideation in People at Risk for Huntington's Disease.

    PubMed

    Anderson, Karen E; Eberly, Shirley; Groves, Mark; Kayson, Elise; Marder, Karen; Young, Anne B; Shoulson, Ira

    2016-12-15

    Suicidal ideation (SI) and attempts are increased in Huntington's disease (HD), making risk factor assessment a priority. To determine whether, hopelessness, irritability, aggression, anxiety, CAG expansion status, depression, and motor signs/symptoms were associated with Suicidal Ideation (SI) in those at risk for HD. Behavioral and neurological data were collected from subjects in an observational study. Subject characteristics were calculated by CAG status and SI. Logistic regression models were adjusted for demographics. Separate logistic regressions were used to compare SI and non-SI subjects. A combined logistic regression model, including 4 pre-specified predictors, (hopelessness, irritability, aggression, anxiety) was used to assess the relationship of SI to these predictors. 801 subjects were assessed, 40 were classified as having SI, 6.3% of CAG mutation expansion carriers had SI, compared with 4.3% of non- CAG mutation expansion carriers (p = 0.2275). SI subjects had significantly increased depression (p < 0.0001), hopelessness (p < 0.0001), irritability (p < 0.0001), aggression (p = 0.0089), and anxiety (p < 0.0001), and an elevated motor score (p = 0.0098). Impulsivity, assessed in a subgroup of subjects, was also associated with SI (p = 0.0267). Hopelessness and anxiety remained significant in combined model (p < 0.001; p < 0.0198, respectively) even when motor score was included. Behavioral symptoms were significantly higher in those reporting SI. Hopelessness and anxiety showed a particularly strong association with SI. Risk identification could assist in assessment of suicidality in this group.

  12. Protective Effect of HLA-DQB1 Alleles Against Alloimmunization in Patients with Sickle Cell Disease

    PubMed Central

    Tatari-Calderone, Zohreh; Gordish-Dressman, Heather; Fasano, Ross; Riggs, Michael; Fortier, Catherine; Andrew; Campbell, D.; Charron, Dominique; Gordeuk, Victor R.; Luban, Naomi L.C.; Vukmanovic, Stanislav; Tamouza, Ryad

    2015-01-01

    Background Alloimmunization or the development of alloantibodies to Red Blood Cell (RBC) antigens is considered one of the major complications after RBC transfusions in patients with sickle cell disease (SCD) and can lead to both acute and delayed hemolytic reactions. It has been suggested that polymorphisms in HLA genes, may play a role in alloimmunization. We conducted a retrospective study analyzing the influence of HLA-DRB1 and DQB1 genetic diversity on RBC-alloimmunization. Study design Two-hundred four multi-transfused SCD patients with and without RBC-alloimmunization were typed at low/medium resolution by PCR-SSO, using IMGT-HLA Database. HLA-DRB1 and DQB1 allele frequencies were analyzed using logistic regression models, and global p-value was calculated using multiple logistic regression. Results While only trends towards associations between HLA-DR diversity and alloimmunization were observed, analysis of HLA-DQ showed that HLA-DQ2 (p=0.02), -DQ3 (p=0.02) and -DQ5 (p=0.01) alleles were significantly higher in non-alloimmunized patients, likely behaving as protective alleles. In addition, multiple logistic regression analysis showed both HLA-DQ2/6 (p=0.01) and HLA-DQ5/5 (p=0.03) combinations constitute additional predictor of protective status. Conclusion Our data suggest that particular HLA-DQ alleles influence the clinical course of RBC transfusion in patients with SCD, which could pave the way towards predictive strategies. PMID:26476208

  13. Stata Modules for Calculating Novel Predictive Performance Indices for Logistic Models.

    PubMed

    Barkhordari, Mahnaz; Padyab, Mojgan; Hadaegh, Farzad; Azizi, Fereidoun; Bozorgmanesh, Mohammadreza

    2016-01-01

    Prediction is a fundamental part of prevention of cardiovascular diseases (CVD). The development of prediction algorithms based on the multivariate regression models loomed several decades ago. Parallel with predictive models development, biomarker researches emerged in an impressively great scale. The key question is how best to assess and quantify the improvement in risk prediction offered by new biomarkers or more basically how to assess the performance of a risk prediction model. Discrimination, calibration, and added predictive value have been recently suggested to be used while comparing the predictive performances of the predictive models' with and without novel biomarkers. Lack of user-friendly statistical software has restricted implementation of novel model assessment methods while examining novel biomarkers. We intended, thus, to develop a user-friendly software that could be used by researchers with few programming skills. We have written a Stata command that is intended to help researchers obtain cut point-free and cut point-based net reclassification improvement index and (NRI) and relative and absolute Integrated discriminatory improvement index (IDI) for logistic-based regression analyses.We applied the commands to a real data on women participating the Tehran lipid and glucose study (TLGS) to examine if information of a family history of premature CVD, waist circumference, and fasting plasma glucose can improve predictive performance of the Framingham's "general CVD risk" algorithm. The command is addpred for logistic regression models. The Stata package provided herein can encourage the use of novel methods in examining predictive capacity of ever-emerging plethora of novel biomarkers.

  14. The Joint Effects of Lifestyle Factors and Comorbidities on the Risk of Colorectal Cancer: A Large Chinese Retrospective Case-Control Study

    PubMed Central

    Hu, Hai; Zhou, Yangyang; Ren, Shujuan; Wu, Jiajin; Zhu, Meiying; Chen, Donghui; Yang, Haiyan; Wang, Liwei

    2015-01-01

    Background Colorectal cancer (CRC) is a major cause of cancer morbidity and mortality. In previous epidemiologic studies, the respective correlation between lifestyle factors and comorbidity and CRC has been extensively studied. However, little is known about their joint effects on CRC. Methods We conducted a retrospective case-control study of 1,144 diagnosed CRC patients and 60,549 community controls. A structured questionnaire was administered to the participants about their socio-demographic factors, anthropometric measures, comorbidity history and lifestyle factors. Logistic regression model was used to calculate the odds ratio (ORs) and 95% confidence intervals (95%CIs) for each factor. According to the results from logistic regression model, we further developed healthy lifestyle index (HLI) and comorbidity history index (CHI) to investigate their independent and joint effects on CRC risk. Results Four lifestyle factors (including physical activities, sleep, red meat and vegetable consumption) and four types of comorbidity (including diabetes, hyperlipidemia, history of inflammatory bowel disease and polyps) were found to be independently associated with the risk of CRC in multivariant logistic regression model. Intriguingly, their combined pattern- HLI and CHI demonstrated significant correlation with CRC risk independently (ORHLI: 3.91, 95%CI: 3.13–4.88; ORCHI: 2.49, 95%CI: 2.11–2.93) and jointly (OR: 10.33, 95%CI: 6.59–16.18). Conclusions There are synergistic effects of lifestyle factors and comorbidity on the risk of colorectal cancer in the Chinese population. PMID:26710070

  15. Logistic regression for dichotomized counts.

    PubMed

    Preisser, John S; Das, Kalyan; Benecha, Habtamu; Stamm, John W

    2016-12-01

    Sometimes there is interest in a dichotomized outcome indicating whether a count variable is positive or zero. Under this scenario, the application of ordinary logistic regression may result in efficiency loss, which is quantifiable under an assumed model for the counts. In such situations, a shared-parameter hurdle model is investigated for more efficient estimation of regression parameters relating to overall effects of covariates on the dichotomous outcome, while handling count data with many zeroes. One model part provides a logistic regression containing marginal log odds ratio effects of primary interest, while an ancillary model part describes the mean count of a Poisson or negative binomial process in terms of nuisance regression parameters. Asymptotic efficiency of the logistic model parameter estimators of the two-part models is evaluated with respect to ordinary logistic regression. Simulations are used to assess the properties of the models with respect to power and Type I error, the latter investigated under both misspecified and correctly specified models. The methods are applied to data from a randomized clinical trial of three toothpaste formulations to prevent incident dental caries in a large population of Scottish schoolchildren. © The Author(s) 2014.

  16. Predicting 30-day Hospital Readmission with Publicly Available Administrative Database. A Conditional Logistic Regression Modeling Approach.

    PubMed

    Zhu, K; Lou, Z; Zhou, J; Ballester, N; Kong, N; Parikh, P

    2015-01-01

    This article is part of the Focus Theme of Methods of Information in Medicine on "Big Data and Analytics in Healthcare". Hospital readmissions raise healthcare costs and cause significant distress to providers and patients. It is, therefore, of great interest to healthcare organizations to predict what patients are at risk to be readmitted to their hospitals. However, current logistic regression based risk prediction models have limited prediction power when applied to hospital administrative data. Meanwhile, although decision trees and random forests have been applied, they tend to be too complex to understand among the hospital practitioners. Explore the use of conditional logistic regression to increase the prediction accuracy. We analyzed an HCUP statewide inpatient discharge record dataset, which includes patient demographics, clinical and care utilization data from California. We extracted records of heart failure Medicare beneficiaries who had inpatient experience during an 11-month period. We corrected the data imbalance issue with under-sampling. In our study, we first applied standard logistic regression and decision tree to obtain influential variables and derive practically meaning decision rules. We then stratified the original data set accordingly and applied logistic regression on each data stratum. We further explored the effect of interacting variables in the logistic regression modeling. We conducted cross validation to assess the overall prediction performance of conditional logistic regression (CLR) and compared it with standard classification models. The developed CLR models outperformed several standard classification models (e.g., straightforward logistic regression, stepwise logistic regression, random forest, support vector machine). For example, the best CLR model improved the classification accuracy by nearly 20% over the straightforward logistic regression model. Furthermore, the developed CLR models tend to achieve better sensitivity of more than 10% over the standard classification models, which can be translated to correct labeling of additional 400 - 500 readmissions for heart failure patients in the state of California over a year. Lastly, several key predictor identified from the HCUP data include the disposition location from discharge, the number of chronic conditions, and the number of acute procedures. It would be beneficial to apply simple decision rules obtained from the decision tree in an ad-hoc manner to guide the cohort stratification. It could be potentially beneficial to explore the effect of pairwise interactions between influential predictors when building the logistic regression models for different data strata. Judicious use of the ad-hoc CLR models developed offers insights into future development of prediction models for hospital readmissions, which can lead to better intuition in identifying high-risk patients and developing effective post-discharge care strategies. Lastly, this paper is expected to raise the awareness of collecting data on additional markers and developing necessary database infrastructure for larger-scale exploratory studies on readmission risk prediction.

  17. Interpretation of commonly used statistical regression models.

    PubMed

    Kasza, Jessica; Wolfe, Rory

    2014-01-01

    A review of some regression models commonly used in respiratory health applications is provided in this article. Simple linear regression, multiple linear regression, logistic regression and ordinal logistic regression are considered. The focus of this article is on the interpretation of the regression coefficients of each model, which are illustrated through the application of these models to a respiratory health research study. © 2013 The Authors. Respirology © 2013 Asian Pacific Society of Respirology.

  18. Evaluation of logistic regression models and effect of covariates for case-control study in RNA-Seq analysis.

    PubMed

    Choi, Seung Hoan; Labadorf, Adam T; Myers, Richard H; Lunetta, Kathryn L; Dupuis, Josée; DeStefano, Anita L

    2017-02-06

    Next generation sequencing provides a count of RNA molecules in the form of short reads, yielding discrete, often highly non-normally distributed gene expression measurements. Although Negative Binomial (NB) regression has been generally accepted in the analysis of RNA sequencing (RNA-Seq) data, its appropriateness has not been exhaustively evaluated. We explore logistic regression as an alternative method for RNA-Seq studies designed to compare cases and controls, where disease status is modeled as a function of RNA-Seq reads using simulated and Huntington disease data. We evaluate the effect of adjusting for covariates that have an unknown relationship with gene expression. Finally, we incorporate the data adaptive method in order to compare false positive rates. When the sample size is small or the expression levels of a gene are highly dispersed, the NB regression shows inflated Type-I error rates but the Classical logistic and Bayes logistic (BL) regressions are conservative. Firth's logistic (FL) regression performs well or is slightly conservative. Large sample size and low dispersion generally make Type-I error rates of all methods close to nominal alpha levels of 0.05 and 0.01. However, Type-I error rates are controlled after applying the data adaptive method. The NB, BL, and FL regressions gain increased power with large sample size, large log2 fold-change, and low dispersion. The FL regression has comparable power to NB regression. We conclude that implementing the data adaptive method appropriately controls Type-I error rates in RNA-Seq analysis. Firth's logistic regression provides a concise statistical inference process and reduces spurious associations from inaccurately estimated dispersion parameters in the negative binomial framework.

  19. Applied Prevalence Ratio estimation with different Regression models: An example from a cross-national study on substance use research.

    PubMed

    Espelt, Albert; Marí-Dell'Olmo, Marc; Penelo, Eva; Bosque-Prous, Marina

    2016-06-14

    To examine the differences between Prevalence Ratio (PR) and Odds Ratio (OR) in a cross-sectional study and to provide tools to calculate PR using two statistical packages widely used in substance use research (STATA and R). We used cross-sectional data from 41,263 participants of 16 European countries participating in the Survey on Health, Ageing and Retirement in Europe (SHARE). The dependent variable, hazardous drinking, was calculated using the Alcohol Use Disorders Identification Test - Consumption (AUDIT-C). The main independent variable was gender. Other variables used were: age, educational level and country of residence. PR of hazardous drinking in men with relation to women was estimated using Mantel-Haenszel method, log-binomial regression models and poisson regression models with robust variance. These estimations were compared to the OR calculated using logistic regression models. Prevalence of hazardous drinkers varied among countries. Generally, men have higher prevalence of hazardous drinking than women [PR=1.43 (1.38-1.47)]. Estimated PR was identical independently of the method and the statistical package used. However, OR overestimated PR, depending on the prevalence of hazardous drinking in the country. In cross-sectional studies, where comparisons between countries with differences in the prevalence of the disease or condition are made, it is advisable to use PR instead of OR.

  20. Differentially private distributed logistic regression using private and public data

    PubMed Central

    2014-01-01

    Background Privacy protecting is an important issue in medical informatics and differential privacy is a state-of-the-art framework for data privacy research. Differential privacy offers provable privacy against attackers who have auxiliary information, and can be applied to data mining models (for example, logistic regression). However, differentially private methods sometimes introduce too much noise and make outputs less useful. Given available public data in medical research (e.g. from patients who sign open-consent agreements), we can design algorithms that use both public and private data sets to decrease the amount of noise that is introduced. Methodology In this paper, we modify the update step in Newton-Raphson method to propose a differentially private distributed logistic regression model based on both public and private data. Experiments and results We try our algorithm on three different data sets, and show its advantage over: (1) a logistic regression model based solely on public data, and (2) a differentially private distributed logistic regression model based on private data under various scenarios. Conclusion Logistic regression models built with our new algorithm based on both private and public datasets demonstrate better utility than models that trained on private or public datasets alone without sacrificing the rigorous privacy guarantee. PMID:25079786

  1. A retrospective analysis to identify the factors affecting infection in patients undergoing chemotherapy.

    PubMed

    Park, Ji Hyun; Kim, Hyeon-Young; Lee, Hanna; Yun, Eun Kyoung

    2015-12-01

    This study compares the performance of the logistic regression and decision tree analysis methods for assessing the risk factors for infection in cancer patients undergoing chemotherapy. The subjects were 732 cancer patients who were receiving chemotherapy at K university hospital in Seoul, Korea. The data were collected between March 2011 and February 2013 and were processed for descriptive analysis, logistic regression and decision tree analysis using the IBM SPSS Statistics 19 and Modeler 15.1 programs. The most common risk factors for infection in cancer patients receiving chemotherapy were identified as alkylating agents, vinca alkaloid and underlying diabetes mellitus. The logistic regression explained 66.7% of the variation in the data in terms of sensitivity and 88.9% in terms of specificity. The decision tree analysis accounted for 55.0% of the variation in the data in terms of sensitivity and 89.0% in terms of specificity. As for the overall classification accuracy, the logistic regression explained 88.0% and the decision tree analysis explained 87.2%. The logistic regression analysis showed a higher degree of sensitivity and classification accuracy. Therefore, logistic regression analysis is concluded to be the more effective and useful method for establishing an infection prediction model for patients undergoing chemotherapy. Copyright © 2015 Elsevier Ltd. All rights reserved.

  2. Performance and strategy comparisons of human listeners and logistic regression in discriminating underwater targets.

    PubMed

    Yang, Lixue; Chen, Kean

    2015-11-01

    To improve the design of underwater target recognition systems based on auditory perception, this study compared human listeners with automatic classifiers. Performances measures and strategies in three discrimination experiments, including discriminations between man-made and natural targets, between ships and submarines, and among three types of ships, were used. In the experiments, the subjects were asked to assign a score to each sound based on how confident they were about the category to which it belonged, and logistic regression, which represents linear discriminative models, also completed three similar tasks by utilizing many auditory features. The results indicated that the performances of logistic regression improved as the ratio between inter- and intra-class differences became larger, whereas the performances of the human subjects were limited by their unfamiliarity with the targets. Logistic regression performed better than the human subjects in all tasks but the discrimination between man-made and natural targets, and the strategies employed by excellent human subjects were similar to that of logistic regression. Logistic regression and several human subjects demonstrated similar performances when discriminating man-made and natural targets, but in this case, their strategies were not similar. An appropriate fusion of their strategies led to further improvement in recognition accuracy.

  3. Simulating land-use changes by incorporating spatial autocorrelation and self-organization in CLUE-S modeling: a case study in Zengcheng District, Guangzhou, China

    NASA Astrophysics Data System (ADS)

    Mei, Zhixiong; Wu, Hao; Li, Shiyun

    2018-06-01

    The Conversion of Land Use and its Effects at Small regional extent (CLUE-S), which is a widely used model for land-use simulation, utilizes logistic regression to estimate the relationships between land use and its drivers, and thus, predict land-use change probabilities. However, logistic regression disregards possible spatial autocorrelation and self-organization in land-use data. Autologistic regression can depict spatial autocorrelation but cannot address self-organization, while logistic regression by considering only self-organization (NElogistic regression) fails to capture spatial autocorrelation. Therefore, this study developed a regression (NE-autologistic regression) method, which incorporated both spatial autocorrelation and self-organization, to improve CLUE-S. The Zengcheng District of Guangzhou, China was selected as the study area. The land-use data of 2001, 2005, and 2009, as well as 10 typical driving factors, were used to validate the proposed regression method and the improved CLUE-S model. Then, three future land-use scenarios in 2020: the natural growth scenario, ecological protection scenario, and economic development scenario, were simulated using the improved model. Validation results showed that NE-autologistic regression performed better than logistic regression, autologistic regression, and NE-logistic regression in predicting land-use change probabilities. The spatial allocation accuracy and kappa values of NE-autologistic-CLUE-S were higher than those of logistic-CLUE-S, autologistic-CLUE-S, and NE-logistic-CLUE-S for the simulations of two periods, 2001-2009 and 2005-2009, which proved that the improved CLUE-S model achieved the best simulation and was thereby effective to a certain extent. The scenario simulation results indicated that under all three scenarios, traffic land and residential/industrial land would increase, whereas arable land and unused land would decrease during 2009-2020. Apparent differences also existed in the simulated change sizes and locations of each land-use type under different scenarios. The results not only demonstrate the validity of the improved model but also provide a valuable reference for relevant policy-makers.

  4. Development and external multicenter validation of Chinese Prostate Cancer Consortium prostate cancer risk calculator for initial prostate biopsy.

    PubMed

    Chen, Rui; Xie, Liping; Xue, Wei; Ye, Zhangqun; Ma, Lulin; Gao, Xu; Ren, Shancheng; Wang, Fubo; Zhao, Lin; Xu, Chuanliang; Sun, Yinghao

    2016-09-01

    Substantial differences exist in the relationship of prostate cancer (PCa) detection rate and prostate-specific antigen (PSA) level between Western and Asian populations. Classic Western risk calculators, European Randomized Study for Screening of Prostate Cancer Risk Calculator, and Prostate Cancer Prevention Trial Risk Calculator, were shown to be not applicable in Asian populations. We aimed to develop and validate a risk calculator for predicting the probability of PCa and high-grade PCa (defined as Gleason Score sum 7 or higher) at initial prostate biopsy in Chinese men. Urology outpatients who underwent initial prostate biopsy according to the inclusion criteria were included. The multivariate logistic regression-based Chinese Prostate Cancer Consortium Risk Calculator (CPCC-RC) was constructed with cases from 2 hospitals in Shanghai. Discriminative ability, calibration and decision curve analysis were externally validated in 3 CPCC member hospitals. Of the 1,835 patients involved, PCa was identified in 338/924 (36.6%) and 294/911 (32.3%) men in the development and validation cohort, respectively. Multivariate logistic regression analyses showed that 5 predictors (age, logPSA, logPV, free PSA ratio, and digital rectal examination) were associated with PCa (Model 1) or high-grade PCa (Model 2), respectively. The area under the curve of Model 1 and Model 2 was 0.801 (95% CI: 0.771-0.831) and 0.826 (95% CI: 0.796-0.857), respectively. Both models illustrated good calibration and substantial improvement in decision curve analyses than any single predictors at all threshold probabilities. Higher predicting accuracy, better calibration, and greater clinical benefit were achieved by CPCC-RC, compared with European Randomized Study for Screening of Prostate Cancer Risk Calculator and Prostate Cancer Prevention Trial Risk Calculator in predicting PCa. CPCC-RC performed well in discrimination and calibration and decision curve analysis in external validation compared with Western risk calculators. CPCC-RC may aid in decision-making of prostate biopsy in Chinese or in other Asian populations with similar genetic and environmental backgrounds. Copyright © 2016 Elsevier Inc. All rights reserved.

  5. Unitary Response Regression Models

    ERIC Educational Resources Information Center

    Lipovetsky, S.

    2007-01-01

    The dependent variable in a regular linear regression is a numerical variable, and in a logistic regression it is a binary or categorical variable. In these models the dependent variable has varying values. However, there are problems yielding an identity output of a constant value which can also be modelled in a linear or logistic regression with…

  6. Estimating Contraceptive Prevalence Using Logistics Data for Short-Acting Methods: Analysis Across 30 Countries.

    PubMed

    Cunningham, Marc; Bock, Ariella; Brown, Niquelle; Sacher, Suzy; Hatch, Benjamin; Inglis, Andrew; Aronovich, Dana

    2015-09-01

    Contraceptive prevalence rate (CPR) is a vital indicator used by country governments, international donors, and other stakeholders for measuring progress in family planning programs against country targets and global initiatives as well as for estimating health outcomes. Because of the need for more frequent CPR estimates than population-based surveys currently provide, alternative approaches for estimating CPRs are being explored, including using contraceptive logistics data. Using data from the Demographic and Health Surveys (DHS) in 30 countries, population data from the United States Census Bureau International Database, and logistics data from the Procurement Planning and Monitoring Report (PPMR) and the Pipeline Monitoring and Procurement Planning System (PipeLine), we developed and evaluated 3 models to generate country-level, public-sector contraceptive prevalence estimates for injectable contraceptives, oral contraceptives, and male condoms. Models included: direct estimation through existing couple-years of protection (CYP) conversion factors, bivariate linear regression, and multivariate linear regression. Model evaluation consisted of comparing the referent DHS prevalence rates for each short-acting method with the model-generated prevalence rate using multiple metrics, including mean absolute error and proportion of countries where the modeled prevalence rate for each method was within 1, 2, or 5 percentage points of the DHS referent value. For the methods studied, family planning use estimates from public-sector logistics data were correlated with those from the DHS, validating the quality and accuracy of current public-sector logistics data. Logistics data for oral and injectable contraceptives were significantly associated (P<.05) with the referent DHS values for both bivariate and multivariate models. For condoms, however, that association was only significant for the bivariate model. With the exception of the CYP-based model for condoms, models were able to estimate public-sector prevalence rates for each short-acting method to within 2 percentage points in at least 85% of countries. Public-sector contraceptive logistics data are strongly correlated with public-sector prevalence rates for short-acting methods, demonstrating the quality of current logistics data and their ability to provide relatively accurate prevalence estimates. The models provide a starting point for generating interim estimates of contraceptive use when timely survey data are unavailable. All models except the condoms CYP model performed well; the regression models were most accurate but the CYP model offers the simplest calculation method. Future work extending the research to other modern methods, relating subnational logistics data with prevalence rates, and tracking that relationship over time is needed. © Cunningham et al.

  7. Estimating Contraceptive Prevalence Using Logistics Data for Short-Acting Methods: Analysis Across 30 Countries

    PubMed Central

    Cunningham, Marc; Brown, Niquelle; Sacher, Suzy; Hatch, Benjamin; Inglis, Andrew; Aronovich, Dana

    2015-01-01

    Background: Contraceptive prevalence rate (CPR) is a vital indicator used by country governments, international donors, and other stakeholders for measuring progress in family planning programs against country targets and global initiatives as well as for estimating health outcomes. Because of the need for more frequent CPR estimates than population-based surveys currently provide, alternative approaches for estimating CPRs are being explored, including using contraceptive logistics data. Methods: Using data from the Demographic and Health Surveys (DHS) in 30 countries, population data from the United States Census Bureau International Database, and logistics data from the Procurement Planning and Monitoring Report (PPMR) and the Pipeline Monitoring and Procurement Planning System (PipeLine), we developed and evaluated 3 models to generate country-level, public-sector contraceptive prevalence estimates for injectable contraceptives, oral contraceptives, and male condoms. Models included: direct estimation through existing couple-years of protection (CYP) conversion factors, bivariate linear regression, and multivariate linear regression. Model evaluation consisted of comparing the referent DHS prevalence rates for each short-acting method with the model-generated prevalence rate using multiple metrics, including mean absolute error and proportion of countries where the modeled prevalence rate for each method was within 1, 2, or 5 percentage points of the DHS referent value. Results: For the methods studied, family planning use estimates from public-sector logistics data were correlated with those from the DHS, validating the quality and accuracy of current public-sector logistics data. Logistics data for oral and injectable contraceptives were significantly associated (P<.05) with the referent DHS values for both bivariate and multivariate models. For condoms, however, that association was only significant for the bivariate model. With the exception of the CYP-based model for condoms, models were able to estimate public-sector prevalence rates for each short-acting method to within 2 percentage points in at least 85% of countries. Conclusions: Public-sector contraceptive logistics data are strongly correlated with public-sector prevalence rates for short-acting methods, demonstrating the quality of current logistics data and their ability to provide relatively accurate prevalence estimates. The models provide a starting point for generating interim estimates of contraceptive use when timely survey data are unavailable. All models except the condoms CYP model performed well; the regression models were most accurate but the CYP model offers the simplest calculation method. Future work extending the research to other modern methods, relating subnational logistics data with prevalence rates, and tracking that relationship over time is needed. PMID:26374805

  8. Mixed conditional logistic regression for habitat selection studies.

    PubMed

    Duchesne, Thierry; Fortin, Daniel; Courbin, Nicolas

    2010-05-01

    1. Resource selection functions (RSFs) are becoming a dominant tool in habitat selection studies. RSF coefficients can be estimated with unconditional (standard) and conditional logistic regressions. While the advantage of mixed-effects models is recognized for standard logistic regression, mixed conditional logistic regression remains largely overlooked in ecological studies. 2. We demonstrate the significance of mixed conditional logistic regression for habitat selection studies. First, we use spatially explicit models to illustrate how mixed-effects RSFs can be useful in the presence of inter-individual heterogeneity in selection and when the assumption of independence from irrelevant alternatives (IIA) is violated. The IIA hypothesis states that the strength of preference for habitat type A over habitat type B does not depend on the other habitat types also available. Secondly, we demonstrate the significance of mixed-effects models to evaluate habitat selection of free-ranging bison Bison bison. 3. When movement rules were homogeneous among individuals and the IIA assumption was respected, fixed-effects RSFs adequately described habitat selection by simulated animals. In situations violating the inter-individual homogeneity and IIA assumptions, however, RSFs were best estimated with mixed-effects regressions, and fixed-effects models could even provide faulty conclusions. 4. Mixed-effects models indicate that bison did not select farmlands, but exhibited strong inter-individual variations in their response to farmlands. Less than half of the bison preferred farmlands over forests. Conversely, the fixed-effect model simply suggested an overall selection for farmlands. 5. Conditional logistic regression is recognized as a powerful approach to evaluate habitat selection when resource availability changes. This regression is increasingly used in ecological studies, but almost exclusively in the context of fixed-effects models. Fitness maximization can imply differences in trade-offs among individuals, which can yield inter-individual differences in selection and lead to departure from IIA. These situations are best modelled with mixed-effects models. Mixed-effects conditional logistic regression should become a valuable tool for ecological research.

  9. Advanced colorectal neoplasia risk stratification by penalized logistic regression.

    PubMed

    Lin, Yunzhi; Yu, Menggang; Wang, Sijian; Chappell, Richard; Imperiale, Thomas F

    2016-08-01

    Colorectal cancer is the second leading cause of death from cancer in the United States. To facilitate the efficiency of colorectal cancer screening, there is a need to stratify risk for colorectal cancer among the 90% of US residents who are considered "average risk." In this article, we investigate such risk stratification rules for advanced colorectal neoplasia (colorectal cancer and advanced, precancerous polyps). We use a recently completed large cohort study of subjects who underwent a first screening colonoscopy. Logistic regression models have been used in the literature to estimate the risk of advanced colorectal neoplasia based on quantifiable risk factors. However, logistic regression may be prone to overfitting and instability in variable selection. Since most of the risk factors in our study have several categories, it was tempting to collapse these categories into fewer risk groups. We propose a penalized logistic regression method that automatically and simultaneously selects variables, groups categories, and estimates their coefficients by penalizing the [Formula: see text]-norm of both the coefficients and their differences. Hence, it encourages sparsity in the categories, i.e. grouping of the categories, and sparsity in the variables, i.e. variable selection. We apply the penalized logistic regression method to our data. The important variables are selected, with close categories simultaneously grouped, by penalized regression models with and without the interactions terms. The models are validated with 10-fold cross-validation. The receiver operating characteristic curves of the penalized regression models dominate the receiver operating characteristic curve of naive logistic regressions, indicating a superior discriminative performance. © The Author(s) 2013.

  10. Using Logistic Regression To Predict the Probability of Debris Flows Occurring in Areas Recently Burned By Wildland Fires

    USGS Publications Warehouse

    Rupert, Michael G.; Cannon, Susan H.; Gartner, Joseph E.

    2003-01-01

    Logistic regression was used to predict the probability of debris flows occurring in areas recently burned by wildland fires. Multiple logistic regression is conceptually similar to multiple linear regression because statistical relations between one dependent variable and several independent variables are evaluated. In logistic regression, however, the dependent variable is transformed to a binary variable (debris flow did or did not occur), and the actual probability of the debris flow occurring is statistically modeled. Data from 399 basins located within 15 wildland fires that burned during 2000-2002 in Colorado, Idaho, Montana, and New Mexico were evaluated. More than 35 independent variables describing the burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated. The models were developed as follows: (1) Basins that did and did not produce debris flows were delineated from National Elevation Data using a Geographic Information System (GIS). (2) Data describing the burn severity, geology, land surface gradient, rainfall, and soil properties were determined for each basin. These data were then downloaded to a statistics software package for analysis using logistic regression. (3) Relations between the occurrence/non-occurrence of debris flows and burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated and several preliminary multivariate logistic regression models were constructed. All possible combinations of independent variables were evaluated to determine which combination produced the most effective model. The multivariate model that best predicted the occurrence of debris flows was selected. (4) The multivariate logistic regression model was entered into a GIS, and a map showing the probability of debris flows was constructed. The most effective model incorporates the percentage of each basin with slope greater than 30 percent, percentage of land burned at medium and high burn severity in each basin, particle size sorting, average storm intensity (millimeters per hour), soil organic matter content, soil permeability, and soil drainage. The results of this study demonstrate that logistic regression is a valuable tool for predicting the probability of debris flows occurring in recently-burned landscapes.

  11. Influence of landscape-scale factors in limiting brook trout populations in Pennsylvania streams

    USGS Publications Warehouse

    Kocovsky, P.M.; Carline, R.F.

    2006-01-01

    Landscapes influence the capacity of streams to produce trout through their effect on water chemistry and other factors at the reach scale. Trout abundance also fluctuates over time; thus, to thoroughly understand how spatial factors at landscape scales affect trout populations, one must assess the changes in populations over time to provide a context for interpreting the importance of spatial factors. We used data from the Pennsylvania Fish and Boat Commission's fisheries management database to investigate spatial factors that affect the capacity of streams to support brook trout Salvelinus fontinalis and to provide models useful for their management. We assessed the relative importance of spatial and temporal variation by calculating variance components and comparing relative standard errors for spatial and temporal variation. We used binary logistic regression to predict the presence of harvestable-length brook trout and multiple linear regression to assess the mechanistic links between landscapes and trout populations and to predict population density. The variance in trout density among streams was equal to or greater than the temporal variation for several streams, indicating that differences among sites affect population density. Logistic regression models correctly predicted the absence of harvestable-length brook trout in 60% of validation samples. The r 2-value for the linear regression model predicting density was 0.3, indicating low predictive ability. Both logistic and linear regression models supported buffering capacity against acid episodes as an important mechanistic link between landscapes and trout populations. Although our models fail to predict trout densities precisely, their success at elucidating the mechanistic links between landscapes and trout populations, in concert with the importance of spatial variation, increases our understanding of factors affecting brook trout abundance and will help managers and private groups to protect and enhance populations of wild brook trout. ?? Copyright by the American Fisheries Society 2006.

  12. Predicting No-Shows in Radiology Using Regression Modeling of Data Available in the Electronic Medical Record.

    PubMed

    Harvey, H Benjamin; Liu, Catherine; Ai, Jing; Jaworsky, Cristina; Guerrier, Claude Emmanuel; Flores, Efren; Pianykh, Oleg

    2017-10-01

    To test whether data elements available in the electronic medical record (EMR) can be effectively leveraged to predict failure to attend a scheduled radiology examination. Using data from a large academic medical center, we identified all patients with a diagnostic imaging examination scheduled from January 1, 2016, to April 1, 2016, and determined whether the patient successfully attended the examination. Demographic, clinical, and health services utilization variables available in the EMR potentially relevant to examination attendance were recorded for each patient. We used descriptive statistics and logistic regression models to test whether these data elements could predict failure to attend a scheduled radiology examination. The predictive accuracy of the regression models were determined by calculating the area under the receiver operator curve. Among the 54,652 patient appointments with radiology examinations scheduled during the study period, 6.5% were no-shows. No-show rates were highest for the modalities of mammography and CT and lowest for PET and MRI. Logistic regression indicated that 16 of the 27 demographic, clinical, and health services utilization factors were significantly associated with failure to attend a scheduled radiology examination (P ≤ .05). Stepwise logistic regression analysis demonstrated that previous no-shows, days between scheduling and appointments, modality type, and insurance type were most strongly predictive of no-show. A model considering all 16 data elements had good ability to predict radiology no-shows (area under the receiver operator curve = 0.753). The predictive ability was similar or improved when these models were analyzed by modality. Patient and examination information readily available in the EMR can be successfully used to predict radiology no-shows. Moving forward, this information can be proactively leveraged to identify patients who might benefit from additional patient engagement through appointment reminders or other targeted interventions to avoid no-shows. Copyright © 2017 American College of Radiology. Published by Elsevier Inc. All rights reserved.

  13. Prediction of unwanted pregnancies using logistic regression, probit regression and discriminant analysis

    PubMed Central

    Ebrahimzadeh, Farzad; Hajizadeh, Ebrahim; Vahabi, Nasim; Almasian, Mohammad; Bakhteyar, Katayoon

    2015-01-01

    Background: Unwanted pregnancy not intended by at least one of the parents has undesirable consequences for the family and the society. In the present study, three classification models were used and compared to predict unwanted pregnancies in an urban population. Methods: In this cross-sectional study, 887 pregnant mothers referring to health centers in Khorramabad, Iran, in 2012 were selected by the stratified and cluster sampling; relevant variables were measured and for prediction of unwanted pregnancy, logistic regression, discriminant analysis, and probit regression models and SPSS software version 21 were used. To compare these models, indicators such as sensitivity, specificity, the area under the ROC curve, and the percentage of correct predictions were used. Results: The prevalence of unwanted pregnancies was 25.3%. The logistic and probit regression models indicated that parity and pregnancy spacing, contraceptive methods, household income and number of living male children were related to unwanted pregnancy. The performance of the models based on the area under the ROC curve was 0.735, 0.733, and 0.680 for logistic regression, probit regression, and linear discriminant analysis, respectively. Conclusion: Given the relatively high prevalence of unwanted pregnancies in Khorramabad, it seems necessary to revise family planning programs. Despite the similar accuracy of the models, if the researcher is interested in the interpretability of the results, the use of the logistic regression model is recommended. PMID:26793655

  14. Prediction of unwanted pregnancies using logistic regression, probit regression and discriminant analysis.

    PubMed

    Ebrahimzadeh, Farzad; Hajizadeh, Ebrahim; Vahabi, Nasim; Almasian, Mohammad; Bakhteyar, Katayoon

    2015-01-01

    Unwanted pregnancy not intended by at least one of the parents has undesirable consequences for the family and the society. In the present study, three classification models were used and compared to predict unwanted pregnancies in an urban population. In this cross-sectional study, 887 pregnant mothers referring to health centers in Khorramabad, Iran, in 2012 were selected by the stratified and cluster sampling; relevant variables were measured and for prediction of unwanted pregnancy, logistic regression, discriminant analysis, and probit regression models and SPSS software version 21 were used. To compare these models, indicators such as sensitivity, specificity, the area under the ROC curve, and the percentage of correct predictions were used. The prevalence of unwanted pregnancies was 25.3%. The logistic and probit regression models indicated that parity and pregnancy spacing, contraceptive methods, household income and number of living male children were related to unwanted pregnancy. The performance of the models based on the area under the ROC curve was 0.735, 0.733, and 0.680 for logistic regression, probit regression, and linear discriminant analysis, respectively. Given the relatively high prevalence of unwanted pregnancies in Khorramabad, it seems necessary to revise family planning programs. Despite the similar accuracy of the models, if the researcher is interested in the interpretability of the results, the use of the logistic regression model is recommended.

  15. Predictors of course in obsessive-compulsive disorder: logistic regression versus Cox regression for recurrent events.

    PubMed

    Kempe, P T; van Oppen, P; de Haan, E; Twisk, J W R; Sluis, A; Smit, J H; van Dyck, R; van Balkom, A J L M

    2007-09-01

    Two methods for predicting remissions in obsessive-compulsive disorder (OCD) treatment are evaluated. Y-BOCS measurements of 88 patients with a primary OCD (DSM-III-R) diagnosis were performed over a 16-week treatment period, and during three follow-ups. Remission at any measurement was defined as a Y-BOCS score lower than thirteen combined with a reduction of seven points when compared with baseline. Logistic regression models were compared with a Cox regression for recurrent events model. Logistic regression yielded different models at different evaluation times. The recurrent events model remained stable when fewer measurements were used. Higher baseline levels of neuroticism and more severe OCD symptoms were associated with a lower chance of remission, early age of onset and more depressive symptoms with a higher chance. Choice of outcome time affects logistic regression prediction models. Recurrent events analysis uses all information on remissions and relapses. Short- and long-term predictors for OCD remission show overlap.

  16. Estimating the exceedance probability of rain rate by logistic regression

    NASA Technical Reports Server (NTRS)

    Chiu, Long S.; Kedem, Benjamin

    1990-01-01

    Recent studies have shown that the fraction of an area with rain intensity above a fixed threshold is highly correlated with the area-averaged rain rate. To estimate the fractional rainy area, a logistic regression model, which estimates the conditional probability that rain rate over an area exceeds a fixed threshold given the values of related covariates, is developed. The problem of dependency in the data in the estimation procedure is bypassed by the method of partial likelihood. Analyses of simulated scanning multichannel microwave radiometer and observed electrically scanning microwave radiometer data during the Global Atlantic Tropical Experiment period show that the use of logistic regression in pixel classification is superior to multiple regression in predicting whether rain rate at each pixel exceeds a given threshold, even in the presence of noisy data. The potential of the logistic regression technique in satellite rain rate estimation is discussed.

  17. Comparison of naïve Bayes and logistic regression for computer-aided diagnosis of breast masses using ultrasound imaging

    NASA Astrophysics Data System (ADS)

    Cary, Theodore W.; Cwanger, Alyssa; Venkatesh, Santosh S.; Conant, Emily F.; Sehgal, Chandra M.

    2012-03-01

    This study compares the performance of two proven but very different machine learners, Naïve Bayes and logistic regression, for differentiating malignant and benign breast masses using ultrasound imaging. Ultrasound images of 266 masses were analyzed quantitatively for shape, echogenicity, margin characteristics, and texture features. These features along with patient age, race, and mammographic BI-RADS category were used to train Naïve Bayes and logistic regression classifiers to diagnose lesions as malignant or benign. ROC analysis was performed using all of the features and using only a subset that maximized information gain. Performance was determined by the area under the ROC curve, Az, obtained from leave-one-out cross validation. Naïve Bayes showed significant variation (Az 0.733 +/- 0.035 to 0.840 +/- 0.029, P < 0.002) with the choice of features, but the performance of logistic regression was relatively unchanged under feature selection (Az 0.839 +/- 0.029 to 0.859 +/- 0.028, P = 0.605). Out of 34 features, a subset of 6 gave the highest information gain: brightness difference, margin sharpness, depth-to-width, mammographic BI-RADs, age, and race. The probabilities of malignancy determined by Naïve Bayes and logistic regression after feature selection showed significant correlation (R2= 0.87, P < 0.0001). The diagnostic performance of Naïve Bayes and logistic regression can be comparable, but logistic regression is more robust. Since probability of malignancy cannot be measured directly, high correlation between the probabilities derived from two basic but dissimilar models increases confidence in the predictive power of machine learning models for characterizing solid breast masses on ultrasound.

  18. Variable Selection in Logistic Regression.

    DTIC Science & Technology

    1987-06-01

    23 %. AUTIOR(.) S. CONTRACT OR GRANT NUMBE Rf.i %Z. D. Bai, P. R. Krishnaiah and . C. Zhao F49620-85- C-0008 " PERFORMING ORGANIZATION NAME AND AOORESS...d I7 IOK-TK- d 7 -I0 7’ VARIABLE SELECTION IN LOGISTIC REGRESSION Z. D. Bai, P. R. Krishnaiah and L. C. Zhao Center for Multivariate Analysis...University of Pittsburgh Center for Multivariate Analysis University of Pittsburgh Y !I VARIABLE SELECTION IN LOGISTIC REGRESSION Z- 0. Bai, P. R. Krishnaiah

  19. Multinomial Logistic Regression Predicted Probability Map To Visualize The Influence Of Socio-Economic Factors On Breast Cancer Occurrence in Southern Karnataka

    NASA Astrophysics Data System (ADS)

    Madhu, B.; Ashok, N. C.; Balasubramanian, S.

    2014-11-01

    Multinomial logistic regression analysis was used to develop statistical model that can predict the probability of breast cancer in Southern Karnataka using the breast cancer occurrence data during 2007-2011. Independent socio-economic variables describing the breast cancer occurrence like age, education, occupation, parity, type of family, health insurance coverage, residential locality and socioeconomic status of each case was obtained. The models were developed as follows: i) Spatial visualization of the Urban- rural distribution of breast cancer cases that were obtained from the Bharat Hospital and Institute of Oncology. ii) Socio-economic risk factors describing the breast cancer occurrences were complied for each case. These data were then analysed using multinomial logistic regression analysis in a SPSS statistical software and relations between the occurrence of breast cancer across the socio-economic status and the influence of other socio-economic variables were evaluated and multinomial logistic regression models were constructed. iii) the model that best predicted the occurrence of breast cancer were identified. This multivariate logistic regression model has been entered into a geographic information system and maps showing the predicted probability of breast cancer occurrence in Southern Karnataka was created. This study demonstrates that Multinomial logistic regression is a valuable tool for developing models that predict the probability of breast cancer Occurrence in Southern Karnataka.

  20. Comparison of Logistic Regression and Artificial Neural Network in Low Back Pain Prediction: Second National Health Survey

    PubMed Central

    Parsaeian, M; Mohammad, K; Mahmoudi, M; Zeraati, H

    2012-01-01

    Background: The purpose of this investigation was to compare empirically predictive ability of an artificial neural network with a logistic regression in prediction of low back pain. Methods: Data from the second national health survey were considered in this investigation. This data includes the information of low back pain and its associated risk factors among Iranian people aged 15 years and older. Artificial neural network and logistic regression models were developed using a set of 17294 data and they were validated in a test set of 17295 data. Hosmer and Lemeshow recommendation for model selection was used in fitting the logistic regression. A three-layer perceptron with 9 inputs, 3 hidden and 1 output neurons was employed. The efficiency of two models was compared by receiver operating characteristic analysis, root mean square and -2 Loglikelihood criteria. Results: The area under the ROC curve (SE), root mean square and -2Loglikelihood of the logistic regression was 0.752 (0.004), 0.3832 and 14769.2, respectively. The area under the ROC curve (SE), root mean square and -2Loglikelihood of the artificial neural network was 0.754 (0.004), 0.3770 and 14757.6, respectively. Conclusions: Based on these three criteria, artificial neural network would give better performance than logistic regression. Although, the difference is statistically significant, it does not seem to be clinically significant. PMID:23113198

  1. Comparison of logistic regression and artificial neural network in low back pain prediction: second national health survey.

    PubMed

    Parsaeian, M; Mohammad, K; Mahmoudi, M; Zeraati, H

    2012-01-01

    The purpose of this investigation was to compare empirically predictive ability of an artificial neural network with a logistic regression in prediction of low back pain. Data from the second national health survey were considered in this investigation. This data includes the information of low back pain and its associated risk factors among Iranian people aged 15 years and older. Artificial neural network and logistic regression models were developed using a set of 17294 data and they were validated in a test set of 17295 data. Hosmer and Lemeshow recommendation for model selection was used in fitting the logistic regression. A three-layer perceptron with 9 inputs, 3 hidden and 1 output neurons was employed. The efficiency of two models was compared by receiver operating characteristic analysis, root mean square and -2 Loglikelihood criteria. The area under the ROC curve (SE), root mean square and -2Loglikelihood of the logistic regression was 0.752 (0.004), 0.3832 and 14769.2, respectively. The area under the ROC curve (SE), root mean square and -2Loglikelihood of the artificial neural network was 0.754 (0.004), 0.3770 and 14757.6, respectively. Based on these three criteria, artificial neural network would give better performance than logistic regression. Although, the difference is statistically significant, it does not seem to be clinically significant.

  2. Understanding logistic regression analysis.

    PubMed

    Sperandei, Sandro

    2014-01-01

    Logistic regression is used to obtain odds ratio in the presence of more than one explanatory variable. The procedure is quite similar to multiple linear regression, with the exception that the response variable is binomial. The result is the impact of each variable on the odds ratio of the observed event of interest. The main advantage is to avoid confounding effects by analyzing the association of all variables together. In this article, we explain the logistic regression procedure using examples to make it as simple as possible. After definition of the technique, the basic interpretation of the results is highlighted and then some special issues are discussed.

  3. Phobic Anxiety and Plasma Levels of Global Oxidative Stress in Women.

    PubMed

    Hagan, Kaitlin A; Wu, Tianying; Rimm, Eric B; Eliassen, A Heather; Okereke, Olivia I

    2015-01-01

    Psychological distress has been hypothesized to be associated with adverse biologic states such as higher oxidative stress and inflammation. Yet, little is known about associations between a common form of distress - phobic anxiety - and global oxidative stress. Thus, we related phobic anxiety to plasma fluorescent oxidation products (FlOPs), a global oxidative stress marker. We conducted a cross-sectional analysis among 1,325 women (aged 43-70 years) from the Nurses' Health Study. Phobic anxiety was measured using the Crown-Crisp Index (CCI). Adjusted least-squares mean log-transformed FlOPs were calculated across phobic categories. Logistic regression models were used to calculate odds ratios (OR) comparing the highest CCI category (≥6 points) vs. lower scores, across FlOPs quartiles. No association was found between phobic anxiety categories and mean FlOP levels in multivariable adjusted linear models. Similarly, in multivariable logistic regression models there were no associations between FlOPs quartiles and likelihood of being in the highest phobic category. Comparing women in the highest vs. lowest FlOPs quartiles: FlOP_360: OR=0.68 (95% CI: 0.40-1.15); FlOP_320: OR=0.99 (95% CI: 0.61-1.61); FlOP_400: OR=0.92 (95% CI: 0.52, 1.63). No cross-sectional association was found between phobic anxiety and a plasma measure of global oxidative stress in this sample of middle-aged and older women.

  4. Comparing Methodologies for Developing an Early Warning System: Classification and Regression Tree Model versus Logistic Regression. REL 2015-077

    ERIC Educational Resources Information Center

    Koon, Sharon; Petscher, Yaacov

    2015-01-01

    The purpose of this report was to explicate the use of logistic regression and classification and regression tree (CART) analysis in the development of early warning systems. It was motivated by state education leaders' interest in maintaining high classification accuracy while simultaneously improving practitioner understanding of the rules by…

  5. Finding the Perfect Match: Factors That Influence Family Medicine Residency Selection.

    PubMed

    Wright, Katherine M; Ryan, Elizabeth R; Gatta, John L; Anderson, Lauren; Clements, Deborah S

    2016-04-01

    Residency program selection is a significant experience for emerging physicians, yet there is limited information about how applicants narrow their list of potential programs. This study examines factors that influence residency program selection among medical students interested in family medicine at the time of application. Medical students with an expressed interest in family medicine were invited to participate in a 37-item, online survey. Students were asked to rate factors that may impact residency selection on a 6-point Likert scale in addition to three open-ended qualitative questions. Mean values were calculated for each survey item and were used to determine a rank order for selection criteria. Logistic regression analysis was performed to identify factors that predict a strong interest in urban, suburban, and rural residency programs. Logistic regression was also used to identify factors that predict a strong interest in academic health center-based residencies, community-based residencies, and community-based residencies with an academic affiliation. A total of 705 medical students from 32 states across the country completed the survey. Location, work/life balance, and program structure (curriculum, schedule) were rated the most important factors for residency selection. Logistic regression analysis was used to refine our understanding of how each factor relates to specific types of residencies. These findings have implications for how to best advise students in selecting a residency, as well as marketing residencies to the right candidates. Refining the recruitment process will ensure a better fit between applicants and potential programs. Limited recruitment resources may be better utilized by focusing on targeted dissemination strategies.

  6. Stata Modules for Calculating Novel Predictive Performance Indices for Logistic Models

    PubMed Central

    Barkhordari, Mahnaz; Padyab, Mojgan; Hadaegh, Farzad; Azizi, Fereidoun; Bozorgmanesh, Mohammadreza

    2016-01-01

    Background Prediction is a fundamental part of prevention of cardiovascular diseases (CVD). The development of prediction algorithms based on the multivariate regression models loomed several decades ago. Parallel with predictive models development, biomarker researches emerged in an impressively great scale. The key question is how best to assess and quantify the improvement in risk prediction offered by new biomarkers or more basically how to assess the performance of a risk prediction model. Discrimination, calibration, and added predictive value have been recently suggested to be used while comparing the predictive performances of the predictive models’ with and without novel biomarkers. Objectives Lack of user-friendly statistical software has restricted implementation of novel model assessment methods while examining novel biomarkers. We intended, thus, to develop a user-friendly software that could be used by researchers with few programming skills. Materials and Methods We have written a Stata command that is intended to help researchers obtain cut point-free and cut point-based net reclassification improvement index and (NRI) and relative and absolute Integrated discriminatory improvement index (IDI) for logistic-based regression analyses.We applied the commands to a real data on women participating the Tehran lipid and glucose study (TLGS) to examine if information of a family history of premature CVD, waist circumference, and fasting plasma glucose can improve predictive performance of the Framingham’s “general CVD risk” algorithm. Results The command is addpred for logistic regression models. Conclusions The Stata package provided herein can encourage the use of novel methods in examining predictive capacity of ever-emerging plethora of novel biomarkers. PMID:27279830

  7. Differentiating major depressive disorder in youths with attention deficit hyperactivity disorder.

    PubMed

    Diler, Rasim Somer; Daviss, W Burleson; Lopez, Adriana; Axelson, David; Iyengar, Satish; Birmaher, Boris

    2007-09-01

    Youths with attention deficit hyperactivity disorders (ADHD) frequently have comorbid major depressive disorders (MDD) sharing overlapping symptoms. Our objective was to examine which depressive symptoms best discriminate MDD among youths with ADHD. One-hundred-eleven youths with ADHD (5.2-17.8 years old) and their parents completed interviews with the K-SADS-PL and respective versions of the child or the parent Mood and Feelings Questionnaire (MFQ-C, MFQ-P). Controlling for group differences, logistic regression was used to calculate odds ratios reflecting the accuracy with which various depressive symptoms on the MFQ-C or MFQ-P discriminated MDD. Stepwise logistic regression then identified depressive symptoms that best discriminated the groups with and without MDD, using cross-validated misclassification rate as the criterion. Symptoms that discriminated youths with MDD (n=18) from those without MDD (n=93) were 4 of 6 mood/anhedonia symptoms, all 14 depressed cognition symptoms, and only 3 of 11 physical/vegetative symptoms. Mild irritability, miserable/unhappy moods, and symptoms related to sleep, appetite, energy levels and concentration did not discriminate MDD. A stepwise logistic regression correctly classified 89% of the comorbid MDD subjects, with only age, anhedonia at school, thoughts about killing self, thoughts that bad things would happen, and talking more slowly remaining in the final model. Results of this study may not generalize to community samples because subjects were drawn largely from a university-based outpatient psychiatric clinic. These findings stress the importance of social withdrawal, anhedonia, depressive cognitions, suicidal thoughts, and psychomotor retardation when trying to identify MDD among ADHD youths.

  8. Using Multiple and Logistic Regression to Estimate the Median WillCost and Probability of Cost and Schedule Overrun for Program Managers

    DTIC Science & Technology

    2017-03-23

    PUBLIC RELEASE; DISTRIBUTION UNLIMITED Using Multiple and Logistic Regression to Estimate the Median Will- Cost and Probability of Cost and... Cost and Probability of Cost and Schedule Overrun for Program Managers Ryan C. Trudelle Follow this and additional works at: https://scholar.afit.edu...afit.edu. Recommended Citation Trudelle, Ryan C., "Using Multiple and Logistic Regression to Estimate the Median Will- Cost and Probability of Cost and

  9. Expression of Proteins Involved in Epithelial-Mesenchymal Transition as Predictors of Metastasis and Survival in Breast Cancer Patients

    DTIC Science & Technology

    2013-11-01

    Ptrend 0.78 0.62 0.75 Unconditional logistic regression was used to estimate odds ratios (OR) and 95 % confidence intervals (CI) for risk of node...Ptrend 0.71 0.67 Unconditional logistic regression was used to estimate odds ratios (OR) and 95 % confidence intervals (CI) for risk of high-grade tumors... logistic regression was used to estimate odds ratios (OR) and 95 % confidence intervals (CI) for the associations between each of the seven SNPs and

  10. Logistic LASSO regression for the diagnosis of breast cancer using clinical demographic data and the BI-RADS lexicon for ultrasonography.

    PubMed

    Kim, Sun Mi; Kim, Yongdai; Jeong, Kuhwan; Jeong, Heeyeong; Kim, Jiyoung

    2018-01-01

    The aim of this study was to compare the performance of image analysis for predicting breast cancer using two distinct regression models and to evaluate the usefulness of incorporating clinical and demographic data (CDD) into the image analysis in order to improve the diagnosis of breast cancer. This study included 139 solid masses from 139 patients who underwent a ultrasonography-guided core biopsy and had available CDD between June 2009 and April 2010. Three breast radiologists retrospectively reviewed 139 breast masses and described each lesion using the Breast Imaging Reporting and Data System (BI-RADS) lexicon. We applied and compared two regression methods-stepwise logistic (SL) regression and logistic least absolute shrinkage and selection operator (LASSO) regression-in which the BI-RADS descriptors and CDD were used as covariates. We investigated the performances of these regression methods and the agreement of radiologists in terms of test misclassification error and the area under the curve (AUC) of the tests. Logistic LASSO regression was superior (P<0.05) to SL regression, regardless of whether CDD was included in the covariates, in terms of test misclassification errors (0.234 vs. 0.253, without CDD; 0.196 vs. 0.258, with CDD) and AUC (0.785 vs. 0.759, without CDD; 0.873 vs. 0.735, with CDD). However, it was inferior (P<0.05) to the agreement of three radiologists in terms of test misclassification errors (0.234 vs. 0.168, without CDD; 0.196 vs. 0.088, with CDD) and the AUC without CDD (0.785 vs. 0.844, P<0.001), but was comparable to the AUC with CDD (0.873 vs. 0.880, P=0.141). Logistic LASSO regression based on BI-RADS descriptors and CDD showed better performance than SL in predicting the presence of breast cancer. The use of CDD as a supplement to the BI-RADS descriptors significantly improved the prediction of breast cancer using logistic LASSO regression.

  11. Building and verifying a severity prediction model of acute pancreatitis (AP) based on BISAP, MEWS and routine test indexes.

    PubMed

    Ye, Jiang-Feng; Zhao, Yu-Xin; Ju, Jian; Wang, Wei

    2017-10-01

    To discuss the value of the Bedside Index for Severity in Acute Pancreatitis (BISAP), Modified Early Warning Score (MEWS), serum Ca2+, similarly hereinafter, and red cell distribution width (RDW) for predicting the severity grade of acute pancreatitis and to develop and verify a more accurate scoring system to predict the severity of AP. In 302 patients with AP, we calculated BISAP and MEWS scores and conducted regression analyses on the relationships of BISAP scoring, RDW, MEWS, and serum Ca2+ with the severity of AP using single-factor logistics. The variables with statistical significance in the single-factor logistic regression were used in a multi-factor logistic regression model; forward stepwise regression was used to screen variables and build a multi-factor prediction model. A receiver operating characteristic curve (ROC curve) was constructed, and the significance of multi- and single-factor prediction models in predicting the severity of AP using the area under the ROC curve (AUC) was evaluated. The internal validity of the model was verified through bootstrapping. Among 302 patients with AP, 209 had mild acute pancreatitis (MAP) and 93 had severe acute pancreatitis (SAP). According to single-factor logistic regression analysis, we found that BISAP, MEWS and serum Ca2+ are prediction indexes of the severity of AP (P-value<0.001), whereas RDW is not a prediction index of AP severity (P-value>0.05). The multi-factor logistic regression analysis showed that BISAP and serum Ca2+ are independent prediction indexes of AP severity (P-value<0.001), and MEWS is not an independent prediction index of AP severity (P-value>0.05); BISAP is negatively related to serum Ca2+ (r=-0.330, P-value<0.001). The constructed model is as follows: ln()=7.306+1.151*BISAP-4.516*serum Ca2+. The predictive ability of each model for SAP follows the order of the combined BISAP and serum Ca2+ prediction model>Ca2+>BISAP. There is no statistical significance for the predictive ability of BISAP and serum Ca2+ (P-value>0.05); however, there is remarkable statistical significance for the predictive ability using the newly built prediction model as well as BISAP and serum Ca2+ individually (P-value<0.01). Verification of the internal validity of the models by bootstrapping is favorable. BISAP and serum Ca2+ have high predictive value for the severity of AP. However, the model built by combining BISAP and serum Ca2+ is remarkably superior to those of BISAP and serum Ca2+ individually. Furthermore, this model is simple, practical and appropriate for clinical use. Copyright © 2016. Published by Elsevier Masson SAS.

  12. Use and interpretation of logistic regression in habitat-selection studies

    USGS Publications Warehouse

    Keating, Kim A.; Cherry, Steve

    2004-01-01

     Logistic regression is an important tool for wildlife habitat-selection studies, but the method frequently has been misapplied due to an inadequate understanding of the logistic model, its interpretation, and the influence of sampling design. To promote better use of this method, we review its application and interpretation under 3 sampling designs: random, case-control, and use-availability. Logistic regression is appropriate for habitat use-nonuse studies employing random sampling and can be used to directly model the conditional probability of use in such cases. Logistic regression also is appropriate for studies employing case-control sampling designs, but careful attention is required to interpret results correctly. Unless bias can be estimated or probability of use is small for all habitats, results of case-control studies should be interpreted as odds ratios, rather than probability of use or relative probability of use. When data are gathered under a use-availability design, logistic regression can be used to estimate approximate odds ratios if probability of use is small, at least on average. More generally, however, logistic regression is inappropriate for modeling habitat selection in use-availability studies. In particular, using logistic regression to fit the exponential model of Manly et al. (2002:100) does not guarantee maximum-likelihood estimates, valid probabilities, or valid likelihoods. We show that the resource selection function (RSF) commonly used for the exponential model is proportional to a logistic discriminant function. Thus, it may be used to rank habitats with respect to probability of use and to identify important habitat characteristics or their surrogates, but it is not guaranteed to be proportional to probability of use. Other problems associated with the exponential model also are discussed. We describe an alternative model based on Lancaster and Imbens (1996) that offers a method for estimating conditional probability of use in use-availability studies. Although promising, this model fails to converge to a unique solution in some important situations. Further work is needed to obtain a robust method that is broadly applicable to use-availability studies.

  13. Modeling Governance KB with CATPCA to Overcome Multicollinearity in the Logistic Regression

    NASA Astrophysics Data System (ADS)

    Khikmah, L.; Wijayanto, H.; Syafitri, U. D.

    2017-04-01

    The problem often encounters in logistic regression modeling are multicollinearity problems. Data that have multicollinearity between explanatory variables with the result in the estimation of parameters to be bias. Besides, the multicollinearity will result in error in the classification. In general, to overcome multicollinearity in regression used stepwise regression. They are also another method to overcome multicollinearity which involves all variable for prediction. That is Principal Component Analysis (PCA). However, classical PCA in only for numeric data. Its data are categorical, one method to solve the problems is Categorical Principal Component Analysis (CATPCA). Data were used in this research were a part of data Demographic and Population Survey Indonesia (IDHS) 2012. This research focuses on the characteristic of women of using the contraceptive methods. Classification results evaluated using Area Under Curve (AUC) values. The higher the AUC value, the better. Based on AUC values, the classification of the contraceptive method using stepwise method (58.66%) is better than the logistic regression model (57.39%) and CATPCA (57.39%). Evaluation of the results of logistic regression using sensitivity, shows the opposite where CATPCA method (99.79%) is better than logistic regression method (92.43%) and stepwise (92.05%). Therefore in this study focuses on major class classification (using a contraceptive method), then the selected model is CATPCA because it can raise the level of the major class model accuracy.

  14. Impact of night-shift work on the prevalence of erosive esophagitis in shipyard male workers.

    PubMed

    Chung, Tae Heum; Lee, Jiho; Kim, Moon Chan

    2016-08-01

    Whether night-shift work is a risk factor for gastroesophageal reflux disease is controversial. The aim of this study was to investigate the association between night-shift work and other factors, and erosive esophagitis. A cross-sectional study with 6040 male shipyard workers was performed. Esophagogastroduodenoscopic examination and a survey about night-shift work status, lifestyle, medical history, educational status, and marital status were conducted in all workers. The odds ratios of erosive esophagitis according to night-shift work status were calculated by using the logistic regression model. The prevalence of erosive esophagitis increased in the night-shift workers [odds ratio, 95 % confidence interval: 1.41 (1.03-1.94)]. According to multiple logistic regression models, night-shift work, obesity, smoking, and alcohol consumption of ≥140 g/week were significant risk factors for erosive esophagitis. By contrast, Helicobacter pylori infection was negatively associated with erosive esophagitis. Night-shift work is suggested to be a risk factor for erosive esophagitis. Avoidance of night-shift work and lifestyle modification should be considered for prevention and management of gastroesophageal reflux disease.

  15. Comparison of patient centeredness of visits to emergency departments, physicians, and dentists for dental problems and injuries.

    PubMed

    Cohen, Leonard A; Bonito, Arthur J; Eicheldinger, Celia; Manski, Richard J; Macek, Mark D; Edwards, Robert R; Khanna, Niharika

    2010-01-01

    Patient-centered care has a positive impact on patient health status. This report compares patient assessments of patient centeredness during treatment in hospital emergency departments (EDs) and physician and dentist offices for dental problems and injuries. Participants included low-income White, Black, and Hispanic adults who had experienced a dental problem or injury during the previous 12 months and who visited an emergency department, physician, or dentist for treatment. A stratified random sample of Maryland households participated in a cross-sectional telephone survey. Interviews were completed with 94.8% (401/423) of eligible individuals. Multivariable logistic regression analyses were performed. The measure of predictive power, the pseudo-R2s, calculated for the logistic regression models ranged from 12% to 18% for the analyses of responses to the measures of patient centeredness (satisfaction with treatment, careful listening, thorough explaining, spending enough time, and treated with courtesy and respect). EDs were less likely than dentists to treat patients with great courtesy and respect. Further research is needed to identify factors that support patient-centered care.

  16. Upper Gastrointestinal Complications and Cardiovascular/Gastrointestinal Risk Calculator in Patients with Myocardial Infarction Treated with Aspirin.

    PubMed

    Wen, Lei

    2017-08-20

    Aspirin is widely used for the prevention of cardiovascular and cerebrovascular diseases for the past few years. However, much attention has been paid to the adverse effects associated with aspirin such as gastrointestinal bleeding. How to weigh the benefits and hazards? The current study aimed to assess the feasibility of a cardiovascular/gastrointestinal risk calculator, AsaRiskCalculator, in predicting gastrointestinal events in Chinese patients with myocardial infarction (MI), determining unique risk factor(s) for gastrointestinal events to be considered in the calculator. The MI patients who visited Shapingba District People's Hospital between January 2012 and January 2016 were retrospectively reviewed. Based on gastroscopic data, the patients were divided into two groups: gastrointestinal and nongastrointestinal groups. Demographic and clinical data of the patients were then retrieved for statistical analysis. Univariate and multiple logistic regression analyses were used to identify independent risk factors for gastrointestinal events. The receiver operating characteristic (ROC) curves were used to assess the predictive value of AsaRiskCalculator for gastrointestinal events. A total of 400 MI patients meeting the eligibility criteria were analyzed, including 94 and 306 in the gastrointestinal and nongastrointestinal groups, respectively. The data showed that age, male gender, predicted gastrointestinal events, and Helicobacter pylori (HP) infection were positively correlated with gastrointestinal events. In multiple logistic regression analysis, predicted gastrointestinal events and HP infection were identified as risk factors for actual gastrointestinal events. HP infection was highly predictive in Chinese patients; the ROC curve indicated an area under the curve of 0.822 (95% confidence interval: 0.774-0.870). The best diagnostic cutoff point of predicted gastrointestinal events was 68.0‰, yielding sensitivity and specificity of 60.6% and 93.1%, respectively, for predicting gastrointestinal events in Chinese patients with MI. AsaRiskCalculator had a predictive value for gastrointestinal events in Chinese patients with MI. HP infection seemed to be an independent risk factor for gastrointestinal events caused by long-term aspirin treatment in Chinese patients with MI, and it should be included in the risk calculator adapted for Chinese patients.

  17. Blood oxygen level dependent magnetic resonance imaging for detecting pathological patterns in lupus nephritis patients: a preliminary study using a decision tree model.

    PubMed

    Shi, Huilan; Jia, Junya; Li, Dong; Wei, Li; Shang, Wenya; Zheng, Zhenfeng

    2018-02-09

    Precise renal histopathological diagnosis will guide therapy strategy in patients with lupus nephritis. Blood oxygen level dependent (BOLD) magnetic resonance imaging (MRI) has been applicable noninvasive technique in renal disease. This current study was performed to explore whether BOLD MRI could contribute to diagnose renal pathological pattern. Adult patients with lupus nephritis renal pathological diagnosis were recruited for this study. Renal biopsy tissues were assessed based on the lupus nephritis ISN/RPS 2003 classification. The Blood oxygen level dependent magnetic resonance imaging (BOLD-MRI) was used to obtain functional magnetic resonance parameter, R2* values. Several functions of R2* values were calculated and used to construct algorithmic models for renal pathological patterns. In addition, the algorithmic models were compared as to their diagnostic capability. Both Histopathology and BOLD MRI were used to examine a total of twelve patients. Renal pathological patterns included five classes III (including 3 as class III + V) and seven classes IV (including 4 as class IV + V). Three algorithmic models, including decision tree, line discriminant, and logistic regression, were constructed to distinguish the renal pathological pattern of class III and class IV. The sensitivity of the decision tree model was better than that of the line discriminant model (71.87% vs 59.48%, P < 0.001) and inferior to that of the Logistic regression model (71.87% vs 78.71%, P < 0.001). The specificity of decision tree model was equivalent to that of the line discriminant model (63.87% vs 63.73%, P = 0.939) and higher than that of the logistic regression model (63.87% vs 38.0%, P < 0.001). The Area under the ROC curve (AUROCC) of the decision tree model was greater than that of the line discriminant model (0.765 vs 0.629, P < 0.001) and logistic regression model (0.765 vs 0.662, P < 0.001). BOLD MRI is a useful non-invasive imaging technique for the evaluation of lupus nephritis. Decision tree models constructed using functions of R2* values may facilitate the prediction of renal pathological patterns.

  18. Socioeconomic Factors Associated with Post-Mastectomy Immediate Reconstruction in a Contemporary Cohort of Breast Cancer Survivors.

    PubMed

    Schumacher, Jessica R; Taylor, Lauren J; Tucholka, Jennifer L; Poore, Samuel; Eggen, Amanda; Steiman, Jennifer; Wilke, Lee G; Greenberg, Caprice C; Neuman, Heather B

    2017-10-01

    Post-mastectomy reconstruction is a critical component of high-quality breast cancer care. Prior studies demonstrate socioeconomic disparity in receipt of reconstruction. Our objective was to evaluate trends in receipt of immediate reconstruction and examine socioeconomic factors associated with reconstruction in a contemporary cohort. Using the National Cancer Database, we identified women <75 years of age with stage 0-1 breast cancer treated with mastectomy (n = 297,121). Trends in immediate reconstruction rates (2004-2013) for the overall cohort and stratified by socioeconomic factors were examined using Join-point regression analysis, and annual percentage change (APC) was calculated. We then restricted our sample to a contemporary cohort (2010-2013, n = 145,577). Multivariable logistic regression identified socioeconomic factors associated with immediate reconstruction. Average adjusted predicted probabilities of receiving reconstruction were calculated. Immediate reconstruction rates increased from 27 to 48%. Although absolute rates of reconstruction for each stratification group increased, similar APCs across strata led to persistent gaps in receipt of reconstruction. On multivariable logistic regression using our contemporary cohort, race, income, education, and insurance type were all strongly associated with immediate reconstruction. Patients with the lowest predicted probability of receiving reconstruction were patients with Medicaid who lived in areas with the lowest rates of high-school graduation (Black 42.4% [95% CI 40.5-44.3], White 45.7% [95% CI 43.9-47.4]). Although reconstruction rates have increased dramatically over the past decade, lower rates persist for disadvantaged patients. Understanding how socioeconomic factors influence receipt of reconstruction, and identifying modifiable factors, are critical next steps towards identifying interventions to reduce disparities in breast cancer surgical care.

  19. Logistic regression models of factors influencing the location of bioenergy and biofuels plants

    Treesearch

    T.M. Young; R.L. Zaretzki; J.H. Perdue; F.M. Guess; X. Liu

    2011-01-01

    Logistic regression models were developed to identify significant factors that influence the location of existing wood-using bioenergy/biofuels plants and traditional wood-using facilities. Logistic models provided quantitative insight for variables influencing the location of woody biomass-using facilities. Availability of "thinnings to a basal area of 31.7m2/ha...

  20. Discrete post-processing of total cloud cover ensemble forecasts

    NASA Astrophysics Data System (ADS)

    Hemri, Stephan; Haiden, Thomas; Pappenberger, Florian

    2017-04-01

    This contribution presents an approach to post-process ensemble forecasts for the discrete and bounded weather variable of total cloud cover. Two methods for discrete statistical post-processing of ensemble predictions are tested. The first approach is based on multinomial logistic regression, the second involves a proportional odds logistic regression model. Applying them to total cloud cover raw ensemble forecasts from the European Centre for Medium-Range Weather Forecasts improves forecast skill significantly. Based on station-wise post-processing of raw ensemble total cloud cover forecasts for a global set of 3330 stations over the period from 2007 to early 2014, the more parsimonious proportional odds logistic regression model proved to slightly outperform the multinomial logistic regression model. Reference Hemri, S., Haiden, T., & Pappenberger, F. (2016). Discrete post-processing of total cloud cover ensemble forecasts. Monthly Weather Review 144, 2565-2577.

  1. Fuzzy multinomial logistic regression analysis: A multi-objective programming approach

    NASA Astrophysics Data System (ADS)

    Abdalla, Hesham A.; El-Sayed, Amany A.; Hamed, Ramadan

    2017-05-01

    Parameter estimation for multinomial logistic regression is usually based on maximizing the likelihood function. For large well-balanced datasets, Maximum Likelihood (ML) estimation is a satisfactory approach. Unfortunately, ML can fail completely or at least produce poor results in terms of estimated probabilities and confidence intervals of parameters, specially for small datasets. In this study, a new approach based on fuzzy concepts is proposed to estimate parameters of the multinomial logistic regression. The study assumes that the parameters of multinomial logistic regression are fuzzy. Based on the extension principle stated by Zadeh and Bárdossy's proposition, a multi-objective programming approach is suggested to estimate these fuzzy parameters. A simulation study is used to evaluate the performance of the new approach versus Maximum likelihood (ML) approach. Results show that the new proposed model outperforms ML in cases of small datasets.

  2. A Primer on Logistic Regression.

    ERIC Educational Resources Information Center

    Woldbeck, Tanya

    This paper introduces logistic regression as a viable alternative when the researcher is faced with variables that are not continuous. If one is to use simple regression, the dependent variable must be measured on a continuous scale. In the behavioral sciences, it may not always be appropriate or possible to have a measured dependent variable on a…

  3. Logistic and Multiple Regression: A Two-Pronged Approach to Accurately Estimate Cost Growth in Major DoD Weapon Systems

    DTIC Science & Technology

    2004-03-01

    Breusch - Pagan test for constant variance of the residuals. Using Microsoft Excel® we calculate a p-value of 0.841237. This high p-value, which is above...our alpha of 0.05, indicates that our residuals indeed pass the Breusch - Pagan test for constant variance. In addition to the assumption tests , we...Wilk Test for Normality – Support (Reduced) Model (OLS) Finally, we perform a Breusch - Pagan test for constant variance of the residuals. Using

  4. A Solution to Separation and Multicollinearity in Multiple Logistic Regression

    PubMed Central

    Shen, Jianzhao; Gao, Sujuan

    2010-01-01

    In dementia screening tests, item selection for shortening an existing screening test can be achieved using multiple logistic regression. However, maximum likelihood estimates for such logistic regression models often experience serious bias or even non-existence because of separation and multicollinearity problems resulting from a large number of highly correlated items. Firth (1993, Biometrika, 80(1), 27–38) proposed a penalized likelihood estimator for generalized linear models and it was shown to reduce bias and the non-existence problems. The ridge regression has been used in logistic regression to stabilize the estimates in cases of multicollinearity. However, neither solves the problems for each other. In this paper, we propose a double penalized maximum likelihood estimator combining Firth’s penalized likelihood equation with a ridge parameter. We present a simulation study evaluating the empirical performance of the double penalized likelihood estimator in small to moderate sample sizes. We demonstrate the proposed approach using a current screening data from a community-based dementia study. PMID:20376286

  5. A Solution to Separation and Multicollinearity in Multiple Logistic Regression.

    PubMed

    Shen, Jianzhao; Gao, Sujuan

    2008-10-01

    In dementia screening tests, item selection for shortening an existing screening test can be achieved using multiple logistic regression. However, maximum likelihood estimates for such logistic regression models often experience serious bias or even non-existence because of separation and multicollinearity problems resulting from a large number of highly correlated items. Firth (1993, Biometrika, 80(1), 27-38) proposed a penalized likelihood estimator for generalized linear models and it was shown to reduce bias and the non-existence problems. The ridge regression has been used in logistic regression to stabilize the estimates in cases of multicollinearity. However, neither solves the problems for each other. In this paper, we propose a double penalized maximum likelihood estimator combining Firth's penalized likelihood equation with a ridge parameter. We present a simulation study evaluating the empirical performance of the double penalized likelihood estimator in small to moderate sample sizes. We demonstrate the proposed approach using a current screening data from a community-based dementia study.

  6. [Influences of environmental factors and interaction of several chemokines gene-environmental on systemic lupus erythematosus].

    PubMed

    Ye, Dong-qing; Hu, Yi-song; Li, Xiang-pei; Huang, Fen; Yang, Shi-gui; Hao, Jia-hu; Yin, Jing; Zhang, Guo-qing; Liu, Hui-hui

    2004-11-01

    To explore the impact of environmental factors, daily lifestyle, psycho-social factors and the interactions between environmental factors and chemokines genes on systemic lupus erythematosus (SLE). Case-control study was carried out and environmental factors for SLE were analyzed by univariate and multivariate unconditional logistic regression. Interactions between environmental factors and chemokines polymorphism contributing to systemic lupus erythematosus were also analyzed by logistic regression model. There were nineteen factors associated with SLE when univariate unconditional logistic regression was used. However, when multivariate unconditional logistic regression was used, only five factors showed having impacts on the disease, in which drinking well water (OR=0.099) was protective factor for SLE, and multiple drug allergy (OR=8.174), over-exposure to sunshine (OR=18.339), taking antibiotics (OR=9.630) and oral contraceptives were risk factors for SLE. When unconditional logistic regression model was used, results showed that there was interaction between eating irritable food and -2518MCP-1G/G genotype (OR=4.387). No interaction between environmental factors was found that contributing to SLE in this study. Many environmental factors were related to SLE, and there was an interaction between -2518MCP-1G/G genotype and eating irritable food.

  7. A deeper look at two concepts of measuring gene-gene interactions: logistic regression and interaction information revisited.

    PubMed

    Mielniczuk, Jan; Teisseyre, Paweł

    2018-03-01

    Detection of gene-gene interactions is one of the most important challenges in genome-wide case-control studies. Besides traditional logistic regression analysis, recently the entropy-based methods attracted a significant attention. Among entropy-based methods, interaction information is one of the most promising measures having many desirable properties. Although both logistic regression and interaction information have been used in several genome-wide association studies, the relationship between them has not been thoroughly investigated theoretically. The present paper attempts to fill this gap. We show that although certain connections between the two methods exist, in general they refer two different concepts of dependence and looking for interactions in those two senses leads to different approaches to interaction detection. We introduce ordering between interaction measures and specify conditions for independent and dependent genes under which interaction information is more discriminative measure than logistic regression. Moreover, we show that for so-called perfect distributions those measures are equivalent. The numerical experiments illustrate the theoretical findings indicating that interaction information and its modified version are more universal tools for detecting various types of interaction than logistic regression and linkage disequilibrium measures. © 2017 WILEY PERIODICALS, INC.

  8. Controlling Type I Error Rates in Assessing DIF for Logistic Regression Method Combined with SIBTEST Regression Correction Procedure and DIF-Free-Then-DIF Strategy

    ERIC Educational Resources Information Center

    Shih, Ching-Lin; Liu, Tien-Hsiang; Wang, Wen-Chung

    2014-01-01

    The simultaneous item bias test (SIBTEST) method regression procedure and the differential item functioning (DIF)-free-then-DIF strategy are applied to the logistic regression (LR) method simultaneously in this study. These procedures are used to adjust the effects of matching true score on observed score and to better control the Type I error…

  9. Access disparities to Magnet hospitals for patients undergoing neurosurgical operations

    PubMed Central

    Missios, Symeon; Bekelis, Kimon

    2017-01-01

    Background Centers of excellence focusing on quality improvement have demonstrated superior outcomes for a variety of surgical interventions. We investigated the presence of access disparities to hospitals recognized by the Magnet Recognition Program of the American Nurses Credentialing Center (ANCC) for patients undergoing neurosurgical operations. Methods We performed a cohort study of all neurosurgery patients who were registered in the New York Statewide Planning and Research Cooperative System (SPARCS) database from 2009–2013. We examined the association of African-American race and lack of insurance with Magnet status hospitalization for neurosurgical procedures. A mixed effects propensity adjusted multivariable regression analysis was used to control for confounding. Results During the study period, 190,535 neurosurgical patients met the inclusion criteria. Using a multivariable logistic regression, we demonstrate that African-Americans had lower admission rates to Magnet institutions (OR 0.62; 95% CI, 0.58–0.67). This persisted in a mixed effects logistic regression model (OR 0.77; 95% CI, 0.70–0.83) to adjust for clustering at the patient county level, and a propensity score adjusted logistic regression model (OR 0.75; 95% CI, 0.69–0.82). Additionally, lack of insurance was associated with lower admission rates to Magnet institutions (OR 0.71; 95% CI, 0.68–0.73), in a multivariable logistic regression model. This persisted in a mixed effects logistic regression model (OR 0.72; 95% CI, 0.69–0.74), and a propensity score adjusted logistic regression model (OR 0.72; 95% CI, 0.69–0.75). Conclusions Using a comprehensive all-payer cohort of neurosurgery patients in New York State we identified an association of African-American race and lack of insurance with lower rates of admission to Magnet hospitals. PMID:28684152

  10. Adjusting for Confounding in Early Postlaunch Settings: Going Beyond Logistic Regression Models.

    PubMed

    Schmidt, Amand F; Klungel, Olaf H; Groenwold, Rolf H H

    2016-01-01

    Postlaunch data on medical treatments can be analyzed to explore adverse events or relative effectiveness in real-life settings. These analyses are often complicated by the number of potential confounders and the possibility of model misspecification. We conducted a simulation study to compare the performance of logistic regression, propensity score, disease risk score, and stabilized inverse probability weighting methods to adjust for confounding. Model misspecification was induced in the independent derivation dataset. We evaluated performance using relative bias confidence interval coverage of the true effect, among other metrics. At low events per coefficient (1.0 and 0.5), the logistic regression estimates had a large relative bias (greater than -100%). Bias of the disease risk score estimates was at most 13.48% and 18.83%. For the propensity score model, this was 8.74% and >100%, respectively. At events per coefficient of 1.0 and 0.5, inverse probability weighting frequently failed or reduced to a crude regression, resulting in biases of -8.49% and 24.55%. Coverage of logistic regression estimates became less than the nominal level at events per coefficient ≤5. For the disease risk score, inverse probability weighting, and propensity score, coverage became less than nominal at events per coefficient ≤2.5, ≤1.0, and ≤1.0, respectively. Bias of misspecified disease risk score models was 16.55%. In settings with low events/exposed subjects per coefficient, disease risk score methods can be useful alternatives to logistic regression models, especially when propensity score models cannot be used. Despite better performance of disease risk score methods than logistic regression and propensity score models in small events per coefficient settings, bias, and coverage still deviated from nominal.

  11. On the use and misuse of scalar scores of confounders in design and analysis of observational studies.

    PubMed

    Pfeiffer, R M; Riedl, R

    2015-08-15

    We assess the asymptotic bias of estimates of exposure effects conditional on covariates when summary scores of confounders, instead of the confounders themselves, are used to analyze observational data. First, we study regression models for cohort data that are adjusted for summary scores. Second, we derive the asymptotic bias for case-control studies when cases and controls are matched on a summary score, and then analyzed either using conditional logistic regression or by unconditional logistic regression adjusted for the summary score. Two scores, the propensity score (PS) and the disease risk score (DRS) are studied in detail. For cohort analysis, when regression models are adjusted for the PS, the estimated conditional treatment effect is unbiased only for linear models, or at the null for non-linear models. Adjustment of cohort data for DRS yields unbiased estimates only for linear regression; all other estimates of exposure effects are biased. Matching cases and controls on DRS and analyzing them using conditional logistic regression yields unbiased estimates of exposure effect, whereas adjusting for the DRS in unconditional logistic regression yields biased estimates, even under the null hypothesis of no association. Matching cases and controls on the PS yield unbiased estimates only under the null for both conditional and unconditional logistic regression, adjusted for the PS. We study the bias for various confounding scenarios and compare our asymptotic results with those from simulations with limited sample sizes. To create realistic correlations among multiple confounders, we also based simulations on a real dataset. Copyright © 2015 John Wiley & Sons, Ltd.

  12. Association of different biomarkers of renal function with D-dimer levels in patients with type 1 diabetes mellitus (renal biomarkers and D-dimer in diabetes).

    PubMed

    Domingueti, Caroline Pereira; Fóscolo, Rodrigo Bastos; Dusse, Luci Maria S; Reis, Janice Sepúlveda; Carvalho, Maria das Graças; Gomes, Karina Braga; Fernandes, Ana Paula

    2018-02-01

    Objective This study aimed to evaluate the association between different renal biomarkers with D-Dimer levels in diabetes mellitus (DM1) patients group classified as: low D-Dimer levels (< 318 ng/mL), which included first and second D-Dimer tertiles, and high D-Dimer levels (≥ 318 ng/mL), which included third D-Dimer tertile. Materials and methods D-Dimer and cystatin C were measured by ELISA. Creatinine and urea were determined by enzymatic method. Estimated glomerular filtration rate (eGFR) was calculated using CKD-EPI equation. Albuminuria was assessed by immunoturbidimetry. Presence of renal disease was evaluated using each renal biomarker: creatinine, urea, cystatin C, eGFR and albuminuria. Bivariate logistic regression analysis was performed to assess which renal biomarkers are associated with high D-Dimer levels and odds ratio was calculated. After, multivariate logistic regression analysis was performed to assess which renal biomarkers are associated with high D-Dimer levels (after adjusting for sex and age) and odds ratio was calculated. Results Cystatin C presented a better association [OR of 9.8 (3.8-25.5)] with high D-Dimer levels than albuminuria, creatinine, eGFR and urea [OR of 5.3 (2.2-12.9), 8.4 (2.5-25.4), 9.1 (2.6-31.4) and 3.5 (1.4-8.4), respectively] after adjusting for sex and age. All biomarkers showed a good association with D-Dimer levels, and consequently, with hypercoagulability status, and cystatin C showed the best association among them. Conclusion Therefore, cystatin C might be useful to detect patients with incipient diabetic kidney disease that present an increased risk of cardiovascular disease, contributing to an early adoption of reno and cardioprotective therapies.

  13. Para I Famagu'on-Ta: Fruit and Vegetable Intake, Food Store Environment, and Childhood Overweight/Obesity in the Children's Healthy Living Program on Guam

    PubMed Central

    Matanane, Lenora; Silva, Joshua; Li, Fenfang; Nigg, Claudio; Leon Guerrero, Rachael T; Novotny, Rachel

    2017-01-01

    This cross-sectional study examined the: (1) association between food store environment (FSE), fruit and vegetable (FV) availability and access, and prevalence of early childhood overweight/obesity (COWOB); and (2) influence of young child actual FV intake on the relationship between the FSE and early COWOB prevalence. Anthropometric and socio-demographic data of children (2 to 8 years; N=466) in baseline communities on Guam participating in the Children's Healthy Living (CHL) Program community trial were included. CDC year 2000 growth charts were used to calculate BMI z-scores and categories. FSE factors (fresh FV scores, store type) were assessed using the CX3 Food Availability and Marketing Survey amended for CHL. ArcGIS maps were constructed with geographic coordinates of participant residences and food stores to calculate food store scores within 1 mile of participant's residences. A sub-sample of participants (n = 355) had Food and Activity Log data to calculate FV and energy intakes. Bivariate correlations and logistic regression evaluated associations. Of 111 stores surveyed, 73% were small markets, 16% were convenience stores, and 11% were large grocery/supermarkets. Supermarkets/large grocery stores averaged the highest FV scores. Most participants did not meet FV intake recommendations while nearly half exceeded energy intake recommendations. Living near a small market was negatively correlated with BMI z-score (r = - 0.129, P < .05) while living near a convenience store was positively correlated with BMI z-score (r = 0.092, P < .05). Logistic regression analysis yielded non-significant associations. The high density of small markets may be an opportunity for FSE intervention but further investigation of Guam's FSE influence on health is needed. PMID:28808612

  14. A Formula to Calculate Standard Liver Volume Using Thoracoabdominal Circumference.

    PubMed

    Shaw, Brian I; Burdine, Lyle J; Braun, Hillary J; Ascher, Nancy L; Roberts, John P

    2017-12-01

    With the use of split liver grafts as well as living donor liver transplantation (LDLT) it is imperative to know the minimum graft volume to avoid complications. Most current formulas to predict standard liver volume (SLV) rely on weight-based measures that are likely inaccurate in the setting of cirrhosis. Therefore, we sought to create a formula for estimating SLV without weight-based covariates. LDLT donors underwent computed tomography scan volumetric evaluation of their livers. An optimal formula for calculating SLV using the anthropomorphic measure thoracoabdominal circumference (TAC) was determined using leave-one-out cross-validation. The ability of this formula to correctly predict liver volume was checked against other existing formulas by analysis of variance. The ability of the formula to predict small grafts in LDLT was evaluated by exact logistic regression. The optimal formula using TAC was determined to be SLV = (TAC × 3.5816) - (Age × 3.9844) - (Sex × 109.7386) - 934.5949. When compared to historic formulas, the current formula was the only one which was not significantly different than computed tomography determined liver volumes when compared by analysis of variance with Dunnett posttest. When evaluating the ability of the formula to predict small for size syndrome, many (10/16) of the formulas tested had significant results by exact logistic regression, with our formula predicting small for size syndrome with an odds ratio of 7.94 (95% confidence interval, 1.23-91.36; P = 0.025). We report a formula for calculating SLV that does not rely on weight-based variables that has good ability to predict SLV and identify patients with potentially small grafts.

  15. Para I Famagu'on-Ta: Fruit and Vegetable Intake, Food Store Environment, and Childhood Overweight/Obesity in the Children's Healthy Living Program on Guam.

    PubMed

    Matanane, Lenora; Fialkowski, Marie Kainoa; Silva, Joshua; Li, Fenfang; Nigg, Claudio; Leon Guerrero, Rachael T; Novotny, Rachel

    2017-08-01

    This cross-sectional study examined the: (1) association between food store environment (FSE), fruit and vegetable (FV) availability and access, and prevalence of early childhood overweight/obesity (COWOB); and (2) influence of young child actual FV intake on the relationship between the FSE and early COWOB prevalence. Anthropometric and socio-demographic data of children (2 to 8 years; N=466) in baseline communities on Guam participating in the Children's Healthy Living (CHL) Program community trial were included. CDC year 2000 growth charts were used to calculate BMI z-scores and categories. FSE factors (fresh FV scores, store type) were assessed using the CX3 Food Availability and Marketing Survey amended for CHL. ArcGIS maps were constructed with geographic coordinates of participant residences and food stores to calculate food store scores within 1 mile of participant's residences. A sub-sample of participants (n = 355) had Food and Activity Log data to calculate FV and energy intakes. Bivariate correlations and logistic regression evaluated associations. Of 111 stores surveyed, 73% were small markets, 16% were convenience stores, and 11% were large grocery/supermarkets. Supermarkets/large grocery stores averaged the highest FV scores. Most participants did not meet FV intake recommendations while nearly half exceeded energy intake recommendations. Living near a small market was negatively correlated with BMI z-score (r = - 0.129, P < .05) while living near a convenience store was positively correlated with BMI z-score (r = 0.092, P < .05). Logistic regression analysis yielded non-significant associations. The high density of small markets may be an opportunity for FSE intervention but further investigation of Guam's FSE influence on health is needed.

  16. No rationale for 1 variable per 10 events criterion for binary logistic regression analysis.

    PubMed

    van Smeden, Maarten; de Groot, Joris A H; Moons, Karel G M; Collins, Gary S; Altman, Douglas G; Eijkemans, Marinus J C; Reitsma, Johannes B

    2016-11-24

    Ten events per variable (EPV) is a widely advocated minimal criterion for sample size considerations in logistic regression analysis. Of three previous simulation studies that examined this minimal EPV criterion only one supports the use of a minimum of 10 EPV. In this paper, we examine the reasons for substantial differences between these extensive simulation studies. The current study uses Monte Carlo simulations to evaluate small sample bias, coverage of confidence intervals and mean square error of logit coefficients. Logistic regression models fitted by maximum likelihood and a modified estimation procedure, known as Firth's correction, are compared. The results show that besides EPV, the problems associated with low EPV depend on other factors such as the total sample size. It is also demonstrated that simulation results can be dominated by even a few simulated data sets for which the prediction of the outcome by the covariates is perfect ('separation'). We reveal that different approaches for identifying and handling separation leads to substantially different simulation results. We further show that Firth's correction can be used to improve the accuracy of regression coefficients and alleviate the problems associated with separation. The current evidence supporting EPV rules for binary logistic regression is weak. Given our findings, there is an urgent need for new research to provide guidance for supporting sample size considerations for binary logistic regression analysis.

  17. 4D-Fingerprint Categorical QSAR Models for Skin Sensitization Based on Classification Local Lymph Node Assay Measures

    PubMed Central

    Li, Yi; Tseng, Yufeng J.; Pan, Dahua; Liu, Jianzhong; Kern, Petra S.; Gerberick, G. Frank; Hopfinger, Anton J.

    2008-01-01

    Currently, the only validated methods to identify skin sensitization effects are in vivo models, such as the Local Lymph Node Assay (LLNA) and guinea pig studies. There is a tremendous need, in particular due to novel legislation, to develop animal alternatives, eg. Quantitative Structure-Activity Relationship (QSAR) models. Here, QSAR models for skin sensitization using LLNA data have been constructed. The descriptors used to generate these models are derived from the 4D-molecular similarity paradigm and are referred to as universal 4D-fingerprints. A training set of 132 structurally diverse compounds and a test set of 15 structurally diverse compounds were used in this study. The statistical methodologies used to build the models are logistic regression (LR), and partial least square coupled logistic regression (PLS-LR), which prove to be effective tools for studying skin sensitization measures expressed in the two categorical terms of sensitizer and non-sensitizer. QSAR models with low values of the Hosmer-Lemeshow goodness-of-fit statistic, χHL2, are significant and predictive. For the training set, the cross-validated prediction accuracy of the logistic regression models ranges from 77.3% to 78.0%, while that of PLS-logistic regression models ranges from 87.1% to 89.4%. For the test set, the prediction accuracy of logistic regression models ranges from 80.0%-86.7%, while that of PLS-logistic regression models ranges from 73.3%-80.0%. The QSAR models are made up of 4D-fingerprints related to aromatic atoms, hydrogen bond acceptors and negatively partially charged atoms. PMID:17226934

  18. MODELING SNAKE MICROHABITAT FROM RADIOTELEMETRY STUDIES USING POLYTOMOUS LOGISTIC REGRESSION

    EPA Science Inventory

    Multivariate analysis of snake microhabitat has historically used techniques that were derived under assumptions of normality and common covariance structure (e.g., discriminant function analysis, MANOVA). In this study, polytomous logistic regression (PLR which does not require ...

  19. Selecting risk factors: a comparison of discriminant analysis, logistic regression and Cox's regression model using data from the Tromsø Heart Study.

    PubMed

    Brenn, T; Arnesen, E

    1985-01-01

    For comparative evaluation, discriminant analysis, logistic regression and Cox's model were used to select risk factors for total and coronary deaths among 6595 men aged 20-49 followed for 9 years. Groups with mortality between 5 and 93 per 1000 were considered. Discriminant analysis selected variable sets only marginally different from the logistic and Cox methods which always selected the same sets. A time-saving option, offered for both the logistic and Cox selection, showed no advantage compared with discriminant analysis. Analysing more than 3800 subjects, the logistic and Cox methods consumed, respectively, 80 and 10 times more computer time than discriminant analysis. When including the same set of variables in non-stepwise analyses, all methods estimated coefficients that in most cases were almost identical. In conclusion, discriminant analysis is advocated for preliminary or stepwise analysis, otherwise Cox's method should be used.

  20. Phobic Anxiety and Plasma Levels of Global Oxidative Stress in Women

    PubMed Central

    Hagan, Kaitlin A.; Wu, Tianying; Rimm, Eric B.; Eliassen, A. Heather; Okereke, Olivia I.

    2015-01-01

    Background and Objectives Psychological distress has been hypothesized to be associated with adverse biologic states such as higher oxidative stress and inflammation. Yet, little is known about associations between a common form of distress – phobic anxiety – and global oxidative stress. Thus, we related phobic anxiety to plasma fluorescent oxidation products (FlOPs), a global oxidative stress marker. Methods We conducted a cross-sectional analysis among 1,325 women (aged 43-70 years) from the Nurses’ Health Study. Phobic anxiety was measured using the Crown-Crisp Index (CCI). Adjusted least-squares mean log-transformed FlOPs were calculated across phobic categories. Logistic regression models were used to calculate odds ratios (OR) comparing the highest CCI category (≥6 points) vs. lower scores, across FlOPs quartiles. Results No association was found between phobic anxiety categories and mean FlOP levels in multivariable adjusted linear models. Similarly, in multivariable logistic regression models there were no associations between FlOPs quartiles and likelihood of being in the highest phobic category. Comparing women in the highest vs. lowest FlOPs quartiles: FlOP_360: OR=0.68 (95% CI: 0.40-1.15); FlOP_320: OR=0.99 (95% CI: 0.61-1.61); FlOP_400: OR=0.92 (95% CI: 0.52, 1.63). Conclusions No cross-sectional association was found between phobic anxiety and a plasma measure of global oxidative stress in this sample of middle-aged and older women. PMID:26635425

  1. Lapse in embryo transfer training does not negatively affect clinical pregnancy rates for reproductive endocrinology and infertility fellows.

    PubMed

    Kresowik, Jessica; Sparks, Amy; Duran, Eyup H; Shah, Divya K

    2015-03-01

    To compare rates of clinical pregnancy (CPR) and live birth (LBR) following embryo transfer (ET) performed by reproductive endocrinology and infertility (REI) fellows before and after a prolonged lapse in clinical training due to an 18-month research rotation. Retrospective cohort study. Not applicable. All women undergoing in vitro fertilization (IVF) and IVF-intracytoplasmic sperm injection (ICSI) cycles with ET performed by REI fellows from August 2003 to July 2012. Eighteen-month lapse in clinical training of REI fellows. CPR and LBR before and after the lapse in clinical training were calculated and compared per fellow and as a composite group. Alternating logistic regression models were used to calculate the odds of clinical pregnancy and live birth following transfers performed before and after the lapse in training. Unadjusted odds of clinical pregnancy and live birth were similar between the two time periods both for individual fellows and for the composite group. Alternate logistic regression analysis revealed no significant difference in CPR (odds ratio [OR] 0.94, 95% confidence interval [CI] 0.83-1.07) or LBR (OR 1.05, 95% CI 0.94-1.18) after the lapse in training compared with before. A research rotation is common in REI fellowship training programs. This prolonged departure from clinical training does not appear to negatively affect pregnancy outcome following fellow ET. Copyright © 2015 American Society for Reproductive Medicine. Published by Elsevier Inc. All rights reserved.

  2. Calculation of Haem Iron Intake and Its Role in the Development of Iron Deficiency in Young Women from the Australian Longitudinal Study on Women's Health.

    PubMed

    Reeves, Angela J; McEvoy, Mark A; MacDonald-Wicks, Lesley K; Barker, Daniel; Attia, John; Hodge, Allison M; Patterson, Amanda J

    2017-05-19

    Total iron intake is not strongly associated with iron stores, but haem iron intake may be more predictive. Haem iron is not available in most nutrient databases, so experimentally determined haem contents were applied to an Australian Food Frequency Questionnaire (FFQ) to estimate haem iron intake in a representative sample of young women (25-30 years). The association between dietary haem iron intakes and incident self-reported diagnosed iron deficiency over six years of follow-up was examined. Haem iron contents for Australian red meats, fish, and poultry were applied to haem-containing foods in the Dietary Questionnaire for Epidemiological Studies V2 (DQESv2) FFQ. Haem iron intakes were calculated for 9076 women from the Australian Longitudinal Study on Women's Health (ALSWH) using the DQESv2 dietary data from 2003. Logistic regression was used to examine the association between haem iron intake (2003) and the incidence of iron deficiency in 2006 and 2009. Multiple logistic regression showed baseline haem iron intake was a statistically significant predictor of iron deficiency in 2006 (Odds Ratio (OR): 0.91; 95% Confidence Interval (CI): 0.84-0.99; p -value: 0.020) and 2009 (OR: 0.89; 95% CI: 0.82-0.99; p -value: 0.007). Using the energy-adjusted haem intake made little difference to the associations. Higher haem iron intake is associated with reduced odds of iron deficiency developing in young adult Australian women.

  3. Relationship between chemical structure and the occupational asthma hazard of low molecular weight organic compounds

    PubMed Central

    Jarvis, J; Seed, M; Elton, R; Sawyer, L; Agius, R

    2005-01-01

    Aims: To investigate quantitatively, relationships between chemical structure and reported occupational asthma hazard for low molecular weight (LMW) organic compounds; to develop and validate a model linking asthma hazard with chemical substructure; and to generate mechanistic hypotheses that might explain the relationships. Methods: A learning dataset used 78 LMW chemical asthmagens reported in the literature before 1995, and 301 control compounds with recognised occupational exposures and hazards other than respiratory sensitisation. The chemical structures of the asthmagens and control compounds were characterised by the presence of chemical substructure fragments. Odds ratios were calculated for these fragments to determine which were associated with a likelihood of being reported as an occupational asthmagen. Logistic regression modelling was used to identify the independent contribution of these substructures. A post-1995 set of 21 asthmagens and 77 controls were selected to externally validate the model. Results: Nitrogen or oxygen containing functional groups such as isocyanate, amine, acid anhydride, and carbonyl were associated with an occupational asthma hazard, particularly when the functional group was present twice or more in the same molecule. A logistic regression model using only statistically significant independent variables for occupational asthma hazard correctly assigned 90% of the model development set. The external validation showed a sensitivity of 86% and specificity of 99%. Conclusions: Although a wide variety of chemical structures are associated with occupational asthma, bifunctional reactivity is strongly associated with occupational asthma hazard across a range of chemical substructures. This suggests that chemical cross-linking is an important molecular mechanism leading to the development of occupational asthma. The logistic regression model is freely available on the internet and may offer a useful but inexpensive adjunct to the prediction of occupational asthma hazard. PMID:15778257

  4. Confirming the validity of the CONUT system for early detection and monitoring of clinical undernutrition: comparison with two logistic regression models developed using SGA as the gold standard.

    PubMed

    González-Madroño, A; Mancha, A; Rodríguez, F J; Culebras, J; de Ulibarri, J I

    2012-01-01

    To ratify previous validations of the CONUT nutritional screening tool by the development of two probabilistic models using the parameters included in the CONUT, to see if the CONUT´s effectiveness could be improved. It is a two step prospective study. In Step 1, 101 patients were randomly selected, and SGA and CONUT was made. With data obtained an unconditional logistic regression model was developed, and two variants of CONUT were constructed: Model 1 was made by a method of logistic regression. Model 2 was made by dividing the probabilities of undernutrition obtained in model 1 in seven regular intervals. In step 2, 60 patients were selected and underwent the SGA, the original CONUT and the new models developed. The diagnostic efficacy of the original CONUT and the new models was tested by means of ROC curves. Both samples 1 and 2 were put together to measure the agreement degree between the original CONUT and SGA, and diagnostic efficacy parameters were calculated. No statistically significant differences were found between sample 1 and 2, regarding age, sex and medical/surgical distribution and undernutrition rates were similar (over 40%). The AUC for the ROC curves were 0.862 for the original CONUT, and 0.839 and 0.874, for model 1 and 2 respectively. The kappa index for the CONUT and SGA was 0.680. The CONUT, with the original scores assigned by the authors is equally good than mathematical models and thus is a valuable tool, highly useful and efficient for the purpose of Clinical Undernutrition screening.

  5. Prevalence of difficult venous access and associated risk factors in highly complex hospitalised patients.

    PubMed

    Armenteros-Yeguas, Victoria; Gárate-Echenique, Lucía; Tomás-López, Maria Aranzazu; Cristóbal-Domínguez, Estíbaliz; Moreno-de Gusmão, Breno; Miranda-Serrano, Erika; Moraza-Dulanto, Maria Inmaculada

    2017-12-01

    To estimate the prevalence of difficult venous access in complex patients with multimorbidity and to identify associated risk factors. In highly complex patients, factors like ageing, the need for frequent use of irritant medication and multiple venous catheterisations to complete treatment could contribute to exhaustion of venous access. A cross-sectional study was conducted. 'Highly complex' patients (n = 135) were recruited from March 2013-November 2013. The main study variable was the prevalence of difficult venous access, assessed using one of the following criteria: (1) a history of difficulties obtaining venous access based on more than two attempts to insert an intravenous line and (2) no visible or palpable veins. Other factors potentially associated with the risk of difficult access were also measured (age, gender and chronic illnesses). Univariate analysis was performed for each potential risk factor. Factors with p < 0·2 were then included in multivariable logistic regression analysis. Odds ratios were also calculated. The prevalence of difficult venous access was 59·3%. The univariate logistic regression analysis indicated that gender, a history of vascular access complications and osteoarticular disease were significantly associated with difficult venous access. The multivariable logistic regression showed that only gender was an independent risk factor and the odds ratios was 2·85. The prevalence of difficult venous access is high in this population. Gender (female) is the only independent risk factor associated with this. Previous history of several attempts at catheter insertion is an important criterion in the assessment of difficult venous access. The prevalence of difficult venous access in complex patients is 59·3%. Significant risk factors include being female and a history of complications related to vascular access. © 2017 John Wiley & Sons Ltd.

  6. Predicting Grade 3 Acute Diarrhea During Radiation Therapy for Rectal Cancer Using a Cutoff-Dose Logistic Regression Normal Tissue Complication Probability Model

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Robertson, John M., E-mail: jrobertson@beaumont.ed; Soehn, Matthias; Yan Di

    Purpose: Understanding the dose-volume relationship of small bowel irradiation and severe acute diarrhea may help reduce the incidence of this side effect during adjuvant treatment for rectal cancer. Methods and Materials: Consecutive patients treated curatively for rectal cancer were reviewed, and the maximum grade of acute diarrhea was determined. The small bowel was outlined on the treatment planning CT scan, and a dose-volume histogram was calculated for the initial pelvic treatment (45 Gy). Logistic regression models were fitted for varying cutoff-dose levels from 5 to 45 Gy in 5-Gy increments. The model with the highest LogLikelihood was used to developmore » a cutoff-dose normal tissue complication probability (NTCP) model. Results: There were a total of 152 patients (48% preoperative, 47% postoperative, 5% other), predominantly treated prone (95%) with a three-field technique (94%) and a protracted venous infusion of 5-fluorouracil (78%). Acute Grade 3 diarrhea occurred in 21%. The largest LogLikelihood was found for the cutoff-dose logistic regression model with 15 Gy as the cutoff-dose, although the models for 20 Gy and 25 Gy had similar significance. According to this model, highly significant correlations (p <0.001) between small bowel volumes receiving at least 15 Gy and toxicity exist in the considered patient population. Similar findings applied to both the preoperatively (p = 0.001) and postoperatively irradiated groups (p = 0.001). Conclusion: The incidence of Grade 3 diarrhea was significantly correlated with the volume of small bowel receiving at least 15 Gy using a cutoff-dose NTCP model.« less

  7. Modification of the Mantel-Haenszel and Logistic Regression DIF Procedures to Incorporate the SIBTEST Regression Correction

    ERIC Educational Resources Information Center

    DeMars, Christine E.

    2009-01-01

    The Mantel-Haenszel (MH) and logistic regression (LR) differential item functioning (DIF) procedures have inflated Type I error rates when there are large mean group differences, short tests, and large sample sizes.When there are large group differences in mean score, groups matched on the observed number-correct score differ on true score,…

  8. Satellite rainfall retrieval by logistic regression

    NASA Technical Reports Server (NTRS)

    Chiu, Long S.

    1986-01-01

    The potential use of logistic regression in rainfall estimation from satellite measurements is investigated. Satellite measurements provide covariate information in terms of radiances from different remote sensors.The logistic regression technique can effectively accommodate many covariates and test their significance in the estimation. The outcome from the logistical model is the probability that the rainrate of a satellite pixel is above a certain threshold. By varying the thresholds, a rainrate histogram can be obtained, from which the mean and the variant can be estimated. A logistical model is developed and applied to rainfall data collected during GATE, using as covariates the fractional rain area and a radiance measurement which is deduced from a microwave temperature-rainrate relation. It is demonstrated that the fractional rain area is an important covariate in the model, consistent with the use of the so-called Area Time Integral in estimating total rain volume in other studies. To calibrate the logistical model, simulated rain fields generated by rainfield models with prescribed parameters are needed. A stringent test of the logistical model is its ability to recover the prescribed parameters of simulated rain fields. A rain field simulation model which preserves the fractional rain area and lognormality of rainrates as found in GATE is developed. A stochastic regression model of branching and immigration whose solutions are lognormally distributed in some asymptotic limits has also been developed.

  9. Practical Session: Logistic Regression

    NASA Astrophysics Data System (ADS)

    Clausel, M.; Grégoire, G.

    2014-12-01

    An exercise is proposed to illustrate the logistic regression. One investigates the different risk factors in the apparition of coronary heart disease. It has been proposed in Chapter 5 of the book of D.G. Kleinbaum and M. Klein, "Logistic Regression", Statistics for Biology and Health, Springer Science Business Media, LLC (2010) and also by D. Chessel and A.B. Dufour in Lyon 1 (see Sect. 6 of http://pbil.univ-lyon1.fr/R/pdf/tdr341.pdf). This example is based on data given in the file evans.txt coming from http://www.sph.emory.edu/dkleinb/logreg3.htm#data.

  10. Multinomial logistic regression modelling of obesity and overweight among primary school students in a rural area of Negeri Sembilan

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ghazali, Amirul Syafiq Mohd; Ali, Zalila; Noor, Norlida Mohd

    Multinomial logistic regression is widely used to model the outcomes of a polytomous response variable, a categorical dependent variable with more than two categories. The model assumes that the conditional mean of the dependent categorical variables is the logistic function of an affine combination of predictor variables. Its procedure gives a number of logistic regression models that make specific comparisons of the response categories. When there are q categories of the response variable, the model consists of q-1 logit equations which are fitted simultaneously. The model is validated by variable selection procedures, tests of regression coefficients, a significant test ofmore » the overall model, goodness-of-fit measures, and validation of predicted probabilities using odds ratio. This study used the multinomial logistic regression model to investigate obesity and overweight among primary school students in a rural area on the basis of their demographic profiles, lifestyles and on the diet and food intake. The results indicated that obesity and overweight of students are related to gender, religion, sleep duration, time spent on electronic games, breakfast intake in a week, with whom meals are taken, protein intake, and also, the interaction between breakfast intake in a week with sleep duration, and the interaction between gender and protein intake.« less

  11. Multinomial logistic regression modelling of obesity and overweight among primary school students in a rural area of Negeri Sembilan

    NASA Astrophysics Data System (ADS)

    Ghazali, Amirul Syafiq Mohd; Ali, Zalila; Noor, Norlida Mohd; Baharum, Adam

    2015-10-01

    Multinomial logistic regression is widely used to model the outcomes of a polytomous response variable, a categorical dependent variable with more than two categories. The model assumes that the conditional mean of the dependent categorical variables is the logistic function of an affine combination of predictor variables. Its procedure gives a number of logistic regression models that make specific comparisons of the response categories. When there are q categories of the response variable, the model consists of q-1 logit equations which are fitted simultaneously. The model is validated by variable selection procedures, tests of regression coefficients, a significant test of the overall model, goodness-of-fit measures, and validation of predicted probabilities using odds ratio. This study used the multinomial logistic regression model to investigate obesity and overweight among primary school students in a rural area on the basis of their demographic profiles, lifestyles and on the diet and food intake. The results indicated that obesity and overweight of students are related to gender, religion, sleep duration, time spent on electronic games, breakfast intake in a week, with whom meals are taken, protein intake, and also, the interaction between breakfast intake in a week with sleep duration, and the interaction between gender and protein intake.

  12. The cross-validated AUC for MCP-logistic regression with high-dimensional data.

    PubMed

    Jiang, Dingfeng; Huang, Jian; Zhang, Ying

    2013-10-01

    We propose a cross-validated area under the receiving operator characteristic (ROC) curve (CV-AUC) criterion for tuning parameter selection for penalized methods in sparse, high-dimensional logistic regression models. We use this criterion in combination with the minimax concave penalty (MCP) method for variable selection. The CV-AUC criterion is specifically designed for optimizing the classification performance for binary outcome data. To implement the proposed approach, we derive an efficient coordinate descent algorithm to compute the MCP-logistic regression solution surface. Simulation studies are conducted to evaluate the finite sample performance of the proposed method and its comparison with the existing methods including the Akaike information criterion (AIC), Bayesian information criterion (BIC) or Extended BIC (EBIC). The model selected based on the CV-AUC criterion tends to have a larger predictive AUC and smaller classification error than those with tuning parameters selected using the AIC, BIC or EBIC. We illustrate the application of the MCP-logistic regression with the CV-AUC criterion on three microarray datasets from the studies that attempt to identify genes related to cancers. Our simulation studies and data examples demonstrate that the CV-AUC is an attractive method for tuning parameter selection for penalized methods in high-dimensional logistic regression models.

  13. Calculating the individual probability of successful ocriplasmin treatment in eyes with VMT syndrome: a multivariable prediction model from the EXPORT study.

    PubMed

    Paul, Christoph; Heun, Christine; Müller, Hans-Helge; Hoerauf, Hans; Feltgen, Nicolas; Wachtlin, Joachim; Kaymak, Hakan; Mennel, Stefan; Koss, Michael Janusz; Fauser, Sascha; Maier, Mathias M; Schumann, Ricarda G; Mueller, Simone; Chang, Petrus; Schmitz-Valckenberg, Steffen; Kazerounian, Sara; Szurman, Peter; Lommatzsch, Albrecht; Bertelmann, Thomas

    2017-10-31

    To evaluate predictive factors for the treatment success of ocriplasmin and to use these factors to generate a multivariate model to calculate the individual probability of successful treatment. Data were collected in a retrospective, multicentre cohort study. Patients with vitreomacular traction (VMT) syndrome without a full-thickness macular hole were included if they received an intravitreal injection (IVI) of ocriplasmin. Five factors (age, gender, lens status, presence of epiretinal membrane (ERM) formation and horizontal diameter of VMT) were assessed on their association with VMT resolution. A multivariable logistic regression model was employed to further analyse these factors and calculate the individual probability of successful treatment. 167 eyes of 167 patients were included. Univariate analysis revealed a significant correlation to VMT resolution for all analysed factors: age (years) (OR 0.9208; 95% CI 0.8845 to 0.9586; p<0.0001), gender (male) (OR 0.480; 95% CI 0.241 to 0.957; p=0.0371), lens status (phakic) (OR 2.042; 95% CI 1.054 to 3.958; p=0.0344), ERM formation (present) (OR 0.384; 95% CI 0.179 to 0.821; p=0.0136) and horizontal VMT diameter (µm) (OR 0.99812; 95% CI 0.99684 to 0.99941, p=0.0042). A significant multivariable logistic regression model was established with age and VMT diameter. Known predictive factors for VMT resolution after ocriplasmin IVI were confirmed in our study. We were able to combine them into a formula, ultimately allowing the calculation of an individual probability of treatment success with ocriplasmin in patients with VMT syndrome without FTHM. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  14. Robust logistic regression to narrow down the winner's curse for rare and recessive susceptibility variants.

    PubMed

    Kesselmeier, Miriam; Lorenzo Bermejo, Justo

    2017-11-01

    Logistic regression is the most common technique used for genetic case-control association studies. A disadvantage of standard maximum likelihood estimators of the genotype relative risk (GRR) is their strong dependence on outlier subjects, for example, patients diagnosed at unusually young age. Robust methods are available to constrain outlier influence, but they are scarcely used in genetic studies. This article provides a non-intimidating introduction to robust logistic regression, and investigates its benefits and limitations in genetic association studies. We applied the bounded Huber and extended the R package 'robustbase' with the re-descending Hampel functions to down-weight outlier influence. Computer simulations were carried out to assess the type I error rate, mean squared error (MSE) and statistical power according to major characteristics of the genetic study and investigated markers. Simulations were complemented with the analysis of real data. Both standard and robust estimation controlled type I error rates. Standard logistic regression showed the highest power but standard GRR estimates also showed the largest bias and MSE, in particular for associated rare and recessive variants. For illustration, a recessive variant with a true GRR=6.32 and a minor allele frequency=0.05 investigated in a 1000 case/1000 control study by standard logistic regression resulted in power=0.60 and MSE=16.5. The corresponding figures for Huber-based estimation were power=0.51 and MSE=0.53. Overall, Hampel- and Huber-based GRR estimates did not differ much. Robust logistic regression may represent a valuable alternative to standard maximum likelihood estimation when the focus lies on risk prediction rather than identification of susceptibility variants. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  15. CUSUM-Logistic Regression analysis for the rapid detection of errors in clinical laboratory test results.

    PubMed

    Sampson, Maureen L; Gounden, Verena; van Deventer, Hendrik E; Remaley, Alan T

    2016-02-01

    The main drawback of the periodic analysis of quality control (QC) material is that test performance is not monitored in time periods between QC analyses, potentially leading to the reporting of faulty test results. The objective of this study was to develop a patient based QC procedure for the more timely detection of test errors. Results from a Chem-14 panel measured on the Beckman LX20 analyzer were used to develop the model. Each test result was predicted from the other 13 members of the panel by multiple regression, which resulted in correlation coefficients between the predicted and measured result of >0.7 for 8 of the 14 tests. A logistic regression model, which utilized the measured test result, the predicted test result, the day of the week and time of day, was then developed for predicting test errors. The output of the logistic regression was tallied by a daily CUSUM approach and used to predict test errors, with a fixed specificity of 90%. The mean average run length (ARL) before error detection by CUSUM-Logistic Regression (CSLR) was 20 with a mean sensitivity of 97%, which was considerably shorter than the mean ARL of 53 (sensitivity 87.5%) for a simple prediction model that only used the measured result for error detection. A CUSUM-Logistic Regression analysis of patient laboratory data can be an effective approach for the rapid and sensitive detection of clinical laboratory errors. Published by Elsevier Inc.

  16. Nonconvex Sparse Logistic Regression With Weakly Convex Regularization

    NASA Astrophysics Data System (ADS)

    Shen, Xinyue; Gu, Yuantao

    2018-06-01

    In this work we propose to fit a sparse logistic regression model by a weakly convex regularized nonconvex optimization problem. The idea is based on the finding that a weakly convex function as an approximation of the $\\ell_0$ pseudo norm is able to better induce sparsity than the commonly used $\\ell_1$ norm. For a class of weakly convex sparsity inducing functions, we prove the nonconvexity of the corresponding sparse logistic regression problem, and study its local optimality conditions and the choice of the regularization parameter to exclude trivial solutions. Despite the nonconvexity, a method based on proximal gradient descent is used to solve the general weakly convex sparse logistic regression, and its convergence behavior is studied theoretically. Then the general framework is applied to a specific weakly convex function, and a necessary and sufficient local optimality condition is provided. The solution method is instantiated in this case as an iterative firm-shrinkage algorithm, and its effectiveness is demonstrated in numerical experiments by both randomly generated and real datasets.

  17. A comparative study on entrepreneurial attitudes modeled with logistic regression and Bayes nets.

    PubMed

    López Puga, Jorge; García García, Juan

    2012-11-01

    Entrepreneurship research is receiving increasing attention in our context, as entrepreneurs are key social agents involved in economic development. We compare the success of the dichotomic logistic regression model and the Bayes simple classifier to predict entrepreneurship, after manipulating the percentage of missing data and the level of categorization in predictors. A sample of undergraduate university students (N = 1230) completed five scales (motivation, attitude towards business creation, obstacles, deficiencies, and training needs) and we found that each of them predicted different aspects of the tendency to business creation. Additionally, our results show that the receiver operating characteristic (ROC) curve is affected by the rate of missing data in both techniques, but logistic regression seems to be more vulnerable when faced with missing data, whereas Bayes nets underperform slightly when categorization has been manipulated. Our study sheds light on the potential entrepreneur profile and we propose to use Bayesian networks as an additional alternative to overcome the weaknesses of logistic regression when missing data are present in applied research.

  18. Comparison of cranial sex determination by discriminant analysis and logistic regression.

    PubMed

    Amores-Ampuero, Anabel; Alemán, Inmaculada

    2016-04-05

    Various methods have been proposed for estimating dimorphism. The objective of this study was to compare sex determination results from cranial measurements using discriminant analysis or logistic regression. The study sample comprised 130 individuals (70 males) of known sex, age, and cause of death from San José cemetery in Granada (Spain). Measurements of 19 neurocranial dimensions and 11 splanchnocranial dimensions were subjected to discriminant analysis and logistic regression, and the percentages of correct classification were compared between the sex functions obtained with each method. The discriminant capacity of the selected variables was evaluated with a cross-validation procedure. The percentage accuracy with discriminant analysis was 78.2% for the neurocranium (82.4% in females and 74.6% in males) and 73.7% for the splanchnocranium (79.6% in females and 68.8% in males). These percentages were higher with logistic regression analysis: 85.7% for the neurocranium (in both sexes) and 94.1% for the splanchnocranium (100% in females and 91.7% in males).

  19. Stepwise Distributed Open Innovation Contests for Software Development: Acceleration of Genome-Wide Association Analysis

    PubMed Central

    Hill, Andrew; Loh, Po-Ru; Bharadwaj, Ragu B.; Pons, Pascal; Shang, Jingbo; Guinan, Eva; Lakhani, Karim; Kilty, Iain

    2017-01-01

    Abstract Background: The association of differing genotypes with disease-related phenotypic traits offers great potential to both help identify new therapeutic targets and support stratification of patients who would gain the greatest benefit from specific drug classes. Development of low-cost genotyping and sequencing has made collecting large-scale genotyping data routine in population and therapeutic intervention studies. In addition, a range of new technologies is being used to capture numerous new and complex phenotypic descriptors. As a result, genotype and phenotype datasets have grown exponentially. Genome-wide association studies associate genotypes and phenotypes using methods such as logistic regression. As existing tools for association analysis limit the efficiency by which value can be extracted from increasing volumes of data, there is a pressing need for new software tools that can accelerate association analyses on large genotype-phenotype datasets. Results: Using open innovation (OI) and contest-based crowdsourcing, the logistic regression analysis in a leading, community-standard genetics software package (PLINK 1.07) was substantially accelerated. OI allowed us to do this in <6 months by providing rapid access to highly skilled programmers with specialized, difficult-to-find skill sets. Through a crowd-based contest a combination of computational, numeric, and algorithmic approaches was identified that accelerated the logistic regression in PLINK 1.07 by 18- to 45-fold. Combining contest-derived logistic regression code with coarse-grained parallelization, multithreading, and associated changes to data initialization code further developed through distributed innovation, we achieved an end-to-end speedup of 591-fold for a data set size of 6678 subjects by 645 863 variants, compared to PLINK 1.07's logistic regression. This represents a reduction in run time from 4.8 hours to 29 seconds. Accelerated logistic regression code developed in this project has been incorporated into the PLINK2 project. Conclusions: Using iterative competition-based OI, we have developed a new, faster implementation of logistic regression for genome-wide association studies analysis. We present lessons learned and recommendations on running a successful OI process for bioinformatics. PMID:28327993

  20. Stepwise Distributed Open Innovation Contests for Software Development: Acceleration of Genome-Wide Association Analysis.

    PubMed

    Hill, Andrew; Loh, Po-Ru; Bharadwaj, Ragu B; Pons, Pascal; Shang, Jingbo; Guinan, Eva; Lakhani, Karim; Kilty, Iain; Jelinsky, Scott A

    2017-05-01

    The association of differing genotypes with disease-related phenotypic traits offers great potential to both help identify new therapeutic targets and support stratification of patients who would gain the greatest benefit from specific drug classes. Development of low-cost genotyping and sequencing has made collecting large-scale genotyping data routine in population and therapeutic intervention studies. In addition, a range of new technologies is being used to capture numerous new and complex phenotypic descriptors. As a result, genotype and phenotype datasets have grown exponentially. Genome-wide association studies associate genotypes and phenotypes using methods such as logistic regression. As existing tools for association analysis limit the efficiency by which value can be extracted from increasing volumes of data, there is a pressing need for new software tools that can accelerate association analyses on large genotype-phenotype datasets. Using open innovation (OI) and contest-based crowdsourcing, the logistic regression analysis in a leading, community-standard genetics software package (PLINK 1.07) was substantially accelerated. OI allowed us to do this in <6 months by providing rapid access to highly skilled programmers with specialized, difficult-to-find skill sets. Through a crowd-based contest a combination of computational, numeric, and algorithmic approaches was identified that accelerated the logistic regression in PLINK 1.07 by 18- to 45-fold. Combining contest-derived logistic regression code with coarse-grained parallelization, multithreading, and associated changes to data initialization code further developed through distributed innovation, we achieved an end-to-end speedup of 591-fold for a data set size of 6678 subjects by 645 863 variants, compared to PLINK 1.07's logistic regression. This represents a reduction in run time from 4.8 hours to 29 seconds. Accelerated logistic regression code developed in this project has been incorporated into the PLINK2 project. Using iterative competition-based OI, we have developed a new, faster implementation of logistic regression for genome-wide association studies analysis. We present lessons learned and recommendations on running a successful OI process for bioinformatics. © The Author 2017. Published by Oxford University Press.

  1. Easy and low-cost identification of metabolic syndrome in patients treated with second-generation antipsychotics: artificial neural network and logistic regression models.

    PubMed

    Lin, Chao-Cheng; Bai, Ya-Mei; Chen, Jen-Yeu; Hwang, Tzung-Jeng; Chen, Tzu-Ting; Chiu, Hung-Wen; Li, Yu-Chuan

    2010-03-01

    Metabolic syndrome (MetS) is an important side effect of second-generation antipsychotics (SGAs). However, many SGA-treated patients with MetS remain undetected. In this study, we trained and validated artificial neural network (ANN) and multiple logistic regression models without biochemical parameters to rapidly identify MetS in patients with SGA treatment. A total of 383 patients with a diagnosis of schizophrenia or schizoaffective disorder (DSM-IV criteria) with SGA treatment for more than 6 months were investigated to determine whether they met the MetS criteria according to the International Diabetes Federation. The data for these patients were collected between March 2005 and September 2005. The input variables of ANN and logistic regression were limited to demographic and anthropometric data only. All models were trained by randomly selecting two-thirds of the patient data and were internally validated with the remaining one-third of the data. The models were then externally validated with data from 69 patients from another hospital, collected between March 2008 and June 2008. The area under the receiver operating characteristic curve (AUC) was used to measure the performance of all models. Both the final ANN and logistic regression models had high accuracy (88.3% vs 83.6%), sensitivity (93.1% vs 86.2%), and specificity (86.9% vs 83.8%) to identify MetS in the internal validation set. The mean +/- SD AUC was high for both the ANN and logistic regression models (0.934 +/- 0.033 vs 0.922 +/- 0.035, P = .63). During external validation, high AUC was still obtained for both models. Waist circumference and diastolic blood pressure were the common variables that were left in the final ANN and logistic regression models. Our study developed accurate ANN and logistic regression models to detect MetS in patients with SGA treatment. The models are likely to provide a noninvasive tool for large-scale screening of MetS in this group of patients. (c) 2010 Physicians Postgraduate Press, Inc.

  2. Bayesian logistic regression in detection of gene-steroid interaction for cancer at PDLIM5 locus.

    PubMed

    Wang, Ke-Sheng; Owusu, Daniel; Pan, Yue; Xie, Changchun

    2016-06-01

    The PDZ and LIM domain 5 (PDLIM5) gene may play a role in cancer, bipolar disorder, major depression, alcohol dependence and schizophrenia; however, little is known about the interaction effect of steroid and PDLIM5 gene on cancer. This study examined 47 single-nucleotide polymorphisms (SNPs) within the PDLIM5 gene in the Marshfield sample with 716 cancer patients (any diagnosed cancer, excluding minor skin cancer) and 2848 noncancer controls. Multiple logistic regression model in PLINK software was used to examine the association of each SNP with cancer. Bayesian logistic regression in PROC GENMOD in SAS statistical software, ver. 9.4 was used to detect gene- steroid interactions influencing cancer. Single marker analysis using PLINK identified 12 SNPs associated with cancer (P< 0.05); especially, SNP rs6532496 revealed the strongest association with cancer (P = 6.84 × 10⁻³); while the next best signal was rs951613 (P = 7.46 × 10⁻³). Classic logistic regression in PROC GENMOD showed that both rs6532496 and rs951613 revealed strong gene-steroid interaction effects (OR=2.18, 95% CI=1.31-3.63 with P = 2.9 × 10⁻³ for rs6532496 and OR=2.07, 95% CI=1.24-3.45 with P = 5.43 × 10⁻³ for rs951613, respectively). Results from Bayesian logistic regression showed stronger interaction effects (OR=2.26, 95% CI=1.2-3.38 for rs6532496 and OR=2.14, 95% CI=1.14-3.2 for rs951613, respectively). All the 12 SNPs associated with cancer revealed significant gene-steroid interaction effects (P < 0.05); whereas 13 SNPs showed gene-steroid interaction effects without main effect on cancer. SNP rs4634230 revealed the strongest gene-steroid interaction effect (OR=2.49, 95% CI=1.5-4.13 with P = 4.0 × 10⁻⁴ based on the classic logistic regression and OR=2.59, 95% CI=1.4-3.97 from Bayesian logistic regression; respectively). This study provides evidence of common genetic variants within the PDLIM5 gene and interactions between PLDIM5 gene polymorphisms and steroid use influencing cancer.

  3. A comparison between univariate probabilistic and multivariate (logistic regression) methods for landslide susceptibility analysis: the example of the Febbraro valley (Northern Alps, Italy)

    NASA Astrophysics Data System (ADS)

    Rossi, M.; Apuani, T.; Felletti, F.

    2009-04-01

    The aim of this paper is to compare the results of two statistical methods for landslide susceptibility analysis: 1) univariate probabilistic method based on landslide susceptibility index, 2) multivariate method (logistic regression). The study area is the Febbraro valley, located in the central Italian Alps, where different types of metamorphic rocks croup out. On the eastern part of the studied basin a quaternary cover represented by colluvial and secondarily, by glacial deposits, is dominant. In this study 110 earth flows, mainly located toward NE portion of the catchment, were analyzed. They involve only the colluvial deposits and their extension mainly ranges from 36 to 3173 m2. Both statistical methods require to establish a spatial database, in which each landslide is described by several parameters that can be assigned using a main scarp central point of landslide. The spatial database is constructed using a Geographical Information System (GIS). Each landslide is described by several parameters corresponding to the value of main scarp central point of the landslide. Based on bibliographic review a total of 15 predisposing factors were utilized. The width of the intervals, in which the maps of the predisposing factors have to be reclassified, has been defined assuming constant intervals to: elevation (100 m), slope (5 °), solar radiation (0.1 MJ/cm2/year), profile curvature (1.2 1/m), tangential curvature (2.2 1/m), drainage density (0.5), lineament density (0.00126). For the other parameters have been used the results of the probability-probability plots analysis and the statistical indexes of landslides site. In particular slope length (0 ÷ 2, 2 ÷ 5, 5 ÷ 10, 10 ÷ 20, 20 ÷ 35, 35 ÷ 260), accumulation flow (0 ÷ 1, 1 ÷ 2, 2 ÷ 5, 5 ÷ 12, 12 ÷ 60, 60 ÷27265), Topographic Wetness Index 0 ÷ 0.74, 0.74 ÷ 1.94, 1.94 ÷ 2.62, 2.62 ÷ 3.48, 3.48 ÷ 6,00, 6.00 ÷ 9.44), Stream Power Index (0 ÷ 0.64, 0.64 ÷ 1.28, 1.28 ÷ 1.81, 1.81 ÷ 4.20, 4.20 ÷ 9.40). Geological map and land use map were also used, considering geological and land use properties as categorical variables. Appling the univariate probabilistic method the Landslide Susceptibility Index (LSI) is defined as the sum of the ratio Ra/Rb calculated for each predisposing factor, where Ra is the ratio between number of pixel of class and the total number of pixel of the study area, and Rb is the ratio between number of landslides respect to the pixel number of the interval area. From the analysis of the Ra/Rb ratio the relationship between landslide occurrence and predisposing factors were defined. Then the equation of LSI was used in GIS to trace the landslide susceptibility maps. The multivariate method for landslide susceptibility analysis, based on logistic regression, was performed starting from the density maps of the predisposing factors, calculated with the intervals defined above using the equation Rb/Rbtot, where Rbtot is a sum of all Rb values. Using stepwise forward algorithms the logistic regression was performed in two successive steps: first a univariate logistic regression is used to choose the most significant predisposing factors, then the multivariate logistic regression can be performed. The univariate regression highlighted the importance of the following factors: elevation, accumulation flow, drainage density, lineament density, geology and land use. When the multivariate regression was applied the number of controlling factors was reduced neglecting the geological properties. The resulting final susceptibility equation is: P = 1 / (1 + exp-(6.46-22.34*elevation-5.33*accumulation flow-7.99* drainage density-4.47*lineament density-17.31*land use)) and using this equation the susceptibility maps were obtained. To easy compare the results of the two methodologies, the susceptibility maps were reclassified in five susceptibility intervals (very high, high, moderate, low and very low) using natural breaks. Then the maps were validated using two cumulative distribution curves, one related to the landslides (number of landslides in each susceptibility class) and one to the basin (number of pixel covering each class). Comparing the curves for each method, it results that the two approaches (univariate and multivariate) are appropriate, providing acceptable results. In both maps the distribution of high susceptibility condition is mainly localized on the left slope of the catchment in agreement with the field evidences. The comparison between the methods was obtained by subtraction of the two maps. This operation shows that about 40% of the basin is classified by the same class of susceptibility. In general the univariate probabilistic method tends to overestimate the areal extension of the high susceptibility class with respect to the maps obtained by the logistic regression method.

  4. Deletion Diagnostics for Alternating Logistic Regressions

    PubMed Central

    Preisser, John S.; By, Kunthel; Perin, Jamie; Qaqish, Bahjat F.

    2013-01-01

    Deletion diagnostics are introduced for the regression analysis of clustered binary outcomes estimated with alternating logistic regressions, an implementation of generalized estimating equations (GEE) that estimates regression coefficients in a marginal mean model and in a model for the intracluster association given by the log odds ratio. The diagnostics are developed within an estimating equations framework that recasts the estimating functions for association parameters based upon conditional residuals into equivalent functions based upon marginal residuals. Extensions of earlier work on GEE diagnostics follow directly, including computational formulae for one-step deletion diagnostics that measure the influence of a cluster of observations on the estimated regression parameters and on the overall marginal mean or association model fit. The diagnostic formulae are evaluated with simulations studies and with an application concerning an assessment of factors associated with health maintenance visits in primary care medical practices. The application and the simulations demonstrate that the proposed cluster-deletion diagnostics for alternating logistic regressions are good approximations of their exact fully iterated counterparts. PMID:22777960

  5. Bicycle Use and Cyclist Safety Following Boston’s Bicycle Infrastructure Expansion, 2009–2012

    PubMed Central

    Angriman, Federico; Bellows, Alexandra L.; Taylor, Kathryn

    2016-01-01

    Objectives. To evaluate changes in bicycle use and cyclist safety in Boston, Massachusetts, following the rapid expansion of its bicycle infrastructure between 2007 and 2014. Methods. We measured bicycle lane mileage, a surrogate for bicycle infrastructure expansion, and quantified total estimated number of commuters. In addition, we calculated the number of reported bicycle accidents from 2009 to 2012. Bicycle accident and injury trends over time were assessed via generalized linear models. Multivariable logistic regression was used to examine factors associated with bicycle injuries. Results. Boston increased its total bicycle lane mileage from 0.034 miles in 2007 to 92.2 miles in 2014 (P < .001). The percentage of bicycle commuters increased from 0.9% in 2005 to 2.4% in 2014 (P = .002) and the total percentage of bicycle accidents involving injuries diminished significantly, from 82.7% in 2009 to 74.6% in 2012. The multivariable logistic regression analysis showed that for every 1-year increase in time from 2009 to 2012, there was a 14% reduction in the odds of being injured in an accident. Conclusions. The expansion of Boston’s bicycle infrastructure was associated with increases in both bicycle use and cyclist safety. PMID:27736203

  6. Sample size adjustments for varying cluster sizes in cluster randomized trials with binary outcomes analyzed with second-order PQL mixed logistic regression.

    PubMed

    Candel, Math J J M; Van Breukelen, Gerard J P

    2010-06-30

    Adjustments of sample size formulas are given for varying cluster sizes in cluster randomized trials with a binary outcome when testing the treatment effect with mixed effects logistic regression using second-order penalized quasi-likelihood estimation (PQL). Starting from first-order marginal quasi-likelihood (MQL) estimation of the treatment effect, the asymptotic relative efficiency of unequal versus equal cluster sizes is derived. A Monte Carlo simulation study shows this asymptotic relative efficiency to be rather accurate for realistic sample sizes, when employing second-order PQL. An approximate, simpler formula is presented to estimate the efficiency loss due to varying cluster sizes when planning a trial. In many cases sampling 14 per cent more clusters is sufficient to repair the efficiency loss due to varying cluster sizes. Since current closed-form formulas for sample size calculation are based on first-order MQL, planning a trial also requires a conversion factor to obtain the variance of the second-order PQL estimator. In a second Monte Carlo study, this conversion factor turned out to be 1.25 at most. (c) 2010 John Wiley & Sons, Ltd.

  7. Internal exposure levels of typical POPs and their associations with childhood asthma in Shanghai, China.

    PubMed

    Meng, Ge; Feng, Yan; Nie, Zhiqing; Wu, Xiaomeng; Wei, Hongying; Wu, Shaowei; Yin, Yong; Wang, Yan

    2016-04-01

    Polybrominated diphenyl ethers (PBDEs), polychlorinated biphenyls (PCBs) and organochlorine pesticides (OCPs) are common persistent organic pollutants (POPs) that may be associated with childhood asthma. The concentrations of PBDEs, PCBs and OCPs were analyzed in pooled serum samples from both asthmatic and non-asthmatic children. The differences in the internal exposure levels between the case and control groups were tested (p value <0.0012). The associations between the internal exposure concentrations of the POPs and childhood asthma were estimated based on the odds ratios (ORs) calculated using logistic regression models. There were significant differences in three PBDEs, 26 PCBs and seven OCPs between the two groups, with significantly higher levels in the cases. The multiple logistic regression models demonstrated that the internal exposure concentrations of a number of the POPs (23 PCBs, p,p'-DDE and α-HCH) were positively associated with childhood asthma. Some synergistic effects were observed when the children were co-exposed to the chemicals. BDE-209 was positively associated with asthma aggravation. This study indicates the potential relationships between the internal exposure concentrations of particular POPs and the development of childhood asthma. Copyright © 2015 Elsevier Inc. All rights reserved.

  8. Physical Comorbidities in Depression Co-Occurring with Anxiety: A Cross Sectional Study in the Czech Primary Care System

    PubMed Central

    Winkler, Petr; Horáček, Jiří; Weissová, Aneta; Šustr, Martin; Brunovský, Martin

    2015-01-01

    Comorbidities associated with depression have been researched in a number of contexts. However, the epidemiological situation in clinical practice is understudied, especially in the post-Communist Central and Eastern Europe region. The aim of this study was to assess physical comorbidities in depression, and to identify whether there are increased odds of physical comorbidities associated with co-occurring depressive and anxiety disorders. Data on 4264 patients aged 18–98 were collected among medical doctors in the Czech Republic between 2010 and 2011. Descriptive statistics were calculated and multiple logistic regressions were performed to assess comorbidities among patients with depressive disorder. There were 51.29% of those who have a physical comorbidity, and 45.5% of those who have a comorbid anxiety disorders among patients treated with depression in Czech primary care. Results of logistic regressions show that odds of having pain, hypertension or diabetes mellitus are particularly elevated at those who have co-occurring depressive and anxiety disorder. Our findings demonstrate that comorbidities associated with depressive disorders are highly prevalent in primary health care practice, and that physical comorbidities are particularly frequent among those with co-occurring depressive and anxiety disorders. PMID:26690458

  9. Poor anaerobic power/capability and static balance predicted prospective musculoskeletal injuries among Soldiers of the 101st Airborne (Air Assault) Division.

    PubMed

    Nagai, Takashi; Lovalekar, Mita; Wohleber, Meleesa F; Perlsweig, Katherine A; Wirt, Michael D; Beals, Kim

    2017-11-01

    Musculoskeletal injuries have negatively impacted tactical readiness. The identification of prospective and modifiable risk factors of preventable musculoskeletal injuries can guide specific injury prevention strategies for Soldiers and health care providers. To analyze physiological and neuromuscular characteristics as predictors of preventable musculoskeletal injuries. Prospective-cohort study. A total of 491 Soldiers were enrolled and participated in the baseline laboratory testing, including body composition, aerobic capacity, anaerobic power/capacity, muscular strength, flexibility, static balance, and landing biomechanics. After reviewing their medical charts, 275 male Soldiers who met the criteria were divided into two groups: with injuries (INJ) and no injuries (NOI). Simple and multiple logistic regression analyses were used to calculate the odds ratio (OR) and significant predictors of musculoskeletal injuries (p<0.05). The final multiple logistic regression model included the static balance with eyes-closed and peak anaerobic power as predictors of future injuries (p<0.001). The current results highlighted the importance of anaerobic power/capacity and static balance. High intensity training and balance exercise should be incorporated in their physical training as countermeasures. Copyright © 2017 Sports Medicine Australia. All rights reserved.

  10. Which Measurement of Blood Pressure Is More Associated With Albuminuria in Patients With Type 2 Diabetes: Central Blood Pressure or Peripheral Blood Pressure?

    PubMed

    Kitagawa, Noriyuki; Okada, Hiroshi; Tanaka, Muhei; Hashimoto, Yoshitaka; Kimura, Toshihiro; Nakano, Koji; Yamazaki, Masahiro; Hasegawa, Goji; Nakamura, Naoto; Fukui, Michiaki

    2016-08-01

    The aim of this study was to investigate whether central systolic blood pressure (SBP) was associated with albuminuria, defined as urinary albumin excretion (UAE) ≥30 mg/g creatinine, and, if so, whether the relationship of central SBP with albuminuria was stronger than that of peripheral SBP in patients with type 2 diabetes. The authors performed a cross-sectional study in 294 outpatients with type 2 diabetes. The relationship between peripheral SBP or central SBP and UAE using regression analysis was evaluated, and the odds ratios of peripheral SBP or central SBP were calculated to identify albuminuria using logistic regression model. Moreover, the area under the receiver operating characteristic curve (AUC) of central SBP was compared with that of peripheral SBP to identify albuminuria. Multiple regression analysis demonstrated that peripheral SBP (β=0.255, P<.0001) or central SBP (r=0.227, P<.0001) was associated with UAE. Multiple logistic regression analysis demonstrated that peripheral SBP (odds ratio, 1.029; 95% confidence interval, 1.016-1.043) or central SBP (odds ratio, 1.022; 95% confidence interval, 1.011-1.034) was associated with an increased odds of albuminuria. In addition, AUC of peripheral SBP was significantly greater than that of central SBP to identify albuminuria (P=0.035). Peripheral SBP is superior to central SBP in identifying albuminuria, although both peripheral and central SBP are associated with UAE in patients with type 2 diabetes. © 2016 Wiley Periodicals, Inc.

  11. Validation of Metrics as Error Predictors

    NASA Astrophysics Data System (ADS)

    Mendling, Jan

    In this chapter, we test the validity of metrics that were defined in the previous chapter for predicting errors in EPC business process models. In Section 5.1, we provide an overview of how the analysis data is generated. Section 5.2 describes the sample of EPCs from practice that we use for the analysis. Here we discuss a disaggregation by the EPC model group and by error as well as a correlation analysis between metrics and error. Based on this sample, we calculate a logistic regression model for predicting error probability with the metrics as input variables in Section 5.3. In Section 5.4, we then test the regression function for an independent sample of EPC models from textbooks as a cross-validation. Section 5.5 summarizes the findings.

  12. Logits and Tigers and Bears, Oh My! A Brief Look at the Simple Math of Logistic Regression and How It Can Improve Dissemination of Results

    ERIC Educational Resources Information Center

    Osborne, Jason W.

    2012-01-01

    Logistic regression is slowly gaining acceptance in the social sciences, and fills an important niche in the researcher's toolkit: being able to predict important outcomes that are not continuous in nature. While OLS regression is a valuable tool, it cannot routinely be used to predict outcomes that are binary or categorical in nature. These…

  13. Intermediate and advanced topics in multilevel logistic regression analysis

    PubMed Central

    Merlo, Juan

    2017-01-01

    Multilevel data occur frequently in health services, population and public health, and epidemiologic research. In such research, binary outcomes are common. Multilevel logistic regression models allow one to account for the clustering of subjects within clusters of higher‐level units when estimating the effect of subject and cluster characteristics on subject outcomes. A search of the PubMed database demonstrated that the use of multilevel or hierarchical regression models is increasing rapidly. However, our impression is that many analysts simply use multilevel regression models to account for the nuisance of within‐cluster homogeneity that is induced by clustering. In this article, we describe a suite of analyses that can complement the fitting of multilevel logistic regression models. These ancillary analyses permit analysts to estimate the marginal or population‐average effect of covariates measured at the subject and cluster level, in contrast to the within‐cluster or cluster‐specific effects arising from the original multilevel logistic regression model. We describe the interval odds ratio and the proportion of opposed odds ratios, which are summary measures of effect for cluster‐level covariates. We describe the variance partition coefficient and the median odds ratio which are measures of components of variance and heterogeneity in outcomes. These measures allow one to quantify the magnitude of the general contextual effect. We describe an R 2 measure that allows analysts to quantify the proportion of variation explained by different multilevel logistic regression models. We illustrate the application and interpretation of these measures by analyzing mortality in patients hospitalized with a diagnosis of acute myocardial infarction. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. PMID:28543517

  14. Intermediate and advanced topics in multilevel logistic regression analysis.

    PubMed

    Austin, Peter C; Merlo, Juan

    2017-09-10

    Multilevel data occur frequently in health services, population and public health, and epidemiologic research. In such research, binary outcomes are common. Multilevel logistic regression models allow one to account for the clustering of subjects within clusters of higher-level units when estimating the effect of subject and cluster characteristics on subject outcomes. A search of the PubMed database demonstrated that the use of multilevel or hierarchical regression models is increasing rapidly. However, our impression is that many analysts simply use multilevel regression models to account for the nuisance of within-cluster homogeneity that is induced by clustering. In this article, we describe a suite of analyses that can complement the fitting of multilevel logistic regression models. These ancillary analyses permit analysts to estimate the marginal or population-average effect of covariates measured at the subject and cluster level, in contrast to the within-cluster or cluster-specific effects arising from the original multilevel logistic regression model. We describe the interval odds ratio and the proportion of opposed odds ratios, which are summary measures of effect for cluster-level covariates. We describe the variance partition coefficient and the median odds ratio which are measures of components of variance and heterogeneity in outcomes. These measures allow one to quantify the magnitude of the general contextual effect. We describe an R 2 measure that allows analysts to quantify the proportion of variation explained by different multilevel logistic regression models. We illustrate the application and interpretation of these measures by analyzing mortality in patients hospitalized with a diagnosis of acute myocardial infarction. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.

  15. Predicting Social Trust with Binary Logistic Regression

    ERIC Educational Resources Information Center

    Adwere-Boamah, Joseph; Hufstedler, Shirley

    2015-01-01

    This study used binary logistic regression to predict social trust with five demographic variables from a national sample of adult individuals who participated in The General Social Survey (GSS) in 2012. The five predictor variables were respondents' highest degree earned, race, sex, general happiness and the importance of personally assisting…

  16. Effect of folic acid on appetite in children: ordinal logistic and fuzzy logistic regressions.

    PubMed

    Namdari, Mahshid; Abadi, Alireza; Taheri, S Mahmoud; Rezaei, Mansour; Kalantari, Naser; Omidvar, Nasrin

    2014-03-01

    Reduced appetite and low food intake are often a concern in preschool children, since it can lead to malnutrition, a leading cause of impaired growth and mortality in childhood. It is occasionally considered that folic acid has a positive effect on appetite enhancement and consequently growth in children. The aim of this study was to assess the effect of folic acid on the appetite of preschool children 3 to 6 y old. The study sample included 127 children ages 3 to 6 who were randomly selected from 20 preschools in the city of Tehran in 2011. Since appetite was measured by linguistic terms, a fuzzy logistic regression was applied for modeling. The obtained results were compared with a statistical ordinal logistic model. After controlling for the potential confounders, in a statistical ordinal logistic model, serum folate showed a significantly positive effect on appetite. A small but positive effect of folate was detected by fuzzy logistic regression. Based on fuzzy regression, the risk for poor appetite in preschool children was related to the employment status of their mothers. In this study, a positive association was detected between the levels of serum folate and improved appetite. For further investigation, a randomized controlled, double-blind clinical trial could be helpful to address causality. Copyright © 2014 Elsevier Inc. All rights reserved.

  17. Clustering performance comparison using K-means and expectation maximization algorithms.

    PubMed

    Jung, Yong Gyu; Kang, Min Soo; Heo, Jun

    2014-11-14

    Clustering is an important means of data mining based on separating data categories by similar features. Unlike the classification algorithm, clustering belongs to the unsupervised type of algorithms. Two representatives of the clustering algorithms are the K -means and the expectation maximization (EM) algorithm. Linear regression analysis was extended to the category-type dependent variable, while logistic regression was achieved using a linear combination of independent variables. To predict the possibility of occurrence of an event, a statistical approach is used. However, the classification of all data by means of logistic regression analysis cannot guarantee the accuracy of the results. In this paper, the logistic regression analysis is applied to EM clusters and the K -means clustering method for quality assessment of red wine, and a method is proposed for ensuring the accuracy of the classification results.

  18. Racial/ethnic and educational differences in the estimated odds of recent nitrite use among adult household residents in the United States: an illustration of matching and conditional logistic regression.

    PubMed

    Delva, J; Spencer, M S; Lin, J K

    2000-01-01

    This article compares estimates of the relative odds of nitrite use obtained from weighted unconditional logistic regression with estimates obtained from conditional logistic regression after post-stratification and matching of cases with controls by neighborhood of residence. We illustrate these methods by comparing the odds associated with nitrite use among adults of four racial/ethnic groups, with and without a high school education. We used aggregated data from the 1994-B through 1996 National Household Survey on Drug Abuse (NHSDA). Difference between the methods and implications for analysis and inference are discussed.

  19. [Formulation of combined predictive indicators using logistic regression model in predicting sepsis and prognosis].

    PubMed

    Duan, Liwei; Zhang, Sheng; Lin, Zhaofen

    2017-02-01

    To explore the method and performance of using multiple indices to diagnose sepsis and to predict the prognosis of severe ill patients. Critically ill patients at first admission to intensive care unit (ICU) of Changzheng Hospital, Second Military Medical University, from January 2014 to September 2015 were enrolled if the following conditions were satisfied: (1) patients were 18-75 years old; (2) the length of ICU stay was more than 24 hours; (3) All records of the patients were available. Data of the patients was collected by searching the electronic medical record system. Logistic regression model was formulated to create the new combined predictive indicator and the receiver operating characteristic (ROC) curve for the new predictive indicator was built. The area under the ROC curve (AUC) for both the new indicator and original ones were compared. The optimal cut-off point was obtained where the Youden index reached the maximum value. Diagnostic parameters such as sensitivity, specificity and predictive accuracy were also calculated for comparison. Finally, individual values were substituted into the equation to test the performance in predicting clinical outcomes. A total of 362 patients (218 males and 144 females) were enrolled in our study and 66 patients died. The average age was (48.3±19.3) years old. (1) For the predictive model only containing categorical covariants [including procalcitonin (PCT), lipopolysaccharide (LPS), infection, white blood cells count (WBC) and fever], increased PCT, increased WBC and fever were demonstrated to be independent risk factors for sepsis in the logistic equation. The AUC for the new combined predictive indicator was higher than that of any other indictor, including PCT, LPS, infection, WBC and fever (0.930 vs. 0.661, 0.503, 0.570, 0.837, 0.800). The optimal cut-off value for the new combined predictive indicator was 0.518. Using the new indicator to diagnose sepsis, the sensitivity, specificity and diagnostic accuracy rate were 78.00%, 93.36% and 87.47%, respectively. One patient was randomly selected, and the clinical data was substituted into the probability equation for prediction. The calculated value was 0.015, which was less than the cut-off value (0.518), indicating that the prognosis was non-sepsis at an accuracy of 87.47%. (2) For the predictive model only containing continuous covariants, the logistic model which combined acute physiology and chronic health evaluation II (APACHE II) score and sequential organ failure assessment (SOFA) score to predict in-hospital death events, both APACHE II score and SOFA score were independent risk factors for death. The AUC for the new predictive indicator was higher than that of APACHE II score and SOFA score (0.834 vs. 0.812, 0.813). The optimal cut-off value for the new combined predictive indicator in predicting in-hospital death events was 0.236, and the corresponding sensitivity, specificity and diagnostic accuracy for the combined predictive indicator were 73.12%, 76.51% and 75.70%, respectively. One patient was randomly selected, and the APACHE II score and SOFA score was substituted into the probability equation for prediction. The calculated value was 0.570, which was higher than the cut-off value (0.236), indicating that the death prognosis at an accuracy of 75.70%. The combined predictive indicator, which is formulated by logistic regression models, is superior to any single indicator in predicting sepsis or in-hospital death events.

  20. Menstrual pain and risk of epithelial ovarian cancer: Results from the Ovarian Cancer Association Consortium.

    PubMed

    Babic, Ana; Harris, Holly R; Vitonis, Allison F; Titus, Linda J; Jordan, Susan J; Webb, Penelope M; Risch, Harvey A; Rossing, Mary Anne; Doherty, Jennifer A; Wicklund, Kristine; Goodman, Marc T; Modugno, Francesmary; Moysich, Kirsten B; Ness, Roberta B; Kjaer, Susanne K; Schildkraut, Joellen; Berchuck, Andrew; Pearce, Celeste L; Wu, Anna H; Cramer, Daniel W; Terry, Kathryn L

    2018-02-01

    Menstrual pain, a common gynecological condition, has been associated with increased risk of ovarian cancer in some, but not all studies. Furthermore, potential variations in the association between menstrual pain and ovarian cancer by histologic subtype have not been adequately evaluated due to lack of power. We assessed menstrual pain using either direct questions about having experienced menstrual pain, or indirect questions about menstrual pain as indication for use of hormones or medications. We used multivariate logistic regression to calculate the odds ratio (OR) for the association between severe menstrual pain and ovarian cancer, adjusting for potential confounders and multinomial logistic regression to calculate ORs for specific histologic subtypes. We observed no association between ovarian cancer and menstrual pain assessed by indirect questions. Among studies using direct question, severe pain was associated with a small but significant increase in overall risk of ovarian cancer (OR = 1.07, 95% CI: 1.01-1.13), after adjusting for endometriosis and other potential confounders. The association appeared to be more relevant for clear cell (OR = 1.48, 95% CI: 1.10-1.99) and serous borderline (OR = 1.31, 95% CI: 1.05-1.63) subtypes. In this large international pooled analysis of case-control studies, we observed a small increase in risk of ovarian cancer for women reporting severe menstrual pain. While we observed an increased ovarian cancer risk with severe menstrual pain, the possibility of recall bias and undiagnosed endometriosis cannot be excluded. Future validation in prospective studies with detailed information on endometriosis is needed. © 2017 UICC.

  1. Effectiveness of electronic stability control on single-vehicle accidents.

    PubMed

    Lyckegaard, Allan; Hels, Tove; Bernhoft, Inger Marie

    2015-01-01

    This study aims at evaluating the effectiveness of electronic stability control (ESC) on single-vehicle injury accidents while controlling for a number of confounders influencing the accident risk. Using police-registered injury accidents from 2004 to 2011 in Denmark with cars manufactured in the period 1998 to 2011 and the principle of induced exposure, 2 measures of the effectiveness of ESC were calculated: The crude odds ratio and the adjusted odds ratio, the latter by means of logistic regression. The logistic regression controlled for a number of confounding factors, of which the following were significant. For the driver: Age, gender, driving experience, valid driving license, and seat belt use. For the vehicle: Year of registration, weight, and ESC. For the accident surroundings: Visibility, light, and location. Finally, for the road: Speed limit, surface, and section characteristics. The present study calculated the crude odds ratio for ESC-equipped cars of getting in a single-vehicle injury accident as 0.40 (95% confidence interval [CI], 0.34-0.47) and the adjusted odds ratio as 0.69 (95% CI, 0.54-0.88). No difference was found in the effectiveness of ESC across the injury severity categories (slight, severe, and fatal). In line with previous results, this study concludes that ESC reduces the risk for single-vehicle injury accidents by 31% when controlling for various confounding factors related to the driver, the car, and the accident surroundings. Furthermore, it is concluded that it is important to control for human factors (at a minimum age and gender) in analyses where evaluations of this type are performed.

  2. Calculation of Haem Iron Intake and Its Role in the Development of Iron Deficiency in Young Women from the Australian Longitudinal Study on Women’s Health

    PubMed Central

    Reeves, Angela J.; McEvoy, Mark A.; MacDonald-Wicks, Lesley K.; Barker, Daniel; Attia, John; Hodge, Allison M.; Patterson, Amanda J.

    2017-01-01

    Total iron intake is not strongly associated with iron stores, but haem iron intake may be more predictive. Haem iron is not available in most nutrient databases, so experimentally determined haem contents were applied to an Australian Food Frequency Questionnaire (FFQ) to estimate haem iron intake in a representative sample of young women (25–30 years). The association between dietary haem iron intakes and incident self-reported diagnosed iron deficiency over six years of follow-up was examined. Haem iron contents for Australian red meats, fish, and poultry were applied to haem-containing foods in the Dietary Questionnaire for Epidemiological Studies V2 (DQESv2) FFQ. Haem iron intakes were calculated for 9076 women from the Australian Longitudinal Study on Women’s Health (ALSWH) using the DQESv2 dietary data from 2003. Logistic regression was used to examine the association between haem iron intake (2003) and the incidence of iron deficiency in 2006 and 2009. Multiple logistic regression showed baseline haem iron intake was a statistically significant predictor of iron deficiency in 2006 (Odds Ratio (OR): 0.91; 95% Confidence Interval (CI): 0.84–0.99; p-value: 0.020) and 2009 (OR: 0.89; 95% CI: 0.82–0.99; p-value: 0.007). Using the energy-adjusted haem intake made little difference to the associations. Higher haem iron intake is associated with reduced odds of iron deficiency developing in young adult Australian women. PMID:28534830

  3. [Risk factor analysis of the patients with solitary pulmonary nodules and establishment of a prediction model for the probability of malignancy].

    PubMed

    Wang, X; Xu, Y H; Du, Z Y; Qian, Y J; Xu, Z H; Chen, R; Shi, M H

    2018-02-23

    Objective: This study aims to analyze the relationship among the clinical features, radiologic characteristics and pathological diagnosis in patients with solitary pulmonary nodules, and establish a prediction model for the probability of malignancy. Methods: Clinical data of 372 patients with solitary pulmonary nodules who underwent surgical resection with definite postoperative pathological diagnosis were retrospectively analyzed. In these cases, we collected clinical and radiologic features including gender, age, smoking history, history of tumor, family history of cancer, the location of lesion, ground-glass opacity, maximum diameter, calcification, vessel convergence sign, vacuole sign, pleural indentation, speculation and lobulation. The cases were divided to modeling group (268 cases) and validation group (104 cases). A new prediction model was established by logistic regression analying the data from modeling group. Then the data of validation group was planned to validate the efficiency of the new model, and was compared with three classical models(Mayo model, VA model and LiYun model). With the calculated probability values for each model from validation group, SPSS 22.0 was used to draw the receiver operating characteristic curve, to assess the predictive value of this new model. Results: 112 benign SPNs and 156 malignant SPNs were included in modeling group. Multivariable logistic regression analysis showed that gender, age, history of tumor, ground -glass opacity, maximum diameter, and speculation were independent predictors of malignancy in patients with SPN( P <0.05). We calculated a prediction model for the probability of malignancy as follow: p =e(x)/(1+ e(x)), x=-4.8029-0.743×gender+ 0.057×age+ 1.306×history of tumor+ 1.305×ground-glass opacity+ 0.051×maximum diameter+ 1.043×speculation. When the data of validation group was added to the four-mathematical prediction model, The area under the curve of our mathematical prediction model was 0.742, which is greater than other models (Mayo 0.696, VA 0.634, LiYun 0.681), while the differences between any two of the four models were not significant ( P >0.05). Conclusions: Age of patient, gender, history of tumor, ground-glass opacity, maximum diameter and speculation are independent predictors of malignancy in patients with solitary pulmonary nodule. This logistic regression prediction mathematic model is not inferior to those classical models in estimating the prognosis of SPNs.

  4. Regression trees for predicting mortality in patients with cardiovascular disease: What improvement is achieved by using ensemble-based methods?

    PubMed Central

    Austin, Peter C; Lee, Douglas S; Steyerberg, Ewout W; Tu, Jack V

    2012-01-01

    In biomedical research, the logistic regression model is the most commonly used method for predicting the probability of a binary outcome. While many clinical researchers have expressed an enthusiasm for regression trees, this method may have limited accuracy for predicting health outcomes. We aimed to evaluate the improvement that is achieved by using ensemble-based methods, including bootstrap aggregation (bagging) of regression trees, random forests, and boosted regression trees. We analyzed 30-day mortality in two large cohorts of patients hospitalized with either acute myocardial infarction (N = 16,230) or congestive heart failure (N = 15,848) in two distinct eras (1999–2001 and 2004–2005). We found that both the in-sample and out-of-sample prediction of ensemble methods offered substantial improvement in predicting cardiovascular mortality compared to conventional regression trees. However, conventional logistic regression models that incorporated restricted cubic smoothing splines had even better performance. We conclude that ensemble methods from the data mining and machine learning literature increase the predictive performance of regression trees, but may not lead to clear advantages over conventional logistic regression models for predicting short-term mortality in population-based samples of subjects with cardiovascular disease. PMID:22777999

  5. Strategies for Testing Statistical and Practical Significance in Detecting DIF with Logistic Regression Models

    ERIC Educational Resources Information Center

    Fidalgo, Angel M.; Alavi, Seyed Mohammad; Amirian, Seyed Mohammad Reza

    2014-01-01

    This study examines three controversial aspects in differential item functioning (DIF) detection by logistic regression (LR) models: first, the relative effectiveness of different analytical strategies for detecting DIF; second, the suitability of the Wald statistic for determining the statistical significance of the parameters of interest; and…

  6. Iterative Purification and Effect Size Use with Logistic Regression for Differential Item Functioning Detection

    ERIC Educational Resources Information Center

    French, Brian F.; Maller, Susan J.

    2007-01-01

    Two unresolved implementation issues with logistic regression (LR) for differential item functioning (DIF) detection include ability purification and effect size use. Purification is suggested to control inaccuracies in DIF detection as a result of DIF items in the ability estimate. Additionally, effect size use may be beneficial in controlling…

  7. A Note on Three Statistical Tests in the Logistic Regression DIF Procedure

    ERIC Educational Resources Information Center

    Paek, Insu

    2012-01-01

    Although logistic regression became one of the well-known methods in detecting differential item functioning (DIF), its three statistical tests, the Wald, likelihood ratio (LR), and score tests, which are readily available under the maximum likelihood, do not seem to be consistently distinguished in DIF literature. This paper provides a clarifying…

  8. "Let Me Count the Ways:" Fostering Reasons for Living among Low-Income, Suicidal, African American Women

    ERIC Educational Resources Information Center

    West, Lindsey M.; Davis, Telsie A.; Thompson, Martie P.; Kaslow, Nadine J.

    2011-01-01

    Protective factors for fostering reasons for living were examined among low-income, suicidal, African American women. Bivariate logistic regressions revealed that higher levels of optimism, spiritual well-being, and family social support predicted reasons for living. Multivariate logistic regressions indicated that spiritual well-being showed…

  9. Comparison of Two Approaches for Handling Missing Covariates in Logistic Regression

    ERIC Educational Resources Information Center

    Peng, Chao-Ying Joanne; Zhu, Jin

    2008-01-01

    For the past 25 years, methodological advances have been made in missing data treatment. Most published work has focused on missing data in dependent variables under various conditions. The present study seeks to fill the void by comparing two approaches for handling missing data in categorical covariates in logistic regression: the…

  10. Comparison of IRT Likelihood Ratio Test and Logistic Regression DIF Detection Procedures

    ERIC Educational Resources Information Center

    Atar, Burcu; Kamata, Akihito

    2011-01-01

    The Type I error rates and the power of IRT likelihood ratio test and cumulative logit ordinal logistic regression procedures in detecting differential item functioning (DIF) for polytomously scored items were investigated in this Monte Carlo simulation study. For this purpose, 54 simulation conditions (combinations of 3 sample sizes, 2 sample…

  11. Multiple Logistic Regression Analysis of Cigarette Use among High School Students

    ERIC Educational Resources Information Center

    Adwere-Boamah, Joseph

    2011-01-01

    A binary logistic regression analysis was performed to predict high school students' cigarette smoking behavior from selected predictors from 2009 CDC Youth Risk Behavior Surveillance Survey. The specific target student behavior of interest was frequent cigarette use. Five predictor variables included in the model were: a) race, b) frequency of…

  12. Modeling Polytomous Item Responses Using Simultaneously Estimated Multinomial Logistic Regression Models

    ERIC Educational Resources Information Center

    Anderson, Carolyn J.; Verkuilen, Jay; Peyton, Buddy L.

    2010-01-01

    Survey items with multiple response categories and multiple-choice test questions are ubiquitous in psychological and educational research. We illustrate the use of log-multiplicative association (LMA) models that are extensions of the well-known multinomial logistic regression model for multiple dependent outcome variables to reanalyze a set of…

  13. Propensity Score Estimation with Data Mining Techniques: Alternatives to Logistic Regression

    ERIC Educational Resources Information Center

    Keller, Bryan S. B.; Kim, Jee-Seon; Steiner, Peter M.

    2013-01-01

    Propensity score analysis (PSA) is a methodological technique which may correct for selection bias in a quasi-experiment by modeling the selection process using observed covariates. Because logistic regression is well understood by researchers in a variety of fields and easy to implement in a number of popular software packages, it has…

  14. Two-factor logistic regression in pediatric liver transplantation

    NASA Astrophysics Data System (ADS)

    Uzunova, Yordanka; Prodanova, Krasimira; Spasov, Lyubomir

    2017-12-01

    Using a two-factor logistic regression analysis an estimate is derived for the probability of absence of infections in the early postoperative period after pediatric liver transplantation. The influence of both the bilirubin level and the international normalized ratio of prothrombin time of blood coagulation at the 5th postoperative day is studied.

  15. Predictors of Placement Stability at the State Level: The Use of Logistic Regression to Inform Practice

    ERIC Educational Resources Information Center

    Courtney, Jon R.; Prophet, Retta

    2011-01-01

    Placement instability is often associated with a number of negative outcomes for children. To gain state level contextual knowledge of factors associated with placement stability/instability, logistic regression was applied to selected variables from the New Mexico Adoption and Foster Care Administrative Reporting System dataset. Predictors…

  16. Classifying machinery condition using oil samples and binary logistic regression

    NASA Astrophysics Data System (ADS)

    Phillips, J.; Cripps, E.; Lau, John W.; Hodkiewicz, M. R.

    2015-08-01

    The era of big data has resulted in an explosion of condition monitoring information. The result is an increasing motivation to automate the costly and time consuming human elements involved in the classification of machine health. When working with industry it is important to build an understanding and hence some trust in the classification scheme for those who use the analysis to initiate maintenance tasks. Typically "black box" approaches such as artificial neural networks (ANN) and support vector machines (SVM) can be difficult to provide ease of interpretability. In contrast, this paper argues that logistic regression offers easy interpretability to industry experts, providing insight to the drivers of the human classification process and to the ramifications of potential misclassification. Of course, accuracy is of foremost importance in any automated classification scheme, so we also provide a comparative study based on predictive performance of logistic regression, ANN and SVM. A real world oil analysis data set from engines on mining trucks is presented and using cross-validation we demonstrate that logistic regression out-performs the ANN and SVM approaches in terms of prediction for healthy/not healthy engines.

  17. Length bias correction in gene ontology enrichment analysis using logistic regression.

    PubMed

    Mi, Gu; Di, Yanming; Emerson, Sarah; Cumbie, Jason S; Chang, Jeff H

    2012-01-01

    When assessing differential gene expression from RNA sequencing data, commonly used statistical tests tend to have greater power to detect differential expression of genes encoding longer transcripts. This phenomenon, called "length bias", will influence subsequent analyses such as Gene Ontology enrichment analysis. In the presence of length bias, Gene Ontology categories that include longer genes are more likely to be identified as enriched. These categories, however, are not necessarily biologically more relevant. We show that one can effectively adjust for length bias in Gene Ontology analysis by including transcript length as a covariate in a logistic regression model. The logistic regression model makes the statistical issue underlying length bias more transparent: transcript length becomes a confounding factor when it correlates with both the Gene Ontology membership and the significance of the differential expression test. The inclusion of the transcript length as a covariate allows one to investigate the direct correlation between the Gene Ontology membership and the significance of testing differential expression, conditional on the transcript length. We present both real and simulated data examples to show that the logistic regression approach is simple, effective, and flexible.

  18. Matched samples logistic regression in case-control studies with missing values: when to break the matches.

    PubMed

    Hansson, Lisbeth; Khamis, Harry J

    2008-12-01

    Simulated data sets are used to evaluate conditional and unconditional maximum likelihood estimation in an individual case-control design with continuous covariates when there are different rates of excluded cases and different levels of other design parameters. The effectiveness of the estimation procedures is measured by method bias, variance of the estimators, root mean square error (RMSE) for logistic regression and the percentage of explained variation. Conditional estimation leads to higher RMSE than unconditional estimation in the presence of missing observations, especially for 1:1 matching. The RMSE is higher for the smaller stratum size, especially for the 1:1 matching. The percentage of explained variation appears to be insensitive to missing data, but is generally higher for the conditional estimation than for the unconditional estimation. It is particularly good for the 1:2 matching design. For minimizing RMSE, a high matching ratio is recommended; in this case, conditional and unconditional logistic regression models yield comparable levels of effectiveness. For maximizing the percentage of explained variation, the 1:2 matching design with the conditional logistic regression model is recommended.

  19. Label-noise resistant logistic regression for functional data classification with an application to Alzheimer's disease study.

    PubMed

    Lee, Seokho; Shin, Hyejin; Lee, Sang Han

    2016-12-01

    Alzheimer's disease (AD) is usually diagnosed by clinicians through cognitive and functional performance test with a potential risk of misdiagnosis. Since the progression of AD is known to cause structural changes in the corpus callosum (CC), the CC thickness can be used as a functional covariate in AD classification problem for a diagnosis. However, misclassified class labels negatively impact the classification performance. Motivated by AD-CC association studies, we propose a logistic regression for functional data classification that is robust to misdiagnosis or label noise. Specifically, our logistic regression model is constructed by adopting individual intercepts to functional logistic regression model. This approach enables to indicate which observations are possibly mislabeled and also lead to a robust and efficient classifier. An effective algorithm using MM algorithm provides simple closed-form update formulas. We test our method using synthetic datasets to demonstrate its superiority over an existing method, and apply it to differentiating patients with AD from healthy normals based on CC from MRI. © 2016, The International Biometric Society.

  20. A VARI-Based Relative Greenness from MODIS Data for Computing the Fire Potential Index

    NASA Technical Reports Server (NTRS)

    Schneider, P.; Roberts, D. A.; Kyriakidis, P. C.

    2008-01-01

    The Fire Potential Index (FPI) relies on relative greenness (RG) estimates from remote sensing data. The Normalized Difference Vegetation index (NDVI), derived from NOAA Advanced Very High Resolution Radiometer (AVHRR) imagery is currently used to calculate RG operationally. Here we evaluated an alternate measure of RG using the Visible Atmospheric Resistant Index (VARI) derived from Moderate Resolution Imaging Spectrometer (MODIS) data. VARI was chosen because it has previously been shown to have the strongest relationship with Live Fuel Moisture (LFM) out of a wide selection of MODIS-derived indices in southern California shrublands. To compare MODIS-based NDVI-FPI and VARI-FPI, RG was calculated from a 6-year time series of MODIS composites and validated against in-situ observations of LFM as a surrogate for vegetation greenness. RG from both indices was then compared in terms of its performance for computing the FPI using historical wildfire data. Computed RG values were regressed against ground-sampled LFM at 14 sites within Los Angeles County. The results indicate the VARI-based RG consistently shows a stronger relationship with observed LFM than NDVI-based RG. With an average R2 of 0.727 compared to a value of only 0.622 for NDVI-RG, VARI-RG showed stronger relationships at 13 out of 14 sites. Based on these results, daily FPI maps were computed for the years 2001 through 2005 using both NDVI-RG and VARI-RG. These were then validated against 12,490 fire detections from the MODIS active fire product using logistic regression. Deviance of the logistic regression model was 408.8 for NDVI-FPI and 176.2 for VARI-FPI. The c-index was found to be 0.69 and 0.78, respectively. The results show that VARI-FP outperforms NDVI-FPI in distinguishing between fire and no-fire events for historical wildfire data in southern California for the given time period.

  1. Logistic regression for circular data

    NASA Astrophysics Data System (ADS)

    Al-Daffaie, Kadhem; Khan, Shahjahan

    2017-05-01

    This paper considers the relationship between a binary response and a circular predictor. It develops the logistic regression model by employing the linear-circular regression approach. The maximum likelihood method is used to estimate the parameters. The Newton-Raphson numerical method is used to find the estimated values of the parameters. A data set from weather records of Toowoomba city is analysed by the proposed methods. Moreover, a simulation study is considered. The R software is used for all computations and simulations.

  2. Naval Research Logistics Quarterly. Volume 28. Number 3,

    DTIC Science & Technology

    1981-09-01

    denotes component-wise maximum. f has antone (isotone) differences on C x D if for cl < c2 and d, < d2, NAVAL RESEARCH LOGISTICS QUARTERLY VOL. 28...or negative correlations and linear or nonlinear regressions. Given are the mo- ments to order two and, for special cases, (he regression function and...data sets. We designate this bnb distribution as G - B - N(a, 0, v). The distribution admits only of positive correlation and linear regressions

  3. Regression approaches in the test-negative study design for assessment of influenza vaccine effectiveness.

    PubMed

    Bond, H S; Sullivan, S G; Cowling, B J

    2016-06-01

    Influenza vaccination is the most practical means available for preventing influenza virus infection and is widely used in many countries. Because vaccine components and circulating strains frequently change, it is important to continually monitor vaccine effectiveness (VE). The test-negative design is frequently used to estimate VE. In this design, patients meeting the same clinical case definition are recruited and tested for influenza; those who test positive are the cases and those who test negative form the comparison group. When determining VE in these studies, the typical approach has been to use logistic regression, adjusting for potential confounders. Because vaccine coverage and influenza incidence change throughout the season, time is included among these confounders. While most studies use unconditional logistic regression, adjusting for time, an alternative approach is to use conditional logistic regression, matching on time. Here, we used simulation data to examine the potential for both regression approaches to permit accurate and robust estimates of VE. In situations where vaccine coverage changed during the influenza season, the conditional model and unconditional models adjusting for categorical week and using a spline function for week provided more accurate estimates. We illustrated the two approaches on data from a test-negative study of influenza VE against hospitalization in children in Hong Kong which resulted in the conditional logistic regression model providing the best fit to the data.

  4. Study on a pattern classification method of soil quality based on simplified learning sample dataset

    USGS Publications Warehouse

    Zhang, Jiahua; Liu, S.; Hu, Y.; Tian, Y.

    2011-01-01

    Based on the massive soil information in current soil quality grade evaluation, this paper constructed an intelligent classification approach of soil quality grade depending on classical sampling techniques and disordered multiclassification Logistic regression model. As a case study to determine the learning sample capacity under certain confidence level and estimation accuracy, and use c-means algorithm to automatically extract the simplified learning sample dataset from the cultivated soil quality grade evaluation database for the study area, Long chuan county in Guangdong province, a disordered Logistic classifier model was then built and the calculation analysis steps of soil quality grade intelligent classification were given. The result indicated that the soil quality grade can be effectively learned and predicted by the extracted simplified dataset through this method, which changed the traditional method for soil quality grade evaluation. ?? 2011 IEEE.

  5. A novel hybrid method of beta-turn identification in protein using binary logistic regression and neural network

    PubMed Central

    Asghari, Mehdi Poursheikhali; Hayatshahi, Sayyed Hamed Sadat; Abdolmaleki, Parviz

    2012-01-01

    From both the structural and functional points of view, β-turns play important biological roles in proteins. In the present study, a novel two-stage hybrid procedure has been developed to identify β-turns in proteins. Binary logistic regression was initially used for the first time to select significant sequence parameters in identification of β-turns due to a re-substitution test procedure. Sequence parameters were consisted of 80 amino acid positional occurrences and 20 amino acid percentages in sequence. Among these parameters, the most significant ones which were selected by binary logistic regression model, were percentages of Gly, Ser and the occurrence of Asn in position i+2, respectively, in sequence. These significant parameters have the highest effect on the constitution of a β-turn sequence. A neural network model was then constructed and fed by the parameters selected by binary logistic regression to build a hybrid predictor. The networks have been trained and tested on a non-homologous dataset of 565 protein chains. With applying a nine fold cross-validation test on the dataset, the network reached an overall accuracy (Qtotal) of 74, which is comparable with results of the other β-turn prediction methods. In conclusion, this study proves that the parameter selection ability of binary logistic regression together with the prediction capability of neural networks lead to the development of more precise models for identifying β-turns in proteins. PMID:27418910

  6. A novel hybrid method of beta-turn identification in protein using binary logistic regression and neural network.

    PubMed

    Asghari, Mehdi Poursheikhali; Hayatshahi, Sayyed Hamed Sadat; Abdolmaleki, Parviz

    2012-01-01

    From both the structural and functional points of view, β-turns play important biological roles in proteins. In the present study, a novel two-stage hybrid procedure has been developed to identify β-turns in proteins. Binary logistic regression was initially used for the first time to select significant sequence parameters in identification of β-turns due to a re-substitution test procedure. Sequence parameters were consisted of 80 amino acid positional occurrences and 20 amino acid percentages in sequence. Among these parameters, the most significant ones which were selected by binary logistic regression model, were percentages of Gly, Ser and the occurrence of Asn in position i+2, respectively, in sequence. These significant parameters have the highest effect on the constitution of a β-turn sequence. A neural network model was then constructed and fed by the parameters selected by binary logistic regression to build a hybrid predictor. The networks have been trained and tested on a non-homologous dataset of 565 protein chains. With applying a nine fold cross-validation test on the dataset, the network reached an overall accuracy (Qtotal) of 74, which is comparable with results of the other β-turn prediction methods. In conclusion, this study proves that the parameter selection ability of binary logistic regression together with the prediction capability of neural networks lead to the development of more precise models for identifying β-turns in proteins.

  7. Differential item functioning analysis with ordinal logistic regression techniques. DIFdetect and difwithpar.

    PubMed

    Crane, Paul K; Gibbons, Laura E; Jolley, Lance; van Belle, Gerald

    2006-11-01

    We present an ordinal logistic regression model for identification of items with differential item functioning (DIF) and apply this model to a Mini-Mental State Examination (MMSE) dataset. We employ item response theory ability estimation in our models. Three nested ordinal logistic regression models are applied to each item. Model testing begins with examination of the statistical significance of the interaction term between ability and the group indicator, consistent with nonuniform DIF. Then we turn our attention to the coefficient of the ability term in models with and without the group term. If including the group term has a marked effect on that coefficient, we declare that it has uniform DIF. We examined DIF related to language of test administration in addition to self-reported race, Hispanic ethnicity, age, years of education, and sex. We used PARSCALE for IRT analyses and STATA for ordinal logistic regression approaches. We used an iterative technique for adjusting IRT ability estimates on the basis of DIF findings. Five items were found to have DIF related to language. These same items also had DIF related to other covariates. The ordinal logistic regression approach to DIF detection, when combined with IRT ability estimates, provides a reasonable alternative for DIF detection. There appear to be several items with significant DIF related to language of test administration in the MMSE. More attention needs to be paid to the specific criteria used to determine whether an item has DIF, not just the technique used to identify DIF.

  8. Conditional Poisson models: a flexible alternative to conditional logistic case cross-over analysis.

    PubMed

    Armstrong, Ben G; Gasparrini, Antonio; Tobias, Aurelio

    2014-11-24

    The time stratified case cross-over approach is a popular alternative to conventional time series regression for analysing associations between time series of environmental exposures (air pollution, weather) and counts of health outcomes. These are almost always analyzed using conditional logistic regression on data expanded to case-control (case crossover) format, but this has some limitations. In particular adjusting for overdispersion and auto-correlation in the counts is not possible. It has been established that a Poisson model for counts with stratum indicators gives identical estimates to those from conditional logistic regression and does not have these limitations, but it is little used, probably because of the overheads in estimating many stratum parameters. The conditional Poisson model avoids estimating stratum parameters by conditioning on the total event count in each stratum, thus simplifying the computing and increasing the number of strata for which fitting is feasible compared with the standard unconditional Poisson model. Unlike the conditional logistic model, the conditional Poisson model does not require expanding the data, and can adjust for overdispersion and auto-correlation. It is available in Stata, R, and other packages. By applying to some real data and using simulations, we demonstrate that conditional Poisson models were simpler to code and shorter to run than are conditional logistic analyses and can be fitted to larger data sets than possible with standard Poisson models. Allowing for overdispersion or autocorrelation was possible with the conditional Poisson model but when not required this model gave identical estimates to those from conditional logistic regression. Conditional Poisson regression models provide an alternative to case crossover analysis of stratified time series data with some advantages. The conditional Poisson model can also be used in other contexts in which primary control for confounding is by fine stratification.

  9. Use of generalized ordered logistic regression for the analysis of multidrug resistance data.

    PubMed

    Agga, Getahun E; Scott, H Morgan

    2015-10-01

    Statistical analysis of antimicrobial resistance data largely focuses on individual antimicrobial's binary outcome (susceptible or resistant). However, bacteria are becoming increasingly multidrug resistant (MDR). Statistical analysis of MDR data is mostly descriptive often with tabular or graphical presentations. Here we report the applicability of generalized ordinal logistic regression model for the analysis of MDR data. A total of 1,152 Escherichia coli, isolated from the feces of weaned pigs experimentally supplemented with chlortetracycline (CTC) and copper, were tested for susceptibilities against 15 antimicrobials and were binary classified into resistant or susceptible. The 15 antimicrobial agents tested were grouped into eight different antimicrobial classes. We defined MDR as the number of antimicrobial classes to which E. coli isolates were resistant ranging from 0 to 8. Proportionality of the odds assumption of the ordinal logistic regression model was violated only for the effect of treatment period (pre-treatment, during-treatment and post-treatment); but not for the effect of CTC or copper supplementation. Subsequently, a partially constrained generalized ordinal logistic model was built that allows for the effect of treatment period to vary while constraining the effects of treatment (CTC and copper supplementation) to be constant across the levels of MDR classes. Copper (Proportional Odds Ratio [Prop OR]=1.03; 95% CI=0.73-1.47) and CTC (Prop OR=1.1; 95% CI=0.78-1.56) supplementation were not significantly associated with the level of MDR adjusted for the effect of treatment period. MDR generally declined over the trial period. In conclusion, generalized ordered logistic regression can be used for the analysis of ordinal data such as MDR data when the proportionality assumptions for ordered logistic regression are violated. Published by Elsevier B.V.

  10. Artificial neural networks predict the incidence of portosplenomesenteric venous thrombosis in patients with acute pancreatitis.

    PubMed

    Fei, Y; Hu, J; Li, W-Q; Wang, W; Zong, G-Q

    2017-03-01

    Essentials Predicting the occurrence of portosplenomesenteric vein thrombosis (PSMVT) is difficult. We studied 72 patients with acute pancreatitis. Artificial neural networks modeling was more accurate than logistic regression in predicting PSMVT. Additional predictive factors may be incorporated into artificial neural networks. Objective To construct and validate artificial neural networks (ANNs) for predicting the occurrence of portosplenomesenteric venous thrombosis (PSMVT) and compare the predictive ability of the ANNs with that of logistic regression. Methods The ANNs and logistic regression modeling were constructed using simple clinical and laboratory data of 72 acute pancreatitis (AP) patients. The ANNs and logistic modeling were first trained on 48 randomly chosen patients and validated on the remaining 24 patients. The accuracy and the performance characteristics were compared between these two approaches by SPSS17.0 software. Results The training set and validation set did not differ on any of the 11 variables. After training, the back propagation network training error converged to 1 × 10 -20 , and it retained excellent pattern recognition ability. When the ANNs model was applied to the validation set, it revealed a sensitivity of 80%, specificity of 85.7%, a positive predictive value of 77.6% and negative predictive value of 90.7%. The accuracy was 83.3%. Differences could be found between ANNs modeling and logistic regression modeling in these parameters (10.0% [95% CI, -14.3 to 34.3%], 14.3% [95% CI, -8.6 to 37.2%], 15.7% [95% CI, -9.9 to 41.3%], 11.8% [95% CI, -8.2 to 31.8%], 22.6% [95% CI, -1.9 to 47.1%], respectively). When ANNs modeling was used to identify PSMVT, the area under receiver operating characteristic curve was 0.849 (95% CI, 0.807-0.901), which demonstrated better overall properties than logistic regression modeling (AUC = 0.716) (95% CI, 0.679-0.761). Conclusions ANNs modeling was a more accurate tool than logistic regression in predicting the occurrence of PSMVT following AP. More clinical factors or biomarkers may be incorporated into ANNs modeling to improve its predictive ability. © 2016 International Society on Thrombosis and Haemostasis.

  11. PREDICTION OF MALIGNANT BREAST LESIONS FROM MRI FEATURES: A COMPARISON OF ARTIFICIAL NEURAL NETWORK AND LOGISTIC REGRESSION TECHNIQUES

    PubMed Central

    McLaren, Christine E.; Chen, Wen-Pin; Nie, Ke; Su, Min-Ying

    2009-01-01

    Rationale and Objectives Dynamic contrast enhanced MRI (DCE-MRI) is a clinical imaging modality for detection and diagnosis of breast lesions. Analytical methods were compared for diagnostic feature selection and performance of lesion classification to differentiate between malignant and benign lesions in patients. Materials and Methods The study included 43 malignant and 28 benign histologically-proven lesions. Eight morphological parameters, ten gray level co-occurrence matrices (GLCM) texture features, and fourteen Laws’ texture features were obtained using automated lesion segmentation and quantitative feature extraction. Artificial neural network (ANN) and logistic regression analysis were compared for selection of the best predictors of malignant lesions among the normalized features. Results Using ANN, the final four selected features were compactness, energy, homogeneity, and Law_LS, with area under the receiver operating characteristic curve (AUC) = 0.82, and accuracy = 0.76. The diagnostic performance of these 4-features computed on the basis of logistic regression yielded AUC = 0.80 (95% CI, 0.688 to 0.905), similar to that of ANN. The analysis also shows that the odds of a malignant lesion decreased by 48% (95% CI, 25% to 92%) for every increase of 1 SD in the Law_LS feature, adjusted for differences in compactness, energy, and homogeneity. Using logistic regression with z-score transformation, a model comprised of compactness, NRL entropy, and gray level sum average was selected, and it had the highest overall accuracy of 0.75 among all models, with AUC = 0.77 (95% CI, 0.660 to 0.880). When logistic modeling of transformations using the Box-Cox method was performed, the most parsimonious model with predictors, compactness and Law_LS, had an AUC of 0.79 (95% CI, 0.672 to 0.898). Conclusion The diagnostic performance of models selected by ANN and logistic regression was similar. The analytic methods were found to be roughly equivalent in terms of predictive ability when a small number of variables were chosen. The robust ANN methodology utilizes a sophisticated non-linear model, while logistic regression analysis provides insightful information to enhance interpretation of the model features. PMID:19409817

  12. Logistic regression analysis of factors associated with avascular necrosis of the femoral head following femoral neck fractures in middle-aged and elderly patients.

    PubMed

    Ai, Zi-Sheng; Gao, You-Shui; Sun, Yuan; Liu, Yue; Zhang, Chang-Qing; Jiang, Cheng-Hua

    2013-03-01

    Risk factors for femoral neck fracture-induced avascular necrosis of the femoral head have not been elucidated clearly in middle-aged and elderly patients. Moreover, the high incidence of screw removal in China and its effect on the fate of the involved femoral head require statistical methods to reflect their intrinsic relationship. Ninety-nine patients older than 45 years with femoral neck fracture were treated by internal fixation between May 1999 and April 2004. Descriptive analysis, interaction analysis between associated factors, single factor logistic regression, multivariate logistic regression, and detailed interaction analysis were employed to explore potential relationships among associated factors. Avascular necrosis of the femoral head was found in 15 cases (15.2 %). Age × the status of implants (removal vs. maintenance) and gender × the timing of reduction were interactive according to two-factor interactive analysis. Age, the displacement of fractures, the quality of reduction, and the status of implants were found to be significant factors in single factor logistic regression analysis. Age, age × the status of implants, and the quality of reduction were found to be significant factors in multivariate logistic regression analysis. In fine interaction analysis after multivariate logistic regression analysis, implant removal was the most important risk factor for avascular necrosis in 56-to-85-year-old patients, with a risk ratio of 26.00 (95 % CI = 3.076-219.747). The middle-aged and elderly have less incidence of avascular necrosis of the femoral head following femoral neck fractures treated by cannulated screws. The removal of cannulated screws can induce a significantly high incidence of avascular necrosis of the femoral head in elderly patients, while a high-quality reduction is helpful to reduce avascular necrosis.

  13. Prediction of sickness absence: development of a screening instrument

    PubMed Central

    Duijts, S F A; Kant, IJ; Landeweerd, J A; Swaen, G M H

    2006-01-01

    Objectives To develop a concise screening instrument for early identification of employees at risk for sickness absence due to psychosocial health complaints. Methods Data from the Maastricht Cohort Study on “Fatigue at Work” were used to identify items to be associated with an increased risk of sickness absence. The analytical procedures univariate logistic regression, backward stepwise linear regression, and multiple logistic regression were successively applied. For both men and women, sum scores were calculated, and sensitivity and specificity rates of different cut‐off points on the screening instrument were defined. Results In women, results suggested that feeling depressed, having a burnout, being tired, being less interested in work, experiencing obligatory change in working days, and living alone, were strong predictors of sickness absence due to psychosocial health complaints. In men, statistically significant predictors were having a history of sickness absence, compulsive thinking, being mentally fatigued, finding it hard to relax, lack of supervisor support, and having no hobbies. A potential cut‐off point of 10 on the screening instrument resulted in a sensitivity score of 41.7% for women and 38.9% for men, and a specificity score of 91.3% for women and 90.6% for men. Conclusions This study shows that it is possible to identify predictive factors for sickness absence and to develop an instrument for early identification of employees at risk for sickness absence. The results of this study increase the possibility for both employers and policymakers to implement interventions directed at the prevention of sickness absence. PMID:16698807

  14. Development of a Bayesian model to estimate health care outcomes in the severely wounded

    PubMed Central

    Stojadinovic, Alexander; Eberhardt, John; Brown, Trevor S; Hawksworth, Jason S; Gage, Frederick; Tadaki, Douglas K; Forsberg, Jonathan A; Davis, Thomas A; Potter, Benjamin K; Dunne, James R; Elster, E A

    2010-01-01

    Background: Graphical probabilistic models have the ability to provide insights as to how clinical factors are conditionally related. These models can be used to help us understand factors influencing health care outcomes and resource utilization, and to estimate morbidity and clinical outcomes in trauma patient populations. Study design: Thirty-two combat casualties with severe extremity injuries enrolled in a prospective observational study were analyzed using step-wise machine-learned Bayesian belief network (BBN) and step-wise logistic regression (LR). Models were evaluated using 10-fold cross-validation to calculate area-under-the-curve (AUC) from receiver operating characteristics (ROC) curves. Results: Our BBN showed important associations between various factors in our data set that could not be developed using standard regression methods. Cross-validated ROC curve analysis showed that our BBN model was a robust representation of our data domain and that LR models trained on these findings were also robust: hospital-acquired infection (AUC: LR, 0.81; BBN, 0.79), intensive care unit length of stay (AUC: LR, 0.97; BBN, 0.81), and wound healing (AUC: LR, 0.91; BBN, 0.72) showed strong AUC. Conclusions: A BBN model can effectively represent clinical outcomes and biomarkers in patients hospitalized after severe wounding, and is confirmed by 10-fold cross-validation and further confirmed through logistic regression modeling. The method warrants further development and independent validation in other, more diverse patient populations. PMID:21197361

  15. Logistic model analysis of neurological findings in Minamata disease and the predicting index.

    PubMed

    Nakagawa, Masanori; Kodama, Tomoko; Akiba, Suminori; Arimura, Kimiyoshi; Wakamiya, Junji; Futatsuka, Makoto; Kitano, Takao; Osame, Mitsuhiro

    2002-01-01

    To establish a statistical diagnostic method to identify patients with Minamata disease (MD) considering factors of aging and sex, we analyzed the neurological findings in MD patients, inhabitants in a methylmercury polluted (MP) area, and inhabitants in a non-MP area. We compared the neurological findings in MD patients and inhabitants aged more than 40 years in the non-MP area. Based on the different frequencies of the neurological signs in the two groups, we devised the following formula to calculate the predicting index for MD: predicting index = 1/(1+e(-x)) x 100 (The value of x was calculated using the regression coefficients of each neurological finding obtained from logistic analysis. The index 100 indicated MD, and 0, non-MD). Using this method, we found that 100% of male and 98% of female patients with MD (95 cases) gave predicting indices higher than 95. Five percent of the aged inhabitants in the MP area (598 inhabitants) and 0.2% of those in the non-MP area (558 inhabitants) gave predicting indices of 50 or higher. Our statistical diagnostic method for MD was useful in distinguishing MD patients from healthy elders based on their neurological findings.

  16. Rank-Optimized Logistic Matrix Regression toward Improved Matrix Data Classification.

    PubMed

    Zhang, Jianguang; Jiang, Jianmin

    2018-02-01

    While existing logistic regression suffers from overfitting and often fails in considering structural information, we propose a novel matrix-based logistic regression to overcome the weakness. In the proposed method, 2D matrices are directly used to learn two groups of parameter vectors along each dimension without vectorization, which allows the proposed method to fully exploit the underlying structural information embedded inside the 2D matrices. Further, we add a joint [Formula: see text]-norm on two parameter matrices, which are organized by aligning each group of parameter vectors in columns. This added co-regularization term has two roles-enhancing the effect of regularization and optimizing the rank during the learning process. With our proposed fast iterative solution, we carried out extensive experiments. The results show that in comparison to both the traditional tensor-based methods and the vector-based regression methods, our proposed solution achieves better performance for matrix data classifications.

  17. Detecting DIF in Polytomous Items Using MACS, IRT and Ordinal Logistic Regression

    ERIC Educational Resources Information Center

    Elosua, Paula; Wells, Craig

    2013-01-01

    The purpose of the present study was to compare the Type I error rate and power of two model-based procedures, the mean and covariance structure model (MACS) and the item response theory (IRT), and an observed-score based procedure, ordinal logistic regression, for detecting differential item functioning (DIF) in polytomous items. A simulation…

  18. Accuracy of Bayes and Logistic Regression Subscale Probabilities for Educational and Certification Tests

    ERIC Educational Resources Information Center

    Rudner, Lawrence

    2016-01-01

    In the machine learning literature, it is commonly accepted as fact that as calibration sample sizes increase, Naïve Bayes classifiers initially outperform Logistic Regression classifiers in terms of classification accuracy. Applied to subtests from an on-line final examination and from a highly regarded certification examination, this study shows…

  19. Comparing Linear Discriminant Function with Logistic Regression for the Two-Group Classification Problem.

    ERIC Educational Resources Information Center

    Fan, Xitao; Wang, Lin

    The Monte Carlo study compared the performance of predictive discriminant analysis (PDA) and that of logistic regression (LR) for the two-group classification problem. Prior probabilities were used for classification, but the cost of misclassification was assumed to be equal. The study used a fully crossed three-factor experimental design (with…

  20. Effects of Social Class and School Conditions on Educational Enrollment and Achievement of Boys and Girls in Rural Viet Nam

    ERIC Educational Resources Information Center

    Nguyen, Phuong L.

    2006-01-01

    This study examines the effects of parental SES, school quality, and community factors on children's enrollment and achievement in rural areas in Viet Nam, using logistic regression and ordered logistic regression. Multivariate analysis reveals significant differences in educational enrollment and outcomes by level of household expenditures and…

  1. School Exits in the Milwaukee Parental Choice Program: Evidence of a Marketplace?

    ERIC Educational Resources Information Center

    Ford, Michael

    2011-01-01

    This article examines whether the large number of school exits from the Milwaukee school voucher program is evidence of a marketplace. Two logistic regression and multinomial logistic regression models tested the relation between the inability to draw large numbers of voucher students and the ability for a private school to remain viable. Data on…

  2. Model building strategy for logistic regression: purposeful selection.

    PubMed

    Zhang, Zhongheng

    2016-03-01

    Logistic regression is one of the most commonly used models to account for confounders in medical literature. The article introduces how to perform purposeful selection model building strategy with R. I stress on the use of likelihood ratio test to see whether deleting a variable will have significant impact on model fit. A deleted variable should also be checked for whether it is an important adjustment of remaining covariates. Interaction should be checked to disentangle complex relationship between covariates and their synergistic effect on response variable. Model should be checked for the goodness-of-fit (GOF). In other words, how the fitted model reflects the real data. Hosmer-Lemeshow GOF test is the most widely used for logistic regression model.

  3. Determining delayed admission to intensive care unit for mechanically ventilated patients in the emergency department.

    PubMed

    Hung, Shih-Chiang; Kung, Chia-Te; Hung, Chih-Wei; Liu, Ber-Ming; Liu, Jien-Wei; Chew, Ghee; Chuang, Hung-Yi; Lee, Wen-Huei; Lee, Tzu-Chi

    2014-08-23

    The adverse effects of delayed admission to the intensive care unit (ICU) have been recognized in previous studies. However, the definitions of delayed admission varies across studies. This study proposed a model to define "delayed admission", and explored the effect of ICU-waiting time on patients' outcome. This retrospective cohort study included non-traumatic adult patients on mechanical ventilation in the emergency department (ED), from July 2009 to June 2010. The primary outcomes measures were 21-ventilator-day mortality and prolonged hospital stays (over 30 days). Models of Cox regression and logistic regression were used for multivariate analysis. The non-delayed ICU-waiting was defined as a period in which the time effect on mortality was not statistically significant in a Cox regression model. To identify a suitable cut-off point between "delayed" and "non-delayed", subsets from the overall data were made based on ICU-waiting time and the hazard ratio of ICU-waiting hour in each subset was iteratively calculated. The cut-off time was then used to evaluate the impact of delayed ICU admission on mortality and prolonged length of hospital stay. The final analysis included 1,242 patients. The time effect on mortality emerged after 4 hours, thus we deduced ICU-waiting time in ED > 4 hours as delayed. By logistic regression analysis, delayed ICU admission affected the outcomes of 21 ventilator-days mortality and prolonged hospital stay, with odds ratio of 1.41 (95% confidence interval, 1.05 to 1.89) and 1.56 (95% confidence interval, 1.07 to 2.27) respectively. For patients on mechanical ventilation at the ED, delayed ICU admission is associated with higher probability of mortality and additional resource expenditure. A benchmark waiting time of no more than 4 hours for ICU admission is recommended.

  4. Inverse Ising Inference Using All the Data

    NASA Astrophysics Data System (ADS)

    Aurell, Erik; Ekeberg, Magnus

    2012-03-01

    We show that a method based on logistic regression, using all the data, solves the inverse Ising problem far better than mean-field calculations relying only on sample pairwise correlation functions, while still computationally feasible for hundreds of nodes. The largest improvement in reconstruction occurs for strong interactions. Using two examples, a diluted Sherrington-Kirkpatrick model and a two-dimensional lattice, we also show that interaction topologies can be recovered from few samples with good accuracy and that the use of l1 regularization is beneficial in this process, pushing inference abilities further into low-temperature regimes.

  5. Association Between Alcohol Calorie Intake and Overweight and Obesity in English Adults

    PubMed Central

    Shelton, Nicola Jane; Knott, Craig S.

    2014-01-01

    We investigated the contribution of alcohol-derived calories to the alcohol–obesity relation. Adult alcohol calorie intake was derived from consumption volume and drink type in the Health Survey for England 2006 (n = 8864). We calculated the odds of obesity with survey-adjusted logistic regression. Mean alcohol calorie consumption was 27% of the recommended daily calorie intake in men and 19% in women on the heaviest drinking day in the last week, with a positive association between alcohol calories and obesity. Alcohol calories may be a significant contributor to the rise in obesity. PMID:24524529

  6. Selected Logistics Models and Techniques.

    DTIC Science & Technology

    1984-09-01

    TI - 59 Programmable Calculator LCC...Program 27 TI - 59 Programmable Calculator LCC Model 30 Unmanned Spacecraft Cost Model 31 iv I: TABLE OF CONTENTS (CONT’D) (Subject Index) LOGISTICS...34"" - % - "° > - " ° .° - " .’ > -% > ]*° - LOGISTICS ANALYSIS MODEL/TECHNIQUE DATA MODEL/TECHNIQUE NAME: TI - 59 Programmable Calculator LCC Model TYPE MODEL: Cost Estimating DEVELOPED BY:

  7. Combination of c-reactive protein and squamous cell carcinoma antigen in predicting postoperative prognosis for patients with squamous cell carcinoma of the esophagus.

    PubMed

    Feng, Ji-Feng; Chen, Sheng; Yang, Xun

    2017-09-08

    We initially proposed a useful and novel prognostic model, named CCS [Combination of c-reactive protein (CRP) and squamous cell carcinoma antigen (SCC)], for predicting the postoperative survival in patients with esophageal squamous cell carcinoma (ESCC). Two hundred and fifty-two patients with resectable ESCC were included in this retrospective study. A logistic regression was performed and yielded a logistic equation. The CCS was calculated by the combined CRP and SCC. The optimal cut-off value for CCS was evaluated by X-tile program. Univariate and multivariate analyses were used to evaluate the predictive factors. In addition, a novel nomogram model was also performed to predict the prognosis for patients with ESCC. In the current study, CCS was calculated as CRP+6.33 SCC according to the logistic equation. The optimal cut-off value was 15.8 for CCS according to the X-tile program. Kaplan-Meier analyses demonstrated that high CCS group had a significantly poor 5-year cancer-specific survival (CSS) than low CCS group (10.3% vs. 47.3%, P <0.001). According to multivariate analyses, CCS ( P =0.004), but not CRP ( P =0.466) or SCC ( P =0.926), was an independent prognostic factor. A nomogram could be more accuracy for CSS (Harrell's c-index: 0.70). The CCS is a usefull and independent predictive factor in patients with ESCC.

  8. Non-specific low back pain: occupational or lifestyle consequences?

    PubMed

    Stričević, Jadranka; Papež, Breda Jesenšek

    2015-12-01

    Nursing occupation was identified as a risk occupation for the development of low back pain (LBP). The aim of our study was to find out how much occupational factors influence the development of LBP in hospital nursing personnel. Non-experimental approach with a cross-sectional survey and statistical analysis. Nine hundred questionnaires were distributed among nursing personnel, 663 were returned and 659 (73.2 %) were considered for the analysis. Univariate and multivariate statistics for LBP risk was calculated by the binary logistic regression. The χ(2), influence factor, 95 % confidence interval and P value were calculated. Multivariate binary logistic regression was calculated by the Wald method to omit insignificant variables. Not performing exercises represented the highest risk for the development of LBP (OR 2.8, 95 % CI 1.7-4.4; p < 0.001). The second and third ranked risk factors were frequent manual lifting > 10 kg (OR 2.4, 95 % CI 1.5-3.8; p < 0.001) and duration of employment ≥ 19 years (OR 2.4, 95 % CI 1.6-3.7; p < 0.001). The fourth ranked risk factor was better physical condition by frequent recreation and sports, which reduced the risk for the development of LBP (OR 0.4, 95 % CI 0.3-0.7; p = 0.001). Work with the computer ≥ 2 h per day as last significant risk factor also reduced the risk for the development of LBP (OR 0.6, 95 % CI 0.4-0.1; p = 0.049). Risk factors for LBP established in our study (exercises, duration of employment, frequent manual lifting, recreation and sports and work with the computer) are not specifically linked to the working environment of the nursing personnel. Rather than focusing on mechanical causes and direct workload in the development of non-specific LBP, the complex approach to LBP including genetics, psychosocial environment, lifestyle and quality of life is coming more to the fore.

  9. Assessing landslide susceptibility by statistical data analysis and GIS: the case of Daunia (Apulian Apennines, Italy)

    NASA Astrophysics Data System (ADS)

    Ceppi, C.; Mancini, F.; Ritrovato, G.

    2009-04-01

    This study aim at the landslide susceptibility mapping within an area of the Daunia (Apulian Apennines, Italy) by a multivariate statistical method and data manipulation in a Geographical Information System (GIS) environment. Among the variety of existing statistical data analysis techniques, the logistic regression was chosen to produce a susceptibility map all over an area where small settlements are historically threatened by landslide phenomena. By logistic regression a best fitting between the presence or absence of landslide (dependent variable) and the set of independent variables is performed on the basis of a maximum likelihood criterion, bringing to the estimation of regression coefficients. The reliability of such analysis is therefore due to the ability to quantify the proneness to landslide occurrences by the probability level produced by the analysis. The inventory of dependent and independent variables were managed in a GIS, where geometric properties and attributes have been translated into raster cells in order to proceed with the logistic regression by means of SPSS (Statistical Package for the Social Sciences) package. A landslide inventory was used to produce the bivariate dependent variable whereas the independent set of variable concerned with slope, aspect, elevation, curvature, drained area, lithology and land use after their reductions to dummy variables. The effect of independent parameters on landslide occurrence was assessed by the corresponding coefficient in the logistic regression function, highlighting a major role played by the land use variable in determining occurrence and distribution of phenomena. Once the outcomes of the logistic regression are determined, data are re-introduced in the GIS to produce a map reporting the proneness to landslide as predicted level of probability. As validation of results and regression model a cell-by-cell comparison between the susceptibility map and the initial inventory of landslide events was performed and an agreement at 75% level achieved.

  10. A pre-admission program for underrepresented minority and disadvantaged students: application, acceptance, graduation rates and timeliness of graduating from medical school.

    PubMed

    Strayhorn, G

    2000-04-01

    To determine whether students' performances in a pre-admission program predicted whether participants would (1) apply to medical school, (2) get accepted, and (3) graduate. Using prospectively collected data from participants in the University of North Carolina at Chapel Hill's Medical Education Development Program (MEDP) and data from the Association of American Colleges Student and Applicant Information Management System, the author identified 371 underrepresented minority (URM) students who were full-time participants and completed the program between 1984 and 1989, prior to their acceptance into medical school. Logistic regression analysis was used to determine whether MEDP performance significantly predicted (after statistically controlling for traditional predictors of these outcomes) the proportions of URM participants who applied to medical school and were accepted, the timeliness of graduating, and the proportion graduating. Odds ratios with 95% confidence intervals were calculated to determine the associations between the independent and outcome variables. In separate logistic regression models, MEDP performance predicted the study's outcomes after statistically controlling for traditional predictors with 95% confidence intervals. Pre-admission programs with similar outcomes can improve the diversity of the physician workforce and the access to health care for underrepresented minority and economically disadvantaged populations.

  11. The association between dietary lignans, phytoestrogen-rich foods, and fiber intake and postmenopausal breast cancer risk: a German case-control study.

    PubMed

    Zaineddin, Aida Karina; Buck, Katharina; Vrieling, Alina; Heinz, Judith; Flesch-Janys, Dieter; Linseisen, Jakob; Chang-Claude, Jenny

    2012-01-01

    Phytoestrogens are structurally similar to estrogens and may affect breast cancer risk by mimicking estrogenic/antiestrogenic properties. In Western societies, whole grains and possibly soy foods are rich sources of phytoestrogens. A population-based case-control study in German postmenopausal women was used to evaluate the association of phytoestrogen-rich foods and dietary lignans with breast cancer risk. Dietary data were collected from 2,884 cases and 5,509 controls using a validated food-frequency questionnaire, which included additional questions phytoestrogen-rich foods. Associations were assessed using conditional logistic regression. All analyses were adjusted for relevant risk and confounding factors. Polytomous logistic regression analysis was performed to evaluate the associations by estrogen receptor (ER) status. High and low consumption of soybeans as well as of sunflower and pumpkin seeds were associated with significantly reduced breast cancer risk compared to no consumption (OR = 0.83, 95% CI = 0.70-0.97; and OR = 0.66, 95% CI = 0.77-0.97, respectively). The observed associations were not differential by ER status. No statistically significant associations were found for dietary intake of plant lignans, fiber, or the calculated enterolignans. Our results provide evidence for a reduced postmenopausal breast cancer risk associated with increased consumption of sunflower and pumpkin seeds and soybeans.

  12. Radiation Exposure and Mortality from Cardiovascular Disease and Cancer in Early NASA Astronauts.

    PubMed

    Elgart, S Robin; Little, Mark P; Chappell, Lori J; Milder, Caitlin M; Shavers, Mark R; Huff, Janice L; Patel, Zarana S

    2018-05-31

    Understanding space radiation health effects is critical due to potential increased morbidity and mortality following spaceflight. We evaluated whether there is evidence for excess cardiovascular disease or cancer mortality in early NASA astronauts and if a correlation exists between space radiation exposure and mortality. Astronauts selected from 1959-1969 were included and followed until death or February 2017, with 39 of 73 individuals still alive at that time. Calculated standardized mortality rates for tested outcomes were significantly below U.S. white male population rates, including all-cardiovascular disease (n = 7, SMR = 33; 95% CI, 14-65) and all-cancer (n = 7, SMR = 43; 95% CI, 18-83), as anticipated in a healthy worker population. Space radiation doses for cohort members ranged from 0-78 mGy. No significant associations between space radiation dose and mortality were found using logistic regression with an internal reference group, adjusting for medical radiation. Statistical power of the logistic regression was <6%, remaining <12% even when expected risk level or observed deaths were assumed to be 10 times higher than currently reported. While no excess radiation-associated cardiovascular or cancer mortality risk was observed, findings must be tempered by the statistical limitations of this cohort; notwithstanding, this small unique cohort provides a foundation for assessment of astronaut health.

  13. Exploring the determinants of secular decreases in dental caries among Korean children.

    PubMed

    Lee, Hye-Ju; Han, Dong-Hun

    2015-08-01

    The aim of this study was to determine the contributions of sealant and water fluoridation to the time trends in dental caries from 2003 to 2010. Data were from three waves of the Korean National Oral Health Surveys between 2003 and 2010, including a total of 23 059 children (11 889 boys and 11 170 girls) aged 8, 10, and 12 years. The impacts of sealant and water fluoridation on dental caries were obtained by logistic regression for each age group of children. The contributions of sealant and water fluoridation to the time trends in the prevalence of dental caries were examined by a series of logistic regression models, and changes in the adjusted odds ratios for each survey year were also calculated. Over the past 7 years, the prevalence of dental caries decreased dramatically. Although sealant had a significant impact on dental caries in each survey year, remarkable decreases in dental caries from 2003 to 2010 were not explained by the secular changes in the dental sealant or water fluoridation factor. We observed important population declines in dental caries in Korea in children aged 8-12 years; however, the likely causes for these secular trends remain to be determined. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  14. Computed tomography pulmonary embolism index for the assessment of survival in patients with pulmonary embolism.

    PubMed

    Pech, Maciej; Wieners, Gero; Dul, Przemyslaw; Fischbach, Frank; Dudeck, Oliver; Lopez Hänninen, Enrique; Ricke, Jens

    2007-08-01

    This study was an analysis of the correlation between pulmonary embolism (PE) and patient survival. Among 694 consecutive patients referred to our institution with clinical suspicion of acute PE who underwent CT pulmonary angiography, 188 patients comprised the study group: 87 women (46.3%, median age: 60.7; age range: 19-88 years) and 101 men (53.7%, median age: 66.9; age range: 21-97 years). PE was assessed by two radiologist who were blinded to the results from the follow-up. A PE index was derived for each set of images on the basis of the embolus size and location. Results were analyzed using logistic regression, and correlation with risk factors and patient outcome (survival or death) was calculated. We observed no significant correlation between the CTPE index and patient outcome (p = 0.703). The test of logistic regression with the sum of heart and liver disease or presence of cancer was significantly (p< 0.05) correlated with PE and overall patient outcome. Interobserver agreement showed a significant correlation rate for the assessment of the PE index (0.993; p< 0.001). In our study the CT PE index did not translate into patient outcome. Prospective larger scale studies are needed to confirm the predictive value of the index and refine the index criteria.

  15. The Integrative Weaning Index in Elderly ICU Subjects.

    PubMed

    Azeredo, Leandro M; Nemer, Sérgio N; Barbas, Carmen Sv; Caldeira, Jefferson B; Noé, Rosângela; Guimarães, Bruno L; Caldas, Célia P

    2017-03-01

    With increasing life expectancy and ICU admission of elderly patients, mechanical ventilation, and weaning trials have increased worldwide. We evaluated a cohort with 479 subjects in the ICU. Patients younger than 18 y, tracheostomized, or with neurologic diseases were excluded, resulting in 331 subjects. Subjects ≥70 y old were considered elderly, whereas those <70 y old were considered non-elderly. Besides the conventional weaning indexes, we evaluated the performance of the integrative weaning index (IWI). The probability of successful weaning was investigated using relative risk and logistic regression. The Hosmer-Lemeshow goodness-of-fit test was used to calibrate and the C statistic was calculated to evaluate the association between predicted probabilities and observed proportions in the logistic regression model. Prevalence of successful weaning in the sample was 83.7%. There was no difference in mortality between elderly and non-elderly subjects ( P = .16), in days of mechanical ventilation ( P = .22) and days of weaning ( P = .55). In elderly subjects, the IWI was the only respiratory variable associated with mechanical ventilation weaning in this population ( P < .001). The IWI was the independent variable found in weaning of elderly subjects that may contribute to the critical moment of this population in intensive care. Copyright © 2017 by Daedalus Enterprises.

  16. Malaria treatment-seeking behaviour and related factors of Wa ethnic minority in Myanmar: a cross-sectional study

    PubMed Central

    2012-01-01

    Background In Southeast Asia, data on malaria treatment-seeking behaviours and related affecting factors are rare. The population of the Wa ethnic in Myanmar has difficulty in accessing formal health care. To understand malaria treatment-seeking behaviour and household-affecting factors of the Wa people, a cross-sectional study carried out in Shan Special Region II, Myanmar. Methods The two methods, questionnaire-based household surveys to household heads and in-depth interviews to key informants, were carried out independently. The proportion of treatment-seeking patterns was calculated. Logistic regression was used to determine affecting factors of treatment-seeking. Qualitative data were analysed by using Text Analysis Markup System. Results Overall, 87.5% of the febrile population sought treatment, but only 32.0% did so within 24 hours. The proportion accessing the retail sector (79.6%) was statistically significant higher (P<0.0001) than accessing the public sector (10.6%). Multivariable logistic regression analysis identified family income, distances from a health facility, family decision and patient characteristics being independently associated with delayed malaria treatment. Conclusion Malaria treatment-seeking behaviour is not appropriate, and affecting factors include health service systems, social and cultural factors in Wa State of Myanmar. PMID:23237576

  17. Radiation dose does not influence anastomotic complications in patients with esophageal cancer treated with neoadjuvant chemoradiation and transhiatal esophagectomy.

    PubMed

    Koëter, Marijn; van der Sangen, Maurice J C; Hurkmans, Coen W; Luyer, Misha D P; Rutten, Harm J T; Nieuwenhuijzen, Grard A P

    2015-03-06

    Neoadjuvant chemoradiation might increase anastomotic leakage and stenosis in patients with esophageal cancer treated with neoadjuvant chemoradiation and esophagectomy. The aim of this study was to determine the influence of radiation dose on the incidence of leakage and stenosis. Fifty-three patients with esophageal cancer received neoadjuvant chemoradiation (23 × 1.8 Gy) (combined with Paclitaxel and Carboplatin) followed by a transhiatal esophagectomy between 2009 and 2011. On planning CT, the future anastomotic region was determined and the mean radiation dose, V20, V25, V30, V35 and V40 were calculated. Logistic regression analysis was conducted to examine determinants of anastomotic leakage and stenosis. Anastomotic leaks occurred in 13 of 53 patients (25.5%) and anastomotic stenosis occurred in 24 of 53 patients (45.3%). Median follow-up was 20 months. Logistic regression analysis showed that mean dose, V20-V40, age, co-morbidity, method of anastomosis, operating time and interval between last radiotherapy treatment and surgery were not predictors of anastomotic leakage and stenosis. A radiation dose of 23 × 1.8 Gy on the future anastomotic region has no influence on the occurrence of anastomotic leakage and stenosis in patients with esophageal cancer treated with neoadjuvant chemoradiation followed by transhiatal esophagectomy.

  18. Statin, testosterone and phosphodiesterase 5-inhibitor treatments and age related mortality in diabetes

    PubMed Central

    Hackett, Geoffrey; Jones, Peter W; Strange, Richard C; Ramachandran, Sudarshan

    2017-01-01

    AIM To determine how statins, testosterone (T) replacement therapy (TRT) and phosphodiesterase 5-inhibitors (PDE5I) influence age related mortality in diabetic men. METHODS We studied 857 diabetic men screened for the BLAST study, stratifying them (mean follow-up = 3.8 years) into: (1) Normal T levels/untreated (total T > 12 nmol/L and free T > 0.25 nmol/L), Low T/untreated and Low T/treated; (2) PDE5I/untreated and PDE5I/treated; and (3) statin/untreated and statin/treated groups. The relationship between age and mortality, alone and with T/TRT, statin and PDE5I treatment was studied using logistic regression. Mortality probability and 95%CI were calculated from the above models for each individual. RESULTS Age was associated with mortality (logistic regression, OR = 1.10, 95%CI: 1.08-1.13, P < 0.001). With all factors included, age (OR = 1.08, 95%CI: 1.06-1.11, P < 0.001), Low T/treated (OR = 0.38, 95%CI: 0.15-0.92, P = 0.033), PDE5I/treated (OR = 0.17, 95%CI: 0.053-0.56, P = 0.004) and statin/treated (OR = 0.59, 95%CI: 0.36-0.97, P = 0.038) were associated with lower mortality. Age related mortality was as described by Gompertz, r2 = 0.881 when Ln (mortality) was plotted against age. The probability of mortality and 95%CI (from logistic regression) of individuals, treated/untreated with the drugs, alone and in combination was plotted against age. Overlap of 95%CI lines was evident with statins and TRT. No overlap was evident with PDE5I alone and with statins and TRT, this suggesting a change in the relationship between age and mortality. CONCLUSION We show that statins, PDE5I and TRT reduce mortality in diabetes. PDE5I, alone and with the other treatments significantly alter age related mortality in diabetic men. PMID:28344753

  19. Effectiveness of the CANRISK tool in the identification of dysglycemia in First Nations and Métis in Canada

    PubMed Central

    Gina, Agarwal; Ying, Jiang; Susan, Rogers Van Katwyk; Chantal, Lemieux; Heather, Orpana; Yang, Mao; Brandan, Hanley; Karen, Davis; Laurel, Leuschen; Howard, Morrison

    2018-01-01

    Abstract Introduction: First Nations/Métis populations develop diabetes earlier and at higher rates than other Canadians. The Canadian diabetes risk questionnaire (CANRISK) was developed as a diabetes screening tool for Canadians aged 40 years or over. The primary aim of this paper is to assess the effectiveness of the existing CANRISK tool and risk scores in detecting dysglycemia in First Nations/Métis participants, including among those under the age of 40. A secondary aim was to determine whether alternative waist circumference (WC) and body mass index (BMI) cut-off points improved the predictive ability of logistic regression models using CANRISK variables to predict dysglycemia. Methods: Information from a self-administered CANRISK questionnaire, anthropometric measurements, and results of a standard oral glucose tolerance test (OGTT) were collected from First Nations and Métis participants (n = 1479). Sensitivity and specificity of CANRISK scores using published risk score cut-off points were calculated. Logistic regression was conducted with alternative ethnicity-specific BMI and WC cut-off points to predict dysglycemia using CANRISK variables. Results: Compared with OGTT results, using a CANRISK score cut-off point of 33, the sensitivity and specificity of CANRISK was 68% and 63% among individuals aged 40 or over; it was 27% and 87%, respectively among those under 40. Using a lower cut-off point of 21, the sensitivity for individuals under 40 improved to 77% with a specificity of 44%. Though specificity at this threshold was low, the higher level of sensitivity reflects the importance of the identification of high risk individuals in this population. Despite altered cut-off points of BMI and WC, logistic regression models demonstrated similar predictive ability. Conclusion: CANRISK functioned well as a preliminary step for diabetes screening in a broad age range of First Nations and Métis in Canada, with an adjusted CANRISK cutoff point for individuals under 40, and with no incremental improvement from using alternative BMI/WC cut-off points. PMID:29443485

  20. A Comparison of Logistic Regression, Neural Networks, and Classification Trees Predicting Success of Actuarial Students

    ERIC Educational Resources Information Center

    Schumacher, Phyllis; Olinsky, Alan; Quinn, John; Smith, Richard

    2010-01-01

    The authors extended previous research by 2 of the authors who conducted a study designed to predict the successful completion of students enrolled in an actuarial program. They used logistic regression to determine the probability of an actuarial student graduating in the major or dropping out. They compared the results of this study with those…

  1. Logistic regression accuracy across different spatial and temporal scales for a wide-ranging species, the marbled murrelet

    Treesearch

    Carolyn B. Meyer; Sherri L. Miller; C. John Ralph

    2004-01-01

    The scale at which habitat variables are measured affects the accuracy of resource selection functions in predicting animal use of sites. We used logistic regression models for a wide-ranging species, the marbled murrelet, (Brachyramphus marmoratus) in a large region in California to address how much changing the spatial or temporal scale of...

  2. Odds Ratio, Delta, ETS Classification, and Standardization Measures of DIF Magnitude for Binary Logistic Regression

    ERIC Educational Resources Information Center

    Monahan, Patrick O.; McHorney, Colleen A.; Stump, Timothy E.; Perkins, Anthony J.

    2007-01-01

    Previous methodological and applied studies that used binary logistic regression (LR) for detection of differential item functioning (DIF) in dichotomously scored items either did not report an effect size or did not employ several useful measures of DIF magnitude derived from the LR model. Equations are provided for these effect size indices.…

  3. A Generalized Logistic Regression Procedure to Detect Differential Item Functioning among Multiple Groups

    ERIC Educational Resources Information Center

    Magis, David; Raiche, Gilles; Beland, Sebastien; Gerard, Paul

    2011-01-01

    We present an extension of the logistic regression procedure to identify dichotomous differential item functioning (DIF) in the presence of more than two groups of respondents. Starting from the usual framework of a single focal group, we propose a general approach to estimate the item response functions in each group and to test for the presence…

  4. Risk Factors of Falls in Community-Dwelling Older Adults: Logistic Regression Tree Analysis

    ERIC Educational Resources Information Center

    Yamashita, Takashi; Noe, Douglas A.; Bailer, A. John

    2012-01-01

    Purpose of the Study: A novel logistic regression tree-based method was applied to identify fall risk factors and possible interaction effects of those risk factors. Design and Methods: A nationally representative sample of American older adults aged 65 years and older (N = 9,592) in the Health and Retirement Study 2004 and 2006 modules was used.…

  5. Estimation of Logistic Regression Models in Small Samples. A Simulation Study Using a Weakly Informative Default Prior Distribution

    ERIC Educational Resources Information Center

    Gordovil-Merino, Amalia; Guardia-Olmos, Joan; Pero-Cebollero, Maribel

    2012-01-01

    In this paper, we used simulations to compare the performance of classical and Bayesian estimations in logistic regression models using small samples. In the performed simulations, conditions were varied, including the type of relationship between independent and dependent variable values (i.e., unrelated and related values), the type of variable…

  6. Using multiple logistic regression and GIS technology to predict landslide hazard in northeast Kansas, USA

    USGS Publications Warehouse

    Ohlmacher, G.C.; Davis, J.C.

    2003-01-01

    Landslides in the hilly terrain along the Kansas and Missouri rivers in northeastern Kansas have caused millions of dollars in property damage during the last decade. To address this problem, a statistical method called multiple logistic regression has been used to create a landslide-hazard map for Atchison, Kansas, and surrounding areas. Data included digitized geology, slopes, and landslides, manipulated using ArcView GIS. Logistic regression relates predictor variables to the occurrence or nonoccurrence of landslides within geographic cells and uses the relationship to produce a map showing the probability of future landslides, given local slopes and geologic units. Results indicated that slope is the most important variable for estimating landslide hazard in the study area. Geologic units consisting mostly of shale, siltstone, and sandstone were most susceptible to landslides. Soil type and aspect ratio were considered but excluded from the final analysis because these variables did not significantly add to the predictive power of the logistic regression. Soil types were highly correlated with the geologic units, and no significant relationships existed between landslides and slope aspect. ?? 2003 Elsevier Science B.V. All rights reserved.

  7. Predicting risk for portal vein thrombosis in acute pancreatitis patients: A comparison of radical basis function artificial neural network and logistic regression models.

    PubMed

    Fei, Yang; Hu, Jian; Gao, Kun; Tu, Jianfeng; Li, Wei-Qin; Wang, Wei

    2017-06-01

    To construct a radical basis function (RBF) artificial neural networks (ANNs) model to predict the incidence of acute pancreatitis (AP)-induced portal vein thrombosis. The analysis included 353 patients with AP who had admitted between January 2011 and December 2015. RBF ANNs model and logistic regression model were constructed based on eleven factors relevant to AP respectively. Statistical indexes were used to evaluate the value of the prediction in two models. The predict sensitivity, specificity, positive predictive value, negative predictive value and accuracy by RBF ANNs model for PVT were 73.3%, 91.4%, 68.8%, 93.0% and 87.7%, respectively. There were significant differences between the RBF ANNs and logistic regression models in these parameters (P<0.05). In addition, a comparison of the area under receiver operating characteristic curves of the two models showed a statistically significant difference (P<0.05). The RBF ANNs model is more likely to predict the occurrence of PVT induced by AP than logistic regression model. D-dimer, AMY, Hct and PT were important prediction factors of approval for AP-induced PVT. Copyright © 2017 Elsevier Inc. All rights reserved.

  8. EXpectation Propagation LOgistic REgRession (EXPLORER): Distributed Privacy-Preserving Online Model Learning

    PubMed Central

    Wang, Shuang; Jiang, Xiaoqian; Wu, Yuan; Cui, Lijuan; Cheng, Samuel; Ohno-Machado, Lucila

    2013-01-01

    We developed an EXpectation Propagation LOgistic REgRession (EXPLORER) model for distributed privacy-preserving online learning. The proposed framework provides a high level guarantee for protecting sensitive information, since the information exchanged between the server and the client is the encrypted posterior distribution of coefficients. Through experimental results, EXPLORER shows the same performance (e.g., discrimination, calibration, feature selection etc.) as the traditional frequentist Logistic Regression model, but provides more flexibility in model updating. That is, EXPLORER can be updated one point at a time rather than having to retrain the entire data set when new observations are recorded. The proposed EXPLORER supports asynchronized communication, which relieves the participants from coordinating with one another, and prevents service breakdown from the absence of participants or interrupted communications. PMID:23562651

  9. Dietary consumption patterns and laryngeal cancer risk.

    PubMed

    Vlastarakos, Petros V; Vassileiou, Andrianna; Delicha, Evie; Kikidis, Dimitrios; Protopapas, Dimosthenis; Nikolopoulos, Thomas P

    2016-06-01

    We conducted a case-control study to investigate the effect of diet on laryngeal carcinogenesis. Our study population was made up of 140 participants-70 patients with laryngeal cancer (LC) and 70 controls with a non-neoplastic condition that was unrelated to diet, smoking, or alcohol. A food-frequency questionnaire determined the mean consumption of 113 different items during the 3 years prior to symptom onset. Total energy intake and cooking mode were also noted. The relative risk, odds ratio (OR), and 95% confidence interval (CI) were estimated by multiple logistic regression analysis. We found that the total energy intake was significantly higher in the LC group (p < 0.001), and that the difference remained statistically significant after logistic regression analysis (p < 0.001; OR: 118.70). Notably, meat consumption was higher in the LC group (p < 0.001), and the difference remained significant after logistic regression analysis (p = 0.029; OR: 1.16). LC patients also consumed significantly more fried food (p = 0.036); this difference also remained significant in the logistic regression model (p = 0.026; OR: 5.45). The LC group also consumed significantly more seafood (p = 0.012); the difference persisted after logistic regression analysis (p = 0.009; OR: 2.48), with the consumption of shrimp proving detrimental (p = 0.049; OR: 2.18). Finally, the intake of zinc was significantly higher in the LC group before and after logistic regression analysis (p = 0.034 and p = 0.011; OR: 30.15, respectively). Cereal consumption (including pastas) was also higher among the LC patients (p = 0.043), with logistic regression analysis showing that their negative effect was possibly associated with the sauces and dressings that traditionally accompany pasta dishes (p = 0.006; OR: 4.78). Conversely, a higher consumption of dairy products was found in controls (p < 0.05); logistic regression analysis showed that calcium appeared to be protective at the micronutrient level (p < 0.001; OR: 0.27). We found no difference in the overall consumption of fruits and vegetables between the LC patients and controls; however, the LC patients did have a greater consumption of cooked tomatoes and cooked root vegetables (p = 0.039 for both), and the controls had more consumption of leeks (p = 0.042) and, among controls younger than 65 years, cooked beans (p = 0.037). Lemon (p = 0.037), squeezed fruit juice (p = 0.032), and watermelon (p = 0.018) were also more frequently consumed by the controls. Other differences at the micronutrient level included greater consumption by the LC patients of retinol (p = 0.044), polyunsaturated fats (p = 0.041), and linoleic acid (p = 0.008); LC patients younger than 65 years also had greater intake of riboflavin (p = 0.045). We conclude that the differences in dietary consumption patterns between LC patients and controls indicate a possible role for lifestyle modifications involving nutritional factors as a means of decreasing the risk of laryngeal cancer.

  10. Associations between dairy cow inter-service interval and probability of conception.

    PubMed

    Remnant, J G; Green, M J; Huxley, J N; Hudson, C D

    2018-07-01

    Recent research has indicated that the interval between inseminations in modern dairy cattle is often longer than the commonly accepted cycle length of 18-24 days. This study analysed 257,396 inseminations in 75,745 cows from 312 herds in England and Wales. The interval between subsequent inseminations in the same cow in the same lactation (inter-service interval, ISI) were calculated and inseminations categorised as successful or unsuccessful depending on whether there was a corresponding calving event. Conception risk was calculated for each individual ISI between 16 and 28 days. A random effects logistic regression model was fitted to the data with pregnancy as the outcome variable and ISI (in days) included in the model as a categorical variable. The modal ISI was 22 days and the peak conception risk was 44% for ISIs of 21 days rising from 27% at 16 days. The logistic regression model revealed significant associations of conception risk with ISI as well as 305 day milk yield, insemination number, parity and days in milk. Predicted conception risk was lower for ISIs of 16, 17 and 18 days and higher for ISIs of 20, 21 and 22 days compared to 25 day ISIs. A mixture model was specified to identify clusters in insemination frequency and conception risk for ISIs between 3 and 50 days. A "high conception risk, high insemination frequency" cluster was identified between 19 and 26 days which indicated that this time period was the true latent distribution for ISI with optimal reproductive outcome. These findings suggest that the period of increased numbers of inseminations around 22 days identified in existing work coincides with the period of increased probability of conception and therefore likely represents true return estrus events. Copyright © 2018 Elsevier Inc. All rights reserved.

  11. Qualitative and Semiquantitative Elastography for the Diagnosis of Intermediate Suspicious Thyroid Nodules Based on the 2015 American Thyroid Association Guidelines.

    PubMed

    Yang, Bo Ra; Kim, Eun-Kyung; Moon, Hee Jung; Yoon, Jung Hyun; Park, Vivian Y; Kwak, Jin Young

    2018-04-01

    To evaluate qualitative and semiquantitative elastography for the diagnosis of intermediate suspicious thyroid nodules based on the 2015 American Thyroid Association (ATA) guidelines. Through a retrospective search of our institutional database, 746 solid thyroid nodules found on grayscale ultrasonography, strain elastography, and ultrasound-guided fine-needle aspiration between June and November 2009 were collected. Among them, 80 nodules from 80 patients with an intermediate suspicion of malignancy based on the 2015 ATA guidelines that were 10 mm or larger were recruited as the final study nodules. Elastographic findings were categorized according to the criteria of Rago et al (J Clin Endocrinol Metab 2007; 92:2917-2922) and Asteria et al (Thyroid 2008; 18:523-531), and strain ratio values were calculated and recorded. The independent 2-sample t test and χ 2 test (or Fisher exact test) were used to evaluate differences in clinical parameters between benign and malignant thyroid nodules. All variables were compared by univariate and multivariate logistic regression analyses, and odds ratios with 95% confidence intervals were calculated. Of the 80 nodules, 6 (7.5%) were malignant, and 74 (92.5%) were benign. No significant differences were observed in age, sex, nodule size, elasticity score, and strain ratio between benign and malignant nodules. No variables significantly predicted thyroid malignancy on the univariate analysis. On the multivariate logistic regression analysis, there were no independent variables associated with thyroid malignancy, including the elasticity score and strain ratio (all P > .05). Elastographic analysis using the elasticity score and strain ratio has limited ability to characterize the benignity or malignancy of thyroid nodules with an intermediate suspicion of malignancy based on the 2015 ATA guidelines. © 2017 by the American Institute of Ultrasound in Medicine.

  12. Diagnostic accuracy of serum antibodies to human papillomavirus type 16 early antigens in the detection of human papillomavirus-related oropharyngeal cancer.

    PubMed

    Dahlstrom, Kristina R; Anderson, Karen S; Field, Matthew S; Chowell, Diego; Ning, Jing; Li, Nan; Wei, Qingyi; Li, Guojun; Sturgis, Erich M

    2017-12-15

    Because of the current epidemic of human papillomavirus (HPV)-related oropharyngeal cancer (OPC), a screening strategy is urgently needed. The presence of serum antibodies to HPV-16 early (E) antigens is associated with an increased risk for OPC. The purpose of this study was to evaluate the diagnostic accuracy of antibodies to a panel of HPV-16 E antigens in screening for OPC. This case-control study included 378 patients with OPC, 153 patients with nonoropharyngeal head and neck cancer (non-OPC), and 782 healthy control subjects. The tumor HPV status was determined with p16 immunohistochemistry and HPV in situ hybridization. HPV-16 E antibody levels in serum were identified with an enzyme-linked immunosorbent assay. A trained binary logistic regression model based on the combination of all E antigens was predefined and applied to the data set. The sensitivity and specificity of the assay for distinguishing HPV-related OPC from controls were calculated. Logistic regression analysis was used to calculate odds ratios with 95% confidence intervals for the association of head and neck cancer with the antibody status. Of the 378 patients with OPC, 348 had p16-positive OPC. HPV-16 E antibody levels were significantly higher among patients with p16-positive OPC but not among patients with non-OPC or among controls. Serology showed high sensitivity and specificity for HPV-related OPC (binary classifier: 83% sensitivity and 99% specificity for p16-positive OPC). A trained binary classification algorithm that incorporates information about multiple E antibodies has high sensitivity and specificity and may be advantageous for risk stratification in future screening trials. Cancer 2017;123:4886-94. © 2017 American Cancer Society. © 2017 American Cancer Society.

  13. Factors associated with developing a fear of falling in subjects with primary open-angle glaucoma.

    PubMed

    Adachi, Sayaka; Yuki, Kenya; Awano-Tanabe, Sachiko; Ono, Takeshi; Shiba, Daisuke; Murata, Hiroshi; Asaoka, Ryo; Tsubota, Kazuo

    2018-02-13

    To investigate the relationship between clinical risk factors, including visual field (VF) defects and visual acuity, and a fear of falling, among patients with primary open-angle glaucoma (POAG). All participants answered the following question at a baseline ophthalmic examination: Are you afraid of falling? The same question was then answered every 12 months for 3 years. A binocular integrated visual field was calculated by merging a patient's monocular Humphrey field analyzer VFs, using the 'best sensitivity' method. The means of total deviation values in the whole, superior peripheral, superior central, inferior central, and inferior peripheral VFs were calculated. The relationship between these mean VF measurements, and various clinical factors, against patients' baseline fear of falling and future fear of falling was analyzed using multiple logistic regression. Among 392 POAG subjects, 342 patients (87.2%) responded to the fear of falling question at least twice in the 3 years study period. The optimal regression model for patients' baseline fear of falling included age, gender, mean of total deviation values in the inferior peripheral VF and number of previous falls. The optimal regression equation for future fear of falling included age, gender, mean of total deviation values in the inferior peripheral VF and number of previous falls. Defects in the inferior peripheral VF area are significantly related to the development of a fear of falling.

  14. A Comparison of the Logistic Regression and Contingency Table Methods for Simultaneous Detection of Uniform and Nonuniform DIF

    ERIC Educational Resources Information Center

    Guler, Nese; Penfield, Randall D.

    2009-01-01

    In this study, we investigate the logistic regression (LR), Mantel-Haenszel (MH), and Breslow-Day (BD) procedures for the simultaneous detection of both uniform and nonuniform differential item functioning (DIF). A simulation study was used to assess and compare the Type I error rate and power of a combined decision rule (CDR), which assesses DIF…

  15. The Overall Odds Ratio as an Intuitive Effect Size Index for Multiple Logistic Regression: Examination of Further Refinements

    ERIC Educational Resources Information Center

    Le, Huy; Marcus, Justin

    2012-01-01

    This study used Monte Carlo simulation to examine the properties of the overall odds ratio (OOR), which was recently introduced as an index for overall effect size in multiple logistic regression. It was found that the OOR was relatively independent of study base rate and performed better than most commonly used R-square analogs in indexing model…

  16. Predicting Student Success on the Texas Chemistry STAAR Test: A Logistic Regression Analysis

    ERIC Educational Resources Information Center

    Johnson, William L.; Johnson, Annabel M.; Johnson, Jared

    2012-01-01

    Background: The context is the new Texas STAAR end-of-course testing program. Purpose: The authors developed a logistic regression model to predict who would pass-or-fail the new Texas chemistry STAAR end-of-course exam. Setting: Robert E. Lee High School (5A) with an enrollment of 2700 students, Tyler, Texas. Date of the study was the 2011-2012…

  17. Using ROC curves to compare neural networks and logistic regression for modeling individual noncatastrophic tree mortality

    Treesearch

    Susan L. King

    2003-01-01

    The performance of two classifiers, logistic regression and neural networks, are compared for modeling noncatastrophic individual tree mortality for 21 species of trees in West Virginia. The output of the classifier is usually a continuous number between 0 and 1. A threshold is selected between 0 and 1 and all of the trees below the threshold are classified as...

  18. Consistency in reporting condom use between husbands and wives in Bangladesh.

    PubMed

    Islam, Mohammad Amirul; Padmadas, Sabu S; Smith, Peter W F

    2010-07-01

    Consistency in reporting contraceptive use between spouses is little understood, especially in developing settings. This research challenges the accuracy of measuring contraceptive prevalence rate, which is traditionally calculated based on women's responses. Multinomial logistic regression techniques are employed on a couple dataset from the 1999-2000 Bangladesh Demographic and Health Survey (DHS) to investigate the consistency in reporting condom use between husbands and wives. The level of inconsistency in reporting condom use was about 46%, of which about 32% was explained by husbands reporting condom use while wives did not, and 14% by wives reporting condom use while husbands did not. Regression analysis showed that couple education and age difference between the spouses are significant determinants of inconsistent reporting behaviour. The findings suggest the need for alternative approaches (questions) in the DHS to ensure consistency in the collection of data related to use of family planning methods.

  19. Logistic regression trees for initial selection of interesting loci in case-control studies

    PubMed Central

    Nickolov, Radoslav Z; Milanov, Valentin B

    2007-01-01

    Modern genetic epidemiology faces the challenge of dealing with hundreds of thousands of genetic markers. The selection of a small initial subset of interesting markers for further investigation can greatly facilitate genetic studies. In this contribution we suggest the use of a logistic regression tree algorithm known as logistic tree with unbiased selection. Using the simulated data provided for Genetic Analysis Workshop 15, we show how this algorithm, with incorporation of multifactor dimensionality reduction method, can reduce an initial large pool of markers to a small set that includes the interesting markers with high probability. PMID:18466557

  20. Using Logistic Regression to Predict the Probability of Debris Flows in Areas Burned by Wildfires, Southern California, 2003-2006

    USGS Publications Warehouse

    Rupert, Michael G.; Cannon, Susan H.; Gartner, Joseph E.; Michael, John A.; Helsel, Dennis R.

    2008-01-01

    Logistic regression was used to develop statistical models that can be used to predict the probability of debris flows in areas recently burned by wildfires by using data from 14 wildfires that burned in southern California during 2003-2006. Twenty-eight independent variables describing the basin morphology, burn severity, rainfall, and soil properties of 306 drainage basins located within those burned areas were evaluated. The models were developed as follows: (1) Basins that did and did not produce debris flows soon after the 2003 to 2006 fires were delineated from data in the National Elevation Dataset using a geographic information system; (2) Data describing the basin morphology, burn severity, rainfall, and soil properties were compiled for each basin. These data were then input to a statistics software package for analysis using logistic regression; and (3) Relations between the occurrence or absence of debris flows and the basin morphology, burn severity, rainfall, and soil properties were evaluated, and five multivariate logistic regression models were constructed. All possible combinations of independent variables were evaluated to determine which combinations produced the most effective models, and the multivariate models that best predicted the occurrence of debris flows were identified. Percentage of high burn severity and 3-hour peak rainfall intensity were significant variables in all models. Soil organic matter content and soil clay content were significant variables in all models except Model 5. Soil slope was a significant variable in all models except Model 4. The most suitable model can be selected from these five models on the basis of the availability of independent variables in the particular area of interest and field checking of probability maps. The multivariate logistic regression models can be entered into a geographic information system, and maps showing the probability of debris flows can be constructed in recently burned areas of southern California. This study demonstrates that logistic regression is a valuable tool for developing models that predict the probability of debris flows occurring in recently burned landscapes.

  1. Polymorphism Thr160Thr in SRD5A1, involved in the progesterone metabolism, modifies postmenopausal breast cancer risk associated with menopausal hormone therapy.

    PubMed

    Hein, R; Abbas, S; Seibold, P; Salazar, R; Flesch-Janys, D; Chang-Claude, J

    2012-01-01

    Menopausal hormone therapy (MHT) is associated with an increased breast cancer risk in postmenopausal women, with combined estrogen-progestagen therapy posing a greater risk than estrogen monotherapy. However, few studies focused on potential effect modification of MHT-associated breast cancer risk by genetic polymorphisms in the progesterone metabolism. We assessed effect modification of MHT use by five coding single nucleotide polymorphisms (SNPs) in the progesterone metabolizing enzymes AKR1C3 (rs7741), AKR1C4 (rs3829125, rs17134592), and SRD5A1 (rs248793, rs3736316) using a two-center population-based case-control study from Germany with 2,502 postmenopausal breast cancer patients and 4,833 matched controls. An empirical-Bayes procedure that tests for interaction using a weighted combination of the prospective and the retrospective case-control estimators as well as standard prospective logistic regression were applied to assess multiplicative statistical interaction between polymorphisms and duration of MHT use with regard to breast cancer risk assuming a log-additive mode of inheritance. No genetic marginal effects were observed. Breast cancer risk associated with duration of combined therapy was significantly modified by SRD5A1_rs3736316, showing a reduced risk elevation in carriers of the minor allele (p (interaction,empirical-Bayes) = 0.006 using the empirical-Bayes method, p (interaction,logistic regression) = 0.013 using logistic regression). The risk associated with duration of use of monotherapy was increased by AKR1C3_rs7741 in minor allele carriers (p (interaction,empirical-Bayes) = 0.083, p (interaction,logistic regression) = 0.029) and decreased in minor allele carriers of two SNPs in AKR1C4 (rs3829125: p (interaction,empirical-Bayes) = 0.07, p (interaction,logistic regression) = 0.021; rs17134592: p (interaction,empirical-Bayes) = 0.101, p (interaction,logistic regression) = 0.038). After Bonferroni correction for multiple testing only SRD5A1_rs3736316 assessed using the empirical-Bayes method remained significant. Postmenopausal breast cancer risk associated with combined therapy may be modified by genetic variation in SRD5A1. Further well-powered studies are, however, required to replicate our finding.

  2. The comparison of landslide ratio-based and general logistic regression landslide susceptibility models in the Chishan watershed after 2009 Typhoon Morakot

    NASA Astrophysics Data System (ADS)

    WU, Chunhung

    2015-04-01

    The research built the original logistic regression landslide susceptibility model (abbreviated as or-LRLSM) and landslide ratio-based ogistic regression landslide susceptibility model (abbreviated as lr-LRLSM), compared the performance and explained the error source of two models. The research assumes that the performance of the logistic regression model can be better if the distribution of landslide ratio and weighted value of each variable is similar. Landslide ratio is the ratio of landslide area to total area in the specific area and an useful index to evaluate the seriousness of landslide disaster in Taiwan. The research adopted the landside inventory induced by 2009 Typhoon Morakot in the Chishan watershed, which was the most serious disaster event in the last decade, in Taiwan. The research adopted the 20 m grid as the basic unit in building the LRLSM, and six variables, including elevation, slope, aspect, geological formation, accumulated rainfall, and bank erosion, were included in the two models. The six variables were divided as continuous variables, including elevation, slope, and accumulated rainfall, and categorical variables, including aspect, geological formation and bank erosion in building the or-LRLSM, while all variables, which were classified based on landslide ratio, were categorical variables in building the lr-LRLSM. Because the count of whole basic unit in the Chishan watershed was too much to calculate by using commercial software, the research took random sampling instead of the whole basic units. The research adopted equal proportions of landslide unit and not landslide unit in logistic regression analysis. The research took 10 times random sampling and selected the group with the best Cox & Snell R2 value and Nagelkerker R2 value as the database for the following analysis. Based on the best result from 10 random sampling groups, the or-LRLSM (lr-LRLSM) is significant at the 1% level with Cox & Snell R2 = 0.190 (0.196) and Nagelkerke R2 = 0.253 (0.260). The unit with the landslide susceptibility value > 0.5 (≦ 0.5) will be classified as a predicted landslide unit (not landslide unit). The AUC, i.e. the area under the relative operating characteristic curve, of or-LRLSM in the Chishan watershed is 0.72, while that of lr-LRLSM is 0.77. Furthermore, the average correct ratio of lr-LRLSM (73.3%) is better than that of or-LRLSM (68.3%). The research analyzed in detail the error sources from the two models. In continuous variables, using the landslide ratio-based classification in building the lr-LRLSM can let the distribution of weighted value more similar to distribution of landslide ratio in the range of continuous variable than that in building the or-LRLSM. In categorical variables, the meaning of using the landslide ratio-based classification in building the lr-LRLSM is to gather the parameters with approximate landslide ratio together. The mean correct ratio in continuous variables (categorical variables) by using the lr-LRLSM is better than that in or-LRLSM by 0.6 ~ 2.6% (1.7% ~ 6.0%). Building the landslide susceptibility model by using landslide ratio-based classification is practical and of better performance than that by using the original logistic regression.

  3. Applications of statistics to medical science, III. Correlation and regression.

    PubMed

    Watanabe, Hiroshi

    2012-01-01

    In this third part of a series surveying medical statistics, the concepts of correlation and regression are reviewed. In particular, methods of linear regression and logistic regression are discussed. Arguments related to survival analysis will be made in a subsequent paper.

  4. Computing group cardinality constraint solutions for logistic regression problems.

    PubMed

    Zhang, Yong; Kwon, Dongjin; Pohl, Kilian M

    2017-01-01

    We derive an algorithm to directly solve logistic regression based on cardinality constraint, group sparsity and use it to classify intra-subject MRI sequences (e.g. cine MRIs) of healthy from diseased subjects. Group cardinality constraint models are often applied to medical images in order to avoid overfitting of the classifier to the training data. Solutions within these models are generally determined by relaxing the cardinality constraint to a weighted feature selection scheme. However, these solutions relate to the original sparse problem only under specific assumptions, which generally do not hold for medical image applications. In addition, inferring clinical meaning from features weighted by a classifier is an ongoing topic of discussion. Avoiding weighing features, we propose to directly solve the group cardinality constraint logistic regression problem by generalizing the Penalty Decomposition method. To do so, we assume that an intra-subject series of images represents repeated samples of the same disease patterns. We model this assumption by combining series of measurements created by a feature across time into a single group. Our algorithm then derives a solution within that model by decoupling the minimization of the logistic regression function from enforcing the group sparsity constraint. The minimum to the smooth and convex logistic regression problem is determined via gradient descent while we derive a closed form solution for finding a sparse approximation of that minimum. We apply our method to cine MRI of 38 healthy controls and 44 adult patients that received reconstructive surgery of Tetralogy of Fallot (TOF) during infancy. Our method correctly identifies regions impacted by TOF and generally obtains statistically significant higher classification accuracy than alternative solutions to this model, i.e., ones relaxing group cardinality constraints. Copyright © 2016 Elsevier B.V. All rights reserved.

  5. Influential factors of red-light running at signalized intersection and prediction using a rare events logistic regression model.

    PubMed

    Ren, Yilong; Wang, Yunpeng; Wu, Xinkai; Yu, Guizhen; Ding, Chuan

    2016-10-01

    Red light running (RLR) has become a major safety concern at signalized intersection. To prevent RLR related crashes, it is critical to identify the factors that significantly impact the drivers' behaviors of RLR, and to predict potential RLR in real time. In this research, 9-month's RLR events extracted from high-resolution traffic data collected by loop detectors from three signalized intersections were applied to identify the factors that significantly affect RLR behaviors. The data analysis indicated that occupancy time, time gap, used yellow time, time left to yellow start, whether the preceding vehicle runs through the intersection during yellow, and whether there is a vehicle passing through the intersection on the adjacent lane were significantly factors for RLR behaviors. Furthermore, due to the rare events nature of RLR, a modified rare events logistic regression model was developed for RLR prediction. The rare events logistic regression method has been applied in many fields for rare events studies and shows impressive performance, but so far none of previous research has applied this method to study RLR. The results showed that the rare events logistic regression model performed significantly better than the standard logistic regression model. More importantly, the proposed RLR prediction method is purely based on loop detector data collected from a single advance loop detector located 400 feet away from stop-bar. This brings great potential for future field applications of the proposed method since loops have been widely implemented in many intersections and can collect data in real time. This research is expected to contribute to the improvement of intersection safety significantly. Copyright © 2016 Elsevier Ltd. All rights reserved.

  6. Use of genetic programming, logistic regression, and artificial neural nets to predict readmission after coronary artery bypass surgery.

    PubMed

    Engoren, Milo; Habib, Robert H; Dooner, John J; Schwann, Thomas A

    2013-08-01

    As many as 14 % of patients undergoing coronary artery bypass surgery are readmitted within 30 days. Readmission is usually the result of morbidity and may lead to death. The purpose of this study is to develop and compare statistical and genetic programming models to predict readmission. Patients were divided into separate Construction and Validation populations. Using 88 variables, logistic regression, genetic programs, and artificial neural nets were used to develop predictive models. Models were first constructed and tested on the Construction populations, then validated on the Validation population. Areas under the receiver operator characteristic curves (AU ROC) were used to compare the models. Two hundred and two patients (7.6 %) in the 2,644 patient Construction group and 216 (8.0 %) of the 2,711 patient Validation group were re-admitted within 30 days of CABG surgery. Logistic regression predicted readmission with AU ROC = .675 ± .021 in the Construction group. Genetic programs significantly improved the accuracy, AU ROC = .767 ± .001, p < .001). Artificial neural nets were less accurate with AU ROC = 0.597 ± .001 in the Construction group. Predictive accuracy of all three techniques fell in the Validation group. However, the accuracy of genetic programming (AU ROC = .654 ± .001) was still trivially but statistically non-significantly better than that of the logistic regression (AU ROC = .644 ± .020, p = .61). Genetic programming and logistic regression provide alternative methods to predict readmission that are similarly accurate.

  7. Artificial neural network, genetic algorithm, and logistic regression applications for predicting renal colic in emergency settings.

    PubMed

    Eken, Cenker; Bilge, Ugur; Kartal, Mutlu; Eray, Oktay

    2009-06-03

    Logistic regression is the most common statistical model for processing multivariate data in the medical literature. Artificial intelligence models like an artificial neural network (ANN) and genetic algorithm (GA) may also be useful to interpret medical data. The purpose of this study was to perform artificial intelligence models on a medical data sheet and compare to logistic regression. ANN, GA, and logistic regression analysis were carried out on a data sheet of a previously published article regarding patients presenting to an emergency department with flank pain suspicious for renal colic. The study population was composed of 227 patients: 176 patients had a diagnosis of urinary stone, while 51 ultimately had no calculus. The GA found two decision rules in predicting urinary stones. Rule 1 consisted of being male, pain not spreading to back, and no fever. In rule 2, pelvicaliceal dilatation on bedside ultrasonography replaced no fever. ANN, GA rule 1, GA rule 2, and logistic regression had a sensitivity of 94.9, 67.6, 56.8, and 95.5%, a specificity of 78.4, 76.47, 86.3, and 47.1%, a positive likelihood ratio of 4.4, 2.9, 4.1, and 1.8, and a negative likelihood ratio of 0.06, 0.42, 0.5, and 0.09, respectively. The area under the curve was found to be 0.867, 0.720, 0.715, and 0.713 for all applications, respectively. Data mining techniques such as ANN and GA can be used for predicting renal colic in emergency settings and to constitute clinical decision rules. They may be an alternative to conventional multivariate analysis applications used in biostatistics.

  8. Application of logistic regression for landslide susceptibility zoning of Cekmece Area, Istanbul, Turkey

    NASA Astrophysics Data System (ADS)

    Duman, T. Y.; Can, T.; Gokceoglu, C.; Nefeslioglu, H. A.; Sonmez, H.

    2006-11-01

    As a result of industrialization, throughout the world, cities have been growing rapidly for the last century. One typical example of these growing cities is Istanbul, the population of which is over 10 million. Due to rapid urbanization, new areas suitable for settlement and engineering structures are necessary. The Cekmece area located west of the Istanbul metropolitan area is studied, because the landslide activity is extensive in this area. The purpose of this study is to develop a model that can be used to characterize landslide susceptibility in map form using logistic regression analysis of an extensive landslide database. A database of landslide activity was constructed using both aerial-photography and field studies. About 19.2% of the selected study area is covered by deep-seated landslides. The landslides that occur in the area are primarily located in sandstones with interbedded permeable and impermeable layers such as claystone, siltstone and mudstone. About 31.95% of the total landslide area is located at this unit. To apply logistic regression analyses, a data matrix including 37 variables was constructed. The variables used in the forwards stepwise analyses are different measures of slope, aspect, elevation, stream power index (SPI), plan curvature, profile curvature, geology, geomorphology and relative permeability of lithological units. A total of 25 variables were identified as exerting strong influence on landslide occurrence, and included by the logistic regression equation. Wald statistics values indicate that lithology, SPI and slope are more important than the other parameters in the equation. Beta coefficients of the 25 variables included the logistic regression equation provide a model for landslide susceptibility in the Cekmece area. This model is used to generate a landslide susceptibility map that correctly classified 83.8% of the landslide-prone areas.

  9. Aortic Curvature Is a Predictor of Late Type Ia Endoleak and Migration After Endovascular Aneurysm Repair.

    PubMed

    Schuurmann, Richte C L; van Noort, Kim; Overeem, Simon P; Ouriel, Kenneth; Jordan, William D; Muhs, Bart E; 't Mannetje, Yannick; Reijnen, Michel; Fioole, Bram; Ünlü, Çağdaş; Brummel, Peter; de Vries, Jean-Paul P M

    2017-06-01

    To evaluate the association between aortic curvature and other preoperative anatomical characteristics and late (>1 year) type Ia endoleak and endograft migration in endovascular aneurysm repair (EVAR) patients. Eight high-volume EVAR centers contributed 116 EVAR patients (mean age 81±7 years; 103 men) to the study: 36 patients (mean age 82±7 years; 31 men) with endograft migration and/or type Ia endoleak diagnosed >1 year after the initial EVAR and 80 controls without early or late complications. Aortic curvature was calculated from the preoperative computed tomography scan as the maximum and average curvature over 5 predefined aortic segments: the entire infrarenal aortic neck, aneurysm sac, and the suprarenal, juxtarenal, and infrarenal aorta. Other morphological characteristics included neck length, neck diameter, mural neck calcification and thrombus, suprarenal and infrarenal angulation, and largest aneurysm sac diameter. Independent risk factors were identified using backward stepwise logistic regression. Relevant cutoff values for each of the variables in the final regression model were determined with the receiver operator characteristic curve. Logistic regression identified maximum curvature over the length of the aneurysm sac (>47 m -1 ; p=0.023), largest aneurysm sac diameter (>56 mm; p=0.028), and mural neck thrombus (>11° circumference; p<0.001) as independent predictors of late migration and type Ia endoleak. Aortic curvature is a predictor for late type Ia endoleak and endograft migration after EVAR. These findings suggest that aortic curvature is a better parameter than angulation to predict post-EVAR failure and should be included as a hostile neck parameter.

  10. Evaluation of spectral domain optical coherence tomography parameters in ocular hypertension, preperimetric, and early glaucoma.

    PubMed

    Aydogan, Tuğba; Akçay, BetÜl İlkay Sezgin; Kardeş, Esra; Ergin, Ahmet

    2017-11-01

    The objective of this study is to evaluate the diagnostic ability of retinal nerve fiber layer (RNFL), macular, optic nerve head (ONH) parameters in healthy subjects, ocular hypertension (OHT), preperimetric glaucoma (PPG), and early glaucoma (EG) patients, to reveal factors affecting the diagnostic ability of spectral domain-optical coherence tomography (SD-OCT) parameters and risk factors for glaucoma. Three hundred and twenty-six eyes (89 healthy, 77 OHT, 94 PPG, and 66 EG eyes) were analyzed. RNFL, macular, and ONH parameters were measured with SD-OCT. The area under the receiver operating characteristic curve (AUC) and sensitivity at 95% specificity was calculated. Logistic regression analysis was used to determine the glaucoma risk factors. Receiver operating characteristic regression analysis was used to evaluate the influence of covariates on the diagnostic ability of parameters. In PPG patients, parameters that had the largest AUC value were average RNFL thickness (0.83) and rim volume (0.83). In EG patients, parameter that had the largest AUC value was average RNFL thickness (0.98). The logistic regression analysis showed average RNFL thickness was a risk factor for both PPG and EG. Diagnostic ability of average RNFL and average ganglion cell complex thickness increased as disease severity increased. Signal strength index did not affect diagnostic abilities. Diagnostic ability of average RNFL and rim area increased as disc area increased. When evaluating patients with glaucoma, patients at risk for glaucoma, and healthy controls RNFL parameters deserve more attention in clinical practice. Further studies are needed to fully understand the influence of covariates on the diagnostic ability of OCT parameters.

  11. Comparison between antegrade and retrograde cerebral perfusion or profound hypothermia as brain protection strategies during repair of type A aortic dissection.

    PubMed

    Stamou, Sotiris C; Rausch, Laura A; Kouchoukos, Nicholas T; Lobdell, Kevin W; Khabbaz, Kamal; Murphy, Edward; Hagberg, Robert C

    2016-07-01

    The goal of this study was to compare early postoperative outcomes and actuarial-free survival between patients who underwent repair of acute type A aortic dissection by the method of cerebral perfusion used. A total of 324 patients from five academic medical centers underwent repair of acute type A aortic dissection between January 2000 and December 2010. Of those, antegrade cerebral perfusion (ACP) was used for 84 patients, retrograde cerebral perfusion (RCP) was used for 55 patients, and deep hypothermic circulatory arrest (DHCA) was used for 184 patients during repair. Major morbidity, operative mortality, and 5-year actuarial survival were compared between groups. Multivariate logistic regression was used to determine predictors of operative mortality and Cox Regression hazard ratios were calculated to determine the predictors of long term mortality. Operative mortality was not influenced by the type of cerebral protection (19% for ACP, 14.5% for RCP and 19.1% for DHCA, P=0.729). In multivariable logistic regression analysis, hemodynamic instability [odds ratio (OR) =19.6, 95% confidence intervals (CI), 0.102-0.414, P<0.001] and CPB time >200 min(OR =4.7, 95% CI, 1.962-1.072, P=0.029) emerged as independent predictors of operative mortality. Actuarial 5-year survival was unchanged by cerebral protection modality (48.8% for ACP, 61.8% for RCP and 66.8% for no cerebral protection, log-rank P=0.844). During surgical repair of type A aortic dissection, ACP, RCP or DHCA are safe strategies for cerebral protection in selected patients with type A aortic dissection.

  12. New robust statistical procedures for the polytomous logistic regression models.

    PubMed

    Castilla, Elena; Ghosh, Abhik; Martin, Nirian; Pardo, Leandro

    2018-05-17

    This article derives a new family of estimators, namely the minimum density power divergence estimators, as a robust generalization of the maximum likelihood estimator for the polytomous logistic regression model. Based on these estimators, a family of Wald-type test statistics for linear hypotheses is introduced. Robustness properties of both the proposed estimators and the test statistics are theoretically studied through the classical influence function analysis. Appropriate real life examples are presented to justify the requirement of suitable robust statistical procedures in place of the likelihood based inference for the polytomous logistic regression model. The validity of the theoretical results established in the article are further confirmed empirically through suitable simulation studies. Finally, an approach for the data-driven selection of the robustness tuning parameter is proposed with empirical justifications. © 2018, The International Biometric Society.

  13. Nowcasting of Low-Visibility Procedure States with Ordered Logistic Regression at Vienna International Airport

    NASA Astrophysics Data System (ADS)

    Kneringer, Philipp; Dietz, Sebastian; Mayr, Georg J.; Zeileis, Achim

    2017-04-01

    Low-visibility conditions have a large impact on aviation safety and economic efficiency of airports and airlines. To support decision makers, we develop a statistical probabilistic nowcasting tool for the occurrence of capacity-reducing operations related to low visibility. The probabilities of four different low visibility classes are predicted with an ordered logistic regression model based on time series of meteorological point measurements. Potential predictor variables for the statistical models are visibility, humidity, temperature and wind measurements at several measurement sites. A stepwise variable selection method indicates that visibility and humidity measurements are the most important model inputs. The forecasts are tested with a 30 minute forecast interval up to two hours, which is a sufficient time span for tactical planning at Vienna Airport. The ordered logistic regression models outperform persistence and are competitive with human forecasters.

  14. EXpectation Propagation LOgistic REgRession (EXPLORER): distributed privacy-preserving online model learning.

    PubMed

    Wang, Shuang; Jiang, Xiaoqian; Wu, Yuan; Cui, Lijuan; Cheng, Samuel; Ohno-Machado, Lucila

    2013-06-01

    We developed an EXpectation Propagation LOgistic REgRession (EXPLORER) model for distributed privacy-preserving online learning. The proposed framework provides a high level guarantee for protecting sensitive information, since the information exchanged between the server and the client is the encrypted posterior distribution of coefficients. Through experimental results, EXPLORER shows the same performance (e.g., discrimination, calibration, feature selection, etc.) as the traditional frequentist logistic regression model, but provides more flexibility in model updating. That is, EXPLORER can be updated one point at a time rather than having to retrain the entire data set when new observations are recorded. The proposed EXPLORER supports asynchronized communication, which relieves the participants from coordinating with one another, and prevents service breakdown from the absence of participants or interrupted communications. Copyright © 2013 Elsevier Inc. All rights reserved.

  15. A computational approach to compare regression modelling strategies in prediction research.

    PubMed

    Pajouheshnia, Romin; Pestman, Wiebe R; Teerenstra, Steven; Groenwold, Rolf H H

    2016-08-25

    It is often unclear which approach to fit, assess and adjust a model will yield the most accurate prediction model. We present an extension of an approach for comparing modelling strategies in linear regression to the setting of logistic regression and demonstrate its application in clinical prediction research. A framework for comparing logistic regression modelling strategies by their likelihoods was formulated using a wrapper approach. Five different strategies for modelling, including simple shrinkage methods, were compared in four empirical data sets to illustrate the concept of a priori strategy comparison. Simulations were performed in both randomly generated data and empirical data to investigate the influence of data characteristics on strategy performance. We applied the comparison framework in a case study setting. Optimal strategies were selected based on the results of a priori comparisons in a clinical data set and the performance of models built according to each strategy was assessed using the Brier score and calibration plots. The performance of modelling strategies was highly dependent on the characteristics of the development data in both linear and logistic regression settings. A priori comparisons in four empirical data sets found that no strategy consistently outperformed the others. The percentage of times that a model adjustment strategy outperformed a logistic model ranged from 3.9 to 94.9 %, depending on the strategy and data set. However, in our case study setting the a priori selection of optimal methods did not result in detectable improvement in model performance when assessed in an external data set. The performance of prediction modelling strategies is a data-dependent process and can be highly variable between data sets within the same clinical domain. A priori strategy comparison can be used to determine an optimal logistic regression modelling strategy for a given data set before selecting a final modelling approach.

  16. Cytopathologic differential diagnosis of low-grade urothelial carcinoma and reactive urothelial proliferation in bladder washings: a logistic regression analysis.

    PubMed

    Cakir, Ebru; Kucuk, Ulku; Pala, Emel Ebru; Sezer, Ozlem; Ekin, Rahmi Gokhan; Cakmak, Ozgur

    2017-05-01

    Conventional cytomorphologic assessment is the first step to establish an accurate diagnosis in urinary cytology. In cytologic preparations, the separation of low-grade urothelial carcinoma (LGUC) from reactive urothelial proliferation (RUP) can be exceedingly difficult. The bladder washing cytologies of 32 LGUC and 29 RUP were reviewed. The cytologic slides were examined for the presence or absence of the 28 cytologic features. The cytologic criteria showing statistical significance in LGUC were increased numbers of monotonous single (non-umbrella) cells, three-dimensional cellular papillary clusters without fibrovascular cores, irregular bordered clusters, atypical single cells, irregular nuclear overlap, cytoplasmic homogeneity, increased N/C ratio, pleomorphism, nuclear border irregularity, nuclear eccentricity, elongated nuclei, and hyperchromasia (p ˂ 0.05), and the cytologic criteria showing statistical significance in RUP were inflammatory background, mixture of small and large urothelial cells, loose monolayer aggregates, and vacuolated cytoplasm (p ˂ 0.05). When these variables were subjected to a stepwise logistic regression analysis, four features were selected to distinguish LGUC from RUP: increased numbers of monotonous single (non-umbrella) cells, increased nuclear cytoplasmic ratio, hyperchromasia, and presence of small and large urothelial cells (p = 0.0001). By this logistic model of the 32 cases with proven LGUC, the stepwise logistic regression analysis correctly predicted 31 (96.9%) patients with this diagnosis, and of the 29 patients with RUP, the logistic model correctly predicted 26 (89.7%) patients as having this disease. There are several cytologic features to separate LGUC from RUP. Stepwise logistic regression analysis is a valuable tool for determining the most useful cytologic criteria to distinguish these entities. © 2017 APMIS. Published by John Wiley & Sons Ltd.

  17. Science of Test Research Consortium: Year Two Final Report

    DTIC Science & Technology

    2012-10-02

    July 2012. Analysis of an Intervention for Small Unmanned Aerial System ( SUAS ) Accidents, submitted to Quality Engineering, LQEN-2012-0056. Stone... Systems Engineering. Wolf, S. E., R. R. Hill, and J. J. Pignatiello. June 2012. Using Neural Networks and Logistic Regression to Model Small Unmanned ...Human Retina. 6. Wolf, S. E. March 2012. Modeling Small Unmanned Aerial System Mishaps using Logistic Regression and Artificial Neural Networks. 7

  18. Binary Logistic Regression Analysis for Detecting Differential Item Functioning: Effectiveness of R[superscript 2] and Delta Log Odds Ratio Effect Size Measures

    ERIC Educational Resources Information Center

    Hidalgo, Mª Dolores; Gómez-Benito, Juana; Zumbo, Bruno D.

    2014-01-01

    The authors analyze the effectiveness of the R[superscript 2] and delta log odds ratio effect size measures when using logistic regression analysis to detect differential item functioning (DIF) in dichotomous items. A simulation study was carried out, and the Type I error rate and power estimates under conditions in which only statistical testing…

  19. Logistic quantile regression provides improved estimates for bounded avian counts: a case study of California Spotted Owl fledgling production

    Treesearch

    Brian S. Cade; Barry R. Noon; Rick D. Scherer; John J. Keane

    2017-01-01

    Counts of avian fledglings, nestlings, or clutch size that are bounded below by zero and above by some small integer form a discrete random variable distribution that is not approximated well by conventional parametric count distributions such as the Poisson or negative binomial. We developed a logistic quantile regression model to provide estimates of the empirical...

  20. Comparison of four methods for deriving hospital standardised mortality ratios from a single hierarchical logistic regression model.

    PubMed

    Mohammed, Mohammed A; Manktelow, Bradley N; Hofer, Timothy P

    2016-04-01

    There is interest in deriving case-mix adjusted standardised mortality ratios so that comparisons between healthcare providers, such as hospitals, can be undertaken in the controversial belief that variability in standardised mortality ratios reflects quality of care. Typically standardised mortality ratios are derived using a fixed effects logistic regression model, without a hospital term in the model. This fails to account for the hierarchical structure of the data - patients nested within hospitals - and so a hierarchical logistic regression model is more appropriate. However, four methods have been advocated for deriving standardised mortality ratios from a hierarchical logistic regression model, but their agreement is not known and neither do we know which is to be preferred. We found significant differences between the four types of standardised mortality ratios because they reflect a range of underlying conceptual issues. The most subtle issue is the distinction between asking how an average patient fares in different hospitals versus how patients at a given hospital fare at an average hospital. Since the answers to these questions are not the same and since the choice between these two approaches is not obvious, the extent to which profiling hospitals on mortality can be undertaken safely and reliably, without resolving these methodological issues, remains questionable. © The Author(s) 2012.

  1. Three methods to construct predictive models using logistic regression and likelihood ratios to facilitate adjustment for pretest probability give similar results.

    PubMed

    Chan, Siew Foong; Deeks, Jonathan J; Macaskill, Petra; Irwig, Les

    2008-01-01

    To compare three predictive models based on logistic regression to estimate adjusted likelihood ratios allowing for interdependency between diagnostic variables (tests). This study was a review of the theoretical basis, assumptions, and limitations of published models; and a statistical extension of methods and application to a case study of the diagnosis of obstructive airways disease based on history and clinical examination. Albert's method includes an offset term to estimate an adjusted likelihood ratio for combinations of tests. Spiegelhalter and Knill-Jones method uses the unadjusted likelihood ratio for each test as a predictor and computes shrinkage factors to allow for interdependence. Knottnerus' method differs from the other methods because it requires sequencing of tests, which limits its application to situations where there are few tests and substantial data. Although parameter estimates differed between the models, predicted "posttest" probabilities were generally similar. Construction of predictive models using logistic regression is preferred to the independence Bayes' approach when it is important to adjust for dependency of tests errors. Methods to estimate adjusted likelihood ratios from predictive models should be considered in preference to a standard logistic regression model to facilitate ease of interpretation and application. Albert's method provides the most straightforward approach.

  2. A comparison of three methods of assessing differential item functioning (DIF) in the Hospital Anxiety Depression Scale: ordinal logistic regression, Rasch analysis and the Mantel chi-square procedure.

    PubMed

    Cameron, Isobel M; Scott, Neil W; Adler, Mats; Reid, Ian C

    2014-12-01

    It is important for clinical practice and research that measurement scales of well-being and quality of life exhibit only minimal differential item functioning (DIF). DIF occurs where different groups of people endorse items in a scale to different extents after being matched by the intended scale attribute. We investigate the equivalence or otherwise of common methods of assessing DIF. Three methods of measuring age- and sex-related DIF (ordinal logistic regression, Rasch analysis and Mantel χ(2) procedure) were applied to Hospital Anxiety Depression Scale (HADS) data pertaining to a sample of 1,068 patients consulting primary care practitioners. Three items were flagged by all three approaches as having either age- or sex-related DIF with a consistent direction of effect; a further three items identified did not meet stricter criteria for important DIF using at least one method. When applying strict criteria for significant DIF, ordinal logistic regression was slightly less sensitive. Ordinal logistic regression, Rasch analysis and contingency table methods yielded consistent results when identifying DIF in the HADS depression and HADS anxiety scales. Regardless of methods applied, investigators should use a combination of statistical significance, magnitude of the DIF effect and investigator judgement when interpreting the results.

  3. Extreme Sparse Multinomial Logistic Regression: A Fast and Robust Framework for Hyperspectral Image Classification

    NASA Astrophysics Data System (ADS)

    Cao, Faxian; Yang, Zhijing; Ren, Jinchang; Ling, Wing-Kuen; Zhao, Huimin; Marshall, Stephen

    2017-12-01

    Although the sparse multinomial logistic regression (SMLR) has provided a useful tool for sparse classification, it suffers from inefficacy in dealing with high dimensional features and manually set initial regressor values. This has significantly constrained its applications for hyperspectral image (HSI) classification. In order to tackle these two drawbacks, an extreme sparse multinomial logistic regression (ESMLR) is proposed for effective classification of HSI. First, the HSI dataset is projected to a new feature space with randomly generated weight and bias. Second, an optimization model is established by the Lagrange multiplier method and the dual principle to automatically determine a good initial regressor for SMLR via minimizing the training error and the regressor value. Furthermore, the extended multi-attribute profiles (EMAPs) are utilized for extracting both the spectral and spatial features. A combinational linear multiple features learning (MFL) method is proposed to further enhance the features extracted by ESMLR and EMAPs. Finally, the logistic regression via the variable splitting and the augmented Lagrangian (LORSAL) is adopted in the proposed framework for reducing the computational time. Experiments are conducted on two well-known HSI datasets, namely the Indian Pines dataset and the Pavia University dataset, which have shown the fast and robust performance of the proposed ESMLR framework.

  4. Improvement of motor function in early Parkinson disease by safinamide.

    PubMed

    Stocchi, F; Arnold, G; Onofrj, M; Kwiecinski, H; Szczudlik, A; Thomas, A; Bonuccelli, U; Van Dijk, A; Cattaneo, C; Sala, P; Fariello, R G

    2004-08-24

    A median safinamide (SAF) dose of 70 mg/day (range 40 to 90 mg/day) increased the percentage of parkinsonian patients improving their motor scores by > or =30% from baseline (responders) after 3 months from 21.4% (placebo) to 37.5% (p < 0.05, calculated by logistic regression analysis). In a subgroup of 101 patients under stable treatment with a single dopamine agonist, addition of SAF magnified the response (47.1% responders, mean 4.7-point motor score decrease; p > or = 0.05). These results suggest that doses of SAF exerting ion channel block and glutamate release inhibition add to its symptomatic effect and warrant exploration of higher doses.

  5. [Satisfaction with life, victimization, and perception of insecurity in Morelos].

    PubMed

    Martínez-Ferrer, Belén; Ávila-Guerrero, María Elena; Vera-Jiménez, Jesús Alejandro; Bahena-Rivera, Alejandro; Musitu-Ochoa, Gonzalo

    2016-01-01

    To examines the influence of victimization, perceived insecurity and restrictions on daily routines in life satisfaction. Participants were 7535 (50.2% men) aged between 12 and 60, selected from a proportional stratified sampling. MANOVA and polytomous logistic regression model were calculated. We found significant differences in victimization, perceived insecurity and restrictions on daily routines in relation with life satisfaction levels. Also, physical protective measures, control of personal information, perception of insecurity in public areas and restrictions on daily routines were related to lower levels of satisfaction with life. Lowest levels of satisfaction with life were associated with victimization, perception of insecurity in public areas, and restrictions on daily routines.

  6. No higher risk of problem drinking or mental illness for women in male-dominated occupations.

    PubMed

    Savikko, Annukka; Lanne, Matilda; Spak, Fredrik; Hensing, Gunnel

    2008-07-01

    A sample of 562 women were drawn from the general population study "Women and alcohol in Goteborg" (N = 8335). An initial screening phase was followed by interviews regarding work, alcohol, and mental illness. Data from 1990 and 1995 were analyzed. Logistic regressions were used to calculate odds ratios. Contradictory to earlier studies we found no higher risk for alcohol problems/mental illness among women in male-dominated occupations. Selection and changes in cultural norms can be explanations. Study limitations included use of occupations at an aggregated level. The Swedish Council financially supported the study for Working Life and Social Research.

  7. Mortality and nursing care dependency one year after first ischemic stroke: an analysis of German statutory health insurance data.

    PubMed

    Kemper, Claudia; Koller, Daniela; Glaeske, Gerd; van den Bussche, Hendrik

    2011-01-01

    Aphasia, dementia, and depression are important and common neurological and neuropsychological disorders after ischemic stroke. We estimated the frequency of these comorbidities and their impact on mortality and nursing care dependency. Data of a German statutory health insurance were analyzed for people aged 50 years and older with first ischemic stroke. Aphasia, dementia, and depression were defined on the basis of outpatient medical diagnoses within 1 year after stroke. Logistic regression models for mortality and nursing care dependency were calculated and were adjusted for age, sex, and other relevant comorbidity. Of 977 individuals with a first ischemic stroke, 14.8% suffered from aphasia, 12.5% became demented, and 22.4% became depressed. The regression model for mortality showed a significant influence of age, aphasia, and other relevant comorbidity. In the regression model for nursing care dependency, the factors age, aphasia, dementia, depression, and other relevant comorbidity were significant. Aphasia has a high impact on mortality and nursing care dependency after ischemic stroke, while dementia and depression are strongly associated with increasing nursing care dependency.

  8. Beyond logistic regression: structural equations modelling for binary variables and its application to investigating unobserved confounders.

    PubMed

    Kupek, Emil

    2006-03-15

    Structural equation modelling (SEM) has been increasingly used in medical statistics for solving a system of related regression equations. However, a great obstacle for its wider use has been its difficulty in handling categorical variables within the framework of generalised linear models. A large data set with a known structure among two related outcomes and three independent variables was generated to investigate the use of Yule's transformation of odds ratio (OR) into Q-metric by (OR-1)/(OR+1) to approximate Pearson's correlation coefficients between binary variables whose covariance structure can be further analysed by SEM. Percent of correctly classified events and non-events was compared with the classification obtained by logistic regression. The performance of SEM based on Q-metric was also checked on a small (N = 100) random sample of the data generated and on a real data set. SEM successfully recovered the generated model structure. SEM of real data suggested a significant influence of a latent confounding variable which would have not been detectable by standard logistic regression. SEM classification performance was broadly similar to that of the logistic regression. The analysis of binary data can be greatly enhanced by Yule's transformation of odds ratios into estimated correlation matrix that can be further analysed by SEM. The interpretation of results is aided by expressing them as odds ratios which are the most frequently used measure of effect in medical statistics.

  9. Predictors of postoperative outcomes of cubital tunnel syndrome treatments using multiple logistic regression analysis.

    PubMed

    Suzuki, Taku; Iwamoto, Takuji; Shizu, Kanae; Suzuki, Katsuji; Yamada, Harumoto; Sato, Kazuki

    2017-05-01

    This retrospective study was designed to investigate prognostic factors for postoperative outcomes for cubital tunnel syndrome (CubTS) using multiple logistic regression analysis with a large number of patients. Eighty-three patients with CubTS who underwent surgeries were enrolled. The following potential prognostic factors for disease severity were selected according to previous reports: sex, age, type of surgery, disease duration, body mass index, cervical lesion, presence of diabetes mellitus, Workers' Compensation status, preoperative severity, and preoperative electrodiagnostic testing. Postoperative severity of disease was assessed 2 years after surgery by Messina's criteria which is an outcome measure specifically for CubTS. Bivariate analysis was performed to select candidate prognostic factors for multiple linear regression analyses. Multiple logistic regression analysis was conducted to identify the association between postoperative severity and selected prognostic factors. Both bivariate and multiple linear regression analysis revealed only preoperative severity as an independent risk factor for poor prognosis, while other factors did not show any significant association. Although conflicting results exist regarding prognosis of CubTS, this study supports evidence from previous studies and concludes early surgical intervention portends the most favorable prognosis. Copyright © 2017 The Japanese Orthopaedic Association. Published by Elsevier B.V. All rights reserved.

  10. Process model comparison and transferability across bioreactor scales and modes of operation for a mammalian cell bioprocess.

    PubMed

    Craven, Stephen; Shirsat, Nishikant; Whelan, Jessica; Glennon, Brian

    2013-01-01

    A Monod kinetic model, logistic equation model, and statistical regression model were developed for a Chinese hamster ovary cell bioprocess operated under three different modes of operation (batch, bolus fed-batch, and continuous fed-batch) and grown on two different bioreactor scales (3 L bench-top and 15 L pilot-scale). The Monod kinetic model was developed for all modes of operation under study and predicted cell density, glucose glutamine, lactate, and ammonia concentrations well for the bioprocess. However, it was computationally demanding due to the large number of parameters necessary to produce a good model fit. The transferability of the Monod kinetic model structure and parameter set across bioreactor scales and modes of operation was investigated and a parameter sensitivity analysis performed. The experimentally determined parameters had the greatest influence on model performance. They changed with scale and mode of operation, but were easily calculated. The remaining parameters, which were fitted using a differential evolutionary algorithm, were not as crucial. Logistic equation and statistical regression models were investigated as alternatives to the Monod kinetic model. They were less computationally intensive to develop due to the absence of a large parameter set. However, modeling of the nutrient and metabolite concentrations proved to be troublesome due to the logistic equation model structure and the inability of both models to incorporate a feed. The complexity, computational load, and effort required for model development has to be balanced with the necessary level of model sophistication when choosing which model type to develop for a particular application. Copyright © 2012 American Institute of Chemical Engineers (AIChE).

  11. A Logistic Regression Analysis of Turkey's 15-Year-Olds' Scoring above the OECD Average on the PISA'09 Reading Assessment

    ERIC Educational Resources Information Center

    Kasapoglu, Koray

    2014-01-01

    This study aims to investigate which factors are associated with Turkey's 15-year-olds' scoring above the OECD average (493) on the PISA'09 reading assessment. Collected from a total of 4,996 15-year-old students from Turkey, data were analyzed by logistic regression analysis in order to model the data of students who were split into two: (1)…

  12. Estimating the Probability of Rare Events Occurring Using a Local Model Averaging.

    PubMed

    Chen, Jin-Hua; Chen, Chun-Shu; Huang, Meng-Fan; Lin, Hung-Chih

    2016-10-01

    In statistical applications, logistic regression is a popular method for analyzing binary data accompanied by explanatory variables. But when one of the two outcomes is rare, the estimation of model parameters has been shown to be severely biased and hence estimating the probability of rare events occurring based on a logistic regression model would be inaccurate. In this article, we focus on estimating the probability of rare events occurring based on logistic regression models. Instead of selecting a best model, we propose a local model averaging procedure based on a data perturbation technique applied to different information criteria to obtain different probability estimates of rare events occurring. Then an approximately unbiased estimator of Kullback-Leibler loss is used to choose the best one among them. We design complete simulations to show the effectiveness of our approach. For illustration, a necrotizing enterocolitis (NEC) data set is analyzed. © 2016 Society for Risk Analysis.

  13. The use of logistic regression to enhance risk assessment and decision making by mental health administrators.

    PubMed

    Menditto, Anthony A; Linhorst, Donald M; Coleman, James C; Beck, Niels C

    2006-04-01

    Development of policies and procedures to contend with the risks presented by elopement, aggression, and suicidal behaviors are long-standing challenges for mental health administrators. Guidance in making such judgments can be obtained through the use of a multivariate statistical technique known as logistic regression. This procedure can be used to develop a predictive equation that is mathematically formulated to use the best combination of predictors, rather than considering just one factor at a time. This paper presents an overview of logistic regression and its utility in mental health administrative decision making. A case example of its application is presented using data on elopements from Missouri's long-term state psychiatric hospitals. Ultimately, the use of statistical prediction analyses tempered with differential qualitative weighting of classification errors can augment decision-making processes in a manner that provides guidance and flexibility while wrestling with the complex problem of risk assessment and decision making.

  14. An application in identifying high-risk populations in alternative tobacco product use utilizing logistic regression and CART: a heuristic comparison.

    PubMed

    Lei, Yang; Nollen, Nikki; Ahluwahlia, Jasjit S; Yu, Qing; Mayo, Matthew S

    2015-04-09

    Other forms of tobacco use are increasing in prevalence, yet most tobacco control efforts are aimed at cigarettes. In light of this, it is important to identify individuals who are using both cigarettes and alternative tobacco products (ATPs). Most previous studies have used regression models. We conducted a traditional logistic regression model and a classification and regression tree (CART) model to illustrate and discuss the added advantages of using CART in the setting of identifying high-risk subgroups of ATP users among cigarettes smokers. The data were collected from an online cross-sectional survey administered by Survey Sampling International between July 5, 2012 and August 15, 2012. Eligible participants self-identified as current smokers, African American, White, or Latino (of any race), were English-speaking, and were at least 25 years old. The study sample included 2,376 participants and was divided into independent training and validation samples for a hold out validation. Logistic regression and CART models were used to examine the important predictors of cigarettes + ATP users. The logistic regression model identified nine important factors: gender, age, race, nicotine dependence, buying cigarettes or borrowing, whether the price of cigarettes influences the brand purchased, whether the participants set limits on cigarettes per day, alcohol use scores, and discrimination frequencies. The C-index of the logistic regression model was 0.74, indicating good discriminatory capability. The model performed well in the validation cohort also with good discrimination (c-index = 0.73) and excellent calibration (R-square = 0.96 in the calibration regression). The parsimonious CART model identified gender, age, alcohol use score, race, and discrimination frequencies to be the most important factors. It also revealed interesting partial interactions. The c-index is 0.70 for the training sample and 0.69 for the validation sample. The misclassification rate was 0.342 for the training sample and 0.346 for the validation sample. The CART model was easier to interpret and discovered target populations that possess clinical significance. This study suggests that the non-parametric CART model is parsimonious, potentially easier to interpret, and provides additional information in identifying the subgroups at high risk of ATP use among cigarette smokers.

  15. Investigation of Hydrogen Sulfide Exposure and Lung Function, Asthma and Chronic Obstructive Pulmonary Disease in a Geothermal Area of New Zealand

    PubMed Central

    Bates, Michael N.; Crane, Julian; Balmes, John R.; Garrett, Nick

    2015-01-01

    Background Results have been conflicting whether long-term ambient hydrogen sulfide (H2S) affects lung function or is a risk factor for asthma or chronic obstructive pulmonary disease (COPD). Rotorua city, New Zealand, has the world’s largest population exposed to ambient H2S—from geothermal sources. Objectives We investigated associations of H2S with lung function, COPD and asthma in this population. Methods 1,204 of 1,639 study participants, aged 18–65 years during 2008–2010, provided satisfactory spirometry results. Residences, workplaces and schools over the last 30 years were geocoded. Exposures were estimated from data collected by summer and winter H2S monitoring networks across Rotorua. Four metrics for H2S exposure, representing both current and long-term (last 30 years) exposure, and also time-weighted average and peak exposures, were calculated. Departures from expected values for pre-bronchodilator lung function, calculated from prediction equations, were outcomes for linear regression models using quartiles of the H2S exposure metrics. Separate models examined participants with and without evidence of asthma or COPD, and never- and ever-smokers. Logistic regression was used to investigate associations of COPD (a post-bronchodilator FEV1/FVC < 70% of expected) and asthma (doctor-diagnosed or by FEV1 response to bronchodilator) with H2S exposure quartiles. Results None of the exposure metrics produced evidence of lung function decrement. The logistic regression analysis showed no evidence that long-term H2S exposure at Rotorua levels was associated with either increased COPD or asthma risk. Some results suggested that recent ambient H2S exposures were beneficially associated with lung function parameters. Conclusions The study found no evidence of reductions in lung function, or increased risk of COPD or asthma, from recent or long-term H2S exposure at the relatively high ambient concentrations found in Rotorua. Suggestions of improved lung function associated with recent ambient H2S exposures require confirmation in other studies. PMID:25822819

  16. Investigation of hydrogen sulfide exposure and lung function, asthma and chronic obstructive pulmonary disease in a geothermal area of New Zealand.

    PubMed

    Bates, Michael N; Crane, Julian; Balmes, John R; Garrett, Nick

    2015-01-01

    Results have been conflicting whether long-term ambient hydrogen sulfide (H2S) affects lung function or is a risk factor for asthma or chronic obstructive pulmonary disease (COPD). Rotorua city, New Zealand, has the world's largest population exposed to ambient H2S-from geothermal sources. We investigated associations of H2S with lung function, COPD and asthma in this population. 1,204 of 1,639 study participants, aged 18-65 years during 2008-2010, provided satisfactory spirometry results. Residences, workplaces and schools over the last 30 years were geocoded. Exposures were estimated from data collected by summer and winter H2S monitoring networks across Rotorua. Four metrics for H2S exposure, representing both current and long-term (last 30 years) exposure, and also time-weighted average and peak exposures, were calculated. Departures from expected values for pre-bronchodilator lung function, calculated from prediction equations, were outcomes for linear regression models using quartiles of the H2S exposure metrics. Separate models examined participants with and without evidence of asthma or COPD, and never- and ever-smokers. Logistic regression was used to investigate associations of COPD (a post-bronchodilator FEV1/FVC < 70% of expected) and asthma (doctor-diagnosed or by FEV1 response to bronchodilator) with H2S exposure quartiles. None of the exposure metrics produced evidence of lung function decrement. The logistic regression analysis showed no evidence that long-term H2S exposure at Rotorua levels was associated with either increased COPD or asthma risk. Some results suggested that recent ambient H2S exposures were beneficially associated with lung function parameters. The study found no evidence of reductions in lung function, or increased risk of COPD or asthma, from recent or long-term H2S exposure at the relatively high ambient concentrations found in Rotorua. Suggestions of improved lung function associated with recent ambient H2S exposures require confirmation in other studies.

  17. Living near overhead high voltage transmission power lines as a risk factor for childhood acute lymphoblastic leukemia: a case-control study.

    PubMed

    Sohrabi, Mohammad-Reza; Tarjoman, Termeh; Abadi, Alireza; Yavari, Parvin

    2010-01-01

    This study aimed to investigate association of living near high voltage power lines with occurrence of childhood acute lymphoblastic leukemia (ALL). Through a case-control study 300 children aged 1-18 years with confirmed ALL were selected from all referral teaching centers for cancer. They interviewed for history of living near overhead high voltage power lines during at least past two years and compared with 300 controls which were individually matched for sex and approximate age. Logistic regression, chi square and paired t-tests were used for analysis when appropriate. The case group were living significantly closer to power lines (P<0.001). More than half of the cases were exposed to two or three types of power lines (P<0.02). Using logistic regression, odds ratio of 2.61 (95%CI: 1.73 to 3.94) calculated for less than 600 meters far from the nearest lines against more than 600 meters. This ratio estimated as 9.93 (95%CI: 3.47 to 28.5) for 123 KV, 10.78 (95%CI: 3.75 to 31) for 230 KV and 2.98 (95%CI: 0.93 to 9.54) for 400 KV lines. Odds of ALL decreased 0.61 for every 600 meters from the nearest power line. This study emphasizes that living close to high voltage power lines is a risk for ALL.

  18. A 3-Year Study of Predictive Factors for Positive and Negative Appendicectomies.

    PubMed

    Chang, Dwayne T S; Maluda, Melissa; Lee, Lisa; Premaratne, Chandrasiri; Khamhing, Srisongham

    2018-03-06

    Early and accurate identification or exclusion of acute appendicitis is the key to avoid the morbidity of delayed treatment for true appendicitis or unnecessary appendicectomy, respectively. We aim (i) to identify potential predictive factors for positive and negative appendicectomies; and (ii) to analyse the use of ultrasound scans (US) and computed tomography (CT) scans for acute appendicitis. All appendicectomies that took place at our hospital from the 1st of January 2013 to the 31st of December 2015 were retrospectively recorded. Test results of potential predictive factors of acute appendicitis were recorded. Statistical analysis was performed using Fisher exact test, logistic regression analysis, sensitivity, specificity, and positive and negative predictive values calculation. 208 patients were included in this study. 184 patients had histologically proven acute appendicitis. The other 24 patients had either nonappendicitis pathology or normal appendix. Logistic regression analysis showed statistically significant associations between appendicitis and white cell count, neutrophil count, C-reactive protein, and bilirubin. Neutrophil count was the test with the highest sensitivity and negative predictive values, whereas bilirubin was the test with the highest specificity and positive predictive values (PPV). US and CT scans had high sensitivity and PPV for diagnosing appendicitis. No single test was sufficient to diagnose or exclude acute appendicitis by itself. Combining tests with high sensitivity (abnormal neutrophil count, and US and CT scans) and high specificity (raised bilirubin) may predict acute appendicitis more accurately.

  19. Serum total bilirubin levels are negatively correlated with metabolic syndrome in aged Chinese women: a community-based study.

    PubMed

    Zhong, P; Sun, D M; Wu, D H; Li, T M; Liu, X Y; Liu, H Y

    2017-01-26

    We evaluated serum total bilirubin levels as a predictor for metabolic syndrome (MetS) and investigated the relationship between serum total bilirubin levels and MetS prevalence. This cross-sectional study included 1728 participants over 65 years of age from Eastern China. Anthropometric data, lifestyle information, and previous medical history were collected. We then measured serum levels of fasting blood-glucose, total cholesterol, triglycerides, and total bilirubin, as well as alanine aminotransferase activity. The prevalence of MetS and each of its individual component were calculated per quartile of total bilirubin level. Logistic regression was used to assess the correlation between serum total bilirubin levels and MetS. Total bilirubin level in the women who did not have MetS was significantly higher than in those who had MetS (P<0.001). Serum total bilirubin quartiles were linearly and negatively correlated with MetS prevalence and hypertriglyceridemia (HTG) in females (P<0.005). Logistic regression showed that serum total bilirubin was an independent predictor of MetS for females (OR: 0.910, 95%CI: 0.863-0.960; P=0.001). The present study suggests that physiological levels of serum total bilirubin might be an independent risk factor for aged Chinese women, and the prevalence of MetS and HTG are negatively correlated to serum total bilirubin levels.

  20. [Depressive symptoms among medical intern students in a Brazilian public university].

    PubMed

    Costa, Edméa Fontes de Oliva; Santana, Ygo Santos; Santos, Ana Teresa Rodrigues de Abreu; Martins, Luiz Antonio Nogueira; Melo, Enaldo Vieira de; Andrade, Tarcísio Matos de

    2012-01-01

    To estimate, among Medical School intern students, the prevalence of depressive symptoms and their severity, as well as associated factors. Cross-sectional study in May 2008, with a representative sample of medical intern students (n = 84) from Universidade Federal de Sergipe (UFS). Beck Depression Inventory (BDI) and a structured questionnaire containing information on sociodemographic variables, teaching-learning process, and personal aspects were used. The exploratory data analysis was performed by descriptive and inferential statistics. Finally, the analysis of multiple variables by logistic regression and the calculation of simple and adjusted ORs with their respective 95% confidence intervals were performed. The general prevalence was 40.5%, with 1.2% (95% CI: 0.0-6.5) of severe depressive symptoms; 4.8% (95% CI: 1.3-11.7) of moderate depressive symptoms; and 34.5% (95% CI: 24.5-45.7) of mild depressive symptoms. The logistic regression revealed the variables with a major impact associated with the emergence of depressive symptoms: thoughts of dropping out (OR 6.24; p = 0.002); emotional stress (OR 7.43;p = 0.0004); and average academic performance (OR 4.74; p = 0.0001). The high prevalence of depressive symptoms in the study population was associated with variables related to the teaching-learning process and personal aspects, suggesting immediate preemptive measures regarding Medical School graduation and student care are required.

  1. Association between chronic viral hepatitis infection and breast cancer risk: a nationwide population-based case-control study

    PubMed Central

    2011-01-01

    Background In Taiwan, there is a high incidence of breast cancer and a high prevalence of viral hepatitis. In this case-control study, we used a population-based insurance dataset to evaluate whether breast cancer in women is associated with chronic viral hepatitis infection. Methods From the claims data, we identified 1,958 patients with newly diagnosed breast cancer during the period 2000-2008. A randomly selected, age-matched cohort of 7,832 subjects without cancer was selected for comparison. Multivariable logistic regression models were constructed to calculate odds ratios of breast cancer associated with viral hepatitis after adjustment for age, residential area, occupation, urbanization, and income. The age-specific (<50 years and ≥50 years) risk of breast cancer was also evaluated. Results There were no significant differences in the prevalence of hepatitis C virus (HCV) infection, hepatitis B virus (HBV), or the prevalence of combined HBC/HBV infection between breast cancer patients and control subjects (p = 0.48). Multivariable logistic regression analysis, however, revealed that age <50 years was associated with a 2-fold greater risk of developing breast cancer (OR = 2.03, 95% CI = 1.23-3.34). Conclusions HCV infection, but not HBV infection, appears to be associated with early onset risk of breast cancer in areas endemic for HCV and HBV. This finding needs to be replicated in further studies. PMID:22115285

  2. Association between maternal smoking, gender, and cleft lip and palate.

    PubMed

    Martelli, Daniella Reis Barbosa; Coletta, Ricardo D; Oliveira, Eduardo A; Swerts, Mário Sérgio Oliveira; Rodrigues, Laíse A Mendes; Oliveira, Maria Christina; Martelli Júnior, Hercílio

    2015-01-01

    Cleft lip and/or palate (CL/P) represent the most common congenital anomalies of the face. To assess the relationship between maternal smoking, gender and CL/P. This is an epidemiological cross-sectional study. We interviewed 1519 mothers divided into two groups: mothers of children with CL/P (n=843) and mothers of children without CL/P (n=676). All mothers were classified as smoker or non-smoker subjects during the first trimester of pregnancy. To determine an association among maternal smoking, gender, and CL/P, odds ratios were calculated and the adjustment was made by a logistic regression model. An association between maternal smoking and the presence of cleft was observed. There was also a strong association between male gender and the presence of cleft (OR=3.51; 95% CI 2.83-4.37). By binary logistic regression analysis, it was demonstrated that both variables were independently associated with clefts. In a multivariate analysis, male gender and maternal smoking had a 2.5- and a 1.5-time greater chance of having a cleft, respectively. Our findings are consistent with a positive association between maternal smoking during pregnancy and CL/P in male gender. The results support the importance of smoking prevention and introduction of cessation programs among women with childbearing potential. Copyright © 2015 Associação Brasileira de Otorrinolaringologia e Cirurgia Cérvico-Facial. Published by Elsevier Editora Ltda. All rights reserved.

  3. Associations between dental knowledge, source of dental knowledge and oral health behavior in Japanese university students: A cross-sectional study

    PubMed Central

    Taniguchi-Tabata, Ayano; Mizutani, Shinsuke; Yamane-Takeuchi, Mayu; Kataoka, Kota; Azuma, Tetsuji; Tomofuji, Takaaki; Iwasaki, Yoshiaki; Morita, Manabu

    2017-01-01

    The aim of this study was to investigate the associations between dental knowledge, the source of dental knowledge and oral health behavior in a group of students at a university in Japan. A total of 2,220 university students (1,276 males, 944 females) volunteered to undergo an oral examination and answer a questionnaire. The questionnaire assessed dental knowledge, the source of dental knowledge and oral health behavior (e.g., daily frequency of tooth brushing, use of dental floss and regular dental checkups). The odds ratio and 95% confidence interval for oral health behavior based on dental knowledge and source of dental knowledge were calculated using logistic regression models. Of the participants, 1,266 (57.0%) students obtained dental knowledge from dental clinics, followed by school (39.2%) and television (29.1%). Logistic regression analyses indicated that use of dental floss was significantly associated with source of dental knowledge from dental clinics (P = 0.006). Receiving regular dental checkups was significantly associated with source of dental knowledge; the positive source was dental clinic (P < 0.001) and the negative sources were school (P = 0.004) and television (P = 0.018). Dental clinic was the most common source of dental knowledge and associated with better oral health behavior among the Japanese university students in this study. PMID:28594914

  4. Treatment of the residual cavity during hepatic hydatidosis surgery: a cohort study of capitonnage vs. omentoplasty.

    PubMed

    Manterola, Carlos; Roa, Juan Carlos; Urrutia, Sebastián

    2013-12-01

    To determine the efficacy of omentoplasty (OP) and capitonnage (CA) in residual cavity management during the hepatic hydatidosis (HH) surgery in terms of the postoperative morbidity. Prospective cohort study. Patients with non-complicated HH treated with subtotal pericystectomy in the Department of Surgery of the Temuco Regional Hospital between 2001 and 2008 were studied. We compared those managed with CA with those managed with OP. A sample size of 40 patients in each group was estimated to be needed to adequately compare the outcomes of the approaches. The primary endpoint was postoperative morbidity. Descriptive statistics, bivariate analyses and logistic regression models were applied. The absolute risk (AR) and relative risk (RR) were calculated. The cohorts comprised 88 patients (CA 40 and OP 48), with a median age of 40 years (15-84), and 62.5 % were females. A general postoperative morbidity rate of 11.4 % was noted after a median follow-up of 60 months (12-84 months). Significant differences in postoperative morbidity were found (p = 0.044). Logistic regression models verified that there were no confounding variables. The AR of the postoperative morbidity for the CA and PO cohorts was 0.025 and 0.1875, respectively, and the RR was 0.13 [0.03, 0.70] 95 % CI. Residual cavity management with CA is associated with a lower postoperative morbidity risk than OP.

  5. Induced abortion: risk factors for adolescent female students, a Brazilian study.

    PubMed

    Correia, Divanise S; Cavalcante, Jairo C; Maia, Eulália M C

    2009-12-16

    The purpose of this study was to analyze risk factors for abortion among female teenagers from 12 to 19 years of age in the city of Maceió, Brazil. This is a cross-sectional study, conducted in ten schools. The sample was calculated by considering the number of admissions for postabortion curettage, obtained from the Information System of Hospitalization. Data were obtained through a semi-structured questionnaire divided into three basic blocks of data: sociodemographic, sexual life, and pregnancy/abortion. To analyze the data, the logistic regression model was used. The Forward Method was chosen to set the final model that minimizes the number of variables and maximizes the accuracy of the model. The significant analysis between the dichotomous variables provided eight significant variables. Two of them are protective for abortion: the ages 12-14 years and talking with parents about sex. After the logistic regression, the receipt of support for abortion was the most significant variable of all. The adolescent with an active sexual life, a previous pregnancy, who is married, and has received support for an abortion has a 99.74% probability for an abortion. The results of this study, demonstrating the importance of the group in adolescence, and the statistical significance of having a partner to support and approve the pregnancy appears as a preventive factor for abortion. It shows the importance of support and companionship for adolescent women.

  6. Analysis of association of clinical aspects and IL1B tagSNPs with severe preeclampsia.

    PubMed

    Leme Galvão, Larissa Paes; Menezes, Filipe Emanuel; Mendonca, Caio; Barreto, Ikaro; Alvim-Pereira, Claudia; Alvim-Pereira, Fabiano; Gurgel, Ricardo

    2016-01-01

    This study investigates the association between IL1B genotypes using a tag SNP (single polymorphism) approach, maternal and environmental factors in Brazilian women with severe preeclampsia. A case-control study with a total of 456 patients (169 preeclamptic women and 287 controls) was conducted in the two reference maternity hospitals of Sergipe state, Northeast Brazil. A questionnaire was administered and DNA was extracted to genotype the population for four tag SNPs of the IL1Beta: rs 1143643, rs 1143633, rs 1143634 and rs 1143630. Haplotype association analysis and p-values were calculated using the THESIAS test. Odds ratio (OR) estimation, confidence interval (CI) and multivariate logistic regression were performed. High pregestational body mass index (pre-BMI), first gestation, cesarean section, more than six medical visits, low level of consciousness on admission and TC and TT genotype in rs1143630 of IL1Beta showed association with the preeclamptic group in univariate analysis. After multivariate logistic regression pre-BMI, first gestation and low level of consciousness on admission remained associated. We identified an association between clinical variables and preeclampsia. Univariate analysis suggested that inflammatory process-related genes, such as IL1B, may be involved and should be targeted in further studies. The identification of the genetic background involved in preeclampsia host response modulation is mandatory in order to understand the preeclampsia process.

  7. Cardiovascular comorbidities of pediatric psoriasis among hospitalized children in the United States.

    PubMed

    Kwa, Lauren; Kwa, Michael C; Silverberg, Jonathan I

    2017-12-01

    Psoriasis has been shown to be associated with cardiovascular disease in adults. Little is known about cardiovascular risk in pediatric psoriasis. To determine if there is an association between pediatric psoriasis and cardiovascular comorbidities. Data were analyzed from the 2002-2012 Nationwide Inpatient Sample, which included 4,884,448 hospitalized children aged 0-17 years. Bivariate and multivariate survey logistic regression models were created to calculate the odds of psoriasis on cardiovascular comorbidities. In multivariate survey logistic regression models adjusting for age, sex, and race/ethnicity, pediatric psoriasis was significantly associated with 5 of 10 cardiovascular comorbidities (adjusted odds ratio [95% confidence interval]), including obesity (3.15 [2.46-4.05]), hypertension (2.63 [1.93-3.59]), diabetes (2.90 [1.90-4.42]), arrhythmia (1.39 [1.02-1.88]), and valvular heart disease (1.90 [1.07-3.37]). The highest odds of cardiovascular risk factors occurred in blacks and Hispanics and children ages 0-9 years, but there were no sex differences. The study was limited to hospitalized children. We were unable to assess the impact of psoriasis treatment or family history on cardiovascular risk. Pediatric psoriasis is associated with higher odds of multiple cardiovascular comorbidities among hospitalized patients. Strategies for mitigating excess cardiovascular risk in pediatric psoriasis need to be determined. Copyright © 2017 American Academy of Dermatology, Inc. Published by Elsevier Inc. All rights reserved.

  8. Predictive occurrence models for coastal wetland plant communities: Delineating hydrologic response surfaces with multinomial logistic regression

    NASA Astrophysics Data System (ADS)

    Snedden, Gregg A.; Steyer, Gregory D.

    2013-02-01

    Understanding plant community zonation along estuarine stress gradients is critical for effective conservation and restoration of coastal wetland ecosystems. We related the presence of plant community types to estuarine hydrology at 173 sites across coastal Louisiana. Percent relative cover by species was assessed at each site near the end of the growing season in 2008, and hourly water level and salinity were recorded at each site Oct 2007-Sep 2008. Nine plant community types were delineated with k-means clustering, and indicator species were identified for each of the community types with indicator species analysis. An inverse relation between salinity and species diversity was observed. Canonical correspondence analysis (CCA) effectively segregated the sites across ordination space by community type, and indicated that salinity and tidal amplitude were both important drivers of vegetation composition. Multinomial logistic regression (MLR) and Akaike's Information Criterion (AIC) were used to predict the probability of occurrence of the nine vegetation communities as a function of salinity and tidal amplitude, and probability surfaces obtained from the MLR model corroborated the CCA results. The weighted kappa statistic, calculated from the confusion matrix of predicted versus actual community types, was 0.7 and indicated good agreement between observed community types and model predictions. Our results suggest that models based on a few key hydrologic variables can be valuable tools for predicting vegetation community development when restoring and managing coastal wetlands.

  9. Proximity to sports facilities and sports participation for adolescents in Germany.

    PubMed

    Reimers, Anne K; Wagner, Matthias; Alvanides, Seraphim; Steinmayr, Andreas; Reiner, Miriam; Schmidt, Steffen; Woll, Alexander

    2014-01-01

    To assess the relationship between proximity to specific sports facilities and participation in the corresponding sports activities for adolescents in Germany. A sample of 1,768 adolescents aged 11-17 years old and living in 161 German communities was examined. Distances to the nearest sports facilities were calculated as an indicator of proximity to sports facilities using Geographic Information Systems (GIS). Participation in specific leisure-time sports activities in sports clubs was assessed using a self-report questionnaire and individual-level socio-demographic variables were derived from a parent questionnaire. Community-level socio-demographics as covariates were selected from the INKAR database, in particular from indicators and maps on land development. Logistic regression analyses were conducted to examine associations between proximity to the nearest sports facilities and participation in the corresponding sports activities. The logistic regression analyses showed that girls residing longer distances from the nearest gym were less likely to engage in indoor sports activities; a significant interaction between distances to gyms and level of urbanization was identified. Decomposition of the interaction term showed that for adolescent girls living in rural areas participation in indoor sports activities was positively associated with gym proximity. Proximity to tennis courts and indoor pools was not associated with participation in tennis or water sports, respectively. Improved proximity to gyms is likely to be more important for female adolescents living in rural areas.

  10. Calibration power of the Braden scale in predicting pressure ulcer development.

    PubMed

    Chen, Hong-Lin; Cao, Ying-Juan; Wang, Jing; Huai, Bao-Sha

    2016-11-02

    Calibration is the degree of correspondence between the estimated probability produced by a model and the actual observed probability. The aim of this study was to investigate the calibration power of the Braden scale in predicting pressure ulcer development (PU). A retrospective analysis was performed among consecutive patients in 2013. The patients were separated into training a group and a validation group. The predicted incidence was calculated using a logistic regression model in the training group and the Hosmer-Lemeshow test was used for assessing the goodness of fit. In the validation cohort, the observed and the predicted incidence were compared by the Chi-square (χ 2 ) goodness of fit test for calibration power. We included 2585 patients in the study, of these 78 patients (3.0%) developed a PU. Between the training and validation groups the patient characteristics were non-significant (p>0.05). In the training group, the logistic regression model for predicting pressure ulcer was Logit(P) = -0.433*Braden score+2.616. The Hosmer-Lemeshow test showed no goodness fit (χ 2 =13.472; p=0.019). In the validation group, the predicted pressure ulcer incidence also did not fit well with the observed incidence (χ 2 =42.154, p=0.000 by Braden scores; and χ 2 =17.223, p=0.001 by Braden scale risk classification). The Braden scale has low calibration power in predicting PU formation.

  11. Cognitive and physical functions related to the level of supervision and dependence in the toileting of stroke patients.

    PubMed

    Sato, Atsushi; Okuda, Yutaka; Fujita, Takaaki; Kimura, Norihiko; Hoshina, Noriyuki; Kato, Sayaka; Tanaka, Shigenari

    2016-01-01

    This study aimed to clarify which cognitive and physical factors are associated with the need for toileting assistance in stroke patients and to calculate cut-off values for discriminating between independent supervision and dependent toileting ability. This cross-sectional study included 163 first-stroke patients in nine convalescent rehabilitation wards. Based on their FIM Ⓡ instrument score for toileting, the patients were divided into an independent-supervision group and a dependent group. Multiple logistic regression analysis and receiver operating characteristic analysis were performed to identify factors related to toileting performance. The Minimental State Examination (MMSE); the Stroke Impairment Assessment Set (SIAS) score for the affected lower limb, speech, and visuospatial functions; and the Functional Assessment for Control of Trunk (FACT) were analyzed as independent variables. The multiple logistic regression analysis showed that the FIM Ⓡ instrument score for toileting was associated with the SIAS score for the affected lower limb function, MMSE, and FACT. On receiver operating characteristic analysis, the SIAS score for the affected lower limb function cut-off value was 8/7 points, the MMSE cut-off value was 25/24 points, and the FACT cut-off value was 14/13 points. Affected lower limb function, cognitive function, and trunk function were related with the need for toileting assistance. These cut-off values may be useful for judging whether toileting assistance is needed in stroke patients.

  12. Determination of osteoporosis risk factors using a multiple logistic regression model in postmenopausal Turkish women.

    PubMed

    Akkus, Zeki; Camdeviren, Handan; Celik, Fatma; Gur, Ali; Nas, Kemal

    2005-09-01

    To determine the risk factors of osteoporosis using a multiple binary logistic regression method and to assess the risk variables for osteoporosis, which is a major and growing health problem in many countries. We presented a case-control study, consisting of 126 postmenopausal healthy women as control group and 225 postmenopausal osteoporotic women as the case group. The study was carried out in the Department of Physical Medicine and Rehabilitation, Dicle University, Diyarbakir, Turkey between 1999-2002. The data from the 351 participants were collected using a standard questionnaire that contains 43 variables. A multiple logistic regression model was then used to evaluate the data and to find the best regression model. We classified 80.1% (281/351) of the participants using the regression model. Furthermore, the specificity value of the model was 67% (84/126) of the control group while the sensitivity value was 88% (197/225) of the case group. We found the distribution of residual values standardized for final model to be exponential using the Kolmogorow-Smirnow test (p=0.193). The receiver operating characteristic curve was found successful to predict patients with risk for osteoporosis. This study suggests that low levels of dietary calcium intake, physical activity, education, and longer duration of menopause are independent predictors of the risk of low bone density in our population. Adequate dietary calcium intake in combination with maintaining a daily physical activity, increasing educational level, decreasing birth rate, and duration of breast-feeding may contribute to healthy bones and play a role in practical prevention of osteoporosis in Southeast Anatolia. In addition, the findings of the present study indicate that the use of multivariate statistical method as a multiple logistic regression in osteoporosis, which maybe influenced by many variables, is better than univariate statistical evaluation.

  13. Classification and regression tree analysis of acute-on-chronic hepatitis B liver failure: Seeing the forest for the trees.

    PubMed

    Shi, K-Q; Zhou, Y-Y; Yan, H-D; Li, H; Wu, F-L; Xie, Y-Y; Braddock, M; Lin, X-Y; Zheng, M-H

    2017-02-01

    At present, there is no ideal model for predicting the short-term outcome of patients with acute-on-chronic hepatitis B liver failure (ACHBLF). This study aimed to establish and validate a prognostic model by using the classification and regression tree (CART) analysis. A total of 1047 patients from two separate medical centres with suspected ACHBLF were screened in the study, which were recognized as derivation cohort and validation cohort, respectively. CART analysis was applied to predict the 3-month mortality of patients with ACHBLF. The accuracy of the CART model was tested using the area under the receiver operating characteristic curve, which was compared with the model for end-stage liver disease (MELD) score and a new logistic regression model. CART analysis identified four variables as prognostic factors of ACHBLF: total bilirubin, age, serum sodium and INR, and three distinct risk groups: low risk (4.2%), intermediate risk (30.2%-53.2%) and high risk (81.4%-96.9%). The new logistic regression model was constructed with four independent factors, including age, total bilirubin, serum sodium and prothrombin activity by multivariate logistic regression analysis. The performances of the CART model (0.896), similar to the logistic regression model (0.914, P=.382), exceeded that of MELD score (0.667, P<.001). The results were confirmed in the validation cohort. We have developed and validated a novel CART model superior to MELD for predicting three-month mortality of patients with ACHBLF. Thus, the CART model could facilitate medical decision-making and provide clinicians with a validated practical bedside tool for ACHBLF risk stratification. © 2016 John Wiley & Sons Ltd.

  14. Identification of immune correlates of protection in Shigella infection by application of machine learning.

    PubMed

    Arevalillo, Jorge M; Sztein, Marcelo B; Kotloff, Karen L; Levine, Myron M; Simon, Jakub K

    2017-10-01

    Immunologic correlates of protection are important in vaccine development because they give insight into mechanisms of protection, assist in the identification of promising vaccine candidates, and serve as endpoints in bridging clinical vaccine studies. Our goal is the development of a methodology to identify immunologic correlates of protection using the Shigella challenge as a model. The proposed methodology utilizes the Random Forests (RF) machine learning algorithm as well as Classification and Regression Trees (CART) to detect immune markers that predict protection, identify interactions between variables, and define optimal cutoffs. Logistic regression modeling is applied to estimate the probability of protection and the confidence interval (CI) for such a probability is computed by bootstrapping the logistic regression models. The results demonstrate that the combination of Classification and Regression Trees and Random Forests complements the standard logistic regression and uncovers subtle immune interactions. Specific levels of immunoglobulin IgG antibody in blood on the day of challenge predicted protection in 75% (95% CI 67-86). Of those subjects that did not have blood IgG at or above a defined threshold, 100% were protected if they had IgA antibody secreting cells above a defined threshold. Comparison with the results obtained by applying only logistic regression modeling with standard Akaike Information Criterion for model selection shows the usefulness of the proposed method. Given the complexity of the immune system, the use of machine learning methods may enhance traditional statistical approaches. When applied together, they offer a novel way to quantify important immune correlates of protection that may help the development of vaccines. Copyright © 2017 Elsevier Inc. All rights reserved.

  15. The quest for conditional independence in prospectivity modeling: weights-of-evidence, boost weights-of-evidence, and logistic regression

    NASA Astrophysics Data System (ADS)

    Schaeben, Helmut; Semmler, Georg

    2016-09-01

    The objective of prospectivity modeling is prediction of the conditional probability of the presence T = 1 or absence T = 0 of a target T given favorable or prohibitive predictors B, or construction of a two classes 0,1 classification of T. A special case of logistic regression called weights-of-evidence (WofE) is geologists' favorite method of prospectivity modeling due to its apparent simplicity. However, the numerical simplicity is deceiving as it is implied by the severe mathematical modeling assumption of joint conditional independence of all predictors given the target. General weights of evidence are explicitly introduced which are as simple to estimate as conventional weights, i.e., by counting, but do not require conditional independence. Complementary to the regression view is the classification view on prospectivity modeling. Boosting is the construction of a strong classifier from a set of weak classifiers. From the regression point of view it is closely related to logistic regression. Boost weights-of-evidence (BoostWofE) was introduced into prospectivity modeling to counterbalance violations of the assumption of conditional independence even though relaxation of modeling assumptions with respect to weak classifiers was not the (initial) purpose of boosting. In the original publication of BoostWofE a fabricated dataset was used to "validate" this approach. Using the same fabricated dataset it is shown that BoostWofE cannot generally compensate lacking conditional independence whatever the consecutively processing order of predictors. Thus the alleged features of BoostWofE are disproved by way of counterexamples, while theoretical findings are confirmed that logistic regression including interaction terms can exactly compensate violations of joint conditional independence if the predictors are indicators.

  16. Separation in Logistic Regression: Causes, Consequences, and Control.

    PubMed

    Mansournia, Mohammad Ali; Geroldinger, Angelika; Greenland, Sander; Heinze, Georg

    2018-04-01

    Separation is encountered in regression models with a discrete outcome (such as logistic regression) where the covariates perfectly predict the outcome. It is most frequent under the same conditions that lead to small-sample and sparse-data bias, such as presence of a rare outcome, rare exposures, highly correlated covariates, or covariates with strong effects. In theory, separation will produce infinite estimates for some coefficients. In practice, however, separation may be unnoticed or mishandled because of software limits in recognizing and handling the problem and in notifying the user. We discuss causes of separation in logistic regression and describe how common software packages deal with it. We then describe methods that remove separation, focusing on the same penalized-likelihood techniques used to address more general sparse-data problems. These methods improve accuracy, avoid software problems, and allow interpretation as Bayesian analyses with weakly informative priors. We discuss likelihood penalties, including some that can be implemented easily with any software package, and their relative advantages and disadvantages. We provide an illustration of ideas and methods using data from a case-control study of contraceptive practices and urinary tract infection.

  17. Modeling the dynamics of urban growth using multinomial logistic regression: a case study of Jiayu County, Hubei Province, China

    NASA Astrophysics Data System (ADS)

    Nong, Yu; Du, Qingyun; Wang, Kun; Miao, Lei; Zhang, Weiwei

    2008-10-01

    Urban growth modeling, one of the most important aspects of land use and land cover change study, has attracted substantial attention because it helps to comprehend the mechanisms of land use change thus helps relevant policies made. This study applied multinomial logistic regression to model urban growth in the Jiayu county of Hubei province, China to discover the relationship between urban growth and the driving forces of which biophysical and social-economic factors are selected as independent variables. This type of regression is similar to binary logistic regression, but it is more general because the dependent variable is not restricted to two categories, as those previous studies did. The multinomial one can simulate the process of multiple land use competition between urban land, bare land, cultivated land and orchard land. Taking the land use type of Urban as reference category, parameters could be estimated with odds ratio. A probability map is generated from the model to predict where urban growth will occur as a result of the computation.

  18. On the predictability of outliers in ensemble forecasts

    NASA Astrophysics Data System (ADS)

    Siegert, S.; Bröcker, J.; Kantz, H.

    2012-03-01

    In numerical weather prediction, ensembles are used to retrieve probabilistic forecasts of future weather conditions. We consider events where the verification is smaller than the smallest, or larger than the largest ensemble member of a scalar ensemble forecast. These events are called outliers. In a statistically consistent K-member ensemble, outliers should occur with a base rate of 2/(K+1). In operational ensembles this base rate tends to be higher. We study the predictability of outlier events in terms of the Brier Skill Score and find that forecast probabilities can be calculated which are more skillful than the unconditional base rate. This is shown analytically for statistically consistent ensembles. Using logistic regression, forecast probabilities for outlier events in an operational ensemble are calculated. These probabilities exhibit positive skill which is quantitatively similar to the analytical results. Possible causes of these results as well as their consequences for ensemble interpretation are discussed.

  19. Logistic Mixed Models to Investigate Implicit and Explicit Belief Tracking.

    PubMed

    Lages, Martin; Scheel, Anne

    2016-01-01

    We investigated the proposition of a two-systems Theory of Mind in adults' belief tracking. A sample of N = 45 participants predicted the choice of one of two opponent players after observing several rounds in an animated card game. Three matches of this card game were played and initial gaze direction on target and subsequent choice predictions were recorded for each belief task and participant. We conducted logistic regressions with mixed effects on the binary data and developed Bayesian logistic mixed models to infer implicit and explicit mentalizing in true belief and false belief tasks. Although logistic regressions with mixed effects predicted the data well a Bayesian logistic mixed model with latent task- and subject-specific parameters gave a better account of the data. As expected explicit choice predictions suggested a clear understanding of true and false beliefs (TB/FB). Surprisingly, however, model parameters for initial gaze direction also indicated belief tracking. We discuss why task-specific parameters for initial gaze directions are different from choice predictions yet reflect second-order perspective taking.

  20. Model selection for logistic regression models

    NASA Astrophysics Data System (ADS)

    Duller, Christine

    2012-09-01

    Model selection for logistic regression models decides which of some given potential regressors have an effect and hence should be included in the final model. The second interesting question is whether a certain factor is heterogeneous among some subsets, i.e. whether the model should include a random intercept or not. In this paper these questions will be answered with classical as well as with Bayesian methods. The application show some results of recent research projects in medicine and business administration.

  1. Radiomorphometric analysis of frontal sinus for sex determination.

    PubMed

    Verma, Saumya; Mahima, V G; Patil, Karthikeya

    2014-09-01

    Sex determination of unknown individuals carries crucial significance in forensic research, in cases where fragments of skull persist with no likelihood of identification based on dental arch. In these instances sex determination becomes important to rule out certain number of possibilities instantly and helps in establishing a biological profile of human remains. The aim of the study is to evaluate a mathematical method based on logistic regression analysis capable of ascertaining the sex of individuals in the South Indian population. The study was conducted in the department of Oral Medicine and Radiology. The right and left areas, maximum height, width of frontal sinus were determined in 100 Caldwell views of 50 women and 50 men aged 20 years and above, with the help of Vernier callipers and a square grid with 1 square measuring 1mm(2) in area. Student's t-test, logistic regression analysis. The mean values of variables were greater in men, based on Student's t-test at 5% level of significance. The mathematical model based on logistic regression analysis gave percentage agreement of total area to correctly predict the female gender as 55.2%, of right area as 60.9% and of left area as 55.2%. The areas of the frontal sinus and the logistic regression proved to be unreliable in sex determination. (Logit = 0.924 - 0.00217 × right area).

  2. Genetic prediction of type 2 diabetes using deep neural network.

    PubMed

    Kim, J; Kim, J; Kwak, M J; Bajaj, M

    2018-04-01

    Type 2 diabetes (T2DM) has strong heritability but genetic models to explain heritability have been challenging. We tested deep neural network (DNN) to predict T2DM using the nested case-control study of Nurses' Health Study (3326 females, 45.6% T2DM) and Health Professionals Follow-up Study (2502 males, 46.5% T2DM). We selected 96, 214, 399, and 678 single-nucleotide polymorphism (SNPs) through Fisher's exact test and L1-penalized logistic regression. We split each dataset randomly in 4:1 to train prediction models and test their performance. DNN and logistic regressions showed better area under the curve (AUC) of ROC curves than the clinical model when 399 or more SNPs included. DNN was superior than logistic regressions in AUC with 399 or more SNPs in male and 678 SNPs in female. Addition of clinical factors consistently increased AUC of DNN but failed to improve logistic regressions with 214 or more SNPs. In conclusion, we show that DNN can be a versatile tool to predict T2DM incorporating large numbers of SNPs and clinical information. Limitations include a relatively small number of the subjects mostly of European ethnicity. Further studies are warranted to confirm and improve performance of genetic prediction models using DNN in different ethnic groups. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  3. Unconditional or Conditional Logistic Regression Model for Age-Matched Case-Control Data?

    PubMed

    Kuo, Chia-Ling; Duan, Yinghui; Grady, James

    2018-01-01

    Matching on demographic variables is commonly used in case-control studies to adjust for confounding at the design stage. There is a presumption that matched data need to be analyzed by matched methods. Conditional logistic regression has become a standard for matched case-control data to tackle the sparse data problem. The sparse data problem, however, may not be a concern for loose-matching data when the matching between cases and controls is not unique, and one case can be matched to other controls without substantially changing the association. Data matched on a few demographic variables are clearly loose-matching data, and we hypothesize that unconditional logistic regression is a proper method to perform. To address the hypothesis, we compare unconditional and conditional logistic regression models by precision in estimates and hypothesis testing using simulated matched case-control data. Our results support our hypothesis; however, the unconditional model is not as robust as the conditional model to the matching distortion that the matching process not only makes cases and controls similar for matching variables but also for the exposure status. When the study design involves other complex features or the computational burden is high, matching in loose-matching data can be ignored for negligible loss in testing and estimation if the distributions of matching variables are not extremely different between cases and controls.

  4. Unconditional or Conditional Logistic Regression Model for Age-Matched Case–Control Data?

    PubMed Central

    Kuo, Chia-Ling; Duan, Yinghui; Grady, James

    2018-01-01

    Matching on demographic variables is commonly used in case–control studies to adjust for confounding at the design stage. There is a presumption that matched data need to be analyzed by matched methods. Conditional logistic regression has become a standard for matched case–control data to tackle the sparse data problem. The sparse data problem, however, may not be a concern for loose-matching data when the matching between cases and controls is not unique, and one case can be matched to other controls without substantially changing the association. Data matched on a few demographic variables are clearly loose-matching data, and we hypothesize that unconditional logistic regression is a proper method to perform. To address the hypothesis, we compare unconditional and conditional logistic regression models by precision in estimates and hypothesis testing using simulated matched case–control data. Our results support our hypothesis; however, the unconditional model is not as robust as the conditional model to the matching distortion that the matching process not only makes cases and controls similar for matching variables but also for the exposure status. When the study design involves other complex features or the computational burden is high, matching in loose-matching data can be ignored for negligible loss in testing and estimation if the distributions of matching variables are not extremely different between cases and controls. PMID:29552553

  5. Estimating multilevel logistic regression models when the number of clusters is low: a comparison of different statistical software procedures.

    PubMed

    Austin, Peter C

    2010-04-22

    Multilevel logistic regression models are increasingly being used to analyze clustered data in medical, public health, epidemiological, and educational research. Procedures for estimating the parameters of such models are available in many statistical software packages. There is currently little evidence on the minimum number of clusters necessary to reliably fit multilevel regression models. We conducted a Monte Carlo study to compare the performance of different statistical software procedures for estimating multilevel logistic regression models when the number of clusters was low. We examined procedures available in BUGS, HLM, R, SAS, and Stata. We found that there were qualitative differences in the performance of different software procedures for estimating multilevel logistic models when the number of clusters was low. Among the likelihood-based procedures, estimation methods based on adaptive Gauss-Hermite approximations to the likelihood (glmer in R and xtlogit in Stata) or adaptive Gaussian quadrature (Proc NLMIXED in SAS) tended to have superior performance for estimating variance components when the number of clusters was small, compared to software procedures based on penalized quasi-likelihood. However, only Bayesian estimation with BUGS allowed for accurate estimation of variance components when there were fewer than 10 clusters. For all statistical software procedures, estimation of variance components tended to be poor when there were only five subjects per cluster, regardless of the number of clusters.

  6. Building a Decision Support System for Inpatient Admission Prediction With the Manchester Triage System and Administrative Check-in Variables.

    PubMed

    Zlotnik, Alexander; Alfaro, Miguel Cuchí; Pérez, María Carmen Pérez; Gallardo-Antolín, Ascensión; Martínez, Juan Manuel Montero

    2016-05-01

    The usage of decision support tools in emergency departments, based on predictive models, capable of estimating the probability of admission for patients in the emergency department may give nursing staff the possibility of allocating resources in advance. We present a methodology for developing and building one such system for a large specialized care hospital using a logistic regression and an artificial neural network model using nine routinely collected variables available right at the end of the triage process.A database of 255.668 triaged nonobstetric emergency department presentations from the Ramon y Cajal University Hospital of Madrid, from January 2011 to December 2012, was used to develop and test the models, with 66% of the data used for derivation and 34% for validation, with an ordered nonrandom partition. On the validation dataset areas under the receiver operating characteristic curve were 0.8568 (95% confidence interval, 0.8508-0.8583) for the logistic regression model and 0.8575 (95% confidence interval, 0.8540-0. 8610) for the artificial neural network model. χ Values for Hosmer-Lemeshow fixed "deciles of risk" were 65.32 for the logistic regression model and 17.28 for the artificial neural network model. A nomogram was generated upon the logistic regression model and an automated software decision support system with a Web interface was built based on the artificial neural network model.

  7. Product unit neural network models for predicting the growth limits of Listeria monocytogenes.

    PubMed

    Valero, A; Hervás, C; García-Gimeno, R M; Zurera, G

    2007-08-01

    A new approach to predict the growth/no growth interface of Listeria monocytogenes as a function of storage temperature, pH, citric acid (CA) and ascorbic acid (AA) is presented. A linear logistic regression procedure was performed and a non-linear model was obtained by adding new variables by means of a Neural Network model based on Product Units (PUNN). The classification efficiency of the training data set and the generalization data of the new Logistic Regression PUNN model (LRPU) were compared with Linear Logistic Regression (LLR) and Polynomial Logistic Regression (PLR) models. 92% of the total cases from the LRPU model were correctly classified, an improvement on the percentage obtained using the PLR model (90%) and significantly higher than the results obtained with the LLR model, 80%. On the other hand predictions of LRPU were closer to data observed which permits to design proper formulations in minimally processed foods. This novel methodology can be applied to predictive microbiology for describing growth/no growth interface of food-borne microorganisms such as L. monocytogenes. The optimal balance is trying to find models with an acceptable interpretation capacity and with good ability to fit the data on the boundaries of variable range. The results obtained conclude that these kinds of models might well be very a valuable tool for mathematical modeling.

  8. Analysis of a database to predict the result of allergy testing in vivo in patients with chronic nasal symptoms.

    PubMed

    Lacagnina, Valerio; Leto-Barone, Maria S; La Piana, Simona; Seidita, Aurelio; Pingitore, Giuseppe; Di Lorenzo, Gabriele

    2014-01-01

    This article uses the logistic regression model for diagnostic decision making in patients with chronic nasal symptoms. We studied the ability of the logistic regression model, obtained by the evaluation of a database, to detect patients with positive allergy skin-prick test (SPT) and patients with negative SPT. The model developed was validated using the data set obtained from another medical institution. The analysis was performed using a database obtained from a questionnaire administered to the patients with nasal symptoms containing personal data, clinical data, and results of allergy testing (SPT). All variables found to be significantly different between patients with positive and negative SPT (p < 0.05) were selected for the logistic regression models and were analyzed with backward stepwise logistic regression, evaluated with area under the curve of the receiver operating characteristic curve. A second set of patients from another institution was used to prove the model. The accuracy of the model in identifying, over the second set, both patients whose SPT will be positive and negative was high. The model detected 96% of patients with nasal symptoms and positive SPT and classified 94% of those with negative SPT. This study is preliminary to the creation of a software that could help the primary care doctors in a diagnostic decision making process (need of allergy testing) in patients complaining of chronic nasal symptoms.

  9. Comparing machine learning and logistic regression methods for predicting hypertension using a combination of gene expression and next-generation sequencing data.

    PubMed

    Held, Elizabeth; Cape, Joshua; Tintle, Nathan

    2016-01-01

    Machine learning methods continue to show promise in the analysis of data from genetic association studies because of the high number of variables relative to the number of observations. However, few best practices exist for the application of these methods. We extend a recently proposed supervised machine learning approach for predicting disease risk by genotypes to be able to incorporate gene expression data and rare variants. We then apply 2 different versions of the approach (radial and linear support vector machines) to simulated data from Genetic Analysis Workshop 19 and compare performance to logistic regression. Method performance was not radically different across the 3 methods, although the linear support vector machine tended to show small gains in predictive ability relative to a radial support vector machine and logistic regression. Importantly, as the number of genes in the models was increased, even when those genes contained causal rare variants, model predictive ability showed a statistically significant decrease in performance for both the radial support vector machine and logistic regression. The linear support vector machine showed more robust performance to the inclusion of additional genes. Further work is needed to evaluate machine learning approaches on larger samples and to evaluate the relative improvement in model prediction from the incorporation of gene expression data.

  10. [Use of multiple regression models in observational studies (1970-2013) and requirements of the STROBE guidelines in Spanish scientific journals].

    PubMed

    Real, J; Cleries, R; Forné, C; Roso-Llorach, A; Martínez-Sánchez, J M

    In medicine and biomedical research, statistical techniques like logistic, linear, Cox and Poisson regression are widely known. The main objective is to describe the evolution of multivariate techniques used in observational studies indexed in PubMed (1970-2013), and to check the requirements of the STROBE guidelines in the author guidelines in Spanish journals indexed in PubMed. A targeted PubMed search was performed to identify papers that used logistic linear Cox and Poisson models. Furthermore, a review was also made of the author guidelines of journals published in Spain and indexed in PubMed and Web of Science. Only 6.1% of the indexed manuscripts included a term related to multivariate analysis, increasing from 0.14% in 1980 to 12.3% in 2013. In 2013, 6.7, 2.5, 3.5, and 0.31% of the manuscripts contained terms related to logistic, linear, Cox and Poisson regression, respectively. On the other hand, 12.8% of journals author guidelines explicitly recommend to follow the STROBE guidelines, and 35.9% recommend the CONSORT guideline. A low percentage of Spanish scientific journals indexed in PubMed include the STROBE statement requirement in the author guidelines. Multivariate regression models in published observational studies such as logistic regression, linear, Cox and Poisson are increasingly used both at international level, as well as in journals published in Spanish. Copyright © 2015 Sociedad Española de Médicos de Atención Primaria (SEMERGEN). Publicado por Elsevier España, S.L.U. All rights reserved.

  11. The microbiological profile and presence of bloodstream infection influence mortality rates in necrotizing fasciitis

    PubMed Central

    2011-01-01

    Introduction Necrotizing fasciitis (NF) is a life threatening infectious disease with a high mortality rate. We carried out a microbiological characterization of the causative pathogens. We investigated the correlation of mortality in NF with bloodstream infection and with the presence of co-morbidities. Methods In this retrospective study, we analyzed 323 patients who presented with necrotizing fasciitis at two different institutions. Bloodstream infection (BSI) was defined as a positive blood culture result. The patients were categorized as survivors and non-survivors. Eleven clinically important variables which were statistically significant by univariate analysis were selected for multivariate regression analysis and a stepwise logistic regression model was developed to determine the association between BSI and mortality. Results Univariate logistic regression analysis showed that patients with hypotension, heart disease, liver disease, presence of Vibrio spp. in wound cultures, presence of fungus in wound cultures, and presence of Streptococcus group A, Aeromonas spp. or Vibrio spp. in blood cultures, had a significantly higher risk of in-hospital mortality. Our multivariate logistic regression analysis showed a higher risk of mortality in patients with pre-existing conditions like hypotension, heart disease, and liver disease. Multivariate logistic regression analysis also showed that presence of Vibrio spp in wound cultures, and presence of Streptococcus Group A in blood cultures were associated with a high risk of mortality while debridement > = 3 was associated with improved survival. Conclusions Mortality in patients with necrotizing fasciitis was significantly associated with the presence of Vibrio in wound cultures and Streptococcus group A in blood cultures. PMID:21693053

  12. Prediction of siRNA potency using sparse logistic regression.

    PubMed

    Hu, Wei; Hu, John

    2014-06-01

    RNA interference (RNAi) can modulate gene expression at post-transcriptional as well as transcriptional levels. Short interfering RNA (siRNA) serves as a trigger for the RNAi gene inhibition mechanism, and therefore is a crucial intermediate step in RNAi. There have been extensive studies to identify the sequence characteristics of potent siRNAs. One such study built a linear model using LASSO (Least Absolute Shrinkage and Selection Operator) to measure the contribution of each siRNA sequence feature. This model is simple and interpretable, but it requires a large number of nonzero weights. We have introduced a novel technique, sparse logistic regression, to build a linear model using single-position specific nucleotide compositions which has the same prediction accuracy of the linear model based on LASSO. The weights in our new model share the same general trend as those in the previous model, but have only 25 nonzero weights out of a total 84 weights, a 54% reduction compared to the previous model. Contrary to the linear model based on LASSO, our model suggests that only a few positions are influential on the efficacy of the siRNA, which are the 5' and 3' ends and the seed region of siRNA sequences. We also employed sparse logistic regression to build a linear model using dual-position specific nucleotide compositions, a task LASSO is not able to accomplish well due to its high dimensional nature. Our results demonstrate the superiority of sparse logistic regression as a technique for both feature selection and regression over LASSO in the context of siRNA design.

  13. Racial and ethnic differences in pediatric obesity-prevention counseling: national prevalence of clinician practices.

    PubMed

    Branner, Christopher M; Koyama, Tatsuki; Jensen, Gordon L

    2008-03-01

    To assess the frequency of clinician-reported delivery of obesity-prevention counseling (OPC) at well-child visits; evaluating for racial/ethnic discrepancies. Combined, weighted well-child visit data from the National Ambulatory Medical Care Survey (NAMCS) and National Hospital Ambulatory Medical Care Survey (NHAMCS) from 2001 to 2004 were analyzed for patients aged 4-18 years. Obesity-prevention counseling was defined as the combined delivery of diet/nutrition and exercise counseling. Patients receiving over- or underweight related diagnoses were excluded. Counseling frequencies were calculated. Multivariate logistic regression models examined the relationship of OPC with race, ethnicity, region, provider, sex, age, and payor type. Of 55,695,554 (weighted) visits, 24.4% included OPC (90.8% of these from NAMCS). 15.4% of Hispanic patients received OPC compared to 28.8% of non-Hispanics. Frequencies were similar between Whites and Blacks (25.0 and 27.1%). Patients with private insurance received more counseling (26.9%) than Medicaid (19.1%) or self-pay (15.1%). In logistic regression models, non-Hispanics were more likely to receive OPC (odds ratio (OR) = 1.94; confidence interval (CI) = 1.13-3.32), and patients in the West were less likely to receive OPC (OR = 0.39; CI = 0.18-0.85). Payor type was not predictive in regression analysis. Patients in hospital-based practices received less OPC (11.9% vs. 25.7% with OR = 0.40; CI =0.22-0.74). Obesity prevention, like treatment, is a complex and multifactorial process. With the documented racial and ethnic disparities in rates of pediatric obesity, reasons for discrepancies in the provision of OPC must be further investigated as preventive strategies are formulated.

  14. Evaluation of spectral domain optical coherence tomography parameters in ocular hypertension, preperimetric, and early glaucoma

    PubMed Central

    Aydoğan, Tuğba; Akçay, Betül İlkay Sezgin; Kardeş, Esra; Ergin, Ahmet

    2017-01-01

    Purpose: The objective of this study is to evaluate the diagnostic ability of retinal nerve fiber layer (RNFL), macular, optic nerve head (ONH) parameters in healthy subjects, ocular hypertension (OHT), preperimetric glaucoma (PPG), and early glaucoma (EG) patients, to reveal factors affecting the diagnostic ability of spectral domain-optical coherence tomography (SD-OCT) parameters and risk factors for glaucoma. Methods: Three hundred and twenty-six eyes (89 healthy, 77 OHT, 94 PPG, and 66 EG eyes) were analyzed. RNFL, macular, and ONH parameters were measured with SD-OCT. The area under the receiver operating characteristic curve (AUC) and sensitivity at 95% specificity was calculated. Logistic regression analysis was used to determine the glaucoma risk factors. Receiver operating characteristic regression analysis was used to evaluate the influence of covariates on the diagnostic ability of parameters. Results: In PPG patients, parameters that had the largest AUC value were average RNFL thickness (0.83) and rim volume (0.83). In EG patients, parameter that had the largest AUC value was average RNFL thickness (0.98). The logistic regression analysis showed average RNFL thickness was a risk factor for both PPG and EG. Diagnostic ability of average RNFL and average ganglion cell complex thickness increased as disease severity increased. Signal strength index did not affect diagnostic abilities. Diagnostic ability of average RNFL and rim area increased as disc area increased. Conclusion: When evaluating patients with glaucoma, patients at risk for glaucoma, and healthy controls RNFL parameters deserve more attention in clinical practice. Further studies are needed to fully understand the influence of covariates on the diagnostic ability of OCT parameters. PMID:29133640

  15. Role of the Egami score to predict immunoglobulin resistance in Kawasaki disease among a Western Mediterranean population.

    PubMed

    Sánchez-Manubens, Judith; Antón, Jordi; Bou, Rosa; Iglesias, Estíbaliz; Calzada-Hernandez, Joan; Borlan, Sergi; Gimenez-Roca, Clara; Rivera, Josefa

    2016-07-01

    Kawasaki disease is an acute self-limited systemic vasculitis common in childhood. Intravenous immunoglobulin (IVIG) is an effective treatment, and it reduces the incidence of cardiac complications. Egami score has been validated to identify IVIG non-responder patients in Japanese population, and it has shown high sensitivity and specificity to identify these non-responder patients. Although its effectiveness in Japan, Egami score has shown to be ineffective in non-Japanese populations. The aim of this study was to apply the Egami score in a Western Mediterranean population in Catalonia (Spain). Observational population-based study that includes patients from all Pediatric Units in 33 Catalan hospitals, both public and private management, between January 2004 and March 2014. Sensitivity and specificity for the Egami score was calculated, and a logistic regression analysis of predictors of overall response to IVIG was also developed. Predicting IVIG resistance with a cutoff for Egami score ≥3 obtained 26 % sensitivity and 82 % specificity. Negative predictive value was 85 % and positive predictive value 22 %. This low sensitivity implies that three out of four non-responders will not be identified by the Egami score. Besides, logistic regression models did not found significance for the use of the Egami score to predict IVIG resistance in Catalan population although having an area under the ROC curve of 0.618 (IC 95 % 0.538-0.698, p < 0.001). Although regression models found an area under the ROC curve >0.5 to predict IVIG resistance, the low sensitivity excludes the Egami score as a useful tool to predict IVIG resistance in Catalan population.

  16. Simple, validated vaginal birth after cesarean delivery prediction model for use at the time of admission.

    PubMed

    Metz, Torri D; Stoddard, Gregory J; Henry, Erick; Jackson, Marc; Holmgren, Calla; Esplin, Sean

    2013-09-01

    To create a simple tool for predicting the likelihood of successful trial of labor after cesarean delivery (TOLAC) during the pregnancy after a primary cesarean delivery using variables available at the time of admission. Data for all deliveries at 14 regional hospitals over an 8-year period were reviewed. Women with one cesarean delivery and one subsequent delivery were included. Variables associated with successful VBAC were identified using multivariable logistic regression. Points were assigned to these characteristics, with weighting based on the coefficients in the regression model to calculate an integer VBAC score. The VBAC score was correlated with TOLAC success rate and was externally validated in an independent cohort using a logistic regression model. A total of 5,445 women met inclusion criteria. Of those women, 1,170 (21.5%) underwent TOLAC. Of the women who underwent trial of labor, 938 (80%) had a successful VBAC. A VBAC score was generated based on the Bishop score (cervical examination) at the time of admission, with points added for history of vaginal birth, age younger than 35 years, absence of recurrent indication, and body mass index less than 30. Women with a VBAC score less than 10 had a likelihood of TOLAC success less than 50%. Women with a VBAC score more than 16 had a TOLAC success rate more than 85%. The model performed well in an independent cohort with an area under the curve of 0.80 (95% confidence interval 0.76-0.84). Prediction of TOLAC success at the time of admission is highly dependent on the initial cervical examination. This simple VBAC score can be utilized when counseling women considering TOLAC. II.

  17. Effect of plasma homocysteine level and urinary monomethylarsonic acid on the risk of arsenic-associated carotid atherosclerosis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wu, M.-M.; Graduate Institute of Medicine, College of Medicine, Fu-Jen Catholic University, Taipei, Taiwan; Chiou, H.-Y.

    2006-10-01

    Arsenic-contaminated well water has been shown to increase the risk of atherosclerosis. Because of involving S-adenosylmethionine, homocysteine may modify the risk by interfering with the biomethylation of ingested arsenic. In this study, we assessed the effect of plasma homocysteine level and urinary monomethylarsonic acid (MMA{sup V}) on the risk of atherosclerosis associated with arsenic. In total, 163 patients with carotid atherosclerosis and 163 controls were studied. Lifetime cumulative arsenic exposure from well water for study subjects was measured as index of arsenic exposure. Homocysteine level was determined by high-performance liquid chromatography (HPLC). Proportion of MMA{sup V} (MMA%) was calculated bymore » dividing with total arsenic species in urine, including arsenite, arsenate, MMA{sup V}, and dimethylarsinic acid (DMA{sup V}). Results of multiple linear regression analysis show a positive correlation of plasma homocysteine levels to the cumulative arsenic exposure after controlling for atherosclerosis status and nutritional factors (P < 0.05). This correlation, however, did not change substantially the effect of arsenic exposure on the risk of atherosclerosis as analyzed in a subsequent logistic regression model. Logistic regression analyses also show that elevated plasma homocysteine levels did not confer an independent risk for developing atherosclerosis in the study population. However, the risk of having atherosclerosis was increased to 5.4-fold (95% CI, 2.0-15.0) for the study subjects with high MMA% ({>=}16.5%) and high homocysteine levels ({>=}12.7 {mu}mol/l) as compared to those with low MMA% (<9.9%) and low homocysteine levels (<12.7 {mu}mol/l). Elevated homocysteinemia may exacerbate the formation of atherosclerosis related to arsenic exposure in individuals with high levels of MMA% in urine.« less

  18. Comparison between antegrade and retrograde cerebral perfusion or profound hypothermia as brain protection strategies during repair of type A aortic dissection

    PubMed Central

    Rausch, Laura A.; Kouchoukos, Nicholas T.; Lobdell, Kevin W.; Khabbaz, Kamal; Murphy, Edward; Hagberg, Robert C.

    2016-01-01

    Background The goal of this study was to compare early postoperative outcomes and actuarial-free survival between patients who underwent repair of acute type A aortic dissection by the method of cerebral perfusion used. Methods A total of 324 patients from five academic medical centers underwent repair of acute type A aortic dissection between January 2000 and December 2010. Of those, antegrade cerebral perfusion (ACP) was used for 84 patients, retrograde cerebral perfusion (RCP) was used for 55 patients, and deep hypothermic circulatory arrest (DHCA) was used for 184 patients during repair. Major morbidity, operative mortality, and 5-year actuarial survival were compared between groups. Multivariate logistic regression was used to determine predictors of operative mortality and Cox Regression hazard ratios were calculated to determine the predictors of long term mortality. Results Operative mortality was not influenced by the type of cerebral protection (19% for ACP, 14.5% for RCP and 19.1% for DHCA, P=0.729). In multivariable logistic regression analysis, hemodynamic instability [odds ratio (OR) =19.6, 95% confidence intervals (CI), 0.102–0.414, P<0.001] and CPB time >200 min(OR =4.7, 95% CI, 1.962–1.072, P=0.029) emerged as independent predictors of operative mortality. Actuarial 5-year survival was unchanged by cerebral protection modality (48.8% for ACP, 61.8% for RCP and 66.8% for no cerebral protection, log-rank P=0.844). Conclusions During surgical repair of type A aortic dissection, ACP, RCP or DHCA are safe strategies for cerebral protection in selected patients with type A aortic dissection. PMID:27563545

  19. Content Coding of Psychotherapy Transcripts Using Labeled Topic Models.

    PubMed

    Gaut, Garren; Steyvers, Mark; Imel, Zac E; Atkins, David C; Smyth, Padhraic

    2017-03-01

    Psychotherapy represents a broad class of medical interventions received by millions of patients each year. Unlike most medical treatments, its primary mechanisms are linguistic; i.e., the treatment relies directly on a conversation between a patient and provider. However, the evaluation of patient-provider conversation suffers from critical shortcomings, including intensive labor requirements, coder error, nonstandardized coding systems, and inability to scale up to larger data sets. To overcome these shortcomings, psychotherapy analysis needs a reliable and scalable method for summarizing the content of treatment encounters. We used a publicly available psychotherapy corpus from Alexander Street press comprising a large collection of transcripts of patient-provider conversations to compare coding performance for two machine learning methods. We used the labeled latent Dirichlet allocation (L-LDA) model to learn associations between text and codes, to predict codes in psychotherapy sessions, and to localize specific passages of within-session text representative of a session code. We compared the L-LDA model to a baseline lasso regression model using predictive accuracy and model generalizability (measured by calculating the area under the curve (AUC) from the receiver operating characteristic curve). The L-LDA model outperforms the lasso logistic regression model at predicting session-level codes with average AUC scores of 0.79, and 0.70, respectively. For fine-grained level coding, L-LDA and logistic regression are able to identify specific talk-turns representative of symptom codes. However, model performance for talk-turn identification is not yet as reliable as human coders. We conclude that the L-LDA model has the potential to be an objective, scalable method for accurate automated coding of psychotherapy sessions that perform better than comparable discriminative methods at session-level coding and can also predict fine-grained codes.

  20. An epidemiological survey on road traffic crashes in Iran: application of the two logistic regression models.

    PubMed

    Bakhtiyari, Mahmood; Mehmandar, Mohammad Reza; Mirbagheri, Babak; Hariri, Gholam Reza; Delpisheh, Ali; Soori, Hamid

    2014-01-01

    Risk factors of human-related traffic crashes are the most important and preventable challenges for community health due to their noteworthy burden in developing countries in particular. The present study aims to investigate the role of human risk factors of road traffic crashes in Iran. Through a cross-sectional study using the COM 114 data collection forms, the police records of almost 600,000 crashes occurred in 2010 are investigated. The binary logistic regression and proportional odds regression models are used. The odds ratio for each risk factor is calculated. These models are adjusted for known confounding factors including age, sex and driving time. The traffic crash reports of 537,688 men (90.8%) and 54,480 women (9.2%) are analysed. The mean age is 34.1 ± 14 years. Not maintaining eyes on the road (53.7%) and losing control of the vehicle (21.4%) are the main causes of drivers' deaths in traffic crashes within cities. Not maintaining eyes on the road is also the most frequent human risk factor for road traffic crashes out of cities. Sudden lane excursion (OR = 9.9, 95% CI: 8.2-11.9) and seat belt non-compliance (OR = 8.7, CI: 6.7-10.1), exceeding authorised speed (OR = 17.9, CI: 12.7-25.1) and exceeding safe speed (OR = 9.7, CI: 7.2-13.2) are the most significant human risk factors for traffic crashes in Iran. The high mortality rate of 39 people for every 100,000 population emphasises on the importance of traffic crashes in Iran. Considering the important role of human risk factors in traffic crashes, struggling efforts are required to control dangerous driving behaviours such as exceeding speed, illegal overtaking and not maintaining eyes on the road.

  1. Impact of parental-rearing styles on irritable bowel syndrome in adolescents: a school-based study.

    PubMed

    Xing, Zhouxiong; Hou, Xiaohua; Zhou, Kan; Qin, Diyuan; Pan, Wen

    2014-03-01

    A strong association between family function and irritable bowel syndrome (IBS) has been observed. Parental rearing styles, as a comprehensive mark for family function, may provide new clues to the etiology of IBS. This study aimed to explore which dimensions of parental rearing styles were risk factors or protective factors for IBS in adolescents. Two thousand three hundred twenty adolescents were recruited from one middle school and one high school randomly selected from Jiangan District (an urban district in Wuhan City). Data were collected using two Chinese versions of validated self-report questionnaires including the Rome III diagnostic criteria for pediatric IBS and the Egna Minnen Beträffande Uppfostran: One's Memories of Upbringing for perceived parental rearing styles. Ninety-six subjects diagnosed as pediatric IBS were compared with 1618 controls. The IBS patients reported less both paternal and maternal emotional warmth (all P < 0.01) and more both paternal and maternal punishment, overinterference, rejection, and overprotection (only for father) (all P < 0.01) than the controls. Furthermore, the IBS patients had higher total scores of parental rearing styles (all P < 0.001) than the controls. With univariate logistic regression, standardized regression coefficients and odds ratios of parental rearing variables were calculated. Multivariate logistic regression revealed that paternal rejection (P = 0.001) and maternal overinterference (P = 0.002) were independent risk factors for IBS in adolescents. Parental emotional warmth is a protective factor for IBS in adolescents and parental punishment, overinterference, rejection, and overprotection are risk factors for IBS in adolescents. © 2013 Journal of Gastroenterology and Hepatology Foundation and Wiley Publishing Asia Pty Ltd.

  2. Content Coding of Psychotherapy Transcripts Using Labeled Topic Models

    PubMed Central

    Gaut, Garren; Steyvers, Mark; Imel, Zac E; Atkins, David C; Smyth, Padhraic

    2016-01-01

    Psychotherapy represents a broad class of medical interventions received by millions of patients each year. Unlike most medical treatments, its primary mechanisms are linguistic; i.e., the treatment relies directly on a conversation between a patient and provider. However, the evaluation of patient-provider conversation suffers from critical shortcomings, including intensive labor requirements, coder error, non-standardized coding systems, and inability to scale up to larger data sets. To overcome these shortcomings, psychotherapy analysis needs a reliable and scalable method for summarizing the content of treatment encounters. We used a publicly-available psychotherapy corpus from Alexander Street press comprising a large collection of transcripts of patient-provider conversations to compare coding performance for two machine learning methods. We used the Labeled Latent Dirichlet Allocation (L-LDA) model to learn associations between text and codes, to predict codes in psychotherapy sessions, and to localize specific passages of within-session text representative of a session code. We compared the L-LDA model to a baseline lasso regression model using predictive accuracy and model generalizability (measured by calculating the area under the curve (AUC) from the receiver operating characteristic (ROC) curve). The L-LDA model outperforms the lasso logistic regression model at predicting session-level codes with average AUC scores of .79, and .70, respectively. For fine-grained level coding, L-LDA and logistic regression are able to identify specific talk-turns representative of symptom codes. However, model performance for talk-turn identification is not yet as reliable as human coders. We conclude that the L-LDA model has the potential to be an objective, scaleable method for accurate automated coding of psychotherapy sessions that performs better than comparable discriminative methods at session-level coding and can also predict fine-grained codes. PMID:26625437

  3. Factors determining the smooth flow and the non-operative time in a one-induction room to one-operating room setting

    PubMed Central

    Mulier, Jan P; De Boeck, Liesje; Meulders, Michel; Beliën, Jeroen; Colpaert, Jan; Sels, Annabel

    2015-01-01

    Rationale, aims and objectives What factors determine the use of an anaesthesia preparation room and shorten non-operative time? Methods A logistic regression is applied to 18 751 surgery records from AZ Sint-Jan Brugge AV, Belgium, where each operating room has its own anaesthesia preparation room. Surgeries, in which the patient's induction has already started when the preceding patient's surgery has ended, belong to a first group where the preparation room is used as an induction room. Surgeries not fulfilling this property belong to a second group. A logistic regression model tries to predict the probability that a surgery will be classified into a specific group. Non-operative time is calculated as the time between end of the previous surgery and incision of the next surgery. A log-linear regression of this non-operative time is performed. Results It was found that switches in surgeons, being a non-elective surgery as well as the previous surgery being non-elective, increase the probability of being classified into the second group. Only a few surgery types, anaesthesiologists and operating rooms can be found exclusively in one of the two groups. Analysis of variance demonstrates that the first group has significantly lower non-operative times. Switches in surgeons, anaesthesiologists and longer scheduled durations of the previous surgery increases the non-operative time. A switch in both surgeon and anaesthesiologist strengthens this negative effect. Only a few operating rooms and surgery types influence the non-operative time. Conclusion The use of the anaesthesia preparation room shortens the non-operative time and is determined by several human and structural factors. PMID:25496600

  4. Windows of achievement for development milestones of Sri Lankan infants and toddlers: estimation through statistical modelling.

    PubMed

    Thalagala, N

    2015-11-01

    The normative age ranges during which cohorts of children achieve milestones are called windows of achievement. The patterns of these windows of achievement are known to be both genetically and environmentally dependent. This study aimed to determine the windows of achievement for motor, social emotional, language and cognitive development milestones for infants and toddlers in Sri Lanka. A set of 293 milestones identified through a literature review were subjected to content validation using parent and expert reviews, which resulted in the selection of a revised set of 277 milestones. Thereafter, a sample of 1036 children from 2 months to 30 months was examined to see whether or not they had attained the selected milestones. Percentile ages of attaining milestone were determined using a rearranged closed form equation related to the logistic regression. The parameters required for calculations were derived through the logistic regression of milestone achievement statuses against ages of children. These percentile ages were used to define the respective windows of achievement. A set of 178 robust indicators that represent motor, socio emotional, language and cognitive development skills and their windows of achievement relevant to 2 to 24 months of age were determined. Windows of achievement for six gross motor milestones determined in the study were shown to closely overlap a similar set of windows of achievement published by the World Health Organization indicating the validity of some findings. A methodology combining the content validation based on qualitative techniques and age validation based on regression modelling found to be effective for determining age percentiles for realizing milestones and determining respective windows of achievement. © 2015 John Wiley & Sons Ltd.

  5. Family practitioners' diagnostic decision-making processes regarding patients with respiratory tract infections: an observational study.

    PubMed

    Fischer, Thomas; Fischer, Susanne; Himmel, Wolfgang; Kochen, Michael M; Hummers-Pradier, Eva

    2008-01-01

    The influence of patient characteristics on family practitioners' (FPs') diagnostic decision making has mainly been investigated using indirect methods such as vignettes or questionnaires. Direct observation-borrowed from social and cultural anthropology-may be an alternative method for describing FPs' real-life behavior and may help in gaining insight into how FPs diagnose respiratory tract infections, which are frequent in primary care. To clarify FPs' diagnostic processes when treating patients suffering from symptoms of respiratory tract infection. This direct observation study was performed in 30 family practices using a checklist for patient complaints, history taking, physical examination, and diagnoses. The influence of patients' symptoms and complaints on the FPs' physical examination and diagnosis was calculated by logistic regression analyses. Dummy variables based on combinations of symptoms and complaints were constructed and tested against saturated (full) and backward regression models. In total, 273 patients (median age 37 years, 51% women) were included. The median number of symptoms described was 4 per patient, and most information was provided at the patients' own initiative. Multiple logistic regression analysis showed a strong association between patients' complaints and the physical examination. Frequent diagnoses were upper respiratory tract infection (URTI)/common cold (43%), bronchitis (26%), sinusitis (12%), and tonsillitis (11%). There were no significant statistical differences between "simple heuristic'' models and saturated regression models in the diagnoses of bronchitis, sinusitis, and tonsillitis, indicating that simple heuristics are probably used by the FPs, whereas "URTI/common cold'' was better explained by the full model. FPs tended to make their diagnosis based on a few patient symptoms and a limited physical examination. Simple heuristic models were almost as powerful in explaining most diagnoses as saturated models. Direct observation allowed for the study of decision making under real conditions, yielding both quantitative data and "qualitative'' information about the FPs' performance. It is important for investigators to be aware of the specific disadvantages of the method (e.g., a possible observer effect).

  6. The relationship between hemoglobin level and the type 1 diabetic nephropathy in Anhui Han's patients.

    PubMed

    Jiang, Jun; Lei, Lan; Zhou, Xiaowan; Li, Peng; Wei, Ren

    2018-02-20

    Recent studies have shown that low hemoglobin (Hb) level promote the progression of chronic kidney disease. This study assessed the relationship between Hb level and type 1 diabetic nephropathy (DN) in Anhui Han's patients. There were a total of 236 patients diagnosed with type 1 diabetes mellitus and (T1DM) seen between January 2014 and December 2016 in our centre. Hemoglobin levels in patients with DN were compared with those without DN. The relationship between Hb level and the urinary albumin-creatinine ratio (ACR) was examined by Spearman's correlational analysis and multiple stepwise regression analysis. The binary logistic multivariate regression analysis was performed to analyze the correlated factors for type 1 DN, calculate the Odds Ratio (OR) and 95%confidence interval (CI). The predicting value of Hb level for DN was evaluated by area under receiver operation characteristic curve (AUROC) for discrimination and Hosmer-Lemeshow goodness-of-fit test for calibration. The average Hb levels in the DN group (116.1 ± 20.8 g/L) were significantly lower than the non-DN group (131.9 ± 14.4 g/L) , P < 0.001. Hb levels were independently correlated with the urinary ACR in multiple stepwise regression analysis. The logistic multivariate regression analysis showed that the Hb level (OR: 0.936, 95% CI: 0.910 to 0.963, P < 0.001) was inversely correlated with DN in patients with T1DM. In sub-analysis, low Hb level (Hb < 120g/L in female, Hb < 130g/L in male) was still negatively associated with DN in patients with T1DM. The AUROC was 0.721 (95% CI: 0.655 to 0.787) in assessing the discrimination of the Hb level for DN. The value of P was 0.593 in Hosmer-Lemeshow goodness-of-fit test. In Anhui Han's patients with T1DM, the Hb level is inversely correlated with urinary ACR and DN. This article is protected by copyright. All rights reserved.

  7. Knowledge, Attitude, and Practices Regarding Vector-borne Diseases in Western Jamaica.

    PubMed

    Alobuia, Wilson M; Missikpode, Celestin; Aung, Maung; Jolly, Pauline E

    2015-01-01

    Outbreaks of vector-borne diseases (VBDs) such as dengue and malaria can overwhelm health systems in resource-poor countries. Environmental management strategies that reduce or eliminate vector breeding sites combined with improved personal prevention strategies can help to significantly reduce transmission of these infections. The aim of this study was to assess the knowledge, attitudes, and practices (KAPs) of residents in western Jamaica regarding control of mosquito vectors and protection from mosquito bites. A cross-sectional study was conducted between May and August 2010 among patients or family members of patients waiting to be seen at hospitals in western Jamaica. Participants completed an interviewer-administered questionnaire on sociodemographic factors and KAPs regarding VBDs. KAP scores were calculated and categorized as high or low based on the number of correct or positive responses. Logistic regression analyses were conducted to identify predictors of KAP and linear regression analysis conducted to determine if knowledge and attitude scores predicted practice scores. In all, 361 (85 men and 276 women) people participated in the study. Most participants (87%) scored low on knowledge and practice items (78%). Conversely, 78% scored high on attitude items. By multivariate logistic regression, housewives were 82% less likely than laborers to have high attitude scores; homeowners were 65% less likely than renters to have high attitude scores. Participants from households with 1 to 2 children were 3.4 times more likely to have high attitude scores compared with those from households with no children. Participants from households with at least 5 people were 65% less likely than those from households with fewer than 5 people to have high practice scores. By multivariable linear regression knowledge and attitude scores were significant predictors of practice score. The study revealed poor knowledge of VBDs and poor prevention practices among participants. It identified specific groups that can be targeted with vector control and personal protection interventions to decrease transmission of the infections. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

  8. Combining logistic regression with classification and regression tree to predict quality of care in a home health nursing data set.

    PubMed

    Guo, Huey-Ming; Shyu, Yea-Ing Lotus; Chang, Her-Kun

    2006-01-01

    In this article, the authors provide an overview of a research method to predict quality of care in home health nursing data set. The results of this study can be visualized through classification an regression tree (CART) graphs. The analysis was more effective, and the results were more informative since the home health nursing dataset was analyzed with a combination of the logistic regression and CART, these two techniques complete each other. And the results more informative that more patients' characters were related to quality of care in home care. The results contributed to home health nurse predict patient outcome in case management. Improved prediction is needed for interventions to be appropriately targeted for improved patient outcome and quality of care.

  9. Two- and three-dimensional transvaginal ultrasound with power Doppler angiography and gel infusion sonography for diagnosis of endometrial malignancy.

    PubMed

    Dueholm, M; Christensen, J W; Rydbjerg, S; Hansen, E S; Ørtoft, G

    2015-06-01

    To evaluate the diagnostic efficiency of two-dimensional (2D) and three-dimensional (3D) transvaginal ultrasonography, power Doppler angiography (PDA) and gel infusion sonography (GIS) at offline analysis for recognition of malignant endometrium compared with real-time evaluation during scanning, and to determine optimal image parameters at 3D analysis. One hundred and sixty-nine consecutive women with postmenopausal bleeding and endometrial thickness ≥ 5 mm underwent systematic evaluation of endometrial pattern on 2D imaging, and 2D videoclips and 3D volumes were later analyzed offline. Histopathological findings at hysteroscopy or hysterectomy were used as the reference standard. The efficiency of the different techniques for diagnosis of malignancy was calculated and compared. 3D image parameters, endometrial volume and 3D vascular indices were assessed. Optimal 3D image parameters were transformed by logistic regression into a risk of endometrial cancer (REC) score, including scores for body mass index, endometrial thickness and endometrial morphology at gray-scale and PDA and GIS. Offline 2D and 3D analysis were equivalent, but had lower diagnostic performance compared with real-time evaluation during scanning. Their diagnostic performance was not markedly improved by the addition of PDA or GIS, but their efficiency was comparable with that of real-time 2D-GIS in offline examinations of good image quality. On logistic regression, the 3D parameters from the REC-score system had the highest diagnostic efficiency. The area under the curve of the REC-score system at 3D-GIS (0.89) was not improved by inclusion of vascular indices or endometrial volume calculations. Real-time evaluation during scanning is most efficient, but offline 2D and 3D analysis is useful for prediction of endometrial cancer when good image quality can be obtained. The diagnostic efficiency at 3D analysis may be improved by use of REC-scoring systems, without the need for calculation of vascular indices or endometrial volume. The optimal imaging modality appears to be real-time 2D-GIS. Copyright © 2014 ISUOG. Published by John Wiley & Sons Ltd.

  10. The American College of Surgeons National Surgical Quality Improvement Program Surgical Risk Calculator Does Not Accurately Predict Risk of 30-Day Complications Among Patients Undergoing Microvascular Head and Neck Reconstruction.

    PubMed

    Arce, Kevin; Moore, Eric J; Lohse, Christine M; Reiland, Matthew D; Yetzer, Jacob G; Ettinger, Kyle S

    2016-09-01

    The American College of Surgeons (ACS) National Surgical Quality Improvement Program (NSQIP) Surgical Risk Calculator (SRC) is a novel universal risk calculator designed to aid in risk stratification of patients undergoing various types of major surgery. The purpose of this study was to assess the validity of the ACS NSQIP SRC in predicting postoperative complications in patients undergoing microvascular head and neck reconstruction. A retrospective cohort study of patients undergoing head and neck microvascular reconstruction with fibular free flaps at a single institution was completed. The NSQIP SRC was used to compute complication risk estimates and length of stay (LOS) estimates for all patients under study. Associations between complication risk estimates generated by the SRC and actual rates of observed complications were evaluated using logistic regression models. Logistic regression models also were used to evaluate the SRC estimates for LOS duration compared with the actual observed LOS after surgery. Of 153 patients under study, 46 (30%) developed a postoperative complication corresponding to those defined by NSQIP SRC. Thirty-eight patients (25%) developed a postoperative complication categorized as severe in the parameters of the NSQIP SRC. None of the SRC complication estimates showed a statistically relevant association with the corresponding observed rates of complications. The mean LOS predicted by the SRC was 8.0 days (median, 7.5 days; interquartile range [IQR], 6.5 to 9; range, 5.0 to 18.5 days). The mean observed LOS for the study group was 9.6 days (median, 7.0 days; IQR, 6 to 9; range, 5 to 67 days). Lin's (Biometrics 45:255, 1989) concordance correlation coefficient to measure agreement between observed and predicted LOS was 0.10, indicating only slight agreement between the 2 values. The ACS NSQIP SRC is not a useful risk-stratifying metric for patients undergoing major head and neck reconstruction with microvascular fibular free flaps. The SRC also does not accurately predict hospital LOS for this same patient cohort. Copyright © 2016 American Association of Oral and Maxillofacial Surgeons. Published by Elsevier Inc. All rights reserved.

  11. A general framework for the use of logistic regression models in meta-analysis.

    PubMed

    Simmonds, Mark C; Higgins, Julian Pt

    2016-12-01

    Where individual participant data are available for every randomised trial in a meta-analysis of dichotomous event outcomes, "one-stage" random-effects logistic regression models have been proposed as a way to analyse these data. Such models can also be used even when individual participant data are not available and we have only summary contingency table data. One benefit of this one-stage regression model over conventional meta-analysis methods is that it maximises the correct binomial likelihood for the data and so does not require the common assumption that effect estimates are normally distributed. A second benefit of using this model is that it may be applied, with only minor modification, in a range of meta-analytic scenarios, including meta-regression, network meta-analyses and meta-analyses of diagnostic test accuracy. This single model can potentially replace the variety of often complex methods used in these areas. This paper considers, with a range of meta-analysis examples, how random-effects logistic regression models may be used in a number of different types of meta-analyses. This one-stage approach is compared with widely used meta-analysis methods including Bayesian network meta-analysis and the bivariate and hierarchical summary receiver operating characteristic (ROC) models for meta-analyses of diagnostic test accuracy. © The Author(s) 2014.

  12. Asthma exacerbation and proximity of residence to major roads: a population-based matched case-control study among the pediatric Medicaid population in Detroit, Michigan

    PubMed Central

    2011-01-01

    Background The relationship between asthma and traffic-related pollutants has received considerable attention. The use of individual-level exposure measures, such as residence location or proximity to emission sources, may avoid ecological biases. Method This study focused on the pediatric Medicaid population in Detroit, MI, a high-risk population for asthma-related events. A population-based matched case-control analysis was used to investigate associations between acute asthma outcomes and proximity of residence to major roads, including freeways. Asthma cases were identified as all children who made at least one asthma claim, including inpatient and emergency department visits, during the three-year study period, 2004-06. Individually matched controls were randomly selected from the rest of the Medicaid population on the basis of non-respiratory related illness. We used conditional logistic regression with distance as both categorical and continuous variables, and examined non-linear relationships with distance using polynomial splines. The conditional logistic regression models were then extended by considering multiple asthma states (based on the frequency of acute asthma outcomes) using polychotomous conditional logistic regression. Results Asthma events were associated with proximity to primary roads with an odds ratio of 0.97 (95% CI: 0.94, 0.99) for a 1 km increase in distance using conditional logistic regression, implying that asthma events are less likely as the distance between the residence and a primary road increases. Similar relationships and effect sizes were found using polychotomous conditional logistic regression. Another plausible exposure metric, a reduced form response surface model that represents atmospheric dispersion of pollutants from roads, was not associated under that exposure model. Conclusions There is moderately strong evidence of elevated risk of asthma close to major roads based on the results obtained in this population-based matched case-control study. PMID:21513554

  13. Neural network modeling for surgical decisions on traumatic brain injury patients.

    PubMed

    Li, Y C; Liu, L; Chiu, W T; Jian, W S

    2000-01-01

    Computerized medical decision support systems have been a major research topic in recent years. Intelligent computer programs were implemented to aid physicians and other medical professionals in making difficult medical decisions. This report compares three different mathematical models for building a traumatic brain injury (TBI) medical decision support system (MDSS). These models were developed based on a large TBI patient database. This MDSS accepts a set of patient data such as the types of skull fracture, Glasgow Coma Scale (GCS), episode of convulsion and return the chance that a neurosurgeon would recommend an open-skull surgery for this patient. The three mathematical models described in this report including a logistic regression model, a multi-layer perceptron (MLP) neural network and a radial-basis-function (RBF) neural network. From the 12,640 patients selected from the database. A randomly drawn 9480 cases were used as the training group to develop/train our models. The other 3160 cases were in the validation group which we used to evaluate the performance of these models. We used sensitivity, specificity, areas under receiver-operating characteristics (ROC) curve and calibration curves as the indicator of how accurate these models are in predicting a neurosurgeon's decision on open-skull surgery. The results showed that, assuming equal importance of sensitivity and specificity, the logistic regression model had a (sensitivity, specificity) of (73%, 68%), compared to (80%, 80%) from the RBF model and (88%, 80%) from the MLP model. The resultant areas under ROC curve for logistic regression, RBF and MLP neural networks are 0.761, 0.880 and 0.897, respectively (P < 0.05). Among these models, the logistic regression has noticeably poorer calibration. This study demonstrated the feasibility of applying neural networks as the mechanism for TBI decision support systems based on clinical databases. The results also suggest that neural networks may be a better solution for complex, non-linear medical decision support systems than conventional statistical techniques such as logistic regression.

  14. Cluster Analysis of Campylobacter jejuni Genotypes Isolated from Small and Medium-Sized Mammalian Wildlife and Bovine Livestock from Ontario Farms.

    PubMed

    Viswanathan, M; Pearl, D L; Taboada, E N; Parmley, E J; Mutschall, S K; Jardine, C M

    2017-05-01

    Using data collected from a cross-sectional study of 25 farms (eight beef, eight swine and nine dairy) in 2010, we assessed clustering of molecular subtypes of C. jejuni based on a Campylobacter-specific 40 gene comparative genomic fingerprinting assay (CGF40) subtypes, using unweighted pair-group method with arithmetic mean (UPGMA) analysis, and multiple correspondence analysis. Exact logistic regression was used to determine which genes differentiate wildlife and livestock subtypes in our study population. A total of 33 bovine livestock (17 beef and 16 dairy), 26 wildlife (20 raccoon (Procyon lotor), five skunk (Mephitis mephitis) and one mouse (Peromyscus spp.) C. jejuni isolates were subtyped using CGF40. Dendrogram analysis, based on UPGMA, showed distinct branches separating bovine livestock and mammalian wildlife isolates. Furthermore, two-dimensional multiple correspondence analysis was highly concordant with dendrogram analysis showing clear differentiation between livestock and wildlife CGF40 subtypes. Based on multilevel logistic regression models with a random intercept for farm of origin, we found that isolates in general, and raccoons more specifically, were significantly more likely to be part of the wildlife branch. Exact logistic regression conducted gene by gene revealed 15 genes that were predictive of whether an isolate was of wildlife or bovine livestock isolate origin. Both multiple correspondence analysis and exact logistic regression revealed that in most cases, the presence of a particular gene (13 of 15) was associated with an isolate being of livestock rather than wildlife origin. In conclusion, the evidence gained from dendrogram analysis, multiple correspondence analysis and exact logistic regression indicates that mammalian wildlife carry CGF40 subtypes of C. jejuni distinct from those carried by bovine livestock. Future studies focused on source attribution of C. jejuni in human infections will help determine whether wildlife transmit Campylobacter jejuni directly to humans. © 2016 Blackwell Verlag GmbH.

  15. Comparative analysis on the probability of being a good payer

    NASA Astrophysics Data System (ADS)

    Mihova, V.; Pavlov, V.

    2017-10-01

    Credit risk assessment is crucial for the bank industry. The current practice uses various approaches for the calculation of credit risk. The core of these approaches is the use of multiple regression models, applied in order to assess the risk associated with the approval of people applying for certain products (loans, credit cards, etc.). Based on data from the past, these models try to predict what will happen in the future. Different data requires different type of models. This work studies the causal link between the conduct of an applicant upon payment of the loan and the data that he completed at the time of application. A database of 100 borrowers from a commercial bank is used for the purposes of the study. The available data includes information from the time of application and credit history while paying off the loan. Customers are divided into two groups, based on the credit history: Good and Bad payers. Linear and logistic regression are applied in parallel to the data in order to estimate the probability of being good for new borrowers. A variable, which contains value of 1 for Good borrowers and value of 0 for Bad candidates, is modeled as a dependent variable. To decide which of the variables listed in the database should be used in the modelling process (as independent variables), a correlation analysis is made. Due to the results of it, several combinations of independent variables are tested as initial models - both with linear and logistic regression. The best linear and logistic models are obtained after initial transformation of the data and following a set of standard and robust statistical criteria. A comparative analysis between the two final models is made and scorecards are obtained from both models to assess new customers at the time of application. A cut-off level of points, bellow which to reject the applications and above it - to accept them, has been suggested for both the models, applying the strategy to keep the same Accept Rate as in the current data.

  16. [A case-control study: association between oral hygiene and oral cancer in non-smoking and non-drinking women].

    PubMed

    Wu, J F; Lin, L S; Chen, F; Liu, F Q; Huang, J F; Yan, L J; Liu, F P; Qiu, Y; Zheng, X Y; Cai, L; He, B C

    2017-08-06

    Objective: To evaluate the influence of oral hygiene on risk of oral cancer in non-smoking and non-drinking women. Methods: From September 2010 to February 2016, 242 non-smoking and non-drinking female patients with pathologically confirmed oral cancer were recruited in a hospital of Fuzhou, and another 856 non-smoking and non-drinking healthy women from health examination center in the same hospital were selected as control group. Five oral hygiene related variables including the frequency of teeth brushing, number of teeth lost, poor prosthesis, regular dental visits and recurrent dental ulceration were used to develop oral hygiene index model. Unconditional logistic regression was used to calculate odds ratios ( OR ) and 95% confidence intervals (95 %CI ). The area under the receiver operating characteristic curve (AUROC) was used to evaluate the predictability of the oral hygiene index model. Multivariate logistic regression model was used to analyze the association between oral hygiene index and the incidence of oral cancer. Results: Teeth brushing <2 twice daily, teeth lost ≥5, poor prosthesis, no regular dental visits, recurrent dental ulceration were risk factors for the incidence of oral cancer in non-smoking and non-drinking women, the corresponding OR (95 %CI ) were 1.50 (1.08-2.09), 1.81 (1.15-2.85), 1.51 (1.03-2.23), 1.73 (1.15-2.59), 7.30 (4.00-13.30), respectively. The AUROC of the oral hygiene index model was 0.705 9, indicating a high predictability. Multivariate logistic regression showed that the oral hygiene index was associated with risk of oral cancer. The higher the score, the higher risk was observed. The corresponding OR (95 %CI ) of oral hygiene index scores (score 1, score 2, score 3, score 4-5) were 2.51 (0.84-7.53), 4.68 (1.59-13.71), 6.47 (2.18-19.25), 15.29 (5.08-45.99), respectively. Conclusion: Oral hygiene could influence the incidence of oral cancer in non-smoking and non-drinking women, and oral hygiene index has a certain significance in assessing the combined effects of oral hygiene.

  17. Factors influencing hospital high length of stay outliers

    PubMed Central

    2012-01-01

    Background The study of length of stay (LOS) outliers is important for the management and financing of hospitals. Our aim was to study variables associated with high LOS outliers and their evolution over time. Methods We used hospital administrative data from inpatient episodes in public acute care hospitals in the Portuguese National Health Service (NHS), with discharges between years 2000 and 2009, together with some hospital characteristics. The dependent variable, LOS outliers, was calculated for each diagnosis related group (DRG) using a trim point defined for each year by the geometric mean plus two standard deviations. Hospitals were classified on the basis of administrative, economic and teaching characteristics. We also studied the influence of comorbidities and readmissions. Logistic regression models, including a multivariable logistic regression, were used in the analysis. All the logistic regressions were fitted using generalized estimating equations (GEE). Results In near nine million inpatient episodes analysed we found a proportion of 3.9% high LOS outliers, accounting for 19.2% of total inpatient days. The number of hospital patient discharges increased between years 2000 and 2005 and slightly decreased after that. The proportion of outliers ranged between the lowest value of 3.6% (in years 2001 and 2002) and the highest value of 4.3% in 2009. Teaching hospitals with over 1,000 beds have significantly more outliers than other hospitals, even after adjustment to readmissions and several patient characteristics. Conclusions In the last years both average LOS and high LOS outliers are increasing in Portuguese NHS hospitals. As high LOS outliers represent an important proportion in the total inpatient days, this should be seen as an important alert for the management of hospitals and for national health policies. As expected, age, type of admission, and hospital type were significantly associated with high LOS outliers. The proportion of high outliers does not seem to be related to their financial coverage; they should be studied in order to highlight areas for further investigation. The increasing complexity of both hospitals and patients may be the single most important determinant of high LOS outliers and must therefore be taken into account by health managers when considering hospital costs. PMID:22906386

  18. Early Change in Stroke Size Performs Best in Predicting Response to Therapy.

    PubMed

    Simpkins, Alexis Nétis; Dias, Christian; Norato, Gina; Kim, Eunhee; Leigh, Richard

    2017-01-01

    Reliable imaging biomarkers of response to therapy in acute stroke are needed. The final infarct volume and percent of early reperfusion have been used for this purpose. Early fluctuation in stroke size is a recognized phenomenon, but its utility as a biomarker for response to therapy has not been established. This study examined the clinical relevance of early change in stroke volume and compared it with the final infarct volume and percent of early reperfusion in identifying early neurologic improvement (ENI). Acute stroke patients, enrolled between 2013 and 2014 with serial magnetic resonance imaging (MRI) scans (pretreatment baseline, 2 h post, and 24 h post), who received thrombolysis were included in the analysis. Early change in stroke volume, infarct volume at 24 h on diffusion, and percent of early reperfusion were calculated from the baseline and 2 h MRI scans were compared. ENI was defined as ≥4 point decrease in National Institutes of Health Stroke Scales within 24 h. Logistic regression models and receiver operator characteristics analysis were used to compare the efficacy of 3 imaging biomarkers. Serial MRIs of 58 acute stroke patients were analyzed. Early change in stroke volume was significantly associated with ENI by logistic regression analysis (OR 0.93, p = 0.048) and remained significant after controlling for stroke size and severity (OR 0.90, p = 0.032). Thus, for every 1 mL increase in stroke volume, there was a 10% decrease in the odds of ENI, while for every 1 mL decrease in stroke volume, there was a 10% increase in the odds of ENI. Neither infarct volume at 24 h nor percent of early reperfusion were significantly associated with ENI by logistic regression. Receiver-operator characteristic analysis identified early change in stroke volume as the only biomarker of the 3 that performed significantly different than chance (p = 0.03). Early fluctuations in stroke size may represent a more reliable biomarker for response to therapy than the more traditional measures of final infarct volume and percent of early reperfusion. © 2017 S. Karger AG, Basel.

  19. Producing landslide susceptibility maps by utilizing machine learning methods. The case of Finikas catchment basin, North Peloponnese, Greece.

    NASA Astrophysics Data System (ADS)

    Tsangaratos, Paraskevas; Ilia, Ioanna; Loupasakis, Constantinos; Papadakis, Michalis; Karimalis, Antonios

    2017-04-01

    The main objective of the present study was to apply two machine learning methods for the production of a landslide susceptibility map in the Finikas catchment basin, located in North Peloponnese, Greece and to compare their results. Specifically, Logistic Regression and Random Forest were utilized, based on a database of 40 sites classified into two categories, non-landslide and landslide areas that were separated into a training dataset (70% of the total data) and a validation dataset (remaining 30%). The identification of the areas was established by analyzing airborne imagery, extensive field investigation and the examination of previous research studies. Six landslide related variables were analyzed, namely: lithology, elevation, slope, aspect, distance to rivers and distance to faults. Within the Finikas catchment basin most of the reported landslides were located along the road network and within the residential complexes, classified as rotational and translational slides, and rockfalls, mainly caused due to the physical conditions and the general geotechnical behavior of the geological formation that cover the area. Each landslide susceptibility map was reclassified by applying the Geometric Interval classification technique into five classes, namely: very low susceptibility, low susceptibility, moderate susceptibility, high susceptibility, and very high susceptibility. The comparison and validation of the outcomes of each model were achieved using statistical evaluation measures, the receiving operating characteristic and the area under the success and predictive rate curves. The computation process was carried out using RStudio an integrated development environment for R language and ArcGIS 10.1 for compiling the data and producing the landslide susceptibility maps. From the outcomes of the Logistic Regression analysis it was induced that the highest b coefficient is allocated to lithology and slope, which was 2.8423 and 1.5841, respectively. From the estimation of the mean decrease in Gini coefficient performed during the application of Random Forest and the mean decrease in accuracy the most important variable is slope followed by lithology, aspect, elevation, distance from river network, and distance from faults, while the most used variables during the training phase were the variable aspect (21.45%), slope (20.53%) and lithology (19.84%). The outcomes of the analysis are consistent with previous studies concerning the area of research, which have indicated the high influence of lithology and slope in the manifestation of landslides. High percentage of landslide occurrence has been observed in Plio-Pleistocene sediments, flysch formations, and Cretaceous limestone. Also the presences of landslides have been associated with the degree of weathering and fragmentation, the orientation of the discontinuities surfaces and the intense morphological relief. The most accurate model was Random Forest which identified correctly 92.00% of the instances during the training phase, followed by the Logistic Regression 89.00%. The same pattern of accuracy was calculated during the validation phase, in which the Random Forest achieved a classification accuracy of 93.00%, while the Logistic Regression model achieved an accuracy of 91.00%. In conclusion, the outcomes of the study could be a useful cartographic product to local authorities and government agencies during the implementation of successful decision-making and land use planning strategies. Keywords: Landslide Susceptibility, Logistic Regression, Random Forest, GIS, Greece.

  20. Development of a statistical model for the determination of the probability of riverbank erosion in a Meditteranean river basin

    NASA Astrophysics Data System (ADS)

    Varouchakis, Emmanouil; Kourgialas, Nektarios; Karatzas, George; Giannakis, Georgios; Lilli, Maria; Nikolaidis, Nikolaos

    2014-05-01

    Riverbank erosion affects the river morphology and the local habitat and results in riparian land loss, damage to property and infrastructures, ultimately weakening flood defences. An important issue concerning riverbank erosion is the identification of the areas vulnerable to erosion, as it allows for predicting changes and assists with stream management and restoration. One way to predict the vulnerable to erosion areas is to determine the erosion probability by identifying the underlying relations between riverbank erosion and the geomorphological and/or hydrological variables that prevent or stimulate erosion. A statistical model for evaluating the probability of erosion based on a series of independent local variables and by using logistic regression is developed in this work. The main variables affecting erosion are vegetation index (stability), the presence or absence of meanders, bank material (classification), stream power, bank height, river bank slope, riverbed slope, cross section width and water velocities (Luppi et al. 2009). In statistics, logistic regression is a type of regression analysis used for predicting the outcome of a categorical dependent variable, e.g. binary response, based on one or more predictor variables (continuous or categorical). The probabilities of the possible outcomes are modelled as a function of independent variables using a logistic function. Logistic regression measures the relationship between a categorical dependent variable and, usually, one or several continuous independent variables by converting the dependent variable to probability scores. Then, a logistic regression is formed, which predicts success or failure of a given binary variable (e.g. 1 = "presence of erosion" and 0 = "no erosion") for any value of the independent variables. The regression coefficients are estimated by using maximum likelihood estimation. The erosion occurrence probability can be calculated in conjunction with the model deviance regarding the independent variables tested (Atkinson et al. 2003). The developed statistical model is applied to the Koiliaris River Basin in the island of Crete, Greece. The aim is to determine the probability of erosion along the Koiliaris' riverbanks considering a series of independent geomorphological and/or hydrological variables. Data for the river bank slope and for the river cross section width are available at ten locations along the river. The riverbank has indications of erosion at six of the ten locations while four has remained stable. Based on a recent work, measurements for the two independent variables and data regarding bank stability are available at eight different locations along the river. These locations were used as validation points for the proposed statistical model. The results show a very close agreement between the observed erosion indications and the statistical model as the probability of erosion was accurately predicted at seven out of the eight locations. The next step is to apply the model at more locations along the riverbanks. In November 2013, stakes were inserted at selected locations in order to be able to identify the presence or absence of erosion after the winter period. In April 2014 the presence or absence of erosion will be identified and the model results will be compared to the field data. Our intent is to extend the model by increasing the number of independent variables in order to indentify the key factors favouring erosion along the Koiliaris River. We aim at developing an easy to use statistical tool that will provide a quantified measure of the erosion probability along the riverbanks, which could consequently be used to prevent erosion and flooding events. Atkinson, P. M., German, S. E., Sear, D. A. and Clark, M. J. 2003. Exploring the relations between riverbank erosion and geomorphological controls using geographically weighted logistic regression. Geographical Analysis, 35 (1), 58-82. Luppi, L., Rinaldi, M., Teruggi, L. B., Darby, S. E. and Nardi, L. 2009. Monitoring and numerical modelling of riverbank erosion processes: A case study along the Cecina River (central Italy). Earth Surface Processes and Landforms, 34 (4), 530-546. Acknowledgements This work is part of an on-going THALES project (CYBERSENSORS - High Frequency Monitoring System for Integrated Water Resources Management of Rivers). The project has been co-financed by the European Union (European Social Fund - ESF) and Greek national funds through the Operational Program "Education and Lifelong Learning" of the National Strategic Reference Framework (NSRF) - Research Funding Program: THALES. Investing in knowledge society through the European Social Fund.

  1. Randomization at the level of primary care practice: use of pre-intervention data and random effects models.

    PubMed

    Nixon, R M; Duffy, S W; Fender, G R; Day, N E; Prevost, T C

    2001-06-30

    The Anglia menorrhagia education study tests the effectiveness of an education package for the treatment of menorrhagia given to doctors at a primary care level. General practices were randomized to receive or not receive the package. It is hoped that this intervention will reduce the proportion of women suffering from menorrhagia that are referred to hospital. Data are available on the treatment and referral of women in the practices in the education and control groups, both pre- and post-intervention. We define and demonstrate a random effects logistic regression model that includes pre-intervention data for calculating the effectiveness of the intervention. Copyright 2001 John Wiley & Sons, Ltd.

  2. Suicidal ideation and Attempts in North American School-Based Surveys

    PubMed Central

    Saewyc, Elizabeth M.; Skay, Carol L.; Hynds, Patricia; Pettingell, Sandra; Bearinger, Linda H.; Resnick, Michael D.; Reis, Elizabeth

    2008-01-01

    This study explored the prevalence, disparity, and cohort trends in suicidality among bisexual teens vs. heterosexual and gay/lesbian peers in 9 population-based high school surveys in Canada and the U.S. Multivariate logistic regressions were used to calculate age-adjusted odds ratios separately by gender; 95% confidence intervals tested cohort trends where surveys were repeated over multiple years. Results showed remarkable consistency: bisexual youth reported higher odds of recent suicidal ideation and attempts vs. heterosexual peers, with increasing odds in most surveys over the past decade. Results compared to gay and lesbian peers were mixed, with varying gender differences in prevalence and disparity trends in the different regions. PMID:19835039

  3. 2012 Workplace and Gender Relations Survey of Reserve Component Members: Statistical Methodology Report

    DTIC Science & Technology

    2012-09-01

    3,435 10,461 9.1 3.1 63 Unmarried with Children+ Unmarried without Children 439,495 0.01 10,350 43,870 10.1 2.2 64 Married with Children+ Married ...logistic regression model was used to predict the probability of eligibility for the survey (known eligibility vs . unknown eligibility). A second logistic...regression model was used to predict the probability of response among eligible sample members (complete response vs . non-response). CHAID (Chi

  4. Habitat features and predictive habitat modeling for the Colorado chipmunk in southern New Mexico

    USGS Publications Warehouse

    Rivieccio, M.; Thompson, B.C.; Gould, W.R.; Boykin, K.G.

    2003-01-01

    Two subspecies of Colorado chipmunk (state threatened and federal species of concern) occur in southern New Mexico: Tamias quadrivittatus australis in the Organ Mountains and T. q. oscuraensis in the Oscura Mountains. We developed a GIS model of potentially suitable habitat based on vegetation and elevation features, evaluated site classifications of the GIS model, and determined vegetation and terrain features associated with chipmunk occurrence. We compared GIS model classifications with actual vegetation and elevation features measured at 37 sites. At 60 sites we measured 18 habitat variables regarding slope, aspect, tree species, shrub species, and ground cover. We used logistic regression to analyze habitat variables associated with chipmunk presence/absence. All (100%) 37 sample sites (28 predicted suitable, 9 predicted unsuitable) were classified correctly by the GIS model regarding elevation and vegetation. For 28 sites predicted suitable by the GIS model, 18 sites (64%) appeared visually suitable based on habitat variables selected from logistic regression analyses, of which 10 sites (36%) were specifically predicted as suitable habitat via logistic regression. We detected chipmunks at 70% of sites deemed suitable via the logistic regression models. Shrub cover, tree density, plant proximity, presence of logs, and presence of rock outcrop were retained in the logistic model for the Oscura Mountains; litter, shrub cover, and grass cover were retained in the logistic model for the Organ Mountains. Evaluation of predictive models illustrates the need for multi-stage analyses to best judge performance. Microhabitat analyses indicate prospective needs for different management strategies between the subspecies. Sensitivities of each population of the Colorado chipmunk to natural and prescribed fire suggest that partial burnings of areas inhabited by Colorado chipmunks in southern New Mexico may be beneficial. These partial burnings may later help avoid a fire that could substantially reduce habitat of chipmunks over a mountain range.

  5. The logistic model for predicting the non-gonoactive Aedes aegypti females.

    PubMed

    Reyes-Villanueva, Filiberto; Rodríguez-Pérez, Mario A

    2004-01-01

    To estimate, using logistic regression, the likelihood of occurrence of a non-gonoactive Aedes aegypti female, previously fed human blood, with relation to body size and collection method. This study was conducted in Monterrey, Mexico, between 1994 and 1996. Ten samplings of 60 mosquitoes of Ae. aegypti females were carried out in three dengue endemic areas: six of biting females, two of emerging mosquitoes, and two of indoor resting females. Gravid females, as well as those with blood in the gut were removed. Mosquitoes were taken to the laboratory and engorged on human blood. After 48 hours, ovaries were dissected to register whether they were gonoactive or non-gonoactive. Wing-length in mm was an indicator for body size. The logistic regression model was used to assess the likelihood of non-gonoactivity, as a binary variable, in relation to wing-length and collection method. Of the 600 females, 164 (27%) remained non-gonoactive, with a wing-length range of 1.9-3.2 mm, almost equal to that of all females (1.8-3.3 mm). The logistic regression model showed a significant likelihood of a female remaining non-gonoactive (Y=1). The collection method did not influence the binary response, but there was an inverse relationship between non-gonoactivity and wing-length. Dengue vector populations from Monterrey, Mexico display a wide-range body size. Logistic regression was a useful tool to estimate the likelihood for an engorged female to remain non-gonoactive. The necessity for a second blood meal is present in any female, but small mosquitoes are more likely to bite again within a 2-day interval, in order to attain egg maturation. The English version of this paper is available too at: http://www.insp.mx/salud/index.html.

  6. The Application of the Cumulative Logistic Regression Model to Automated Essay Scoring

    ERIC Educational Resources Information Center

    Haberman, Shelby J.; Sinharay, Sandip

    2010-01-01

    Most automated essay scoring programs use a linear regression model to predict an essay score from several essay features. This article applied a cumulative logit model instead of the linear regression model to automated essay scoring. Comparison of the performances of the linear regression model and the cumulative logit model was performed on a…

  7. Obesity Increases Prevalence of Colonic Adenomas at Screening Colonoscopy: A Canadian Community-Based Study

    PubMed Central

    Chen, Grant I.; Devlin, Tim; Gibbs, Alison; Murray, Iain C.; Tran, Stanley; Weigensberg, Corey

    2017-01-01

    Background and Aims Obesity is a risk factor for colorectal neoplasia. We examined the influence of obesity and metabolic syndrome (MetS) on prevalence of neoplasia at screening colonoscopy. Methods We evaluated 2020 subjects undergoing first screening colonoscopy. Body mass index (BMI) was calculated at enrolment. Hyperlipidemia (HL), hypertension (HT), and diabetes mellitus (DM) were identified. Details of colonoscopy, polypectomy, and histology were recorded. Odds for adenomas (A) and advanced adenomas (ADV) in overweight (BMI 25.1–30) and obese (BMI > 30) subjects were assessed by multinomial regression, adjusted for covariates. Analyses included relationships between HL, HT, DM, age, tobacco usage, and neoplasia. Discriminatory power of HT, HL, DM, and BMI for neoplasia was assessed by binary logistic regression. Odds were calculated for neoplasia in each colonic segment related to BMI. Results A and ADV were commoner in overweight and obese males, obese females, older subjects, and smokers. HL, HT, and DM were associated with increased odds for neoplasia, significantly for A with hypertension. BMI alone predicted neoplasia as well as HT, HL, DM, or combinations thereof. All segments of the colon were affected. Multiple polyps were particularly prevalent in the obese. Conclusions Obesity and MetS are risk factors for colonic neoplasia in a Canadian population. PMID:28781966

  8. Climate change, weather and road deaths.

    PubMed

    Robertson, Leon

    2018-06-01

    In 2015, a 7% increase in road deaths per population in the USA reversed the 35-year downward trend. Here I test the hypothesis that weather influenced the change in trend. I used linear regression to estimate the effect of temperature and precipitation on miles driven per capita in urbanizedurbanised areas of the USA during 2010. I matched date and county of death with temperature on that date and number of people exposed to that temperature to calculate the risk per persons exposed to specific temperatures. I employed logistic regression analysis of temperature, precipitation and other risk factors prevalent in 2014 to project expected deaths in 2015 among the 100 most populous counties in the USA. Comparison of actual and projected deaths provided an estimate of deaths expected without the temperature increase. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  9. Widen NomoGram for multinomial logistic regression: an application to staging liver fibrosis in chronic hepatitis C patients.

    PubMed

    Ardoino, Ilaria; Lanzoni, Monica; Marano, Giuseppe; Boracchi, Patrizia; Sagrini, Elisabetta; Gianstefani, Alice; Piscaglia, Fabio; Biganzoli, Elia M

    2017-04-01

    The interpretation of regression models results can often benefit from the generation of nomograms, 'user friendly' graphical devices especially useful for assisting the decision-making processes. However, in the case of multinomial regression models, whenever categorical responses with more than two classes are involved, nomograms cannot be drawn in the conventional way. Such a difficulty in managing and interpreting the outcome could often result in a limitation of the use of multinomial regression in decision-making support. In the present paper, we illustrate the derivation of a non-conventional nomogram for multinomial regression models, intended to overcome this issue. Although it may appear less straightforward at first sight, the proposed methodology allows an easy interpretation of the results of multinomial regression models and makes them more accessible for clinicians and general practitioners too. Development of prediction model based on multinomial logistic regression and of the pertinent graphical tool is illustrated by means of an example involving the prediction of the extent of liver fibrosis in hepatitis C patients by routinely available markers.

  10. Regularization Paths for Conditional Logistic Regression: The clogitL1 Package.

    PubMed

    Reid, Stephen; Tibshirani, Rob

    2014-07-01

    We apply the cyclic coordinate descent algorithm of Friedman, Hastie, and Tibshirani (2010) to the fitting of a conditional logistic regression model with lasso [Formula: see text] and elastic net penalties. The sequential strong rules of Tibshirani, Bien, Hastie, Friedman, Taylor, Simon, and Tibshirani (2012) are also used in the algorithm and it is shown that these offer a considerable speed up over the standard coordinate descent algorithm with warm starts. Once implemented, the algorithm is used in simulation studies to compare the variable selection and prediction performance of the conditional logistic regression model against that of its unconditional (standard) counterpart. We find that the conditional model performs admirably on datasets drawn from a suitable conditional distribution, outperforming its unconditional counterpart at variable selection. The conditional model is also fit to a small real world dataset, demonstrating how we obtain regularization paths for the parameters of the model and how we apply cross validation for this method where natural unconditional prediction rules are hard to come by.

  11. Computational tools for exact conditional logistic regression.

    PubMed

    Corcoran, C; Mehta, C; Patel, N; Senchaudhuri, P

    Logistic regression analyses are often challenged by the inability of unconditional likelihood-based approximations to yield consistent, valid estimates and p-values for model parameters. This can be due to sparseness or separability in the data. Conditional logistic regression, though useful in such situations, can also be computationally unfeasible when the sample size or number of explanatory covariates is large. We review recent developments that allow efficient approximate conditional inference, including Monte Carlo sampling and saddlepoint approximations. We demonstrate through real examples that these methods enable the analysis of significantly larger and more complex data sets. We find in this investigation that for these moderately large data sets Monte Carlo seems a better alternative, as it provides unbiased estimates of the exact results and can be executed in less CPU time than can the single saddlepoint approximation. Moreover, the double saddlepoint approximation, while computationally the easiest to obtain, offers little practical advantage. It produces unreliable results and cannot be computed when a maximum likelihood solution does not exist. Copyright 2001 John Wiley & Sons, Ltd.

  12. Regularization Paths for Conditional Logistic Regression: The clogitL1 Package

    PubMed Central

    Reid, Stephen; Tibshirani, Rob

    2014-01-01

    We apply the cyclic coordinate descent algorithm of Friedman, Hastie, and Tibshirani (2010) to the fitting of a conditional logistic regression model with lasso (ℓ1) and elastic net penalties. The sequential strong rules of Tibshirani, Bien, Hastie, Friedman, Taylor, Simon, and Tibshirani (2012) are also used in the algorithm and it is shown that these offer a considerable speed up over the standard coordinate descent algorithm with warm starts. Once implemented, the algorithm is used in simulation studies to compare the variable selection and prediction performance of the conditional logistic regression model against that of its unconditional (standard) counterpart. We find that the conditional model performs admirably on datasets drawn from a suitable conditional distribution, outperforming its unconditional counterpart at variable selection. The conditional model is also fit to a small real world dataset, demonstrating how we obtain regularization paths for the parameters of the model and how we apply cross validation for this method where natural unconditional prediction rules are hard to come by. PMID:26257587

  13. Ordinal logistic regression analysis on the nutritional status of children in KarangKitri village

    NASA Astrophysics Data System (ADS)

    Ohyver, Margaretha; Yongharto, Kimmy Octavian

    2015-09-01

    Ordinal logistic regression is a statistical technique that can be used to describe the relationship between ordinal response variable with one or more independent variables. This method has been used in various fields including in the health field. In this research, ordinal logistic regression is used to describe the relationship between nutritional status of children with age, gender, height, and family status. Nutritional status of children in this research is divided into over nutrition, well nutrition, less nutrition, and malnutrition. The purpose for this research is to describe the characteristics of children in the KarangKitri Village and to determine the factors that influence the nutritional status of children in the KarangKitri village. There are three things that obtained from this research. First, there are still children who are not categorized as well nutritional status. Second, there are children who come from sufficient economic level which include in not normal status. Third, the factors that affect the nutritional level of children are age, family status, and height.

  14. Analysis of an Environmental Exposure Health Questionnaire in a Metropolitan Minority Population Utilizing Logistic Regression and Support Vector Machines

    PubMed Central

    Chen, Chau-Kuang; Bruce, Michelle; Tyler, Lauren; Brown, Claudine; Garrett, Angelica; Goggins, Susan; Lewis-Polite, Brandy; Weriwoh, Mirabel L; Juarez, Paul D.; Hood, Darryl B.; Skelton, Tyler

    2014-01-01

    The goal of this study was to analyze a 54-item instrument for assessment of perception of exposure to environmental contaminants within the context of the built environment, or exposome. This exposome was defined in five domains to include 1) home and hobby, 2) school, 3) community, 4) occupation, and 5) exposure history. Interviews were conducted with child-bearing-age minority women at Metro Nashville General Hospital at Meharry Medical College. Data were analyzed utilizing DTReg software for Support Vector Machine (SVM) modeling followed by an SPSS package for a logistic regression model. The target (outcome) variable of interest was respondent's residence by ZIP code. The results demonstrate that the rank order of important variables with respect to SVM modeling versus traditional logistic regression models is almost identical. This is the first study documenting that SVM analysis has discriminate power for determination of higher-ordered spatial relationships on an environmental exposure history questionnaire. PMID:23395953

  15. An ultra low power feature extraction and classification system for wearable seizure detection.

    PubMed

    Page, Adam; Pramod Tim Oates, Siddharth; Mohsenin, Tinoosh

    2015-01-01

    In this paper we explore the use of a variety of machine learning algorithms for designing a reliable and low-power, multi-channel EEG feature extractor and classifier for predicting seizures from electroencephalographic data (scalp EEG). Different machine learning classifiers including k-nearest neighbor, support vector machines, naïve Bayes, logistic regression, and neural networks are explored with the goal of maximizing detection accuracy while minimizing power, area, and latency. The input to each machine learning classifier is a 198 feature vector containing 9 features for each of the 22 EEG channels obtained over 1-second windows. All classifiers were able to obtain F1 scores over 80% and onset sensitivity of 100% when tested on 10 patients. Among five different classifiers that were explored, logistic regression (LR) proved to have minimum hardware complexity while providing average F-1 score of 91%. Both ASIC and FPGA implementations of logistic regression are presented and show the smallest area, power consumption, and the lowest latency when compared to the previous work.

  16. The arcsine is asinine: the analysis of proportions in ecology.

    PubMed

    Warton, David I; Hui, Francis K C

    2011-01-01

    The arcsine square root transformation has long been standard procedure when analyzing proportional data in ecology, with applications in data sets containing binomial and non-binomial response variables. Here, we argue that the arcsine transform should not be used in either circumstance. For binomial data, logistic regression has greater interpretability and higher power than analyses of transformed data. However, it is important to check the data for additional unexplained variation, i.e., overdispersion, and to account for it via the inclusion of random effects in the model if found. For non-binomial data, the arcsine transform is undesirable on the grounds of interpretability, and because it can produce nonsensical predictions. The logit transformation is proposed as an alternative approach to address these issues. Examples are presented in both cases to illustrate these advantages, comparing various methods of analyzing proportions including untransformed, arcsine- and logit-transformed linear models and logistic regression (with or without random effects). Simulations demonstrate that logistic regression usually provides a gain in power over other methods.

  17. Analysis of an environmental exposure health questionnaire in a metropolitan minority population utilizing logistic regression and Support Vector Machines.

    PubMed

    Chen, Chau-Kuang; Bruce, Michelle; Tyler, Lauren; Brown, Claudine; Garrett, Angelica; Goggins, Susan; Lewis-Polite, Brandy; Weriwoh, Mirabel L; Juarez, Paul D; Hood, Darryl B; Skelton, Tyler

    2013-02-01

    The goal of this study was to analyze a 54-item instrument for assessment of perception of exposure to environmental contaminants within the context of the built environment, or exposome. This exposome was defined in five domains to include 1) home and hobby, 2) school, 3) community, 4) occupation, and 5) exposure history. Interviews were conducted with child-bearing-age minority women at Metro Nashville General Hospital at Meharry Medical College. Data were analyzed utilizing DTReg software for Support Vector Machine (SVM) modeling followed by an SPSS package for a logistic regression model. The target (outcome) variable of interest was respondent's residence by ZIP code. The results demonstrate that the rank order of important variables with respect to SVM modeling versus traditional logistic regression models is almost identical. This is the first study documenting that SVM analysis has discriminate power for determination of higher-ordered spatial relationships on an environmental exposure history questionnaire.

  18. Prescription-drug-related risk in driving: comparing conventional and lasso shrinkage logistic regressions.

    PubMed

    Avalos, Marta; Adroher, Nuria Duran; Lagarde, Emmanuel; Thiessard, Frantz; Grandvalet, Yves; Contrand, Benjamin; Orriols, Ludivine

    2012-09-01

    Large data sets with many variables provide particular challenges when constructing analytic models. Lasso-related methods provide a useful tool, although one that remains unfamiliar to most epidemiologists. We illustrate the application of lasso methods in an analysis of the impact of prescribed drugs on the risk of a road traffic crash, using a large French nationwide database (PLoS Med 2010;7:e1000366). In the original case-control study, the authors analyzed each exposure separately. We use the lasso method, which can simultaneously perform estimation and variable selection in a single model. We compare point estimates and confidence intervals using (1) a separate logistic regression model for each drug with a Bonferroni correction and (2) lasso shrinkage logistic regression analysis. Shrinkage regression had little effect on (bias corrected) point estimates, but led to less conservative results, noticeably for drugs with moderate levels of exposure. Carbamates, carboxamide derivative and fatty acid derivative antiepileptics, drugs used in opioid dependence, and mineral supplements of potassium showed stronger associations. Lasso is a relevant method in the analysis of databases with large number of exposures and can be recommended as an alternative to conventional strategies.

  19. Spatiotemporal variability of urban growth factors: A global and local perspective on the megacity of Mumbai

    NASA Astrophysics Data System (ADS)

    Shafizadeh-Moghadam, Hossein; Helbich, Marco

    2015-03-01

    The rapid growth of megacities requires special attention among urban planners worldwide, and particularly in Mumbai, India, where growth is very pronounced. To cope with the planning challenges this will bring, developing a retrospective understanding of urban land-use dynamics and the underlying driving-forces behind urban growth is a key prerequisite. This research uses regression-based land-use change models - and in particular non-spatial logistic regression models (LR) and auto-logistic regression models (ALR) - for the Mumbai region over the period 1973-2010, in order to determine the drivers behind spatiotemporal urban expansion. Both global models are complemented by a local, spatial model, the so-called geographically weighted logistic regression (GWLR) model, one that explicitly permits variations in driving-forces across space. The study comes to two main conclusions. First, both global models suggest similar driving-forces behind urban growth over time, revealing that LRs and ALRs result in estimated coefficients with comparable magnitudes. Second, all the local coefficients show distinctive temporal and spatial variations. It is therefore concluded that GWLR aids our understanding of urban growth processes, and so can assist context-related planning and policymaking activities when seeking to secure a sustainable urban future.

  20. Can Predictive Modeling Identify Head and Neck Oncology Patients at Risk for Readmission?

    PubMed

    Manning, Amy M; Casper, Keith A; Peter, Kay St; Wilson, Keith M; Mark, Jonathan R; Collar, Ryan M

    2018-05-01

    Objective Unplanned readmission within 30 days is a contributor to health care costs in the United States. The use of predictive modeling during hospitalization to identify patients at risk for readmission offers a novel approach to quality improvement and cost reduction. Study Design Two-phase study including retrospective analysis of prospectively collected data followed by prospective longitudinal study. Setting Tertiary academic medical center. Subjects and Methods Prospectively collected data for patients undergoing surgical treatment for head and neck cancer from January 2013 to January 2015 were used to build predictive models for readmission within 30 days of discharge using logistic regression, classification and regression tree (CART) analysis, and random forests. One model (logistic regression) was then placed prospectively into the discharge workflow from March 2016 to May 2016 to determine the model's ability to predict which patients would be readmitted within 30 days. Results In total, 174 admissions had descriptive data. Thirty-two were excluded due to incomplete data. Logistic regression, CART, and random forest predictive models were constructed using the remaining 142 admissions. When applied to 106 consecutive prospective head and neck oncology patients at the time of discharge, the logistic regression model predicted readmissions with a specificity of 94%, a sensitivity of 47%, a negative predictive value of 90%, and a positive predictive value of 62% (odds ratio, 14.9; 95% confidence interval, 4.02-55.45). Conclusion Prospectively collected head and neck cancer databases can be used to develop predictive models that can accurately predict which patients will be readmitted. This offers valuable support for quality improvement initiatives and readmission-related cost reduction in head and neck cancer care.

  1. Utility of an Abbreviated Dizziness Questionnaire to Differentiate between Causes of Vertigo and Guide Appropriate Referral: A Multicenter Prospective Blinded Study

    PubMed Central

    Roland, Lauren T.; Kallogjeri, Dorina; Sinks, Belinda C.; Rauch, Steven D.; Shepard, Neil T.; White, Judith A.; Goebel, Joel A.

    2015-01-01

    Objective Test performance of a focused dizziness questionnaire’s ability to discriminate between peripheral and non-peripheral causes of vertigo. Study Design Prospective multi-center Setting Four academic centers with experienced balance specialists Patients New dizzy patients Interventions A 32-question survey was given to participants. Balance specialists were blinded and a diagnosis was established for all participating patients within 6 months. Main outcomes Multinomial logistic regression was used to evaluate questionnaire performance in predicting final diagnosis and differentiating between peripheral and non-peripheral vertigo. Univariate and multivariable stepwise logistic regression were used to identify questions as significant predictors of the ultimate diagnosis. C-index was used to evaluate performance and discriminative power of the multivariable models. Results 437 patients participated in the study. Eight participants without confirmed diagnoses were excluded and 429 were included in the analysis. Multinomial regression revealed that the model had good overall predictive accuracy of 78.5% for the final diagnosis and 75.5% for differentiating between peripheral and non-peripheral vertigo. Univariate logistic regression identified significant predictors of three main categories of vertigo: peripheral, central and other. Predictors were entered into forward stepwise multivariable logistic regression. The discriminative power of the final models for peripheral, central and other causes were considered good as measured by c-indices of 0.75, 0.7 and 0.78, respectively. Conclusions This multicenter study demonstrates a focused dizziness questionnaire can accurately predict diagnosis for patients with chronic/relapsing dizziness referred to outpatient clinics. Additionally, this survey has significant capability to differentiate peripheral from non-peripheral causes of vertigo and may, in the future, serve as a screening tool for specialty referral. Clinical utility of this questionnaire to guide specialty referral is discussed. PMID:26485598

  2. Utility of an Abbreviated Dizziness Questionnaire to Differentiate Between Causes of Vertigo and Guide Appropriate Referral: A Multicenter Prospective Blinded Study.

    PubMed

    Roland, Lauren T; Kallogjeri, Dorina; Sinks, Belinda C; Rauch, Steven D; Shepard, Neil T; White, Judith A; Goebel, Joel A

    2015-12-01

    Test performance of a focused dizziness questionnaire's ability to discriminate between peripheral and nonperipheral causes of vertigo. Prospective multicenter. Four academic centers with experienced balance specialists. New dizzy patients. A 32-question survey was given to participants. Balance specialists were blinded and a diagnosis was established for all participating patients within 6 months. Multinomial logistic regression was used to evaluate questionnaire performance in predicting final diagnosis and differentiating between peripheral and nonperipheral vertigo. Univariate and multivariable stepwise logistic regression were used to identify questions as significant predictors of the ultimate diagnosis. C-index was used to evaluate performance and discriminative power of the multivariable models. In total, 437 patients participated in the study. Eight participants without confirmed diagnoses were excluded and 429 were included in the analysis. Multinomial regression revealed that the model had good overall predictive accuracy of 78.5% for the final diagnosis and 75.5% for differentiating between peripheral and nonperipheral vertigo. Univariate logistic regression identified significant predictors of three main categories of vertigo: peripheral, central, and other. Predictors were entered into forward stepwise multivariable logistic regression. The discriminative power of the final models for peripheral, central, and other causes was considered good as measured by c-indices of 0.75, 0.7, and 0.78, respectively. This multicenter study demonstrates a focused dizziness questionnaire can accurately predict diagnosis for patients with chronic/relapsing dizziness referred to outpatient clinics. Additionally, this survey has significant capability to differentiate peripheral from nonperipheral causes of vertigo and may, in the future, serve as a screening tool for specialty referral. Clinical utility of this questionnaire to guide specialty referral is discussed.

  3. Prediction of cold and heat patterns using anthropometric measures based on machine learning.

    PubMed

    Lee, Bum Ju; Lee, Jae Chul; Nam, Jiho; Kim, Jong Yeol

    2018-01-01

    To examine the association of body shape with cold and heat patterns, to determine which anthropometric measure is the best indicator for discriminating between the two patterns, and to investigate whether using a combination of measures can improve the predictive power to diagnose these patterns. Based on a total of 4,859 subjects (3,000 women and 1,859 men), statistical analyses using binary logistic regression were performed to assess the significance of the difference and the predictive power of each anthropometric measure, and binary logistic regression and Naive Bayes with the variable selection technique were used to assess the improvement in the predictive power of the patterns using the combined measures. In women, the strongest indicators for determining the cold and heat patterns among anthropometric measures were body mass index (BMI) and rib circumference; in men, the best indicator was BMI. In experiments using a combination of measures, the values of the area under the receiver operating characteristic curve in women were 0.776 by Naive Bayes and 0.772 by logistic regression, and the values in men were 0.788 by Naive Bayes and 0.779 by logistic regression. Individuals with a higher BMI have a tendency toward a heat pattern in both women and men. The use of a combination of anthropometric measures can slightly improve the diagnostic accuracy. Our findings can provide fundamental information for the diagnosis of cold and heat patterns based on body shape for personalized medicine.

  4. Application of classification tree and logistic regression for the management and health intervention plans in a community-based study.

    PubMed

    Teng, Ju-Hsi; Lin, Kuan-Chia; Ho, Bin-Shenq

    2007-10-01

    A community-based aboriginal study was conducted and analysed to explore the application of classification tree and logistic regression. A total of 1066 aboriginal residents in Yilan County were screened during 2003-2004. The independent variables include demographic characteristics, physical examinations, geographic location, health behaviours, dietary habits and family hereditary diseases history. Risk factors of cardiovascular diseases were selected as the dependent variables in further analysis. The completion rate for heath interview is 88.9%. The classification tree results find that if body mass index is higher than 25.72 kg m(-2) and the age is above 51 years, the predicted probability for number of cardiovascular risk factors > or =3 is 73.6% and the population is 322. If body mass index is higher than 26.35 kg m(-2) and geographical latitude of the village is lower than 24 degrees 22.8', the predicted probability for number of cardiovascular risk factors > or =4 is 60.8% and the population is 74. As the logistic regression results indicate that body mass index, drinking habit and menopause are the top three significant independent variables. The classification tree model specifically shows the discrimination paths and interactions between the risk groups. The logistic regression model presents and analyses the statistical independent factors of cardiovascular risks. Applying both models to specific situations will provide a different angle for the design and management of future health intervention plans after community-based study.

  5. Risk factors for pedicled flap necrosis in hand soft tissue reconstruction: a multivariate logistic regression analysis.

    PubMed

    Gong, Xu; Cui, Jianli; Jiang, Ziping; Lu, Laijin; Li, Xiucun

    2018-03-01

    Few clinical retrospective studies have reported the risk factors of pedicled flap necrosis in hand soft tissue reconstruction. The aim of this study was to identify non-technical risk factors associated with pedicled flap perioperative necrosis in hand soft tissue reconstruction via a multivariate logistic regression analysis. For patients with hand soft tissue reconstruction, we carefully reviewed hospital records and identified 163 patients who met the inclusion criteria. The characteristics of these patients, flap transfer procedures and postoperative complications were recorded. Eleven predictors were identified. The correlations between pedicled flap necrosis and risk factors were analysed using a logistic regression model. Of 163 skin flaps, 125 flaps survived completely without any complications. The pedicled flap necrosis rate in hands was 11.04%, which included partial flap necrosis (7.36%) and total flap necrosis (3.68%). Soft tissue defects in fingers were noted in 68.10% of all cases. The logistic regression analysis indicated that the soft tissue defect site (P = 0.046, odds ratio (OR) = 0.079, confidence interval (CI) (0.006, 0.959)), flap size (P = 0.020, OR = 1.024, CI (1.004, 1.045)) and postoperative wound infection (P < 0.001, OR = 17.407, CI (3.821, 79.303)) were statistically significant risk factors for pedicled flap necrosis of the hand. Soft tissue defect site, flap size and postoperative wound infection were risk factors associated with pedicled flap necrosis in hand soft tissue defect reconstruction. © 2017 Royal Australasian College of Surgeons.

  6. A regularization corrected score method for nonlinear regression models with covariate error.

    PubMed

    Zucker, David M; Gorfine, Malka; Li, Yi; Tadesse, Mahlet G; Spiegelman, Donna

    2013-03-01

    Many regression analyses involve explanatory variables that are measured with error, and failing to account for this error is well known to lead to biased point and interval estimates of the regression coefficients. We present here a new general method for adjusting for covariate error. Our method consists of an approximate version of the Stefanski-Nakamura corrected score approach, using the method of regularization to obtain an approximate solution of the relevant integral equation. We develop the theory in the setting of classical likelihood models; this setting covers, for example, linear regression, nonlinear regression, logistic regression, and Poisson regression. The method is extremely general in terms of the types of measurement error models covered, and is a functional method in the sense of not involving assumptions on the distribution of the true covariate. We discuss the theoretical properties of the method and present simulation results in the logistic regression setting (univariate and multivariate). For illustration, we apply the method to data from the Harvard Nurses' Health Study concerning the relationship between physical activity and breast cancer mortality in the period following a diagnosis of breast cancer. Copyright © 2013, The International Biometric Society.

  7. Logistic Mixed Models to Investigate Implicit and Explicit Belief Tracking

    PubMed Central

    Lages, Martin; Scheel, Anne

    2016-01-01

    We investigated the proposition of a two-systems Theory of Mind in adults’ belief tracking. A sample of N = 45 participants predicted the choice of one of two opponent players after observing several rounds in an animated card game. Three matches of this card game were played and initial gaze direction on target and subsequent choice predictions were recorded for each belief task and participant. We conducted logistic regressions with mixed effects on the binary data and developed Bayesian logistic mixed models to infer implicit and explicit mentalizing in true belief and false belief tasks. Although logistic regressions with mixed effects predicted the data well a Bayesian logistic mixed model with latent task- and subject-specific parameters gave a better account of the data. As expected explicit choice predictions suggested a clear understanding of true and false beliefs (TB/FB). Surprisingly, however, model parameters for initial gaze direction also indicated belief tracking. We discuss why task-specific parameters for initial gaze directions are different from choice predictions yet reflect second-order perspective taking. PMID:27853440

  8. Association of the Shared Epitope, Smoking and the Interaction Between the Two With the Presence of Autoantibodies (Anti-CCP and FR) in Patients With Rheumatoid Arthritis in a Hospital in Seville, Spain.

    PubMed

    García de Veas Silva, José Luis; González Rodríguez, Concepción; Hernández Cruz, Blanca

    2017-11-01

    To evaluate the association of shared epitope, smoking and their interaction on the presence of autoantibodies (anti-cyclic citrullinated peptide [CCP] antibodies and rheumatoid factor) in patients with rheumatoid arthritis in our geographical area. A descriptive and cross-sectional study was carried out in a cohort of 106 patients diagnosed with RA. Odds ratios (OR) for antibody development were calculated for shared epitope, tobacco exposure and smoking dose. Statistical analysis was performed with univariate and multivariate statistics using ordinal logistic regression. Odds ratios were calculated with 95% confidence interval (95% CI) and a value of P<.05 was considered significant. In univariate analysis, shared epitope (OR=2.68; 95% CI: 1.11-6.46), tobacco exposure (OR=2.79; 95% CI: 1.12-6.97) and heavy smoker (>20 packs/year) (OR=8.93; 95% CI: 1.95-40.82) were associated with the presence of anti-CCP antibodies. For rheumatoid factor, the association was only significant for tobacco exposure (OR=3.89; 95% CI: 1.06-14.28) and smoking dose (OR=8.33; 95% CI: 1.05-66.22). By ordinal logistic regression analysis, an association with high titers of anti-CCP (>200U/mL) was identified with South American mestizos, patients with homozygous shared epitope, positive FR and heavy smokers. Being a South American mestizo, having a shared epitope, rheumatoid factor positivity and a smoking dose>20 packs/year are independent risk factors for the development of rheumatoid arthritis with a high titer of anti-CCP (>200U/mL). In shared epitope-positive rheumatoid arthritis patients, the intensity of smoking is more strongly associated than tobacco exposure with an increased risk of positive anti-CCP. Copyright © 2017 Elsevier España, S.L.U. and Sociedad Española de Reumatología y Colegio Mexicano de Reumatología. All rights reserved.

  9. The Bright Side and Dark Side of Workplace Social Capital: Opposing Effects of Gender on Overweight among Japanese Employees

    PubMed Central

    Kobayashi, Tomoko; Suzuki, Etsuji; Oksanen, Tuula; Kawachi, Ichiro; Takao, Soshi

    2014-01-01

    Background A growing number of studies have sought to examine the health associations of workplace social capital; however, evidence of associations with overweight is sparse. We examined the association between individual perceptions of workplace social capital and overweight among Japanese male and female employees. Methodology/Principal Findings We conducted a cross-sectional survey among full-time employees at a company in Osaka prefecture in February 2012. We used an 8-item measure to assess overall and sub-dimensions of workplace social capital, divided into tertiles. Of 1050 employees, 849 responded, and 750 (624 men and 126 women) could be linked to annual health check-up data in the analysis. Binomial logistic regression models were used to calculate odds ratios and 95% confidence intervals for overweight (body mass index: ≥25 kg/m2, calculated from measured weight and height) separately for men and women. The prevalence of overweight was 24.5% among men and 14.3% among women. Among men, low levels of bonding and linking social capital in the workplace were associated with a nearly 2-fold risk of overweight compared to high corresponding dimensions of social capital when adjusted for age, sleep hours, physiological distress, and lifestyle. In contrast, among women we found lower overall and linking social capital to be associated with lower odds for overweight even after covariate adjustment. Subsequently, we used multinomial logistic regression analyses to assess the relationships between a 1 standard deviation (SD) decrease in mean social capital and odds of underweight/overweight relative to normal weight. Among men, a 1-SD decrease in overall, bonding, and linking social capital was significantly associated with higher odds of overweight, but not with underweight. Among women, no significant associations were found for either overweight or underweight. Conclusions/Significance We found opposite gender relationships between perceived low linking workplace social capital and overweight among Japanese employees. PMID:24498248

  10. The bright side and dark side of workplace social capital: opposing effects of gender on overweight among Japanese employees.

    PubMed

    Kobayashi, Tomoko; Suzuki, Etsuji; Oksanen, Tuula; Kawachi, Ichiro; Takao, Soshi

    2014-01-01

    A growing number of studies have sought to examine the health associations of workplace social capital; however, evidence of associations with overweight is sparse. We examined the association between individual perceptions of workplace social capital and overweight among Japanese male and female employees. We conducted a cross-sectional survey among full-time employees at a company in Osaka prefecture in February 2012. We used an 8-item measure to assess overall and sub-dimensions of workplace social capital, divided into tertiles. Of 1050 employees, 849 responded, and 750 (624 men and 126 women) could be linked to annual health check-up data in the analysis. Binomial logistic regression models were used to calculate odds ratios and 95% confidence intervals for overweight (body mass index: ≥ 25 kg/m(2), calculated from measured weight and height) separately for men and women. The prevalence of overweight was 24.5% among men and 14.3% among women. Among men, low levels of bonding and linking social capital in the workplace were associated with a nearly 2-fold risk of overweight compared to high corresponding dimensions of social capital when adjusted for age, sleep hours, physiological distress, and lifestyle. In contrast, among women we found lower overall and linking social capital to be associated with lower odds for overweight even after covariate adjustment. Subsequently, we used multinomial logistic regression analyses to assess the relationships between a 1 standard deviation (SD) decrease in mean social capital and odds of underweight/overweight relative to normal weight. Among men, a 1-SD decrease in overall, bonding, and linking social capital was significantly associated with higher odds of overweight, but not with underweight. Among women, no significant associations were found for either overweight or underweight. We found opposite gender relationships between perceived low linking workplace social capital and overweight among Japanese employees.

  11. Polycystic Ovary Syndrome, Oligomenorrhea, and Risk of Ovarian Cancer Histotypes: Evidence from the Ovarian Cancer Association Consortium.

    PubMed

    Harris, Holly R; Babic, Ana; Webb, Penelope M; Nagle, Christina M; Jordan, Susan J; Risch, Harvey A; Rossing, Mary Anne; Doherty, Jennifer A; Goodman, Marc T; Modugno, Francesmary; Ness, Roberta B; Moysich, Kirsten B; Kjær, Susanne K; Høgdall, Estrid; Jensen, Allan; Schildkraut, Joellen M; Berchuck, Andrew; Cramer, Daniel W; Bandera, Elisa V; Wentzensen, Nicolas; Kotsopoulos, Joanne; Narod, Steven A; Phelan, Catherine M; McLaughlin, John R; Anton-Culver, Hoda; Ziogas, Argyrios; Pearce, Celeste L; Wu, Anna H; Terry, Kathryn L

    2018-02-01

    Background: Polycystic ovary syndrome (PCOS), and one of its distinguishing characteristics, oligomenorrhea, have both been associated with ovarian cancer risk in some but not all studies. However, these associations have been rarely examined by ovarian cancer histotypes, which may explain the lack of clear associations reported in previous studies. Methods: We analyzed data from 14 case-control studies including 16,594 women with invasive ovarian cancer ( n = 13,719) or borderline ovarian disease ( n = 2,875) and 17,718 controls. Adjusted study-specific ORs were calculated using logistic regression and combined using random-effects meta-analysis. Pooled histotype-specific ORs were calculated using polytomous logistic regression. Results: Women reporting menstrual cycle length >35 days had decreased risk of invasive ovarian cancer compared with women reporting cycle length ≤35 days [OR = 0.70; 95% confidence interval (CI) = 0.58-0.84]. Decreased risk of invasive ovarian cancer was also observed among women who reported irregular menstrual cycles compared with women with regular cycles (OR = 0.83; 95% CI = 0.76-0.89). No significant association was observed between self-reported PCOS and invasive ovarian cancer risk (OR = 0.87; 95% CI = 0.65-1.15). There was a decreased risk of all individual invasive histotypes for women with menstrual cycle length >35 days, but no association with serous borderline tumors ( P heterogeneity = 0.006). Similarly, we observed decreased risks of most invasive histotypes among women with irregular cycles, but an increased risk of borderline serous and mucinous tumors ( P heterogeneity < 0.0001). Conclusions: Our results suggest that menstrual cycle characteristics influence ovarian cancer risk differentially based on histotype. Impact: These results highlight the importance of examining ovarian cancer risk factors associations by histologic subtype. Cancer Epidemiol Biomarkers Prev; 27(2); 174-82. ©2017 AACR . ©2017 American Association for Cancer Research.

  12. Comparative effectiveness of echinocandins versus fluconazole therapy for the treatment of adult candidaemia due to Candida parapsilosis: a retrospective observational cohort study of the Mycoses Study Group (MSG-12).

    PubMed

    Chiotos, Kathleen; Vendetti, Neika; Zaoutis, Theoklis E; Baddley, John; Ostrosky-Zeichner, Luis; Pappas, Peter; Fisher, Brian T

    2016-12-01

    A polymorphism in the gene encoding β-1,3-glucan synthase, the target of the echinocandin class of antifungals, results in increased in vitro MICs of the echinocandins. This has resulted in controversy surrounding use of the echinocandins for treatment of Candida parapsilosis candidaemia. We aimed to compare 30 day mortality in adults with C. parapsilosis candidaemia treated with echinocandins versus fluconazole. This is a retrospective observational cohort study. We used the Premier Perspective Database to identify adult patients with C. parapsilosis candidaemia treated with only fluconazole or only an echinocandin as definitive therapy. The primary outcome was 30 day mortality. Propensity scores were derived to estimate the probability the patient would have received either an echinocandin or fluconazole. Inverse probability of treatment weighting (IPTW) was used in a weighted logistic regression to calculate odds of 30 day mortality. There were 307 unique patients with C. parapsilosis candidaemia. One hundred and twenty-six (41%) received fluconazole and 181 (59%) received an echinocandin. Age, gender, race, year of admission, need for ICU resources in the week prior to candidaemia onset, and receipt of vasopressors on the day of candidaemia onset were included in the propensity score model used to calculate inverse probability of treatment weights. Weighted logistic regression demonstrated no difference in 30 day mortality between patients receiving an echinocandin as compared with fluconazole (OR 0.82, 95% CI 0.33-2.07). Our result supports the 2016 IDSA invasive candidiasis guidelines, which no longer clearly favour treatment with fluconazole over an echinocandin for C. parapsilosis candidaemia. © The Author 2016. Published by Oxford University Press on behalf of the British Society for Antimicrobial Chemotherapy. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  13. Mixture models for undiagnosed prevalent disease and interval-censored incident disease: applications to a cohort assembled from electronic health records.

    PubMed

    Cheung, Li C; Pan, Qing; Hyun, Noorie; Schiffman, Mark; Fetterman, Barbara; Castle, Philip E; Lorey, Thomas; Katki, Hormuzd A

    2017-09-30

    For cost-effectiveness and efficiency, many large-scale general-purpose cohort studies are being assembled within large health-care providers who use electronic health records. Two key features of such data are that incident disease is interval-censored between irregular visits and there can be pre-existing (prevalent) disease. Because prevalent disease is not always immediately diagnosed, some disease diagnosed at later visits are actually undiagnosed prevalent disease. We consider prevalent disease as a point mass at time zero for clinical applications where there is no interest in time of prevalent disease onset. We demonstrate that the naive Kaplan-Meier cumulative risk estimator underestimates risks at early time points and overestimates later risks. We propose a general family of mixture models for undiagnosed prevalent disease and interval-censored incident disease that we call prevalence-incidence models. Parameters for parametric prevalence-incidence models, such as the logistic regression and Weibull survival (logistic-Weibull) model, are estimated by direct likelihood maximization or by EM algorithm. Non-parametric methods are proposed to calculate cumulative risks for cases without covariates. We compare naive Kaplan-Meier, logistic-Weibull, and non-parametric estimates of cumulative risk in the cervical cancer screening program at Kaiser Permanente Northern California. Kaplan-Meier provided poor estimates while the logistic-Weibull model was a close fit to the non-parametric. Our findings support our use of logistic-Weibull models to develop the risk estimates that underlie current US risk-based cervical cancer screening guidelines. Published 2017. This article has been contributed to by US Government employees and their work is in the public domain in the USA. Published 2017. This article has been contributed to by US Government employees and their work is in the public domain in the USA.

  14. Carotid artery intima-media complex thickening in patients with relatively long-surviving type 1 diabetes mellitus.

    PubMed

    Distiller, Larry A; Joffe, Barry I; Melville, Vanessa; Welman, Tania; Distiller, Greg B

    2006-01-01

    The factors responsible for premature coronary atherosclerosis in patients with type 1 diabetes are ill defined. We therefore assessed carotid intima-media complex thickness (IMT) in relatively long-surviving patients with type 1 diabetes as a marker of atherosclerosis and correlated this with traditional risk factors. Cross-sectional study of 148 patients with relatively long-surviving (>18 years) type 1 diabetes (76 men and 72 women) attending the Centre for Diabetes and Endocrinology, Johannesburg. The mean common carotid artery IMT and presence or absence of plaque was evaluated by high-resolution B-mode ultrasound. Their median age was 48 years and duration of diabetes 26 years (range 18-59 years). Traditional risk factors (age, duration of diabetes, glycemic control, hypertension, smoking and lipoprotein concentrations) were recorded. Three response variables were defined and modeled. Standard multiple regression was used for a continuous IMT variable, logistic regression for the presence/absence of plaque and ordinal logistic regression to model three categories of "risk." The median common carotid IMT was 0.62 mm (range 0.44-1.23 mm) with plaque detected in 28 cases. The multiple regression model found significant associations between IMT and current age (P=.001), duration of diabetes (P=.033), BMI (P=.008) and diagnosed hypertension (P=.046) with HDL showing a protective effect (P=.022). Current age (P=.001) and diagnosed hypertension (P=.004), smoking (P=.008) and retinopathy (P=.033) were significant in the logistic regression model. Current age was also significant in the ordinal logistic regression model (P<.001), as was total cholesterol/HDL ratio (P<.001) and mean HbA(1c) concentration (P=.073). The major factors influencing common carotid IMT in patients with relatively long-surviving type 1 diabetes are age, duration of diabetes, existing hypertension and HDL (protective) with a relatively minor role ascribed to relatively long-standing glycemic control.

  15. History of spontaneous miscarriage and the risk of diabetes mellitus among middle-aged and older Chinese women.

    PubMed

    Liu, Bingqing; Song, Lulu; Li, Hui; Zheng, Xiaoxuan; Yuan, Jing; Liang, Yuan; Wang, Youjie

    2018-06-01

    Epidemiological studies of the long-term maternal health outcomes of spontaneous miscarriages have been sparse and inconsistent. The objective of our study is to examine the association between spontaneous miscarriages and diabetes among middle-aged and older Chinese women. A total of 19,539 women from the Dongfeng-Tongji cohort study who completed a questionnaire and had medical examinations performed on were included in the analysis. History of spontaneous miscarriage was obtained by self-reporting in the first follow-up questionnaire interview. The presence of diabetes was determined by a fasting plasma glucose level, self-reported physician diagnosis and use of antidiabetic medication. A series of multivariate logistic regression models were used to calculate the odds ratios and 95% CI across spontaneous miscarriage categories (0, 1, 2, ≥ 3) after adjustment for potential confounding factors. The prevalence rate of diabetes was 18.8% among the participants. In the fully adjusted logistic regression model, women who had 1, 2 or ≥ 3 spontaneous miscarriages had 0.86 times (95% CI 0.68, 1.08), 1.30 times (95% CI 0.82, 2.04) and 2.11 times (95% CI 1.08, 4.11) higher risk of diabetes, respectively, compared with women who had no history of spontaneous miscarriage. There is an increased risk of diabetes among women with a history of a higher number of spontaneous miscarriages. History of multiple spontaneous miscarriages should be taken into consideration when assessing the risk of diabetes.

  16. Validation of statistical predictive models meant to select melanoma patients for sentinel lymph node biopsy.

    PubMed

    Sabel, Michael S; Rice, John D; Griffith, Kent A; Lowe, Lori; Wong, Sandra L; Chang, Alfred E; Johnson, Timothy M; Taylor, Jeremy M G

    2012-01-01

    To identify melanoma patients at sufficiently low risk of nodal metastases who could avoid sentinel lymph node biopsy (SLNB), several statistical models have been proposed based upon patient/tumor characteristics, including logistic regression, classification trees, random forests, and support vector machines. We sought to validate recently published models meant to predict sentinel node status. We queried our comprehensive, prospectively collected melanoma database for consecutive melanoma patients undergoing SLNB. Prediction values were estimated based upon four published models, calculating the same reported metrics: negative predictive value (NPV), rate of negative predictions (RNP), and false-negative rate (FNR). Logistic regression performed comparably with our data when considering NPV (89.4 versus 93.6%); however, the model's specificity was not high enough to significantly reduce the rate of biopsies (SLN reduction rate of 2.9%). When applied to our data, the classification tree produced NPV and reduction in biopsy rates that were lower (87.7 versus 94.1 and 29.8 versus 14.3, respectively). Two published models could not be applied to our data due to model complexity and the use of proprietary software. Published models meant to reduce the SLNB rate among patients with melanoma either underperformed when applied to our larger dataset, or could not be validated. Differences in selection criteria and histopathologic interpretation likely resulted in underperformance. Statistical predictive models must be developed in a clinically applicable manner to allow for both validation and ultimately clinical utility.

  17. Reporting and methodological quality of meta-analyses in urological literature.

    PubMed

    Xia, Leilei; Xu, Jing; Guzzo, Thomas J

    2017-01-01

    To assess the overall quality of published urological meta-analyses and identify predictive factors for high quality. We systematically searched PubMed to identify meta-analyses published from January 1st, 2011 to December 31st, 2015 in 10 predetermined major paper-based urology journals. The characteristics of the included meta-analyses were collected, and their reporting and methodological qualities were assessed by the PRISMA checklist (27 items) and AMSTAR tool (11 items), respectively. Descriptive statistics were used for individual items as a measure of overall compliance, and PRISMA and AMSTAR scores were calculated as the sum of adequately reported domains. Logistic regression was used to identify predictive factors for high qualities. A total of 183 meta-analyses were included. The mean PRISMA and AMSTAR scores were 22.74 ± 2.04 and 7.57 ± 1.41, respectively. PRISMA item 5, protocol and registration, items 15 and 22, risk of bias across studies, items 16 and 23, additional analysis had less than 50% adherence. AMSTAR item 1, " a priori " design, item 5, list of studies and item 10, publication bias had less than 50% adherence. Logistic regression analyses showed that funding support and " a priori " design were associated with superior reporting quality, following PRISMA guideline and " a priori " design were associated with superior methodological quality. Reporting and methodological qualities of recently published meta-analyses in major paper-based urology journals are generally good. Further improvement could potentially be achieved by strictly adhering to PRISMA guideline and having " a priori " protocol.

  18. Prevalence and Trend of Overweight and Obesity among Schoolchildren in Ahvaz, Southwest of Iran

    PubMed Central

    Tabesh, Hamed; Hosseiny, Sayyed Mahdi; Kompani, Farshid; Saki, Azadeh; Firoozabadi, Maliheh Saeed; Chenary, Roghayeh; Fard, Mahta Mehrabian

    2014-01-01

    Introduction: Obesity is an important risk factor for some chronic diseases. Since the effect of obesity is long-standing, monitoring childhood obesity should be the first step in the health policy for interventions regarding early prevention of chronic diseases. In this study we aim to determine the prevalence of overweight and obesity among school children in the city of Ahvaz. Methods: A cross-sectional survey was designed. A sample of 5811 children, 2904 (49.97%) boys and 2907 (50.03%) girls, was selected and their heights and weights were measured in 2012-2013 academic year. Measurements of height and weight were made by using calibrated equipment and according to standardized protocol with the children having light clothes and without wearing shoes. The adjusted odds ratio of obesity and overweight for age and sex were calculated from multiple logistic regression model. Results: A total 685 (23.6%) of boys and 561 (19.3%) of girls were overweight. and 190(6.05%) of boys and 130 (4.5%) of girls were obese. The proportion of overweight and obese boys was significantly higher than that of girls (p<0.001). Logistic regression showed significant increase in the likelihood of being overweight with the increasing age OR=1.50, C.I.95%: (1.43, 1.57). Conclusion: The prevalence of overweight and obesity increased markedly with age. This shows the importance of early prevention by doing interventions and training since the first year of primary school. PMID:24576363

  19. Prevalence and trend of overweight and obesity among schoolchildren in Ahvaz, Southwest of Iran.

    PubMed

    Tabesh, Hamed; Hosseiny, Sayyed Mahdi; Kompani, Farshid; Saki, Azadeh; Firoozabadi, Maliheh Saeed; Chenary, Roghayeh; Mehrabian Fard, Mahta

    2013-11-26

    Obesity is an important risk factor for some chronic diseases. Since the effect of obesity is long-standing, monitoring childhood obesity should be the first step in the health policy for interventions regarding early prevention of chronic diseases. In this study we aim to determine the prevalence of overweight and obesity among school children in the city of Ahvaz. A cross-sectional survey was designed. A sample of 5811 children, 2904 (49.97%) boys and 2907 (50.03%) girls, was selected and their heights and weights were measured in 2012-2013 academic year. Measurements of height and weight were made by using calibrated equipment and according to standardized protocol with the children having light clothes and without wearing shoes. The adjusted odds ratio of obesity and overweight for age and sex were calculated from multiple logistic regression model. A total 685 (23.6%) of boys and 561 (19.3%) of girls were overweight. and 190(6.05%) of boys and 130 (4.5%) of girls were obese. The proportion of overweight and obese boys was significantly higher than that of girls (p<0.001). Logistic regression showed significant increase in the likelihood of being overweight with the increasing age OR=1.50, C.I.95%: (1.43, 1.57). The prevalence of overweight and obesity increased markedly with age. This shows the importance of early prevention by doing interventions and training since the first year of primary school.

  20. Pharmacodynamics and effectiveness of topical nitroglycerin at lowering blood pressure during autonomic dysreflexia.

    PubMed

    Solinsky, R; Bunnell, A E; Linsenmeyer, T A; Svircev, J N; Engle, A; Burns, S P

    2017-10-01

    Secondary analysis of prospectively collected observational data assessing the safety of an autonomic dysreflexia (AD) management protocol. To estimate the time to onset of action, time to full clinical effect (sustained systolic blood pressure (SBP) <160 mm Hg) and effectiveness of nitroglycerin ointment at lowering blood pressure for patients with spinal cord injuries experiencing AD. US Veterans Affairs inpatient spinal cord injury (SCI) unit. Episodes of AD recalcitrant to nonpharmacologic interventions that were given one to two inches of 2% topical nitroglycerin ointment were recorded. Pharmacodynamics as above and predictive characteristics (through a mixed multivariate logistic regression model) were calculated. A total of 260 episodes of pharmacologically managed AD were recorded in 56 individuals. Time to onset of action for nitroglycerin ointment was 9-11 min. Time to full clinical effect was 14-20 min. Topical nitroglycerin controlled SBP <160 mm Hg in 77.3% of pharmacologically treated AD episodes with the remainder requiring additional antihypertensive medications. A multivariate logistic regression model was unable to identify statistically significant factors to predict which patients would respond to nitroglycerin ointment (odds ratios 95% confidence intervals 0.29-4.93). The adverse event rate, entirely attributed to hypotension, was 3.6% with seven of the eight events resolving with close observation alone and one episode requiring normal saline. Nitroglycerin ointment has a rapid onset of action and time to full clinical effect with high efficacy and relatively low adverse event rate for patients with SCI experiencing AD.

  1. Blood cadmium concentrations in Korean adolescents: From the Korea National Health and Nutrition Examination Survey 2010-2013.

    PubMed

    Ahn, Borami; Kim, Shin-Hye; Park, Mi-Jung

    2017-01-01

    To assess blood cadmium levels in Korean adolescents with respect to demographic and lifestyle factors. We analyzed data from the Korea National Health and Nutrition Examination Survey from 2010 to 2013, totaling 1472 adolescents aged 10-18 years. Geometric means of blood cadmium were calculated using a complex samples general linear model to compare blood levels in different demographic and lifestyle groups. Multivariate logistic regression analyses were also used to find predictors for high blood cadmium (>90th percentile). The geometric mean of the blood cadmium concentrations was 0.30μg/L in Korean adolescents. Older age, type of housing (multifamily house and commercial building), smoking and alcohol consumption, and iron deficiency/iron deficiency anemia (IDA) were significantly associated with higher blood cadmium concentrations (P<0.05). Blood cadmium concentrations were not significantly affected by gender, region, body mass index status, or household income. In multivariate logistic regression analysis, independent predictors for higher blood cadmium levels included current smoker (OR=7.77), alcohol consumption (OR=4.31), living in a multifamily house or commercial building (OR=3.11-3.46), and IDA (OR=2.64). Possible associations between blood cadmium levels and type of housing or alcohol consumption in adolescents are suggested for the first time in this study. Further studies are needed to elucidate the mechanism of these findings. Copyright © 2016 Elsevier GmbH. All rights reserved.

  2. Does a birthday predispose to vascular events?

    PubMed

    Saposnik, Gustavo; Baibergenova, Akerke; Dang, Jason; Hachinski, Vladimir

    2006-07-25

    To examine the influence of birthdays on the onset and course of vascular events such as stroke, TIA, and acute myocardial infarction (AMI). This population-based study included all emergency department (ED) admissions due to ischemic stroke, TIA, or AMI from April 2002 to March 2004 in Ontario, Canada. All cases were identified through the National Ambulatory Care Reporting System. Calculations of daily and weekly numbers of events were centered on the patient's birthday and the week of the birthday. Statistical analyses include binomial tests and logistic regression. During the study period, there were 24,315 ED admissions with acute stroke, 16,088 with TIAs, and 29,090 with AMI. The observed number of vascular events during the birthday was higher than the expected daily number of visits for stroke (87 vs 67; p = 0.009), TIA (58 vs 44; p = 0.02), and AMI (97 vs 80; p = 0.027) but not for selected control conditions (asthma, appendicitis, head trauma). Vascular events were more likely to occur on birthday (242 vs 191; odds ratio [OR] = 1.27). No significant differences were observed during the birthday week for any of the conditions. Multivariate logistic regression showed that birthday vascular events were more likely to occur in patients with a history of hypertension (OR = 1.88; 95% CI 1.09 to 3.24). Sensitivity analyses with alternative definitions of birthday week did not alter the results. Stress associated with birthdays may trigger vascular events in patients with predisposing conditions.

  3. An Evidence-Based Approach to Defining Fetal Macrosomia.

    PubMed

    Froehlich, Rosemary; Simhan, Hyagriv N; Larkin, Jacob C

    2016-04-01

    This study aims to determine the risk of adverse outcomes associated with the current diagnostic criteria for fetal macrosomia. Study We evaluated three techniques for characterizing birth weight as a predictor of shoulder dystocia or third- or fourth-degree laceration in 79,879 vaginal deliveries. First, we compared deliveries with birth weights above or below 4,500 g. We then performed logistic regression using birth weight as a continuous predictor, both with and without fractional polynomial transformation. Finally, we calculated the number of cesarean sections required to prevent one incident of the interrogated outcomes (number needed to treat [NNT]). Rates of adverse intrapartum outcomes increase incrementally with increasing birth weight and are predicted most accurately with logistic regression following fractional polynomial transformation. The NNT for third- or fourth-degree laceration dropped from 14.3 (95% confidence interval [CI], 13.9-14.7) at a birth weight of 3,500 g to 6.4 (95% CI, 6.1-6.8) at 4,500 g and, for shoulder dystocia, from 54.9 (95% CI, 51.5-58.6) at 3,500 g to 5.6 (95% CI, 5.2-6.0) at 4,500 g. The conventional distinction between "normal" and "macrosomic" does not reflect the incremental effect of increasing birth weight on the risk of obstetric morbidity. Outcomes analysis can inform fetal growth standards to better reflect relevant thresholds of risk. Thieme Medical Publishers 333 Seventh Avenue, New York, NY 10001, USA.

  4. The relationship between the dietary inflammatory index and prevalence of radiographic symptomatic osteoarthritis: data from the Osteoarthritis Initiative.

    PubMed

    Veronese, Nicola; Shivappa, Nitin; Stubbs, Brendon; Smith, Toby; Hébert, James R; Cooper, Cyrus; Guglielmi, Giuseppe; Reginster, Jean-Yves; Rizzoli, Renè; Maggi, Stefania

    2017-12-05

    To investigate whether higher dietary inflammatory index (DII ® ) scores were associated with higher prevalence of radiographic symptomatic knee osteoarthritis in a large cohort of North American people from the Osteoarthritis Initiative database. A total of 4358 community-dwelling participants (2527 females; mean age 61.2 years) from the Osteoarthritis Initiative were identified. DII ® scores were calculated using the validated Block Brief 2000 Food-Frequency Questionnaire and scores were categorized into quartiles. Knee radiographic symptomatic osteoarthritis was diagnosed clinically and radiologically. The strength of association between divided into quartiles (DII ® ) and knee osteoarthritis was investigated through a logistic regression analysis, which adjusted for potential confounders, and results were reported as odds ratios (ORs) with 95% confidence intervals (CIs). Participants with a higher DII ® score, indicating a more pro-inflammatory diet, had a significantly higher prevalence of radiographic symptomatic knee osteoarthritis compared to those with lower DII ® score (quartile 4: 35.4% vs. quartile 1: 24.0%; p < 0.0001). Using a logistic regression analysis, adjusting for 11 potential confounders, participants with the highest DII ® score (quartile 4) had a significantly higher probability of experiencing radiographic symptomatic knee osteoarthritis (OR 1.40; 95% CI 1.14-1.72; p = 0.002) compared to participants with the lowest DII ® score (quartile 1). Higher DII ® values are associated with higher prevalence of radiographic symptomatic knee osteoarthritis.

  5. Predictive occurrence models for coastal wetland plant communities: delineating hydrologic response surfaces with multinomial logistic regression

    USGS Publications Warehouse

    Snedden, Gregg A.; Steyer, Gregory D.

    2013-01-01

    Understanding plant community zonation along estuarine stress gradients is critical for effective conservation and restoration of coastal wetland ecosystems. We related the presence of plant community types to estuarine hydrology at 173 sites across coastal Louisiana. Percent relative cover by species was assessed at each site near the end of the growing season in 2008, and hourly water level and salinity were recorded at each site Oct 2007–Sep 2008. Nine plant community types were delineated with k-means clustering, and indicator species were identified for each of the community types with indicator species analysis. An inverse relation between salinity and species diversity was observed. Canonical correspondence analysis (CCA) effectively segregated the sites across ordination space by community type, and indicated that salinity and tidal amplitude were both important drivers of vegetation composition. Multinomial logistic regression (MLR) and Akaike's Information Criterion (AIC) were used to predict the probability of occurrence of the nine vegetation communities as a function of salinity and tidal amplitude, and probability surfaces obtained from the MLR model corroborated the CCA results. The weighted kappa statistic, calculated from the confusion matrix of predicted versus actual community types, was 0.7 and indicated good agreement between observed community types and model predictions. Our results suggest that models based on a few key hydrologic variables can be valuable tools for predicting vegetation community development when restoring and managing coastal wetlands.

  6. Public sector scale-up of zinc and ORS improves coverage in selected districts in Bihar, India.

    PubMed

    Walker, Christa L Fischer; Taneja, Sunita; Lamberti, Laura M; Black, Robert E; Mazumder, Sarmila

    2015-12-01

    In Bihar, India, a new initiative to enhance diarrhea treatment with zinc and ORS in the public sector was rolled out in selected districts. We conducted an external evaluation to measure changes in diarrhea careseeking and treatment in intervention districts. We conducted baseline and endline household surveys among caregivers of children 2-59 months of age. We calculated summary statistics for household characteristics, knowledge, careseeking and treatments given to children with a diarrhea episode in the last 14 days and built logistic regression models to compare baseline and endline values. Caregivers named a public health center as an appropriate source of care for childhood diarrhea more often at endline (71.3%) compared to baseline (38.4%) but did not report increased careseeking to public sector providers for the current diarrhea episode. In logistic regression analyses, the odds of receiving zinc, with or without oral rehydration salts (ORS), increased at endline by more than 2.7 as compared to baseline. Children who were taken to the public sector for care were more likely to receive zinc (odds ratio, OR = 3.93) and zinc in addition to ORS (OR = 6.10) compared to children who were not taken to the public sector. Coverage of zinc and ORS can improve with public sector programs targeted at training and increasing product availability, but demand creation may be needed to increase public sector careseeking in areas where the private sector has historically provided much of the care.

  7. Validation of Statistical Predictive Models Meant to Select Melanoma Patients for Sentinel Lymph Node Biopsy

    PubMed Central

    Sabel, Michael S.; Rice, John D.; Griffith, Kent A.; Lowe, Lori; Wong, Sandra L.; Chang, Alfred E.; Johnson, Timothy M.; Taylor, Jeremy M.G.

    2013-01-01

    Introduction To identify melanoma patients at sufficiently low risk of nodal metastases who could avoid SLN biopsy (SLNB). Several statistical models have been proposed based upon patient/tumor characteristics, including logistic regression, classification trees, random forests and support vector machines. We sought to validate recently published models meant to predict sentinel node status. Methods We queried our comprehensive, prospectively-collected melanoma database for consecutive melanoma patients undergoing SLNB. Prediction values were estimated based upon 4 published models, calculating the same reported metrics: negative predictive value (NPV), rate of negative predictions (RNP), and false negative rate (FNR). Results Logistic regression performed comparably with our data when considering NPV (89.4% vs. 93.6%); however the model’s specificity was not high enough to significantly reduce the rate of biopsies (SLN reduction rate of 2.9%). When applied to our data, the classification tree produced NPV and reduction in biopsies rates that were lower 87.7% vs. 94.1% and 29.8% vs. 14.3%, respectively. Two published models could not be applied to our data due to model complexity and the use of proprietary software. Conclusions Published models meant to reduce the SLNB rate among patients with melanoma either underperformed when applied to our larger dataset, or could not be validated. Differences in selection criteria and histopathologic interpretation likely resulted in underperformance. Development of statistical predictive models must be created in a clinically applicable manner to allow for both validation and ultimately clinical utility. PMID:21822550

  8. Association between red blood cell distribution width (RDW) and carotid artery atherosclerosis (CAS) in patients with primary ischemic stroke.

    PubMed

    Jia, He; Li, Huimian; Zhang, Yan; Li, Che; Hu, Yingyun; Xia, Chunfang

    2015-01-01

    The present study aimed to explore the association between RDW and CAS in patients with ischemic stroke, expecting to find a new and significant diagnosis index for clinical practice. This cross-sectional study involves 432 consecutive patients with primary ischemic stroke (within 72 h). All subjects were confirmed by magnetic resonance imaging, and underwent physical examination, laboratory tests and carotid ultrasonography check. Finally, 392 patients were included according to the exclusion criteria. The odds ratios of independent variables were calculated using stepwise multiple logistic regression. Carotid intimal-medial thickness (IMT) and RDW are both significantly different between CAS group and control group. Univariate analyses show that high-sensitive C-reactive protein (Hs-CRP) and RDW (r=0.436) are both in significantly positive association with IMT. Stepwise multiple logistic regression shows that RDW is an independent protective factor of CAS in patients with ischemic stroke. Compared with the lowest quartile, the second to fourth quartiles are 1.13 (95% CI: 1.13-3.05), 2.02 (95% CI: 1.66-4.67), and 3.10 (95% CI: 2.46-7.65), respectively. The present study suggested that RDW level were higher than non-CAS in patients with primary ischemic stroke. Our results facilitated a bridge to connect RDW with ischemic stroke and further confirmed the role of RDW in the progression of the ischemic stroke. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  9. An exploration of the relationship between youth assets and engagement in risky sexual behaviors.

    PubMed

    Evans, Alexandra E; Sanderson, Maureen; Griffin, Sarah F; Reininger, Belinda; Vincent, Murray L; Parra-Medina, Debra; Valois, Robert F; Taylor, Doug

    2004-11-01

    To examine the relationship between specific youth assets and adolescents' engagement in risky sexual behaviors, as measured by an Aggregate Sexual Risk score, and to specifically explore which youth assets and demographic variables were predictive of youth engagement in risky sexual intercourse. A total of 2108 sexually active high school students attending public high schools in a southern state completed a self-report questionnaire that measured youth assets. Based upon responses to items measuring risk behaviors, an Aggregate Sexual Risk score was calculated for each student. Unconditional logistic regression and multivariate logistic regression analyses were conducted to examine the relationships between the assets and the Aggregate Risk Score. Four separate analyses (white females, white males, black females, and black males) were conducted. In general, the patterns in all four groups indicated that students who had an Aggregate Risk Score of > or = 3 (high risk) possessed less of the measured youth assets. The assets that were most significantly associated with engagement in risky sexual behaviors included self peer values regarding risky behaviors, quantity of other adult support, and youths' empathetic relationships. Thus, students who reported not having these assets were significantly more likely to engage in the risky sexual behaviors. Results underscore the relationship of specific youth assets to sexual risk behaviors. Health researcher and practitioners who work to prevent teen pregnancy and sexually transmitted infections among teenagers need to understand and acknowledge these factors within this population so that the assets can be built or strengthened.

  10. Relationship between number of sexual intercourse partners and selected health risk behaviors among public high school adolescents.

    PubMed

    Valois, R F; Oeltmann, J E; Waller, J; Hussey, J R

    1999-11-01

    To examine the relationship between number of sexual partners and selected health risk behaviors in a statewide sample of public high school students. The Centers for Disease Control and Prevention Youth Risk Behavior Survey was used to secure usable sexual risk-taking, substance use, and violence/aggression data from 3805 respondents. Because simple polychotomous logistic regression analysis revealed a significant Race x Gender interaction, subsequent multivariate models were constructed separately for each race-gender group. Odds ratios and 95% confidence intervals was calculated from polychotomous logistic regression models for number of sexual intercourse partners and their potential risk behavior correlates. An increased number of sexual intercourse partners were correlated with a cluster of risk behaviors that place adolescents at risk for unintended pregnancy, human immunodeficiency virus/acquired immunodeficiency syndrome, and other sexually transmitted infections. For Black females, alcohol, tobacco, marijuana use, and dating violence behaviors were the strongest predictors of an increased number of sexual partners; white females had similar predictors with the addition of physical fighting. For white males, alcohol, tobacco, marijuana use, physical fighting, carrying weapons, and dating violence were the strongest predictors of an increased number of sexual intercourse partners. Black males had similar predictors with the addition of binge alcohol use. Prevention of adolescent sexual and other health risk behaviors calls for creative approaches in school and community settings and will require long-term intervention strategies focused on adolescent behavior changes and environmental modifications.

  11. Prediction of Depression in Cancer Patients With Different Classification Criteria, Linear Discriminant Analysis versus Logistic Regression.

    PubMed

    Shayan, Zahra; Mohammad Gholi Mezerji, Naser; Shayan, Leila; Naseri, Parisa

    2015-11-03

    Logistic regression (LR) and linear discriminant analysis (LDA) are two popular statistical models for prediction of group membership. Although they are very similar, the LDA makes more assumptions about the data. When categorical and continuous variables used simultaneously, the optimal choice between the two models is questionable. In most studies, classification error (CE) is used to discriminate between subjects in several groups, but this index is not suitable to predict the accuracy of the outcome. The present study compared LR and LDA models using classification indices. This cross-sectional study selected 243 cancer patients. Sample sets of different sizes (n = 50, 100, 150, 200, 220) were randomly selected and the CE, B, and Q classification indices were calculated by the LR and LDA models. CE revealed the a lack of superiority for one model over the other, but the results showed that LR performed better than LDA for the B and Q indices in all situations. No significant effect for sample size on CE was noted for selection of an optimal model. Assessment of the accuracy of prediction of real data indicated that the B and Q indices are appropriate for selection of an optimal model. The results of this study showed that LR performs better in some cases and LDA in others when based on CE. The CE index is not appropriate for classification, although the B and Q indices performed better and offered more efficient criteria for comparison and discrimination between groups.

  12. Household financial contribution to the health System in Shiraz, Iran in 2012.

    PubMed

    Kavosi, Zahra; Keshtkaran, Ali; Hayati, Ramin; Ravangard, Ramin; Khammarnia, Mohammad

    2014-10-01

    One common challenge to social systems is achieving equity in financial contributions and preventing financial loss. Because of the large and unpredictable nature of some costs, achieving this goal in the health system presents important and unique problems. The present study investigated the Household Financial Contributions (HFCs) to the health system. The study investigated 800 households in Shiraz. The study sample size was selected using stratified sampling and cluster sampling in the urban and rural regions, respectively. The data was collected using the household section of the World Health Survey (WHS) questionnaire. Catastrophic health expenditures were calculated based on the ability of the household to pay and the reasons for the catastrophic health expenditures by a household were specified using logistic regression. The results showed that the fairness financial contribution index was 0.6 and that 14.2% of households were faced with catastrophic health expenditures. Logistic regression analysis revealed that household economic status, the basic and supplementary insurance status of the head of the household, existence of individuals in the household who require chronic medical care, use of dental and hospital care, rural location of residences, frequency of use of outpatient services, and Out-of-Pocket (OOP) payment for physician visits were effective factors for determining the likelihood of experiencing catastrophic health expenditure. It appears that the current method of health financing in Iran does not adequately protect households against catastrophic health expenditure. Consequently, it is essential to reform healthcare financing.

  13. Osteoporosis prediction from the mandible using cone-beam computed tomography

    PubMed Central

    Al Haffar, Iyad; Khattab, Razan

    2014-01-01

    Purpose This study aimed to evaluate the use of dental cone-beam computed tomography (CBCT) in the diagnosis of osteoporosis among menopausal and postmenopausal women by using only a CBCT viewer program. Materials and Methods Thirty-eight menopausal and postmenopausal women who underwent dual-energy X-ray absorptiometry (DXA) examination for hip and lumbar vertebrae were scanned using CBCT (field of view: 13 cm×15 cm; voxel size: 0.25 mm). Slices from the body of the mandible as well as the ramus were selected and some CBCT-derived variables, such as radiographic density (RD) as gray values, were calculated as gray values. Pearson's correlation, one-way analysis of variance (ANOVA), and accuracy (sensitivity and specificity) evaluation based on linear and logistic regression were performed to choose the variable that best correlated with the lumbar and femoral neck T-scores. Results RD of the whole bone area of the mandible was the variable that best correlated with and predicted both the femoral neck and the lumbar vertebrae T-scores; further, Pearson's correlation coefficients were 0.5/0.6 (p value=0.037/0.009). The sensitivity, specificity, and accuracy based on the logistic regression were 50%, 88.9%, and 78.4%, respectively, for the femoral neck, and 46.2%, 91.3%, and 75%, respectively, for the lumbar vertebrae. Conclusion Lumbar vertebrae and femoral neck osteoporosis can be predicted with high accuracy from the RD value of the body of the mandible by using a CBCT viewer program. PMID:25473633

  14. Association of RTEL1 gene polymorphisms with stroke risk in a Chinese Han population.

    PubMed

    Cai, Yi; Zeng, Chaosheng; Su, Qingjie; Zhou, Jingxia; Li, Pengxiang; Dai, Mingming; Wang, Desheng; Long, Faqing

    2017-12-29

    We investigated the associations between single nucleotide polymorphisms (SNPs) in the regulator of telomere elongation helicase 1 ( RTEL1 ) gene and stroke in the Chinese population. A total of 400 stroke patients and 395 healthy participants were included in this study. Five SNPs in RTEL1 were genotyped and the association with stroke risk was analyzed. Odds ratios (ORs) and 95% confidence intervals (95% CIs) were calculated using unconditional logistic regression analysis. Multivariate logistic regression analysis was used to identify SNPs that correlated with stroke. Rs2297441 was associated with an increased risk of stroke in an allele model (odds ratio [OR] = 1.24, 95% confidence interval [95% CI] = 1.01-1.52, p = 0.043). Rs6089953 was associated with an increased risk of stroke under the genotype model ([OR] = 1.862, [CI] = 1.123-3.085, p = 0.016). Rs2297441 was associated with an increased risk of stroke in an additive model (OR = 1.234, 95% CI = 1.005, p = 0.045, Rs6089953, Rs6010620 and Rs6010621 were associated with an increased risk of stroke in the recessive model (Rs6089953:OR = 1.825, 95% CI = 1.121-2.969, p =0.01546; Rs6010620: OR = 1.64, 95% CI = 1.008-2.669, p =0.04656;Rs6010621:OR = 1.661, 95% CI = 1.014-2.722, p =0.04389). Our findings reveal a possible association between SNPs in the RTEL1 gene and stroke risk in Chinese population.

  15. SCREENING FOR TYPE 2 DIABETES MELLITUS AND PREDIABETES USING POINT-OF-CARE TESTING FOR HBA1C AMONG THAI DENTAL PATIENTS.

    PubMed

    Tantipoj, Chanita; Sakoolnamarka, Serena Siraratna; Supa-amornkul, Sirirak; Lohsoonthorn, Vitool; Deerochanawong, Chaicharn; Khovidhunkit, Siribangon Piboonniyom; Hiransuthikul, Narin

    2017-03-01

    Diabetes mellitus type 2 (DM) is associated with oral diseases. Some studies indicated that patients who seek dental treatment could have undiagnosed hyperglycemic condition. The aim of this study was to assess the prevalence of undiagnosed hyperglycemia and selected associated factors among Thai dental patients. Dental patients without a history of hyperglycemia were recruited from the Special Clinic, Faculty of Dentistry, Mahidol University, Bangkok, Thailand and His Majesty the King’s Dental Service Unit, Thailand. The patients were randomly selected and a standardized questionnaire was used to collect demographic data from each patient. Blood pressure, body mass index (BMI), and waist circumference were recorded for each subject. The number of missing teeth, periodontal status, and salivary flow rate were also investigated. HbA1c was assessed using a finger prick blood sample and analyzed with a point-of-care testing machine. Hyperglycemia was defined as a HbA1c ≥5.7%. The prevalence of hyperglycemia among participants was calculated and multivariate logistic regression analysis was used to identify risk factors. A total of 724 participants were included in the study; 33.8% had hyperglycemia. On multiple logistic regression analysis, older age, family history of DM, being overweight (BMI ≥23 kg/m2), having central obesity and having severe periodontitis were significantly associated with hyperglycemia. The high prevalence of hyperglycemia in this study of dental patients suggests this setting may be appropriate to screen for patients with hyperglycemia.

  16. Evaluation of select neurophysiological, clinical and psychological tests for burning mouth syndrome.

    PubMed

    Mendak-Ziółko, Magdalena; Konopka, Tomasz; Bogucki, Zdzisław Artur

    2012-09-01

    The objective of this study was to identify, among an array of potential risk factors for burning mouth syndrome (BMS), those that are potentially the most significant in the development of the disease. Sixty-three participants, divided into group I (with BMS: 33 patients ages 41 to 82 years [mean age: 61.5 ± 9.4]) and group II (without BMS: 30 healthy volunteers ages 42-83 years [mean age: 60.5 ± 10.5]) were studied. All underwent a dental examination and psychological tests. Neurological tests (neurophysiological test, electroneurography, and tests of the autonomic nervous system) were performed. Mean parameters were analyzed by Student t test, Kruskal-Wallis test, and χ(2) test, and multifactor analysis was performed with logistic regression and by calculating the odds ratio. In the logistic regression test, 3 factors were significant in the etiopathogenesis of BMS: a value more than 39 μV for the amplitude of the positive peak of the potential induced by stimulating the trigeminal nerve on the left side (P2-L); a value above 5.96 ms for the latency of wave V of the brainstem auditory evoked potentials on the right side (V-R); and a value over 2.35 ms for the latency of the sensory ulnar nerve response. The BMS sufferer was characterized as having mild sensory and autonomic small fiber neuropathy with concomitant central disorders. Copyright © 2012 Elsevier Inc. All rights reserved.

  17. Correlation and simple linear regression.

    PubMed

    Eberly, Lynn E

    2007-01-01

    This chapter highlights important steps in using correlation and simple linear regression to address scientific questions about the association of two continuous variables with each other. These steps include estimation and inference, assessing model fit, the connection between regression and ANOVA, and study design. Examples in microbiology are used throughout. This chapter provides a framework that is helpful in understanding more complex statistical techniques, such as multiple linear regression, linear mixed effects models, logistic regression, and proportional hazards regression.

  18. Multiple Imputation of a Randomly Censored Covariate Improves Logistic Regression Analysis.

    PubMed

    Atem, Folefac D; Qian, Jing; Maye, Jacqueline E; Johnson, Keith A; Betensky, Rebecca A

    2016-01-01

    Randomly censored covariates arise frequently in epidemiologic studies. The most commonly used methods, including complete case and single imputation or substitution, suffer from inefficiency and bias. They make strong parametric assumptions or they consider limit of detection censoring only. We employ multiple imputation, in conjunction with semi-parametric modeling of the censored covariate, to overcome these shortcomings and to facilitate robust estimation. We develop a multiple imputation approach for randomly censored covariates within the framework of a logistic regression model. We use the non-parametric estimate of the covariate distribution or the semiparametric Cox model estimate in the presence of additional covariates in the model. We evaluate this procedure in simulations, and compare its operating characteristics to those from the complete case analysis and a survival regression approach. We apply the procedures to an Alzheimer's study of the association between amyloid positivity and maternal age of onset of dementia. Multiple imputation achieves lower standard errors and higher power than the complete case approach under heavy and moderate censoring and is comparable under light censoring. The survival regression approach achieves the highest power among all procedures, but does not produce interpretable estimates of association. Multiple imputation offers a favorable alternative to complete case analysis and ad hoc substitution methods in the presence of randomly censored covariates within the framework of logistic regression.

  19. Multidimensional Ultrasound Imaging of the Wrist: Changes of Shape and Displacement of the Median Nerve and Tendons in Carpal Tunnel Syndrome

    PubMed Central

    Filius, Anika; Scheltens, Marjan; Bosch, Hans G.; van Doorn, Pieter A.; Stam, Henk J.; Hovius, Steven E.R.; Amadio, Peter C.; Selles, Ruud W.

    2015-01-01

    Dynamics of structures within the carpal tunnel may alter in carpal tunnel syndrome (CTS) due to fibrotic changes and increased carpal tunnel pressure. Ultrasound can visualize these potential changes, making ultrasound potentially an accurate diagnostic tool. To study this, we imaged the carpal tunnel of 113 patients and 42 controls. CTS severity was classified according to validated clinical and nerve conduction study (NCS) classifications. Transversal and longitudinal displacement and shape (changes) were calculated for the median nerve, tendons and surrounding tissue. To predict diagnostic value binary logistic regression modeling was applied. Reduced longitudinal nerve displacement (p≤0.019), increased nerve cross-sectional area (p≤0.006) and perimeter (p≤0.007), and a trend of relatively changed tendon displacements were seen in patients. Changes were more convincing when CTS was classified as more severe. Binary logistic modeling to diagnose CTS using ultrasound showed a sensitivity of 70-71% and specificity of 80-84%. In conclusion, CTS patients have altered dynamics of structures within the carpal tunnel. PMID:25865180

  20. Multinomial logistic regression in workers' health

    NASA Astrophysics Data System (ADS)

    Grilo, Luís M.; Grilo, Helena L.; Gonçalves, Sónia P.; Junça, Ana

    2017-11-01

    In European countries, namely in Portugal, it is common to hear some people mentioning that they are exposed to excessive and continuous psychosocial stressors at work. This is increasing in diverse activity sectors, such as, the Services sector. A representative sample was collected from a Portuguese Services' organization, by applying a survey (internationally validated), which variables were measured in five ordered categories in Likert-type scale. A multinomial logistic regression model is used to estimate the probability of each category of the dependent variable general health perception where, among other independent variables, burnout appear as statistically significant.

  1. Blastocoele expansion degree predicts live birth after single blastocyst transfer for fresh and vitrified/warmed single blastocyst transfer cycles.

    PubMed

    Du, Qing-Yun; Wang, En-Yin; Huang, Yan; Guo, Xiao-Yi; Xiong, Yu-Jing; Yu, Yi-Ping; Yao, Gui-Dong; Shi, Sen-Lin; Sun, Ying-Pu

    2016-04-01

    To evaluate the independent effects of the degree of blastocoele expansion and re-expansion and the inner cell mass (ICM) and trophectoderm (TE) grades on predicting live birth after fresh and vitrified/warmed single blastocyst transfer. Retrospective study. Reproductive medical center. Women undergoing 844 fresh and 370 vitrified/warmed single blastocyst transfer cycles. None. Live-birth rate correlated with blastocyst morphology parameters by logistic regression analysis and Spearman correlations analysis. The degree of blastocoele expansion and re-expansion was the only blastocyst morphology parameter that exhibited a significant ability to predict live birth in both fresh and vitrified/warmed single blastocyst transfer cycles respectively by multivariate logistic regression and Spearman correlations analysis. Although the ICM grade was significantly related to live birth in fresh cycles according to the univariate model, its effect was not maintained in the multivariate logistic analysis. In vitrified/warmed cycles, neither ICM nor TE grade was correlated with live birth by logistic regression analysis. This study is the first to confirm that the degree of blastocoele expansion and re-expansion is a better predictor of live birth after both fresh and vitrified/warmed single blastocyst transfer cycles than ICM or TE grade. Copyright © 2016. Published by Elsevier Inc.

  2. Characterization of Microbiota in Children with Chronic Functional Constipation.

    PubMed

    de Meij, Tim G J; de Groot, Evelien F J; Eck, Anat; Budding, Andries E; Kneepkens, C M Frank; Benninga, Marc A; van Bodegraven, Adriaan A; Savelkoul, Paul H M

    2016-01-01

    Disruption of the intestinal microbiota is considered an etiological factor in pediatric functional constipation. Scientifically based selection of potential beneficial probiotic strains in functional constipation therapy is not feasible due to insufficient knowledge of microbiota composition in affected subjects. The aim of this study was to describe microbial composition and diversity in children with functional constipation, compared to healthy controls. Fecal samples from 76 children diagnosed with functional constipation according to the Rome III criteria (median age 8.0 years; range 4.2-17.8) were analyzed by IS-pro, a PCR-based microbiota profiling method. Outcome was compared with intestinal microbiota profiles of 61 healthy children (median 8.6 years; range 4.1-17.9). Microbiota dissimilarity was depicted by principal coordinate analysis (PCoA), diversity was calculated by Shannon diversity index. To determine the most discriminative species, cross validated logistic ridge regression was performed. Applying total microbiota profiles (all phyla together) or per phylum analysis, no disease-specific separation was observed by PCoA and by calculation of diversity indices. By ridge regression, however, functional constipation and controls could be discriminated with 82% accuracy. Most discriminative species were Bacteroides fragilis, Bacteroides ovatus, Bifidobacterium longum, Parabacteroides species (increased in functional constipation) and Alistipes finegoldii (decreased in functional constipation). None of the commonly used unsupervised statistical methods allowed for microbiota-based discrimination of children with functional constipation and controls. By ridge regression, however, both groups could be discriminated with 82% accuracy. Optimization of microbiota-based interventions in constipated children warrants further characterization of microbial signatures linked to clinical subgroups of functional constipation.

  3. Factor complexity of crash occurrence: An empirical demonstration using boosted regression trees.

    PubMed

    Chung, Yi-Shih

    2013-12-01

    Factor complexity is a characteristic of traffic crashes. This paper proposes a novel method, namely boosted regression trees (BRT), to investigate the complex and nonlinear relationships in high-variance traffic crash data. The Taiwanese 2004-2005 single-vehicle motorcycle crash data are used to demonstrate the utility of BRT. Traditional logistic regression and classification and regression tree (CART) models are also used to compare their estimation results and external validities. Both the in-sample cross-validation and out-of-sample validation results show that an increase in tree complexity provides improved, although declining, classification performance, indicating a limited factor complexity of single-vehicle motorcycle crashes. The effects of crucial variables including geographical, time, and sociodemographic factors explain some fatal crashes. Relatively unique fatal crashes are better approximated by interactive terms, especially combinations of behavioral factors. BRT models generally provide improved transferability than conventional logistic regression and CART models. This study also discusses the implications of the results for devising safety policies. Copyright © 2012 Elsevier Ltd. All rights reserved.

  4. Estimating time-varying exposure-outcome associations using case-control data: logistic and case-cohort analyses.

    PubMed

    Keogh, Ruth H; Mangtani, Punam; Rodrigues, Laura; Nguipdop Djomo, Patrick

    2016-01-05

    Traditional analyses of standard case-control studies using logistic regression do not allow estimation of time-varying associations between exposures and the outcome. We present two approaches which allow this. The motivation is a study of vaccine efficacy as a function of time since vaccination. Our first approach is to estimate time-varying exposure-outcome associations by fitting a series of logistic regressions within successive time periods, reusing controls across periods. Our second approach treats the case-control sample as a case-cohort study, with the controls forming the subcohort. In the case-cohort analysis, controls contribute information at all times they are at risk. Extensions allow left truncation, frequency matching and, using the case-cohort analysis, time-varying exposures. Simulations are used to investigate the methods. The simulation results show that both methods give correct estimates of time-varying effects of exposures using standard case-control data. Using the logistic approach there are efficiency gains by reusing controls over time and care should be taken over the definition of controls within time periods. However, using the case-cohort analysis there is no ambiguity over the definition of controls. The performance of the two analyses is very similar when controls are used most efficiently under the logistic approach. Using our methods, case-control studies can be used to estimate time-varying exposure-outcome associations where they may not previously have been considered. The case-cohort analysis has several advantages, including that it allows estimation of time-varying associations as a continuous function of time, while the logistic regression approach is restricted to assuming a step function form for the time-varying association.

  5. Methods for identifying SNP interactions: a review on variations of Logic Regression, Random Forest and Bayesian logistic regression.

    PubMed

    Chen, Carla Chia-Ming; Schwender, Holger; Keith, Jonathan; Nunkesser, Robin; Mengersen, Kerrie; Macrossan, Paula

    2011-01-01

    Due to advancements in computational ability, enhanced technology and a reduction in the price of genotyping, more data are being generated for understanding genetic associations with diseases and disorders. However, with the availability of large data sets comes the inherent challenges of new methods of statistical analysis and modeling. Considering a complex phenotype may be the effect of a combination of multiple loci, various statistical methods have been developed for identifying genetic epistasis effects. Among these methods, logic regression (LR) is an intriguing approach incorporating tree-like structures. Various methods have built on the original LR to improve different aspects of the model. In this study, we review four variations of LR, namely Logic Feature Selection, Monte Carlo Logic Regression, Genetic Programming for Association Studies, and Modified Logic Regression-Gene Expression Programming, and investigate the performance of each method using simulated and real genotype data. We contrast these with another tree-like approach, namely Random Forests, and a Bayesian logistic regression with stochastic search variable selection.

  6. Personality predicts time to remission and clinical status in hypochondriasis during a 6-year follow-up.

    PubMed

    Greeven, Anja; van Balkom, Anton J L M; Spinhoven, Philip

    2014-05-01

    We aimed to investigate whether personality characteristics predict time to remission and psychiatric status. The follow-up was at most 6 years and was performed within the scope of a randomized controlled trial that investigated the efficacy of cognitive behavioral therapy, paroxetine, and placebo in hypochondriasis. The Life Chart Interview was administered to investigate for each year if remission had occurred. Personality was assessed at pretest by the Abbreviated Dutch Temperament and Character Inventory. Cox's regression models for recurrent events were compared with logistic regression models. Sixteen (36.4%) of 44 patients achieved remission during the follow-up period. Cox's regression yielded approximately the same results as the logistic regression. Being less harm avoidant and more cooperative were associated with a shorter time to remission and a remitted state after the follow-up period. Personality variables seem to be relevant for describing patients with a more chronic course of hypochondriacal complaints.

  7. A Hybrid Approach of Stepwise Regression, Logistic Regression, Support Vector Machine, and Decision Tree for Forecasting Fraudulent Financial Statements

    PubMed Central

    Goo, Yeong-Jia James; Shen, Zone-De

    2014-01-01

    As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%. PMID:25302338

  8. Modeling brook trout presence and absence from landscape variables using four different analytical methods

    USGS Publications Warehouse

    Steen, Paul J.; Passino-Reader, Dora R.; Wiley, Michael J.

    2006-01-01

    As a part of the Great Lakes Regional Aquatic Gap Analysis Project, we evaluated methodologies for modeling associations between fish species and habitat characteristics at a landscape scale. To do this, we created brook trout Salvelinus fontinalis presence and absence models based on four different techniques: multiple linear regression, logistic regression, neural networks, and classification trees. The models were tested in two ways: by application to an independent validation database and cross-validation using the training data, and by visual comparison of statewide distribution maps with historically recorded occurrences from the Michigan Fish Atlas. Although differences in the accuracy of our models were slight, the logistic regression model predicted with the least error, followed by multiple regression, then classification trees, then the neural networks. These models will provide natural resource managers a way to identify habitats requiring protection for the conservation of fish species.

  9. A hybrid approach of stepwise regression, logistic regression, support vector machine, and decision tree for forecasting fraudulent financial statements.

    PubMed

    Chen, Suduan; Goo, Yeong-Jia James; Shen, Zone-De

    2014-01-01

    As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%.

  10. Functional Data Analysis Applied to Modeling of Severe Acute Mucositis and Dysphagia Resulting From Head and Neck Radiation Therapy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dean, Jamie A., E-mail: jamie.dean@icr.ac.uk; Wong, Kee H.; Gay, Hiram

    Purpose: Current normal tissue complication probability modeling using logistic regression suffers from bias and high uncertainty in the presence of highly correlated radiation therapy (RT) dose data. This hinders robust estimates of dose-response associations and, hence, optimal normal tissue–sparing strategies from being elucidated. Using functional data analysis (FDA) to reduce the dimensionality of the dose data could overcome this limitation. Methods and Materials: FDA was applied to modeling of severe acute mucositis and dysphagia resulting from head and neck RT. Functional partial least squares regression (FPLS) and functional principal component analysis were used for dimensionality reduction of the dose-volume histogrammore » data. The reduced dose data were input into functional logistic regression models (functional partial least squares–logistic regression [FPLS-LR] and functional principal component–logistic regression [FPC-LR]) along with clinical data. This approach was compared with penalized logistic regression (PLR) in terms of predictive performance and the significance of treatment covariate–response associations, assessed using bootstrapping. Results: The area under the receiver operating characteristic curve for the PLR, FPC-LR, and FPLS-LR models was 0.65, 0.69, and 0.67, respectively, for mucositis (internal validation) and 0.81, 0.83, and 0.83, respectively, for dysphagia (external validation). The calibration slopes/intercepts for the PLR, FPC-LR, and FPLS-LR models were 1.6/−0.67, 0.45/0.47, and 0.40/0.49, respectively, for mucositis (internal validation) and 2.5/−0.96, 0.79/−0.04, and 0.79/0.00, respectively, for dysphagia (external validation). The bootstrapped odds ratios indicated significant associations between RT dose and severe toxicity in the mucositis and dysphagia FDA models. Cisplatin was significantly associated with severe dysphagia in the FDA models. None of the covariates was significantly associated with severe toxicity in the PLR models. Dose levels greater than approximately 1.0 Gy/fraction were most strongly associated with severe acute mucositis and dysphagia in the FDA models. Conclusions: FPLS and functional principal component analysis marginally improved predictive performance compared with PLR and provided robust dose-response associations. FDA is recommended for use in normal tissue complication probability modeling.« less

  11. Functional Data Analysis Applied to Modeling of Severe Acute Mucositis and Dysphagia Resulting From Head and Neck Radiation Therapy.

    PubMed

    Dean, Jamie A; Wong, Kee H; Gay, Hiram; Welsh, Liam C; Jones, Ann-Britt; Schick, Ulrike; Oh, Jung Hun; Apte, Aditya; Newbold, Kate L; Bhide, Shreerang A; Harrington, Kevin J; Deasy, Joseph O; Nutting, Christopher M; Gulliford, Sarah L

    2016-11-15

    Current normal tissue complication probability modeling using logistic regression suffers from bias and high uncertainty in the presence of highly correlated radiation therapy (RT) dose data. This hinders robust estimates of dose-response associations and, hence, optimal normal tissue-sparing strategies from being elucidated. Using functional data analysis (FDA) to reduce the dimensionality of the dose data could overcome this limitation. FDA was applied to modeling of severe acute mucositis and dysphagia resulting from head and neck RT. Functional partial least squares regression (FPLS) and functional principal component analysis were used for dimensionality reduction of the dose-volume histogram data. The reduced dose data were input into functional logistic regression models (functional partial least squares-logistic regression [FPLS-LR] and functional principal component-logistic regression [FPC-LR]) along with clinical data. This approach was compared with penalized logistic regression (PLR) in terms of predictive performance and the significance of treatment covariate-response associations, assessed using bootstrapping. The area under the receiver operating characteristic curve for the PLR, FPC-LR, and FPLS-LR models was 0.65, 0.69, and 0.67, respectively, for mucositis (internal validation) and 0.81, 0.83, and 0.83, respectively, for dysphagia (external validation). The calibration slopes/intercepts for the PLR, FPC-LR, and FPLS-LR models were 1.6/-0.67, 0.45/0.47, and 0.40/0.49, respectively, for mucositis (internal validation) and 2.5/-0.96, 0.79/-0.04, and 0.79/0.00, respectively, for dysphagia (external validation). The bootstrapped odds ratios indicated significant associations between RT dose and severe toxicity in the mucositis and dysphagia FDA models. Cisplatin was significantly associated with severe dysphagia in the FDA models. None of the covariates was significantly associated with severe toxicity in the PLR models. Dose levels greater than approximately 1.0 Gy/fraction were most strongly associated with severe acute mucositis and dysphagia in the FDA models. FPLS and functional principal component analysis marginally improved predictive performance compared with PLR and provided robust dose-response associations. FDA is recommended for use in normal tissue complication probability modeling. Copyright © 2016 The Author(s). Published by Elsevier Inc. All rights reserved.

  12. Methodologic considerations in the design and analysis of nested case-control studies: association between cytokines and postoperative delirium.

    PubMed

    Ngo, Long H; Inouye, Sharon K; Jones, Richard N; Travison, Thomas G; Libermann, Towia A; Dillon, Simon T; Kuchel, George A; Vasunilashorn, Sarinnapha M; Alsop, David C; Marcantonio, Edward R

    2017-06-06

    The nested case-control study (NCC) design within a prospective cohort study is used when outcome data are available for all subjects, but the exposure of interest has not been collected, and is difficult or prohibitively expensive to obtain for all subjects. A NCC analysis with good matching procedures yields estimates that are as efficient and unbiased as estimates from the full cohort study. We present methodological considerations in a matched NCC design and analysis, which include the choice of match algorithms, analysis methods to evaluate the association of exposures of interest with outcomes, and consideration of overmatching. Matched, NCC design within a longitudinal observational prospective cohort study in the setting of two academic hospitals. Study participants are patients aged over 70 years who underwent scheduled major non-cardiac surgery. The primary outcome was postoperative delirium from in-hospital interviews and medical record review. The main exposure was IL-6 concentration (pg/ml) from blood sampled at three time points before delirium occurred. We used nonparametric signed ranked test to test for the median of the paired differences. We used conditional logistic regression to model the risk of IL-6 on delirium incidence. Simulation was used to generate a sample of cohort data on which unconditional multivariable logistic regression was used, and the results were compared to those of the conditional logistic regression. Partial R-square was used to assess the level of overmatching. We found that the optimal match algorithm yielded more matched pairs than the greedy algorithm. The choice of analytic strategy-whether to consider measured cytokine levels as the predictor or outcome-- yielded inferences that have different clinical interpretations but similar levels of statistical significance. Estimation results from NCC design using conditional logistic regression, and from simulated cohort design using unconditional logistic regression, were similar. We found minimal evidence for overmatching. Using a matched NCC approach introduces methodological challenges into the study design and data analysis. Nonetheless, with careful selection of the match algorithm, match factors, and analysis methods, this design is cost effective and, for our study, yields estimates that are similar to those from a prospective cohort study design.

  13. Discriminating between adaptive and carcinogenic liver hypertrophy in rat studies using logistic ridge regression analysis of toxicogenomic data: The mode of action and predictive models

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Liu, Shujie; Kawamoto, Taisuke; Morita, Osamu

    Chemical exposure often results in liver hypertrophy in animal tests, characterized by increased liver weight, hepatocellular hypertrophy, and/or cell proliferation. While most of these changes are considered adaptive responses, there is concern that they may be associated with carcinogenesis. In this study, we have employed a toxicogenomic approach using a logistic ridge regression model to identify genes responsible for liver hypertrophy and hypertrophic hepatocarcinogenesis and to develop a predictive model for assessing hypertrophy-inducing compounds. Logistic regression models have previously been used in the quantification of epidemiological risk factors. DNA microarray data from the Toxicogenomics Project-Genomics Assisted Toxicity Evaluation System weremore » used to identify hypertrophy-related genes that are expressed differently in hypertrophy induced by carcinogens and non-carcinogens. Data were collected for 134 chemicals (72 non-hypertrophy-inducing chemicals, 27 hypertrophy-inducing non-carcinogenic chemicals, and 15 hypertrophy-inducing carcinogenic compounds). After applying logistic ridge regression analysis, 35 genes for liver hypertrophy (e.g., Acot1 and Abcc3) and 13 genes for hypertrophic hepatocarcinogenesis (e.g., Asns and Gpx2) were selected. The predictive models built using these genes were 94.8% and 82.7% accurate, respectively. Pathway analysis of the genes indicates that, aside from a xenobiotic metabolism-related pathway as an adaptive response for liver hypertrophy, amino acid biosynthesis and oxidative responses appear to be involved in hypertrophic hepatocarcinogenesis. Early detection and toxicogenomic characterization of liver hypertrophy using our models may be useful for predicting carcinogenesis. In addition, the identified genes provide novel insight into discrimination between adverse hypertrophy associated with carcinogenesis and adaptive hypertrophy in risk assessment. - Highlights: • Hypertrophy (H) and hypertrophic carcinogenesis (C) were studied by toxicogenomics. • Important genes for H and C were selected by logistic ridge regression analysis. • Amino acid biosynthesis and oxidative responses may be involved in C. • Predictive models for H and C provided 94.8% and 82.7% accuracy, respectively. • The identified genes could be useful for assessment of liver hypertrophy.« less

  14. A comparative analysis of predictive models of morbidity in intensive care unit after cardiac surgery - part II: an illustrative example.

    PubMed

    Cevenini, Gabriele; Barbini, Emanuela; Scolletta, Sabino; Biagioli, Bonizella; Giomarelli, Pierpaolo; Barbini, Paolo

    2007-11-22

    Popular predictive models for estimating morbidity probability after heart surgery are compared critically in a unitary framework. The study is divided into two parts. In the first part modelling techniques and intrinsic strengths and weaknesses of different approaches were discussed from a theoretical point of view. In this second part the performances of the same models are evaluated in an illustrative example. Eight models were developed: Bayes linear and quadratic models, k-nearest neighbour model, logistic regression model, Higgins and direct scoring systems and two feed-forward artificial neural networks with one and two layers. Cardiovascular, respiratory, neurological, renal, infectious and hemorrhagic complications were defined as morbidity. Training and testing sets each of 545 cases were used. The optimal set of predictors was chosen among a collection of 78 preoperative, intraoperative and postoperative variables by a stepwise procedure. Discrimination and calibration were evaluated by the area under the receiver operating characteristic curve and Hosmer-Lemeshow goodness-of-fit test, respectively. Scoring systems and the logistic regression model required the largest set of predictors, while Bayesian and k-nearest neighbour models were much more parsimonious. In testing data, all models showed acceptable discrimination capacities, however the Bayes quadratic model, using only three predictors, provided the best performance. All models showed satisfactory generalization ability: again the Bayes quadratic model exhibited the best generalization, while artificial neural networks and scoring systems gave the worst results. Finally, poor calibration was obtained when using scoring systems, k-nearest neighbour model and artificial neural networks, while Bayes (after recalibration) and logistic regression models gave adequate results. Although all the predictive models showed acceptable discrimination performance in the example considered, the Bayes and logistic regression models seemed better than the others, because they also had good generalization and calibration. The Bayes quadratic model seemed to be a convincing alternative to the much more usual Bayes linear and logistic regression models. It showed its capacity to identify a minimum core of predictors generally recognized as essential to pragmatically evaluate the risk of developing morbidity after heart surgery.

  15. Toward a model for improved targeting of aged at risk of institutionalization.

    PubMed Central

    Weissert, W G; Cready, C M

    1989-01-01

    A national sample of institutionalized and noninstitutionalized aged was created by merging the 1977 National Nursing Home Survey and its counterpart, the National Health Interview Survey for the same year. A weighted logistic regression analysis was conducted to identify factors that might be useful in calculating home- and community-based long-term care clients' risk of institutionalization. A model containing patient characteristics, nursing home bed supply, and a climate variable correctly classified 98.2 percent of cases residing in nursing homes or the community. Physical dependency, mental disorder and degenerative disease, lack of spouse, being white, poverty, old age, unoccupied nursing home beds, and climate all appear to be determinants of institutional residency among the aged. PMID:2807934

  16. Independent Prognostic Factors for Acute Organophosphorus Pesticide Poisoning.

    PubMed

    Tang, Weidong; Ruan, Feng; Chen, Qi; Chen, Suping; Shao, Xuebo; Gao, Jianbo; Zhang, Mao

    2016-07-01

    Acute organophosphorus pesticide poisoning (AOPP) is becoming a significant problem and a potential cause of human mortality because of the abuse of organophosphate compounds. This study aims to determine the independent prognostic factors of AOPP by using multivariate logistic regression analysis. The clinical data for 71 subjects with AOPP admitted to our hospital were retrospectively analyzed. This information included the Acute Physiology and Chronic Health Evaluation II (APACHE II) scores, 6-h post-admission blood lactate levels, post-admission 6-h lactate clearance rates, admission blood cholinesterase levels, 6-h post-admission blood cholinesterase levels, cholinesterase activity, blood pH, and other factors. Univariate analysis and multivariate logistic regression analyses were conducted to identify all prognostic factors and independent prognostic factors, respectively. A receiver operating characteristic curve was plotted to analyze the testing power of independent prognostic factors. Twelve of 71 subjects died. Admission blood lactate levels, 6-h post-admission blood lactate levels, post-admission 6-h lactate clearance rates, blood pH, and APACHE II scores were identified as prognostic factors for AOPP according to the univariate analysis, whereas only 6-h post-admission blood lactate levels, post-admission 6-h lactate clearance rates, and blood pH were independent prognostic factors identified by multivariate logistic regression analysis. The receiver operating characteristic analysis suggested that post-admission 6-h lactate clearance rates were of moderate diagnostic value. High 6-h post-admission blood lactate levels, low blood pH, and low post-admission 6-h lactate clearance rates were independent prognostic factors identified by multivariate logistic regression analysis. Copyright © 2016 by Daedalus Enterprises.

  17. A local equation for differential diagnosis of β-thalassemia trait and iron deficiency anemia by logistic regression analysis in Southeast Iran.

    PubMed

    Sargolzaie, Narjes; Miri-Moghaddam, Ebrahim

    2014-01-01

    The most common differential diagnosis of β-thalassemia (β-thal) trait is iron deficiency anemia. Several red blood cell equations were introduced during different studies for differential diagnosis between β-thal trait and iron deficiency anemia. Due to genetic variations in different regions, these equations cannot be useful in all population. The aim of this study was to determine a native equation with high accuracy for differential diagnosis of β-thal trait and iron deficiency anemia for the Sistan and Baluchestan population by logistic regression analysis. We selected 77 iron deficiency anemia and 100 β-thal trait cases. We used binary logistic regression analysis and determined best equations for probability prediction of β-thal trait against iron deficiency anemia in our population. We compared diagnostic values and receiver operative characteristic (ROC) curve related to this equation and another 10 published equations in discriminating β-thal trait and iron deficiency anemia. The binary logistic regression analysis determined the best equation for best probability prediction of β-thal trait against iron deficiency anemia with area under curve (AUC) 0.998. Based on ROC curves and AUC, Green & King, England & Frazer, and then Sirdah indices, respectively, had the most accuracy after our equation. We suggest that to get the best equation and cut-off in each region, one needs to evaluate specific information of each region, specifically in areas where populations are homogeneous, to provide a specific formula for differentiating between β-thal trait and iron deficiency anemia.

  18. Selenium in irrigated agricultural areas of the western United States

    USGS Publications Warehouse

    Nolan, B.T.; Clark, M.L.

    1997-01-01

    A logistic regression model was developed to predict the likelihood that Se exceeds the USEPA chronic criterion for aquatic life (5 ??g/L) in irrigated agricultural areas of the western USA. Preliminary analysis of explanatory variables used in the model indicated that surface-water Se concentration increased with increasing dissolved solids (DS) concentration and with the presence of Upper Cretaceous, mainly marine sediment. The presence or absence of Cretaceous sediment was the major variable affecting Se concentration in surface-water samples from the National Irrigation Water Quality Program. Median Se concentration was 14 ??g/L in samples from areas underlain by Cretaceous sediments and < 1 ??g/L in samples from areas underlain by non-Cretaceous sediments. Wilcoxon rank sum tests indicated that elevated Se concentrations in samples from areas with Cretaceous sediments, irrigated areas, and from closed lakes and ponds were statistically significant. Spearman correlations indicated that Se was positively correlated with a binary geology variable (0.64) and DS (0.45). Logistic regression models indicated that the concentration of Se in surface water was almost certain to exceed the Environmental Protection Agency aquatic-life chronic criterion of 5 ??g/L when DS was greater than 3000 mg/L in areas with Cretaceous sediments. The 'best' logistic regression model correctly predicted Se exceedances and nonexceedances 84.4% of the time, and model sensitivity was 80.7%. A regional map of Cretaceous sediment showed the location of potential problem areas. The map and logistic regression model are tools that can be used to determine the potential for Se contamination of irrigated agricultural areas in the western USA.

  19. Improving virtual screening predictive accuracy of Human kallikrein 5 inhibitors using machine learning models.

    PubMed

    Fang, Xingang; Bagui, Sikha; Bagui, Subhash

    2017-08-01

    The readily available high throughput screening (HTS) data from the PubChem database provides an opportunity for mining of small molecules in a variety of biological systems using machine learning techniques. From the thousands of available molecular descriptors developed to encode useful chemical information representing the characteristics of molecules, descriptor selection is an essential step in building an optimal quantitative structural-activity relationship (QSAR) model. For the development of a systematic descriptor selection strategy, we need the understanding of the relationship between: (i) the descriptor selection; (ii) the choice of the machine learning model; and (iii) the characteristics of the target bio-molecule. In this work, we employed the Signature descriptor to generate a dataset on the Human kallikrein 5 (hK 5) inhibition confirmatory assay data and compared multiple classification models including logistic regression, support vector machine, random forest and k-nearest neighbor. Under optimal conditions, the logistic regression model provided extremely high overall accuracy (98%) and precision (90%), with good sensitivity (65%) in the cross validation test. In testing the primary HTS screening data with more than 200K molecular structures, the logistic regression model exhibited the capability of eliminating more than 99.9% of the inactive structures. As part of our exploration of the descriptor-model-target relationship, the excellent predictive performance of the combination of the Signature descriptor and the logistic regression model on the assay data of the Human kallikrein 5 (hK 5) target suggested a feasible descriptor/model selection strategy on similar targets. Copyright © 2017 Elsevier Ltd. All rights reserved.

  20. Non-ignorable missingness in logistic regression.

    PubMed

    Wang, Joanna J J; Bartlett, Mark; Ryan, Louise

    2017-08-30

    Nonresponses and missing data are common in observational studies. Ignoring or inadequately handling missing data may lead to biased parameter estimation, incorrect standard errors and, as a consequence, incorrect statistical inference and conclusions. We present a strategy for modelling non-ignorable missingness where the probability of nonresponse depends on the outcome. Using a simple case of logistic regression, we quantify the bias in regression estimates and show the observed likelihood is non-identifiable under non-ignorable missing data mechanism. We then adopt a selection model factorisation of the joint distribution as the basis for a sensitivity analysis to study changes in estimated parameters and the robustness of study conclusions against different assumptions. A Bayesian framework for model estimation is used as it provides a flexible approach for incorporating different missing data assumptions and conducting sensitivity analysis. Using simulated data, we explore the performance of the Bayesian selection model in correcting for bias in a logistic regression. We then implement our strategy using survey data from the 45 and Up Study to investigate factors associated with worsening health from the baseline to follow-up survey. Our findings have practical implications for the use of the 45 and Up Study data to answer important research questions relating to health and quality-of-life. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  1. Prediction model for the return to work of workers with injuries in Hong Kong.

    PubMed

    Xu, Yanwen; Chan, Chetwyn C H; Lo, Karen Hui Yu-Ling; Tang, Dan

    2008-01-01

    This study attempts to formulate a prediction model of return to work for a group of workers who have been suffering from chronic pain and physical injury while also being out of work in Hong Kong. The study used Case-based Reasoning (CBR) method, and compared the result with the statistical method of logistic regression model. The database of the algorithm of CBR was composed of 67 cases who were also used in the logistic regression model. The testing cases were 32 participants who had a similar background and characteristics to those in the database. The methods of setting constraints and Euclidean distance metric were used in CBR to search the closest cases to the trial case based on the matrix. The usefulness of the algorithm was tested on 32 new participants, and the accuracy of predicting return to work outcomes was 62.5%, which was no better than the 71.2% accuracy derived from the logistic regression model. The results of the study would enable us to have a better understanding of the CBR applied in the field of occupational rehabilitation by comparing with the conventional regression analysis. The findings would also shed light on the development of relevant interventions for the return-to-work process of these workers.

  2. Factors associated with mouth breathing in children with -developmental -disabilities.

    PubMed

    de Castilho, Lia Silva; Abreu, Mauro Henrique Nogueira Guimarães; de Oliveira, Renata Batista; Souza E Silva, Maria Elisa; Resende, Vera Lúcia Silva

    2016-01-01

    To investigate the prevalence and factors associated with mouth breathing among patients with developmental disabilities of a dental service. We analyzed 408 dental records. Mouth breathing was reported by the patients' parents and from direct observation. Other variables were as -follows: history of asthma, bronchitis, palate shape, pacifier use, thumb -sucking, nail biting, use of medications, gastroesophageal reflux, bruxism, gender, age, and diagnosis of the patient. Statistical analysis included descriptive analysis with ratio calculation and multiple logistic regression. Variables with p < 0.25 were included in the model to estimate the adjusted OR (95% CI), calculated by the forward stepwise method. Variables with p ​​< 0.05 were kept in the model. Being male (p = 0.016) and use of centrally acting drugs (p = 0.001) were the variables that remained in the model. Among patients with -developmental disabilities, boys and psychotropic drug users had a greater chance of being mouth breathers. © 2016 Special Care Dentistry Association and Wiley Periodicals, Inc.

  3. Prevalence and psychosocial correlates of current smoking among adolescent students in Thailand, 2005.

    PubMed

    McKnight-Eily, Lela; Arrazola, René; Merritt, Robert; Malarcher, Ann; Sirichotiratana, Nithat

    2010-12-01

    This article examines the prevalence of current smoking and associated psychosocial correlates and whether these correlates differ by sex among adolescent students in Thailand. Data were analyzed from the Thailand Global Youth Tobacco Survey (GYTS), a school-based, cross-sectional survey conducted in 2005 and completed by Mathayom 1, 2, and 3 (U.S. seventh through ninth grades) students. Weighted prevalence estimates of the percentage of students who were current smokers (smoked on ≥ 1 day during the past 30 days) and noncurrent smokers were calculated for the sample and for each psychosocial variable. Separate logistic regression models were calculated for males and females to examine the independent association of the psychosocial correlates of current smoking. Significant correlates for both males and females included close peer smoking, secondhand smoke exposure, being offered a free cigarette by a tobacco industry representative, and belief that smoking is not harmful. These correlates are examined in the context of comprehensive tobacco control laws in Thailand.

  4. The use of mifepristone in abortion associated with an increased risk of uterine leiomyomas

    PubMed Central

    Shen, Qi; Shu, Li; Luo, Hui; Hu, Xiaoli; Zhu, Xueqiong

    2017-01-01

    Abstract To investigate the association between widespread use of mifepristone in abortions and risk of uterine leiomyomas. We conducted a case-control study of 305 patients with uterine leiomyomas between January 2011 and July 2012; 311 women with ordinary vaginitis were selected as controls during the same period. Data were collected by questionnaires (including past history, life history, menstruation history, reproductive history, abortion history, the use of mifepristone, and uterine leiomyomas risk factors) and calculated by univariate and multivariate conditional logistic regression analyses; odds ratios and its 95% confidence interval were calculated to estimate the risk for uterine leiomyomas. Abortion with mifepristone was one of the risk factors for uterine leiomyomas, and the risk increased with increasing frequency of mifepristone use. Family history of uterine leiomyomas, body mass index, age at menarche, number of full-term delivery, and medical abortion history were also correlated with uterine leiomyomas. The use of mifepristone in abortion will increase the risk to develop uterine leiomyomas. PMID:28445268

  5. Visual Acuity’s Association with Levels of Leisure-Time Physical Activity Among Community-Dwelling Older Adults

    PubMed Central

    Swanson, Mark W; Bodner, Eric; Sawyer, Patricia; Allman, Richard

    2013-01-01

    Little is known about the affect of reduced vision on physical activity in older adults. This study evaluates the association of visual acuity level, self-reported vision and ocular disease conditions with leisure-time physical activity and calculated caloric expenditure. A cross sectional study of 911 subjects 65 yr and older from the University of Alabama at Birmingham Study of Aging (SOA) cohort was conducted evaluating the association of vision-related variables to weekly kilocalorie expenditure calculated from the 17-item Leisure Time Physical Activity Questionnaire. Ordinal logistic regression was used to evaluate possible associations controlling for potential confounders. In multivariate analyses, each lower step in visual acuity category below 20/50 was significantly associated with reduced odds of having a higher level of physical activity OR 0.81, 95% CI 0.67, 0.97. Reduced visual acuity appears to be independently associated with lower levels of physical activity among community-dwelling adults. PMID:21945888

  6. The use of mifepristone in abortion associated with an increased risk of uterine leiomyomas.

    PubMed

    Shen, Qi; Shu, Li; Luo, Hui; Hu, Xiaoli; Zhu, Xueqiong

    2017-04-01

    To investigate the association between widespread use of mifepristone in abortions and risk of uterine leiomyomas.We conducted a case-control study of 305 patients with uterine leiomyomas between January 2011 and July 2012; 311 women with ordinary vaginitis were selected as controls during the same period. Data were collected by questionnaires (including past history, life history, menstruation history, reproductive history, abortion history, the use of mifepristone, and uterine leiomyomas risk factors) and calculated by univariate and multivariate conditional logistic regression analyses; odds ratios and its 95% confidence interval were calculated to estimate the risk for uterine leiomyomas.Abortion with mifepristone was one of the risk factors for uterine leiomyomas, and the risk increased with increasing frequency of mifepristone use. Family history of uterine leiomyomas, body mass index, age at menarche, number of full-term delivery, and medical abortion history were also correlated with uterine leiomyomas.The use of mifepristone in abortion will increase the risk to develop uterine leiomyomas.

  7. Integration of logistic regression, Markov chain and cellular automata models to simulate urban expansion

    NASA Astrophysics Data System (ADS)

    Jokar Arsanjani, Jamal; Helbich, Marco; Kainz, Wolfgang; Darvishi Boloorani, Ali

    2013-04-01

    This research analyses the suburban expansion in the metropolitan area of Tehran, Iran. A hybrid model consisting of logistic regression model, Markov chain (MC), and cellular automata (CA) was designed to improve the performance of the standard logistic regression model. Environmental and socio-economic variables dealing with urban sprawl were operationalised to create a probability surface of spatiotemporal states of built-up land use for the years 2006, 2016, and 2026. For validation, the model was evaluated by means of relative operating characteristic values for different sets of variables. The approach was calibrated for 2006 by cross comparing of actual and simulated land use maps. The achieved outcomes represent a match of 89% between simulated and actual maps of 2006, which was satisfactory to approve the calibration process. Thereafter, the calibrated hybrid approach was implemented for forthcoming years. Finally, future land use maps for 2016 and 2026 were predicted by means of this hybrid approach. The simulated maps illustrate a new wave of suburban development in the vicinity of Tehran at the western border of the metropolis during the next decades.

  8. Association between cardiovascular risk factors and carotid intima-media thickness in prepubertal Brazilian children.

    PubMed

    Gazolla, Fernanda Mussi; Neves Bordallo, Maria Alice; Madeira, Isabel Rey; de Miranda Carvalho, Cecilia Noronha; Vieira Monteiro, Alexandra Maria; Pinheiro Rodrigues, Nádia Cristina; Borges, Marcos Antonio; Collett-Solberg, Paulo Ferrez; Muniz, Bruna Moreira; de Oliveira, Cecilia Lacroix; Pinheiro, Suellen Martins; de Queiroz Ribeiro, Rebeca Mathias

    2015-05-01

    Early exposure to cardiovascular risk factors creates a chronic inflammatory state that could damage the endothelium followed by thickening of the carotid intima-media. To investigate the association of cardiovascular risk factors and thickening of the carotid intima. Media in prepubertal children. In this cross-sectional study, carotid intima-media thickness (cIMT) and cardiovascular risk factors were assessed in 129 prepubertal children aged from 5 to 10 year. Association was assessed by simple and multivariate logistic regression analyses. In simple logistic regression analyses, body mass index (BMI) z-score, waist circumference, and systolic blood pressure (SBP) were positively associated with increased left, right, and average cIMT, whereas diastolic blood pressure was positively associated only with increased left and average cIMT (p<0.05). In multivariate logistic regression analyses increased left cIMT was positively associated to BMI z-score and SBP, and increased average cIMT was only positively associated to SBP (p<0.05). BMI z-score and SBP were the strongest risk factors for increased cIMT.

  9. New machine-learning algorithms for prediction of Parkinson's disease

    NASA Astrophysics Data System (ADS)

    Mandal, Indrajit; Sairam, N.

    2014-03-01

    This article presents an enhanced prediction accuracy of diagnosis of Parkinson's disease (PD) to prevent the delay and misdiagnosis of patients using the proposed robust inference system. New machine-learning methods are proposed and performance comparisons are based on specificity, sensitivity, accuracy and other measurable parameters. The robust methods of treating Parkinson's disease (PD) includes sparse multinomial logistic regression, rotation forest ensemble with support vector machines and principal components analysis, artificial neural networks, boosting methods. A new ensemble method comprising of the Bayesian network optimised by Tabu search algorithm as classifier and Haar wavelets as projection filter is used for relevant feature selection and ranking. The highest accuracy obtained by linear logistic regression and sparse multinomial logistic regression is 100% and sensitivity, specificity of 0.983 and 0.996, respectively. All the experiments are conducted over 95% and 99% confidence levels and establish the results with corrected t-tests. This work shows a high degree of advancement in software reliability and quality of the computer-aided diagnosis system and experimentally shows best results with supportive statistical inference.

  10. Landslide Hazard Mapping in Rwanda Using Logistic Regression

    NASA Astrophysics Data System (ADS)

    Piller, A.; Anderson, E.; Ballard, H.

    2015-12-01

    Landslides in the United States cause more than $1 billion in damages and 50 deaths per year (USGS 2014). Globally, figures are much more grave, yet monitoring, mapping and forecasting of these hazards are less than adequate. Seventy-five percent of the population of Rwanda earns a living from farming, mostly subsistence. Loss of farmland, housing, or life, to landslides is a very real hazard. Landslides in Rwanda have an impact at the economic, social, and environmental level. In a developing nation that faces challenges in tracking, cataloging, and predicting the numerous landslides that occur each year, satellite imagery and spatial analysis allow for remote study. We have focused on the development of a landslide inventory and a statistical methodology for assessing landslide hazards. Using logistic regression on approximately 30 test variables (i.e. slope, soil type, land cover, etc.) and a sample of over 200 landslides, we determine which variables are statistically most relevant to landslide occurrence in Rwanda. A preliminary predictive hazard map for Rwanda has been produced, using the variables selected from the logistic regression analysis.

  11. Evaluating penalized logistic regression models to predict Heat-Related Electric grid stress days

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bramer, L. M.; Rounds, J.; Burleyson, C. D.

    Understanding the conditions associated with stress on the electricity grid is important in the development of contingency plans for maintaining reliability during periods when the grid is stressed. In this paper, heat-related grid stress and the relationship with weather conditions is examined using data from the eastern United States. Penalized logistic regression models were developed and applied to predict stress on the electric grid using weather data. The inclusion of other weather variables, such as precipitation, in addition to temperature improved model performance. Several candidate models and datasets were examined. A penalized logistic regression model fit at the operation-zone levelmore » was found to provide predictive value and interpretability. Additionally, the importance of different weather variables observed at different time scales were examined. Maximum temperature and precipitation were identified as important across all zones while the importance of other weather variables was zone specific. The methods presented in this work are extensible to other regions and can be used to aid in planning and development of the electrical grid.« less

  12. Evaluating penalized logistic regression models to predict Heat-Related Electric grid stress days

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bramer, Lisa M.; Rounds, J.; Burleyson, C. D.

    Understanding the conditions associated with stress on the electricity grid is important in the development of contingency plans for maintaining reliability during periods when the grid is stressed. In this paper, heat-related grid stress and the relationship with weather conditions were examined using data from the eastern United States. Penalized logistic regression models were developed and applied to predict stress on the electric grid using weather data. The inclusion of other weather variables, such as precipitation, in addition to temperature improved model performance. Several candidate models and combinations of predictive variables were examined. A penalized logistic regression model which wasmore » fit at the operation-zone level was found to provide predictive value and interpretability. Additionally, the importance of different weather variables observed at various time scales were examined. Maximum temperature and precipitation were identified as important across all zones while the importance of other weather variables was zone specific. In conclusion, the methods presented in this work are extensible to other regions and can be used to aid in planning and development of the electrical grid.« less

  13. GIS-based rare events logistic regression for mineral prospectivity mapping

    NASA Astrophysics Data System (ADS)

    Xiong, Yihui; Zuo, Renguang

    2018-02-01

    Mineralization is a special type of singularity event, and can be considered as a rare event, because within a specific study area the number of prospective locations (1s) are considerably fewer than the number of non-prospective locations (0s). In this study, GIS-based rare events logistic regression (RELR) was used to map the mineral prospectivity in the southwestern Fujian Province, China. An odds ratio was used to measure the relative importance of the evidence variables with respect to mineralization. The results suggest that formations, granites, and skarn alterations, followed by faults and aeromagnetic anomaly are the most important indicators for the formation of Fe-related mineralization in the study area. The prediction rate and the area under the curve (AUC) values show that areas with higher probability have a strong spatial relationship with the known mineral deposits. Comparing the results with original logistic regression (OLR) demonstrates that the GIS-based RELR performs better than OLR. The prospectivity map obtained in this study benefits the search for skarn Fe-related mineralization in the study area.

  14. [Analysis of rational clinical uses of traditional Chinese medicine injections and factors influencing adverse drug reactions].

    PubMed

    Sun, Shi-Guang; Li, Zi-Feng; Xie, Yan-Ming; Liu, Jian; Lu, Yan; Song, Yi-Fei; Han, Ying-Hua; Liu, Li-Da; Peng, Ting-Ting

    2013-09-01

    To rationalize the clinical use and safety are some of the key issues in the surveillance of traditional Chinese medicine injections (TCMIs). In this 2011 study, 240 medical records of patients who had been discharged following treatment with TCMIs between 1 and 12 month previously were randomly selected from hospital records. Consistency between clinical use and the description of TCMIs was evaluated. Research on drug use and adverse drug reactions/events using logistic regression analysis was carried out. There was poor consistency between clinical use and best practice advised in manuals on TCMIs. Over-dosage and overly concentrated administration of TCMIs occurred, with the outcome of modifying properties of the blood. Logistic regression analysis showed that, drug concentration was a valid predictor for both adverse drug reactions/events and benefits associated with TCMIs. Surveillance of rational clinical use and safety of TCMIs finds that clinical use should be consistent with technical drug manual specifications, and drug use should draw on multi-layered logistic regression analysis research to help avoid adverse drug reactions/events.

  15. HEALER: homomorphic computation of ExAct Logistic rEgRession for secure rare disease variants analysis in GWAS

    PubMed Central

    Wang, Shuang; Zhang, Yuchen; Dai, Wenrui; Lauter, Kristin; Kim, Miran; Tang, Yuzhe; Xiong, Hongkai; Jiang, Xiaoqian

    2016-01-01

    Motivation: Genome-wide association studies (GWAS) have been widely used in discovering the association between genotypes and phenotypes. Human genome data contain valuable but highly sensitive information. Unprotected disclosure of such information might put individual’s privacy at risk. It is important to protect human genome data. Exact logistic regression is a bias-reduction method based on a penalized likelihood to discover rare variants that are associated with disease susceptibility. We propose the HEALER framework to facilitate secure rare variants analysis with a small sample size. Results: We target at the algorithm design aiming at reducing the computational and storage costs to learn a homomorphic exact logistic regression model (i.e. evaluate P-values of coefficients), where the circuit depth is proportional to the logarithmic scale of data size. We evaluate the algorithm performance using rare Kawasaki Disease datasets. Availability and implementation: Download HEALER at http://research.ucsd-dbmi.org/HEALER/ Contact: shw070@ucsd.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26446135

  16. Testing Gene-Gene Interactions in the Case-Parents Design

    PubMed Central

    Yu, Zhaoxia

    2011-01-01

    The case-parents design has been widely used to detect genetic associations as it can prevent spurious association that could occur in population-based designs. When examining the effect of an individual genetic locus on a disease, logistic regressions developed by conditioning on parental genotypes provide complete protection from spurious association caused by population stratification. However, when testing gene-gene interactions, it is unknown whether conditional logistic regressions are still robust. Here we evaluate the robustness and efficiency of several gene-gene interaction tests that are derived from conditional logistic regressions. We found that in the presence of SNP genotype correlation due to population stratification or linkage disequilibrium, tests with incorrectly specified main-genetic-effect models can lead to inflated type I error rates. We also found that a test with fully flexible main genetic effects always maintains correct test size and its robustness can be achieved with negligible sacrifice of its power. When testing gene-gene interactions is the focus, the test allowing fully flexible main effects is recommended to be used. PMID:21778736

  17. A logistic regression analysis of factors related to the treatment compliance of infertile patients with polycystic ovary syndrome.

    PubMed

    Li, Saijiao; He, Aiyan; Yang, Jing; Yin, TaiLang; Xu, Wangming

    2011-01-01

    To investigate factors that can affect compliance with treatment of polycystic ovary syndrome (PCOS) in infertile patients and to provide a basis for clinical treatment, specialist consultation and health education. Patient compliance was assessed via a questionnaire based on the Morisky-Green test and the treatment principles of PCOS. Then interviews were conducted with 99 infertile patients diagnosed with PCOS at Renmin Hospital of Wuhan University in China, from March to September 2009. Finally, these data were analyzed using logistic regression analysis. Logistic regression analysis revealed that a total of 23 (25.6%) of the participants showed good compliance. Factors that significantly (p < 0.05) affected compliance with treatment were the patient's body mass index, convenience of medical treatment and concerns about adverse drug reactions. Patients who are obese, experience inconvenient medical treatment or are concerned about adverse drug reactions are more likely to exhibit noncompliance. Treatment education and intervention aimed at these patients should be strengthened in the clinic to improve treatment compliance. Further research is needed to better elucidate the compliance behavior of patients with PCOS.

  18. A general equation to obtain multiple cut-off scores on a test from multinomial logistic regression.

    PubMed

    Bersabé, Rosa; Rivas, Teresa

    2010-05-01

    The authors derive a general equation to compute multiple cut-offs on a total test score in order to classify individuals into more than two ordinal categories. The equation is derived from the multinomial logistic regression (MLR) model, which is an extension of the binary logistic regression (BLR) model to accommodate polytomous outcome variables. From this analytical procedure, cut-off scores are established at the test score (the predictor variable) at which an individual is as likely to be in category j as in category j+1 of an ordinal outcome variable. The application of the complete procedure is illustrated by an example with data from an actual study on eating disorders. In this example, two cut-off scores on the Eating Attitudes Test (EAT-26) scores are obtained in order to classify individuals into three ordinal categories: asymptomatic, symptomatic and eating disorder. Diagnoses were made from the responses to a self-report (Q-EDD) that operationalises DSM-IV criteria for eating disorders. Alternatives to the MLR model to set multiple cut-off scores are discussed.

  19. Sparse Logistic Regression for Diagnosis of Liver Fibrosis in Rat by Using SCAD-Penalized Likelihood

    PubMed Central

    Yan, Fang-Rong; Lin, Jin-Guan; Liu, Yu

    2011-01-01

    The objective of the present study is to find out the quantitative relationship between progression of liver fibrosis and the levels of certain serum markers using mathematic model. We provide the sparse logistic regression by using smoothly clipped absolute deviation (SCAD) penalized function to diagnose the liver fibrosis in rats. Not only does it give a sparse solution with high accuracy, it also provides the users with the precise probabilities of classification with the class information. In the simulative case and the experiment case, the proposed method is comparable to the stepwise linear discriminant analysis (SLDA) and the sparse logistic regression with least absolute shrinkage and selection operator (LASSO) penalty, by using receiver operating characteristic (ROC) with bayesian bootstrap estimating area under the curve (AUC) diagnostic sensitivity for selected variable. Results show that the new approach provides a good correlation between the serum marker levels and the liver fibrosis induced by thioacetamide (TAA) in rats. Meanwhile, this approach might also be used in predicting the development of liver cirrhosis. PMID:21716672

  20. John Snow, William Farr and the 1849 outbreak of cholera that affected London: a reworking of the data highlights the importance of the water supply.

    PubMed

    Bingham, P; Verlander, N Q; Cheal, M J

    2004-09-01

    This paper examines why Snow's contention that cholera was principally spread by water was not accepted in the 1850s by the medical elite. The consequence of rejection was that hundreds in the UK continued to die. Logistic regression was used to re-analyse data, first published in 1852 by William Farr, consisting of the 1849 mortality rate from cholera and eight potential explanatory variables for the 38 registration districts of London. Logistic regression does not support Farr's original conclusion that a district's elevation above high water was the most important explanatory variable. Elevation above high water, water supply and poor rate each have an independent significant effect on district cholera mortality rate, but in terms of size of effect, it can be argued that water supply most strongly 'invited' further consideration. The science of epidemiology, that Farr helped to found, has continued to advance. Had logistic regression been available to Farr, its application to his 1852 data set would have changed his conclusion.

Top