Sample records for logistic regression coefficient

  1. Standards for Standardized Logistic Regression Coefficients

    ERIC Educational Resources Information Center

    Menard, Scott

    2011-01-01

    Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…

  2. Remote sensing and GIS-based landslide hazard analysis and cross-validation using multivariate logistic regression model on three test areas in Malaysia

    NASA Astrophysics Data System (ADS)

    Pradhan, Biswajeet

    2010-05-01

    This paper presents the results of the cross-validation of a multivariate logistic regression model using remote sensing data and GIS for landslide hazard analysis on the Penang, Cameron, and Selangor areas in Malaysia. Landslide locations in the study areas were identified by interpreting aerial photographs and satellite images, supported by field surveys. SPOT 5 and Landsat TM satellite imagery were used to map landcover and vegetation index, respectively. Maps of topography, soil type, lineaments and land cover were constructed from the spatial datasets. Ten factors which influence landslide occurrence, i.e., slope, aspect, curvature, distance from drainage, lithology, distance from lineaments, soil type, landcover, rainfall precipitation, and normalized difference vegetation index (ndvi), were extracted from the spatial database and the logistic regression coefficient of each factor was computed. Then the landslide hazard was analysed using the multivariate logistic regression coefficients derived not only from the data for the respective area but also using the logistic regression coefficients calculated from each of the other two areas (nine hazard maps in all) as a cross-validation of the model. For verification of the model, the results of the analyses were then compared with the field-verified landslide locations. Among the three cases of the application of logistic regression coefficient in the same study area, the case of Selangor based on the Selangor logistic regression coefficients showed the highest accuracy (94%), where as Penang based on the Penang coefficients showed the lowest accuracy (86%). Similarly, among the six cases from the cross application of logistic regression coefficient in other two areas, the case of Selangor based on logistic coefficient of Cameron showed highest (90%) prediction accuracy where as the case of Penang based on the Selangor logistic regression coefficients showed the lowest accuracy (79%). Qualitatively, the cross application model yields reasonable results which can be used for preliminary landslide hazard mapping.

  3. Adjusting for Confounding in Early Postlaunch Settings: Going Beyond Logistic Regression Models.

    PubMed

    Schmidt, Amand F; Klungel, Olaf H; Groenwold, Rolf H H

    2016-01-01

    Postlaunch data on medical treatments can be analyzed to explore adverse events or relative effectiveness in real-life settings. These analyses are often complicated by the number of potential confounders and the possibility of model misspecification. We conducted a simulation study to compare the performance of logistic regression, propensity score, disease risk score, and stabilized inverse probability weighting methods to adjust for confounding. Model misspecification was induced in the independent derivation dataset. We evaluated performance using relative bias confidence interval coverage of the true effect, among other metrics. At low events per coefficient (1.0 and 0.5), the logistic regression estimates had a large relative bias (greater than -100%). Bias of the disease risk score estimates was at most 13.48% and 18.83%. For the propensity score model, this was 8.74% and >100%, respectively. At events per coefficient of 1.0 and 0.5, inverse probability weighting frequently failed or reduced to a crude regression, resulting in biases of -8.49% and 24.55%. Coverage of logistic regression estimates became less than the nominal level at events per coefficient ≤5. For the disease risk score, inverse probability weighting, and propensity score, coverage became less than nominal at events per coefficient ≤2.5, ≤1.0, and ≤1.0, respectively. Bias of misspecified disease risk score models was 16.55%. In settings with low events/exposed subjects per coefficient, disease risk score methods can be useful alternatives to logistic regression models, especially when propensity score models cannot be used. Despite better performance of disease risk score methods than logistic regression and propensity score models in small events per coefficient settings, bias, and coverage still deviated from nominal.

  4. Sample size determination for logistic regression on a logit-normal distribution.

    PubMed

    Kim, Seongho; Heath, Elisabeth; Heilbrun, Lance

    2017-06-01

    Although the sample size for simple logistic regression can be readily determined using currently available methods, the sample size calculation for multiple logistic regression requires some additional information, such as the coefficient of determination ([Formula: see text]) of a covariate of interest with other covariates, which is often unavailable in practice. The response variable of logistic regression follows a logit-normal distribution which can be generated from a logistic transformation of a normal distribution. Using this property of logistic regression, we propose new methods of determining the sample size for simple and multiple logistic regressions using a normal transformation of outcome measures. Simulation studies and a motivating example show several advantages of the proposed methods over the existing methods: (i) no need for [Formula: see text] for multiple logistic regression, (ii) available interim or group-sequential designs, and (iii) much smaller required sample size.

  5. Interpretation of commonly used statistical regression models.

    PubMed

    Kasza, Jessica; Wolfe, Rory

    2014-01-01

    A review of some regression models commonly used in respiratory health applications is provided in this article. Simple linear regression, multiple linear regression, logistic regression and ordinal logistic regression are considered. The focus of this article is on the interpretation of the regression coefficients of each model, which are illustrated through the application of these models to a respiratory health research study. © 2013 The Authors. Respirology © 2013 Asian Pacific Society of Respirology.

  6. An empirical study of statistical properties of variance partition coefficients for multi-level logistic regression models

    USGS Publications Warehouse

    Li, Ji; Gray, B.R.; Bates, D.M.

    2008-01-01

    Partitioning the variance of a response by design levels is challenging for binomial and other discrete outcomes. Goldstein (2003) proposed four definitions for variance partitioning coefficients (VPC) under a two-level logistic regression model. In this study, we explicitly derived formulae for multi-level logistic regression model and subsequently studied the distributional properties of the calculated VPCs. Using simulations and a vegetation dataset, we demonstrated associations between different VPC definitions, the importance of methods for estimating VPCs (by comparing VPC obtained using Laplace and penalized quasilikehood methods), and bivariate dependence between VPCs calculated at different levels. Such an empirical study lends an immediate support to wider applications of VPC in scientific data analysis.

  7. Advanced colorectal neoplasia risk stratification by penalized logistic regression.

    PubMed

    Lin, Yunzhi; Yu, Menggang; Wang, Sijian; Chappell, Richard; Imperiale, Thomas F

    2016-08-01

    Colorectal cancer is the second leading cause of death from cancer in the United States. To facilitate the efficiency of colorectal cancer screening, there is a need to stratify risk for colorectal cancer among the 90% of US residents who are considered "average risk." In this article, we investigate such risk stratification rules for advanced colorectal neoplasia (colorectal cancer and advanced, precancerous polyps). We use a recently completed large cohort study of subjects who underwent a first screening colonoscopy. Logistic regression models have been used in the literature to estimate the risk of advanced colorectal neoplasia based on quantifiable risk factors. However, logistic regression may be prone to overfitting and instability in variable selection. Since most of the risk factors in our study have several categories, it was tempting to collapse these categories into fewer risk groups. We propose a penalized logistic regression method that automatically and simultaneously selects variables, groups categories, and estimates their coefficients by penalizing the [Formula: see text]-norm of both the coefficients and their differences. Hence, it encourages sparsity in the categories, i.e. grouping of the categories, and sparsity in the variables, i.e. variable selection. We apply the penalized logistic regression method to our data. The important variables are selected, with close categories simultaneously grouped, by penalized regression models with and without the interactions terms. The models are validated with 10-fold cross-validation. The receiver operating characteristic curves of the penalized regression models dominate the receiver operating characteristic curve of naive logistic regressions, indicating a superior discriminative performance. © The Author(s) 2013.

  8. No rationale for 1 variable per 10 events criterion for binary logistic regression analysis.

    PubMed

    van Smeden, Maarten; de Groot, Joris A H; Moons, Karel G M; Collins, Gary S; Altman, Douglas G; Eijkemans, Marinus J C; Reitsma, Johannes B

    2016-11-24

    Ten events per variable (EPV) is a widely advocated minimal criterion for sample size considerations in logistic regression analysis. Of three previous simulation studies that examined this minimal EPV criterion only one supports the use of a minimum of 10 EPV. In this paper, we examine the reasons for substantial differences between these extensive simulation studies. The current study uses Monte Carlo simulations to evaluate small sample bias, coverage of confidence intervals and mean square error of logit coefficients. Logistic regression models fitted by maximum likelihood and a modified estimation procedure, known as Firth's correction, are compared. The results show that besides EPV, the problems associated with low EPV depend on other factors such as the total sample size. It is also demonstrated that simulation results can be dominated by even a few simulated data sets for which the prediction of the outcome by the covariates is perfect ('separation'). We reveal that different approaches for identifying and handling separation leads to substantially different simulation results. We further show that Firth's correction can be used to improve the accuracy of regression coefficients and alleviate the problems associated with separation. The current evidence supporting EPV rules for binary logistic regression is weak. Given our findings, there is an urgent need for new research to provide guidance for supporting sample size considerations for binary logistic regression analysis.

  9. The Outlier Detection for Ordinal Data Using Scalling Technique of Regression Coefficients

    NASA Astrophysics Data System (ADS)

    Adnan, Arisman; Sugiarto, Sigit

    2017-06-01

    The aims of this study is to detect the outliers by using coefficients of Ordinal Logistic Regression (OLR) for the case of k category responses where the score from 1 (the best) to 8 (the worst). We detect them by using the sum of moduli of the ordinal regression coefficients calculated by jackknife technique. This technique is improved by scalling the regression coefficients to their means. R language has been used on a set of ordinal data from reference distribution. Furthermore, we compare this approach by using studentised residual plots of jackknife technique for ANOVA (Analysis of Variance) and OLR. This study shows that the jackknifing technique along with the proper scaling may lead us to reveal outliers in ordinal regression reasonably well.

  10. Testing for gene-environment interaction under exposure misspecification.

    PubMed

    Sun, Ryan; Carroll, Raymond J; Christiani, David C; Lin, Xihong

    2017-11-09

    Complex interplay between genetic and environmental factors characterizes the etiology of many diseases. Modeling gene-environment (GxE) interactions is often challenged by the unknown functional form of the environment term in the true data-generating mechanism. We study the impact of misspecification of the environmental exposure effect on inference for the GxE interaction term in linear and logistic regression models. We first examine the asymptotic bias of the GxE interaction regression coefficient, allowing for confounders as well as arbitrary misspecification of the exposure and confounder effects. For linear regression, we show that under gene-environment independence and some confounder-dependent conditions, when the environment effect is misspecified, the regression coefficient of the GxE interaction can be unbiased. However, inference on the GxE interaction is still often incorrect. In logistic regression, we show that the regression coefficient is generally biased if the genetic factor is associated with the outcome directly or indirectly. Further, we show that the standard robust sandwich variance estimator for the GxE interaction does not perform well in practical GxE studies, and we provide an alternative testing procedure that has better finite sample properties. © 2017, The International Biometric Society.

  11. Differential item functioning analysis with ordinal logistic regression techniques. DIFdetect and difwithpar.

    PubMed

    Crane, Paul K; Gibbons, Laura E; Jolley, Lance; van Belle, Gerald

    2006-11-01

    We present an ordinal logistic regression model for identification of items with differential item functioning (DIF) and apply this model to a Mini-Mental State Examination (MMSE) dataset. We employ item response theory ability estimation in our models. Three nested ordinal logistic regression models are applied to each item. Model testing begins with examination of the statistical significance of the interaction term between ability and the group indicator, consistent with nonuniform DIF. Then we turn our attention to the coefficient of the ability term in models with and without the group term. If including the group term has a marked effect on that coefficient, we declare that it has uniform DIF. We examined DIF related to language of test administration in addition to self-reported race, Hispanic ethnicity, age, years of education, and sex. We used PARSCALE for IRT analyses and STATA for ordinal logistic regression approaches. We used an iterative technique for adjusting IRT ability estimates on the basis of DIF findings. Five items were found to have DIF related to language. These same items also had DIF related to other covariates. The ordinal logistic regression approach to DIF detection, when combined with IRT ability estimates, provides a reasonable alternative for DIF detection. There appear to be several items with significant DIF related to language of test administration in the MMSE. More attention needs to be paid to the specific criteria used to determine whether an item has DIF, not just the technique used to identify DIF.

  12. Spatiotemporal variability of urban growth factors: A global and local perspective on the megacity of Mumbai

    NASA Astrophysics Data System (ADS)

    Shafizadeh-Moghadam, Hossein; Helbich, Marco

    2015-03-01

    The rapid growth of megacities requires special attention among urban planners worldwide, and particularly in Mumbai, India, where growth is very pronounced. To cope with the planning challenges this will bring, developing a retrospective understanding of urban land-use dynamics and the underlying driving-forces behind urban growth is a key prerequisite. This research uses regression-based land-use change models - and in particular non-spatial logistic regression models (LR) and auto-logistic regression models (ALR) - for the Mumbai region over the period 1973-2010, in order to determine the drivers behind spatiotemporal urban expansion. Both global models are complemented by a local, spatial model, the so-called geographically weighted logistic regression (GWLR) model, one that explicitly permits variations in driving-forces across space. The study comes to two main conclusions. First, both global models suggest similar driving-forces behind urban growth over time, revealing that LRs and ALRs result in estimated coefficients with comparable magnitudes. Second, all the local coefficients show distinctive temporal and spatial variations. It is therefore concluded that GWLR aids our understanding of urban growth processes, and so can assist context-related planning and policymaking activities when seeking to secure a sustainable urban future.

  13. Selecting risk factors: a comparison of discriminant analysis, logistic regression and Cox's regression model using data from the Tromsø Heart Study.

    PubMed

    Brenn, T; Arnesen, E

    1985-01-01

    For comparative evaluation, discriminant analysis, logistic regression and Cox's model were used to select risk factors for total and coronary deaths among 6595 men aged 20-49 followed for 9 years. Groups with mortality between 5 and 93 per 1000 were considered. Discriminant analysis selected variable sets only marginally different from the logistic and Cox methods which always selected the same sets. A time-saving option, offered for both the logistic and Cox selection, showed no advantage compared with discriminant analysis. Analysing more than 3800 subjects, the logistic and Cox methods consumed, respectively, 80 and 10 times more computer time than discriminant analysis. When including the same set of variables in non-stepwise analyses, all methods estimated coefficients that in most cases were almost identical. In conclusion, discriminant analysis is advocated for preliminary or stepwise analysis, otherwise Cox's method should be used.

  14. Hierarchical Bayesian Logistic Regression to forecast metabolic control in type 2 DM patients.

    PubMed

    Dagliati, Arianna; Malovini, Alberto; Decata, Pasquale; Cogni, Giulia; Teliti, Marsida; Sacchi, Lucia; Cerra, Carlo; Chiovato, Luca; Bellazzi, Riccardo

    2016-01-01

    In this work we present our efforts in building a model able to forecast patients' changes in clinical conditions when repeated measurements are available. In this case the available risk calculators are typically not applicable. We propose a Hierarchical Bayesian Logistic Regression model, which allows taking into account individual and population variability in model parameters estimate. The model is used to predict metabolic control and its variation in type 2 diabetes mellitus. In particular we have analyzed a population of more than 1000 Italian type 2 diabetic patients, collected within the European project Mosaic. The results obtained in terms of Matthews Correlation Coefficient are significantly better than the ones gathered with standard logistic regression model, based on data pooling.

  15. Assessing landslide susceptibility by statistical data analysis and GIS: the case of Daunia (Apulian Apennines, Italy)

    NASA Astrophysics Data System (ADS)

    Ceppi, C.; Mancini, F.; Ritrovato, G.

    2009-04-01

    This study aim at the landslide susceptibility mapping within an area of the Daunia (Apulian Apennines, Italy) by a multivariate statistical method and data manipulation in a Geographical Information System (GIS) environment. Among the variety of existing statistical data analysis techniques, the logistic regression was chosen to produce a susceptibility map all over an area where small settlements are historically threatened by landslide phenomena. By logistic regression a best fitting between the presence or absence of landslide (dependent variable) and the set of independent variables is performed on the basis of a maximum likelihood criterion, bringing to the estimation of regression coefficients. The reliability of such analysis is therefore due to the ability to quantify the proneness to landslide occurrences by the probability level produced by the analysis. The inventory of dependent and independent variables were managed in a GIS, where geometric properties and attributes have been translated into raster cells in order to proceed with the logistic regression by means of SPSS (Statistical Package for the Social Sciences) package. A landslide inventory was used to produce the bivariate dependent variable whereas the independent set of variable concerned with slope, aspect, elevation, curvature, drained area, lithology and land use after their reductions to dummy variables. The effect of independent parameters on landslide occurrence was assessed by the corresponding coefficient in the logistic regression function, highlighting a major role played by the land use variable in determining occurrence and distribution of phenomena. Once the outcomes of the logistic regression are determined, data are re-introduced in the GIS to produce a map reporting the proneness to landslide as predicted level of probability. As validation of results and regression model a cell-by-cell comparison between the susceptibility map and the initial inventory of landslide events was performed and an agreement at 75% level achieved.

  16. Deletion Diagnostics for Alternating Logistic Regressions

    PubMed Central

    Preisser, John S.; By, Kunthel; Perin, Jamie; Qaqish, Bahjat F.

    2013-01-01

    Deletion diagnostics are introduced for the regression analysis of clustered binary outcomes estimated with alternating logistic regressions, an implementation of generalized estimating equations (GEE) that estimates regression coefficients in a marginal mean model and in a model for the intracluster association given by the log odds ratio. The diagnostics are developed within an estimating equations framework that recasts the estimating functions for association parameters based upon conditional residuals into equivalent functions based upon marginal residuals. Extensions of earlier work on GEE diagnostics follow directly, including computational formulae for one-step deletion diagnostics that measure the influence of a cluster of observations on the estimated regression parameters and on the overall marginal mean or association model fit. The diagnostic formulae are evaluated with simulations studies and with an application concerning an assessment of factors associated with health maintenance visits in primary care medical practices. The application and the simulations demonstrate that the proposed cluster-deletion diagnostics for alternating logistic regressions are good approximations of their exact fully iterated counterparts. PMID:22777960

  17. EXpectation Propagation LOgistic REgRession (EXPLORER): Distributed Privacy-Preserving Online Model Learning

    PubMed Central

    Wang, Shuang; Jiang, Xiaoqian; Wu, Yuan; Cui, Lijuan; Cheng, Samuel; Ohno-Machado, Lucila

    2013-01-01

    We developed an EXpectation Propagation LOgistic REgRession (EXPLORER) model for distributed privacy-preserving online learning. The proposed framework provides a high level guarantee for protecting sensitive information, since the information exchanged between the server and the client is the encrypted posterior distribution of coefficients. Through experimental results, EXPLORER shows the same performance (e.g., discrimination, calibration, feature selection etc.) as the traditional frequentist Logistic Regression model, but provides more flexibility in model updating. That is, EXPLORER can be updated one point at a time rather than having to retrain the entire data set when new observations are recorded. The proposed EXPLORER supports asynchronized communication, which relieves the participants from coordinating with one another, and prevents service breakdown from the absence of participants or interrupted communications. PMID:23562651

  18. EXpectation Propagation LOgistic REgRession (EXPLORER): distributed privacy-preserving online model learning.

    PubMed

    Wang, Shuang; Jiang, Xiaoqian; Wu, Yuan; Cui, Lijuan; Cheng, Samuel; Ohno-Machado, Lucila

    2013-06-01

    We developed an EXpectation Propagation LOgistic REgRession (EXPLORER) model for distributed privacy-preserving online learning. The proposed framework provides a high level guarantee for protecting sensitive information, since the information exchanged between the server and the client is the encrypted posterior distribution of coefficients. Through experimental results, EXPLORER shows the same performance (e.g., discrimination, calibration, feature selection, etc.) as the traditional frequentist logistic regression model, but provides more flexibility in model updating. That is, EXPLORER can be updated one point at a time rather than having to retrain the entire data set when new observations are recorded. The proposed EXPLORER supports asynchronized communication, which relieves the participants from coordinating with one another, and prevents service breakdown from the absence of participants or interrupted communications. Copyright © 2013 Elsevier Inc. All rights reserved.

  19. Multinomial logistic regression modelling of obesity and overweight among primary school students in a rural area of Negeri Sembilan

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ghazali, Amirul Syafiq Mohd; Ali, Zalila; Noor, Norlida Mohd

    Multinomial logistic regression is widely used to model the outcomes of a polytomous response variable, a categorical dependent variable with more than two categories. The model assumes that the conditional mean of the dependent categorical variables is the logistic function of an affine combination of predictor variables. Its procedure gives a number of logistic regression models that make specific comparisons of the response categories. When there are q categories of the response variable, the model consists of q-1 logit equations which are fitted simultaneously. The model is validated by variable selection procedures, tests of regression coefficients, a significant test ofmore » the overall model, goodness-of-fit measures, and validation of predicted probabilities using odds ratio. This study used the multinomial logistic regression model to investigate obesity and overweight among primary school students in a rural area on the basis of their demographic profiles, lifestyles and on the diet and food intake. The results indicated that obesity and overweight of students are related to gender, religion, sleep duration, time spent on electronic games, breakfast intake in a week, with whom meals are taken, protein intake, and also, the interaction between breakfast intake in a week with sleep duration, and the interaction between gender and protein intake.« less

  20. Multinomial logistic regression modelling of obesity and overweight among primary school students in a rural area of Negeri Sembilan

    NASA Astrophysics Data System (ADS)

    Ghazali, Amirul Syafiq Mohd; Ali, Zalila; Noor, Norlida Mohd; Baharum, Adam

    2015-10-01

    Multinomial logistic regression is widely used to model the outcomes of a polytomous response variable, a categorical dependent variable with more than two categories. The model assumes that the conditional mean of the dependent categorical variables is the logistic function of an affine combination of predictor variables. Its procedure gives a number of logistic regression models that make specific comparisons of the response categories. When there are q categories of the response variable, the model consists of q-1 logit equations which are fitted simultaneously. The model is validated by variable selection procedures, tests of regression coefficients, a significant test of the overall model, goodness-of-fit measures, and validation of predicted probabilities using odds ratio. This study used the multinomial logistic regression model to investigate obesity and overweight among primary school students in a rural area on the basis of their demographic profiles, lifestyles and on the diet and food intake. The results indicated that obesity and overweight of students are related to gender, religion, sleep duration, time spent on electronic games, breakfast intake in a week, with whom meals are taken, protein intake, and also, the interaction between breakfast intake in a week with sleep duration, and the interaction between gender and protein intake.

  1. The intermediate endpoint effect in logistic and probit regression

    PubMed Central

    MacKinnon, DP; Lockwood, CM; Brown, CH; Wang, W; Hoffman, JM

    2010-01-01

    Background An intermediate endpoint is hypothesized to be in the middle of the causal sequence relating an independent variable to a dependent variable. The intermediate variable is also called a surrogate or mediating variable and the corresponding effect is called the mediated, surrogate endpoint, or intermediate endpoint effect. Clinical studies are often designed to change an intermediate or surrogate endpoint and through this intermediate change influence the ultimate endpoint. In many intermediate endpoint clinical studies the dependent variable is binary, and logistic or probit regression is used. Purpose The purpose of this study is to describe a limitation of a widely used approach to assessing intermediate endpoint effects and to propose an alternative method, based on products of coefficients, that yields more accurate results. Methods The intermediate endpoint model for a binary outcome is described for a true binary outcome and for a dichotomization of a latent continuous outcome. Plots of true values and a simulation study are used to evaluate the different methods. Results Distorted estimates of the intermediate endpoint effect and incorrect conclusions can result from the application of widely used methods to assess the intermediate endpoint effect. The same problem occurs for the proportion of an effect explained by an intermediate endpoint, which has been suggested as a useful measure for identifying intermediate endpoints. A solution to this problem is given based on the relationship between latent variable modeling and logistic or probit regression. Limitations More complicated intermediate variable models are not addressed in the study, although the methods described in the article can be extended to these more complicated models. Conclusions Researchers are encouraged to use an intermediate endpoint method based on the product of regression coefficients. A common method based on difference in coefficient methods can lead to distorted conclusions regarding the intermediate effect. PMID:17942466

  2. Mixed conditional logistic regression for habitat selection studies.

    PubMed

    Duchesne, Thierry; Fortin, Daniel; Courbin, Nicolas

    2010-05-01

    1. Resource selection functions (RSFs) are becoming a dominant tool in habitat selection studies. RSF coefficients can be estimated with unconditional (standard) and conditional logistic regressions. While the advantage of mixed-effects models is recognized for standard logistic regression, mixed conditional logistic regression remains largely overlooked in ecological studies. 2. We demonstrate the significance of mixed conditional logistic regression for habitat selection studies. First, we use spatially explicit models to illustrate how mixed-effects RSFs can be useful in the presence of inter-individual heterogeneity in selection and when the assumption of independence from irrelevant alternatives (IIA) is violated. The IIA hypothesis states that the strength of preference for habitat type A over habitat type B does not depend on the other habitat types also available. Secondly, we demonstrate the significance of mixed-effects models to evaluate habitat selection of free-ranging bison Bison bison. 3. When movement rules were homogeneous among individuals and the IIA assumption was respected, fixed-effects RSFs adequately described habitat selection by simulated animals. In situations violating the inter-individual homogeneity and IIA assumptions, however, RSFs were best estimated with mixed-effects regressions, and fixed-effects models could even provide faulty conclusions. 4. Mixed-effects models indicate that bison did not select farmlands, but exhibited strong inter-individual variations in their response to farmlands. Less than half of the bison preferred farmlands over forests. Conversely, the fixed-effect model simply suggested an overall selection for farmlands. 5. Conditional logistic regression is recognized as a powerful approach to evaluate habitat selection when resource availability changes. This regression is increasingly used in ecological studies, but almost exclusively in the context of fixed-effects models. Fitness maximization can imply differences in trade-offs among individuals, which can yield inter-individual differences in selection and lead to departure from IIA. These situations are best modelled with mixed-effects models. Mixed-effects conditional logistic regression should become a valuable tool for ecological research.

  3. Measuring Productivity of Depot-Level Aircraft Maintenance in the Air Force Logistics Command.

    DTIC Science & Technology

    1985-09-01

    of Figures...... . . . . . . . . . . . . vi List of Tables . . . . . . . . . ............ vii Abstract . . . ...................... viii I...59 6. DEA Efficiency Values (Third DEA Model) . .... 62 7. DMU 5 Input Efficiencies ................ 64 vi F "-’ List of Tables Table Page I. DEA...Regression Results for 20 Months . . . ..... 68 V. Regression Results for 7 Quarters . . ..... 70 VI . Coefficients of Correlation (Using Quarterly Data

  4. Estimating interaction on an additive scale between continuous determinants in a logistic regression model.

    PubMed

    Knol, Mirjam J; van der Tweel, Ingeborg; Grobbee, Diederick E; Numans, Mattijs E; Geerlings, Mirjam I

    2007-10-01

    To determine the presence of interaction in epidemiologic research, typically a product term is added to the regression model. In linear regression, the regression coefficient of the product term reflects interaction as departure from additivity. However, in logistic regression it refers to interaction as departure from multiplicativity. Rothman has argued that interaction estimated as departure from additivity better reflects biologic interaction. So far, literature on estimating interaction on an additive scale using logistic regression only focused on dichotomous determinants. The objective of the present study was to provide the methods to estimate interaction between continuous determinants and to illustrate these methods with a clinical example. and results From the existing literature we derived the formulas to quantify interaction as departure from additivity between one continuous and one dichotomous determinant and between two continuous determinants using logistic regression. Bootstrapping was used to calculate the corresponding confidence intervals. To illustrate the theory with an empirical example, data from the Utrecht Health Project were used, with age and body mass index as risk factors for elevated diastolic blood pressure. The methods and formulas presented in this article are intended to assist epidemiologists to calculate interaction on an additive scale between two variables on a certain outcome. The proposed methods are included in a spreadsheet which is freely available at: http://www.juliuscenter.nl/additive-interaction.xls.

  5. CUSUM-Logistic Regression analysis for the rapid detection of errors in clinical laboratory test results.

    PubMed

    Sampson, Maureen L; Gounden, Verena; van Deventer, Hendrik E; Remaley, Alan T

    2016-02-01

    The main drawback of the periodic analysis of quality control (QC) material is that test performance is not monitored in time periods between QC analyses, potentially leading to the reporting of faulty test results. The objective of this study was to develop a patient based QC procedure for the more timely detection of test errors. Results from a Chem-14 panel measured on the Beckman LX20 analyzer were used to develop the model. Each test result was predicted from the other 13 members of the panel by multiple regression, which resulted in correlation coefficients between the predicted and measured result of >0.7 for 8 of the 14 tests. A logistic regression model, which utilized the measured test result, the predicted test result, the day of the week and time of day, was then developed for predicting test errors. The output of the logistic regression was tallied by a daily CUSUM approach and used to predict test errors, with a fixed specificity of 90%. The mean average run length (ARL) before error detection by CUSUM-Logistic Regression (CSLR) was 20 with a mean sensitivity of 97%, which was considerably shorter than the mean ARL of 53 (sensitivity 87.5%) for a simple prediction model that only used the measured result for error detection. A CUSUM-Logistic Regression analysis of patient laboratory data can be an effective approach for the rapid and sensitive detection of clinical laboratory errors. Published by Elsevier Inc.

  6. Separation in Logistic Regression: Causes, Consequences, and Control.

    PubMed

    Mansournia, Mohammad Ali; Geroldinger, Angelika; Greenland, Sander; Heinze, Georg

    2018-04-01

    Separation is encountered in regression models with a discrete outcome (such as logistic regression) where the covariates perfectly predict the outcome. It is most frequent under the same conditions that lead to small-sample and sparse-data bias, such as presence of a rare outcome, rare exposures, highly correlated covariates, or covariates with strong effects. In theory, separation will produce infinite estimates for some coefficients. In practice, however, separation may be unnoticed or mishandled because of software limits in recognizing and handling the problem and in notifying the user. We discuss causes of separation in logistic regression and describe how common software packages deal with it. We then describe methods that remove separation, focusing on the same penalized-likelihood techniques used to address more general sparse-data problems. These methods improve accuracy, avoid software problems, and allow interpretation as Bayesian analyses with weakly informative priors. We discuss likelihood penalties, including some that can be implemented easily with any software package, and their relative advantages and disadvantages. We provide an illustration of ideas and methods using data from a case-control study of contraceptive practices and urinary tract infection.

  7. Intermediate and advanced topics in multilevel logistic regression analysis

    PubMed Central

    Merlo, Juan

    2017-01-01

    Multilevel data occur frequently in health services, population and public health, and epidemiologic research. In such research, binary outcomes are common. Multilevel logistic regression models allow one to account for the clustering of subjects within clusters of higher‐level units when estimating the effect of subject and cluster characteristics on subject outcomes. A search of the PubMed database demonstrated that the use of multilevel or hierarchical regression models is increasing rapidly. However, our impression is that many analysts simply use multilevel regression models to account for the nuisance of within‐cluster homogeneity that is induced by clustering. In this article, we describe a suite of analyses that can complement the fitting of multilevel logistic regression models. These ancillary analyses permit analysts to estimate the marginal or population‐average effect of covariates measured at the subject and cluster level, in contrast to the within‐cluster or cluster‐specific effects arising from the original multilevel logistic regression model. We describe the interval odds ratio and the proportion of opposed odds ratios, which are summary measures of effect for cluster‐level covariates. We describe the variance partition coefficient and the median odds ratio which are measures of components of variance and heterogeneity in outcomes. These measures allow one to quantify the magnitude of the general contextual effect. We describe an R 2 measure that allows analysts to quantify the proportion of variation explained by different multilevel logistic regression models. We illustrate the application and interpretation of these measures by analyzing mortality in patients hospitalized with a diagnosis of acute myocardial infarction. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. PMID:28543517

  8. Intermediate and advanced topics in multilevel logistic regression analysis.

    PubMed

    Austin, Peter C; Merlo, Juan

    2017-09-10

    Multilevel data occur frequently in health services, population and public health, and epidemiologic research. In such research, binary outcomes are common. Multilevel logistic regression models allow one to account for the clustering of subjects within clusters of higher-level units when estimating the effect of subject and cluster characteristics on subject outcomes. A search of the PubMed database demonstrated that the use of multilevel or hierarchical regression models is increasing rapidly. However, our impression is that many analysts simply use multilevel regression models to account for the nuisance of within-cluster homogeneity that is induced by clustering. In this article, we describe a suite of analyses that can complement the fitting of multilevel logistic regression models. These ancillary analyses permit analysts to estimate the marginal or population-average effect of covariates measured at the subject and cluster level, in contrast to the within-cluster or cluster-specific effects arising from the original multilevel logistic regression model. We describe the interval odds ratio and the proportion of opposed odds ratios, which are summary measures of effect for cluster-level covariates. We describe the variance partition coefficient and the median odds ratio which are measures of components of variance and heterogeneity in outcomes. These measures allow one to quantify the magnitude of the general contextual effect. We describe an R 2 measure that allows analysts to quantify the proportion of variation explained by different multilevel logistic regression models. We illustrate the application and interpretation of these measures by analyzing mortality in patients hospitalized with a diagnosis of acute myocardial infarction. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.

  9. Latin hypercube approach to estimate uncertainty in ground water vulnerability

    USGS Publications Warehouse

    Gurdak, J.J.; McCray, J.E.; Thyne, G.; Qi, S.L.

    2007-01-01

    A methodology is proposed to quantify prediction uncertainty associated with ground water vulnerability models that were developed through an approach that coupled multivariate logistic regression with a geographic information system (GIS). This method uses Latin hypercube sampling (LHS) to illustrate the propagation of input error and estimate uncertainty associated with the logistic regression predictions of ground water vulnerability. Central to the proposed method is the assumption that prediction uncertainty in ground water vulnerability models is a function of input error propagation from uncertainty in the estimated logistic regression model coefficients (model error) and the values of explanatory variables represented in the GIS (data error). Input probability distributions that represent both model and data error sources of uncertainty were simultaneously sampled using a Latin hypercube approach with logistic regression calculations of probability of elevated nonpoint source contaminants in ground water. The resulting probability distribution represents the prediction intervals and associated uncertainty of the ground water vulnerability predictions. The method is illustrated through a ground water vulnerability assessment of the High Plains regional aquifer. Results of the LHS simulations reveal significant prediction uncertainties that vary spatially across the regional aquifer. Additionally, the proposed method enables a spatial deconstruction of the prediction uncertainty that can lead to improved prediction of ground water vulnerability. ?? 2007 National Ground Water Association.

  10. Regression analysis for solving diagnosis problem of children's health

    NASA Astrophysics Data System (ADS)

    Cherkashina, Yu A.; Gerget, O. M.

    2016-04-01

    The paper includes results of scientific researches. These researches are devoted to the application of statistical techniques, namely, regression analysis, to assess the health status of children in the neonatal period based on medical data (hemostatic parameters, parameters of blood tests, the gestational age, vascular-endothelial growth factor) measured at 3-5 days of children's life. In this paper a detailed description of the studied medical data is given. A binary logistic regression procedure is discussed in the paper. Basic results of the research are presented. A classification table of predicted values and factual observed values is shown, the overall percentage of correct recognition is determined. Regression equation coefficients are calculated, the general regression equation is written based on them. Based on the results of logistic regression, ROC analysis was performed, sensitivity and specificity of the model are calculated and ROC curves are constructed. These mathematical techniques allow carrying out diagnostics of health of children providing a high quality of recognition. The results make a significant contribution to the development of evidence-based medicine and have a high practical importance in the professional activity of the author.

  11. Beyond logistic regression: structural equations modelling for binary variables and its application to investigating unobserved confounders.

    PubMed

    Kupek, Emil

    2006-03-15

    Structural equation modelling (SEM) has been increasingly used in medical statistics for solving a system of related regression equations. However, a great obstacle for its wider use has been its difficulty in handling categorical variables within the framework of generalised linear models. A large data set with a known structure among two related outcomes and three independent variables was generated to investigate the use of Yule's transformation of odds ratio (OR) into Q-metric by (OR-1)/(OR+1) to approximate Pearson's correlation coefficients between binary variables whose covariance structure can be further analysed by SEM. Percent of correctly classified events and non-events was compared with the classification obtained by logistic regression. The performance of SEM based on Q-metric was also checked on a small (N = 100) random sample of the data generated and on a real data set. SEM successfully recovered the generated model structure. SEM of real data suggested a significant influence of a latent confounding variable which would have not been detectable by standard logistic regression. SEM classification performance was broadly similar to that of the logistic regression. The analysis of binary data can be greatly enhanced by Yule's transformation of odds ratios into estimated correlation matrix that can be further analysed by SEM. The interpretation of results is aided by expressing them as odds ratios which are the most frequently used measure of effect in medical statistics.

  12. A regularization corrected score method for nonlinear regression models with covariate error.

    PubMed

    Zucker, David M; Gorfine, Malka; Li, Yi; Tadesse, Mahlet G; Spiegelman, Donna

    2013-03-01

    Many regression analyses involve explanatory variables that are measured with error, and failing to account for this error is well known to lead to biased point and interval estimates of the regression coefficients. We present here a new general method for adjusting for covariate error. Our method consists of an approximate version of the Stefanski-Nakamura corrected score approach, using the method of regularization to obtain an approximate solution of the relevant integral equation. We develop the theory in the setting of classical likelihood models; this setting covers, for example, linear regression, nonlinear regression, logistic regression, and Poisson regression. The method is extremely general in terms of the types of measurement error models covered, and is a functional method in the sense of not involving assumptions on the distribution of the true covariate. We discuss the theoretical properties of the method and present simulation results in the logistic regression setting (univariate and multivariate). For illustration, we apply the method to data from the Harvard Nurses' Health Study concerning the relationship between physical activity and breast cancer mortality in the period following a diagnosis of breast cancer. Copyright © 2013, The International Biometric Society.

  13. Predictive landslide susceptibility mapping using spatial information in the Pechabun area of Thailand

    NASA Astrophysics Data System (ADS)

    Oh, Hyun-Joo; Lee, Saro; Chotikasathien, Wisut; Kim, Chang Hwan; Kwon, Ju Hyoung

    2009-04-01

    For predictive landslide susceptibility mapping, this study applied and verified probability model, the frequency ratio and statistical model, logistic regression at Pechabun, Thailand, using a geographic information system (GIS) and remote sensing. Landslide locations were identified in the study area from interpretation of aerial photographs and field surveys, and maps of the topography, geology and land cover were constructed to spatial database. The factors that influence landslide occurrence, such as slope gradient, slope aspect and curvature of topography and distance from drainage were calculated from the topographic database. Lithology and distance from fault were extracted and calculated from the geology database. Land cover was classified from Landsat TM satellite image. The frequency ratio and logistic regression coefficient were overlaid for landslide susceptibility mapping as each factor’s ratings. Then the landslide susceptibility map was verified and compared using the existing landslide location. As the verification results, the frequency ratio model showed 76.39% and logistic regression model showed 70.42% in prediction accuracy. The method can be used to reduce hazards associated with landslides and to plan land cover.

  14. HEALER: homomorphic computation of ExAct Logistic rEgRession for secure rare disease variants analysis in GWAS

    PubMed Central

    Wang, Shuang; Zhang, Yuchen; Dai, Wenrui; Lauter, Kristin; Kim, Miran; Tang, Yuzhe; Xiong, Hongkai; Jiang, Xiaoqian

    2016-01-01

    Motivation: Genome-wide association studies (GWAS) have been widely used in discovering the association between genotypes and phenotypes. Human genome data contain valuable but highly sensitive information. Unprotected disclosure of such information might put individual’s privacy at risk. It is important to protect human genome data. Exact logistic regression is a bias-reduction method based on a penalized likelihood to discover rare variants that are associated with disease susceptibility. We propose the HEALER framework to facilitate secure rare variants analysis with a small sample size. Results: We target at the algorithm design aiming at reducing the computational and storage costs to learn a homomorphic exact logistic regression model (i.e. evaluate P-values of coefficients), where the circuit depth is proportional to the logarithmic scale of data size. We evaluate the algorithm performance using rare Kawasaki Disease datasets. Availability and implementation: Download HEALER at http://research.ucsd-dbmi.org/HEALER/ Contact: shw070@ucsd.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26446135

  15. Application of logistic regression for landslide susceptibility zoning of Cekmece Area, Istanbul, Turkey

    NASA Astrophysics Data System (ADS)

    Duman, T. Y.; Can, T.; Gokceoglu, C.; Nefeslioglu, H. A.; Sonmez, H.

    2006-11-01

    As a result of industrialization, throughout the world, cities have been growing rapidly for the last century. One typical example of these growing cities is Istanbul, the population of which is over 10 million. Due to rapid urbanization, new areas suitable for settlement and engineering structures are necessary. The Cekmece area located west of the Istanbul metropolitan area is studied, because the landslide activity is extensive in this area. The purpose of this study is to develop a model that can be used to characterize landslide susceptibility in map form using logistic regression analysis of an extensive landslide database. A database of landslide activity was constructed using both aerial-photography and field studies. About 19.2% of the selected study area is covered by deep-seated landslides. The landslides that occur in the area are primarily located in sandstones with interbedded permeable and impermeable layers such as claystone, siltstone and mudstone. About 31.95% of the total landslide area is located at this unit. To apply logistic regression analyses, a data matrix including 37 variables was constructed. The variables used in the forwards stepwise analyses are different measures of slope, aspect, elevation, stream power index (SPI), plan curvature, profile curvature, geology, geomorphology and relative permeability of lithological units. A total of 25 variables were identified as exerting strong influence on landslide occurrence, and included by the logistic regression equation. Wald statistics values indicate that lithology, SPI and slope are more important than the other parameters in the equation. Beta coefficients of the 25 variables included the logistic regression equation provide a model for landslide susceptibility in the Cekmece area. This model is used to generate a landslide susceptibility map that correctly classified 83.8% of the landslide-prone areas.

  16. Assessing risk factors for periodontitis using regression

    NASA Astrophysics Data System (ADS)

    Lobo Pereira, J. A.; Ferreira, Maria Cristina; Oliveira, Teresa

    2013-10-01

    Multivariate statistical analysis is indispensable to assess the associations and interactions between different factors and the risk of periodontitis. Among others, regression analysis is a statistical technique widely used in healthcare to investigate and model the relationship between variables. In our work we study the impact of socio-demographic, medical and behavioral factors on periodontal health. Using regression, linear and logistic models, we can assess the relevance, as risk factors for periodontitis disease, of the following independent variables (IVs): Age, Gender, Diabetic Status, Education, Smoking status and Plaque Index. The multiple linear regression analysis model was built to evaluate the influence of IVs on mean Attachment Loss (AL). Thus, the regression coefficients along with respective p-values will be obtained as well as the respective p-values from the significance tests. The classification of a case (individual) adopted in the logistic model was the extent of the destruction of periodontal tissues defined by an Attachment Loss greater than or equal to 4 mm in 25% (AL≥4mm/≥25%) of sites surveyed. The association measures include the Odds Ratios together with the correspondent 95% confidence intervals.

  17. [Analysis of risk factors for dry eye syndrome in visual display terminal workers].

    PubMed

    Zhu, Yong; Yu, Wen-lan; Xu, Ming; Han, Lei; Cao, Wen-dong; Zhang, Hong-bing; Zhang, Heng-dong

    2013-08-01

    To analyze the risk factors for dry eye syndrome in visual display terminal (VDT) workers and to provide a scientific basis for protecting the eye health of VDT workers. Questionnaire survey, Schirmer I test, tear break-up time test, and workshop microenvironment evaluation were performed in 185 VDT workers. Multivariate logistic regression analysis was performed to determine the risk factors for dry eye syndrome in VDT workers after adjustment for confounding factors. In the logistic regression model, the regression coefficients of daily mean time of exposure to screen, daily mean time of watching TV, parallel screen-eye angle, upward screen-eye angle, eye-screen distance of less than 20 cm, irregular breaks during screen-exposed work, age, and female gender on the results of Schirmer I test were 0.153, 0.548, 0.400, 0.796, 0.234, 0.516, 0.559, and -0.685, respectively; the regression coefficients of daily mean time of exposure to screen, parallel screen-eye angle, upward screen-eye angle, age, working years, and female gender on tear break-up time were 0.021, 0.625, 2.652, 0.749, 0.403, and 1.481, respectively. Daily mean time of exposure to screen, daily mean time of watching TV, parallel screen-eye angle, upward screen-eye angle, eye-screen distance of less than 20 cm, irregular breaks during screen-exposed work, age, and working years are risk factors for dry eye syndrome in VDT workers.

  18. Assessing LULC changes over Chilika Lake watershed in Eastern India using Driving Force Analysis

    NASA Astrophysics Data System (ADS)

    Jadav, S.; Syed, T. H.

    2017-12-01

    Rapid population growth and industrial development has brought about significant changes in Land Use Land Cover (LULC) of many developing countries in the world. This study investigates LULC changes in the Chilika Lake watershed of Eastern India for the period of 1988 to 2016. The methodology involves pre-processing and classification of Landsat satellite images using support vector machine (SVM) supervised classification algorithm. Results reveal that `Cropland', `Emergent Vegetation' and `Settlement' has expanded over the study period by 284.61 km², 106.83 km² and 98.83 km² respectively. Contemporaneously, `Lake Area', `Vegetation' and `Scrub Land' have decreased by 121.62 km², 96.05 km² and 80.29 km² respectively. This study also analyzes five major driving force variables of socio-economic and climatological factors triggering LULC changes through a bivariate logistic regression model. The outcome gives credible relative operating characteristics (ROC) value of 0.76 that indicate goodness fit of logistic regression model. In addition, independent variables like distance to drainage network and average annual rainfall have negative regression coefficient values that represent decreased rate of dependent variable (changed LULC) whereas independent variables (population density, distance to road and distance to railway) have positive regression coefficient indicates increased rate of changed LULC . Results from this study will be crucial for planning and restoration of this vital lake water body that has major implications over the society and environment at large.

  19. Development and evaluation of an electromagnetic hypersensitivity questionnaire for Japanese people

    PubMed Central

    Tokiya, Mikiko; Mizuki, Masami; Miyata, Mikio; Kanatani, Kumiko T.; Takagi, Airi; Tsurikisawa, Naomi; Kame, Setsuko; Katoh, Takahiko; Tsujiuchi, Takuya; Kumano, Hiroaki

    2016-01-01

    The purpose of the present study was to evaluate the validity and reliability of a Japanese version of an electromagnetic hypersensitivity (EHS) questionnaire, originally developed by Eltiti et al. in the United Kingdom. Using this Japanese EHS questionnaire, surveys were conducted on 1306 controls and 127 self‐selected EHS subjects in Japan. Principal component analysis of controls revealed eight principal symptom groups, namely, nervous, skin‐related, head‐related, auditory and vestibular, musculoskeletal, allergy‐related, sensory, and heart/chest‐related. The reliability of the Japanese EHS questionnaire was confirmed by high to moderate intraclass correlation coefficients in a test–retest analysis, and high Cronbach's α coefficients (0.853–0.953) from each subscale. A comparison of scores of each subscale between self‐selected EHS subjects and age‐ and sex‐matched controls using bivariate logistic regression analysis, Mann–Whitney U‐ and χ 2 tests, verified the validity of the questionnaire. This study demonstrated that the Japanese EHS questionnaire is reliable and valid, and can be used for surveillance of EHS individuals in Japan. Furthermore, based on multiple logistic regression and receiver operating characteristic analyses, we propose specific preliminary criteria for screening EHS individuals in Japan. Bioelectromagnetics. 37:353–372, 2016. © 2016 The Authors. Bioelectromagnetics Published by Wiley Periodicals, Inc. PMID:27324106

  20. The association between short interpregnancy interval and preterm birth in Louisiana: a comparison of methods.

    PubMed

    Howard, Elizabeth J; Harville, Emily; Kissinger, Patricia; Xiong, Xu

    2013-07-01

    There is growing interest in the application of propensity scores (PS) in epidemiologic studies, especially within the field of reproductive epidemiology. This retrospective cohort study assesses the impact of a short interpregnancy interval (IPI) on preterm birth and compares the results of the conventional logistic regression analysis with analyses utilizing a PS. The study included 96,378 singleton infants from Louisiana birth certificate data (1995-2007). Five regression models designed for methods comparison are presented. Ten percent (10.17 %) of all births were preterm; 26.83 % of births were from a short IPI. The PS-adjusted model produced a more conservative estimate of the exposure variable compared to the conventional logistic regression method (β-coefficient: 0.21 vs. 0.43), as well as a smaller standard error (0.024 vs. 0.028), odds ratio and 95 % confidence intervals [1.15 (1.09, 1.20) vs. 1.23 (1.17, 1.30)]. The inclusion of more covariate and interaction terms in the PS did not change the estimates of the exposure variable. This analysis indicates that PS-adjusted regression may be appropriate for validation of conventional methods in a large dataset with a fairly common outcome. PS's may be beneficial in producing more precise estimates, especially for models with many confounders and effect modifiers and where conventional adjustment with logistic regression is unsatisfactory. Short intervals between pregnancies are associated with preterm birth in this population, according to either technique. Birth spacing is an issue that women have some control over. Educational interventions, including birth control, should be applied during prenatal visits and following delivery.

  1. Interpreting Multiple Logistic Regression Coefficients in Prospective Observational Studies

    DTIC Science & Technology

    1982-11-01

    TG HDL -C Males T-C 50-80 MRW pɘ.05 pɘ.10 1HDL-C = high density lipoprotein cholesterol MRW...consider a more complete analy- sis, attempting to uncover the relationship between CHD and TG controlling for covariables such a high density ...for T-C can be re- duced, when among older individuals, elevated T-C may increase the capacity to carry cholesterol in the high density lipoprotein

  2. Learning investment indicators through data extension

    NASA Astrophysics Data System (ADS)

    Dvořák, Marek

    2017-07-01

    Stock prices in the form of time series were analysed using single and multivariate statistical methods. After simple data preprocessing in the form of logarithmic differences, we augmented this single variate time series to a multivariate representation. This method makes use of sliding windows to calculate several dozen of new variables using simple statistic tools like first and second moments as well as more complicated statistic, like auto-regression coefficients and residual analysis, followed by an optional quadratic transformation that was further used for data extension. These were used as a explanatory variables in a regularized logistic LASSO regression which tried to estimate Buy-Sell Index (BSI) from real stock market data.

  3. Using a binary logistic regression method and GIS for evaluating and mapping the groundwater spring potential in the Sultan Mountains (Aksehir, Turkey)

    NASA Astrophysics Data System (ADS)

    Ozdemir, Adnan

    2011-07-01

    SummaryThe purpose of this study is to produce a groundwater spring potential map of the Sultan Mountains in central Turkey, based on a logistic regression method within a Geographic Information System (GIS) environment. Using field surveys, the locations of the springs (440 springs) were determined in the study area. In this study, 17 spring-related factors were used in the analysis: geology, relative permeability, land use/land cover, precipitation, elevation, slope, aspect, total curvature, plan curvature, profile curvature, wetness index, stream power index, sediment transport capacity index, distance to drainage, distance to fault, drainage density, and fault density map. The coefficients of the predictor variables were estimated using binary logistic regression analysis and were used to calculate the groundwater spring potential for the entire study area. The accuracy of the final spring potential map was evaluated based on the observed springs. The accuracy of the model was evaluated by calculating the relative operating characteristics. The area value of the relative operating characteristic curve model was found to be 0.82. These results indicate that the model is a good estimator of the spring potential in the study area. The spring potential map shows that the areas of very low, low, moderate and high groundwater spring potential classes are 105.586 km 2 (28.99%), 74.271 km 2 (19.906%), 101.203 km 2 (27.14%), and 90.05 km 2 (24.671%), respectively. The interpretations of the potential map showed that stream power index, relative permeability of lithologies, geology, elevation, aspect, wetness index, plan curvature, and drainage density play major roles in spring occurrence and distribution in the Sultan Mountains. The logistic regression approach has not yet been used to delineate groundwater potential zones. In this study, the logistic regression method was used to locate potential zones for groundwater springs in the Sultan Mountains. The evolved model was found to be in strong agreement with the available groundwater spring test data. Hence, this method can be used routinely in groundwater exploration under favourable conditions.

  4. Early warnings for suicide attempt among Chinese rural population.

    PubMed

    Lyu, Juncheng; Wang, Yingying; Shi, Hong; Zhang, Jie

    2018-06-05

    This study was to explore the main influencing factors of attempted suicide and establish an early warning model, so as to put forward prevention strategies for attempted suicide. Data came from a large-scale case-control epidemiological survey. A sample of 659 serious suicide attempters was randomly recruited from 13 rural counties in China. Each case was matched by a community control for gender, age, and residence location. Face to face interviews were conducted for all the cases and controls with the same structured questionnaire. Univariate logistic regression was applied to screen the factors and multivariate logistic regression was used to excavate the predictors. There were no statistical differences between suicide attempters and the community controls in gender, age, and residence location. The Cronbach`s coefficients for all the scales used were above 0.675. The multivariate logistic regressions have revealed 12 statistically significant variables predicting attempted suicide, including less education, family history of suicide, poor health, mental problem, aspiration strain, hopelessness, impulsivity, depression, negative life events. On the other hand, social support, coping skills, and healthy community protected the rural residents from suicide attempt. The excavated warning predictors are significant clinical meaning for the clinical psychiatrist. Crisis intervention strategies in rural China should be informed by the findings from this research. Education, social support, healthy community, and strain reduction are all measures to decrease the likelihood of crises. Copyright © 2018. Published by Elsevier B.V.

  5. Semiparametric time varying coefficient model for matched case-crossover studies.

    PubMed

    Ortega-Villa, Ana Maria; Kim, Inyoung; Kim, H

    2017-03-15

    In matched case-crossover studies, it is generally accepted that the covariates on which a case and associated controls are matched cannot exert a confounding effect on independent predictors included in the conditional logistic regression model. This is because any stratum effect is removed by the conditioning on the fixed number of sets of the case and controls in the stratum. Hence, the conditional logistic regression model is not able to detect any effects associated with the matching covariates by stratum. However, some matching covariates such as time often play an important role as an effect modification leading to incorrect statistical estimation and prediction. Therefore, we propose three approaches to evaluate effect modification by time. The first is a parametric approach, the second is a semiparametric penalized approach, and the third is a semiparametric Bayesian approach. Our parametric approach is a two-stage method, which uses conditional logistic regression in the first stage and then estimates polynomial regression in the second stage. Our semiparametric penalized and Bayesian approaches are one-stage approaches developed by using regression splines. Our semiparametric one stage approach allows us to not only detect the parametric relationship between the predictor and binary outcomes, but also evaluate nonparametric relationships between the predictor and time. We demonstrate the advantage of our semiparametric one-stage approaches using both a simulation study and an epidemiological example of a 1-4 bi-directional case-crossover study of childhood aseptic meningitis with drinking water turbidity. We also provide statistical inference for the semiparametric Bayesian approach using Bayes Factors. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  6. Predicting The Type Of Pregnancy Using Flexible Discriminate Analysis And Artificial Neural Networks: A Comparison Study

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hooman, A.; Mohammadzadeh, M

    Some medical and epidemiological surveys have been designed to predict a nominal response variable with several levels. With regard to the type of pregnancy there are four possible states: wanted, unwanted by wife, unwanted by husband and unwanted by couple. In this paper, we have predicted the type of pregnancy, as well as the factors influencing it using three different models and comparing them. Regarding the type of pregnancy with several levels, we developed a multinomial logistic regression, a neural network and a flexible discrimination based on the data and compared their results using tow statistical indices: Surface under curvemore » (ROC) and kappa coefficient. Based on these tow indices, flexible discrimination proved to be a better fit for prediction on data in comparison to other methods. When the relations among variables are complex, one can use flexible discrimination instead of multinomial logistic regression and neural network to predict the nominal response variables with several levels in order to gain more accurate predictions.« less

  7. Evaluating Cancer Patients' Expectations and Barriers Toward Traditional Chinese Medicine Utilization in China: A Patient-Support Group-Based Cross-Sectional Survey.

    PubMed

    Sun, Lingyun; Mao, Jun J; Vertosick, Emily; Seluzicki, Christina; Yang, Yufei

    2018-06-01

    Traditional Chinese medicine (TCM) is widely used among Chinese cancer patients. However, little is known about Chinese patients' expectations and barriers toward using TCM for cancer. We conducted a cross-sectional survey within a patient-support group, the Beijing Anti-Cancer Association. We measured the outcome, Chinese cancer survivors' expectations and barriers toward TCM utilization, using a modified version of ABCAM (Attitudes and Beliefs towards Complementary and Alternative Medicine), the ABTCM (Attitudes and Beliefs towards Traditional Chinese Medicine). We used multivariate models to evaluate the impact of socioeconomic status and clinical factors on their expectations and barriers (including treatment concerns and logistical challenges domain) toward TCM. Among 590 participants, most patients expected TCM to boost their immune system (96%), improve their physical health (96%), and reduce symptoms (94%). Many had logistical challenges (difficulty decocting herbs (58%) and finding a good TCM physician (55%)). A few were concerned that TCM might interfere with conventional treatments (7.6%), and that many TCM treatments are not based on scientific research (9.1%). In the multivariable regression model, age ≤60 years was independently associated with higher expectation score ( P = .031). Age ≤60 years (coefficient 5.0, P = .003) and localized disease (coefficient 9.5, P = .001) were both associated with higher treatment concerns. Active employment status (coefficient 9.0, P = .008) and localized disease (coefficient 7.5, P = .030) were related to more logistical challenges. Age and cancer stage were related to Chinese cancer patients' perceived expectations and barriers toward TCM use. Understanding these attitudes is important for reshaping the role that TCM plays in China's patient-centered comprehensive cancer care model.

  8. Prediction models for clustered data: comparison of a random intercept and standard regression model

    PubMed Central

    2013-01-01

    Background When study data are clustered, standard regression analysis is considered inappropriate and analytical techniques for clustered data need to be used. For prediction research in which the interest of predictor effects is on the patient level, random effect regression models are probably preferred over standard regression analysis. It is well known that the random effect parameter estimates and the standard logistic regression parameter estimates are different. Here, we compared random effect and standard logistic regression models for their ability to provide accurate predictions. Methods Using an empirical study on 1642 surgical patients at risk of postoperative nausea and vomiting, who were treated by one of 19 anesthesiologists (clusters), we developed prognostic models either with standard or random intercept logistic regression. External validity of these models was assessed in new patients from other anesthesiologists. We supported our results with simulation studies using intra-class correlation coefficients (ICC) of 5%, 15%, or 30%. Standard performance measures and measures adapted for the clustered data structure were estimated. Results The model developed with random effect analysis showed better discrimination than the standard approach, if the cluster effects were used for risk prediction (standard c-index of 0.69 versus 0.66). In the external validation set, both models showed similar discrimination (standard c-index 0.68 versus 0.67). The simulation study confirmed these results. For datasets with a high ICC (≥15%), model calibration was only adequate in external subjects, if the used performance measure assumed the same data structure as the model development method: standard calibration measures showed good calibration for the standard developed model, calibration measures adapting the clustered data structure showed good calibration for the prediction model with random intercept. Conclusion The models with random intercept discriminate better than the standard model only if the cluster effect is used for predictions. The prediction model with random intercept had good calibration within clusters. PMID:23414436

  9. Prediction models for clustered data: comparison of a random intercept and standard regression model.

    PubMed

    Bouwmeester, Walter; Twisk, Jos W R; Kappen, Teus H; van Klei, Wilton A; Moons, Karel G M; Vergouwe, Yvonne

    2013-02-15

    When study data are clustered, standard regression analysis is considered inappropriate and analytical techniques for clustered data need to be used. For prediction research in which the interest of predictor effects is on the patient level, random effect regression models are probably preferred over standard regression analysis. It is well known that the random effect parameter estimates and the standard logistic regression parameter estimates are different. Here, we compared random effect and standard logistic regression models for their ability to provide accurate predictions. Using an empirical study on 1642 surgical patients at risk of postoperative nausea and vomiting, who were treated by one of 19 anesthesiologists (clusters), we developed prognostic models either with standard or random intercept logistic regression. External validity of these models was assessed in new patients from other anesthesiologists. We supported our results with simulation studies using intra-class correlation coefficients (ICC) of 5%, 15%, or 30%. Standard performance measures and measures adapted for the clustered data structure were estimated. The model developed with random effect analysis showed better discrimination than the standard approach, if the cluster effects were used for risk prediction (standard c-index of 0.69 versus 0.66). In the external validation set, both models showed similar discrimination (standard c-index 0.68 versus 0.67). The simulation study confirmed these results. For datasets with a high ICC (≥15%), model calibration was only adequate in external subjects, if the used performance measure assumed the same data structure as the model development method: standard calibration measures showed good calibration for the standard developed model, calibration measures adapting the clustered data structure showed good calibration for the prediction model with random intercept. The models with random intercept discriminate better than the standard model only if the cluster effect is used for predictions. The prediction model with random intercept had good calibration within clusters.

  10. The relationship between social support, shared decision-making and patient's trust in doctors: a cross-sectional survey of 2,197 inpatients using the Cologne Patient Questionnaire.

    PubMed

    Ommen, Oliver; Thuem, Sonja; Pfaff, Holger; Janssen, Christian

    2011-06-01

    Empirical studies have confirmed that a trusting physician-patient interaction promotes patient satisfaction, adherence to treatment and improved health outcomes. The objective of this analysis was to investigate the relationship between social support, shared decision-making and inpatient's trust in physicians in a hospital setting. A written questionnaire was completed by 2,197 patients who were treated in the year 2000 in six hospitals in Germany. Logistic regression was performed with a dichotomized index for patient's trust in physicians. The logistic regression model identified significant relationships (p < 0.05) in terms of emotional support (standardized effect coefficient [sc], 3.65), informational support (sc, 1.70), shared decision-making (sc, 1.40), age (sc, 1.14), socioeconomic status (sc, 1.15) and gender (sc, 1.15). We found no significant relationship between 'tendency to excuse' and trust. The last regression model accounted for 49.1% of Nagelkerke's R-square. Insufficient physician communication skills can lead to extensive negative effects on the trust of patients in their physicians. Thus, it becomes clear that medical support requires not only biomedical, but also psychosocial skills.

  11. The effect of high leverage points on the logistic ridge regression estimator having multicollinearity

    NASA Astrophysics Data System (ADS)

    Ariffin, Syaiba Balqish; Midi, Habshah

    2014-06-01

    This article is concerned with the performance of logistic ridge regression estimation technique in the presence of multicollinearity and high leverage points. In logistic regression, multicollinearity exists among predictors and in the information matrix. The maximum likelihood estimator suffers a huge setback in the presence of multicollinearity which cause regression estimates to have unduly large standard errors. To remedy this problem, a logistic ridge regression estimator is put forward. It is evident that the logistic ridge regression estimator outperforms the maximum likelihood approach for handling multicollinearity. The effect of high leverage points are then investigated on the performance of the logistic ridge regression estimator through real data set and simulation study. The findings signify that logistic ridge regression estimator fails to provide better parameter estimates in the presence of both high leverage points and multicollinearity.

  12. Factors affecting the use of postincisional analgesics in dogs and cats by Canadian veterinarians in 2001

    PubMed Central

    Hewson, Caroline J.; Dohoo, Ian R.

    2006-01-01

    Abstract Factors affecting the postincisional use of analgesics for ovariohysterectomy (OVH) in dogs and cats were assessed by using data collected from 280 Canadian veterinarians, as part of a national, randomized mail survey (response rate 57.8%). Predictors of analgesic usage identified by logistic regression included the presence of at least 1 animal health technician (AHT) per 2 veterinarians (OR = 2.3, P = 0.004), and the veterinarians’ perception of the pain caused by surgery without analgesia (OR = 1.5, P < 0.001). Linear regression identified the following predictors of veterinarians’ perception of pain: the presence of more than 1 AHT per 2 veterinarians (coefficient = 0.42, P = 0.048) and the number of years since graduation (coefficient = −0.073, P < 0.001). Some of these risk factors are similar to those identified in 1994. The results suggest that continuing education may help to increase analgesic usage. Other important contributors may be client education and a valid method of pain assessment. PMID:16734371

  13. Multiple regression analysis of anthropometric measurements influencing the cephalic index of male Japanese university students.

    PubMed

    Hossain, Md Golam; Saw, Aik; Alam, Rashidul; Ohtsuki, Fumio; Kamarul, Tunku

    2013-09-01

    Cephalic index (CI), the ratio of head breadth to head length, is widely used to categorise human populations. The aim of this study was to access the impact of anthropometric measurements on the CI of male Japanese university students. This study included 1,215 male university students from Tokyo and Kyoto, selected using convenient sampling. Multiple regression analysis was used to determine the effect of anthropometric measurements on CI. The variance inflation factor (VIF) showed no evidence of a multicollinearity problem among independent variables. The coefficients of the regression line demonstrated a significant positive relationship between CI and minimum frontal breadth (p < 0.01), bizygomatic breadth (p < 0.01) and head height (p < 0.05), and a negative relationship between CI and morphological facial height (p < 0.01) and head circumference (p < 0.01). Moreover, the coefficient and odds ratio of logistic regression analysis showed a greater likelihood for minimum frontal breadth (p < 0.01) and bizygomatic breadth (p < 0.01) to predict round-headedness, and morphological facial height (p < 0.05) and head circumference (p < 0.01) to predict long-headedness. Stepwise regression analysis revealed bizygomatic breadth, head circumference, minimum frontal breadth, head height and morphological facial height to be the best predictor craniofacial measurements with respect to CI. The results suggest that most of the variables considered in this study appear to influence the CI of adult male Japanese students.

  14. A comparison of Cox and logistic regression for use in genome-wide association studies of cohort and case-cohort design.

    PubMed

    Staley, James R; Jones, Edmund; Kaptoge, Stephen; Butterworth, Adam S; Sweeting, Michael J; Wood, Angela M; Howson, Joanna M M

    2017-06-01

    Logistic regression is often used instead of Cox regression to analyse genome-wide association studies (GWAS) of single-nucleotide polymorphisms (SNPs) and disease outcomes with cohort and case-cohort designs, as it is less computationally expensive. Although Cox and logistic regression models have been compared previously in cohort studies, this work does not completely cover the GWAS setting nor extend to the case-cohort study design. Here, we evaluated Cox and logistic regression applied to cohort and case-cohort genetic association studies using simulated data and genetic data from the EPIC-CVD study. In the cohort setting, there was a modest improvement in power to detect SNP-disease associations using Cox regression compared with logistic regression, which increased as the disease incidence increased. In contrast, logistic regression had more power than (Prentice weighted) Cox regression in the case-cohort setting. Logistic regression yielded inflated effect estimates (assuming the hazard ratio is the underlying measure of association) for both study designs, especially for SNPs with greater effect on disease. Given logistic regression is substantially more computationally efficient than Cox regression in both settings, we propose a two-step approach to GWAS in cohort and case-cohort studies. First to analyse all SNPs with logistic regression to identify associated variants below a pre-defined P-value threshold, and second to fit Cox regression (appropriately weighted in case-cohort studies) to those identified SNPs to ensure accurate estimation of association with disease.

  15. The crux of the method: assumptions in ordinary least squares and logistic regression.

    PubMed

    Long, Rebecca G

    2008-10-01

    Logistic regression has increasingly become the tool of choice when analyzing data with a binary dependent variable. While resources relating to the technique are widely available, clear discussions of why logistic regression should be used in place of ordinary least squares regression are difficult to find. The current paper compares and contrasts the assumptions of ordinary least squares with those of logistic regression and explains why logistic regression's looser assumptions make it adept at handling violations of the more important assumptions in ordinary least squares.

  16. Retinal nerve fibre layer thinning is associated with drug resistance in epilepsy

    PubMed Central

    Balestrini, Simona; Clayton, Lisa M S; Bartmann, Ana P; Chinthapalli, Krishna; Novy, Jan; Coppola, Antonietta; Wandschneider, Britta; Stern, William M; Acheson, James; Bell, Gail S; Sander, Josemir W; Sisodiya, Sanjay M

    2016-01-01

    Objective Retinal nerve fibre layer (RNFL) thickness is related to the axonal anterior visual pathway and is considered a marker of overall white matter ‘integrity’. We hypothesised that RNFL changes would occur in people with epilepsy, independently of vigabatrin exposure, and be related to clinical characteristics of epilepsy. Methods Three hundred people with epilepsy attending specialist clinics and 90 healthy controls were included in this cross-sectional cohort study. RNFL imaging was performed using spectral-domain optical coherence tomography (OCT). Drug resistance was defined as failure of adequate trials of two antiepileptic drugs to achieve sustained seizure freedom. Results The average RNFL thickness and the thickness of each of the 90° quadrants were significantly thinner in people with epilepsy than healthy controls (p<0.001, t test). In a multivariate logistic regression model, drug resistance was the only significant predictor of abnormal RNFL thinning (OR=2.09, 95% CI 1.09 to 4.01, p=0.03). Duration of epilepsy (coefficient −0.16, p=0.004) and presence of intellectual disability (coefficient −4.0, p=0.044) also showed a significant relationship with RNFL thinning in a multivariate linear regression model. Conclusions Our results suggest that people with epilepsy with no previous exposure to vigabatrin have a significantly thinner RNFL than healthy participants. Drug resistance emerged as a significant independent predictor of RNFL borderline attenuation or abnormal thinning in a logistic regression model. As this is easily assessed by OCT, RNFL thickness might be used to better understand the mechanisms underlying drug resistance, and possibly severity. Longitudinal studies are needed to confirm our findings. PMID:25886782

  17. Using Dominance Analysis to Determine Predictor Importance in Logistic Regression

    ERIC Educational Resources Information Center

    Azen, Razia; Traxel, Nicole

    2009-01-01

    This article proposes an extension of dominance analysis that allows researchers to determine the relative importance of predictors in logistic regression models. Criteria for choosing logistic regression R[superscript 2] analogues were determined and measures were selected that can be used to perform dominance analysis in logistic regression. A…

  18. Applying Kaplan-Meier to Item Response Data

    ERIC Educational Resources Information Center

    McNeish, Daniel

    2018-01-01

    Some IRT models can be equivalently modeled in alternative frameworks such as logistic regression. Logistic regression can also model time-to-event data, which concerns the probability of an event occurring over time. Using the relation between time-to-event models and logistic regression and the relation between logistic regression and IRT, this…

  19. Contemporary New Zealand coefficients for the Trauma Injury Severity Score: TRISS(NZ).

    PubMed

    Schluter, Philip J; Cameron, Cate M; Davey, Tamzyn M; Civil, Ian; Orchard, Jodie; Dansey, Rangi; Hamill, James; Naylor, Helen; James, Carolyn; Dorrian, Jenny; Christey, Grant; Pollard, Cliff; McClure, Rod J

    2009-09-11

    To develop local contemporary coefficients for the Trauma Injury Severity Score in New Zealand, TRISS(NZ), and to evaluate their performance at predicting survival against the original TRISS coefficients. Retrospective cohort study of adults who sustained a serious traumatic injury, and who survived until presentation at Auckland City, Middlemore, Waikato, or North Shore Hospitals between 2002 and 2006. Coefficients were estimated using ordinary and multilevel mixed-effects logistic regression models. 1735 eligible patients were identified, 1672 (96%) injured from a blunt mechanism and 63 (4%) from a penetrating mechanism. For blunt mechanism trauma, 1250 (75%) were male and average age was 38 years (range: 15-94 years). TRISS information was available for 1565 patients of whom 204 (13%) died. Area under the Receiver Operating Characteristic (ROC) curves was 0.901 (95%CI: 0.879-0.923) for the TRISS(NZ) model and 0.890 (95% CI: 0.866-0.913) for TRISS (P<0.001). Insufficient data were available to determine coefficients for penetrating mechanism TRISS(NZ) models. Both TRISS models accurately predicted survival for blunt mechanism trauma. However, TRISS(NZ) coefficients were statistically superior to TRISS coefficients. A strong case exists for replacing TRISS coefficients in the New Zealand benchmarking software with these updated TRISS(NZ) estimates.

  20. Psychomotor development index in children younger than 6 years from Argentine provinces.

    PubMed

    Lejarraga, Horacio; Kelmansky, Diana M; Masautis, Alicia; Nunes, Fernando

    2018-04-01

    To obtain a psychomotor development index (PDI) for each Argentine province. Using a national, probabilistic, and stratified sample of 13 323 male and female children younger than 6 years selected for the National Survey on Nutrition and Health (Encuesta Nacional de Nutrición y Salud, ENNyS 2004), we estimated the PDI per province based on compliance with 10 developmental milestones. The median age at attainment (median age) of each milestone was estimated adjusting a logistic regression. The PDI was estimated as 100* (1 + b), where "b" is the regression coefficient of y= a + b x, where "y" is the median age as per the national reference (x) minus the median age at attainment of a milestone. The theoretical value expected for the PDI was 100. The PDI per province ranged between 72.1 and 106.4. Most provinces showed a negative regression coefficient, which indicated a progressive increase of the delay in the age at attainment of milestones. The correlation coefficient between the PDI per province and infant mortality in 2005was extremely high: -0.85, suggesting that both indicators share similar biological and social determinants. The PDI was negative because the higher the mortality, the lower the PDI. We have now a positive health indicator available in Argentina: the psychomotor development index, which is a low-cost, easy to collect, and reliable tool that may be used in national health statistics. Sociedad Argentina de Pediatría.

  1. Analysis of nonlinear relationships in dual epidemics, and its application to the management of grapevine downy and powdery mildews.

    PubMed

    Savary, Serge; Delbac, Lionel; Rochas, Amélie; Taisant, Guillaume; Willocquet, Laetitia

    2009-08-01

    Dual epidemics are defined as epidemics developing on two or several plant organs in the course of a cropping season. Agricultural pathosystems where such epidemics develop are often very important, because the harvestable part is one of the organs affected. These epidemics also are often difficult to manage, because the linkage between epidemiological components occurring on different organs is poorly understood, and because prediction of the risk toward the harvestable organs is difficult. In the case of downy mildew (DM) and powdery mildew (PM) of grapevine, nonlinear modeling and logistic regression indicated nonlinearity in the foliage-cluster relationships. Nonlinear modeling enabled the parameterization of a transmission coefficient that numerically links the two components, leaves and clusters, in DM and PM epidemics. Logistic regression analysis yielded a series of probabilistic models that enabled predicting preset levels of cluster infection risks based on DM and PM severities on the foliage at successive crop stages. The usefulness of this framework for tactical decision-making for disease control is discussed.

  2. Comparison of multinomial logistic regression and logistic regression: which is more efficient in allocating land use?

    NASA Astrophysics Data System (ADS)

    Lin, Yingzhi; Deng, Xiangzheng; Li, Xing; Ma, Enjun

    2014-12-01

    Spatially explicit simulation of land use change is the basis for estimating the effects of land use and cover change on energy fluxes, ecology and the environment. At the pixel level, logistic regression is one of the most common approaches used in spatially explicit land use allocation models to determine the relationship between land use and its causal factors in driving land use change, and thereby to evaluate land use suitability. However, these models have a drawback in that they do not determine/allocate land use based on the direct relationship between land use change and its driving factors. Consequently, a multinomial logistic regression method was introduced to address this flaw, and thereby, judge the suitability of a type of land use in any given pixel in a case study area of the Jiangxi Province, China. A comparison of the two regression methods indicated that the proportion of correctly allocated pixels using multinomial logistic regression was 92.98%, which was 8.47% higher than that obtained using logistic regression. Paired t-test results also showed that pixels were more clearly distinguished by multinomial logistic regression than by logistic regression. In conclusion, multinomial logistic regression is a more efficient and accurate method for the spatial allocation of land use changes. The application of this method in future land use change studies may improve the accuracy of predicting the effects of land use and cover change on energy fluxes, ecology, and environment.

  3. Differential impact of anxiety symptoms and anxiety disorders on treatment outcome for psychotic depression in the STOP-PD study

    PubMed Central

    Davies, Simon J.C.; Mulsant, Benoit H.; Flint, Alastair J.; Rothschild, Anthony J.; Whyte, Ellen M.; Meyers, Barnett S.

    2014-01-01

    Background There are conflicting results on the impact of anxiety on depression outcomes. The impact of anxiety has not been studied in major depression with psychotic features (“psychotic depression”). Aims We assessed the impact of specific anxiety symptoms and disorders on the outcomes of psychotic depression. Methods We analyzed data from the Study of Pharmacotherapy for Psychotic Depression that randomized 259 younger and older participants to either olanzapine plus placebo or olanzapine plus sertraline. We assessed the impact of specific anxiety symptoms from the Brief Psychiatric Rating Scale (“tension”, “anxiety” and “somatic concerns” and a composite anxiety score) and diagnoses (panic disorder and GAD) on psychotic depression outcomes using linear or logistic regression. Age, gender, education and benzodiazepine use (at baseline and end) were included as covariates. Results Anxiety symptoms at baseline and anxiety disorder diagnoses differentially impacted outcomes. On adjusted linear regression there was an association between improvement in depressive symptoms and both baseline “tension” (coefficient = 0.784; 95% CI: 0.169–1.400; p = 0.013) and the composite anxiety score (regression coefficient = 0.348; 95% CI: 0.064–0.632; p = 0.017). There was an interaction between “tension” and treatment group, with better responses in those randomized to combination treatment if they had high baseline anxiety scores (coefficient = 1.309; 95% CI: 0.105–2.514; p = 0.033). In contrast, panic disorder was associated with worse clinical outcomes (coefficient = −3.858; 95% CI: –7.281 to −0.434; p = 0.027) regardless of treatment. Conclusions Our results suggest that analysis of the impact of anxiety on depression outcome needs to differentiate psychic and somatic symptoms. PMID:24656524

  4. Elder abuse and socioeconomic inequalities: a multilevel study in 7 European countries.

    PubMed

    Fraga, Sílvia; Lindert, Jutta; Barros, Henrique; Torres-González, Francisco; Ioannidi-Kapolou, Elisabeth; Melchiorre, Maria Gabriella; Stankunas, Mindaugas; Soares, Joaquim F

    2014-04-01

    To compare the prevalence of elder abuse using a multilevel approach that takes into account the characteristics of participants as well as socioeconomic indicators at city and country level. In 2009, the project on abuse of elderly in Europe (ABUEL) was conducted in seven cities (Stuttgart, Germany; Ancona, Italy; Kaunas, Lithuania, Stockholm, Sweden; Porto, Portugal; Granada, Spain; Athens, Greece) comprising 4467 individuals aged 60-84 years. We used a 3-level hierarchical structure of data: 1) characteristics of participants; 2) mean of tertiary education of each city; and 3) country inequality indicator (Gini coefficient). Multilevel logistic regression was used and proportional changes in Intraclass Correlation Coefficient (ICC) were inspected to assert explained variance between models. The prevalence of elder abuse showed large variations across sites. Adding tertiary education to the regression model reduced the country level variance for psychological abuse (ICC=3.4%), with no significant decrease in the explained variance for the other types of abuse. When the Gini coefficient was considered, the highest drop in ICC was observed for financial abuse (from 9.5% to 4.3%). There is a societal and community level dimension that adds information to individual variability in explaining country differences in elder abuse, highlighting underlying socioeconomic inequalities leading to such behavior. Copyright © 2014 Elsevier Inc. All rights reserved.

  5. [Optimization of diagnosis indicator selection and inspection plan by 3.0T MRI in breast cancer].

    PubMed

    Jiang, Zhongbiao; Wang, Yunhua; He, Zhong; Zhang, Lejun; Zheng, Kai

    2013-08-01

    To optimize 3.0T MRI diagnosis indicator in breast cancer and to select the best MRI scan program. Totally 45 patients with breast cancers were collected, and another 35 patients with benign breast tumor served as the control group. All patients underwent 3.0T MRI, including T1- weighted imaging (T1WI), fat suppression of the T2-weighted imaging (T2WI), diffusion weighted imaging (DWI), 1H magnetic resonance spectroscopy (1H-MRS) and dynamic contrast enhanced (DCE) sequence. With operation pathology results as the gold standard in the diagnosis of breast diseases, the pathological results of benign and malignant served as dependent variables, and the diagnostic indicators of MRI were taken as independent variables. We put all the indicators of MRI examination under Logistic regression analysis, established the Logistic model, and optimized the diagnosis indicators of MRI examination to further improve MRI scan of breast cancer. By Logistic regression analysis, some indicators were selected in the equation, including the edge feature of the tumor, the time-signal intensity curve (TIC) type and the apparent diffusion coefficient (ADC) value when b=500 s/mm2. The regression equation was Logit (P)=-21.936+20.478X6+3.267X7+ 21.488X3. Valuable indicators in the diagnosis of breast cancer are the edge feature of the tumor, the TIC type and the ADC value when b=500 s/mm2. Combining conventional MRI scan, DWI and dynamic enhanced MRI is a better examination program, while MRS is the complementary program when diagnosis is difficult.

  6. Predicting the need for muscle flap salvage after open groin vascular procedures: a clinical assessment tool.

    PubMed

    Fischer, John P; Nelson, Jonas A; Shang, Eric K; Wink, Jason D; Wingate, Nicholas A; Woo, Edward Y; Jackson, Benjamin M; Kovach, Stephen J; Kanchwala, Suhail

    2014-12-01

    Groin wound complications after open vascular surgery procedures are common, morbid, and costly. The purpose of this study was to generate a simple, validated, clinically usable risk assessment tool for predicting groin wound morbidity after infra-inguinal vascular surgery. A retrospective review of consecutive patients undergoing groin cutdowns for femoral access between 2005-2011 was performed. Patients necessitating salvage flaps were compared to those who did not, and a stepwise logistic regression was performed and validated using a bootstrap technique. Utilising this analysis, a simplified risk score was developed to predict the risk of developing a wound which would necessitate salvage. A total of 925 patients were included in the study. The salvage flap rate was 11.2% (n = 104). Predictors determined by logistic regression included prior groin surgery (OR = 4.0, p < 0.001), prosthetic graft (OR = 2.7, p < 0.001), coronary artery disease (OR = 1.8, p = 0.019), peripheral arterial disease (OR = 5.0, p < 0.001), and obesity (OR = 1.7, p = 0.039). Based upon the respective logistic coefficients, a simplified scoring system was developed to enable the preoperative risk stratification regarding the likelihood of a significant complication which would require a salvage muscle flap. The c-statistic for the regression demonstrated excellent discrimination at 0.89. This study presents a simple, internally validated risk assessment tool that accurately predicts wound morbidity requiring flap salvage in open groin vascular surgery patients. The preoperatively high-risk patient can be identified and selectively targeted as a candidate for a prophylactic muscle flap.

  7. A Bayesian goodness of fit test and semiparametric generalization of logistic regression with measurement data.

    PubMed

    Schörgendorfer, Angela; Branscum, Adam J; Hanson, Timothy E

    2013-06-01

    Logistic regression is a popular tool for risk analysis in medical and population health science. With continuous response data, it is common to create a dichotomous outcome for logistic regression analysis by specifying a threshold for positivity. Fitting a linear regression to the nondichotomized response variable assuming a logistic sampling model for the data has been empirically shown to yield more efficient estimates of odds ratios than ordinary logistic regression of the dichotomized endpoint. We illustrate that risk inference is not robust to departures from the parametric logistic distribution. Moreover, the model assumption of proportional odds is generally not satisfied when the condition of a logistic distribution for the data is violated, leading to biased inference from a parametric logistic analysis. We develop novel Bayesian semiparametric methodology for testing goodness of fit of parametric logistic regression with continuous measurement data. The testing procedures hold for any cutoff threshold and our approach simultaneously provides the ability to perform semiparametric risk estimation. Bayes factors are calculated using the Savage-Dickey ratio for testing the null hypothesis of logistic regression versus a semiparametric generalization. We propose a fully Bayesian and a computationally efficient empirical Bayesian approach to testing, and we present methods for semiparametric estimation of risks, relative risks, and odds ratios when parametric logistic regression fails. Theoretical results establish the consistency of the empirical Bayes test. Results from simulated data show that the proposed approach provides accurate inference irrespective of whether parametric assumptions hold or not. Evaluation of risk factors for obesity shows that different inferences are derived from an analysis of a real data set when deviations from a logistic distribution are permissible in a flexible semiparametric framework. © 2013, The International Biometric Society.

  8. Modified Regression Correlation Coefficient for Poisson Regression Model

    NASA Astrophysics Data System (ADS)

    Kaengthong, Nattacha; Domthong, Uthumporn

    2017-09-01

    This study gives attention to indicators in predictive power of the Generalized Linear Model (GLM) which are widely used; however, often having some restrictions. We are interested in regression correlation coefficient for a Poisson regression model. This is a measure of predictive power, and defined by the relationship between the dependent variable (Y) and the expected value of the dependent variable given the independent variables [E(Y|X)] for the Poisson regression model. The dependent variable is distributed as Poisson. The purpose of this research was modifying regression correlation coefficient for Poisson regression model. We also compare the proposed modified regression correlation coefficient with the traditional regression correlation coefficient in the case of two or more independent variables, and having multicollinearity in independent variables. The result shows that the proposed regression correlation coefficient is better than the traditional regression correlation coefficient based on Bias and the Root Mean Square Error (RMSE).

  9. Propensity score estimation: machine learning and classification methods as alternatives to logistic regression

    PubMed Central

    Westreich, Daniel; Lessler, Justin; Funk, Michele Jonsson

    2010-01-01

    Summary Objective Propensity scores for the analysis of observational data are typically estimated using logistic regression. Our objective in this Review was to assess machine learning alternatives to logistic regression which may accomplish the same goals but with fewer assumptions or greater accuracy. Study Design and Setting We identified alternative methods for propensity score estimation and/or classification from the public health, biostatistics, discrete mathematics, and computer science literature, and evaluated these algorithms for applicability to the problem of propensity score estimation, potential advantages over logistic regression, and ease of use. Results We identified four techniques as alternatives to logistic regression: neural networks, support vector machines, decision trees (CART), and meta-classifiers (in particular, boosting). Conclusion While the assumptions of logistic regression are well understood, those assumptions are frequently ignored. All four alternatives have advantages and disadvantages compared with logistic regression. Boosting (meta-classifiers) and to a lesser extent decision trees (particularly CART) appear to be most promising for use in the context of propensity score analysis, but extensive simulation studies are needed to establish their utility in practice. PMID:20630332

  10. Robust mislabel logistic regression without modeling mislabel probabilities.

    PubMed

    Hung, Hung; Jou, Zhi-Yu; Huang, Su-Yun

    2018-03-01

    Logistic regression is among the most widely used statistical methods for linear discriminant analysis. In many applications, we only observe possibly mislabeled responses. Fitting a conventional logistic regression can then lead to biased estimation. One common resolution is to fit a mislabel logistic regression model, which takes into consideration of mislabeled responses. Another common method is to adopt a robust M-estimation by down-weighting suspected instances. In this work, we propose a new robust mislabel logistic regression based on γ-divergence. Our proposal possesses two advantageous features: (1) It does not need to model the mislabel probabilities. (2) The minimum γ-divergence estimation leads to a weighted estimating equation without the need to include any bias correction term, that is, it is automatically bias-corrected. These features make the proposed γ-logistic regression more robust in model fitting and more intuitive for model interpretation through a simple weighting scheme. Our method is also easy to implement, and two types of algorithms are included. Simulation studies and the Pima data application are presented to demonstrate the performance of γ-logistic regression. © 2017, The International Biometric Society.

  11. Fungible weights in logistic regression.

    PubMed

    Jones, Jeff A; Waller, Niels G

    2016-06-01

    In this article we develop methods for assessing parameter sensitivity in logistic regression models. To set the stage for this work, we first review Waller's (2008) equations for computing fungible weights in linear regression. Next, we describe 2 methods for computing fungible weights in logistic regression. To demonstrate the utility of these methods, we compute fungible logistic regression weights using data from the Centers for Disease Control and Prevention's (2010) Youth Risk Behavior Surveillance Survey, and we illustrate how these alternate weights can be used to evaluate parameter sensitivity. To make our work accessible to the research community, we provide R code (R Core Team, 2015) that will generate both kinds of fungible logistic regression weights. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

  12. Propensity score estimation: neural networks, support vector machines, decision trees (CART), and meta-classifiers as alternatives to logistic regression.

    PubMed

    Westreich, Daniel; Lessler, Justin; Funk, Michele Jonsson

    2010-08-01

    Propensity scores for the analysis of observational data are typically estimated using logistic regression. Our objective in this review was to assess machine learning alternatives to logistic regression, which may accomplish the same goals but with fewer assumptions or greater accuracy. We identified alternative methods for propensity score estimation and/or classification from the public health, biostatistics, discrete mathematics, and computer science literature, and evaluated these algorithms for applicability to the problem of propensity score estimation, potential advantages over logistic regression, and ease of use. We identified four techniques as alternatives to logistic regression: neural networks, support vector machines, decision trees (classification and regression trees [CART]), and meta-classifiers (in particular, boosting). Although the assumptions of logistic regression are well understood, those assumptions are frequently ignored. All four alternatives have advantages and disadvantages compared with logistic regression. Boosting (meta-classifiers) and, to a lesser extent, decision trees (particularly CART), appear to be most promising for use in the context of propensity score analysis, but extensive simulation studies are needed to establish their utility in practice. Copyright (c) 2010 Elsevier Inc. All rights reserved.

  13. Should metacognition be measured by logistic regression?

    PubMed

    Rausch, Manuel; Zehetleitner, Michael

    2017-03-01

    Are logistic regression slopes suitable to quantify metacognitive sensitivity, i.e. the efficiency with which subjective reports differentiate between correct and incorrect task responses? We analytically show that logistic regression slopes are independent from rating criteria in one specific model of metacognition, which assumes (i) that rating decisions are based on sensory evidence generated independently of the sensory evidence used for primary task responses and (ii) that the distributions of evidence are logistic. Given a hierarchical model of metacognition, logistic regression slopes depend on rating criteria. According to all considered models, regression slopes depend on the primary task criterion. A reanalysis of previous data revealed that massive numbers of trials are required to distinguish between hierarchical and independent models with tolerable accuracy. It is argued that researchers who wish to use logistic regression as measure of metacognitive sensitivity need to control the primary task criterion and rating criteria. Copyright © 2017 Elsevier Inc. All rights reserved.

  14. Investigating bias in squared regression structure coefficients

    PubMed Central

    Nimon, Kim F.; Zientek, Linda R.; Thompson, Bruce

    2015-01-01

    The importance of structure coefficients and analogs of regression weights for analysis within the general linear model (GLM) has been well-documented. The purpose of this study was to investigate bias in squared structure coefficients in the context of multiple regression and to determine if a formula that had been shown to correct for bias in squared Pearson correlation coefficients and coefficients of determination could be used to correct for bias in squared regression structure coefficients. Using data from a Monte Carlo simulation, this study found that squared regression structure coefficients corrected with Pratt's formula produced less biased estimates and might be more accurate and stable estimates of population squared regression structure coefficients than estimates with no such corrections. While our findings are in line with prior literature that identified multicollinearity as a predictor of bias in squared regression structure coefficients but not coefficients of determination, the findings from this study are unique in that the level of predictive power, number of predictors, and sample size were also observed to contribute bias in squared regression structure coefficients. PMID:26217273

  15. The reliability and validity of a short food frequency questionnaire among 9–11-year olds: a multinational study on three middle-income and high-income countries

    PubMed Central

    Saloheimo, T; González, S A; Erkkola, M; Milauskas, D M; Meisel, J D; Champagne, C M; Tudor-Locke, C; Sarmiento, O; Katzmarzyk, P T; Fogelholm, M

    2015-01-01

    Objective: The main aim of this study was to assess the reliability and validity of a food frequency questionnaire with 23 food groups (I-FFQ) among a sample of 9–11-year-old children from three different countries that differ on economical development and income distribution, and to assess differences between country sites. Furthermore, we assessed factors associated with I-FFQ's performance. Methods: This was an ancillary study of the International Study of Childhood Obesity, Lifestyle and the Environment. Reliability (n=321) and validity (n=282) components of this study had the same participants. Participation rates were 95% and 70%, respectively. Participants completed two I-FFQs with a mean interval of 4.9 weeks to assess reliability. A 3-day pre-coded food diary (PFD) was used as the reference method in the validity analyses. Wilcoxon signed-rank tests, intraclass correlation coefficients and cross-classifications were used to assess the reliability of I-FFQ. Spearman correlation coefficients, percentage difference and cross-classifications were used to assess the validity of I-FFQ. A logistic regression model was used to assess the relation of selected variables with the estimate of validity. Analyses based on information in the PFDs were performed to assess how participants interpreted food groups. Results: Reliability correlation coefficients ranged from 0.37 to 0.78 and gross misclassification for all food groups was <5%. Validity correlation coefficients were below 0.5 for 22/23 food groups, and they differed among country sites. For validity, gross misclassification was <5% for 22/23 food groups. Over- or underestimation did not appear for 19/23 food groups. Logistic regression showed that country of participation and parental education were associated (P⩽0.05) with the validity of I-FFQ. Analyses of children's interpretation of food groups suggested that the meaning of most food groups was understood by the children. Conclusion: I-FFQ is a moderately reliable method and its validity ranged from low to moderate, depending on food group and country site. PMID:27152180

  16. London Measure of Unplanned Pregnancy: guidance for its use as an outcome measure

    PubMed Central

    Hall, Jennifer A; Barrett, Geraldine; Copas, Andrew; Stephenson, Judith

    2017-01-01

    Background The London Measure of Unplanned Pregnancy (LMUP) is a psychometrically validated measure of the degree of intention of a current or recent pregnancy. The LMUP is increasingly being used worldwide, and can be used to evaluate family planning or preconception care programs. However, beyond recommending the use of the full LMUP scale, there is no published guidance on how to use the LMUP as an outcome measure. Ordinal logistic regression has been recommended informally, but studies published to date have all used binary logistic regression and dichotomized the scale at different cut points. There is thus a need for evidence-based guidance to provide a standardized methodology for multivariate analysis and to enable comparison of results. This paper makes recommendations for the regression method for analysis of the LMUP as an outcome measure. Materials and methods Data collected from 4,244 pregnant women in Malawi were used to compare five regression methods: linear, logistic with two cut points, and ordinal logistic with either the full or grouped LMUP score. The recommendations were then tested on the original UK LMUP data. Results There were small but no important differences in the findings across the regression models. Logistic regression resulted in the largest loss of information, and assumptions were violated for the linear and ordinal logistic regression. Consequently, robust standard errors were used for linear regression and a partial proportional odds ordinal logistic regression model attempted. The latter could only be fitted for grouped LMUP score. Conclusion We recommend the linear regression model with robust standard errors to make full use of the LMUP score when analyzed as an outcome measure. Ordinal logistic regression could be considered, but a partial proportional odds model with grouped LMUP score may be required. Logistic regression is the least-favored option, due to the loss of information. For logistic regression, the cut point for un/planned pregnancy should be between nine and ten. These recommendations will standardize the analysis of LMUP data and enhance comparability of results across studies. PMID:28435343

  17. Logistic models--an odd(s) kind of regression.

    PubMed

    Jupiter, Daniel C

    2013-01-01

    The logistic regression model bears some similarity to the multivariable linear regression with which we are familiar. However, the differences are great enough to warrant a discussion of the need for and interpretation of logistic regression. Copyright © 2013 American College of Foot and Ankle Surgeons. Published by Elsevier Inc. All rights reserved.

  18. Association of Perceived Stress with Stressful Life Events, Lifestyle and Sociodemographic Factors: A Large-Scale Community-Based Study Using Logistic Quantile Regression

    PubMed Central

    Feizi, Awat; Aliyari, Roqayeh; Roohafza, Hamidreza

    2012-01-01

    Objective. The present paper aimed at investigating the association between perceived stress and major life events stressors in Iranian general population. Methods. In a cross-sectional large-scale community-based study, 4583 people aged 19 and older, living in Isfahan, Iran, were investigated. Logistic quantile regression was used for modeling perceived stress, measured by GHQ questionnaire, as the bounded outcome (dependent), variable, and as a function of most important stressful life events, as the predictor variables, controlling for major lifestyle and sociodemographic factors. This model provides empirical evidence of the predictors' effects heterogeneity depending on individual location on the distribution of perceived stress. Results. The results showed that among four stressful life events, family conflicts and social problems were more correlated with level of perceived stress. Higher levels of education were negatively associated with perceived stress and its coefficients monotonically decrease beyond the 30th percentile. Also, higher levels of physical activity were associated with perception of low levels of stress. The pattern of gender's coefficient over the majority of quantiles implied that females are more affected by stressors. Also high perceived stress was associated with low or middle levels of income. Conclusions. The results of current research suggested that in a developing society with high prevalence of stress, interventions targeted toward promoting financial and social equalities, social skills training, and healthy lifestyle may have the potential benefits for large parts of the population, most notably female and lower educated people. PMID:23091560

  19. Evaluation of keratoconus progression.

    PubMed

    Shajari, Mehdi; Steinwender, Gernot; Herrmann, Kim; Kubiak, Kate Barbara; Pavlovic, Ivana; Plawetzki, Elena; Schmack, Ingo; Kohnen, Thomas

    2018-06-01

    To define variables for the evaluation of keratoconus progression and to determine cut-off values. In this retrospective cohort study (2010-2016), 265 eyes of 165 patients diagnosed with keratoconus underwent two Scheimpflug measurements (Pentacam) that took place 1 year apart ±3 months. Variables used for keratoconus detection were evaluated for progression and a correlation analysis was performed. By logistic regression analysis, a keratoconus progression index (KPI) was defined. Receiver-operating characteristic curve (ROC) analysis was performed and Youden Index calculated to determine cut-off values. Variables used for keratoconus detection showed a weak correlation with each other (eg, correlation r=0.245 between RPImin and Kmax, p<0.001). Therefore, we used parameters that took several variables into consideration (eg, D-index, index of surface variance, index for height asymmetry, KPI). KPI was defined by logistic regression and consisted of a Pachymin coefficient of -0.78 (p=0.001), a maximum elevation of back surface coefficient of 0.27 and coefficient of corneal curvature at the zone 3 mm away from the thinnest point on the posterior corneal surface of -12.44 (both p<0.001). The two variables with the highest Youden Index in the ROC analysis were D-index and KPI: D-index had a cut-off of 0.4175 (70.6% sensitivity) and Youden Index of 0.606. Cut-off for KPI was -0.78196 (84.7% sensitivity) and a Youden Index of 0.747; both 90% specificity. Keratoconus progression should be defined by evaluating parameters that consider several corneal changes; we suggest D-index and KPI to detect progression. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  20. Producing landslide susceptibility maps by utilizing machine learning methods. The case of Finikas catchment basin, North Peloponnese, Greece.

    NASA Astrophysics Data System (ADS)

    Tsangaratos, Paraskevas; Ilia, Ioanna; Loupasakis, Constantinos; Papadakis, Michalis; Karimalis, Antonios

    2017-04-01

    The main objective of the present study was to apply two machine learning methods for the production of a landslide susceptibility map in the Finikas catchment basin, located in North Peloponnese, Greece and to compare their results. Specifically, Logistic Regression and Random Forest were utilized, based on a database of 40 sites classified into two categories, non-landslide and landslide areas that were separated into a training dataset (70% of the total data) and a validation dataset (remaining 30%). The identification of the areas was established by analyzing airborne imagery, extensive field investigation and the examination of previous research studies. Six landslide related variables were analyzed, namely: lithology, elevation, slope, aspect, distance to rivers and distance to faults. Within the Finikas catchment basin most of the reported landslides were located along the road network and within the residential complexes, classified as rotational and translational slides, and rockfalls, mainly caused due to the physical conditions and the general geotechnical behavior of the geological formation that cover the area. Each landslide susceptibility map was reclassified by applying the Geometric Interval classification technique into five classes, namely: very low susceptibility, low susceptibility, moderate susceptibility, high susceptibility, and very high susceptibility. The comparison and validation of the outcomes of each model were achieved using statistical evaluation measures, the receiving operating characteristic and the area under the success and predictive rate curves. The computation process was carried out using RStudio an integrated development environment for R language and ArcGIS 10.1 for compiling the data and producing the landslide susceptibility maps. From the outcomes of the Logistic Regression analysis it was induced that the highest b coefficient is allocated to lithology and slope, which was 2.8423 and 1.5841, respectively. From the estimation of the mean decrease in Gini coefficient performed during the application of Random Forest and the mean decrease in accuracy the most important variable is slope followed by lithology, aspect, elevation, distance from river network, and distance from faults, while the most used variables during the training phase were the variable aspect (21.45%), slope (20.53%) and lithology (19.84%). The outcomes of the analysis are consistent with previous studies concerning the area of research, which have indicated the high influence of lithology and slope in the manifestation of landslides. High percentage of landslide occurrence has been observed in Plio-Pleistocene sediments, flysch formations, and Cretaceous limestone. Also the presences of landslides have been associated with the degree of weathering and fragmentation, the orientation of the discontinuities surfaces and the intense morphological relief. The most accurate model was Random Forest which identified correctly 92.00% of the instances during the training phase, followed by the Logistic Regression 89.00%. The same pattern of accuracy was calculated during the validation phase, in which the Random Forest achieved a classification accuracy of 93.00%, while the Logistic Regression model achieved an accuracy of 91.00%. In conclusion, the outcomes of the study could be a useful cartographic product to local authorities and government agencies during the implementation of successful decision-making and land use planning strategies. Keywords: Landslide Susceptibility, Logistic Regression, Random Forest, GIS, Greece.

  1. Parameters Estimation of Geographically Weighted Ordinal Logistic Regression (GWOLR) Model

    NASA Astrophysics Data System (ADS)

    Zuhdi, Shaifudin; Retno Sari Saputro, Dewi; Widyaningsih, Purnami

    2017-06-01

    A regression model is the representation of relationship between independent variable and dependent variable. The dependent variable has categories used in the logistic regression model to calculate odds on. The logistic regression model for dependent variable has levels in the logistics regression model is ordinal. GWOLR model is an ordinal logistic regression model influenced the geographical location of the observation site. Parameters estimation in the model needed to determine the value of a population based on sample. The purpose of this research is to parameters estimation of GWOLR model using R software. Parameter estimation uses the data amount of dengue fever patients in Semarang City. Observation units used are 144 villages in Semarang City. The results of research get GWOLR model locally for each village and to know probability of number dengue fever patient categories.

  2. PARAMETRIC AND NON PARAMETRIC (MARS: MULTIVARIATE ADDITIVE REGRESSION SPLINES) LOGISTIC REGRESSIONS FOR PREDICTION OF A DICHOTOMOUS RESPONSE VARIABLE WITH AN EXAMPLE FOR PRESENCE/ABSENCE OF AMPHIBIANS

    EPA Science Inventory

    The purpose of this report is to provide a reference manual that could be used by investigators for making informed use of logistic regression using two methods (standard logistic regression and MARS). The details for analyses of relationships between a dependent binary response ...

  3. Predicting U.S. Army Reserve Unit Manning Using Market Demographics

    DTIC Science & Technology

    2015-06-01

    develops linear regression , classification tree, and logistic regression models to determine the ability of the location to support manning requirements... logistic regression model delivers predictive results that allow decision-makers to identify locations with a high probability of meeting unit...manning requirements. The recommendation of this thesis is that the USAR implement the logistic regression model. 14. SUBJECT TERMS U.S

  4. Analyzing Student Learning Outcomes: Usefulness of Logistic and Cox Regression Models. IR Applications, Volume 5

    ERIC Educational Resources Information Center

    Chen, Chau-Kuang

    2005-01-01

    Logistic and Cox regression methods are practical tools used to model the relationships between certain student learning outcomes and their relevant explanatory variables. The logistic regression model fits an S-shaped curve into a binary outcome with data points of zero and one. The Cox regression model allows investigators to study the duration…

  5. An appraisal of convergence failures in the application of logistic regression model in published manuscripts.

    PubMed

    Yusuf, O B; Bamgboye, E A; Afolabi, R F; Shodimu, M A

    2014-09-01

    Logistic regression model is widely used in health research for description and predictive purposes. Unfortunately, most researchers are sometimes not aware that the underlying principles of the techniques have failed when the algorithm for maximum likelihood does not converge. Young researchers particularly postgraduate students may not know why separation problem whether quasi or complete occurs, how to identify it and how to fix it. This study was designed to critically evaluate convergence issues in articles that employed logistic regression analysis published in an African Journal of Medicine and medical sciences between 2004 and 2013. Problems of quasi or complete separation were described and were illustrated with the National Demographic and Health Survey dataset. A critical evaluation of articles that employed logistic regression was conducted. A total of 581 articles was reviewed, of which 40 (6.9%) used binary logistic regression. Twenty-four (60.0%) stated the use of logistic regression model in the methodology while none of the articles assessed model fit. Only 3 (12.5%) properly described the procedures. Of the 40 that used the logistic regression model, the problem of convergence occurred in 6 (15.0%) of the articles. Logistic regression tends to be poorly reported in studies published between 2004 and 2013. Our findings showed that the procedure may not be well understood by researchers since very few described the process in their reports and may be totally unaware of the problem of convergence or how to deal with it.

  6. Age and mortality after injury: is the association linear?

    PubMed

    Friese, R S; Wynne, J; Joseph, B; Hashmi, A; Diven, C; Pandit, V; O'Keeffe, T; Zangbar, B; Kulvatunyou, N; Rhee, P

    2014-10-01

    Multiple studies have demonstrated a linear association between advancing age and mortality after injury. An inflection point, or an age at which outcomes begin to differ, has not been previously described. We hypothesized that the relationship between age and mortality after injury is non-linear and an inflection point exists. We performed a retrospective cohort analysis at our urban level I center from 2007 through 2009. All patients aged 65 years and older with the admission diagnosis of injury were included. Non-parametric logistic regression was used to identify the functional form between mortality and age. Multivariate logistic regression was utilized to explore the association between age and mortality. Age 65 years was used as the reference. Significance was defined as p < 0.05. A total of 1,107 patients were included in the analysis. One-third required intensive care unit (ICU) admission and 48 % had traumatic brain injury. 229 patients (20.6 %) were 84 years of age or older. The overall mortality was 7.2 %. Our model indicates that mortality is a quadratic function of age. After controlling for confounders, age is associated with mortality with a regression coefficient of 1.08 for the linear term (p = 0.02) and a regression coefficient of -0.006 for the quadratic term (p = 0.03). The model identified 84.4 years of age as the inflection point at which mortality rates begin to decline. The risk of death after injury varies linearly with age until 84 years. After 84 years of age, the mortality rates decline. These findings may reflect the varying severity of comorbidities and differences in baseline functional status in elderly trauma patients. Specifically, a proportion of our injured patient population less than 84 years old may be more frail, contributing to increased mortality after trauma, whereas a larger proportion of our injured patients over 84 years old, by virtue of reaching this advanced age, may, in fact, be less frail, contributing to less risk of death.

  7. An EM-based semi-parametric mixture model approach to the regression analysis of competing-risks data.

    PubMed

    Ng, S K; McLachlan, G J

    2003-04-15

    We consider a mixture model approach to the regression analysis of competing-risks data. Attention is focused on inference concerning the effects of factors on both the probability of occurrence and the hazard rate conditional on each of the failure types. These two quantities are specified in the mixture model using the logistic model and the proportional hazards model, respectively. We propose a semi-parametric mixture method to estimate the logistic and regression coefficients jointly, whereby the component-baseline hazard functions are completely unspecified. Estimation is based on maximum likelihood on the basis of the full likelihood, implemented via an expectation-conditional maximization (ECM) algorithm. Simulation studies are performed to compare the performance of the proposed semi-parametric method with a fully parametric mixture approach. The results show that when the component-baseline hazard is monotonic increasing, the semi-parametric and fully parametric mixture approaches are comparable for mildly and moderately censored samples. When the component-baseline hazard is not monotonic increasing, the semi-parametric method consistently provides less biased estimates than a fully parametric approach and is comparable in efficiency in the estimation of the parameters for all levels of censoring. The methods are illustrated using a real data set of prostate cancer patients treated with different dosages of the drug diethylstilbestrol. Copyright 2003 John Wiley & Sons, Ltd.

  8. Influence of Health Behaviors and Occupational Stress on Prediabetic State among Male Office Workers.

    PubMed

    Ryu, Hosihn; Moon, Jihyeon; Jung, Jiyeon

    2018-06-14

    This study examined the influence of health behaviors and occupational stress on the prediabetic state of male office workers, and identified related risks and influencing factors. The study used a cross-sectional design and performed an integrative analysis on data from regular health checkups, health questionnaires, and a health behavior-related survey of employees of a company, using Spearman’s correlation coefficients and multiple logistic regression analysis. The results showed significant relationships of prediabetic state with health behaviors and occupational stress. Among health behaviors, a diet without vegetables and fruits (Odds Ratio (OR) = 3.74, 95% Confidence Interval (CI) = 1.93⁻7.66) was associated with a high risk of prediabetic state. In the subscales on occupational stress, organizational system in the 4th quartile (OR = 4.83, 95% CI = 2.40⁻9.70) was significantly associated with an increased likelihood of prediabetic state. To identify influencing factors of prediabetic state, the multiple logistic regression was performed using regression models. The results showed that dietary habits (β = 1.20, p = 0.002), total occupational stress score (β = 1.33, p = 0.024), and organizational system (β = 1.13, p = 0.009) were significant influencing factors. The present findings indicate that active interventions are needed at workplace for the systematic and comprehensive management of health behaviors and occupational stress that influence prediabetic state of office workers.

  9. The implementation of rare events logistic regression to predict the distribution of mesophotic hard corals across the main Hawaiian Islands.

    PubMed

    Veazey, Lindsay M; Franklin, Erik C; Kelley, Christopher; Rooney, John; Frazer, L Neil; Toonen, Robert J

    2016-01-01

    Predictive habitat suitability models are powerful tools for cost-effective, statistically robust assessment of the environmental drivers of species distributions. The aim of this study was to develop predictive habitat suitability models for two genera of scleractinian corals (Leptoserisand Montipora) found within the mesophotic zone across the main Hawaiian Islands. The mesophotic zone (30-180 m) is challenging to reach, and therefore historically understudied, because it falls between the maximum limit of SCUBA divers and the minimum typical working depth of submersible vehicles. Here, we implement a logistic regression with rare events corrections to account for the scarcity of presence observations within the dataset. These corrections reduced the coefficient error and improved overall prediction success (73.6% and 74.3%) for both original regression models. The final models included depth, rugosity, slope, mean current velocity, and wave height as the best environmental covariates for predicting the occurrence of the two genera in the mesophotic zone. Using an objectively selected theta ("presence") threshold, the predicted presence probability values (average of 0.051 for Leptoseris and 0.040 for Montipora) were translated to spatially-explicit habitat suitability maps of the main Hawaiian Islands at 25 m grid cell resolution. Our maps are the first of their kind to use extant presence and absence data to examine the habitat preferences of these two dominant mesophotic coral genera across Hawai'i.

  10. Logistic Regression: Concept and Application

    ERIC Educational Resources Information Center

    Cokluk, Omay

    2010-01-01

    The main focus of logistic regression analysis is classification of individuals in different groups. The aim of the present study is to explain basic concepts and processes of binary logistic regression analysis intended to determine the combination of independent variables which best explain the membership in certain groups called dichotomous…

  11. Analysis of the discriminative methods for diagnosis of benign and malignant solitary pulmonary nodules based on serum markers.

    PubMed

    Wang, Wanping; Liu, Mingyue; Wang, Jing; Tian, Rui; Dong, Junqiang; Liu, Qi; Zhao, Xianping; Wang, Yuanfang

    2014-01-01

    Screening indexes of tumor serum markers for benign and malignant solitary pulmonary nodules (SPNs) were analyzed to find the optimum method for diagnosis. Enzyme-linked immunosorbent assays, an automatic immune analyzer and radioimmunoassay methods were used to examine the levels of 8 serum markers in 164 SPN patients, and the sensitivity for differential diagnosis of malignant or benign SPN was compared for detection using a single plasma marker or a combination of markers. The results for serological indicators that closely relate to benign and malignant SPNs were screened using the Fisher discriminant analysis and a non-conditional logistic regression analysis method, respectively. The results were then verified by the k-means clustering analysis method. The sensitivity when using a combination of serum markers to detect SPN was higher than that using a single marker. By Fisher discriminant analysis, cytokeratin 19 fragments (CYFRA21-1), carbohydrate antigen 125 (CA125), squamous cell carcinoma antigen (SCC) and breast cancer antigen (CA153), which relate to the benign and malignant SPNs, were screened. Through non-conditional logistic regression analysis, CYFRA21-1, SCC and CA153 were obtained. Using the k-means clustering analysis, the cophenetic correlation coefficient (0.940) obtained by the Fisher discriminant analysis was higher than that obtained with logistic regression analysis (0.875). This study indicated that the Fisher discriminant analysis functioned better in screening out serum markers to recognize the benign and malignant SPN. The combined detection of CYFRA21-1, CA125, SCC and CA153 is an effective way to distinguish benign and malignant SPN, and will find an important clinical application in the early diagnosis of SPN. © 2014 S. Karger GmbH, Freiburg.

  12. The impact of the 2008 financial crisis on food security and food expenditures in Mexico: a disproportionate effect on the vulnerable.

    PubMed

    Vilar-Compte, Mireya; Sandoval-Olascoaga, Sebastian; Bernal-Stuart, Ana; Shimoga, Sandhya; Vargas-Bustamante, Arturo

    2015-11-01

    The present paper investigated the impact of the 2008 financial crisis on food security in Mexico and how it disproportionally affected vulnerable households. A generalized ordered logistic regression was estimated to assess the impact of the crisis on households' food security status. An ordinary least squares and a quantile regression were estimated to evaluate the effect of the financial crisis on a continuous proxy measure of food security defined as the share of a household's current income devoted to food expenditures. Setting Both analyses were performed using pooled cross-sectional data from the Mexican National Household Income and Expenditure Survey 2008 and 2010. The analytical sample included 29,468 households in 2008 and 27,654 in 2010. The generalized ordered logistic model showed that the financial crisis significantly (P<0·05) decreased the probability of being food secure, mildly or moderately food insecure, compared with being severely food insecure (OR=0·74). A similar but smaller effect was found when comparing severely and moderately food-insecure households with mildly food-insecure and food-secure households (OR=0·81). The ordinary least squares model showed that the crisis significantly (P<0·05) increased the share of total income spent on food (β coefficient of 0·02). The quantile regression confirmed the findings suggested by the generalized ordered logistic model, showing that the effects of the crisis were more profound among poorer households. The results suggest that households that were more vulnerable before the financial crisis saw a worsened effect in terms of food insecurity with the crisis. Findings were consistent with both measures of food security--one based on self-reported experience and the other based on food spending.

  13. Heritability of Respiratory Infection with Pseudomonas aeruginosa in Cystic Fibrosis

    PubMed Central

    Green, Deanna M.; Collaco, J. Michael; McDougal, Kathryn E.; Naughton, Kathleen M.; Blackman, Scott M.; Cutting, Garry R.

    2013-01-01

    Objective To quantify the relative contribution of factors other than cystic fibrosis transmembrane conductance regulator genotype and environment on the acquisition of Pseudomonas aeruginosa (Pa) by patients with cystic fibrosis. Study design Lung infection with Pa and mucoid Pa was assessed using a co-twin study design of 44 monozygous (MZ) and 17 dizygous (DZ) twin pairs. Two definitions were used to establish infection: first positive culture and persistent positive culture. Genetic contribution to infection (ie, heritability) was estimated based on concordance analysis, logistic regression, and age at onset of infection through comparison of intraclass correlation coefficients. Results Concordance for persistent Pa infection was higher in MZ (0.83; 25 of 30 pairs) than DZ twins (0.45; 5 of 11 pairs), generating a heritability of 0.76. Logistic regression adjusted for age corroborated genetic control of persistent Pa infection. The correlation for age at persistent Pa infection was higher in MZ twins (0.589; 95% CI, 0.222-0.704) than in DZ twins (0.162; 95% CI, −0.352 to 0.607), generating a heritability of 0.85. Conclusion Genetic modifiers play a significant role in the establishment and timing of persistent Pa infection in individuals with cystic fibrosis. PMID:22364820

  14. Use of multilevel logistic regression to identify the causes of differential item functioning.

    PubMed

    Balluerka, Nekane; Gorostiaga, Arantxa; Gómez-Benito, Juana; Hidalgo, María Dolores

    2010-11-01

    Given that a key function of tests is to serve as evaluation instruments and for decision making in the fields of psychology and education, the possibility that some of their items may show differential behaviour is a major concern for psychometricians. In recent decades, important progress has been made as regards the efficacy of techniques designed to detect this differential item functioning (DIF). However, the findings are scant when it comes to explaining its causes. The present study addresses this problem from the perspective of multilevel analysis. Starting from a case study in the area of transcultural comparisons, multilevel logistic regression is used: 1) to identify the item characteristics associated with the presence of DIF; 2) to estimate the proportion of variation in the DIF coefficients that is explained by these characteristics; and 3) to evaluate alternative explanations of the DIF by comparing the explanatory power or fit of different sequential models. The comparison of these models confirmed one of the two alternatives (familiarity with the stimulus) and rejected the other (the topic area) as being a cause of differential functioning with respect to the compared groups.

  15. An Entropy-Based Measure for Assessing Fuzziness in Logistic Regression

    PubMed Central

    Weiss, Brandi A.; Dardick, William

    2015-01-01

    This article introduces an entropy-based measure of data–model fit that can be used to assess the quality of logistic regression models. Entropy has previously been used in mixture-modeling to quantify how well individuals are classified into latent classes. The current study proposes the use of entropy for logistic regression models to quantify the quality of classification and separation of group membership. Entropy complements preexisting measures of data–model fit and provides unique information not contained in other measures. Hypothetical data scenarios, an applied example, and Monte Carlo simulation results are used to demonstrate the application of entropy in logistic regression. Entropy should be used in conjunction with other measures of data–model fit to assess how well logistic regression models classify cases into observed categories. PMID:29795897

  16. Logistic regression applied to natural hazards: rare event logistic regression with replications

    NASA Astrophysics Data System (ADS)

    Guns, M.; Vanacker, V.

    2012-06-01

    Statistical analysis of natural hazards needs particular attention, as most of these phenomena are rare events. This study shows that the ordinary rare event logistic regression, as it is now commonly used in geomorphologic studies, does not always lead to a robust detection of controlling factors, as the results can be strongly sample-dependent. In this paper, we introduce some concepts of Monte Carlo simulations in rare event logistic regression. This technique, so-called rare event logistic regression with replications, combines the strength of probabilistic and statistical methods, and allows overcoming some of the limitations of previous developments through robust variable selection. This technique was here developed for the analyses of landslide controlling factors, but the concept is widely applicable for statistical analyses of natural hazards.

  17. Large unbalanced credit scoring using Lasso-logistic regression ensemble.

    PubMed

    Wang, Hong; Xu, Qingsong; Zhou, Lifeng

    2015-01-01

    Recently, various ensemble learning methods with different base classifiers have been proposed for credit scoring problems. However, for various reasons, there has been little research using logistic regression as the base classifier. In this paper, given large unbalanced data, we consider the plausibility of ensemble learning using regularized logistic regression as the base classifier to deal with credit scoring problems. In this research, the data is first balanced and diversified by clustering and bagging algorithms. Then we apply a Lasso-logistic regression learning ensemble to evaluate the credit risks. We show that the proposed algorithm outperforms popular credit scoring models such as decision tree, Lasso-logistic regression and random forests in terms of AUC and F-measure. We also provide two importance measures for the proposed model to identify important variables in the data.

  18. An Entropy-Based Measure for Assessing Fuzziness in Logistic Regression.

    PubMed

    Weiss, Brandi A; Dardick, William

    2016-12-01

    This article introduces an entropy-based measure of data-model fit that can be used to assess the quality of logistic regression models. Entropy has previously been used in mixture-modeling to quantify how well individuals are classified into latent classes. The current study proposes the use of entropy for logistic regression models to quantify the quality of classification and separation of group membership. Entropy complements preexisting measures of data-model fit and provides unique information not contained in other measures. Hypothetical data scenarios, an applied example, and Monte Carlo simulation results are used to demonstrate the application of entropy in logistic regression. Entropy should be used in conjunction with other measures of data-model fit to assess how well logistic regression models classify cases into observed categories.

  19. Evaluating the High Risk Groups for Suicide: A Comparison of Logistic Regression, Support Vector Machine, Decision Tree and Artificial Neural Network

    PubMed Central

    AMINI, Payam; AHMADINIA, Hasan; POOROLAJAL, Jalal; MOQADDASI AMIRI, Mohammad

    2016-01-01

    Background: We aimed to assess the high-risk group for suicide using different classification methods includinglogistic regression (LR), decision tree (DT), artificial neural network (ANN), and support vector machine (SVM). Methods: We used the dataset of a study conducted to predict risk factors of completed suicide in Hamadan Province, the west of Iran, in 2010. To evaluate the high-risk groups for suicide, LR, SVM, DT and ANN were performed. The applied methods were compared using sensitivity, specificity, positive predicted value, negative predicted value, accuracy and the area under curve. Cochran-Q test was implied to check differences in proportion among methods. To assess the association between the observed and predicted values, Ø coefficient, contingency coefficient, and Kendall tau-b were calculated. Results: Gender, age, and job were the most important risk factors for fatal suicide attempts in common for four methods. SVM method showed the highest accuracy 0.68 and 0.67 for training and testing sample, respectively. However, this method resulted in the highest specificity (0.67 for training and 0.68 for testing sample) and the highest sensitivity for training sample (0.85), but the lowest sensitivity for the testing sample (0.53). Cochran-Q test resulted in differences between proportions in different methods (P<0.001). The association of SVM predictions and observed values, Ø coefficient, contingency coefficient, and Kendall tau-b were 0.239, 0.232 and 0.239, respectively. Conclusion: SVM had the best performance to classify fatal suicide attempts comparing to DT, LR and ANN. PMID:27957463

  20. Power and Sample Size Calculations for Logistic Regression Tests for Differential Item Functioning

    ERIC Educational Resources Information Center

    Li, Zhushan

    2014-01-01

    Logistic regression is a popular method for detecting uniform and nonuniform differential item functioning (DIF) effects. Theoretical formulas for the power and sample size calculations are derived for likelihood ratio tests and Wald tests based on the asymptotic distribution of the maximum likelihood estimators for the logistic regression model.…

  1. A Methodology for Generating Placement Rules that Utilizes Logistic Regression

    ERIC Educational Resources Information Center

    Wurtz, Keith

    2008-01-01

    The purpose of this article is to provide the necessary tools for institutional researchers to conduct a logistic regression analysis and interpret the results. Aspects of the logistic regression procedure that are necessary to evaluate models are presented and discussed with an emphasis on cutoff values and choosing the appropriate number of…

  2. Comparison of standard maximum likelihood classification and polytomous logistic regression used in remote sensing

    Treesearch

    John Hogland; Nedret Billor; Nathaniel Anderson

    2013-01-01

    Discriminant analysis, referred to as maximum likelihood classification within popular remote sensing software packages, is a common supervised technique used by analysts. Polytomous logistic regression (PLR), also referred to as multinomial logistic regression, is an alternative classification approach that is less restrictive, more flexible, and easy to interpret. To...

  3. Response Error in Reporting Dental Coverage by Older Americans in the Health and Retirement Study

    PubMed Central

    Manski, Richard J.; Mathiowetz, Nancy A.; Campbell, Nancy; Pepper, John V.

    2014-01-01

    The aim of this research was to analyze the inconsistency in responses to survey questions within the Health and Retirement Study (HRS) regarding insurance coverage of dental services. Self-reports of dental coverage in the dental services section were compared with those in the insurance section of the 2002 HRS to identify inconsistent responses. Logistic regression identified characteristics of persons reporting discrepancies and assessed the effect of measurement error on dental coverage coefficient estimates in dental utilization models. In 18% of cases, data reported in the insurance section contradicted data reported in the dental use section of the HRS by those who said insurance at least partially covered (or would have covered) their (hypothetical) dental use. Additional findings included distinct characteristics of persons with potential reporting errors and a downward bias to the regression coefficient for coverage in a dental use model without controls for inconsistent self-reports of coverage. This study offers evidence for the need to validate self-reports of dental insurance coverage among a survey population of older Americans to obtain more accurate estimates of coverage and its impact on dental utilization. PMID:25428430

  4. Large Unbalanced Credit Scoring Using Lasso-Logistic Regression Ensemble

    PubMed Central

    Wang, Hong; Xu, Qingsong; Zhou, Lifeng

    2015-01-01

    Recently, various ensemble learning methods with different base classifiers have been proposed for credit scoring problems. However, for various reasons, there has been little research using logistic regression as the base classifier. In this paper, given large unbalanced data, we consider the plausibility of ensemble learning using regularized logistic regression as the base classifier to deal with credit scoring problems. In this research, the data is first balanced and diversified by clustering and bagging algorithms. Then we apply a Lasso-logistic regression learning ensemble to evaluate the credit risks. We show that the proposed algorithm outperforms popular credit scoring models such as decision tree, Lasso-logistic regression and random forests in terms of AUC and F-measure. We also provide two importance measures for the proposed model to identify important variables in the data. PMID:25706988

  5. Serum Liver Fibrosis Markers in the Prognosis of Liver Cirrhosis: A Prospective Observational Study.

    PubMed

    Qi, Xingshun; Liu, Xu; Zhang, Yongguo; Hou, Yue; Ren, Linan; Wu, Chunyan; Chen, Jiang; Xia, Chunlian; Zhao, Jiajun; Wang, Di; Zhang, Yanlin; Zhang, Xia; Lin, Hao; Wang, Hezhi; Wang, Jinling; Cui, Zhongmin; Li, Xueyan; Deng, Han; Hou, Feifei; Peng, Ying; Wang, Xueying; Shao, Xiaodong; Li, Hongyu; Guo, Xiaozhong

    2016-08-02

    BACKGROUND The prognostic role of serum liver fibrosis markers in cirrhotic patients remains unclear. We performed a prospective observational study to evaluate the effect of amino-terminal pro-peptide of type III pro-collagen (PIIINP), collagen IV (CIV), laminin (LN), and hyaluronic acid (HA) on the prognosis of liver cirrhosis. MATERIAL AND METHODS All patients who were diagnosed with liver cirrhosis and admitted to our department were prospectively enrolled. PIIINP, CIV, LN, and HA levels were tested. RESULTS Overall, 108 cirrhotic patients were included. Correlation analysis demonstrated that CIV (coefficient r: 0.658, p<0.001; coefficient r: 0.368, p<0.001), LN (coefficient r: 0.450, p<0.001; coefficient r: 0.343, p<0.001), and HA (coefficient r: 0.325, p=0.001; coefficient r: 0.282, p=0.004) levels, but not PIIINP level (coefficient r: 0.081, p=0.414; coefficient r: 0.090, p=0.363), significantly correlated with Child-Pugh and MELD scores. Logistic regression analysis demonstrated that HA (odds ratio=1.00003, 95% confidence interval [CI]=1.000004-1.000056, p=0.022) was significantly associated with the 6-month mortality. Receiver operating characteristics analysis demonstrated that the area under the curve (AUC) of HA for predicting the 6-month mortality was 0.612 (95%CI=0.508-0.709, p=0.1531). CONCLUSIONS CIV, LN, and HA levels were significantly associated with the severity of liver dysfunction, but might be inappropriate for the prognostic assessment of liver cirrhosis.

  6. An Entropy-Based Measure for Assessing Fuzziness in Logistic Regression

    ERIC Educational Resources Information Center

    Weiss, Brandi A.; Dardick, William

    2016-01-01

    This article introduces an entropy-based measure of data-model fit that can be used to assess the quality of logistic regression models. Entropy has previously been used in mixture-modeling to quantify how well individuals are classified into latent classes. The current study proposes the use of entropy for logistic regression models to quantify…

  7. What Are the Odds of that? A Primer on Understanding Logistic Regression

    ERIC Educational Resources Information Center

    Huang, Francis L.; Moon, Tonya R.

    2013-01-01

    The purpose of this Methodological Brief is to present a brief primer on logistic regression, a commonly used technique when modeling dichotomous outcomes. Using data from the National Education Longitudinal Study of 1988 (NELS:88), logistic regression techniques were used to investigate student-level variables in eighth grade (i.e., enrolled in a…

  8. On the Usefulness of a Multilevel Logistic Regression Approach to Person-Fit Analysis

    ERIC Educational Resources Information Center

    Conijn, Judith M.; Emons, Wilco H. M.; van Assen, Marcel A. L. M.; Sijtsma, Klaas

    2011-01-01

    The logistic person response function (PRF) models the probability of a correct response as a function of the item locations. Reise (2000) proposed to use the slope parameter of the logistic PRF as a person-fit measure. He reformulated the logistic PRF model as a multilevel logistic regression model and estimated the PRF parameters from this…

  9. Mortality risk prediction in burn injury: Comparison of logistic regression with machine learning approaches.

    PubMed

    Stylianou, Neophytos; Akbarov, Artur; Kontopantelis, Evangelos; Buchan, Iain; Dunn, Ken W

    2015-08-01

    Predicting mortality from burn injury has traditionally employed logistic regression models. Alternative machine learning methods have been introduced in some areas of clinical prediction as the necessary software and computational facilities have become accessible. Here we compare logistic regression and machine learning predictions of mortality from burn. An established logistic mortality model was compared to machine learning methods (artificial neural network, support vector machine, random forests and naïve Bayes) using a population-based (England & Wales) case-cohort registry. Predictive evaluation used: area under the receiver operating characteristic curve; sensitivity; specificity; positive predictive value and Youden's index. All methods had comparable discriminatory abilities, similar sensitivities, specificities and positive predictive values. Although some machine learning methods performed marginally better than logistic regression the differences were seldom statistically significant and clinically insubstantial. Random forests were marginally better for high positive predictive value and reasonable sensitivity. Neural networks yielded slightly better prediction overall. Logistic regression gives an optimal mix of performance and interpretability. The established logistic regression model of burn mortality performs well against more complex alternatives. Clinical prediction with a small set of strong, stable, independent predictors is unlikely to gain much from machine learning outside specialist research contexts. Copyright © 2015 Elsevier Ltd and ISBI. All rights reserved.

  10. Bias in logistic regression due to imperfect diagnostic test results and practical correction approaches.

    PubMed

    Valle, Denis; Lima, Joanna M Tucker; Millar, Justin; Amratia, Punam; Haque, Ubydul

    2015-11-04

    Logistic regression is a statistical model widely used in cross-sectional and cohort studies to identify and quantify the effects of potential disease risk factors. However, the impact of imperfect tests on adjusted odds ratios (and thus on the identification of risk factors) is under-appreciated. The purpose of this article is to draw attention to the problem associated with modelling imperfect diagnostic tests, and propose simple Bayesian models to adequately address this issue. A systematic literature review was conducted to determine the proportion of malaria studies that appropriately accounted for false-negatives/false-positives in a logistic regression setting. Inference from the standard logistic regression was also compared with that from three proposed Bayesian models using simulations and malaria data from the western Brazilian Amazon. A systematic literature review suggests that malaria epidemiologists are largely unaware of the problem of using logistic regression to model imperfect diagnostic test results. Simulation results reveal that statistical inference can be substantially improved when using the proposed Bayesian models versus the standard logistic regression. Finally, analysis of original malaria data with one of the proposed Bayesian models reveals that microscopy sensitivity is strongly influenced by how long people have lived in the study region, and an important risk factor (i.e., participation in forest extractivism) is identified that would have been missed by standard logistic regression. Given the numerous diagnostic methods employed by malaria researchers and the ubiquitous use of logistic regression to model the results of these diagnostic tests, this paper provides critical guidelines to improve data analysis practice in the presence of misclassification error. Easy-to-use code that can be readily adapted to WinBUGS is provided, enabling straightforward implementation of the proposed Bayesian models.

  11. The association of coffee intake with liver cancer risk is mediated by biomarkers of inflammation and hepatocellular injury: data from the European Prospective Investigation into Cancer and Nutrition.

    PubMed

    Aleksandrova, Krasimira; Bamia, Christina; Drogan, Dagmar; Lagiou, Pagona; Trichopoulou, Antonia; Jenab, Mazda; Fedirko, Veronika; Romieu, Isabelle; Bueno-de-Mesquita, H Bas; Pischon, Tobias; Tsilidis, Kostas; Overvad, Kim; Tjønneland, Anne; Bouton-Ruault, Marie-Christine; Dossus, Laure; Racine, Antoine; Kaaks, Rudolf; Kühn, Tilman; Tsironis, Christos; Papatesta, Eleni-Maria; Saitakis, George; Palli, Domenico; Panico, Salvatore; Grioni, Sara; Tumino, Rosario; Vineis, Paolo; Peeters, Petra H; Weiderpass, Elisabete; Lukic, Marko; Braaten, Tonje; Quirós, J Ramón; Luján-Barroso, Leila; Sánchez, María-José; Chilarque, Maria-Dolores; Ardanas, Eva; Dorronsoro, Miren; Nilsson, Lena Maria; Sund, Malin; Wallström, Peter; Ohlsson, Bodil; Bradbury, Kathryn E; Khaw, Kay-Tee; Wareham, Nick; Stepien, Magdalena; Duarte-Salles, Talita; Assi, Nada; Murphy, Neil; Gunter, Marc J; Riboli, Elio; Boeing, Heiner; Trichopoulos, Dimitrios

    2015-12-01

    Higher coffee intake has been purportedly related to a lower risk of liver cancer. However, it remains unclear whether this association may be accounted for by specific biological mechanisms. We aimed to evaluate the potential mediating roles of inflammatory, metabolic, liver injury, and iron metabolism biomarkers on the association between coffee intake and the primary form of liver cancer-hepatocellular carcinoma (HCC). We conducted a prospective nested case-control study within the European Prospective Investigation into Cancer and Nutrition among 125 incident HCC cases matched to 250 controls using an incidence-density sampling procedure. The association of coffee intake with HCC risk was evaluated by using multivariable-adjusted conditional logistic regression that accounted for smoking, alcohol consumption, hepatitis infection, and other established liver cancer risk factors. The mediating effects of 21 biomarkers were evaluated on the basis of percentage changes and associated 95% CIs in the estimated regression coefficients of models with and without adjustment for biomarkers individually and in combination. The multivariable-adjusted RR of having ≥4 cups (600 mL) coffee/d compared with <2 cups (300 mL)/d was 0.25 (95% CI: 0.11, 0.62; P-trend = 0.006). A statistically significant attenuation of the association between coffee intake and HCC risk and thereby suspected mediation was confirmed for the inflammatory biomarker IL-6 and for the biomarkers of hepatocellular injury glutamate dehydrogenase, alanine aminotransferase, aspartate aminotransferase (AST), γ-glutamyltransferase (GGT), and total bilirubin, which-in combination-attenuated the regression coefficients by 72% (95% CI: 7%, 239%). Of the investigated biomarkers, IL-6, AST, and GGT produced the highest change in the regression coefficients: 40%, 56%, and 60%, respectively. These data suggest that the inverse association of coffee intake with HCC risk was partly accounted for by biomarkers of inflammation and hepatocellular injury.

  12. The association of coffee intake with liver cancer risk is mediated by biomarkers of inflammation and hepatocellular injury: data from the European Prospective Investigation into Cancer and Nutrition123

    PubMed Central

    Aleksandrova, Krasimira; Bamia, Christina; Drogan, Dagmar; Lagiou, Pagona; Trichopoulou, Antonia; Jenab, Mazda; Fedirko, Veronika; Romieu, Isabelle; Bueno-de-Mesquita, H Bas; Pischon, Tobias; Tsilidis, Kostas; Overvad, Kim; Tjønneland, Anne; Bouton-Ruault, Marie-Christine; Dossus, Laure; Racine, Antoine; Kaaks, Rudolf; Kühn, Tilman; Tsironis, Christos; Papatesta, Eleni-Maria; Saitakis, George; Palli, Domenico; Panico, Salvatore; Grioni, Sara; Tumino, Rosario; Vineis, Paolo; Peeters, Petra H; Weiderpass, Elisabete; Lukic, Marko; Braaten, Tonje; Quirós, J Ramón; Luján-Barroso, Leila; Sánchez, María-José; Chilarque, Maria-Dolores; Ardanas, Eva; Dorronsoro, Miren; Nilsson, Lena Maria; Sund, Malin; Wallström, Peter; Ohlsson, Bodil; Bradbury, Kathryn E; Khaw, Kay-Tee; Wareham, Nick; Stepien, Magdalena; Duarte-Salles, Talita; Assi, Nada; Murphy, Neil; Gunter, Marc J; Riboli, Elio; Boeing, Heiner; Trichopoulos, Dimitrios

    2015-01-01

    Background: Higher coffee intake has been purportedly related to a lower risk of liver cancer. However, it remains unclear whether this association may be accounted for by specific biological mechanisms. Objective: We aimed to evaluate the potential mediating roles of inflammatory, metabolic, liver injury, and iron metabolism biomarkers on the association between coffee intake and the primary form of liver cancer—hepatocellular carcinoma (HCC). Design: We conducted a prospective nested case-control study within the European Prospective Investigation into Cancer and Nutrition among 125 incident HCC cases matched to 250 controls using an incidence-density sampling procedure. The association of coffee intake with HCC risk was evaluated by using multivariable-adjusted conditional logistic regression that accounted for smoking, alcohol consumption, hepatitis infection, and other established liver cancer risk factors. The mediating effects of 21 biomarkers were evaluated on the basis of percentage changes and associated 95% CIs in the estimated regression coefficients of models with and without adjustment for biomarkers individually and in combination. Results: The multivariable-adjusted RR of having ≥4 cups (600 mL) coffee/d compared with <2 cups (300 mL)/d was 0.25 (95% CI: 0.11, 0.62; P-trend = 0.006). A statistically significant attenuation of the association between coffee intake and HCC risk and thereby suspected mediation was confirmed for the inflammatory biomarker IL-6 and for the biomarkers of hepatocellular injury glutamate dehydrogenase, alanine aminotransferase, aspartate aminotransferase (AST), γ-glutamyltransferase (GGT), and total bilirubin, which—in combination—attenuated the regression coefficients by 72% (95% CI: 7%, 239%). Of the investigated biomarkers, IL-6, AST, and GGT produced the highest change in the regression coefficients: 40%, 56%, and 60%, respectively. Conclusion: These data suggest that the inverse association of coffee intake with HCC risk was partly accounted for by biomarkers of inflammation and hepatocellular injury. PMID:26561631

  13. Logistic regression for risk factor modelling in stuttering research.

    PubMed

    Reed, Phil; Wu, Yaqionq

    2013-06-01

    To outline the uses of logistic regression and other statistical methods for risk factor analysis in the context of research on stuttering. The principles underlying the application of a logistic regression are illustrated, and the types of questions to which such a technique has been applied in the stuttering field are outlined. The assumptions and limitations of the technique are discussed with respect to existing stuttering research, and with respect to formulating appropriate research strategies to accommodate these considerations. Finally, some alternatives to the approach are briefly discussed. The way the statistical procedures are employed are demonstrated with some hypothetical data. Research into several practical issues concerning stuttering could benefit if risk factor modelling were used. Important examples are early diagnosis, prognosis (whether a child will recover or persist) and assessment of treatment outcome. After reading this article you will: (a) Summarize the situations in which logistic regression can be applied to a range of issues about stuttering; (b) Follow the steps in performing a logistic regression analysis; (c) Describe the assumptions of the logistic regression technique and the precautions that need to be checked when it is employed; (d) Be able to summarize its advantages over other techniques like estimation of group differences and simple regression. Copyright © 2012 Elsevier Inc. All rights reserved.

  14. Dietary intake in adults at risk for Huntington disease: analysis of PHAROS research participants.

    PubMed

    Marder, K; Zhao, H; Eberly, S; Tanner, C M; Oakes, D; Shoulson, I

    2009-08-04

    To examine caloric intake, dietary composition, and body mass index (BMI) in participants in the Prospective Huntington At Risk Observational Study (PHAROS). Caloric intake and macronutrient composition were measured using the National Cancer Institute Food Frequency Questionnaire (FFQ) in 652 participants at risk for Huntington disease (HD) who did not meet clinical criteria for HD. Logistic regression was used to examine the relationship between macronutrients, BMI, caloric intake, and genetic status (CAG <37 vs CAG > or =37), adjusting for age, gender, and education. Linear regression was used to determine the relationship between caloric intake, BMI, and CAG repeat length. A total of 435 participants with CAG <37 and 217 with CAG > or =37 completed the FFQ. Individuals in the CAG > or =37 group had a twofold odds of being represented in the second, third, or fourth quartile of caloric intake compared to the lowest quartile adjusted for age, gender, education, and BMI. This relationship was attenuated in the highest quartile when additionally adjusted for total motor score. In subjects with CAG > or =37, higher caloric intake, but not BMI, was associated with both higher CAG repeat length (adjusted regression coefficient = 0.26, p = 0.032) and 5-year probability of onset of HD (adjusted regression coefficient = 0.024; p = 0.013). Adjusted analyses showed no differences in macronutrient composition between groups. Increased caloric intake may be necessary to maintain body mass index in clinically unaffected individuals with CAG repeat length > or =37. This may be related to increased energy expenditure due to subtle motor impairment or a hypermetabolic state.

  15. Dynamic Dimensionality Selection for Bayesian Classifier Ensembles

    DTIC Science & Technology

    2015-03-19

    learning of weights in an otherwise generatively learned naive Bayes classifier. WANBIA-C is very cometitive to Logistic Regression but much more...classifier, Generative learning, Discriminative learning, Naïve Bayes, Feature selection, Logistic regression , higher order attribute independence 16...discriminative learning of weights in an otherwise generatively learned naive Bayes classifier. WANBIA-C is very cometitive to Logistic Regression but

  16. A review of logistic regression models used to predict post-fire tree mortality of western North American conifers

    Treesearch

    Travis Woolley; David C. Shaw; Lisa M. Ganio; Stephen Fitzgerald

    2012-01-01

    Logistic regression models used to predict tree mortality are critical to post-fire management, planning prescribed bums and understanding disturbance ecology. We review literature concerning post-fire mortality prediction using logistic regression models for coniferous tree species in the western USA. We include synthesis and review of: methods to develop, evaluate...

  17. Preserving Institutional Privacy in Distributed binary Logistic Regression.

    PubMed

    Wu, Yuan; Jiang, Xiaoqian; Ohno-Machado, Lucila

    2012-01-01

    Privacy is becoming a major concern when sharing biomedical data across institutions. Although methods for protecting privacy of individual patients have been proposed, it is not clear how to protect the institutional privacy, which is many times a critical concern of data custodians. Built upon our previous work, Grid Binary LOgistic REgression (GLORE)1, we developed an Institutional Privacy-preserving Distributed binary Logistic Regression model (IPDLR) that considers both individual and institutional privacy for building a logistic regression model in a distributed manner. We tested our method using both simulated and clinical data, showing how it is possible to protect the privacy of individuals and of institutions using a distributed strategy.

  18. Covariate Imbalance and Adjustment for Logistic Regression Analysis of Clinical Trial Data

    PubMed Central

    Ciolino, Jody D.; Martin, Reneé H.; Zhao, Wenle; Jauch, Edward C.; Hill, Michael D.; Palesch, Yuko Y.

    2014-01-01

    In logistic regression analysis for binary clinical trial data, adjusted treatment effect estimates are often not equivalent to unadjusted estimates in the presence of influential covariates. This paper uses simulation to quantify the benefit of covariate adjustment in logistic regression. However, International Conference on Harmonization guidelines suggest that covariate adjustment be pre-specified. Unplanned adjusted analyses should be considered secondary. Results suggest that that if adjustment is not possible or unplanned in a logistic setting, balance in continuous covariates can alleviate some (but never all) of the shortcomings of unadjusted analyses. The case of log binomial regression is also explored. PMID:24138438

  19. Differentially private distributed logistic regression using private and public data.

    PubMed

    Ji, Zhanglong; Jiang, Xiaoqian; Wang, Shuang; Xiong, Li; Ohno-Machado, Lucila

    2014-01-01

    Privacy protecting is an important issue in medical informatics and differential privacy is a state-of-the-art framework for data privacy research. Differential privacy offers provable privacy against attackers who have auxiliary information, and can be applied to data mining models (for example, logistic regression). However, differentially private methods sometimes introduce too much noise and make outputs less useful. Given available public data in medical research (e.g. from patients who sign open-consent agreements), we can design algorithms that use both public and private data sets to decrease the amount of noise that is introduced. In this paper, we modify the update step in Newton-Raphson method to propose a differentially private distributed logistic regression model based on both public and private data. We try our algorithm on three different data sets, and show its advantage over: (1) a logistic regression model based solely on public data, and (2) a differentially private distributed logistic regression model based on private data under various scenarios. Logistic regression models built with our new algorithm based on both private and public datasets demonstrate better utility than models that trained on private or public datasets alone without sacrificing the rigorous privacy guarantee.

  20. Logistic regression analysis of conventional ultrasonography, strain elastosonography, and contrast-enhanced ultrasound characteristics for the differentiation of benign and malignant thyroid nodules

    PubMed Central

    Deng, Yingyuan; Wang, Tianfu; Chen, Siping; Liu, Weixiang

    2017-01-01

    The aim of the study is to screen the significant sonographic features by logistic regression analysis and fit a model to diagnose thyroid nodules. A total of 525 pathological thyroid nodules were retrospectively analyzed. All the nodules underwent conventional ultrasonography (US), strain elastosonography (SE), and contrast -enhanced ultrasound (CEUS). Those nodules’ 12 suspicious sonographic features were used to assess thyroid nodules. The significant features of diagnosing thyroid nodules were picked out by logistic regression analysis. All variables that were statistically related to diagnosis of thyroid nodules, at a level of p < 0.05 were embodied in a logistic regression analysis model. The significant features in the logistic regression model of diagnosing thyroid nodules were calcification, suspected cervical lymph node metastasis, hypoenhancement pattern, margin, shape, vascularity, posterior acoustic, echogenicity, and elastography score. According to the results of logistic regression analysis, the formula that could predict whether or not thyroid nodules are malignant was established. The area under the receiver operating curve (ROC) was 0.930 and the sensitivity, specificity, accuracy, positive predictive value, and negative predictive value were 83.77%, 89.56%, 87.05%, 86.04%, and 87.79% respectively. PMID:29228030

  1. Logistic regression analysis of conventional ultrasonography, strain elastosonography, and contrast-enhanced ultrasound characteristics for the differentiation of benign and malignant thyroid nodules.

    PubMed

    Pang, Tiantian; Huang, Leidan; Deng, Yingyuan; Wang, Tianfu; Chen, Siping; Gong, Xuehao; Liu, Weixiang

    2017-01-01

    The aim of the study is to screen the significant sonographic features by logistic regression analysis and fit a model to diagnose thyroid nodules. A total of 525 pathological thyroid nodules were retrospectively analyzed. All the nodules underwent conventional ultrasonography (US), strain elastosonography (SE), and contrast -enhanced ultrasound (CEUS). Those nodules' 12 suspicious sonographic features were used to assess thyroid nodules. The significant features of diagnosing thyroid nodules were picked out by logistic regression analysis. All variables that were statistically related to diagnosis of thyroid nodules, at a level of p < 0.05 were embodied in a logistic regression analysis model. The significant features in the logistic regression model of diagnosing thyroid nodules were calcification, suspected cervical lymph node metastasis, hypoenhancement pattern, margin, shape, vascularity, posterior acoustic, echogenicity, and elastography score. According to the results of logistic regression analysis, the formula that could predict whether or not thyroid nodules are malignant was established. The area under the receiver operating curve (ROC) was 0.930 and the sensitivity, specificity, accuracy, positive predictive value, and negative predictive value were 83.77%, 89.56%, 87.05%, 86.04%, and 87.79% respectively.

  2. Prevalence and Determinants of Preterm Birth in Tehran, Iran: A Comparison between Logistic Regression and Decision Tree Methods.

    PubMed

    Amini, Payam; Maroufizadeh, Saman; Samani, Reza Omani; Hamidi, Omid; Sepidarkish, Mahdi

    2017-06-01

    Preterm birth (PTB) is a leading cause of neonatal death and the second biggest cause of death in children under five years of age. The objective of this study was to determine the prevalence of PTB and its associated factors using logistic regression and decision tree classification methods. This cross-sectional study was conducted on 4,415 pregnant women in Tehran, Iran, from July 6-21, 2015. Data were collected by a researcher-developed questionnaire through interviews with mothers and review of their medical records. To evaluate the accuracy of the logistic regression and decision tree methods, several indices such as sensitivity, specificity, and the area under the curve were used. The PTB rate was 5.5% in this study. The logistic regression outperformed the decision tree for the classification of PTB based on risk factors. Logistic regression showed that multiple pregnancies, mothers with preeclampsia, and those who conceived with assisted reproductive technology had an increased risk for PTB ( p < 0.05). Identifying and training mothers at risk as well as improving prenatal care may reduce the PTB rate. We also recommend that statisticians utilize the logistic regression model for the classification of risk groups for PTB.

  3. Modeling the rheological behavior of thermosonic extracted guava, pomelo, and soursop juice concentrates at different concentration and temperature using a new combination model

    PubMed Central

    Abdullah, Norazlin; Yusof, Yus A.; Talib, Rosnita A.

    2017-01-01

    Abstract This study has modeled the rheological behavior of thermosonic extracted pink‐fleshed guava, pink‐fleshed pomelo, and soursop juice concentrates at different concentrations and temperatures. The effects of concentration on consistency coefficient (K) and flow behavior index (n) of the fruit juice concentrates was modeled using a master curve which utilized the concentration‐temperature shifting to allow a general prediction of rheological behaviors covering a wide concentration. For modeling the effects of temperature on K and n, the integration of two functions from the Arrhenius and logistic sigmoidal growth equations has provided a new model which gave better description of the properties. It also alleviated the problems of negative region when using the Arrhenius model alone. The fitted regression using this new model has improved coefficient of determination, R 2 values above 0.9792 as compared to using the Arrhenius and logistic sigmoidal models alone, which presented minimum R 2 of 0.6243 and 0.9440, respectively. Practical applications In general, juice concentrate is a better form of food for transportation, preservation, and ingredient. Models are necessary to predict the effects of processing factors such as concentration and temperature on the rheological behavior of juice concentrates. The modeling approach allows prediction of behaviors and determination of processing parameters. The master curve model introduced in this study simplifies and generalized rheological behavior of juice concentrates over a wide range of concentration when temperature factor is insignificant. The proposed new mathematical model from the combination of the Arrhenius and logistic sigmoidal growth models has improved and extended description of rheological properties of fruit juice concentrates. It also solved problems of negative values of consistency coefficient and flow behavior index prediction using existing model, the Arrhenius equation. These rheological data modeling provide good information for the juice processing and equipment manufacturing needs. PMID:29479123

  4. Ridge: a computer program for calculating ridge regression estimates

    Treesearch

    Donald E. Hilt; Donald W. Seegrist

    1977-01-01

    Least-squares coefficients for multiple-regression models may be unstable when the independent variables are highly correlated. Ridge regression is a biased estimation procedure that produces stable estimates of the coefficients. Ridge regression is discussed, and a computer program for calculating the ridge coefficients is presented.

  5. The Predictive Effects of Protection Motivation Theory on Intention and Behaviour of Physical Activity in Patients with Type 2 Diabetes.

    PubMed

    Ali Morowatisharifabad, Mohammad; Abdolkarimi, Mahdi; Asadpour, Mohammad; Fathollahi, Mahmood Sheikh; Balaee, Parisa

    2018-04-15

    Theory-based education tailored to target behaviour and group can be effective in promoting physical activity. The purpose of this study was to examine the predictive power of Protection Motivation Theory on intent and behaviour of Physical Activity in Patients with Type 2 Diabetes. This descriptive study was conducted on 250 patients in Rafsanjan, Iran. To examine the scores of protection motivation theory structures, a researcher-made questionnaire was used. Its validity and reliability were confirmed. The level of physical activity was also measured by the International Short - form Physical Activity Inventory. Its validity and reliability were also approved. Data were analysed by statistical tests including correlation coefficient, chi-square, logistic regression and linear regression. The results revealed that there was a significant correlation between all the protection motivation theory constructs and the intention to do physical activity. The results showed that the Theory structures were able to predict 60% of the variance of physical activity intention. The results of logistic regression demonstrated that increase in the score of physical activity intent and self - efficacy increased the chance of higher level of physical activity by 3.4 and 1.5 times, respectively OR = (3.39, 1.54). Considering the ability of protection motivation theory structures to explain the physical activity behaviour, interventional designs are suggested based on the structures of this theory, especially to improve self -efficacy as the most powerful factor in predicting physical activity intention and behaviour.

  6. [Job retention and nursing practice environment of hospital nurses in Japan applying the Japanese version of the Practice Environment Scale of the Nursing Work Index (PES-NWI)].

    PubMed

    Ogata, Yasuko; Nagano, Midori; Fukuda, Takashi; Hashimoto, Michio

    2011-06-01

    The purpose of this study was to examine how the nursing practice environment affects job retention and the turnover rate among hospital nurses. The Practice Environment Scale of the Nursing Work Index (PES-NWI) was applied to investigate the nurse working environment from the viewpoint of hospital nurses in Japan. Methods A postal mail survey was conducted using the PES-NWI questionnaire targeting 2,211 nurses who were working at 91 wards in 5 hospitals situated in the Tokyo metropolitan area from February to March in 2008. In the questionnaire, hospital nurses were asked about characteristics such as sex, age and work experience as a nurse, whether they would work at the same hospital in the next year, the 31 items of the PES-NWI and job satisfaction. Nurse managers were asked to provide staff numbers to calculate the turnover rate of each ward. Logistic regression analyses were carried out, with "intention to retain or leave the workplace next year" as the dependent variable, with composite and 5 sub-scale scores of the PES-NWI and nurse characteristics as independent variables. Correlation coefficients were calculated to investigate the relationship between nurse turnover rates and nursing practice environments. A total of 1,067 full-time nurses (48.3%) from 5 hospitals responded. Almost all of them were men (95.9%), with an average age of 29.2 years old. They had an average of 7.0 years total work experience in hospitals and 5.8 years of experience at their current hospital. Cronbach's alpha coefficients were 0.75 for composite of the PES-NWI, and 0.77-0.85 for the sub-scales. All correlation coefficients between PES-NWI and job satisfaction were significant (P < 0.01). In the logistic regression analysis, a composite of PES-NWI, "Nurse Manager's Ability, Leadership, and Support of Nurses" and "Staffing and Resource Adequacy" among the 5 sub-scales correlated with the intention of nurses to stay on (P < 0.05). The means for turnover rate were 10.4% for nurses and 17.6% for newly hired nurses. These rates were significantly correlated to the composite and some sub-scales of the PES-NWI. The working environment for nurses is important in retaining nurses working at hospitals. We confirmed the reliability and the validity of the PES-NWI scale based on the magnitude of the Cronbach's alpha coefficient and correlation coefficient between the PES-NWI scale and job satisfaction in this study.

  7. Logistic regression for dichotomized counts.

    PubMed

    Preisser, John S; Das, Kalyan; Benecha, Habtamu; Stamm, John W

    2016-12-01

    Sometimes there is interest in a dichotomized outcome indicating whether a count variable is positive or zero. Under this scenario, the application of ordinary logistic regression may result in efficiency loss, which is quantifiable under an assumed model for the counts. In such situations, a shared-parameter hurdle model is investigated for more efficient estimation of regression parameters relating to overall effects of covariates on the dichotomous outcome, while handling count data with many zeroes. One model part provides a logistic regression containing marginal log odds ratio effects of primary interest, while an ancillary model part describes the mean count of a Poisson or negative binomial process in terms of nuisance regression parameters. Asymptotic efficiency of the logistic model parameter estimators of the two-part models is evaluated with respect to ordinary logistic regression. Simulations are used to assess the properties of the models with respect to power and Type I error, the latter investigated under both misspecified and correctly specified models. The methods are applied to data from a randomized clinical trial of three toothpaste formulations to prevent incident dental caries in a large population of Scottish schoolchildren. © The Author(s) 2014.

  8. Predicting 30-day Hospital Readmission with Publicly Available Administrative Database. A Conditional Logistic Regression Modeling Approach.

    PubMed

    Zhu, K; Lou, Z; Zhou, J; Ballester, N; Kong, N; Parikh, P

    2015-01-01

    This article is part of the Focus Theme of Methods of Information in Medicine on "Big Data and Analytics in Healthcare". Hospital readmissions raise healthcare costs and cause significant distress to providers and patients. It is, therefore, of great interest to healthcare organizations to predict what patients are at risk to be readmitted to their hospitals. However, current logistic regression based risk prediction models have limited prediction power when applied to hospital administrative data. Meanwhile, although decision trees and random forests have been applied, they tend to be too complex to understand among the hospital practitioners. Explore the use of conditional logistic regression to increase the prediction accuracy. We analyzed an HCUP statewide inpatient discharge record dataset, which includes patient demographics, clinical and care utilization data from California. We extracted records of heart failure Medicare beneficiaries who had inpatient experience during an 11-month period. We corrected the data imbalance issue with under-sampling. In our study, we first applied standard logistic regression and decision tree to obtain influential variables and derive practically meaning decision rules. We then stratified the original data set accordingly and applied logistic regression on each data stratum. We further explored the effect of interacting variables in the logistic regression modeling. We conducted cross validation to assess the overall prediction performance of conditional logistic regression (CLR) and compared it with standard classification models. The developed CLR models outperformed several standard classification models (e.g., straightforward logistic regression, stepwise logistic regression, random forest, support vector machine). For example, the best CLR model improved the classification accuracy by nearly 20% over the straightforward logistic regression model. Furthermore, the developed CLR models tend to achieve better sensitivity of more than 10% over the standard classification models, which can be translated to correct labeling of additional 400 - 500 readmissions for heart failure patients in the state of California over a year. Lastly, several key predictor identified from the HCUP data include the disposition location from discharge, the number of chronic conditions, and the number of acute procedures. It would be beneficial to apply simple decision rules obtained from the decision tree in an ad-hoc manner to guide the cohort stratification. It could be potentially beneficial to explore the effect of pairwise interactions between influential predictors when building the logistic regression models for different data strata. Judicious use of the ad-hoc CLR models developed offers insights into future development of prediction models for hospital readmissions, which can lead to better intuition in identifying high-risk patients and developing effective post-discharge care strategies. Lastly, this paper is expected to raise the awareness of collecting data on additional markers and developing necessary database infrastructure for larger-scale exploratory studies on readmission risk prediction.

  9. Next day discharge rate has little use as a quality measure for individual physician performance.

    PubMed

    Inabnit, Christopher; Markwell, Stephen; Gruwell, Jack; Jaeger, Cassie; Millburg, Lance; Griffen, David

    2018-06-18

    Emergency Department (ED) physicians' next day discharge rate (NDDR), the percentage of patients who were admitted from the ED and subsequently discharged within the next calendar day was hypothesized as a potential measure for unnecessary admissions. The objective was to determine if NDDR has validity as a measure for quality of individual ED physician performance. Hospital admission data was obtained for thirty-six ED physicians for calendar year 2015. Funnel plots were used to identify NDDR outliers beyond 95% control limits. A mixed model logistic regression was built to investigate factors contributing to NDDR. To determine yearly variation, data from calendar years 2014 and 2016 were analyzed, again by funnel plots and logistic regression. Intraclass correlation coefficient was used to estimate the percent of total variation in NDDR attributable to individual ED physicians. NDDR varied significantly among ED physicians. Individual ED physician outliers in NDDR varied year to year. Individual ED physician contribution to NDDR variation was minimal, accounting for 1%. Years of experience in Emergency Medicine practice was not correlated with NDDR. NDDR does not appear to be a reliable independent quality measure for individual ED physician performance. The percent of variance attributable to the ED physician was 1%. Copyright © 2018. Published by Elsevier Inc.

  10. Assessment of Differential Item Functioning in Health-Related Outcomes: A Simulation and Empirical Analysis with Hierarchical Polytomous Data

    PubMed Central

    Sharafi, Zahra

    2017-01-01

    Background The purpose of this study was to evaluate the effectiveness of two methods of detecting differential item functioning (DIF) in the presence of multilevel data and polytomously scored items. The assessment of DIF with multilevel data (e.g., patients nested within hospitals, hospitals nested within districts) from large-scale assessment programs has received considerable attention but very few studies evaluated the effect of hierarchical structure of data on DIF detection for polytomously scored items. Methods The ordinal logistic regression (OLR) and hierarchical ordinal logistic regression (HOLR) were utilized to assess DIF in simulated and real multilevel polytomous data. Six factors (DIF magnitude, grouping variable, intraclass correlation coefficient, number of clusters, number of participants per cluster, and item discrimination parameter) with a fully crossed design were considered in the simulation study. Furthermore, data of Pediatric Quality of Life Inventory™ (PedsQL™) 4.0 collected from 576 healthy school children were analyzed. Results Overall, results indicate that both methods performed equivalently in terms of controlling Type I error and detection power rates. Conclusions The current study showed negligible difference between OLR and HOLR in detecting DIF with polytomously scored items in a hierarchical structure. Implications and considerations while analyzing real data were also discussed. PMID:29312463

  11. Assessment of Differential Item Functioning in Health-Related Outcomes: A Simulation and Empirical Analysis with Hierarchical Polytomous Data.

    PubMed

    Sharafi, Zahra; Mousavi, Amin; Ayatollahi, Seyyed Mohammad Taghi; Jafari, Peyman

    2017-01-01

    The purpose of this study was to evaluate the effectiveness of two methods of detecting differential item functioning (DIF) in the presence of multilevel data and polytomously scored items. The assessment of DIF with multilevel data (e.g., patients nested within hospitals, hospitals nested within districts) from large-scale assessment programs has received considerable attention but very few studies evaluated the effect of hierarchical structure of data on DIF detection for polytomously scored items. The ordinal logistic regression (OLR) and hierarchical ordinal logistic regression (HOLR) were utilized to assess DIF in simulated and real multilevel polytomous data. Six factors (DIF magnitude, grouping variable, intraclass correlation coefficient, number of clusters, number of participants per cluster, and item discrimination parameter) with a fully crossed design were considered in the simulation study. Furthermore, data of Pediatric Quality of Life Inventory™ (PedsQL™) 4.0 collected from 576 healthy school children were analyzed. Overall, results indicate that both methods performed equivalently in terms of controlling Type I error and detection power rates. The current study showed negligible difference between OLR and HOLR in detecting DIF with polytomously scored items in a hierarchical structure. Implications and considerations while analyzing real data were also discussed.

  12. Molecular properties of steroids involved in their effects on the biophysical state of membranes.

    PubMed

    Wenz, Jorge J

    2015-10-01

    The activity of steroids on membranes was studied in relation to their ordering, rigidifying, condensing and/or raft promoting ability. The structures of 82 steroids were modeled by a semi-empirical procedure (AM1) and 245 molecular descriptors were next computed on the optimized energy conformations. Principal component analysis, mean contrasting and logistic regression were used to correlate the molecular properties with 212 cases of documented activities. It was possible to group steroids based on their properties and activities, indicating that steroids having similar molecular properties have similar activities on membranes. Steroids having high values of area, partition coefficient, volume, number of rotatable bonds, molar refractivity, polarizability or mass displayed ordering, rigidifying, condensing and/or raft promoting activity on membranes higher than those steroids having low values in such molecular properties. After a variable selection procedure circumventing correlation problems among descriptors, area and log P were found as the most relevant properties in governing and predicting the activity of steroids on membranes. A logistic regression model as a function of the area and log P of the steroids is proposed, which is able to predict correctly 92.5% of the cases. A rationale of the findings is discussed. Copyright © 2015 Elsevier B.V. All rights reserved.

  13. Subjective vs objective evaluations of smile esthetics.

    PubMed

    Schabel, Brian J; Franchi, Lorenzo; Baccetti, Tiziano; McNamara, James A

    2009-04-01

    The aim of this study was to analyze the relationships between subjective evaluations of posttreatment smiles captured with clinical photography and rated by a panel of orthodontists and parents of orthodontic patients, and objective evaluations of the same smiles from the Smile Mesh program (TDG Computing, Philadelphia, Pa). The clinical photographs of 48 orthodontically treated patients were rated by a panel of 25 experienced orthodontists and 20 parents of patients. Independent samples t tests were used to test whether objective measurements were significantly different between subjects with "attractive" and "unattractive" smiles, and those with the "most attractive" and "least attractive" smiles. Additionally, logistic regression was performed to evaluate whether the measurements could predict whether a smile captured with clinical photography would be attractive or unattractive. The comparison between groups showed no significant differences for any measurement. Subjects with the "most unattractive" smiles had a significantly greater distance between the incisal edge of the maxillary central incisors and the lower lip during smiling, and a significantly smaller smile index than did those with the "most attractive" smiles. As shown by the coefficients of logistic regression, smile attractiveness could not be predicted by any objectively gathered measurement. No objective measure of the smile could predict attractive or unattractive smiles as judged subjectively.

  14. Predicting the probability of elevated nitrate concentrations in the Puget Sound Basin: Implications for aquifer susceptibility and vulnerability

    USGS Publications Warehouse

    Tesoriero, A.J.; Voss, F.D.

    1997-01-01

    The occurrence and distribution of elevated nitrate concentrations (≥ 3 mg/l) in ground water in the Puget Sound Basin, Washington, were determined by examining existing data from more than 3000 wells. Models that estimate the probability that a well has an elevated nitrate concentration were constructed by relating the occurrence of elevated nitrate concentrations to both natural and anthropogenic variables using logistic regression. The variables that best explain the occurrence of elevated nitrate concentrations were well depth, surficial geology, and the percentage of urban and agricultural land within a radius of 3.2 kilometers of the well. From these relations, logistic regression models were developed to assess aquifer susceptibility (relative ease with which contaminants will reach aquifer) and ground-water vulnerability (relative ease with which contaminants will reach aquifer for a given set of land-use practices). Both models performed well at predicting the probability of elevated nitrate concentrations in an independent data set. This approach to assessing aquifer susceptibility and ground-water vulnerability has the advantages of having both model variables and coefficient values determined on the basis of existing water quality information and does not depend on the assignment of variables and weighting factors based on qualitative criteria.

  15. Evaluation of logistic regression models and effect of covariates for case-control study in RNA-Seq analysis.

    PubMed

    Choi, Seung Hoan; Labadorf, Adam T; Myers, Richard H; Lunetta, Kathryn L; Dupuis, Josée; DeStefano, Anita L

    2017-02-06

    Next generation sequencing provides a count of RNA molecules in the form of short reads, yielding discrete, often highly non-normally distributed gene expression measurements. Although Negative Binomial (NB) regression has been generally accepted in the analysis of RNA sequencing (RNA-Seq) data, its appropriateness has not been exhaustively evaluated. We explore logistic regression as an alternative method for RNA-Seq studies designed to compare cases and controls, where disease status is modeled as a function of RNA-Seq reads using simulated and Huntington disease data. We evaluate the effect of adjusting for covariates that have an unknown relationship with gene expression. Finally, we incorporate the data adaptive method in order to compare false positive rates. When the sample size is small or the expression levels of a gene are highly dispersed, the NB regression shows inflated Type-I error rates but the Classical logistic and Bayes logistic (BL) regressions are conservative. Firth's logistic (FL) regression performs well or is slightly conservative. Large sample size and low dispersion generally make Type-I error rates of all methods close to nominal alpha levels of 0.05 and 0.01. However, Type-I error rates are controlled after applying the data adaptive method. The NB, BL, and FL regressions gain increased power with large sample size, large log2 fold-change, and low dispersion. The FL regression has comparable power to NB regression. We conclude that implementing the data adaptive method appropriately controls Type-I error rates in RNA-Seq analysis. Firth's logistic regression provides a concise statistical inference process and reduces spurious associations from inaccurately estimated dispersion parameters in the negative binomial framework.

  16. Differentially private distributed logistic regression using private and public data

    PubMed Central

    2014-01-01

    Background Privacy protecting is an important issue in medical informatics and differential privacy is a state-of-the-art framework for data privacy research. Differential privacy offers provable privacy against attackers who have auxiliary information, and can be applied to data mining models (for example, logistic regression). However, differentially private methods sometimes introduce too much noise and make outputs less useful. Given available public data in medical research (e.g. from patients who sign open-consent agreements), we can design algorithms that use both public and private data sets to decrease the amount of noise that is introduced. Methodology In this paper, we modify the update step in Newton-Raphson method to propose a differentially private distributed logistic regression model based on both public and private data. Experiments and results We try our algorithm on three different data sets, and show its advantage over: (1) a logistic regression model based solely on public data, and (2) a differentially private distributed logistic regression model based on private data under various scenarios. Conclusion Logistic regression models built with our new algorithm based on both private and public datasets demonstrate better utility than models that trained on private or public datasets alone without sacrificing the rigorous privacy guarantee. PMID:25079786

  17. A retrospective analysis to identify the factors affecting infection in patients undergoing chemotherapy.

    PubMed

    Park, Ji Hyun; Kim, Hyeon-Young; Lee, Hanna; Yun, Eun Kyoung

    2015-12-01

    This study compares the performance of the logistic regression and decision tree analysis methods for assessing the risk factors for infection in cancer patients undergoing chemotherapy. The subjects were 732 cancer patients who were receiving chemotherapy at K university hospital in Seoul, Korea. The data were collected between March 2011 and February 2013 and were processed for descriptive analysis, logistic regression and decision tree analysis using the IBM SPSS Statistics 19 and Modeler 15.1 programs. The most common risk factors for infection in cancer patients receiving chemotherapy were identified as alkylating agents, vinca alkaloid and underlying diabetes mellitus. The logistic regression explained 66.7% of the variation in the data in terms of sensitivity and 88.9% in terms of specificity. The decision tree analysis accounted for 55.0% of the variation in the data in terms of sensitivity and 89.0% in terms of specificity. As for the overall classification accuracy, the logistic regression explained 88.0% and the decision tree analysis explained 87.2%. The logistic regression analysis showed a higher degree of sensitivity and classification accuracy. Therefore, logistic regression analysis is concluded to be the more effective and useful method for establishing an infection prediction model for patients undergoing chemotherapy. Copyright © 2015 Elsevier Ltd. All rights reserved.

  18. Performance and strategy comparisons of human listeners and logistic regression in discriminating underwater targets.

    PubMed

    Yang, Lixue; Chen, Kean

    2015-11-01

    To improve the design of underwater target recognition systems based on auditory perception, this study compared human listeners with automatic classifiers. Performances measures and strategies in three discrimination experiments, including discriminations between man-made and natural targets, between ships and submarines, and among three types of ships, were used. In the experiments, the subjects were asked to assign a score to each sound based on how confident they were about the category to which it belonged, and logistic regression, which represents linear discriminative models, also completed three similar tasks by utilizing many auditory features. The results indicated that the performances of logistic regression improved as the ratio between inter- and intra-class differences became larger, whereas the performances of the human subjects were limited by their unfamiliarity with the targets. Logistic regression performed better than the human subjects in all tasks but the discrimination between man-made and natural targets, and the strategies employed by excellent human subjects were similar to that of logistic regression. Logistic regression and several human subjects demonstrated similar performances when discriminating man-made and natural targets, but in this case, their strategies were not similar. An appropriate fusion of their strategies led to further improvement in recognition accuracy.

  19. Simulating land-use changes by incorporating spatial autocorrelation and self-organization in CLUE-S modeling: a case study in Zengcheng District, Guangzhou, China

    NASA Astrophysics Data System (ADS)

    Mei, Zhixiong; Wu, Hao; Li, Shiyun

    2018-06-01

    The Conversion of Land Use and its Effects at Small regional extent (CLUE-S), which is a widely used model for land-use simulation, utilizes logistic regression to estimate the relationships between land use and its drivers, and thus, predict land-use change probabilities. However, logistic regression disregards possible spatial autocorrelation and self-organization in land-use data. Autologistic regression can depict spatial autocorrelation but cannot address self-organization, while logistic regression by considering only self-organization (NElogistic regression) fails to capture spatial autocorrelation. Therefore, this study developed a regression (NE-autologistic regression) method, which incorporated both spatial autocorrelation and self-organization, to improve CLUE-S. The Zengcheng District of Guangzhou, China was selected as the study area. The land-use data of 2001, 2005, and 2009, as well as 10 typical driving factors, were used to validate the proposed regression method and the improved CLUE-S model. Then, three future land-use scenarios in 2020: the natural growth scenario, ecological protection scenario, and economic development scenario, were simulated using the improved model. Validation results showed that NE-autologistic regression performed better than logistic regression, autologistic regression, and NE-logistic regression in predicting land-use change probabilities. The spatial allocation accuracy and kappa values of NE-autologistic-CLUE-S were higher than those of logistic-CLUE-S, autologistic-CLUE-S, and NE-logistic-CLUE-S for the simulations of two periods, 2001-2009 and 2005-2009, which proved that the improved CLUE-S model achieved the best simulation and was thereby effective to a certain extent. The scenario simulation results indicated that under all three scenarios, traffic land and residential/industrial land would increase, whereas arable land and unused land would decrease during 2009-2020. Apparent differences also existed in the simulated change sizes and locations of each land-use type under different scenarios. The results not only demonstrate the validity of the improved model but also provide a valuable reference for relevant policy-makers.

  20. Religiosity and decreased risk of substance use disorders: is the effect mediated by social support or mental health status?

    PubMed Central

    Harris, Katherine M.; Koenig, Harold G.; Han, Xiaotong; Sullivan, Greer; Mattox, Rhonda; Tang, Lingqi

    2009-01-01

    Objective The negative association between religiosity (religious beliefs and church attendance) and the likelihood of substance use disorders is well established, but the mechanism(s) remain poorly understood. We investigated whether this association was mediated by social support or mental health status. Method We utilized cross-sectional data from the 2002 National Survey on Drug Use and Health (n = 36,370). We first used logistic regression to regress any alcohol use in the past year on sociodemographic and religiosity variables. Then, among individuals who drank in the past year, we regressed past year alcohol abuse/dependence on sociodemographic and religiosity variables. To investigate whether social support mediated the association between religiosity and alcohol use and alcohol abuse/dependence we repeated the above models, adding the social support variables. To the extent that these added predictors modified the magnitude of the effect of the religiosity variables, we interpreted social support as a possible mediator. We also formally tested for mediation using path analysis. We investigated the possible mediating role of mental health status analogously. Parallel sets of analyses were conducted for any drug use, and drug abuse/dependence among those using any drugs as the dependent variables. Results The addition of social support and mental health status variables to logistic regression models had little effect on the magnitude of the religiosity coefficients in any of the models. While some of the tests of mediation were significant in the path analyses, the results were not always in the expected direction, and the magnitude of the effects was small. Conclusions The association between religiosity and decreased likelihood of a substance use disorder does not appear to be substantively mediated by either social support or mental health status. PMID:19714282

  1. Mindfulness, Physical Activity and Avoidance of Secondhand Smoke: A Study of College Students in Shanghai.

    PubMed

    Gao, Yu; Shi, Lu

    2015-08-21

    To better understand the documented link between mindfulness and longevity, we examine the association between mindfulness and conscious avoidance of secondhand smoke (SHS), as well as the association between mindfulness and physical activity. In Shanghai University of Finance and Economics (SUFE) we surveyed a convenience sample of 1516 college freshmen. We measured mindfulness, weekly physical activity, and conscious avoidance of secondhand smoke, along with demographic and behavioral covariates. We used a multilevel logistic regression to test the association between mindfulness and conscious avoidance of secondhand smoke, and used a Tobit regression model to test the association between mindfulness and metabolic equivalent hours per week. In both models the home province of the student respondent was used as the cluster variable, and demographic and behavioral covariates, such as age, gender, smoking history, household registration status (urban vs. rural), the perceived smog frequency in their home towns, and the asthma diagnosis. The logistic regression of consciously avoiding SHS shows that a higher level of mindfulness was associated with an increase in the odds ratio of conscious SHS avoidance (logged odds: 0.22, standard error: 0.07, p < 0.01). The Tobit regression shows that a higher level of mindfulness was associated with more metabolic equivalent hours per week (Tobit coefficient: 4.09, standard error: 1.13, p < 0.001). This study is an innovative attempt to study the behavioral issue of secondhand smoke from the perspective of the potential victim, rather than the active smoker. The observed associational patterns here are consistent with previous findings that mindfulness is associated with healthier behaviors in obesity prevention and substance use. Research designs with interventions are needed to test the causal link between mindfulness and these healthy behaviors.

  2. Mindfulness, Physical Activity and Avoidance of Secondhand Smoke: A Study of College Students in Shanghai

    PubMed Central

    Gao, Yu; Shi, Lu

    2015-01-01

    Introduction: To better understand the documented link between mindfulness and longevity, we examine the association between mindfulness and conscious avoidance of secondhand smoke (SHS), as well as the association between mindfulness and physical activity. Method: In Shanghai University of Finance and Economics (SUFE) we surveyed a convenience sample of 1516 college freshmen. We measured mindfulness, weekly physical activity, and conscious avoidance of secondhand smoke, along with demographic and behavioral covariates. We used a multilevel logistic regression to test the association between mindfulness and conscious avoidance of secondhand smoke, and used a Tobit regression model to test the association between mindfulness and metabolic equivalent hours per week. In both models the home province of the student respondent was used as the cluster variable, and demographic and behavioral covariates, such as age, gender, smoking history, household registration status (urban vs. rural), the perceived smog frequency in their home towns, and the asthma diagnosis. Results: The logistic regression of consciously avoiding SHS shows that a higher level of mindfulness was associated with an increase in the odds ratio of conscious SHS avoidance (logged odds: 0.22, standard error: 0.07, p < 0.01). The Tobit regression shows that a higher level of mindfulness was associated with more metabolic equivalent hours per week (Tobit coefficient: 4.09, standard error: 1.13, p < 0.001). Discussion: This study is an innovative attempt to study the behavioral issue of secondhand smoke from the perspective of the potential victim, rather than the active smoker. The observed associational patterns here are consistent with previous findings that mindfulness is associated with healthier behaviors in obesity prevention and substance use. Research designs with interventions are needed to test the causal link between mindfulness and these healthy behaviors. PMID:26308029

  3. Serum Liver Fibrosis Markers in the Prognosis of Liver Cirrhosis: A Prospective Observational Study

    PubMed Central

    Qi, Xingshun; Liu, Xu; Zhang, Yongguo; Hou, Yue; Ren, Linan; Wu, Chunyan; Chen, Jiang; Xia, Chunlian; Zhao, Jiajun; Wang, Di; Zhang, Yanlin; Zhang, Xia; Lin, Hao; Wang, Hezhi; Wang, Jinling; Cui, Zhongmin; Li, Xueyan; Deng, Han; Hou, Feifei; Peng, Ying; Wang, Xueying; Shao, Xiaodong; Li, Hongyu; Guo, Xiaozhong

    2016-01-01

    Background The prognostic role of serum liver fibrosis markers in cirrhotic patients remains unclear. We performed a prospective observational study to evaluate the effect of amino-terminal pro-peptide of type III pro-collagen (PIIINP), collagen IV (CIV), laminin (LN), and hyaluronic acid (HA) on the prognosis of liver cirrhosis. Material/Methods All patients who were diagnosed with liver cirrhosis and admitted to our department were prospectively enrolled. PIIINP, CIV, LN, and HA levels were tested. Results Overall, 108 cirrhotic patients were included. Correlation analysis demonstrated that CIV (coefficient r: 0.658, p<0.001; coefficient r: 0.368, p<0.001), LN (coefficient r: 0.450, p<0.001; coefficient r: 0.343, p<0.001), and HA (coefficient r: 0.325, p=0.001; coefficient r: 0.282, p=0.004) levels, but not PIIINP level (coefficient r: 0.081, p=0.414; coefficient r: 0.090, p=0.363), significantly correlated with Child-Pugh and MELD scores. Logistic regression analysis demonstrated that HA (odds ratio=1.00003, 95% confidence interval [CI]=1.000004–1.000056, p=0.022) was significantly associated with the 6-month mortality. Receiver operating characteristics analysis demonstrated that the area under the curve (AUC) of HA for predicting the 6-month mortality was 0.612 (95%CI=0.508–0.709, p=0.1531). Conclusions CIV, LN, and HA levels were significantly associated with the severity of liver dysfunction, but might be inappropriate for the prognostic assessment of liver cirrhosis. PMID:27480906

  4. Unitary Response Regression Models

    ERIC Educational Resources Information Center

    Lipovetsky, S.

    2007-01-01

    The dependent variable in a regular linear regression is a numerical variable, and in a logistic regression it is a binary or categorical variable. In these models the dependent variable has varying values. However, there are problems yielding an identity output of a constant value which can also be modelled in a linear or logistic regression with…

  5. Binary logistic regression-Instrument for assessing museum indoor air impact on exhibits.

    PubMed

    Bucur, Elena; Danet, Andrei Florin; Lehr, Carol Blaziu; Lehr, Elena; Nita-Lazar, Mihai

    2017-04-01

    This paper presents a new way to assess the environmental impact on historical artifacts using binary logistic regression. The prediction of the impact on the exhibits during certain pollution scenarios (environmental impact) was calculated by a mathematical model based on the binary logistic regression; it allows the identification of those environmental parameters from a multitude of possible parameters with a significant impact on exhibitions and ranks them according to their severity effect. Air quality (NO 2 , SO 2 , O 3 and PM 2.5 ) and microclimate parameters (temperature, humidity) monitoring data from a case study conducted within exhibition and storage spaces of the Romanian National Aviation Museum Bucharest have been used for developing and validating the binary logistic regression method and the mathematical model. The logistic regression analysis was used on 794 data combinations (715 to develop of the model and 79 to validate it) by a Statistical Package for Social Sciences (SPSS 20.0). The results from the binary logistic regression analysis demonstrated that from six parameters taken into consideration, four of them present a significant effect upon exhibits in the following order: O 3 >PM 2.5 >NO 2 >humidity followed at a significant distance by the effects of SO 2 and temperature. The mathematical model, developed in this study, correctly predicted 95.1 % of the cumulated effect of the environmental parameters upon the exhibits. Moreover, this model could also be used in the decisional process regarding the preventive preservation measures that should be implemented within the exhibition space. The paper presents a new way to assess the environmental impact on historical artifacts using binary logistic regression. The mathematical model developed on the environmental parameters analyzed by the binary logistic regression method could be useful in a decision-making process establishing the best measures for pollution reduction and preventive preservation of exhibits.

  6. Determining factors influencing survival of breast cancer by fuzzy logistic regression model.

    PubMed

    Nikbakht, Roya; Bahrampour, Abbas

    2017-01-01

    Fuzzy logistic regression model can be used for determining influential factors of disease. This study explores the important factors of actual predictive survival factors of breast cancer's patients. We used breast cancer data which collected by cancer registry of Kerman University of Medical Sciences during the period of 2000-2007. The variables such as morphology, grade, age, and treatments (surgery, radiotherapy, and chemotherapy) were applied in the fuzzy logistic regression model. Performance of model was determined in terms of mean degree of membership (MDM). The study results showed that almost 41% of patients were in neoplasm and malignant group and more than two-third of them were still alive after 5-year follow-up. Based on the fuzzy logistic model, the most important factors influencing survival were chemotherapy, morphology, and radiotherapy, respectively. Furthermore, the MDM criteria show that the fuzzy logistic regression have a good fit on the data (MDM = 0.86). Fuzzy logistic regression model showed that chemotherapy is more important than radiotherapy in survival of patients with breast cancer. In addition, another ability of this model is calculating possibilistic odds of survival in cancer patients. The results of this study can be applied in clinical research. Furthermore, there are few studies which applied the fuzzy logistic models. Furthermore, we recommend using this model in various research areas.

  7. Cross-sectional study of variables associated with length of stay and ICU need in open Roux-En-Y gastric bypass surgery for morbid obese patients: an exploratory analysis based on the Public Health System administrative database (Datasus) in Brazil.

    PubMed

    Asano, Elio Fernando; Rasera, Irineu; Shiraga, Elisabete Cristina

    2012-12-01

    This is an exploratory analysis of potential variables associated with open Roux-en-Y gastric bypass (RYGB) surgery hospitalization resource use pattern. Cross-sectional study based on an administrative database (DATASUS) records. Inclusion criteria were adult patients undergoing RYGB between Jan/2008 and Jun/2011. Dependent variables were length of stay (LoS) and ICU need. Independent variables were: gender, age, region, hospital volume, surgery at certified center of excellence (CoE) by the Surgical Review Corporation (SRC), teaching hospital, and year of hospitalization. Univariate and multivariate analysis (logistic regression for ICU need and linear regression for length of stay) were performed. Data from 13,069 surgeries were analyzed. In crude analysis, hospital volume was the most impactful variable associated with log-transformed LoS (1.312 ± 0.302 high volume vs. 1.670 ± 0.581 low volume, p < 0.001), whereas for ICU need it was certified CoE (odds ratio (OR), 0.016; 95% confidence interval (CI), 0.010-0.026). After adjustment by logistic regression, certified CoE remained as the strongest predictor of ICU need (OR, 0.011; 95% CI, 0.007-0.018), followed by hospital volume (OR, 3.096; 95% CI, 2.861-3.350). Age group, male gender, and teaching hospital were also significantly associated (p < 0.001). For log-transformed LoS, final model includes hospital volume (coefficient, -0.223; 95% CI, -0.250 to -0.196) and teaching hospital (coefficient, 0.375; 95% CI, 0.351-0.398). Region of Brazil was not associated with any of the outcomes. High-volume hospital was the strongest predictor for shorter LoS, whereas SRC certification was the strongest predictor of lower ICU need. Public health policies targeting an increase of efficiency and patient access to the procedure should take into account these results.

  8. Using Logistic Regression To Predict the Probability of Debris Flows Occurring in Areas Recently Burned By Wildland Fires

    USGS Publications Warehouse

    Rupert, Michael G.; Cannon, Susan H.; Gartner, Joseph E.

    2003-01-01

    Logistic regression was used to predict the probability of debris flows occurring in areas recently burned by wildland fires. Multiple logistic regression is conceptually similar to multiple linear regression because statistical relations between one dependent variable and several independent variables are evaluated. In logistic regression, however, the dependent variable is transformed to a binary variable (debris flow did or did not occur), and the actual probability of the debris flow occurring is statistically modeled. Data from 399 basins located within 15 wildland fires that burned during 2000-2002 in Colorado, Idaho, Montana, and New Mexico were evaluated. More than 35 independent variables describing the burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated. The models were developed as follows: (1) Basins that did and did not produce debris flows were delineated from National Elevation Data using a Geographic Information System (GIS). (2) Data describing the burn severity, geology, land surface gradient, rainfall, and soil properties were determined for each basin. These data were then downloaded to a statistics software package for analysis using logistic regression. (3) Relations between the occurrence/non-occurrence of debris flows and burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated and several preliminary multivariate logistic regression models were constructed. All possible combinations of independent variables were evaluated to determine which combination produced the most effective model. The multivariate model that best predicted the occurrence of debris flows was selected. (4) The multivariate logistic regression model was entered into a GIS, and a map showing the probability of debris flows was constructed. The most effective model incorporates the percentage of each basin with slope greater than 30 percent, percentage of land burned at medium and high burn severity in each basin, particle size sorting, average storm intensity (millimeters per hour), soil organic matter content, soil permeability, and soil drainage. The results of this study demonstrate that logistic regression is a valuable tool for predicting the probability of debris flows occurring in recently-burned landscapes.

  9. Prediction of unwanted pregnancies using logistic regression, probit regression and discriminant analysis

    PubMed Central

    Ebrahimzadeh, Farzad; Hajizadeh, Ebrahim; Vahabi, Nasim; Almasian, Mohammad; Bakhteyar, Katayoon

    2015-01-01

    Background: Unwanted pregnancy not intended by at least one of the parents has undesirable consequences for the family and the society. In the present study, three classification models were used and compared to predict unwanted pregnancies in an urban population. Methods: In this cross-sectional study, 887 pregnant mothers referring to health centers in Khorramabad, Iran, in 2012 were selected by the stratified and cluster sampling; relevant variables were measured and for prediction of unwanted pregnancy, logistic regression, discriminant analysis, and probit regression models and SPSS software version 21 were used. To compare these models, indicators such as sensitivity, specificity, the area under the ROC curve, and the percentage of correct predictions were used. Results: The prevalence of unwanted pregnancies was 25.3%. The logistic and probit regression models indicated that parity and pregnancy spacing, contraceptive methods, household income and number of living male children were related to unwanted pregnancy. The performance of the models based on the area under the ROC curve was 0.735, 0.733, and 0.680 for logistic regression, probit regression, and linear discriminant analysis, respectively. Conclusion: Given the relatively high prevalence of unwanted pregnancies in Khorramabad, it seems necessary to revise family planning programs. Despite the similar accuracy of the models, if the researcher is interested in the interpretability of the results, the use of the logistic regression model is recommended. PMID:26793655

  10. Prediction of unwanted pregnancies using logistic regression, probit regression and discriminant analysis.

    PubMed

    Ebrahimzadeh, Farzad; Hajizadeh, Ebrahim; Vahabi, Nasim; Almasian, Mohammad; Bakhteyar, Katayoon

    2015-01-01

    Unwanted pregnancy not intended by at least one of the parents has undesirable consequences for the family and the society. In the present study, three classification models were used and compared to predict unwanted pregnancies in an urban population. In this cross-sectional study, 887 pregnant mothers referring to health centers in Khorramabad, Iran, in 2012 were selected by the stratified and cluster sampling; relevant variables were measured and for prediction of unwanted pregnancy, logistic regression, discriminant analysis, and probit regression models and SPSS software version 21 were used. To compare these models, indicators such as sensitivity, specificity, the area under the ROC curve, and the percentage of correct predictions were used. The prevalence of unwanted pregnancies was 25.3%. The logistic and probit regression models indicated that parity and pregnancy spacing, contraceptive methods, household income and number of living male children were related to unwanted pregnancy. The performance of the models based on the area under the ROC curve was 0.735, 0.733, and 0.680 for logistic regression, probit regression, and linear discriminant analysis, respectively. Given the relatively high prevalence of unwanted pregnancies in Khorramabad, it seems necessary to revise family planning programs. Despite the similar accuracy of the models, if the researcher is interested in the interpretability of the results, the use of the logistic regression model is recommended.

  11. Predictors of course in obsessive-compulsive disorder: logistic regression versus Cox regression for recurrent events.

    PubMed

    Kempe, P T; van Oppen, P; de Haan, E; Twisk, J W R; Sluis, A; Smit, J H; van Dyck, R; van Balkom, A J L M

    2007-09-01

    Two methods for predicting remissions in obsessive-compulsive disorder (OCD) treatment are evaluated. Y-BOCS measurements of 88 patients with a primary OCD (DSM-III-R) diagnosis were performed over a 16-week treatment period, and during three follow-ups. Remission at any measurement was defined as a Y-BOCS score lower than thirteen combined with a reduction of seven points when compared with baseline. Logistic regression models were compared with a Cox regression for recurrent events model. Logistic regression yielded different models at different evaluation times. The recurrent events model remained stable when fewer measurements were used. Higher baseline levels of neuroticism and more severe OCD symptoms were associated with a lower chance of remission, early age of onset and more depressive symptoms with a higher chance. Choice of outcome time affects logistic regression prediction models. Recurrent events analysis uses all information on remissions and relapses. Short- and long-term predictors for OCD remission show overlap.

  12. SU-F-R-22: Malignancy Classification for Small Pulmonary Nodules with Radiomics and Logistic Regression

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Huang, W; Tu, S

    Purpose: We conducted a retrospective study of Radiomics research for classifying malignancy of small pulmonary nodules. A machine learning algorithm of logistic regression and open research platform of Radiomics, IBEX (Imaging Biomarker Explorer), were used to evaluate the classification accuracy. Methods: The training set included 100 CT image series from cancer patients with small pulmonary nodules where the average diameter is 1.10 cm. These patients registered at Chang Gung Memorial Hospital and received a CT-guided operation of lung cancer lobectomy. The specimens were classified by experienced pathologists with a B (benign) or M (malignant). CT images with slice thickness ofmore » 0.625 mm were acquired from a GE BrightSpeed 16 scanner. The study was formally approved by our institutional internal review board. Nodules were delineated and 374 feature parameters were extracted from IBEX. We first used the t-test and p-value criteria to study which feature can differentiate between group B and M. Then we implemented a logistic regression algorithm to perform nodule malignancy classification. 10-fold cross-validation and the receiver operating characteristic curve (ROC) were used to evaluate the classification accuracy. Finally hierarchical clustering analysis, Spearman rank correlation coefficient, and clustering heat map were used to further study correlation characteristics among different features. Results: 238 features were found differentiable between group B and M based on whether their statistical p-values were less than 0.05. A forward search algorithm was used to select an optimal combination of features for the best classification and 9 features were identified. Our study found the best accuracy of classifying malignancy was 0.79±0.01 with the 10-fold cross-validation. The area under the ROC curve was 0.81±0.02. Conclusion: Benign nodules may be treated as a malignant tumor in low-dose CT and patients may undergo unnecessary surgeries or treatments. Our study may help radiologists to differentiate nodule malignancy for low-dose CT.« less

  13. The impact of the 2008 financial crisis on food security and food expenditures in Mexico: a disproportionate effect on the vulnerable

    PubMed Central

    Vilar-Compte, Mireya; Sandoval-Olascoaga, Sebastian; Bernal-Stuart, Ana; Shimoga, Sandhya; Vargas-Bustamante, Arturo

    2015-01-01

    Objective The present paper investigated the impact of the 2008 financial crisis on food security in Mexico and how it disproportionally affected vulnerable households. Design A generalized ordered logistic regression was estimated to assess the impact of the crisis on households’ food security status. An ordinary least squares and a quantile regression were estimated to evaluate the effect of the financial crisis on a continuous proxy measure of food security defined as the share of a household’s current income devoted to food expenditures. Setting Both analyses were performed using pooled cross-sectional data from the Mexican National Household Income and Expenditure Survey 2008 and 2010. Subjects The analytical sample included 29 468 households in 2008 and 27 654 in 2010. Results The generalized ordered logistic model showed that the financial crisis significantly (P < 0·05) decreased the probability of being food secure, mildly or moderately food insecure, compared with being severely food insecure (OR = 0·74). A similar but smaller effect was found when comparing severely and moderately food-insecure households with mildly food-insecure and food-secure households (OR = 0·81). The ordinary least squares model showed that the crisis significantly (P < 0·05) increased the share of total income spent on food (β coefficient of 0·02). The quantile regression confirmed the findings suggested by the generalized ordered logistic model, showing that the effects of the crisis were more profound among poorer households. Conclusions The results suggest that households that were more vulnerable before the financial crisis saw a worsened effect in terms of food insecurity with the crisis. Findings were consistent with both measures of food security – one based on self-reported experience and the other based on food spending. PMID:25428800

  14. Estimating the exceedance probability of rain rate by logistic regression

    NASA Technical Reports Server (NTRS)

    Chiu, Long S.; Kedem, Benjamin

    1990-01-01

    Recent studies have shown that the fraction of an area with rain intensity above a fixed threshold is highly correlated with the area-averaged rain rate. To estimate the fractional rainy area, a logistic regression model, which estimates the conditional probability that rain rate over an area exceeds a fixed threshold given the values of related covariates, is developed. The problem of dependency in the data in the estimation procedure is bypassed by the method of partial likelihood. Analyses of simulated scanning multichannel microwave radiometer and observed electrically scanning microwave radiometer data during the Global Atlantic Tropical Experiment period show that the use of logistic regression in pixel classification is superior to multiple regression in predicting whether rain rate at each pixel exceeds a given threshold, even in the presence of noisy data. The potential of the logistic regression technique in satellite rain rate estimation is discussed.

  15. Comparison of naïve Bayes and logistic regression for computer-aided diagnosis of breast masses using ultrasound imaging

    NASA Astrophysics Data System (ADS)

    Cary, Theodore W.; Cwanger, Alyssa; Venkatesh, Santosh S.; Conant, Emily F.; Sehgal, Chandra M.

    2012-03-01

    This study compares the performance of two proven but very different machine learners, Naïve Bayes and logistic regression, for differentiating malignant and benign breast masses using ultrasound imaging. Ultrasound images of 266 masses were analyzed quantitatively for shape, echogenicity, margin characteristics, and texture features. These features along with patient age, race, and mammographic BI-RADS category were used to train Naïve Bayes and logistic regression classifiers to diagnose lesions as malignant or benign. ROC analysis was performed using all of the features and using only a subset that maximized information gain. Performance was determined by the area under the ROC curve, Az, obtained from leave-one-out cross validation. Naïve Bayes showed significant variation (Az 0.733 +/- 0.035 to 0.840 +/- 0.029, P < 0.002) with the choice of features, but the performance of logistic regression was relatively unchanged under feature selection (Az 0.839 +/- 0.029 to 0.859 +/- 0.028, P = 0.605). Out of 34 features, a subset of 6 gave the highest information gain: brightness difference, margin sharpness, depth-to-width, mammographic BI-RADs, age, and race. The probabilities of malignancy determined by Naïve Bayes and logistic regression after feature selection showed significant correlation (R2= 0.87, P < 0.0001). The diagnostic performance of Naïve Bayes and logistic regression can be comparable, but logistic regression is more robust. Since probability of malignancy cannot be measured directly, high correlation between the probabilities derived from two basic but dissimilar models increases confidence in the predictive power of machine learning models for characterizing solid breast masses on ultrasound.

  16. [Logistic regression model of noninvasive prediction for portal hypertensive gastropathy in patients with hepatitis B associated cirrhosis].

    PubMed

    Wang, Qingliang; Li, Xiaojie; Hu, Kunpeng; Zhao, Kun; Yang, Peisheng; Liu, Bo

    2015-05-12

    To explore the risk factors of portal hypertensive gastropathy (PHG) in patients with hepatitis B associated cirrhosis and establish a Logistic regression model of noninvasive prediction. The clinical data of 234 hospitalized patients with hepatitis B associated cirrhosis from March 2012 to March 2014 were analyzed retrospectively. The dependent variable was the occurrence of PHG while the independent variables were screened by binary Logistic analysis. Multivariate Logistic regression was used for further analysis of significant noninvasive independent variables. Logistic regression model was established and odds ratio was calculated for each factor. The accuracy, sensitivity and specificity of model were evaluated by the curve of receiver operating characteristic (ROC). According to univariate Logistic regression, the risk factors included hepatic dysfunction, albumin (ALB), bilirubin (TB), prothrombin time (PT), platelet (PLT), white blood cell (WBC), portal vein diameter, spleen index, splenic vein diameter, diameter ratio, PLT to spleen volume ratio, esophageal varices (EV) and gastric varices (GV). Multivariate analysis showed that hepatic dysfunction (X1), TB (X2), PLT (X3) and splenic vein diameter (X4) were the major occurring factors for PHG. The established regression model was Logit P=-2.667+2.186X1-2.167X2+0.725X3+0.976X4. The accuracy of model for PHG was 79.1% with a sensitivity of 77.2% and a specificity of 80.8%. Hepatic dysfunction, TB, PLT and splenic vein diameter are risk factors for PHG and the noninvasive predicted Logistic regression model was Logit P=-2.667+2.186X1-2.167X2+0.725X3+0.976X4.

  17. Study on Spatial Spillover Effects of Logistics Industry Development for Economic Growth in the Yangtze River Delta City Cluster Based on Spatial Durbin Model

    PubMed Central

    Xu, Xinxing

    2017-01-01

    The overall entropy method is used to evaluate the development level of the logistics industry in the city based on a mechanism analysis of the spillover effect of the development of the logistics industry on economic growth, according to the panel data of 26 cities in the Yangtze River delta. On this basis, the paper uses the spatial durbin model to study the direct impact of the development of the logistics industry on economic growth and the spatial spillover effect. The results show that the direct impact coefficient of the development of the logistics industry in the Yangtze River Delta urban agglomeration on local economic growth is 0.092, and the significant spatial spillover effect on the economic growth in the surrounding area is 0.197. Compared with the labor force input, capital investment and the degree of opening to the world, and government functions, the logistics industry’s direct impact coefficient is the largest, other than capital investment; the coefficient of the spillover effect is higher than other control variables, making it a “strong engine” of the Yangtze River Delta urban agglomeration economic growth. PMID:29207555

  18. Study on Spatial Spillover Effects of Logistics Industry Development for Economic Growth in the Yangtze River Delta City Cluster Based on Spatial Durbin Model.

    PubMed

    Xu, Xinxing; Wang, Yuhong

    2017-12-04

    The overall entropy method is used to evaluate the development level of the logistics industry in the city based on a mechanism analysis of the spillover effect of the development of the logistics industry on economic growth, according to the panel data of 26 cities in the Yangtze River delta. On this basis, the paper uses the spatial durbin model to study the direct impact of the development of the logistics industry on economic growth and the spatial spillover effect. The results show that the direct impact coefficient of the development of the logistics industry in the Yangtze River Delta urban agglomeration on local economic growth is 0.092, and the significant spatial spillover effect on the economic growth in the surrounding area is 0.197. Compared with the labor force input, capital investment and the degree of opening to the world, and government functions, the logistics industry's direct impact coefficient is the largest, other than capital investment; the coefficient of the spillover effect is higher than other control variables, making it a "strong engine" of the Yangtze River Delta urban agglomeration economic growth.

  19. Variable Selection in Logistic Regression.

    DTIC Science & Technology

    1987-06-01

    23 %. AUTIOR(.) S. CONTRACT OR GRANT NUMBE Rf.i %Z. D. Bai, P. R. Krishnaiah and . C. Zhao F49620-85- C-0008 " PERFORMING ORGANIZATION NAME AND AOORESS...d I7 IOK-TK- d 7 -I0 7’ VARIABLE SELECTION IN LOGISTIC REGRESSION Z. D. Bai, P. R. Krishnaiah and L. C. Zhao Center for Multivariate Analysis...University of Pittsburgh Center for Multivariate Analysis University of Pittsburgh Y !I VARIABLE SELECTION IN LOGISTIC REGRESSION Z- 0. Bai, P. R. Krishnaiah

  20. Multinomial Logistic Regression Predicted Probability Map To Visualize The Influence Of Socio-Economic Factors On Breast Cancer Occurrence in Southern Karnataka

    NASA Astrophysics Data System (ADS)

    Madhu, B.; Ashok, N. C.; Balasubramanian, S.

    2014-11-01

    Multinomial logistic regression analysis was used to develop statistical model that can predict the probability of breast cancer in Southern Karnataka using the breast cancer occurrence data during 2007-2011. Independent socio-economic variables describing the breast cancer occurrence like age, education, occupation, parity, type of family, health insurance coverage, residential locality and socioeconomic status of each case was obtained. The models were developed as follows: i) Spatial visualization of the Urban- rural distribution of breast cancer cases that were obtained from the Bharat Hospital and Institute of Oncology. ii) Socio-economic risk factors describing the breast cancer occurrences were complied for each case. These data were then analysed using multinomial logistic regression analysis in a SPSS statistical software and relations between the occurrence of breast cancer across the socio-economic status and the influence of other socio-economic variables were evaluated and multinomial logistic regression models were constructed. iii) the model that best predicted the occurrence of breast cancer were identified. This multivariate logistic regression model has been entered into a geographic information system and maps showing the predicted probability of breast cancer occurrence in Southern Karnataka was created. This study demonstrates that Multinomial logistic regression is a valuable tool for developing models that predict the probability of breast cancer Occurrence in Southern Karnataka.

  1. Comparison of Logistic Regression and Artificial Neural Network in Low Back Pain Prediction: Second National Health Survey

    PubMed Central

    Parsaeian, M; Mohammad, K; Mahmoudi, M; Zeraati, H

    2012-01-01

    Background: The purpose of this investigation was to compare empirically predictive ability of an artificial neural network with a logistic regression in prediction of low back pain. Methods: Data from the second national health survey were considered in this investigation. This data includes the information of low back pain and its associated risk factors among Iranian people aged 15 years and older. Artificial neural network and logistic regression models were developed using a set of 17294 data and they were validated in a test set of 17295 data. Hosmer and Lemeshow recommendation for model selection was used in fitting the logistic regression. A three-layer perceptron with 9 inputs, 3 hidden and 1 output neurons was employed. The efficiency of two models was compared by receiver operating characteristic analysis, root mean square and -2 Loglikelihood criteria. Results: The area under the ROC curve (SE), root mean square and -2Loglikelihood of the logistic regression was 0.752 (0.004), 0.3832 and 14769.2, respectively. The area under the ROC curve (SE), root mean square and -2Loglikelihood of the artificial neural network was 0.754 (0.004), 0.3770 and 14757.6, respectively. Conclusions: Based on these three criteria, artificial neural network would give better performance than logistic regression. Although, the difference is statistically significant, it does not seem to be clinically significant. PMID:23113198

  2. Comparison of logistic regression and artificial neural network in low back pain prediction: second national health survey.

    PubMed

    Parsaeian, M; Mohammad, K; Mahmoudi, M; Zeraati, H

    2012-01-01

    The purpose of this investigation was to compare empirically predictive ability of an artificial neural network with a logistic regression in prediction of low back pain. Data from the second national health survey were considered in this investigation. This data includes the information of low back pain and its associated risk factors among Iranian people aged 15 years and older. Artificial neural network and logistic regression models were developed using a set of 17294 data and they were validated in a test set of 17295 data. Hosmer and Lemeshow recommendation for model selection was used in fitting the logistic regression. A three-layer perceptron with 9 inputs, 3 hidden and 1 output neurons was employed. The efficiency of two models was compared by receiver operating characteristic analysis, root mean square and -2 Loglikelihood criteria. The area under the ROC curve (SE), root mean square and -2Loglikelihood of the logistic regression was 0.752 (0.004), 0.3832 and 14769.2, respectively. The area under the ROC curve (SE), root mean square and -2Loglikelihood of the artificial neural network was 0.754 (0.004), 0.3770 and 14757.6, respectively. Based on these three criteria, artificial neural network would give better performance than logistic regression. Although, the difference is statistically significant, it does not seem to be clinically significant.

  3. Modelling of binary logistic regression for obesity among secondary students in a rural area of Kedah

    NASA Astrophysics Data System (ADS)

    Kamaruddin, Ainur Amira; Ali, Zalila; Noor, Norlida Mohd.; Baharum, Adam; Ahmad, Wan Muhamad Amir W.

    2014-07-01

    Logistic regression analysis examines the influence of various factors on a dichotomous outcome by estimating the probability of the event's occurrence. Logistic regression, also called a logit model, is a statistical procedure used to model dichotomous outcomes. In the logit model the log odds of the dichotomous outcome is modeled as a linear combination of the predictor variables. The log odds ratio in logistic regression provides a description of the probabilistic relationship of the variables and the outcome. In conducting logistic regression, selection procedures are used in selecting important predictor variables, diagnostics are used to check that assumptions are valid which include independence of errors, linearity in the logit for continuous variables, absence of multicollinearity, and lack of strongly influential outliers and a test statistic is calculated to determine the aptness of the model. This study used the binary logistic regression model to investigate overweight and obesity among rural secondary school students on the basis of their demographics profile, medical history, diet and lifestyle. The results indicate that overweight and obesity of students are influenced by obesity in family and the interaction between a student's ethnicity and routine meals intake. The odds of a student being overweight and obese are higher for a student having a family history of obesity and for a non-Malay student who frequently takes routine meals as compared to a Malay student.

  4. Understanding logistic regression analysis.

    PubMed

    Sperandei, Sandro

    2014-01-01

    Logistic regression is used to obtain odds ratio in the presence of more than one explanatory variable. The procedure is quite similar to multiple linear regression, with the exception that the response variable is binomial. The result is the impact of each variable on the odds ratio of the observed event of interest. The main advantage is to avoid confounding effects by analyzing the association of all variables together. In this article, we explain the logistic regression procedure using examples to make it as simple as possible. After definition of the technique, the basic interpretation of the results is highlighted and then some special issues are discussed.

  5. Binary Logistic Regression Versus Boosted Regression Trees in Assessing Landslide Susceptibility for Multiple-Occurring Regional Landslide Events: Application to the 2009 Storm Event in Messina (Sicily, southern Italy).

    NASA Astrophysics Data System (ADS)

    Lombardo, L.; Cama, M.; Maerker, M.; Parisi, L.; Rotigliano, E.

    2014-12-01

    This study aims at comparing the performances of Binary Logistic Regression (BLR) and Boosted Regression Trees (BRT) methods in assessing landslide susceptibility for multiple-occurrence regional landslide events within the Mediterranean region. A test area was selected in the north-eastern sector of Sicily (southern Italy), corresponding to the catchments of the Briga and the Giampilieri streams both stretching for few kilometres from the Peloritan ridge (eastern Sicily, Italy) to the Ionian sea. This area was struck on the 1st October 2009 by an extreme climatic event resulting in thousands of rapid shallow landslides, mainly of debris flows and debris avalanches types involving the weathered layer of a low to high grade metamorphic bedrock. Exploiting the same set of predictors and the 2009 landslide archive, BLR- and BRT-based susceptibility models were obtained for the two catchments separately, adopting a random partition (RP) technique for validation; besides, the models trained in one of the two catchments (Briga) were tested in predicting the landslide distribution in the other (Giampilieri), adopting a spatial partition (SP) based validation procedure. All the validation procedures were based on multi-folds tests so to evaluate and compare the reliability of the fitting, the prediction skill, the coherence in the predictor selection and the precision of the susceptibility estimates. All the obtained models for the two methods produced very high predictive performances, with a general congruence between BLR and BRT in the predictor importance. In particular, the research highlighted that BRT-models reached a higher prediction performance with respect to BLR-models, for RP based modelling, whilst for the SP-based models the difference in predictive skills between the two methods dropped drastically, converging to an analogous excellent performance. However, when looking at the precision of the probability estimates, BLR demonstrated to produce more robust models in terms of selected predictors and coefficients, as well as of dispersion of the estimated probabilities around the mean value for each mapped pixel. The difference in the behaviour could be interpreted as the result of overfitting effects, which heavily affect decision tree classification more than logistic regression techniques.

  6. Comparing Methodologies for Developing an Early Warning System: Classification and Regression Tree Model versus Logistic Regression. REL 2015-077

    ERIC Educational Resources Information Center

    Koon, Sharon; Petscher, Yaacov

    2015-01-01

    The purpose of this report was to explicate the use of logistic regression and classification and regression tree (CART) analysis in the development of early warning systems. It was motivated by state education leaders' interest in maintaining high classification accuracy while simultaneously improving practitioner understanding of the rules by…

  7. A Note on the Relationship between the Number of Indicators and Their Reliability in Detecting Regression Coefficients in Latent Regression Analysis

    ERIC Educational Resources Information Center

    Dolan, Conor V.; Wicherts, Jelte M.; Molenaar, Peter C. M.

    2004-01-01

    We consider the question of how variation in the number and reliability of indicators affects the power to reject the hypothesis that the regression coefficients are zero in latent linear regression analysis. We show that power remains constant as long as the coefficient of determination remains unchanged. Any increase in the number of indicators…

  8. The Predictive Effects of Protection Motivation Theory on Intention and Behaviour of Physical Activity in Patients with Type 2 Diabetes

    PubMed Central

    Ali Morowatisharifabad, Mohammad; Abdolkarimi, Mahdi; Asadpour, Mohammad; Fathollahi, Mahmood Sheikh; Balaee, Parisa

    2018-01-01

    INTRODUCTION: Theory-based education tailored to target behaviour and group can be effective in promoting physical activity. AIM: The purpose of this study was to examine the predictive power of Protection Motivation Theory on intent and behaviour of Physical Activity in Patients with Type 2 Diabetes. METHODS: This descriptive study was conducted on 250 patients in Rafsanjan, Iran. To examine the scores of protection motivation theory structures, a researcher-made questionnaire was used. Its validity and reliability were confirmed. The level of physical activity was also measured by the International Short - form Physical Activity Inventory. Its validity and reliability were also approved. Data were analysed by statistical tests including correlation coefficient, chi-square, logistic regression and linear regression. RESULTS: The results revealed that there was a significant correlation between all the protection motivation theory constructs and the intention to do physical activity. The results showed that the Theory structures were able to predict 60% of the variance of physical activity intention. The results of logistic regression demonstrated that increase in the score of physical activity intent and self - efficacy increased the chance of higher level of physical activity by 3.4 and 1.5 times, respectively OR = (3.39, 1.54). CONCLUSION: Considering the ability of protection motivation theory structures to explain the physical activity behaviour, interventional designs are suggested based on the structures of this theory, especially to improve self -efficacy as the most powerful factor in predicting physical activity intention and behaviour. PMID:29731945

  9. Mapping Shallow Landslide Slope Inestability at Large Scales Using Remote Sensing and GIS

    NASA Astrophysics Data System (ADS)

    Avalon Cullen, C.; Kashuk, S.; Temimi, M.; Suhili, R.; Khanbilvardi, R.

    2015-12-01

    Rainfall induced landslides are one of the most frequent hazards on slanted terrains. They lead to great economic losses and fatalities worldwide. Most factors inducing shallow landslides are local and can only be mapped with high levels of uncertainty at larger scales. This work presents an attempt to determine slope instability at large scales. Buffer and threshold techniques are used to downscale areas and minimize uncertainties. Four static parameters (slope angle, soil type, land cover and elevation) for 261 shallow rainfall-induced landslides in the continental United States are examined. ASTER GDEM is used as bases for topographical characterization of slope and buffer analysis. Slope angle threshold assessment at the 50, 75, 95, 98, and 99 percentiles is tested locally. Further analysis of each threshold in relation to other parameters is investigated in a logistic regression environment for the continental U.S. It is determined that lower than 95-percentile thresholds under-estimate slope angles. Best regression fit can be achieved when utilizing the 99-threshold slope angle. This model predicts the highest number of cases correctly at 87.0% accuracy. A one-unit rise in the 99-threshold range increases landslide likelihood by 11.8%. The logistic regression model is carried over to ArcGIS where all variables are processed based on their corresponding coefficients. A regional slope instability map for the continental United States is created and analyzed against the available landslide records and their spatial distributions. It is expected that future inclusion of dynamic parameters like precipitation and other proxies like soil moisture into the model will further improve accuracy.

  10. Using Multiple and Logistic Regression to Estimate the Median WillCost and Probability of Cost and Schedule Overrun for Program Managers

    DTIC Science & Technology

    2017-03-23

    PUBLIC RELEASE; DISTRIBUTION UNLIMITED Using Multiple and Logistic Regression to Estimate the Median Will- Cost and Probability of Cost and... Cost and Probability of Cost and Schedule Overrun for Program Managers Ryan C. Trudelle Follow this and additional works at: https://scholar.afit.edu...afit.edu. Recommended Citation Trudelle, Ryan C., "Using Multiple and Logistic Regression to Estimate the Median Will- Cost and Probability of Cost and

  11. Expression of Proteins Involved in Epithelial-Mesenchymal Transition as Predictors of Metastasis and Survival in Breast Cancer Patients

    DTIC Science & Technology

    2013-11-01

    Ptrend 0.78 0.62 0.75 Unconditional logistic regression was used to estimate odds ratios (OR) and 95 % confidence intervals (CI) for risk of node...Ptrend 0.71 0.67 Unconditional logistic regression was used to estimate odds ratios (OR) and 95 % confidence intervals (CI) for risk of high-grade tumors... logistic regression was used to estimate odds ratios (OR) and 95 % confidence intervals (CI) for the associations between each of the seven SNPs and

  12. Logistic LASSO regression for the diagnosis of breast cancer using clinical demographic data and the BI-RADS lexicon for ultrasonography.

    PubMed

    Kim, Sun Mi; Kim, Yongdai; Jeong, Kuhwan; Jeong, Heeyeong; Kim, Jiyoung

    2018-01-01

    The aim of this study was to compare the performance of image analysis for predicting breast cancer using two distinct regression models and to evaluate the usefulness of incorporating clinical and demographic data (CDD) into the image analysis in order to improve the diagnosis of breast cancer. This study included 139 solid masses from 139 patients who underwent a ultrasonography-guided core biopsy and had available CDD between June 2009 and April 2010. Three breast radiologists retrospectively reviewed 139 breast masses and described each lesion using the Breast Imaging Reporting and Data System (BI-RADS) lexicon. We applied and compared two regression methods-stepwise logistic (SL) regression and logistic least absolute shrinkage and selection operator (LASSO) regression-in which the BI-RADS descriptors and CDD were used as covariates. We investigated the performances of these regression methods and the agreement of radiologists in terms of test misclassification error and the area under the curve (AUC) of the tests. Logistic LASSO regression was superior (P<0.05) to SL regression, regardless of whether CDD was included in the covariates, in terms of test misclassification errors (0.234 vs. 0.253, without CDD; 0.196 vs. 0.258, with CDD) and AUC (0.785 vs. 0.759, without CDD; 0.873 vs. 0.735, with CDD). However, it was inferior (P<0.05) to the agreement of three radiologists in terms of test misclassification errors (0.234 vs. 0.168, without CDD; 0.196 vs. 0.088, with CDD) and the AUC without CDD (0.785 vs. 0.844, P<0.001), but was comparable to the AUC with CDD (0.873 vs. 0.880, P=0.141). Logistic LASSO regression based on BI-RADS descriptors and CDD showed better performance than SL in predicting the presence of breast cancer. The use of CDD as a supplement to the BI-RADS descriptors significantly improved the prediction of breast cancer using logistic LASSO regression.

  13. The alarming problems of confounding equivalence using logistic regression models in the perspective of causal diagrams.

    PubMed

    Yu, Yuanyuan; Li, Hongkai; Sun, Xiaoru; Su, Ping; Wang, Tingting; Liu, Yi; Yuan, Zhongshang; Liu, Yanxun; Xue, Fuzhong

    2017-12-28

    Confounders can produce spurious associations between exposure and outcome in observational studies. For majority of epidemiologists, adjusting for confounders using logistic regression model is their habitual method, though it has some problems in accuracy and precision. It is, therefore, important to highlight the problems of logistic regression and search the alternative method. Four causal diagram models were defined to summarize confounding equivalence. Both theoretical proofs and simulation studies were performed to verify whether conditioning on different confounding equivalence sets had the same bias-reducing potential and then to select the optimum adjusting strategy, in which logistic regression model and inverse probability weighting based marginal structural model (IPW-based-MSM) were compared. The "do-calculus" was used to calculate the true causal effect of exposure on outcome, then the bias and standard error were used to evaluate the performances of different strategies. Adjusting for different sets of confounding equivalence, as judged by identical Markov boundaries, produced different bias-reducing potential in the logistic regression model. For the sets satisfied G-admissibility, adjusting for the set including all the confounders reduced the equivalent bias to the one containing the parent nodes of the outcome, while the bias after adjusting for the parent nodes of exposure was not equivalent to them. In addition, all causal effect estimations through logistic regression were biased, although the estimation after adjusting for the parent nodes of exposure was nearest to the true causal effect. However, conditioning on different confounding equivalence sets had the same bias-reducing potential under IPW-based-MSM. Compared with logistic regression, the IPW-based-MSM could obtain unbiased causal effect estimation when the adjusted confounders satisfied G-admissibility and the optimal strategy was to adjust for the parent nodes of outcome, which obtained the highest precision. All adjustment strategies through logistic regression were biased for causal effect estimation, while IPW-based-MSM could always obtain unbiased estimation when the adjusted set satisfied G-admissibility. Thus, IPW-based-MSM was recommended to adjust for confounders set.

  14. Use and interpretation of logistic regression in habitat-selection studies

    USGS Publications Warehouse

    Keating, Kim A.; Cherry, Steve

    2004-01-01

     Logistic regression is an important tool for wildlife habitat-selection studies, but the method frequently has been misapplied due to an inadequate understanding of the logistic model, its interpretation, and the influence of sampling design. To promote better use of this method, we review its application and interpretation under 3 sampling designs: random, case-control, and use-availability. Logistic regression is appropriate for habitat use-nonuse studies employing random sampling and can be used to directly model the conditional probability of use in such cases. Logistic regression also is appropriate for studies employing case-control sampling designs, but careful attention is required to interpret results correctly. Unless bias can be estimated or probability of use is small for all habitats, results of case-control studies should be interpreted as odds ratios, rather than probability of use or relative probability of use. When data are gathered under a use-availability design, logistic regression can be used to estimate approximate odds ratios if probability of use is small, at least on average. More generally, however, logistic regression is inappropriate for modeling habitat selection in use-availability studies. In particular, using logistic regression to fit the exponential model of Manly et al. (2002:100) does not guarantee maximum-likelihood estimates, valid probabilities, or valid likelihoods. We show that the resource selection function (RSF) commonly used for the exponential model is proportional to a logistic discriminant function. Thus, it may be used to rank habitats with respect to probability of use and to identify important habitat characteristics or their surrogates, but it is not guaranteed to be proportional to probability of use. Other problems associated with the exponential model also are discussed. We describe an alternative model based on Lancaster and Imbens (1996) that offers a method for estimating conditional probability of use in use-availability studies. Although promising, this model fails to converge to a unique solution in some important situations. Further work is needed to obtain a robust method that is broadly applicable to use-availability studies.

  15. Modeling Governance KB with CATPCA to Overcome Multicollinearity in the Logistic Regression

    NASA Astrophysics Data System (ADS)

    Khikmah, L.; Wijayanto, H.; Syafitri, U. D.

    2017-04-01

    The problem often encounters in logistic regression modeling are multicollinearity problems. Data that have multicollinearity between explanatory variables with the result in the estimation of parameters to be bias. Besides, the multicollinearity will result in error in the classification. In general, to overcome multicollinearity in regression used stepwise regression. They are also another method to overcome multicollinearity which involves all variable for prediction. That is Principal Component Analysis (PCA). However, classical PCA in only for numeric data. Its data are categorical, one method to solve the problems is Categorical Principal Component Analysis (CATPCA). Data were used in this research were a part of data Demographic and Population Survey Indonesia (IDHS) 2012. This research focuses on the characteristic of women of using the contraceptive methods. Classification results evaluated using Area Under Curve (AUC) values. The higher the AUC value, the better. Based on AUC values, the classification of the contraceptive method using stepwise method (58.66%) is better than the logistic regression model (57.39%) and CATPCA (57.39%). Evaluation of the results of logistic regression using sensitivity, shows the opposite where CATPCA method (99.79%) is better than logistic regression method (92.43%) and stepwise (92.05%). Therefore in this study focuses on major class classification (using a contraceptive method), then the selected model is CATPCA because it can raise the level of the major class model accuracy.

  16. Estimation of diffusion coefficients from voltammetric signals by support vector and gaussian process regression

    PubMed Central

    2014-01-01

    Background Support vector regression (SVR) and Gaussian process regression (GPR) were used for the analysis of electroanalytical experimental data to estimate diffusion coefficients. Results For simulated cyclic voltammograms based on the EC, Eqr, and EqrC mechanisms these regression algorithms in combination with nonlinear kernel/covariance functions yielded diffusion coefficients with higher accuracy as compared to the standard approach of calculating diffusion coefficients relying on the Nicholson-Shain equation. The level of accuracy achieved by SVR and GPR is virtually independent of the rate constants governing the respective reaction steps. Further, the reduction of high-dimensional voltammetric signals by manual selection of typical voltammetric peak features decreased the performance of both regression algorithms compared to a reduction by downsampling or principal component analysis. After training on simulated data sets, diffusion coefficients were estimated by the regression algorithms for experimental data comprising voltammetric signals for three organometallic complexes. Conclusions Estimated diffusion coefficients closely matched the values determined by the parameter fitting method, but reduced the required computational time considerably for one of the reaction mechanisms. The automated processing of voltammograms according to the regression algorithms yields better results than the conventional analysis of peak-related data. PMID:24987463

  17. Prediction of random-regression coefficient for daily milk yield after 305 days in milk by using the regression-coefficient estimates from the first 305 days.

    PubMed

    Yamazaki, Takeshi; Takeda, Hisato; Hagiya, Koichi; Yamaguchi, Satoshi; Sasaki, Osamu

    2018-03-13

    Because lactation periods in dairy cows lengthen with increasing total milk production, it is important to predict individual productivities after 305 days in milk (DIM) to determine the optimal lactation period. We therefore examined whether the random regression (RR) coefficient from 306 to 450 DIM (M2) can be predicted from those during the first 305 DIM (M1) by using a random regression model. We analyzed test-day milk records from 85690 Holstein cows in their first lactations and 131727 cows in their later (second to fifth) lactations. Data in M1 and M2 were analyzed separately by using different single-trait RR animal models. We then performed a multiple regression analysis of the RR coefficients of M2 on those of M1 during the first and later lactations. The first-order Legendre polynomials were practical covariates of random regression for the milk yields of M2. All RR coefficients for the additive genetic (AG) effect and the intercept for the permanent environmental (PE) effect of M2 had moderate to strong correlations with the intercept for the AG effect of M1. The coefficients of determination for multiple regression of the combined intercepts for the AG and PE effects of M2 on the coefficients for the AG effect of M1 were moderate to high. The daily milk yields of M2 predicted by using the RR coefficients for the AG effect of M1 were highly correlated with those obtained by using the coefficients of M2. Milk production after 305 DIM can be predicted by using the RR coefficient estimates of the AG effect during the first 305 DIM.

  18. Interpreting Regression Results: beta Weights and Structure Coefficients are Both Important.

    ERIC Educational Resources Information Center

    Thompson, Bruce

    Various realizations have led to less frequent use of the "OVA" methods (analysis of variance--ANOVA--among others) and to more frequent use of general linear model approaches such as regression. However, too few researchers understand all the various coefficients produced in regression. This paper explains these coefficients and their…

  19. Biases and Standard Errors of Standardized Regression Coefficients

    ERIC Educational Resources Information Center

    Yuan, Ke-Hai; Chan, Wai

    2011-01-01

    The paper obtains consistent standard errors (SE) and biases of order O(1/n) for the sample standardized regression coefficients with both random and given predictors. Analytical results indicate that the formulas for SEs given in popular text books are consistent only when the population value of the regression coefficient is zero. The sample…

  20. Logistic regression models of factors influencing the location of bioenergy and biofuels plants

    Treesearch

    T.M. Young; R.L. Zaretzki; J.H. Perdue; F.M. Guess; X. Liu

    2011-01-01

    Logistic regression models were developed to identify significant factors that influence the location of existing wood-using bioenergy/biofuels plants and traditional wood-using facilities. Logistic models provided quantitative insight for variables influencing the location of woody biomass-using facilities. Availability of "thinnings to a basal area of 31.7m2/ha...

  1. Discrete post-processing of total cloud cover ensemble forecasts

    NASA Astrophysics Data System (ADS)

    Hemri, Stephan; Haiden, Thomas; Pappenberger, Florian

    2017-04-01

    This contribution presents an approach to post-process ensemble forecasts for the discrete and bounded weather variable of total cloud cover. Two methods for discrete statistical post-processing of ensemble predictions are tested. The first approach is based on multinomial logistic regression, the second involves a proportional odds logistic regression model. Applying them to total cloud cover raw ensemble forecasts from the European Centre for Medium-Range Weather Forecasts improves forecast skill significantly. Based on station-wise post-processing of raw ensemble total cloud cover forecasts for a global set of 3330 stations over the period from 2007 to early 2014, the more parsimonious proportional odds logistic regression model proved to slightly outperform the multinomial logistic regression model. Reference Hemri, S., Haiden, T., & Pappenberger, F. (2016). Discrete post-processing of total cloud cover ensemble forecasts. Monthly Weather Review 144, 2565-2577.

  2. Fuzzy multinomial logistic regression analysis: A multi-objective programming approach

    NASA Astrophysics Data System (ADS)

    Abdalla, Hesham A.; El-Sayed, Amany A.; Hamed, Ramadan

    2017-05-01

    Parameter estimation for multinomial logistic regression is usually based on maximizing the likelihood function. For large well-balanced datasets, Maximum Likelihood (ML) estimation is a satisfactory approach. Unfortunately, ML can fail completely or at least produce poor results in terms of estimated probabilities and confidence intervals of parameters, specially for small datasets. In this study, a new approach based on fuzzy concepts is proposed to estimate parameters of the multinomial logistic regression. The study assumes that the parameters of multinomial logistic regression are fuzzy. Based on the extension principle stated by Zadeh and Bárdossy's proposition, a multi-objective programming approach is suggested to estimate these fuzzy parameters. A simulation study is used to evaluate the performance of the new approach versus Maximum likelihood (ML) approach. Results show that the new proposed model outperforms ML in cases of small datasets.

  3. A Primer on Logistic Regression.

    ERIC Educational Resources Information Center

    Woldbeck, Tanya

    This paper introduces logistic regression as a viable alternative when the researcher is faced with variables that are not continuous. If one is to use simple regression, the dependent variable must be measured on a continuous scale. In the behavioral sciences, it may not always be appropriate or possible to have a measured dependent variable on a…

  4. [Effects of carbon components of fine particulate matter (PM2.5) on atherogenic index of plasma].

    PubMed

    Fan, Jiao; Qin, Xiaolei; Xue, Xiaodan; Han, Bin; Bai, Zhipeng; Tang, Naijun; Zhang, Liwen

    2014-01-01

    To evaluate associations between carbon constituents of fine particulate matter (PM2.5) and atherogenic index of plasma (AIP). We collected subjects from two communities by a system sampling, and 112 people aged over 60 years old without cardiovascular disease were recruited. The levels of cholesterol (TC), triglycerides (TG), high-density lipoprotein cholesterol (HDL-C), low density lipoprotein cholesterol (LDL-C) of objects, and personal exposure to PM2.5 were measured on December, 2011. Total carbon (TC), organic carbon (OC) and elemental carbon (EC) of PM2.5 were detected and AIP was calculated according to its definition. The value of AIP among the 112 subjects was 0.05 ± 0.26. Personal exposure concentration of PM2.5 and its carbon components (TC,OC and EC) were (164.75 ± 110.67), (53.86 ± 29.65), (44.93 ± 26.37) and (9.49 ± 5.75) µg/m(3), respectively. The Pearson analysis showed the linear relationship between TC,OC,EC and AIP, all significant positive correlations. The correlation coefficients were TC (r = 0.307, P < 0.05),OC (r = 0.287, P < 0.05) and EC (r = 0.252, P < 0.05), respectively. The multiple logistic regression analysis showed that when the AIP risk categories were selected as dependent variable and low risk group as reference group, the regression coefficient of TC,OC and EC was separately 1.03 (95%CI:1.01-1.05), 1.03 (95%CI:1.01-1.05), 1.12 (95%CI:1.02-1.22) in the high risk group; while there was no statistical significance of the regression coefficient and OR in the middle risk group. There was stable associations between the carbon constituents (TC,OC and EC) of fine Particulate Matter (PM2.5) and AIP. The findings suggested that carbon components of PM2.5 should be considered as risk factors of atherogenic.

  5. A Solution to Separation and Multicollinearity in Multiple Logistic Regression

    PubMed Central

    Shen, Jianzhao; Gao, Sujuan

    2010-01-01

    In dementia screening tests, item selection for shortening an existing screening test can be achieved using multiple logistic regression. However, maximum likelihood estimates for such logistic regression models often experience serious bias or even non-existence because of separation and multicollinearity problems resulting from a large number of highly correlated items. Firth (1993, Biometrika, 80(1), 27–38) proposed a penalized likelihood estimator for generalized linear models and it was shown to reduce bias and the non-existence problems. The ridge regression has been used in logistic regression to stabilize the estimates in cases of multicollinearity. However, neither solves the problems for each other. In this paper, we propose a double penalized maximum likelihood estimator combining Firth’s penalized likelihood equation with a ridge parameter. We present a simulation study evaluating the empirical performance of the double penalized likelihood estimator in small to moderate sample sizes. We demonstrate the proposed approach using a current screening data from a community-based dementia study. PMID:20376286

  6. A Solution to Separation and Multicollinearity in Multiple Logistic Regression.

    PubMed

    Shen, Jianzhao; Gao, Sujuan

    2008-10-01

    In dementia screening tests, item selection for shortening an existing screening test can be achieved using multiple logistic regression. However, maximum likelihood estimates for such logistic regression models often experience serious bias or even non-existence because of separation and multicollinearity problems resulting from a large number of highly correlated items. Firth (1993, Biometrika, 80(1), 27-38) proposed a penalized likelihood estimator for generalized linear models and it was shown to reduce bias and the non-existence problems. The ridge regression has been used in logistic regression to stabilize the estimates in cases of multicollinearity. However, neither solves the problems for each other. In this paper, we propose a double penalized maximum likelihood estimator combining Firth's penalized likelihood equation with a ridge parameter. We present a simulation study evaluating the empirical performance of the double penalized likelihood estimator in small to moderate sample sizes. We demonstrate the proposed approach using a current screening data from a community-based dementia study.

  7. Novel risk score of contrast-induced nephropathy after percutaneous coronary intervention.

    PubMed

    Ji, Ling; Su, XiaoFeng; Qin, Wei; Mi, XuHua; Liu, Fei; Tang, XiaoHong; Li, Zi; Yang, LiChuan

    2015-08-01

    Contrast-induced nephropathy (CIN) post-percutaneous coronary intervention (PCI) is a major cause of acute kidney injury. In this study, we established a comprehensive risk score model to assess risk of CIN after PCI procedure, which could be easily used in a clinical environment. A total of 805 PCI patients, divided into analysis cohort (70%) and validation cohort (30%), were enrolled retrospectively in this study. Risk factors for CIN were identified using univariate analysis and multivariate logistic regression in the analysis cohort. Risk score model was developed based on multiple regression coefficients. Sensitivity and specificity of the new risk score system was validated in the validation cohort. Comparisons between the new risk score model and previous reported models were applied. The incidence of post-PCI CIN in the analysis cohort (n = 565) was 12%. Considerably high CIN incidence (50%) was observed in patients with chronic kidney disease (CKD). Age >75, body mass index (BMI) >25, myoglobin level, cardiac function level, hypoalbuminaemia, history of chronic kidney disease (CKD), Intra-aortic balloon pump (IABP) and peripheral vascular disease (PVD) were identified as independent risk factors of post-PCI CIN. A novel risk score model was established using multivariate regression coefficients, which showed highest sensitivity and specificity (0.917, 95%CI 0.877-0.957) compared with previous models. A new post-PCI CIN risk score model was developed based on a retrospective study of 805 patients. Application of this model might be helpful to predict CIN in patients undergoing PCI procedure. © 2015 Asian Pacific Society of Nephrology.

  8. [Influences of environmental factors and interaction of several chemokines gene-environmental on systemic lupus erythematosus].

    PubMed

    Ye, Dong-qing; Hu, Yi-song; Li, Xiang-pei; Huang, Fen; Yang, Shi-gui; Hao, Jia-hu; Yin, Jing; Zhang, Guo-qing; Liu, Hui-hui

    2004-11-01

    To explore the impact of environmental factors, daily lifestyle, psycho-social factors and the interactions between environmental factors and chemokines genes on systemic lupus erythematosus (SLE). Case-control study was carried out and environmental factors for SLE were analyzed by univariate and multivariate unconditional logistic regression. Interactions between environmental factors and chemokines polymorphism contributing to systemic lupus erythematosus were also analyzed by logistic regression model. There were nineteen factors associated with SLE when univariate unconditional logistic regression was used. However, when multivariate unconditional logistic regression was used, only five factors showed having impacts on the disease, in which drinking well water (OR=0.099) was protective factor for SLE, and multiple drug allergy (OR=8.174), over-exposure to sunshine (OR=18.339), taking antibiotics (OR=9.630) and oral contraceptives were risk factors for SLE. When unconditional logistic regression model was used, results showed that there was interaction between eating irritable food and -2518MCP-1G/G genotype (OR=4.387). No interaction between environmental factors was found that contributing to SLE in this study. Many environmental factors were related to SLE, and there was an interaction between -2518MCP-1G/G genotype and eating irritable food.

  9. A deeper look at two concepts of measuring gene-gene interactions: logistic regression and interaction information revisited.

    PubMed

    Mielniczuk, Jan; Teisseyre, Paweł

    2018-03-01

    Detection of gene-gene interactions is one of the most important challenges in genome-wide case-control studies. Besides traditional logistic regression analysis, recently the entropy-based methods attracted a significant attention. Among entropy-based methods, interaction information is one of the most promising measures having many desirable properties. Although both logistic regression and interaction information have been used in several genome-wide association studies, the relationship between them has not been thoroughly investigated theoretically. The present paper attempts to fill this gap. We show that although certain connections between the two methods exist, in general they refer two different concepts of dependence and looking for interactions in those two senses leads to different approaches to interaction detection. We introduce ordering between interaction measures and specify conditions for independent and dependent genes under which interaction information is more discriminative measure than logistic regression. Moreover, we show that for so-called perfect distributions those measures are equivalent. The numerical experiments illustrate the theoretical findings indicating that interaction information and its modified version are more universal tools for detecting various types of interaction than logistic regression and linkage disequilibrium measures. © 2017 WILEY PERIODICALS, INC.

  10. Controlling Type I Error Rates in Assessing DIF for Logistic Regression Method Combined with SIBTEST Regression Correction Procedure and DIF-Free-Then-DIF Strategy

    ERIC Educational Resources Information Center

    Shih, Ching-Lin; Liu, Tien-Hsiang; Wang, Wen-Chung

    2014-01-01

    The simultaneous item bias test (SIBTEST) method regression procedure and the differential item functioning (DIF)-free-then-DIF strategy are applied to the logistic regression (LR) method simultaneously in this study. These procedures are used to adjust the effects of matching true score on observed score and to better control the Type I error…

  11. Analysis of quality of life and influencing factors in 197 Chinese patients with port-wine stains

    PubMed Central

    Wang, Juan; Zhu, Yu-you; Wang, Zhong-ying; Yao, Xiu-hua; Zhang, Lan-fang; Lv, Hong; Zhang, Si-ping; Hu, Bai

    2017-01-01

    Abstract Port-wine stains (PWS) are congenital capillary malformations, usually occurring on the face, neck, and other exposed parts of the skin, that have serious psychological and social impact on the patient. Most researchers focus on the treatment of PWS, but the quality of life (QoL) of PWS patients is seldom researched. The objective of this study is to evaluate the QoL of patients with PWS on exposed parts and explore the factors influencing the QoL of PWS patients. The QoL of 197 cases with PWS on exposed parts were prospectively studied using the Dermatology Life Quality Index questionnaire (DLQI), and the factors influencing the patients’ QoL were analyzed by single-factor analysis and multiple-factor logistic regression analysis. The reliability and validity of the QoL of PWS patients were then assessed by DLQI. A total of 197 valid questionnaires were collected. The DLQI scores in PWS cases ranged from 2 to 16, with 2 to 5 in 52.29% (103/197), 6 to 10 in 42.13% (83/197), and 11 to 20 in 5.58% (11/197). The main score elements of the DLQI focused on symptoms and feelings, daily activities, and social entertainment. Single-factor analysis and multiple-factor logistic regression analysis showed that the main influencing factors were female sex, skin hypertrophy, and lesion area >30 cm2. The inter-item correlation averaged 47.46% and the Cronbach α was 0.740, indicating high internal consistency. Correlation of the 6 dimensions of the DLQI questionnaires with the total scores showed that the Spearman correlation coefficient r ranged from 0.550 to 0.782 (P < .001), with symptoms and feelings having a correlation coefficient of 0.782 and a high correlation with total scores. This study shows that PWS has mild to moderate influence on the QoL of most patients, mainly on daily activities, social entertainment, and feelings. PMID:29390578

  12. Comparing performances of logistic regression and neural networks for predicting melatonin excretion patterns in the rat exposed to ELF magnetic fields.

    PubMed

    Jahandideh, Samad; Abdolmaleki, Parviz; Movahedi, Mohammad Mehdi

    2010-02-01

    Various studies have been reported on the bioeffects of magnetic field exposure; however, no consensus or guideline is available for experimental designs relating to exposure conditions as yet. In this study, logistic regression (LR) and artificial neural networks (ANNs) were used in order to analyze and predict the melatonin excretion patterns in the rat exposed to extremely low frequency magnetic fields (ELF-MF). Subsequently, on a database containing 33 experiments, performances of LR and ANNs were compared through resubstitution and jackknife tests. Predictor variables were more effective parameters and included frequency, polarization, exposure duration, and strength of magnetic fields. Also, five performance measures including accuracy, sensitivity, specificity, Matthew's Correlation Coefficient (MCC) and normalized percentage, better than random (S) were used to evaluate the performance of models. The LR as a conventional model obtained poor prediction performance. Nonetheless, LR distinguished the duration of magnetic fields as a statistically significant parameter. Also, horizontal polarization of magnetic fields with the highest logit coefficient (or parameter estimate) with negative sign was found to be the strongest indicator for experimental designs relating to exposure conditions. This means that each experiment with horizontal polarization of magnetic fields has a higher probability to result in "not changed melatonin level" pattern. On the other hand, ANNs, a more powerful model which has not been introduced in predicting melatonin excretion patterns in the rat exposed to ELF-MF, showed high performance measure values and higher reliability, especially obtaining 0.55 value of MCC through jackknife tests. Obtained results showed that such predictor models are promising and may play a useful role in defining guidelines for experimental designs relating to exposure conditions. In conclusion, analysis of the bioelectromagnetic data could result in finding a relationship between electromagnetic fields and different biological processes. (c) 2009 Wiley-Liss, Inc.

  13. DTI measures identify mild and moderate TBI cases among patients with complex health problems: A receiver operating characteristic analysis of U.S. veterans.

    PubMed

    Main, Keith L; Soman, Salil; Pestilli, Franco; Furst, Ansgar; Noda, Art; Hernandez, Beatriz; Kong, Jennifer; Cheng, Jauhtai; Fairchild, Jennifer K; Taylor, Joy; Yesavage, Jerome; Wesson Ashford, J; Kraemer, Helena; Adamson, Maheen M

    2017-01-01

    Standard MRI methods are often inadequate for identifying mild traumatic brain injury (TBI). Advances in diffusion tensor imaging now provide potential biomarkers of TBI among white matter fascicles (tracts). However, it is still unclear which tracts are most pertinent to TBI diagnosis. This study ranked fiber tracts on their ability to discriminate patients with and without TBI. We acquired diffusion tensor imaging data from military veterans admitted to a polytrauma clinic (Overall n  = 109; Age: M  = 47.2, SD  = 11.3; Male: 88%; TBI: 67%). TBI diagnosis was based on self-report and neurological examination. Fiber tractography analysis produced 20 fiber tracts per patient. Each tract yielded four clinically relevant measures (fractional anisotropy, mean diffusivity, radial diffusivity, and axial diffusivity). We applied receiver operating characteristic (ROC) analyses to identify the most diagnostic tract for each measure. The analyses produced an optimal cutpoint for each tract. We then used kappa coefficients to rate the agreement of each cutpoint with the neurologist's diagnosis. The tract with the highest kappa was most diagnostic. As a check on the ROC results, we performed a stepwise logistic regression on each measure using all 20 tracts as predictors. We also bootstrapped the ROC analyses to compute the 95% confidence intervals for sensitivity, specificity, and the highest kappa coefficients. The ROC analyses identified two fiber tracts as most diagnostic of TBI: the left cingulum (LCG) and the left inferior fronto-occipital fasciculus (LIF). Like ROC, logistic regression identified LCG as most predictive for the FA measure but identified the right anterior thalamic tract (RAT) for the MD, RD, and AD measures. These findings are potentially relevant to the development of TBI biomarkers. Our methods also demonstrate how ROC analysis may be used to identify clinically relevant variables in the TBI population.

  14. Relationships between Family Levels of Socioeconomic Status and Distribution of Breast Cancer Risk Factors

    PubMed Central

    Mohaghegh, Pegah; Yavari, Parvin; Akbari, Mohammad Esmaeil; Abadi, Alireza; Ahmadi, Farzaneh; Shormeij, Zeinab

    2015-01-01

    Background Not only the expand development of knowledge for reducing risk factors, but also the improvement in early diagnosis and treatment of cancer, and socioeconomic inequalities could affect cancer incidence, diagnosis stage, and mortality. The aim of this study was investigation the relationships between family levels of socioeconomic status and distribution of breast cancer risk factors. Methods This descriptive cross-sectional study has conducted on 526 patients who were suffering from breast cancer, and have registered in Cancer Research Center of Shahid Beheshti University of Medical Sciences from March 2008 to December 2013. A reliable and valid questionnaire about family levels of socioeconomic status has filled by interviewing the patients via phone. For analyzing the data, Multinomial logistic regression, Kendal tau-b correlation coefficient and Contingency Coefficient tests have executed by SPSS19. Results The mean age of the patients was 48.30 (SD=11.41). According to the results of this study, there was a significant relationship between family socioeconomic status and patient's age at diagnosis of breast cancer (p value<0.001). Also, the relationships between socioeconomic status and number of pregnancies, and duration of breast feeding were significant (p value> 0.001). In the multiple logistic regressions, the relationship between excellent socioeconomic status and number of abortions was significant (p value> 0.007). Furthermore, the relationships between moderate and good socioeconomic statuses and smoking were significant (p value=0.05 and p value=0.02, respectively). Conclusion The results have indicated that among those patients having better socioeconomic status, age at cancer diagnosis, number of pregnancies and duration of breast feeding was lower, and then number of abortions was more than the others. According to the results of this study, it was really important to focus on family socioeconomic status as a critical and effective variable on breast cancer risk factors among the Iranian women. PMID:25821572

  15. Relationships between Family Levels of Socioeconomic Status and Distribution of Breast Cancer Risk Factors.

    PubMed

    Mohaghegh, Pegah; Yavari, Parvin; Akbari, Mohammad Esmaeil; Abadi, Alireza; Ahmadi, Farzaneh; Shormeij, Zeinab

    2015-01-01

    Not only the expand development of knowledge for reducing risk factors, but also the improvement in early diagnosis and treatment of cancer, and socioeconomic inequalities could affect cancer incidence, diagnosis stage, and mortality. The aim of this study was investigation the relationships between family levels of socioeconomic status and distribution of breast cancer risk factors. This descriptive cross-sectional study has conducted on 526 patients who were suffering from breast cancer, and have registered in Cancer Research Center of Shahid Beheshti University of Medical Sciences from March 2008 to December 2013. A reliable and valid questionnaire about family levels of socioeconomic status has filled by interviewing the patients via phone. For analyzing the data, Multinomial logistic regression, Kendal tau-b correlation coefficient and Contingency Coefficient tests have executed by SPSS19. The mean age of the patients was 48.30 (SD=11.41). According to the results of this study, there was a significant relationship between family socioeconomic status and patient's age at diagnosis of breast cancer (p value<0.001). Also, the relationships between socioeconomic status and number of pregnancies, and duration of breast feeding were significant (p value> 0.001). In the multiple logistic regressions, the relationship between excellent socioeconomic status and number of abortions was significant (p value> 0.007). Furthermore, the relationships between moderate and good socioeconomic statuses and smoking were significant (p value=0.05 and p value=0.02, respectively). The results have indicated that among those patients having better socioeconomic status, age at cancer diagnosis, number of pregnancies and duration of breast feeding was lower, and then number of abortions was more than the others. According to the results of this study, it was really important to focus on family socioeconomic status as a critical and effective variable on breast cancer risk factors among the Iranian women.

  16. Differentiation of orbital lymphoma and idiopathic orbital inflammatory pseudotumor: combined diagnostic value of conventional MRI and histogram analysis of ADC maps.

    PubMed

    Ren, Jiliang; Yuan, Ying; Wu, Yingwei; Tao, Xiaofeng

    2018-05-02

    The overlap of morphological feature and mean ADC value restricted clinical application of MRI in the differential diagnosis of orbital lymphoma and idiopathic orbital inflammatory pseudotumor (IOIP). In this paper, we aimed to retrospectively evaluate the combined diagnostic value of conventional magnetic resonance imaging (MRI) and whole-tumor histogram analysis of apparent diffusion coefficient (ADC) maps in the differentiation of the two lesions. In total, 18 patients with orbital lymphoma and 22 patients with IOIP were included, who underwent both conventional MRI and diffusion weighted imaging before treatment. Conventional MRI features and histogram parameters derived from ADC maps, including mean ADC (ADC mean ), median ADC (ADC median ), skewness, kurtosis, 10th, 25th, 75th and 90th percentiles of ADC (ADC 10 , ADC 25 , ADC 75 , ADC 90 ) were evaluated and compared between orbital lymphoma and IOIP. Multivariate logistic regression analysis was used to identify the most valuable variables for discriminating. Differential model was built upon the selected variables and receiver operating characteristic (ROC) analysis was also performed to determine the differential ability of the model. Multivariate logistic regression showed ADC 10 (P = 0.023) and involvement of orbit preseptal space (P = 0.029) were the most promising indexes in the discrimination of orbital lymphoma and IOIP. The logistic model defined by ADC 10 and involvement of orbit preseptal space was built, which achieved an AUC of 0.939, with sensitivity of 77.30% and specificity of 94.40%. Conventional MRI feature of involvement of orbit preseptal space and ADC histogram parameter of ADC 10 are valuable in differential diagnosis of orbital lymphoma and IOIP.

  17. Comparative multivariate analyses of transient otoacoustic emissions and distorsion products in normal and impaired hearing.

    PubMed

    Stamate, Mirela Cristina; Todor, Nicolae; Cosgarea, Marcel

    2015-01-01

    The clinical utility of otoacoustic emissions as a noninvasive objective test of cochlear function has been long studied. Both transient otoacoustic emissions and distorsion products can be used to identify hearing loss, but to what extent they can be used as predictors for hearing loss is still debated. Most studies agree that multivariate analyses have better test performances than univariate analyses. The aim of the study was to determine transient otoacoustic emissions and distorsion products performance in identifying normal and impaired hearing loss, using the pure tone audiogram as a gold standard procedure and different multivariate statistical approaches. The study included 105 adult subjects with normal hearing and hearing loss who underwent the same test battery: pure-tone audiometry, tympanometry, otoacoustic emission tests. We chose to use the logistic regression as a multivariate statistical technique. Three logistic regression models were developed to characterize the relations between different risk factors (age, sex, tinnitus, demographic features, cochlear status defined by otoacoustic emissions) and hearing status defined by pure-tone audiometry. The multivariate analyses allow the calculation of the logistic score, which is a combination of the inputs, weighted by coefficients, calculated within the analyses. The accuracy of each model was assessed using receiver operating characteristics curve analysis. We used the logistic score to generate receivers operating curves and to estimate the areas under the curves in order to compare different multivariate analyses. We compared the performance of each otoacoustic emission (transient, distorsion product) using three different multivariate analyses for each ear, when multi-frequency gold standards were used. We demonstrated that all multivariate analyses provided high values of the area under the curve proving the performance of the otoacoustic emissions. Each otoacoustic emission test presented high values of area under the curve, suggesting that implementing a multivariate approach to evaluate the performances of each otoacoustic emission test would serve to increase the accuracy in identifying the normal and impaired ears. We encountered the highest area under the curve value for the combined multivariate analysis suggesting that both otoacoustic emission tests should be used in assessing hearing status. Our multivariate analyses revealed that age is a constant predictor factor of the auditory status for both ears, but the presence of tinnitus was the most important predictor for the hearing level, only for the left ear. Age presented similar coefficients, but tinnitus coefficients, by their high value, produced the highest variations of the logistic scores, only for the left ear group, thus increasing the risk of hearing loss. We did not find gender differences between ears for any otoacoustic emission tests, but studies still debate this question as the results are contradictory. Neither gender, nor environment origin had any predictive value for the hearing status, according to the results of our study. Like any other audiological test, using otoacoustic emissions to identify hearing loss is not without error. Even when applying multivariate analysis, perfect test performance is never achieved. Although most studies demonstrated the benefit of using the multivariate analysis, it has not been incorporated into clinical decisions maybe because of the idiosyncratic nature of multivariate solutions or because of the lack of the validation studies.

  18. Comparative multivariate analyses of transient otoacoustic emissions and distorsion products in normal and impaired hearing

    PubMed Central

    STAMATE, MIRELA CRISTINA; TODOR, NICOLAE; COSGAREA, MARCEL

    2015-01-01

    Background and aim The clinical utility of otoacoustic emissions as a noninvasive objective test of cochlear function has been long studied. Both transient otoacoustic emissions and distorsion products can be used to identify hearing loss, but to what extent they can be used as predictors for hearing loss is still debated. Most studies agree that multivariate analyses have better test performances than univariate analyses. The aim of the study was to determine transient otoacoustic emissions and distorsion products performance in identifying normal and impaired hearing loss, using the pure tone audiogram as a gold standard procedure and different multivariate statistical approaches. Methods The study included 105 adult subjects with normal hearing and hearing loss who underwent the same test battery: pure-tone audiometry, tympanometry, otoacoustic emission tests. We chose to use the logistic regression as a multivariate statistical technique. Three logistic regression models were developed to characterize the relations between different risk factors (age, sex, tinnitus, demographic features, cochlear status defined by otoacoustic emissions) and hearing status defined by pure-tone audiometry. The multivariate analyses allow the calculation of the logistic score, which is a combination of the inputs, weighted by coefficients, calculated within the analyses. The accuracy of each model was assessed using receiver operating characteristics curve analysis. We used the logistic score to generate receivers operating curves and to estimate the areas under the curves in order to compare different multivariate analyses. Results We compared the performance of each otoacoustic emission (transient, distorsion product) using three different multivariate analyses for each ear, when multi-frequency gold standards were used. We demonstrated that all multivariate analyses provided high values of the area under the curve proving the performance of the otoacoustic emissions. Each otoacoustic emission test presented high values of area under the curve, suggesting that implementing a multivariate approach to evaluate the performances of each otoacoustic emission test would serve to increase the accuracy in identifying the normal and impaired ears. We encountered the highest area under the curve value for the combined multivariate analysis suggesting that both otoacoustic emission tests should be used in assessing hearing status. Our multivariate analyses revealed that age is a constant predictor factor of the auditory status for both ears, but the presence of tinnitus was the most important predictor for the hearing level, only for the left ear. Age presented similar coefficients, but tinnitus coefficients, by their high value, produced the highest variations of the logistic scores, only for the left ear group, thus increasing the risk of hearing loss. We did not find gender differences between ears for any otoacoustic emission tests, but studies still debate this question as the results are contradictory. Neither gender, nor environment origin had any predictive value for the hearing status, according to the results of our study. Conclusion Like any other audiological test, using otoacoustic emissions to identify hearing loss is not without error. Even when applying multivariate analysis, perfect test performance is never achieved. Although most studies demonstrated the benefit of using the multivariate analysis, it has not been incorporated into clinical decisions maybe because of the idiosyncratic nature of multivariate solutions or because of the lack of the validation studies. PMID:26733749

  19. Access disparities to Magnet hospitals for patients undergoing neurosurgical operations

    PubMed Central

    Missios, Symeon; Bekelis, Kimon

    2017-01-01

    Background Centers of excellence focusing on quality improvement have demonstrated superior outcomes for a variety of surgical interventions. We investigated the presence of access disparities to hospitals recognized by the Magnet Recognition Program of the American Nurses Credentialing Center (ANCC) for patients undergoing neurosurgical operations. Methods We performed a cohort study of all neurosurgery patients who were registered in the New York Statewide Planning and Research Cooperative System (SPARCS) database from 2009–2013. We examined the association of African-American race and lack of insurance with Magnet status hospitalization for neurosurgical procedures. A mixed effects propensity adjusted multivariable regression analysis was used to control for confounding. Results During the study period, 190,535 neurosurgical patients met the inclusion criteria. Using a multivariable logistic regression, we demonstrate that African-Americans had lower admission rates to Magnet institutions (OR 0.62; 95% CI, 0.58–0.67). This persisted in a mixed effects logistic regression model (OR 0.77; 95% CI, 0.70–0.83) to adjust for clustering at the patient county level, and a propensity score adjusted logistic regression model (OR 0.75; 95% CI, 0.69–0.82). Additionally, lack of insurance was associated with lower admission rates to Magnet institutions (OR 0.71; 95% CI, 0.68–0.73), in a multivariable logistic regression model. This persisted in a mixed effects logistic regression model (OR 0.72; 95% CI, 0.69–0.74), and a propensity score adjusted logistic regression model (OR 0.72; 95% CI, 0.69–0.75). Conclusions Using a comprehensive all-payer cohort of neurosurgery patients in New York State we identified an association of African-American race and lack of insurance with lower rates of admission to Magnet hospitals. PMID:28684152

  20. On the Occurrence of Standardized Regression Coefficients Greater than One.

    ERIC Educational Resources Information Center

    Deegan, John, Jr.

    1978-01-01

    It is demonstrated here that standardized regression coefficients greater than one can legitimately occur. Furthermore, the relationship between the occurrence of such coefficients and the extent of multicollinearity present among the set of predictor variables in an equation is examined. Comments on the interpretation of these coefficients are…

  1. On the use and misuse of scalar scores of confounders in design and analysis of observational studies.

    PubMed

    Pfeiffer, R M; Riedl, R

    2015-08-15

    We assess the asymptotic bias of estimates of exposure effects conditional on covariates when summary scores of confounders, instead of the confounders themselves, are used to analyze observational data. First, we study regression models for cohort data that are adjusted for summary scores. Second, we derive the asymptotic bias for case-control studies when cases and controls are matched on a summary score, and then analyzed either using conditional logistic regression or by unconditional logistic regression adjusted for the summary score. Two scores, the propensity score (PS) and the disease risk score (DRS) are studied in detail. For cohort analysis, when regression models are adjusted for the PS, the estimated conditional treatment effect is unbiased only for linear models, or at the null for non-linear models. Adjustment of cohort data for DRS yields unbiased estimates only for linear regression; all other estimates of exposure effects are biased. Matching cases and controls on DRS and analyzing them using conditional logistic regression yields unbiased estimates of exposure effect, whereas adjusting for the DRS in unconditional logistic regression yields biased estimates, even under the null hypothesis of no association. Matching cases and controls on the PS yield unbiased estimates only under the null for both conditional and unconditional logistic regression, adjusted for the PS. We study the bias for various confounding scenarios and compare our asymptotic results with those from simulations with limited sample sizes. To create realistic correlations among multiple confounders, we also based simulations on a real dataset. Copyright © 2015 John Wiley & Sons, Ltd.

  2. Estimating regression coefficients from clustered samples: Sampling errors and optimum sample allocation

    NASA Technical Reports Server (NTRS)

    Kalton, G.

    1983-01-01

    A number of surveys were conducted to study the relationship between the level of aircraft or traffic noise exposure experienced by people living in a particular area and their annoyance with it. These surveys generally employ a clustered sample design which affects the precision of the survey estimates. Regression analysis of annoyance on noise measures and other variables is often an important component of the survey analysis. Formulae are presented for estimating the standard errors of regression coefficients and ratio of regression coefficients that are applicable with a two- or three-stage clustered sample design. Using a simple cost function, they also determine the optimum allocation of the sample across the stages of the sample design for the estimation of a regression coefficient.

  3. Simple, Efficient Estimators of Treatment Effects in Randomized Trials Using Generalized Linear Models to Leverage Baseline Variables

    PubMed Central

    Rosenblum, Michael; van der Laan, Mark J.

    2010-01-01

    Models, such as logistic regression and Poisson regression models, are often used to estimate treatment effects in randomized trials. These models leverage information in variables collected before randomization, in order to obtain more precise estimates of treatment effects. However, there is the danger that model misspecification will lead to bias. We show that certain easy to compute, model-based estimators are asymptotically unbiased even when the working model used is arbitrarily misspecified. Furthermore, these estimators are locally efficient. As a special case of our main result, we consider a simple Poisson working model containing only main terms; in this case, we prove the maximum likelihood estimate of the coefficient corresponding to the treatment variable is an asymptotically unbiased estimator of the marginal log rate ratio, even when the working model is arbitrarily misspecified. This is the log-linear analog of ANCOVA for linear models. Our results demonstrate one application of targeted maximum likelihood estimation. PMID:20628636

  4. [Application of SAS macro to evaluated multiplicative and additive interaction in logistic and Cox regression in clinical practices].

    PubMed

    Nie, Z Q; Ou, Y Q; Zhuang, J; Qu, Y J; Mai, J Z; Chen, J M; Liu, X Q

    2016-05-01

    Conditional logistic regression analysis and unconditional logistic regression analysis are commonly used in case control study, but Cox proportional hazard model is often used in survival data analysis. Most literature only refer to main effect model, however, generalized linear model differs from general linear model, and the interaction was composed of multiplicative interaction and additive interaction. The former is only statistical significant, but the latter has biological significance. In this paper, macros was written by using SAS 9.4 and the contrast ratio, attributable proportion due to interaction and synergy index were calculated while calculating the items of logistic and Cox regression interactions, and the confidence intervals of Wald, delta and profile likelihood were used to evaluate additive interaction for the reference in big data analysis in clinical epidemiology and in analysis of genetic multiplicative and additive interactions.

  5. An order insertion scheduling model of logistics service supply chain considering capacity and time factors.

    PubMed

    Liu, Weihua; Yang, Yi; Wang, Shuqing; Liu, Yang

    2014-01-01

    Order insertion often occurs in the scheduling process of logistics service supply chain (LSSC), which disturbs normal time scheduling especially in the environment of mass customization logistics service. This study analyses order similarity coefficient and order insertion operation process and then establishes an order insertion scheduling model of LSSC with service capacity and time factors considered. This model aims to minimize the average unit volume operation cost of logistics service integrator and maximize the average satisfaction degree of functional logistics service providers. In order to verify the viability and effectiveness of our model, a specific example is numerically analyzed. Some interesting conclusions are obtained. First, along with the increase of completion time delay coefficient permitted by customers, the possible inserting order volume first increases and then trends to be stable. Second, supply chain performance reaches the best when the volume of inserting order is equal to the surplus volume of the normal operation capacity in mass service process. Third, the larger the normal operation capacity in mass service process is, the bigger the possible inserting order's volume will be. Moreover, compared to increasing the completion time delay coefficient, improving the normal operation capacity of mass service process is more useful.

  6. Predicting location of recurrence using FDG, FLT, and Cu-ATSM PET in canine sinonasal tumors treated with radiotherapy

    NASA Astrophysics Data System (ADS)

    Bradshaw, Tyler; Fu, Rau; Bowen, Stephen; Zhu, Jun; Forrest, Lisa; Jeraj, Robert

    2015-07-01

    Dose painting relies on the ability of functional imaging to identify resistant tumor subvolumes to be targeted for additional boosting. This work assessed the ability of FDG, FLT, and Cu-ATSM PET imaging to predict the locations of residual FDG PET in canine tumors following radiotherapy. Nineteen canines with spontaneous sinonasal tumors underwent PET/CT imaging with radiotracers FDG, FLT, and Cu-ATSM prior to hypofractionated radiotherapy. Therapy consisted of 10 fractions of 4.2 Gy to the sinonasal cavity with or without an integrated boost of 0.8 Gy to the GTV. Patients had an additional FLT PET/CT scan after fraction 2, a Cu-ATSM PET/CT scan after fraction 3, and follow-up FDG PET/CT scans after radiotherapy. Following image registration, simple and multiple linear and logistic voxel regressions were performed to assess how well pre- and mid-treatment PET imaging predicted post-treatment FDG uptake. R2 and pseudo R2 were used to assess the goodness of fits. For simple linear regression models, regression coefficients for all pre- and mid-treatment PET images were significantly positive across the population (P < 0.05). However, there was large variability among patients in goodness of fits: R2 ranged from 0.00 to 0.85, with a median of 0.12. Results for logistic regression models were similar. Multiple linear regression models resulted in better fits (median R2 = 0.31), but there was still large variability between patients in R2. The R2 from regression models for different predictor variables were highly correlated across patients (R ≈ 0.8), indicating tumors that were poorly predicted with one tracer were also poorly predicted by other tracers. In conclusion, the high inter-patient variability in goodness of fits indicates that PET was able to predict locations of residual tumor in some patients, but not others. This suggests not all patients would be good candidates for dose painting based on a single biological target.

  7. Predicting location of recurrence using FDG, FLT, and Cu-ATSM PET in canine sinonasal tumors treated with radiotherapy.

    PubMed

    Bradshaw, Tyler; Fu, Rau; Bowen, Stephen; Zhu, Jun; Forrest, Lisa; Jeraj, Robert

    2015-07-07

    Dose painting relies on the ability of functional imaging to identify resistant tumor subvolumes to be targeted for additional boosting. This work assessed the ability of FDG, FLT, and Cu-ATSM PET imaging to predict the locations of residual FDG PET in canine tumors following radiotherapy. Nineteen canines with spontaneous sinonasal tumors underwent PET/CT imaging with radiotracers FDG, FLT, and Cu-ATSM prior to hypofractionated radiotherapy. Therapy consisted of 10 fractions of 4.2 Gy to the sinonasal cavity with or without an integrated boost of 0.8 Gy to the GTV. Patients had an additional FLT PET/CT scan after fraction 2, a Cu-ATSM PET/CT scan after fraction 3, and follow-up FDG PET/CT scans after radiotherapy. Following image registration, simple and multiple linear and logistic voxel regressions were performed to assess how well pre- and mid-treatment PET imaging predicted post-treatment FDG uptake. R(2) and pseudo R(2) were used to assess the goodness of fits. For simple linear regression models, regression coefficients for all pre- and mid-treatment PET images were significantly positive across the population (P < 0.05). However, there was large variability among patients in goodness of fits: R(2) ranged from 0.00 to 0.85, with a median of 0.12. Results for logistic regression models were similar. Multiple linear regression models resulted in better fits (median R(2) = 0.31), but there was still large variability between patients in R(2). The R(2) from regression models for different predictor variables were highly correlated across patients (R ≈ 0.8), indicating tumors that were poorly predicted with one tracer were also poorly predicted by other tracers. In conclusion, the high inter-patient variability in goodness of fits indicates that PET was able to predict locations of residual tumor in some patients, but not others. This suggests not all patients would be good candidates for dose painting based on a single biological target.

  8. An empirical comparison of methods for analyzing correlated data from a discrete choice survey to elicit patient preference for colorectal cancer screening

    PubMed Central

    2012-01-01

    Background A discrete choice experiment (DCE) is a preference survey which asks participants to make a choice among product portfolios comparing the key product characteristics by performing several choice tasks. Analyzing DCE data needs to account for within-participant correlation because choices from the same participant are likely to be similar. In this study, we empirically compared some commonly-used statistical methods for analyzing DCE data while accounting for within-participant correlation based on a survey of patient preference for colorectal cancer (CRC) screening tests conducted in Hamilton, Ontario, Canada in 2002. Methods A two-stage DCE design was used to investigate the impact of six attributes on participants' preferences for CRC screening test and willingness to undertake the test. We compared six models for clustered binary outcomes (logistic and probit regressions using cluster-robust standard error (SE), random-effects and generalized estimating equation approaches) and three models for clustered nominal outcomes (multinomial logistic and probit regressions with cluster-robust SE and random-effects multinomial logistic model). We also fitted a bivariate probit model with cluster-robust SE treating the choices from two stages as two correlated binary outcomes. The rank of relative importance between attributes and the estimates of β coefficient within attributes were used to assess the model robustness. Results In total 468 participants with each completing 10 choices were analyzed. Similar results were reported for the rank of relative importance and β coefficients across models for stage-one data on evaluating participants' preferences for the test. The six attributes ranked from high to low as follows: cost, specificity, process, sensitivity, preparation and pain. However, the results differed across models for stage-two data on evaluating participants' willingness to undertake the tests. Little within-patient correlation (ICC ≈ 0) was found in stage-one data, but substantial within-patient correlation existed (ICC = 0.659) in stage-two data. Conclusions When small clustering effect presented in DCE data, results remained robust across statistical models. However, results varied when larger clustering effect presented. Therefore, it is important to assess the robustness of the estimates via sensitivity analysis using different models for analyzing clustered data from DCE studies. PMID:22348526

  9. 4D-Fingerprint Categorical QSAR Models for Skin Sensitization Based on Classification Local Lymph Node Assay Measures

    PubMed Central

    Li, Yi; Tseng, Yufeng J.; Pan, Dahua; Liu, Jianzhong; Kern, Petra S.; Gerberick, G. Frank; Hopfinger, Anton J.

    2008-01-01

    Currently, the only validated methods to identify skin sensitization effects are in vivo models, such as the Local Lymph Node Assay (LLNA) and guinea pig studies. There is a tremendous need, in particular due to novel legislation, to develop animal alternatives, eg. Quantitative Structure-Activity Relationship (QSAR) models. Here, QSAR models for skin sensitization using LLNA data have been constructed. The descriptors used to generate these models are derived from the 4D-molecular similarity paradigm and are referred to as universal 4D-fingerprints. A training set of 132 structurally diverse compounds and a test set of 15 structurally diverse compounds were used in this study. The statistical methodologies used to build the models are logistic regression (LR), and partial least square coupled logistic regression (PLS-LR), which prove to be effective tools for studying skin sensitization measures expressed in the two categorical terms of sensitizer and non-sensitizer. QSAR models with low values of the Hosmer-Lemeshow goodness-of-fit statistic, χHL2, are significant and predictive. For the training set, the cross-validated prediction accuracy of the logistic regression models ranges from 77.3% to 78.0%, while that of PLS-logistic regression models ranges from 87.1% to 89.4%. For the test set, the prediction accuracy of logistic regression models ranges from 80.0%-86.7%, while that of PLS-logistic regression models ranges from 73.3%-80.0%. The QSAR models are made up of 4D-fingerprints related to aromatic atoms, hydrogen bond acceptors and negatively partially charged atoms. PMID:17226934

  10. Comparison of intravesical prostatic protrusion, prostate volume and serum prostatic-specific antigen in the evaluation of bladder outlet obstruction.

    PubMed

    Lim, Kok Bin; Ho, Henry; Foo, Keong Tatt; Wong, Michael Yuet Chen; Fook-Chong, Stephanie

    2006-12-01

    The aims of this study were to define the relationship between intravesical prostatic protrusion (IPP), prostate-specific antigen (PSA) and prostate volume (PV) and to determine which one of them is the best predictor of bladder outlet obstruction (BOO) due to benign prostatic enlargement. A prospective study of 114 male patients older than 50 years examined between November 2001 and 2002 was performed. They were evaluated with digital rectal examination, International Prostate Symptoms Score, PSA, uroflowmetry, postvoid residual urine measurement, IPP and PV using transabdominal ultrasound scan. Statistical analysis included scatter plot with Spearman's correlation coefficients and nominal logistic regression Prostate volume, IPP and PSA showed parallel correlation. Although all three indices had good correlation with BOO index, IPP was the best. The Spearman rho correlation coefficients were 0.314, 0.408 and 0.507 for PV, PSA and IPP, respectively. Using receiver-operator characteristic curves, the areas under the curve for PV, PSA and IPP were 0.637, 0.703 and 0.772, respectively. The positive predictive values of PV, PSA and IPP were 65%, 68% and 72%, respectively. Using a nominal regression model, IPP remained the most significant independent index to determine BOO. All three non-invasive indices correlate with one another. The study showed that IPP is a better predictor for BOO than PSA or PV.

  11. Classifying Volcanic Activity Using an Empirical Decision Making Algorithm

    NASA Astrophysics Data System (ADS)

    Junek, W. N.; Jones, W. L.; Woods, M. T.

    2012-12-01

    Detection and classification of developing volcanic activity is vital to eruption forecasting. Timely information regarding an impending eruption would aid civil authorities in determining the proper response to a developing crisis. In this presentation, volcanic activity is characterized using an event tree classifier and a suite of empirical statistical models derived through logistic regression. Forecasts are reported in terms of the United States Geological Survey (USGS) volcano alert level system. The algorithm employs multidisciplinary data (e.g., seismic, GPS, InSAR) acquired by various volcano monitoring systems and source modeling information to forecast the likelihood that an eruption, with a volcanic explosivity index (VEI) > 1, will occur within a quantitatively constrained area. Logistic models are constructed from a sparse and geographically diverse dataset assembled from a collection of historic volcanic unrest episodes. Bootstrapping techniques are applied to the training data to allow for the estimation of robust logistic model coefficients. Cross validation produced a series of receiver operating characteristic (ROC) curves with areas ranging between 0.78-0.81, which indicates the algorithm has good predictive capabilities. The ROC curves also allowed for the determination of a false positive rate and optimum detection for each stage of the algorithm. Forecasts for historic volcanic unrest episodes in North America and Iceland were computed and are consistent with the actual outcome of the events.

  12. MODELING SNAKE MICROHABITAT FROM RADIOTELEMETRY STUDIES USING POLYTOMOUS LOGISTIC REGRESSION

    EPA Science Inventory

    Multivariate analysis of snake microhabitat has historically used techniques that were derived under assumptions of normality and common covariance structure (e.g., discriminant function analysis, MANOVA). In this study, polytomous logistic regression (PLR which does not require ...

  13. Bayesian Estimation of Multivariate Latent Regression Models: Gauss versus Laplace

    ERIC Educational Resources Information Center

    Culpepper, Steven Andrew; Park, Trevor

    2017-01-01

    A latent multivariate regression model is developed that employs a generalized asymmetric Laplace (GAL) prior distribution for regression coefficients. The model is designed for high-dimensional applications where an approximate sparsity condition is satisfied, such that many regression coefficients are near zero after accounting for all the model…

  14. Modification of the Mantel-Haenszel and Logistic Regression DIF Procedures to Incorporate the SIBTEST Regression Correction

    ERIC Educational Resources Information Center

    DeMars, Christine E.

    2009-01-01

    The Mantel-Haenszel (MH) and logistic regression (LR) differential item functioning (DIF) procedures have inflated Type I error rates when there are large mean group differences, short tests, and large sample sizes.When there are large group differences in mean score, groups matched on the observed number-correct score differ on true score,…

  15. Satellite rainfall retrieval by logistic regression

    NASA Technical Reports Server (NTRS)

    Chiu, Long S.

    1986-01-01

    The potential use of logistic regression in rainfall estimation from satellite measurements is investigated. Satellite measurements provide covariate information in terms of radiances from different remote sensors.The logistic regression technique can effectively accommodate many covariates and test their significance in the estimation. The outcome from the logistical model is the probability that the rainrate of a satellite pixel is above a certain threshold. By varying the thresholds, a rainrate histogram can be obtained, from which the mean and the variant can be estimated. A logistical model is developed and applied to rainfall data collected during GATE, using as covariates the fractional rain area and a radiance measurement which is deduced from a microwave temperature-rainrate relation. It is demonstrated that the fractional rain area is an important covariate in the model, consistent with the use of the so-called Area Time Integral in estimating total rain volume in other studies. To calibrate the logistical model, simulated rain fields generated by rainfield models with prescribed parameters are needed. A stringent test of the logistical model is its ability to recover the prescribed parameters of simulated rain fields. A rain field simulation model which preserves the fractional rain area and lognormality of rainrates as found in GATE is developed. A stochastic regression model of branching and immigration whose solutions are lognormally distributed in some asymptotic limits has also been developed.

  16. Viability estimation of pepper seeds using time-resolved photothermal signal characterization

    NASA Astrophysics Data System (ADS)

    Kim, Ghiseok; Kim, Geon-Hee; Lohumi, Santosh; Kang, Jum-Soon; Cho, Byoung-Kwan

    2014-11-01

    We used infrared thermal signal measurement system and photothermal signal and image reconstruction techniques for viability estimation of pepper seeds. Photothermal signals from healthy and aged seeds were measured for seven periods (24, 48, 72, 96, 120, 144, and 168 h) using an infrared camera and analyzed by a regression method. The photothermal signals were regressed using a two-term exponential decay curve with two amplitudes and two time variables (lifetime) as regression coefficients. The regression coefficients of the fitted curve showed significant differences for each seed groups, depending on the aging times. In addition, the viability of a single seed was estimated by imaging of its regression coefficient, which was reconstructed from the measured photothermal signals. The time-resolved photothermal characteristics, along with the regression coefficient images, can be used to discriminate the aged or dead pepper seeds from the healthy seeds.

  17. Practical Session: Logistic Regression

    NASA Astrophysics Data System (ADS)

    Clausel, M.; Grégoire, G.

    2014-12-01

    An exercise is proposed to illustrate the logistic regression. One investigates the different risk factors in the apparition of coronary heart disease. It has been proposed in Chapter 5 of the book of D.G. Kleinbaum and M. Klein, "Logistic Regression", Statistics for Biology and Health, Springer Science Business Media, LLC (2010) and also by D. Chessel and A.B. Dufour in Lyon 1 (see Sect. 6 of http://pbil.univ-lyon1.fr/R/pdf/tdr341.pdf). This example is based on data given in the file evans.txt coming from http://www.sph.emory.edu/dkleinb/logreg3.htm#data.

  18. The cross-validated AUC for MCP-logistic regression with high-dimensional data.

    PubMed

    Jiang, Dingfeng; Huang, Jian; Zhang, Ying

    2013-10-01

    We propose a cross-validated area under the receiving operator characteristic (ROC) curve (CV-AUC) criterion for tuning parameter selection for penalized methods in sparse, high-dimensional logistic regression models. We use this criterion in combination with the minimax concave penalty (MCP) method for variable selection. The CV-AUC criterion is specifically designed for optimizing the classification performance for binary outcome data. To implement the proposed approach, we derive an efficient coordinate descent algorithm to compute the MCP-logistic regression solution surface. Simulation studies are conducted to evaluate the finite sample performance of the proposed method and its comparison with the existing methods including the Akaike information criterion (AIC), Bayesian information criterion (BIC) or Extended BIC (EBIC). The model selected based on the CV-AUC criterion tends to have a larger predictive AUC and smaller classification error than those with tuning parameters selected using the AIC, BIC or EBIC. We illustrate the application of the MCP-logistic regression with the CV-AUC criterion on three microarray datasets from the studies that attempt to identify genes related to cancers. Our simulation studies and data examples demonstrate that the CV-AUC is an attractive method for tuning parameter selection for penalized methods in high-dimensional logistic regression models.

  19. Clinical management provided by board-certificated physiatrists in early rehabilitation is a significant determinant of functional improvement in acute stroke patients: a retrospective analysis of Japan rehabilitation database.

    PubMed

    Kinoshita, Shoji; Kakuda, Wataru; Momosaki, Ryo; Yamada, Naoki; Sugawara, Hidekazu; Watanabe, Shu; Abo, Masahiro

    2015-05-01

    Early rehabilitation for acute stroke patients is widely recommended. We tested the hypothesis that clinical outcome of stroke patients who receive early rehabilitation managed by board-certificated physiatrists (BCP) is generally better than that provided by other medical specialties. Data of stroke patients who underwent early rehabilitation in 19 acute hospitals between January 2005 and December 2013 were collected from the Japan Rehabilitation Database and analyzed retrospectively. Multivariate linear regression analysis using generalized estimating equations method was performed to assess the association between Functional Independence Measure (FIM) effectiveness and management provided by BCP in early rehabilitation. In addition, multivariate logistic regression analysis was also performed to assess the impact of management provided by BCP in acute phase on discharge destination. After setting the inclusion criteria, data of 3838 stroke patients were eligible for analysis. BCP provided early rehabilitation in 814 patients (21.2%). Both the duration of daily exercise time and the frequency of regular conferencing were significantly higher for patients managed by BCP than by other specialties. Although the mortality rate was not different, multivariate regression analysis showed that FIM effectiveness correlated significantly and positively with the management provided by BCP (coefficient, .35; 95% confidence interval [CI], .012-.059; P < .005). In addition, multivariate logistic analysis identified clinical management by BCP as a significant determinant of home discharge (odds ratio, 1.24; 95% CI, 1.08-1.44; P < .005). Our retrospective cohort study demonstrated that clinical management provided by BCP in early rehabilitation can lead to functional recovery of acute stroke. Copyright © 2015 National Stroke Association. Published by Elsevier Inc. All rights reserved.

  20. Simple, validated vaginal birth after cesarean delivery prediction model for use at the time of admission.

    PubMed

    Metz, Torri D; Stoddard, Gregory J; Henry, Erick; Jackson, Marc; Holmgren, Calla; Esplin, Sean

    2013-09-01

    To create a simple tool for predicting the likelihood of successful trial of labor after cesarean delivery (TOLAC) during the pregnancy after a primary cesarean delivery using variables available at the time of admission. Data for all deliveries at 14 regional hospitals over an 8-year period were reviewed. Women with one cesarean delivery and one subsequent delivery were included. Variables associated with successful VBAC were identified using multivariable logistic regression. Points were assigned to these characteristics, with weighting based on the coefficients in the regression model to calculate an integer VBAC score. The VBAC score was correlated with TOLAC success rate and was externally validated in an independent cohort using a logistic regression model. A total of 5,445 women met inclusion criteria. Of those women, 1,170 (21.5%) underwent TOLAC. Of the women who underwent trial of labor, 938 (80%) had a successful VBAC. A VBAC score was generated based on the Bishop score (cervical examination) at the time of admission, with points added for history of vaginal birth, age younger than 35 years, absence of recurrent indication, and body mass index less than 30. Women with a VBAC score less than 10 had a likelihood of TOLAC success less than 50%. Women with a VBAC score more than 16 had a TOLAC success rate more than 85%. The model performed well in an independent cohort with an area under the curve of 0.80 (95% confidence interval 0.76-0.84). Prediction of TOLAC success at the time of admission is highly dependent on the initial cervical examination. This simple VBAC score can be utilized when counseling women considering TOLAC. II.

  1. Relationships among personality traits, metabolic syndrome, and metabolic syndrome scores: The Kakegawa cohort study.

    PubMed

    Ohseto, Hisashi; Ishikuro, Mami; Kikuya, Masahiro; Obara, Taku; Igarashi, Yuko; Takahashi, Satomi; Kikuchi, Daisuke; Shigihara, Michiko; Yamanaka, Chizuru; Miyashita, Masako; Mizuno, Satoshi; Nagai, Masato; Matsubara, Hiroko; Sato, Yuki; Metoki, Hirohito; Tachibana, Hirofumi; Maeda-Yamamoto, Mari; Kuriyama, Shinichi

    2018-04-01

    Metabolic syndrome and the presence of metabolic syndrome components are risk factors for cardiovascular disease (CVD). However, the association between personality traits and metabolic syndrome remains controversial, and few studies have been conducted in East Asian populations. We measured personality traits using the Japanese version of the Eysenck Personality Questionnaire (Revised Short Form) and five metabolic syndrome components-elevated waist circumference, elevated triglycerides, reduced high-density lipoprotein cholesterol, elevated blood pressure, and elevated fasting glucose-in 1322 participants aged 51.1±12.7years old from Kakegawa city, Japan. Metabolic syndrome score (MS score) was defined as the number of metabolic syndrome components present, and metabolic syndrome as having the MS score of 3 or higher. We performed multiple logistic regression analyses to examine the relationship between personality traits and metabolic syndrome components and multiple regression analyses to examine the relationship between personality traits and MS scores adjusted for age, sex, education, income, smoking status, alcohol use, and family history of CVD and diabetes mellitus. We also examine the relationship between personality traits and metabolic syndrome presence by multiple logistic regression analyses. "Extraversion" scores were higher in those with metabolic syndrome components (elevated waist circumference: P=0.001; elevated triglycerides: P=0.01; elevated blood pressure: P=0.004; elevated fasting glucose: P=0.002). "Extraversion" was associated with the MS score (coefficient=0.12, P=0.0003). No personality trait was significantly associated with the presence of metabolic syndrome. Higher "extraversion" scores were related to higher MS scores, but no personality trait was significantly associated with the presence of metabolic syndrome. Copyright © 2018 Elsevier Inc. All rights reserved.

  2. Impact of parental-rearing styles on irritable bowel syndrome in adolescents: a school-based study.

    PubMed

    Xing, Zhouxiong; Hou, Xiaohua; Zhou, Kan; Qin, Diyuan; Pan, Wen

    2014-03-01

    A strong association between family function and irritable bowel syndrome (IBS) has been observed. Parental rearing styles, as a comprehensive mark for family function, may provide new clues to the etiology of IBS. This study aimed to explore which dimensions of parental rearing styles were risk factors or protective factors for IBS in adolescents. Two thousand three hundred twenty adolescents were recruited from one middle school and one high school randomly selected from Jiangan District (an urban district in Wuhan City). Data were collected using two Chinese versions of validated self-report questionnaires including the Rome III diagnostic criteria for pediatric IBS and the Egna Minnen Beträffande Uppfostran: One's Memories of Upbringing for perceived parental rearing styles. Ninety-six subjects diagnosed as pediatric IBS were compared with 1618 controls. The IBS patients reported less both paternal and maternal emotional warmth (all P < 0.01) and more both paternal and maternal punishment, overinterference, rejection, and overprotection (only for father) (all P < 0.01) than the controls. Furthermore, the IBS patients had higher total scores of parental rearing styles (all P < 0.001) than the controls. With univariate logistic regression, standardized regression coefficients and odds ratios of parental rearing variables were calculated. Multivariate logistic regression revealed that paternal rejection (P = 0.001) and maternal overinterference (P = 0.002) were independent risk factors for IBS in adolescents. Parental emotional warmth is a protective factor for IBS in adolescents and parental punishment, overinterference, rejection, and overprotection are risk factors for IBS in adolescents. © 2013 Journal of Gastroenterology and Hepatology Foundation and Wiley Publishing Asia Pty Ltd.

  3. A simple approach to power and sample size calculations in logistic regression and Cox regression models.

    PubMed

    Vaeth, Michael; Skovlund, Eva

    2004-06-15

    For a given regression problem it is possible to identify a suitably defined equivalent two-sample problem such that the power or sample size obtained for the two-sample problem also applies to the regression problem. For a standard linear regression model the equivalent two-sample problem is easily identified, but for generalized linear models and for Cox regression models the situation is more complicated. An approximately equivalent two-sample problem may, however, also be identified here. In particular, we show that for logistic regression and Cox regression models the equivalent two-sample problem is obtained by selecting two equally sized samples for which the parameters differ by a value equal to the slope times twice the standard deviation of the independent variable and further requiring that the overall expected number of events is unchanged. In a simulation study we examine the validity of this approach to power calculations in logistic regression and Cox regression models. Several different covariate distributions are considered for selected values of the overall response probability and a range of alternatives. For the Cox regression model we consider both constant and non-constant hazard rates. The results show that in general the approach is remarkably accurate even in relatively small samples. Some discrepancies are, however, found in small samples with few events and a highly skewed covariate distribution. Comparison with results based on alternative methods for logistic regression models with a single continuous covariate indicates that the proposed method is at least as good as its competitors. The method is easy to implement and therefore provides a simple way to extend the range of problems that can be covered by the usual formulas for power and sample size determination. Copyright 2004 John Wiley & Sons, Ltd.

  4. Predictors of High Profit and High Deficit Outliers under SwissDRG of a Tertiary Care Center

    PubMed Central

    Mehra, Tarun; Müller, Christian Thomas Benedikt; Volbracht, Jörk; Seifert, Burkhardt; Moos, Rudolf

    2015-01-01

    Principles Case weights of Diagnosis Related Groups (DRGs) are determined by the average cost of cases from a previous billing period. However, a significant amount of cases are largely over- or underfunded. We therefore decided to analyze earning outliers of our hospital as to search for predictors enabling a better grouping under SwissDRG. Methods 28,893 inpatient cases without additional private insurance discharged from our hospital in 2012 were included in our analysis. Outliers were defined by the interquartile range method. Predictors for deficit and profit outliers were determined with logistic regressions. Predictors were shortlisted with the LASSO regularized logistic regression method and compared to results of Random forest analysis. 10 of these parameters were selected for quantile regression analysis as to quantify their impact on earnings. Results Psychiatric diagnosis and admission as an emergency case were significant predictors for higher deficit with negative regression coefficients for all analyzed quantiles (p<0.001). Admission from an external health care provider was a significant predictor for a higher deficit in all but the 90% quantile (p<0.001 for Q10, Q20, Q50, Q80 and p = 0.0017 for Q90). Burns predicted higher earnings for cases which were favorably remunerated (p<0.001 for the 90% quantile). Osteoporosis predicted a higher deficit in the most underfunded cases, but did not predict differences in earnings for balanced or profitable cases (Q10 and Q20: p<0.00, Q50: p = 0.10, Q80: p = 0.88 and Q90: p = 0.52). ICU stay, mechanical and patient clinical complexity level score (PCCL) predicted higher losses at the 10% quantile but also higher profits at the 90% quantile (p<0.001). Conclusion We suggest considering psychiatric diagnosis, admission as an emergencay case and admission from an external health care provider as DRG split criteria as they predict large, consistent and significant losses. PMID:26517545

  5. Predictors of High Profit and High Deficit Outliers under SwissDRG of a Tertiary Care Center.

    PubMed

    Mehra, Tarun; Müller, Christian Thomas Benedikt; Volbracht, Jörk; Seifert, Burkhardt; Moos, Rudolf

    2015-01-01

    Case weights of Diagnosis Related Groups (DRGs) are determined by the average cost of cases from a previous billing period. However, a significant amount of cases are largely over- or underfunded. We therefore decided to analyze earning outliers of our hospital as to search for predictors enabling a better grouping under SwissDRG. 28,893 inpatient cases without additional private insurance discharged from our hospital in 2012 were included in our analysis. Outliers were defined by the interquartile range method. Predictors for deficit and profit outliers were determined with logistic regressions. Predictors were shortlisted with the LASSO regularized logistic regression method and compared to results of Random forest analysis. 10 of these parameters were selected for quantile regression analysis as to quantify their impact on earnings. Psychiatric diagnosis and admission as an emergency case were significant predictors for higher deficit with negative regression coefficients for all analyzed quantiles (p<0.001). Admission from an external health care provider was a significant predictor for a higher deficit in all but the 90% quantile (p<0.001 for Q10, Q20, Q50, Q80 and p = 0.0017 for Q90). Burns predicted higher earnings for cases which were favorably remunerated (p<0.001 for the 90% quantile). Osteoporosis predicted a higher deficit in the most underfunded cases, but did not predict differences in earnings for balanced or profitable cases (Q10 and Q20: p<0.00, Q50: p = 0.10, Q80: p = 0.88 and Q90: p = 0.52). ICU stay, mechanical and patient clinical complexity level score (PCCL) predicted higher losses at the 10% quantile but also higher profits at the 90% quantile (p<0.001). We suggest considering psychiatric diagnosis, admission as an emergency case and admission from an external health care provider as DRG split criteria as they predict large, consistent and significant losses.

  6. Physician burnout, work engagement and the quality of patient care.

    PubMed

    Loerbroks, A; Glaser, J; Vu-Eickmann, P; Angerer, P

    2017-07-01

    Research suggests that burnout in physicians is associated with poorer patient care, but evidence is inconclusive. More recently, the concept of work engagement has emerged (i.e. the beneficial counterpart of burnout) and has been associated with better care. Evidence remains markedly sparse however. To examine the associations of burnout and work engagement with physicians' self-perceived quality of care. We drew on cross-sectional data from physicians in Germany. We used a six-item version of the Maslach Burnout Inventory measuring exhaustion and depersonalization. We employed the nine-item Utrecht Work Engagement Scale to assess work engagement and its subcomponents: vigour, dedication and absorption. We measured physicians' own perceptions of their quality of care by a six-item instrument covering practices and attitudes. We used continuous and categorized dependent and independent variables in linear and logistic regression analyses. There were 416 participants. In multivariable linear regression analyses, increasing burnout total scores were associated with poorer perceived quality of care [unstandardized regression coefficient (b) = 0.45, 95% confidence interval (CI) 0.37, 0.54]. This association was stronger for depersonalization (b = 0.37, 95% CI 0.29, 0.44) than for exhaustion (b = 0.26, 95% CI 0.18, 0.33). Increasing work engagement was associated with higher perceived quality care (b for the total score = -0.20, 95% CI -0.28, -0.11). This was confirmed for each subcomponent with stronger associations for vigour (b = -0.21, 95% CI -0.29, -0.13) and dedication (b = -0.16, 95% CI -0.24, -0.09) than for absorption (b = -0.12, 95% CI -0.20, -0.04). Logistic regression analyses yielded comparable results. Physician burnout was associated with self-perceived poorer patient care, while work engagement related to self-reported better care. Studies are needed to corroborate these findings, particularly for work engagement. © The Author 2017. Published by Oxford University Press on behalf of the Society of Occupational Medicine. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  7. Penalized spline estimation for functional coefficient regression models.

    PubMed

    Cao, Yanrong; Lin, Haiqun; Wu, Tracy Z; Yu, Yan

    2010-04-01

    The functional coefficient regression models assume that the regression coefficients vary with some "threshold" variable, providing appreciable flexibility in capturing the underlying dynamics in data and avoiding the so-called "curse of dimensionality" in multivariate nonparametric estimation. We first investigate the estimation, inference, and forecasting for the functional coefficient regression models with dependent observations via penalized splines. The P-spline approach, as a direct ridge regression shrinkage type global smoothing method, is computationally efficient and stable. With established fixed-knot asymptotics, inference is readily available. Exact inference can be obtained for fixed smoothing parameter λ, which is most appealing for finite samples. Our penalized spline approach gives an explicit model expression, which also enables multi-step-ahead forecasting via simulations. Furthermore, we examine different methods of choosing the important smoothing parameter λ: modified multi-fold cross-validation (MCV), generalized cross-validation (GCV), and an extension of empirical bias bandwidth selection (EBBS) to P-splines. In addition, we implement smoothing parameter selection using mixed model framework through restricted maximum likelihood (REML) for P-spline functional coefficient regression models with independent observations. The P-spline approach also easily allows different smoothness for different functional coefficients, which is enabled by assigning different penalty λ accordingly. We demonstrate the proposed approach by both simulation examples and a real data application.

  8. Robust logistic regression to narrow down the winner's curse for rare and recessive susceptibility variants.

    PubMed

    Kesselmeier, Miriam; Lorenzo Bermejo, Justo

    2017-11-01

    Logistic regression is the most common technique used for genetic case-control association studies. A disadvantage of standard maximum likelihood estimators of the genotype relative risk (GRR) is their strong dependence on outlier subjects, for example, patients diagnosed at unusually young age. Robust methods are available to constrain outlier influence, but they are scarcely used in genetic studies. This article provides a non-intimidating introduction to robust logistic regression, and investigates its benefits and limitations in genetic association studies. We applied the bounded Huber and extended the R package 'robustbase' with the re-descending Hampel functions to down-weight outlier influence. Computer simulations were carried out to assess the type I error rate, mean squared error (MSE) and statistical power according to major characteristics of the genetic study and investigated markers. Simulations were complemented with the analysis of real data. Both standard and robust estimation controlled type I error rates. Standard logistic regression showed the highest power but standard GRR estimates also showed the largest bias and MSE, in particular for associated rare and recessive variants. For illustration, a recessive variant with a true GRR=6.32 and a minor allele frequency=0.05 investigated in a 1000 case/1000 control study by standard logistic regression resulted in power=0.60 and MSE=16.5. The corresponding figures for Huber-based estimation were power=0.51 and MSE=0.53. Overall, Hampel- and Huber-based GRR estimates did not differ much. Robust logistic regression may represent a valuable alternative to standard maximum likelihood estimation when the focus lies on risk prediction rather than identification of susceptibility variants. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  9. Nonconvex Sparse Logistic Regression With Weakly Convex Regularization

    NASA Astrophysics Data System (ADS)

    Shen, Xinyue; Gu, Yuantao

    2018-06-01

    In this work we propose to fit a sparse logistic regression model by a weakly convex regularized nonconvex optimization problem. The idea is based on the finding that a weakly convex function as an approximation of the $\\ell_0$ pseudo norm is able to better induce sparsity than the commonly used $\\ell_1$ norm. For a class of weakly convex sparsity inducing functions, we prove the nonconvexity of the corresponding sparse logistic regression problem, and study its local optimality conditions and the choice of the regularization parameter to exclude trivial solutions. Despite the nonconvexity, a method based on proximal gradient descent is used to solve the general weakly convex sparse logistic regression, and its convergence behavior is studied theoretically. Then the general framework is applied to a specific weakly convex function, and a necessary and sufficient local optimality condition is provided. The solution method is instantiated in this case as an iterative firm-shrinkage algorithm, and its effectiveness is demonstrated in numerical experiments by both randomly generated and real datasets.

  10. A comparative study on entrepreneurial attitudes modeled with logistic regression and Bayes nets.

    PubMed

    López Puga, Jorge; García García, Juan

    2012-11-01

    Entrepreneurship research is receiving increasing attention in our context, as entrepreneurs are key social agents involved in economic development. We compare the success of the dichotomic logistic regression model and the Bayes simple classifier to predict entrepreneurship, after manipulating the percentage of missing data and the level of categorization in predictors. A sample of undergraduate university students (N = 1230) completed five scales (motivation, attitude towards business creation, obstacles, deficiencies, and training needs) and we found that each of them predicted different aspects of the tendency to business creation. Additionally, our results show that the receiver operating characteristic (ROC) curve is affected by the rate of missing data in both techniques, but logistic regression seems to be more vulnerable when faced with missing data, whereas Bayes nets underperform slightly when categorization has been manipulated. Our study sheds light on the potential entrepreneur profile and we propose to use Bayesian networks as an additional alternative to overcome the weaknesses of logistic regression when missing data are present in applied research.

  11. Epidemiologic programs for computers and calculators. A microcomputer program for multiple logistic regression by unconditional and conditional maximum likelihood methods.

    PubMed

    Campos-Filho, N; Franco, E L

    1989-02-01

    A frequent procedure in matched case-control studies is to report results from the multivariate unmatched analyses if they do not differ substantially from the ones obtained after conditioning on the matching variables. Although conceptually simple, this rule requires that an extensive series of logistic regression models be evaluated by both the conditional and unconditional maximum likelihood methods. Most computer programs for logistic regression employ only one maximum likelihood method, which requires that the analyses be performed in separate steps. This paper describes a Pascal microcomputer (IBM PC) program that performs multiple logistic regression by both maximum likelihood estimation methods, which obviates the need for switching between programs to obtain relative risk estimates from both matched and unmatched analyses. The program calculates most standard statistics and allows factoring of categorical or continuous variables by two distinct methods of contrast. A built-in, descriptive statistics option allows the user to inspect the distribution of cases and controls across categories of any given variable.

  12. Comparison of cranial sex determination by discriminant analysis and logistic regression.

    PubMed

    Amores-Ampuero, Anabel; Alemán, Inmaculada

    2016-04-05

    Various methods have been proposed for estimating dimorphism. The objective of this study was to compare sex determination results from cranial measurements using discriminant analysis or logistic regression. The study sample comprised 130 individuals (70 males) of known sex, age, and cause of death from San José cemetery in Granada (Spain). Measurements of 19 neurocranial dimensions and 11 splanchnocranial dimensions were subjected to discriminant analysis and logistic regression, and the percentages of correct classification were compared between the sex functions obtained with each method. The discriminant capacity of the selected variables was evaluated with a cross-validation procedure. The percentage accuracy with discriminant analysis was 78.2% for the neurocranium (82.4% in females and 74.6% in males) and 73.7% for the splanchnocranium (79.6% in females and 68.8% in males). These percentages were higher with logistic regression analysis: 85.7% for the neurocranium (in both sexes) and 94.1% for the splanchnocranium (100% in females and 91.7% in males).

  13. An Order Insertion Scheduling Model of Logistics Service Supply Chain Considering Capacity and Time Factors

    PubMed Central

    Yang, Yi; Wang, Shuqing; Liu, Yang

    2014-01-01

    Order insertion often occurs in the scheduling process of logistics service supply chain (LSSC), which disturbs normal time scheduling especially in the environment of mass customization logistics service. This study analyses order similarity coefficient and order insertion operation process and then establishes an order insertion scheduling model of LSSC with service capacity and time factors considered. This model aims to minimize the average unit volume operation cost of logistics service integrator and maximize the average satisfaction degree of functional logistics service providers. In order to verify the viability and effectiveness of our model, a specific example is numerically analyzed. Some interesting conclusions are obtained. First, along with the increase of completion time delay coefficient permitted by customers, the possible inserting order volume first increases and then trends to be stable. Second, supply chain performance reaches the best when the volume of inserting order is equal to the surplus volume of the normal operation capacity in mass service process. Third, the larger the normal operation capacity in mass service process is, the bigger the possible inserting order's volume will be. Moreover, compared to increasing the completion time delay coefficient, improving the normal operation capacity of mass service process is more useful. PMID:25276851

  14. Differentiating Tumor Progression from Pseudoprogression in Patients with Glioblastomas Using Diffusion Tensor Imaging and Dynamic Susceptibility Contrast MRI.

    PubMed

    Wang, S; Martinez-Lage, M; Sakai, Y; Chawla, S; Kim, S G; Alonso-Basanta, M; Lustig, R A; Brem, S; Mohan, S; Wolf, R L; Desai, A; Poptani, H

    2016-01-01

    Early assessment of treatment response is critical in patients with glioblastomas. A combination of DTI and DSC perfusion imaging parameters was evaluated to distinguish glioblastomas with true progression from mixed response and pseudoprogression. Forty-one patients with glioblastomas exhibiting enhancing lesions within 6 months after completion of chemoradiation therapy were retrospectively studied. All patients underwent surgery after MR imaging and were histologically classified as having true progression (>75% tumor), mixed response (25%-75% tumor), or pseudoprogression (<25% tumor). Mean diffusivity, fractional anisotropy, linear anisotropy coefficient, planar anisotropy coefficient, spheric anisotropy coefficient, and maximum relative cerebral blood volume values were measured from the enhancing tissue. A multivariate logistic regression analysis was used to determine the best model for classification of true progression from mixed response or pseudoprogression. Significantly elevated maximum relative cerebral blood volume, fractional anisotropy, linear anisotropy coefficient, and planar anisotropy coefficient and decreased spheric anisotropy coefficient were observed in true progression compared with pseudoprogression (P < .05). There were also significant differences in maximum relative cerebral blood volume, fractional anisotropy, planar anisotropy coefficient, and spheric anisotropy coefficient measurements between mixed response and true progression groups. The best model to distinguish true progression from non-true progression (pseudoprogression and mixed) consisted of fractional anisotropy, linear anisotropy coefficient, and maximum relative cerebral blood volume, resulting in an area under the curve of 0.905. This model also differentiated true progression from mixed response with an area under the curve of 0.901. A combination of fractional anisotropy and maximum relative cerebral blood volume differentiated pseudoprogression from nonpseudoprogression (true progression and mixed) with an area under the curve of 0.807. DTI and DSC perfusion imaging can improve accuracy in assessing treatment response and may aid in individualized treatment of patients with glioblastomas. © 2016 by American Journal of Neuroradiology.

  15. Hidden Connections between Regression Models of Strain-Gage Balance Calibration Data

    NASA Technical Reports Server (NTRS)

    Ulbrich, Norbert

    2013-01-01

    Hidden connections between regression models of wind tunnel strain-gage balance calibration data are investigated. These connections become visible whenever balance calibration data is supplied in its design format and both the Iterative and Non-Iterative Method are used to process the data. First, it is shown how the regression coefficients of the fitted balance loads of a force balance can be approximated by using the corresponding regression coefficients of the fitted strain-gage outputs. Then, data from the manual calibration of the Ames MK40 six-component force balance is chosen to illustrate how estimates of the regression coefficients of the fitted balance loads can be obtained from the regression coefficients of the fitted strain-gage outputs. The study illustrates that load predictions obtained by applying the Iterative or the Non-Iterative Method originate from two related regression solutions of the balance calibration data as long as balance loads are given in the design format of the balance, gage outputs behave highly linear, strict statistical quality metrics are used to assess regression models of the data, and regression model term combinations of the fitted loads and gage outputs can be obtained by a simple variable exchange.

  16. Stepwise Distributed Open Innovation Contests for Software Development: Acceleration of Genome-Wide Association Analysis

    PubMed Central

    Hill, Andrew; Loh, Po-Ru; Bharadwaj, Ragu B.; Pons, Pascal; Shang, Jingbo; Guinan, Eva; Lakhani, Karim; Kilty, Iain

    2017-01-01

    Abstract Background: The association of differing genotypes with disease-related phenotypic traits offers great potential to both help identify new therapeutic targets and support stratification of patients who would gain the greatest benefit from specific drug classes. Development of low-cost genotyping and sequencing has made collecting large-scale genotyping data routine in population and therapeutic intervention studies. In addition, a range of new technologies is being used to capture numerous new and complex phenotypic descriptors. As a result, genotype and phenotype datasets have grown exponentially. Genome-wide association studies associate genotypes and phenotypes using methods such as logistic regression. As existing tools for association analysis limit the efficiency by which value can be extracted from increasing volumes of data, there is a pressing need for new software tools that can accelerate association analyses on large genotype-phenotype datasets. Results: Using open innovation (OI) and contest-based crowdsourcing, the logistic regression analysis in a leading, community-standard genetics software package (PLINK 1.07) was substantially accelerated. OI allowed us to do this in <6 months by providing rapid access to highly skilled programmers with specialized, difficult-to-find skill sets. Through a crowd-based contest a combination of computational, numeric, and algorithmic approaches was identified that accelerated the logistic regression in PLINK 1.07 by 18- to 45-fold. Combining contest-derived logistic regression code with coarse-grained parallelization, multithreading, and associated changes to data initialization code further developed through distributed innovation, we achieved an end-to-end speedup of 591-fold for a data set size of 6678 subjects by 645 863 variants, compared to PLINK 1.07's logistic regression. This represents a reduction in run time from 4.8 hours to 29 seconds. Accelerated logistic regression code developed in this project has been incorporated into the PLINK2 project. Conclusions: Using iterative competition-based OI, we have developed a new, faster implementation of logistic regression for genome-wide association studies analysis. We present lessons learned and recommendations on running a successful OI process for bioinformatics. PMID:28327993

  17. Stepwise Distributed Open Innovation Contests for Software Development: Acceleration of Genome-Wide Association Analysis.

    PubMed

    Hill, Andrew; Loh, Po-Ru; Bharadwaj, Ragu B; Pons, Pascal; Shang, Jingbo; Guinan, Eva; Lakhani, Karim; Kilty, Iain; Jelinsky, Scott A

    2017-05-01

    The association of differing genotypes with disease-related phenotypic traits offers great potential to both help identify new therapeutic targets and support stratification of patients who would gain the greatest benefit from specific drug classes. Development of low-cost genotyping and sequencing has made collecting large-scale genotyping data routine in population and therapeutic intervention studies. In addition, a range of new technologies is being used to capture numerous new and complex phenotypic descriptors. As a result, genotype and phenotype datasets have grown exponentially. Genome-wide association studies associate genotypes and phenotypes using methods such as logistic regression. As existing tools for association analysis limit the efficiency by which value can be extracted from increasing volumes of data, there is a pressing need for new software tools that can accelerate association analyses on large genotype-phenotype datasets. Using open innovation (OI) and contest-based crowdsourcing, the logistic regression analysis in a leading, community-standard genetics software package (PLINK 1.07) was substantially accelerated. OI allowed us to do this in <6 months by providing rapid access to highly skilled programmers with specialized, difficult-to-find skill sets. Through a crowd-based contest a combination of computational, numeric, and algorithmic approaches was identified that accelerated the logistic regression in PLINK 1.07 by 18- to 45-fold. Combining contest-derived logistic regression code with coarse-grained parallelization, multithreading, and associated changes to data initialization code further developed through distributed innovation, we achieved an end-to-end speedup of 591-fold for a data set size of 6678 subjects by 645 863 variants, compared to PLINK 1.07's logistic regression. This represents a reduction in run time from 4.8 hours to 29 seconds. Accelerated logistic regression code developed in this project has been incorporated into the PLINK2 project. Using iterative competition-based OI, we have developed a new, faster implementation of logistic regression for genome-wide association studies analysis. We present lessons learned and recommendations on running a successful OI process for bioinformatics. © The Author 2017. Published by Oxford University Press.

  18. Easy and low-cost identification of metabolic syndrome in patients treated with second-generation antipsychotics: artificial neural network and logistic regression models.

    PubMed

    Lin, Chao-Cheng; Bai, Ya-Mei; Chen, Jen-Yeu; Hwang, Tzung-Jeng; Chen, Tzu-Ting; Chiu, Hung-Wen; Li, Yu-Chuan

    2010-03-01

    Metabolic syndrome (MetS) is an important side effect of second-generation antipsychotics (SGAs). However, many SGA-treated patients with MetS remain undetected. In this study, we trained and validated artificial neural network (ANN) and multiple logistic regression models without biochemical parameters to rapidly identify MetS in patients with SGA treatment. A total of 383 patients with a diagnosis of schizophrenia or schizoaffective disorder (DSM-IV criteria) with SGA treatment for more than 6 months were investigated to determine whether they met the MetS criteria according to the International Diabetes Federation. The data for these patients were collected between March 2005 and September 2005. The input variables of ANN and logistic regression were limited to demographic and anthropometric data only. All models were trained by randomly selecting two-thirds of the patient data and were internally validated with the remaining one-third of the data. The models were then externally validated with data from 69 patients from another hospital, collected between March 2008 and June 2008. The area under the receiver operating characteristic curve (AUC) was used to measure the performance of all models. Both the final ANN and logistic regression models had high accuracy (88.3% vs 83.6%), sensitivity (93.1% vs 86.2%), and specificity (86.9% vs 83.8%) to identify MetS in the internal validation set. The mean +/- SD AUC was high for both the ANN and logistic regression models (0.934 +/- 0.033 vs 0.922 +/- 0.035, P = .63). During external validation, high AUC was still obtained for both models. Waist circumference and diastolic blood pressure were the common variables that were left in the final ANN and logistic regression models. Our study developed accurate ANN and logistic regression models to detect MetS in patients with SGA treatment. The models are likely to provide a noninvasive tool for large-scale screening of MetS in this group of patients. (c) 2010 Physicians Postgraduate Press, Inc.

  19. Bayesian logistic regression in detection of gene-steroid interaction for cancer at PDLIM5 locus.

    PubMed

    Wang, Ke-Sheng; Owusu, Daniel; Pan, Yue; Xie, Changchun

    2016-06-01

    The PDZ and LIM domain 5 (PDLIM5) gene may play a role in cancer, bipolar disorder, major depression, alcohol dependence and schizophrenia; however, little is known about the interaction effect of steroid and PDLIM5 gene on cancer. This study examined 47 single-nucleotide polymorphisms (SNPs) within the PDLIM5 gene in the Marshfield sample with 716 cancer patients (any diagnosed cancer, excluding minor skin cancer) and 2848 noncancer controls. Multiple logistic regression model in PLINK software was used to examine the association of each SNP with cancer. Bayesian logistic regression in PROC GENMOD in SAS statistical software, ver. 9.4 was used to detect gene- steroid interactions influencing cancer. Single marker analysis using PLINK identified 12 SNPs associated with cancer (P< 0.05); especially, SNP rs6532496 revealed the strongest association with cancer (P = 6.84 × 10⁻³); while the next best signal was rs951613 (P = 7.46 × 10⁻³). Classic logistic regression in PROC GENMOD showed that both rs6532496 and rs951613 revealed strong gene-steroid interaction effects (OR=2.18, 95% CI=1.31-3.63 with P = 2.9 × 10⁻³ for rs6532496 and OR=2.07, 95% CI=1.24-3.45 with P = 5.43 × 10⁻³ for rs951613, respectively). Results from Bayesian logistic regression showed stronger interaction effects (OR=2.26, 95% CI=1.2-3.38 for rs6532496 and OR=2.14, 95% CI=1.14-3.2 for rs951613, respectively). All the 12 SNPs associated with cancer revealed significant gene-steroid interaction effects (P < 0.05); whereas 13 SNPs showed gene-steroid interaction effects without main effect on cancer. SNP rs4634230 revealed the strongest gene-steroid interaction effect (OR=2.49, 95% CI=1.5-4.13 with P = 4.0 × 10⁻⁴ based on the classic logistic regression and OR=2.59, 95% CI=1.4-3.97 from Bayesian logistic regression; respectively). This study provides evidence of common genetic variants within the PDLIM5 gene and interactions between PLDIM5 gene polymorphisms and steroid use influencing cancer.

  20. Evaluation of Quality of Life and Safety of Seniors in Golestan Province, Iran

    PubMed Central

    Foroushani, Abbas Rahimi; Badakhshan, Abbas; Gholipour, Mahin; Hosseini, Masoumeh

    2015-01-01

    This study evaluated the criteria for quality of life (QoL) using standardized short-form health survey with only 36 questions (SF-36; Version 2.0) and Consumer Product Safety Commission (CPSC) questionnaires to study the relationship between QoL and living conditions of seniors in Golestan province in Iran. This was an analytical cross-sectional study with descriptive and analytical parts. The population was individuals above 65 years of age in Golestan province in Iran. The sample size was calculated based on the correlation coefficient; a correlation of .2 or greater was considered statistically significant at 80% for the power of the test at the 95% confidence level. The data on QoL of seniors were collected by interview and observation using the CPSC questionnaire for nursing homes and the SF-36 for QoL health indicators. The reliability of the CPSC questionnaire was estimated using Cronbach’s alpha with a coefficient of .838. The SF-36 questionnaire was validated with Cronbach’s alpha with a coefficient of .95. Chi-square and logistic regression were used to interpret the probability of abnormal QoL between levels of independent predictors. The percentage of seniors in overall poor health as a binary outcome was 43.5, and the percentage of unsafe conditions was 49.8. PMID:28138463

  1. Wrong Signs in Regression Coefficients

    NASA Technical Reports Server (NTRS)

    McGee, Holly

    1999-01-01

    When using parametric cost estimation, it is important to note the possibility of the regression coefficients having the wrong sign. A wrong sign is defined as a sign on the regression coefficient opposite to the researcher's intuition and experience. Some possible causes for the wrong sign discussed in this paper are a small range of x's, leverage points, missing variables, multicollinearity, and computational error. Additionally, techniques for determining the cause of the wrong sign are given.

  2. Logits and Tigers and Bears, Oh My! A Brief Look at the Simple Math of Logistic Regression and How It Can Improve Dissemination of Results

    ERIC Educational Resources Information Center

    Osborne, Jason W.

    2012-01-01

    Logistic regression is slowly gaining acceptance in the social sciences, and fills an important niche in the researcher's toolkit: being able to predict important outcomes that are not continuous in nature. While OLS regression is a valuable tool, it cannot routinely be used to predict outcomes that are binary or categorical in nature. These…

  3. Racial residential segregation and preterm birth: built environment as a mediator.

    PubMed

    Anthopolos, Rebecca; Kaufman, Jay S; Messer, Lynne C; Miranda, Marie Lynn

    2014-05-01

    Racial residential segregation has been associated with preterm birth. Few studies have examined mediating pathways, in part because, with binary outcomes, indirect effects estimated from multiplicative models generally lack causal interpretation. We develop a method to estimate additive-scale natural direct and indirect effects from logistic regression. We then evaluate whether segregation operates through poor-quality built environment to affect preterm birth. To estimate natural direct and indirect effects, we derive risk differences from logistic regression coefficients. Birth records (2000-2008) for Durham, North Carolina, were linked to neighborhood-level measures of racial isolation and a composite construct of poor-quality built environment. We decomposed the total effect of racial isolation on preterm birth into direct and indirect effects. The adjusted total effect of an interquartile increase in racial isolation on preterm birth was an extra 27 preterm events per 1000 births (risk difference = 0.027 [95% confidence interval = 0.007 to 0.047]). With poor-quality built environment held at the level it would take under isolation at the 25th percentile, the direct effect of an interquartile increase in isolation was 0.022 (-0.001 to 0.042). Poor-quality built environment accounted for 35% (11% to 65%) of the total effect. Our methodology facilitates the estimation of additive-scale natural effects with binary outcomes. In this study, the total effect of racial segregation on preterm birth was partially mediated by poor-quality built environment.

  4. Predicting Future Suicide Attempts among Depressed Suicide Ideators: A 10-year Longitudinal Study

    PubMed Central

    May, Alexis M.; Klonsky, E. David; Klein, Daniel N.

    2012-01-01

    Suicidal ideation and attempts are a major public health problem. Research has identified many risk factors for suicidality; however, most fail to identify which suicide ideators are at greatest risk of progressing to a suicide attempt. Thus, the present study identified predictors of future suicide attempts in a sample of psychiatric patients reporting suicidal ideation. The sample comprised 49 individuals who met full DSM-IV criteria for major depressive disorder and/or dysthymic disorder and reported suicidal ideation at baseline. Participants were followed for 10 years. Demographic, psychological, personality, and psychosocial risk factors were assessed using validated questionnaires and structured interviews. Phi coefficients and point-biserial correlations were used to identify prospective predictors of attempts, and logistic regressions were used to identify which variables predicted future attempts over and above past suicide attempts. Six significant predictors of future suicide attempts were identified – cluster A personality disorder, cluster B personality disorder, lifetime substance abuse, baseline anxiety disorder, poor maternal relationship, and poor social adjustment. Finally, exploratory logistic regressions were used to examine the unique contribution of each significant predictor controlling for the others. Co-morbid cluster B personality disorder emerged as the only robust, unique predictor of future suicide attempts among depressed suicide ideators. Future research should continue to identify variables that predict transition from suicidal thoughts to suicide attempts, as such work will enhance clinical assessment of suicide risk as well as theoretical models of suicide. PMID:22575331

  5. Comparative Performance Analysis of Support Vector Machine, Random Forest, Logistic Regression and k-Nearest Neighbours in Rainbow Trout (Oncorhynchus Mykiss) Classification Using Image-Based Features

    PubMed Central

    Císař, Petr; Labbé, Laurent; Souček, Pavel; Pelissier, Pablo; Kerneis, Thierry

    2018-01-01

    The main aim of this study was to develop a new objective method for evaluating the impacts of different diets on the live fish skin using image-based features. In total, one-hundred and sixty rainbow trout (Oncorhynchus mykiss) were fed either a fish-meal based diet (80 fish) or a 100% plant-based diet (80 fish) and photographed using consumer-grade digital camera. Twenty-three colour features and four texture features were extracted. Four different classification methods were used to evaluate fish diets including Random forest (RF), Support vector machine (SVM), Logistic regression (LR) and k-Nearest neighbours (k-NN). The SVM with radial based kernel provided the best classifier with correct classification rate (CCR) of 82% and Kappa coefficient of 0.65. Although the both LR and RF methods were less accurate than SVM, they achieved good classification with CCR 75% and 70% respectively. The k-NN was the least accurate (40%) classification model. Overall, it can be concluded that consumer-grade digital cameras could be employed as the fast, accurate and non-invasive sensor for classifying rainbow trout based on their diets. Furthermore, these was a close association between image-based features and fish diet received during cultivation. These procedures can be used as non-invasive, accurate and precise approaches for monitoring fish status during the cultivation by evaluating diet’s effects on fish skin. PMID:29596375

  6. Comparative Performance Analysis of Support Vector Machine, Random Forest, Logistic Regression and k-Nearest Neighbours in Rainbow Trout (Oncorhynchus Mykiss) Classification Using Image-Based Features.

    PubMed

    Saberioon, Mohammadmehdi; Císař, Petr; Labbé, Laurent; Souček, Pavel; Pelissier, Pablo; Kerneis, Thierry

    2018-03-29

    The main aim of this study was to develop a new objective method for evaluating the impacts of different diets on the live fish skin using image-based features. In total, one-hundred and sixty rainbow trout ( Oncorhynchus mykiss ) were fed either a fish-meal based diet (80 fish) or a 100% plant-based diet (80 fish) and photographed using consumer-grade digital camera. Twenty-three colour features and four texture features were extracted. Four different classification methods were used to evaluate fish diets including Random forest (RF), Support vector machine (SVM), Logistic regression (LR) and k -Nearest neighbours ( k -NN). The SVM with radial based kernel provided the best classifier with correct classification rate (CCR) of 82% and Kappa coefficient of 0.65. Although the both LR and RF methods were less accurate than SVM, they achieved good classification with CCR 75% and 70% respectively. The k -NN was the least accurate (40%) classification model. Overall, it can be concluded that consumer-grade digital cameras could be employed as the fast, accurate and non-invasive sensor for classifying rainbow trout based on their diets. Furthermore, these was a close association between image-based features and fish diet received during cultivation. These procedures can be used as non-invasive, accurate and precise approaches for monitoring fish status during the cultivation by evaluating diet's effects on fish skin.

  7. Osteoporosis prediction from the mandible using cone-beam computed tomography

    PubMed Central

    Al Haffar, Iyad; Khattab, Razan

    2014-01-01

    Purpose This study aimed to evaluate the use of dental cone-beam computed tomography (CBCT) in the diagnosis of osteoporosis among menopausal and postmenopausal women by using only a CBCT viewer program. Materials and Methods Thirty-eight menopausal and postmenopausal women who underwent dual-energy X-ray absorptiometry (DXA) examination for hip and lumbar vertebrae were scanned using CBCT (field of view: 13 cm×15 cm; voxel size: 0.25 mm). Slices from the body of the mandible as well as the ramus were selected and some CBCT-derived variables, such as radiographic density (RD) as gray values, were calculated as gray values. Pearson's correlation, one-way analysis of variance (ANOVA), and accuracy (sensitivity and specificity) evaluation based on linear and logistic regression were performed to choose the variable that best correlated with the lumbar and femoral neck T-scores. Results RD of the whole bone area of the mandible was the variable that best correlated with and predicted both the femoral neck and the lumbar vertebrae T-scores; further, Pearson's correlation coefficients were 0.5/0.6 (p value=0.037/0.009). The sensitivity, specificity, and accuracy based on the logistic regression were 50%, 88.9%, and 78.4%, respectively, for the femoral neck, and 46.2%, 91.3%, and 75%, respectively, for the lumbar vertebrae. Conclusion Lumbar vertebrae and femoral neck osteoporosis can be predicted with high accuracy from the RD value of the body of the mandible by using a CBCT viewer program. PMID:25473633

  8. Mortality prediction of head Abbreviated Injury Score and Glasgow Coma Scale: analysis of 7,764 head injuries.

    PubMed

    Demetriades, Demetrios; Kuncir, Eric; Murray, James; Velmahos, George C; Rhee, Peter; Chan, Linda

    2004-08-01

    We assessed the prognostic value and limitations of Glasgow Coma Scale (GCS) and head Abbreviated Injury Score (AIS) and correlated head AIS with GCS. We studied 7,764 patients with head injuries. Bivariate analysis was performed to examine the relationship of GCS, head AIS, age, gender, and mechanism of injury with mortality. Stepwise logistic regression analysis was used to identify the independent risk factors associated with mortality. The overall mortality in the group of head injury patients with no other major extracranial injuries and no hypotension on admission was 9.3%. Logistic regression analysis identified head AIS, GCS, age, and mechanism of injury as significant independent risk factors of death. The prognostic value of GCS and head AIS was significantly affected by the mechanism of injury and the age of the patient. Patients with similar GCS or head AIS but different mechanisms of injury or ages had significantly different outcomes. The adjusted odds ratio of death in penetrating trauma was 5.2 (3.9, 7.0), p < 0.0001, and in the age group > or = 55 years the adjusted odds ratio was 3.4 (2.6, 4.6), p < 0.0001. There was no correlation between head AIS and GCS (correlation coefficient -0.31). Mechanism of injury and age have a major effect in the predictive value of GCS and head AIS. There is no good correlation between GCS and head AIS.

  9. Sperm function and assisted reproduction technology

    PubMed Central

    MAAß, GESA; BÖDEKER, ROLF‐HASSO; SCHEIBELHUT, CHRISTINE; STALF, THOMAS; MEHNERT, CLAAS; SCHUPPE, HANS‐CHRISTIAN; JUNG, ANDREAS; SCHILL, WOLF‐BERNHARD

    2005-01-01

    The evaluation of different functional sperm parameters has become a tool in andrological diagnosis. These assays determine the sperm's capability to fertilize an oocyte. It also appears that sperm functions and semen parameters are interrelated and interdependent. Therefore, the question arose whether a given laboratory test or a battery of tests can predict the outcome in in vitro fertilization (IVF). One‐hundred and sixty‐one patients who underwent an IVF treatment were selected from a database of 4178 patients who had been examined for male infertility 3 months before or after IVF. Sperm concentration, motility, acrosin activity, acrosome reaction, sperm morphology, maternal age, number of transferred embryos, embryo score, fertilization rate and pregnancy rate were determined. In addition, logistic regression models to describe fertilization rate and pregnancy were developed. All the parameters in the models were dichotomized and intra‐ and interindividual variability of the parameters were assessed. Although the sperm parameters showed good correlations with IVF when correlated separately, the only essential parameter in the multivariate model was morphology. The enormous intra‐ and interindividual variability of the values was striking. In conclusion, our data indicate that the andrological status at the end of the respective treatment does not necessarily represent the status at the time of IVF. Despite a relatively low correlation coefficient in the logistic regression model, it appears that among the parameters tested, the most reliable parameter to predict fertilization is normal sperm morphology. (Reprod Med Biol 2005; 4: 7–30) PMID:29699207

  10. A Two-Stage Method to Determine Optimal Product Sampling considering Dynamic Potential Market

    PubMed Central

    Hu, Zhineng; Lu, Wei; Han, Bing

    2015-01-01

    This paper develops an optimization model for the diffusion effects of free samples under dynamic changes in potential market based on the characteristics of independent product and presents a two-stage method to figure out the sampling level. The impact analysis of the key factors on the sampling level shows that the increase of the external coefficient or internal coefficient has a negative influence on the sampling level. And the changing rate of the potential market has no significant influence on the sampling level whereas the repeat purchase has a positive one. Using logistic analysis and regression analysis, the global sensitivity analysis gives a whole analysis of the interaction of all parameters, which provides a two-stage method to estimate the impact of the relevant parameters in the case of inaccuracy of the parameters and to be able to construct a 95% confidence interval for the predicted sampling level. Finally, the paper provides the operational steps to improve the accuracy of the parameter estimation and an innovational way to estimate the sampling level. PMID:25821847

  11. Noninvasive spectral imaging of skin chromophores based on multiple regression analysis aided by Monte Carlo simulation

    NASA Astrophysics Data System (ADS)

    Nishidate, Izumi; Wiswadarma, Aditya; Hase, Yota; Tanaka, Noriyuki; Maeda, Takaaki; Niizeki, Kyuichi; Aizu, Yoshihisa

    2011-08-01

    In order to visualize melanin and blood concentrations and oxygen saturation in human skin tissue, a simple imaging technique based on multispectral diffuse reflectance images acquired at six wavelengths (500, 520, 540, 560, 580 and 600nm) was developed. The technique utilizes multiple regression analysis aided by Monte Carlo simulation for diffuse reflectance spectra. Using the absorbance spectrum as a response variable and the extinction coefficients of melanin, oxygenated hemoglobin, and deoxygenated hemoglobin as predictor variables, multiple regression analysis provides regression coefficients. Concentrations of melanin and total blood are then determined from the regression coefficients using conversion vectors that are deduced numerically in advance, while oxygen saturation is obtained directly from the regression coefficients. Experiments with a tissue-like agar gel phantom validated the method. In vivo experiments with human skin of the human hand during upper limb occlusion and of the inner forearm exposed to UV irradiation demonstrated the ability of the method to evaluate physiological reactions of human skin tissue.

  12. Predicting Social Trust with Binary Logistic Regression

    ERIC Educational Resources Information Center

    Adwere-Boamah, Joseph; Hufstedler, Shirley

    2015-01-01

    This study used binary logistic regression to predict social trust with five demographic variables from a national sample of adult individuals who participated in The General Social Survey (GSS) in 2012. The five predictor variables were respondents' highest degree earned, race, sex, general happiness and the importance of personally assisting…

  13. Effect of folic acid on appetite in children: ordinal logistic and fuzzy logistic regressions.

    PubMed

    Namdari, Mahshid; Abadi, Alireza; Taheri, S Mahmoud; Rezaei, Mansour; Kalantari, Naser; Omidvar, Nasrin

    2014-03-01

    Reduced appetite and low food intake are often a concern in preschool children, since it can lead to malnutrition, a leading cause of impaired growth and mortality in childhood. It is occasionally considered that folic acid has a positive effect on appetite enhancement and consequently growth in children. The aim of this study was to assess the effect of folic acid on the appetite of preschool children 3 to 6 y old. The study sample included 127 children ages 3 to 6 who were randomly selected from 20 preschools in the city of Tehran in 2011. Since appetite was measured by linguistic terms, a fuzzy logistic regression was applied for modeling. The obtained results were compared with a statistical ordinal logistic model. After controlling for the potential confounders, in a statistical ordinal logistic model, serum folate showed a significantly positive effect on appetite. A small but positive effect of folate was detected by fuzzy logistic regression. Based on fuzzy regression, the risk for poor appetite in preschool children was related to the employment status of their mothers. In this study, a positive association was detected between the levels of serum folate and improved appetite. For further investigation, a randomized controlled, double-blind clinical trial could be helpful to address causality. Copyright © 2014 Elsevier Inc. All rights reserved.

  14. Association between Personality Traits and Sleep Quality in Young Korean Women

    PubMed Central

    Kim, Han-Na; Cho, Juhee; Chang, Yoosoo; Ryu, Seungho

    2015-01-01

    Personality is a trait that affects behavior and lifestyle, and sleep quality is an important component of a healthy life. We analyzed the association between personality traits and sleep quality in a cross-section of 1,406 young women (from 18 to 40 years of age) who were not reporting clinically meaningful depression symptoms. Surveys were carried out from December 2011 to February 2012, using the Revised NEO Personality Inventory and the Pittsburgh Sleep Quality Index (PSQI). All analyses were adjusted for demographic and behavioral variables. We considered beta weights, structure coefficients, unique effects, and common effects when evaluating the importance of sleep quality predictors in multiple linear regression models. Neuroticism was the most important contributor to PSQI global scores in the multiple regression models. By contrast, despite being strongly correlated with sleep quality, conscientiousness had a near-zero beta weight in linear regression models, because most variance was shared with other personality traits. However, conscientiousness was the most noteworthy predictor of poor sleep quality status (PSQI≥6) in logistic regression models and individuals high in conscientiousness were least likely to have poor sleep quality, which is consistent with an OR of 0.813, with conscientiousness being protective against poor sleep quality. Personality may be a factor in poor sleep quality and should be considered in sleep interventions targeting young women. PMID:26030141

  15. Clustering performance comparison using K-means and expectation maximization algorithms.

    PubMed

    Jung, Yong Gyu; Kang, Min Soo; Heo, Jun

    2014-11-14

    Clustering is an important means of data mining based on separating data categories by similar features. Unlike the classification algorithm, clustering belongs to the unsupervised type of algorithms. Two representatives of the clustering algorithms are the K -means and the expectation maximization (EM) algorithm. Linear regression analysis was extended to the category-type dependent variable, while logistic regression was achieved using a linear combination of independent variables. To predict the possibility of occurrence of an event, a statistical approach is used. However, the classification of all data by means of logistic regression analysis cannot guarantee the accuracy of the results. In this paper, the logistic regression analysis is applied to EM clusters and the K -means clustering method for quality assessment of red wine, and a method is proposed for ensuring the accuracy of the classification results.

  16. Racial/ethnic and educational differences in the estimated odds of recent nitrite use among adult household residents in the United States: an illustration of matching and conditional logistic regression.

    PubMed

    Delva, J; Spencer, M S; Lin, J K

    2000-01-01

    This article compares estimates of the relative odds of nitrite use obtained from weighted unconditional logistic regression with estimates obtained from conditional logistic regression after post-stratification and matching of cases with controls by neighborhood of residence. We illustrate these methods by comparing the odds associated with nitrite use among adults of four racial/ethnic groups, with and without a high school education. We used aggregated data from the 1994-B through 1996 National Household Survey on Drug Abuse (NHSDA). Difference between the methods and implications for analysis and inference are discussed.

  17. Aldosterone and glomerular filtration--observations in the general population.

    PubMed

    Hannemann, Anke; Rettig, Rainer; Dittmann, Kathleen; Völzke, Henry; Endlich, Karlhans; Nauck, Matthias; Wallaschofski, Henri

    2014-03-10

    Increasing evidence suggests that aldosterone promotes renal damage. Since data on the association between aldosterone and renal function in the general population are sparse, we chose to address this issue. We investigated the associations between the plasma aldosterone concentration (PAC) or the aldosterone-to-renin ratio (ARR) and the estimated glomerular filtration rate (eGFR) in a sample of adult men and women from Northeast Germany. A study population of 1921 adult men and women who participated in the first follow-up of the Study of Health in Pomerania was selected. None of the subjects used drugs that alter PAC or ARR. The eGFR was calculated according to the four-variable Modification of Diet in Renal Disease formula. Chronic kidney disease (CKD) was defined as an eGFR < 60 ml/min/1.73 m2. Linear regression models, adjusted for sex, age, waist circumference, diabetes mellitus, smoking status, systolic and diastolic blood pressures, serum triglyceride concentrations and time of blood sampling revealed inverse associations of PAC or ARR with eGFR (ß-coefficient for log-transformed PAC -3.12, p < 0.001; ß-coefficient for log-transformed ARR -3.36, p < 0.001). Logistic regression models revealed increased odds for CKD with increasing PAC (odds ratio for a one standard deviation increase in PAC: 1.35, 95% confidence interval: 1.06-1.71). There was no statistically significant association between ARR and CKD. Our study demonstrates that PAC and ARR are inversely associated with the glomerular filtration rate in the general population.

  18. Partial F-tests with multiply imputed data in the linear regression framework via coefficient of determination.

    PubMed

    Chaurasia, Ashok; Harel, Ofer

    2015-02-10

    Tests for regression coefficients such as global, local, and partial F-tests are common in applied research. In the framework of multiple imputation, there are several papers addressing tests for regression coefficients. However, for simultaneous hypothesis testing, the existing methods are computationally intensive because they involve calculation with vectors and (inversion of) matrices. In this paper, we propose a simple method based on the scalar entity, coefficient of determination, to perform (global, local, and partial) F-tests with multiply imputed data. The proposed method is evaluated using simulated data and applied to suicide prevention data. Copyright © 2014 John Wiley & Sons, Ltd.

  19. Regression trees for predicting mortality in patients with cardiovascular disease: What improvement is achieved by using ensemble-based methods?

    PubMed Central

    Austin, Peter C; Lee, Douglas S; Steyerberg, Ewout W; Tu, Jack V

    2012-01-01

    In biomedical research, the logistic regression model is the most commonly used method for predicting the probability of a binary outcome. While many clinical researchers have expressed an enthusiasm for regression trees, this method may have limited accuracy for predicting health outcomes. We aimed to evaluate the improvement that is achieved by using ensemble-based methods, including bootstrap aggregation (bagging) of regression trees, random forests, and boosted regression trees. We analyzed 30-day mortality in two large cohorts of patients hospitalized with either acute myocardial infarction (N = 16,230) or congestive heart failure (N = 15,848) in two distinct eras (1999–2001 and 2004–2005). We found that both the in-sample and out-of-sample prediction of ensemble methods offered substantial improvement in predicting cardiovascular mortality compared to conventional regression trees. However, conventional logistic regression models that incorporated restricted cubic smoothing splines had even better performance. We conclude that ensemble methods from the data mining and machine learning literature increase the predictive performance of regression trees, but may not lead to clear advantages over conventional logistic regression models for predicting short-term mortality in population-based samples of subjects with cardiovascular disease. PMID:22777999

  20. Strategies for Testing Statistical and Practical Significance in Detecting DIF with Logistic Regression Models

    ERIC Educational Resources Information Center

    Fidalgo, Angel M.; Alavi, Seyed Mohammad; Amirian, Seyed Mohammad Reza

    2014-01-01

    This study examines three controversial aspects in differential item functioning (DIF) detection by logistic regression (LR) models: first, the relative effectiveness of different analytical strategies for detecting DIF; second, the suitability of the Wald statistic for determining the statistical significance of the parameters of interest; and…

  1. Iterative Purification and Effect Size Use with Logistic Regression for Differential Item Functioning Detection

    ERIC Educational Resources Information Center

    French, Brian F.; Maller, Susan J.

    2007-01-01

    Two unresolved implementation issues with logistic regression (LR) for differential item functioning (DIF) detection include ability purification and effect size use. Purification is suggested to control inaccuracies in DIF detection as a result of DIF items in the ability estimate. Additionally, effect size use may be beneficial in controlling…

  2. A Note on Three Statistical Tests in the Logistic Regression DIF Procedure

    ERIC Educational Resources Information Center

    Paek, Insu

    2012-01-01

    Although logistic regression became one of the well-known methods in detecting differential item functioning (DIF), its three statistical tests, the Wald, likelihood ratio (LR), and score tests, which are readily available under the maximum likelihood, do not seem to be consistently distinguished in DIF literature. This paper provides a clarifying…

  3. "Let Me Count the Ways:" Fostering Reasons for Living among Low-Income, Suicidal, African American Women

    ERIC Educational Resources Information Center

    West, Lindsey M.; Davis, Telsie A.; Thompson, Martie P.; Kaslow, Nadine J.

    2011-01-01

    Protective factors for fostering reasons for living were examined among low-income, suicidal, African American women. Bivariate logistic regressions revealed that higher levels of optimism, spiritual well-being, and family social support predicted reasons for living. Multivariate logistic regressions indicated that spiritual well-being showed…

  4. Comparison of Two Approaches for Handling Missing Covariates in Logistic Regression

    ERIC Educational Resources Information Center

    Peng, Chao-Ying Joanne; Zhu, Jin

    2008-01-01

    For the past 25 years, methodological advances have been made in missing data treatment. Most published work has focused on missing data in dependent variables under various conditions. The present study seeks to fill the void by comparing two approaches for handling missing data in categorical covariates in logistic regression: the…

  5. Comparison of IRT Likelihood Ratio Test and Logistic Regression DIF Detection Procedures

    ERIC Educational Resources Information Center

    Atar, Burcu; Kamata, Akihito

    2011-01-01

    The Type I error rates and the power of IRT likelihood ratio test and cumulative logit ordinal logistic regression procedures in detecting differential item functioning (DIF) for polytomously scored items were investigated in this Monte Carlo simulation study. For this purpose, 54 simulation conditions (combinations of 3 sample sizes, 2 sample…

  6. Multiple Logistic Regression Analysis of Cigarette Use among High School Students

    ERIC Educational Resources Information Center

    Adwere-Boamah, Joseph

    2011-01-01

    A binary logistic regression analysis was performed to predict high school students' cigarette smoking behavior from selected predictors from 2009 CDC Youth Risk Behavior Surveillance Survey. The specific target student behavior of interest was frequent cigarette use. Five predictor variables included in the model were: a) race, b) frequency of…

  7. Modeling Polytomous Item Responses Using Simultaneously Estimated Multinomial Logistic Regression Models

    ERIC Educational Resources Information Center

    Anderson, Carolyn J.; Verkuilen, Jay; Peyton, Buddy L.

    2010-01-01

    Survey items with multiple response categories and multiple-choice test questions are ubiquitous in psychological and educational research. We illustrate the use of log-multiplicative association (LMA) models that are extensions of the well-known multinomial logistic regression model for multiple dependent outcome variables to reanalyze a set of…

  8. Propensity Score Estimation with Data Mining Techniques: Alternatives to Logistic Regression

    ERIC Educational Resources Information Center

    Keller, Bryan S. B.; Kim, Jee-Seon; Steiner, Peter M.

    2013-01-01

    Propensity score analysis (PSA) is a methodological technique which may correct for selection bias in a quasi-experiment by modeling the selection process using observed covariates. Because logistic regression is well understood by researchers in a variety of fields and easy to implement in a number of popular software packages, it has…

  9. Two-factor logistic regression in pediatric liver transplantation

    NASA Astrophysics Data System (ADS)

    Uzunova, Yordanka; Prodanova, Krasimira; Spasov, Lyubomir

    2017-12-01

    Using a two-factor logistic regression analysis an estimate is derived for the probability of absence of infections in the early postoperative period after pediatric liver transplantation. The influence of both the bilirubin level and the international normalized ratio of prothrombin time of blood coagulation at the 5th postoperative day is studied.

  10. Predictors of Placement Stability at the State Level: The Use of Logistic Regression to Inform Practice

    ERIC Educational Resources Information Center

    Courtney, Jon R.; Prophet, Retta

    2011-01-01

    Placement instability is often associated with a number of negative outcomes for children. To gain state level contextual knowledge of factors associated with placement stability/instability, logistic regression was applied to selected variables from the New Mexico Adoption and Foster Care Administrative Reporting System dataset. Predictors…

  11. Classifying machinery condition using oil samples and binary logistic regression

    NASA Astrophysics Data System (ADS)

    Phillips, J.; Cripps, E.; Lau, John W.; Hodkiewicz, M. R.

    2015-08-01

    The era of big data has resulted in an explosion of condition monitoring information. The result is an increasing motivation to automate the costly and time consuming human elements involved in the classification of machine health. When working with industry it is important to build an understanding and hence some trust in the classification scheme for those who use the analysis to initiate maintenance tasks. Typically "black box" approaches such as artificial neural networks (ANN) and support vector machines (SVM) can be difficult to provide ease of interpretability. In contrast, this paper argues that logistic regression offers easy interpretability to industry experts, providing insight to the drivers of the human classification process and to the ramifications of potential misclassification. Of course, accuracy is of foremost importance in any automated classification scheme, so we also provide a comparative study based on predictive performance of logistic regression, ANN and SVM. A real world oil analysis data set from engines on mining trucks is presented and using cross-validation we demonstrate that logistic regression out-performs the ANN and SVM approaches in terms of prediction for healthy/not healthy engines.

  12. Length bias correction in gene ontology enrichment analysis using logistic regression.

    PubMed

    Mi, Gu; Di, Yanming; Emerson, Sarah; Cumbie, Jason S; Chang, Jeff H

    2012-01-01

    When assessing differential gene expression from RNA sequencing data, commonly used statistical tests tend to have greater power to detect differential expression of genes encoding longer transcripts. This phenomenon, called "length bias", will influence subsequent analyses such as Gene Ontology enrichment analysis. In the presence of length bias, Gene Ontology categories that include longer genes are more likely to be identified as enriched. These categories, however, are not necessarily biologically more relevant. We show that one can effectively adjust for length bias in Gene Ontology analysis by including transcript length as a covariate in a logistic regression model. The logistic regression model makes the statistical issue underlying length bias more transparent: transcript length becomes a confounding factor when it correlates with both the Gene Ontology membership and the significance of the differential expression test. The inclusion of the transcript length as a covariate allows one to investigate the direct correlation between the Gene Ontology membership and the significance of testing differential expression, conditional on the transcript length. We present both real and simulated data examples to show that the logistic regression approach is simple, effective, and flexible.

  13. Matched samples logistic regression in case-control studies with missing values: when to break the matches.

    PubMed

    Hansson, Lisbeth; Khamis, Harry J

    2008-12-01

    Simulated data sets are used to evaluate conditional and unconditional maximum likelihood estimation in an individual case-control design with continuous covariates when there are different rates of excluded cases and different levels of other design parameters. The effectiveness of the estimation procedures is measured by method bias, variance of the estimators, root mean square error (RMSE) for logistic regression and the percentage of explained variation. Conditional estimation leads to higher RMSE than unconditional estimation in the presence of missing observations, especially for 1:1 matching. The RMSE is higher for the smaller stratum size, especially for the 1:1 matching. The percentage of explained variation appears to be insensitive to missing data, but is generally higher for the conditional estimation than for the unconditional estimation. It is particularly good for the 1:2 matching design. For minimizing RMSE, a high matching ratio is recommended; in this case, conditional and unconditional logistic regression models yield comparable levels of effectiveness. For maximizing the percentage of explained variation, the 1:2 matching design with the conditional logistic regression model is recommended.

  14. Label-noise resistant logistic regression for functional data classification with an application to Alzheimer's disease study.

    PubMed

    Lee, Seokho; Shin, Hyejin; Lee, Sang Han

    2016-12-01

    Alzheimer's disease (AD) is usually diagnosed by clinicians through cognitive and functional performance test with a potential risk of misdiagnosis. Since the progression of AD is known to cause structural changes in the corpus callosum (CC), the CC thickness can be used as a functional covariate in AD classification problem for a diagnosis. However, misclassified class labels negatively impact the classification performance. Motivated by AD-CC association studies, we propose a logistic regression for functional data classification that is robust to misdiagnosis or label noise. Specifically, our logistic regression model is constructed by adopting individual intercepts to functional logistic regression model. This approach enables to indicate which observations are possibly mislabeled and also lead to a robust and efficient classifier. An effective algorithm using MM algorithm provides simple closed-form update formulas. We test our method using synthetic datasets to demonstrate its superiority over an existing method, and apply it to differentiating patients with AD from healthy normals based on CC from MRI. © 2016, The International Biometric Society.

  15. The Effect of Latent Binary Variables on the Uncertainty of the Prediction of a Dichotomous Outcome Using Logistic Regression Based Propensity Score Matching.

    PubMed

    Szekér, Szabolcs; Vathy-Fogarassy, Ágnes

    2018-01-01

    Logistic regression based propensity score matching is a widely used method in case-control studies to select the individuals of the control group. This method creates a suitable control group if all factors affecting the output variable are known. However, if relevant latent variables exist as well, which are not taken into account during the calculations, the quality of the control group is uncertain. In this paper, we present a statistics-based research in which we try to determine the relationship between the accuracy of the logistic regression model and the uncertainty of the dependent variable of the control group defined by propensity score matching. Our analyses show that there is a linear correlation between the fit of the logistic regression model and the uncertainty of the output variable. In certain cases, a latent binary explanatory variable can result in a relative error of up to 70% in the prediction of the outcome variable. The observed phenomenon calls the attention of analysts to an important point, which must be taken into account when deducting conclusions.

  16. Logistic regression for circular data

    NASA Astrophysics Data System (ADS)

    Al-Daffaie, Kadhem; Khan, Shahjahan

    2017-05-01

    This paper considers the relationship between a binary response and a circular predictor. It develops the logistic regression model by employing the linear-circular regression approach. The maximum likelihood method is used to estimate the parameters. The Newton-Raphson numerical method is used to find the estimated values of the parameters. A data set from weather records of Toowoomba city is analysed by the proposed methods. Moreover, a simulation study is considered. The R software is used for all computations and simulations.

  17. Naval Research Logistics Quarterly. Volume 28. Number 3,

    DTIC Science & Technology

    1981-09-01

    denotes component-wise maximum. f has antone (isotone) differences on C x D if for cl < c2 and d, < d2, NAVAL RESEARCH LOGISTICS QUARTERLY VOL. 28...or negative correlations and linear or nonlinear regressions. Given are the mo- ments to order two and, for special cases, (he regression function and...data sets. We designate this bnb distribution as G - B - N(a, 0, v). The distribution admits only of positive correlation and linear regressions

  18. Household and familial resemblance in risk factors for type 2 diabetes and related cardiometabolic diseases in rural Uganda: a cross-sectional community sample.

    PubMed

    Nielsen, Jannie; Bahendeka, Silver K; Whyte, Susan R; Meyrowitsch, Dan W; Bygbjerg, Ib C; Witte, Daniel R

    2017-09-21

    Prevention of type 2 diabetes (T2D) has been successfully established in randomised clinical trials. However, the best methods for the translation of this evidence into effective population-wide interventions remain unclear. To assess whether households could be a target for T2D prevention and screening, we investigated the resemblance of T2D risk factors at household level and by type of familial dyadic relationship in a rural Ugandan community. This cross-sectional household-based study included 437 individuals ≥13 years of age from 90 rural households in south-western Uganda. Resemblance in glycosylated haemoglobin (HbA1c), anthropometry, blood pressure, fitness status and sitting time were analysed using a general mixed model with random effects (by household or dyad) to calculate household intraclass correlation coefficients (ICCs) and dyadic regression coefficients. Logistic regression with household as a random effect was used to calculate the ORs for individuals having a condition or risk factor if another household member had the same condition. The strongest degree of household member resemblances in T2D risk factors was seen in relation to fitness status (ICC=0.24), HbA1c (ICC=0.18) and systolic blood pressure (ICC=0.11). Regarding dyadic resemblance, the highest standardised regression coefficient was seen in fitness status for spouses (0.54, 95% CI 0.32 to 0.76), parent-offspring (0.41, 95% CI 0.28 0.54) and siblings (0.41, 95% CI 0.25 to 0.57). Overall, parent-offspring and sibling pairs were the dyads with strongest resemblance, followed by spouses. The marked degree of resemblance in T2D risk factors at household level and between spouses, parent-offspring and sibling dyads suggest that shared behavioural and environmental factors may influence risk factor levels among cohabiting individuals, which point to the potential of the household setting for screening and prevention of T2D. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  19. Regression approaches in the test-negative study design for assessment of influenza vaccine effectiveness.

    PubMed

    Bond, H S; Sullivan, S G; Cowling, B J

    2016-06-01

    Influenza vaccination is the most practical means available for preventing influenza virus infection and is widely used in many countries. Because vaccine components and circulating strains frequently change, it is important to continually monitor vaccine effectiveness (VE). The test-negative design is frequently used to estimate VE. In this design, patients meeting the same clinical case definition are recruited and tested for influenza; those who test positive are the cases and those who test negative form the comparison group. When determining VE in these studies, the typical approach has been to use logistic regression, adjusting for potential confounders. Because vaccine coverage and influenza incidence change throughout the season, time is included among these confounders. While most studies use unconditional logistic regression, adjusting for time, an alternative approach is to use conditional logistic regression, matching on time. Here, we used simulation data to examine the potential for both regression approaches to permit accurate and robust estimates of VE. In situations where vaccine coverage changed during the influenza season, the conditional model and unconditional models adjusting for categorical week and using a spline function for week provided more accurate estimates. We illustrated the two approaches on data from a test-negative study of influenza VE against hospitalization in children in Hong Kong which resulted in the conditional logistic regression model providing the best fit to the data.

  20. Neonatal brain structure on MRI and diffusion tensor imaging, sex, and neurodevelopment in very-low-birthweight preterm children.

    PubMed

    Rose, Jessica; Butler, Erin E; Lamont, Lauren E; Barnes, Patrick D; Atlas, Scott W; Stevenson, David K

    2009-07-01

    The neurological basis of an increased incidence of cerebral palsy (CP) in preterm males is unknown. This study examined neonatal brain structure on magnetic resonance imaging (MRI) and diffusion tensor imaging (DTI) at term-equivalent age, sex, and neurodevelopment at 1 year 6 months on the basis of the Amiel-Tison neurological examination, Gross Motor Function Classification System, and Bayley Scales of Infant Development in 78 very-low-birthweight preterm children (41 males, 37 females; mean gestational age 27.6 wks, SD 2.5; mean birthweight 1021 g, SD 339). Brain abnormalities on MRI and DTI were not different between males and females except in the splenium of the corpus callosum, where males had lower DTI fractional anisotropy (p=0.025) and a higher apparent diffusion coefficient (p=0.013), indicating delayed splenium development. In the 26 infants who were at higher risk on the basis of DTI, males had more abnormalities on MRI (p=0.034) and had lower fractional anisotropy and a higher apparent diffusion coefficient in the splenium (p=0.049; p=0.025) and right posterior limb of the internal capsule (PLIC; p=0.003; p=0.033). Abnormal neurodevelopment was more common in males (n=9) than in females (n=2; p=0.036). Children with abnormal neurodevelopment had more abnormalities on MRI (p=0.014) and reduced splenium and right PLIC fractional anisotropy (p=0.001; p=0.035). In children with abnormal neurodevelopment, right PLIC fractional anisotropy was lower than left (p=0.035), whereas in those with normal neurodevelopment right PLIC fractional anisotropy was higher than left (p=0.001). Right PLIC fractional anisotropy correlated to neurodevelopment (rho=0.371, p=0.002). Logistic regression predicted neurodevelopment with 94% accuracy; only right PLIC fractional anisotropy was a significant logistic coefficient. Results indicate that the higher incidence of abnormal neurodevelopment in preterm males relates to greater incidence and severity of brain abnormalities, including reduced PLIC and splenium development.

  1. A novel hybrid method of beta-turn identification in protein using binary logistic regression and neural network

    PubMed Central

    Asghari, Mehdi Poursheikhali; Hayatshahi, Sayyed Hamed Sadat; Abdolmaleki, Parviz

    2012-01-01

    From both the structural and functional points of view, β-turns play important biological roles in proteins. In the present study, a novel two-stage hybrid procedure has been developed to identify β-turns in proteins. Binary logistic regression was initially used for the first time to select significant sequence parameters in identification of β-turns due to a re-substitution test procedure. Sequence parameters were consisted of 80 amino acid positional occurrences and 20 amino acid percentages in sequence. Among these parameters, the most significant ones which were selected by binary logistic regression model, were percentages of Gly, Ser and the occurrence of Asn in position i+2, respectively, in sequence. These significant parameters have the highest effect on the constitution of a β-turn sequence. A neural network model was then constructed and fed by the parameters selected by binary logistic regression to build a hybrid predictor. The networks have been trained and tested on a non-homologous dataset of 565 protein chains. With applying a nine fold cross-validation test on the dataset, the network reached an overall accuracy (Qtotal) of 74, which is comparable with results of the other β-turn prediction methods. In conclusion, this study proves that the parameter selection ability of binary logistic regression together with the prediction capability of neural networks lead to the development of more precise models for identifying β-turns in proteins. PMID:27418910

  2. A novel hybrid method of beta-turn identification in protein using binary logistic regression and neural network.

    PubMed

    Asghari, Mehdi Poursheikhali; Hayatshahi, Sayyed Hamed Sadat; Abdolmaleki, Parviz

    2012-01-01

    From both the structural and functional points of view, β-turns play important biological roles in proteins. In the present study, a novel two-stage hybrid procedure has been developed to identify β-turns in proteins. Binary logistic regression was initially used for the first time to select significant sequence parameters in identification of β-turns due to a re-substitution test procedure. Sequence parameters were consisted of 80 amino acid positional occurrences and 20 amino acid percentages in sequence. Among these parameters, the most significant ones which were selected by binary logistic regression model, were percentages of Gly, Ser and the occurrence of Asn in position i+2, respectively, in sequence. These significant parameters have the highest effect on the constitution of a β-turn sequence. A neural network model was then constructed and fed by the parameters selected by binary logistic regression to build a hybrid predictor. The networks have been trained and tested on a non-homologous dataset of 565 protein chains. With applying a nine fold cross-validation test on the dataset, the network reached an overall accuracy (Qtotal) of 74, which is comparable with results of the other β-turn prediction methods. In conclusion, this study proves that the parameter selection ability of binary logistic regression together with the prediction capability of neural networks lead to the development of more precise models for identifying β-turns in proteins.

  3. Conditional Poisson models: a flexible alternative to conditional logistic case cross-over analysis.

    PubMed

    Armstrong, Ben G; Gasparrini, Antonio; Tobias, Aurelio

    2014-11-24

    The time stratified case cross-over approach is a popular alternative to conventional time series regression for analysing associations between time series of environmental exposures (air pollution, weather) and counts of health outcomes. These are almost always analyzed using conditional logistic regression on data expanded to case-control (case crossover) format, but this has some limitations. In particular adjusting for overdispersion and auto-correlation in the counts is not possible. It has been established that a Poisson model for counts with stratum indicators gives identical estimates to those from conditional logistic regression and does not have these limitations, but it is little used, probably because of the overheads in estimating many stratum parameters. The conditional Poisson model avoids estimating stratum parameters by conditioning on the total event count in each stratum, thus simplifying the computing and increasing the number of strata for which fitting is feasible compared with the standard unconditional Poisson model. Unlike the conditional logistic model, the conditional Poisson model does not require expanding the data, and can adjust for overdispersion and auto-correlation. It is available in Stata, R, and other packages. By applying to some real data and using simulations, we demonstrate that conditional Poisson models were simpler to code and shorter to run than are conditional logistic analyses and can be fitted to larger data sets than possible with standard Poisson models. Allowing for overdispersion or autocorrelation was possible with the conditional Poisson model but when not required this model gave identical estimates to those from conditional logistic regression. Conditional Poisson regression models provide an alternative to case crossover analysis of stratified time series data with some advantages. The conditional Poisson model can also be used in other contexts in which primary control for confounding is by fine stratification.

  4. Use of generalized ordered logistic regression for the analysis of multidrug resistance data.

    PubMed

    Agga, Getahun E; Scott, H Morgan

    2015-10-01

    Statistical analysis of antimicrobial resistance data largely focuses on individual antimicrobial's binary outcome (susceptible or resistant). However, bacteria are becoming increasingly multidrug resistant (MDR). Statistical analysis of MDR data is mostly descriptive often with tabular or graphical presentations. Here we report the applicability of generalized ordinal logistic regression model for the analysis of MDR data. A total of 1,152 Escherichia coli, isolated from the feces of weaned pigs experimentally supplemented with chlortetracycline (CTC) and copper, were tested for susceptibilities against 15 antimicrobials and were binary classified into resistant or susceptible. The 15 antimicrobial agents tested were grouped into eight different antimicrobial classes. We defined MDR as the number of antimicrobial classes to which E. coli isolates were resistant ranging from 0 to 8. Proportionality of the odds assumption of the ordinal logistic regression model was violated only for the effect of treatment period (pre-treatment, during-treatment and post-treatment); but not for the effect of CTC or copper supplementation. Subsequently, a partially constrained generalized ordinal logistic model was built that allows for the effect of treatment period to vary while constraining the effects of treatment (CTC and copper supplementation) to be constant across the levels of MDR classes. Copper (Proportional Odds Ratio [Prop OR]=1.03; 95% CI=0.73-1.47) and CTC (Prop OR=1.1; 95% CI=0.78-1.56) supplementation were not significantly associated with the level of MDR adjusted for the effect of treatment period. MDR generally declined over the trial period. In conclusion, generalized ordered logistic regression can be used for the analysis of ordinal data such as MDR data when the proportionality assumptions for ordered logistic regression are violated. Published by Elsevier B.V.

  5. Artificial neural networks predict the incidence of portosplenomesenteric venous thrombosis in patients with acute pancreatitis.

    PubMed

    Fei, Y; Hu, J; Li, W-Q; Wang, W; Zong, G-Q

    2017-03-01

    Essentials Predicting the occurrence of portosplenomesenteric vein thrombosis (PSMVT) is difficult. We studied 72 patients with acute pancreatitis. Artificial neural networks modeling was more accurate than logistic regression in predicting PSMVT. Additional predictive factors may be incorporated into artificial neural networks. Objective To construct and validate artificial neural networks (ANNs) for predicting the occurrence of portosplenomesenteric venous thrombosis (PSMVT) and compare the predictive ability of the ANNs with that of logistic regression. Methods The ANNs and logistic regression modeling were constructed using simple clinical and laboratory data of 72 acute pancreatitis (AP) patients. The ANNs and logistic modeling were first trained on 48 randomly chosen patients and validated on the remaining 24 patients. The accuracy and the performance characteristics were compared between these two approaches by SPSS17.0 software. Results The training set and validation set did not differ on any of the 11 variables. After training, the back propagation network training error converged to 1 × 10 -20 , and it retained excellent pattern recognition ability. When the ANNs model was applied to the validation set, it revealed a sensitivity of 80%, specificity of 85.7%, a positive predictive value of 77.6% and negative predictive value of 90.7%. The accuracy was 83.3%. Differences could be found between ANNs modeling and logistic regression modeling in these parameters (10.0% [95% CI, -14.3 to 34.3%], 14.3% [95% CI, -8.6 to 37.2%], 15.7% [95% CI, -9.9 to 41.3%], 11.8% [95% CI, -8.2 to 31.8%], 22.6% [95% CI, -1.9 to 47.1%], respectively). When ANNs modeling was used to identify PSMVT, the area under receiver operating characteristic curve was 0.849 (95% CI, 0.807-0.901), which demonstrated better overall properties than logistic regression modeling (AUC = 0.716) (95% CI, 0.679-0.761). Conclusions ANNs modeling was a more accurate tool than logistic regression in predicting the occurrence of PSMVT following AP. More clinical factors or biomarkers may be incorporated into ANNs modeling to improve its predictive ability. © 2016 International Society on Thrombosis and Haemostasis.

  6. PREDICTION OF MALIGNANT BREAST LESIONS FROM MRI FEATURES: A COMPARISON OF ARTIFICIAL NEURAL NETWORK AND LOGISTIC REGRESSION TECHNIQUES

    PubMed Central

    McLaren, Christine E.; Chen, Wen-Pin; Nie, Ke; Su, Min-Ying

    2009-01-01

    Rationale and Objectives Dynamic contrast enhanced MRI (DCE-MRI) is a clinical imaging modality for detection and diagnosis of breast lesions. Analytical methods were compared for diagnostic feature selection and performance of lesion classification to differentiate between malignant and benign lesions in patients. Materials and Methods The study included 43 malignant and 28 benign histologically-proven lesions. Eight morphological parameters, ten gray level co-occurrence matrices (GLCM) texture features, and fourteen Laws’ texture features were obtained using automated lesion segmentation and quantitative feature extraction. Artificial neural network (ANN) and logistic regression analysis were compared for selection of the best predictors of malignant lesions among the normalized features. Results Using ANN, the final four selected features were compactness, energy, homogeneity, and Law_LS, with area under the receiver operating characteristic curve (AUC) = 0.82, and accuracy = 0.76. The diagnostic performance of these 4-features computed on the basis of logistic regression yielded AUC = 0.80 (95% CI, 0.688 to 0.905), similar to that of ANN. The analysis also shows that the odds of a malignant lesion decreased by 48% (95% CI, 25% to 92%) for every increase of 1 SD in the Law_LS feature, adjusted for differences in compactness, energy, and homogeneity. Using logistic regression with z-score transformation, a model comprised of compactness, NRL entropy, and gray level sum average was selected, and it had the highest overall accuracy of 0.75 among all models, with AUC = 0.77 (95% CI, 0.660 to 0.880). When logistic modeling of transformations using the Box-Cox method was performed, the most parsimonious model with predictors, compactness and Law_LS, had an AUC of 0.79 (95% CI, 0.672 to 0.898). Conclusion The diagnostic performance of models selected by ANN and logistic regression was similar. The analytic methods were found to be roughly equivalent in terms of predictive ability when a small number of variables were chosen. The robust ANN methodology utilizes a sophisticated non-linear model, while logistic regression analysis provides insightful information to enhance interpretation of the model features. PMID:19409817

  7. Logistic regression analysis of factors associated with avascular necrosis of the femoral head following femoral neck fractures in middle-aged and elderly patients.

    PubMed

    Ai, Zi-Sheng; Gao, You-Shui; Sun, Yuan; Liu, Yue; Zhang, Chang-Qing; Jiang, Cheng-Hua

    2013-03-01

    Risk factors for femoral neck fracture-induced avascular necrosis of the femoral head have not been elucidated clearly in middle-aged and elderly patients. Moreover, the high incidence of screw removal in China and its effect on the fate of the involved femoral head require statistical methods to reflect their intrinsic relationship. Ninety-nine patients older than 45 years with femoral neck fracture were treated by internal fixation between May 1999 and April 2004. Descriptive analysis, interaction analysis between associated factors, single factor logistic regression, multivariate logistic regression, and detailed interaction analysis were employed to explore potential relationships among associated factors. Avascular necrosis of the femoral head was found in 15 cases (15.2 %). Age × the status of implants (removal vs. maintenance) and gender × the timing of reduction were interactive according to two-factor interactive analysis. Age, the displacement of fractures, the quality of reduction, and the status of implants were found to be significant factors in single factor logistic regression analysis. Age, age × the status of implants, and the quality of reduction were found to be significant factors in multivariate logistic regression analysis. In fine interaction analysis after multivariate logistic regression analysis, implant removal was the most important risk factor for avascular necrosis in 56-to-85-year-old patients, with a risk ratio of 26.00 (95 % CI = 3.076-219.747). The middle-aged and elderly have less incidence of avascular necrosis of the femoral head following femoral neck fractures treated by cannulated screws. The removal of cannulated screws can induce a significantly high incidence of avascular necrosis of the femoral head in elderly patients, while a high-quality reduction is helpful to reduce avascular necrosis.

  8. Multivariate logistic regression analysis of postoperative complications and risk model establishment of gastrectomy for gastric cancer: A single-center cohort report.

    PubMed

    Zhou, Jinzhe; Zhou, Yanbing; Cao, Shougen; Li, Shikuan; Wang, Hao; Niu, Zhaojian; Chen, Dong; Wang, Dongsheng; Lv, Liang; Zhang, Jian; Li, Yu; Jiao, Xuelong; Tan, Xiaojie; Zhang, Jianli; Wang, Haibo; Zhang, Bingyuan; Lu, Yun; Sun, Zhenqing

    2016-01-01

    Reporting of surgical complications is common, but few provide information about the severity and estimate risk factors of complications. If have, but lack of specificity. We retrospectively analyzed data on 2795 gastric cancer patients underwent surgical procedure at the Affiliated Hospital of Qingdao University between June 2007 and June 2012, established multivariate logistic regression model to predictive risk factors related to the postoperative complications according to the Clavien-Dindo classification system. Twenty-four out of 86 variables were identified statistically significant in univariate logistic regression analysis, 11 significant variables entered multivariate analysis were employed to produce the risk model. Liver cirrhosis, diabetes mellitus, Child classification, invasion of neighboring organs, combined resection, introperative transfusion, Billroth II anastomosis of reconstruction, malnutrition, surgical volume of surgeons, operating time and age were independent risk factors for postoperative complications after gastrectomy. Based on logistic regression equation, p=Exp∑BiXi / (1+Exp∑BiXi), multivariate logistic regression predictive model that calculated the risk of postoperative morbidity was developed, p = 1/(1 + e((4.810-1.287X1-0.504X2-0.500X3-0.474X4-0.405X5-0.318X6-0.316X7-0.305X8-0.278X9-0.255X10-0.138X11))). The accuracy, sensitivity and specificity of the model to predict the postoperative complications were 86.7%, 76.2% and 88.6%, respectively. This risk model based on Clavien-Dindo grading severity of complications system and logistic regression analysis can predict severe morbidity specific to an individual patient's risk factors, estimate patients' risks and benefits of gastric surgery as an accurate decision-making tool and may serve as a template for the development of risk models for other surgical groups.

  9. Comparing Regression Coefficients between Nested Linear Models for Clustered Data with Generalized Estimating Equations

    ERIC Educational Resources Information Center

    Yan, Jun; Aseltine, Robert H., Jr.; Harel, Ofer

    2013-01-01

    Comparing regression coefficients between models when one model is nested within another is of great practical interest when two explanations of a given phenomenon are specified as linear models. The statistical problem is whether the coefficients associated with a given set of covariates change significantly when other covariates are added into…

  10. Rank-Optimized Logistic Matrix Regression toward Improved Matrix Data Classification.

    PubMed

    Zhang, Jianguang; Jiang, Jianmin

    2018-02-01

    While existing logistic regression suffers from overfitting and often fails in considering structural information, we propose a novel matrix-based logistic regression to overcome the weakness. In the proposed method, 2D matrices are directly used to learn two groups of parameter vectors along each dimension without vectorization, which allows the proposed method to fully exploit the underlying structural information embedded inside the 2D matrices. Further, we add a joint [Formula: see text]-norm on two parameter matrices, which are organized by aligning each group of parameter vectors in columns. This added co-regularization term has two roles-enhancing the effect of regularization and optimizing the rank during the learning process. With our proposed fast iterative solution, we carried out extensive experiments. The results show that in comparison to both the traditional tensor-based methods and the vector-based regression methods, our proposed solution achieves better performance for matrix data classifications.

  11. Detecting DIF in Polytomous Items Using MACS, IRT and Ordinal Logistic Regression

    ERIC Educational Resources Information Center

    Elosua, Paula; Wells, Craig

    2013-01-01

    The purpose of the present study was to compare the Type I error rate and power of two model-based procedures, the mean and covariance structure model (MACS) and the item response theory (IRT), and an observed-score based procedure, ordinal logistic regression, for detecting differential item functioning (DIF) in polytomous items. A simulation…

  12. Accuracy of Bayes and Logistic Regression Subscale Probabilities for Educational and Certification Tests

    ERIC Educational Resources Information Center

    Rudner, Lawrence

    2016-01-01

    In the machine learning literature, it is commonly accepted as fact that as calibration sample sizes increase, Naïve Bayes classifiers initially outperform Logistic Regression classifiers in terms of classification accuracy. Applied to subtests from an on-line final examination and from a highly regarded certification examination, this study shows…

  13. Comparing Linear Discriminant Function with Logistic Regression for the Two-Group Classification Problem.

    ERIC Educational Resources Information Center

    Fan, Xitao; Wang, Lin

    The Monte Carlo study compared the performance of predictive discriminant analysis (PDA) and that of logistic regression (LR) for the two-group classification problem. Prior probabilities were used for classification, but the cost of misclassification was assumed to be equal. The study used a fully crossed three-factor experimental design (with…

  14. Effects of Social Class and School Conditions on Educational Enrollment and Achievement of Boys and Girls in Rural Viet Nam

    ERIC Educational Resources Information Center

    Nguyen, Phuong L.

    2006-01-01

    This study examines the effects of parental SES, school quality, and community factors on children's enrollment and achievement in rural areas in Viet Nam, using logistic regression and ordered logistic regression. Multivariate analysis reveals significant differences in educational enrollment and outcomes by level of household expenditures and…

  15. School Exits in the Milwaukee Parental Choice Program: Evidence of a Marketplace?

    ERIC Educational Resources Information Center

    Ford, Michael

    2011-01-01

    This article examines whether the large number of school exits from the Milwaukee school voucher program is evidence of a marketplace. Two logistic regression and multinomial logistic regression models tested the relation between the inability to draw large numbers of voucher students and the ability for a private school to remain viable. Data on…

  16. Model building strategy for logistic regression: purposeful selection.

    PubMed

    Zhang, Zhongheng

    2016-03-01

    Logistic regression is one of the most commonly used models to account for confounders in medical literature. The article introduces how to perform purposeful selection model building strategy with R. I stress on the use of likelihood ratio test to see whether deleting a variable will have significant impact on model fit. A deleted variable should also be checked for whether it is an important adjustment of remaining covariates. Interaction should be checked to disentangle complex relationship between covariates and their synergistic effect on response variable. Model should be checked for the goodness-of-fit (GOF). In other words, how the fitted model reflects the real data. Hosmer-Lemeshow GOF test is the most widely used for logistic regression model.

  17. Tools to Support Interpreting Multiple Regression in the Face of Multicollinearity

    PubMed Central

    Kraha, Amanda; Turner, Heather; Nimon, Kim; Zientek, Linda Reichwein; Henson, Robin K.

    2012-01-01

    While multicollinearity may increase the difficulty of interpreting multiple regression (MR) results, it should not cause undue problems for the knowledgeable researcher. In the current paper, we argue that rather than using one technique to investigate regression results, researchers should consider multiple indices to understand the contributions that predictors make not only to a regression model, but to each other as well. Some of the techniques to interpret MR effects include, but are not limited to, correlation coefficients, beta weights, structure coefficients, all possible subsets regression, commonality coefficients, dominance weights, and relative importance weights. This article will review a set of techniques to interpret MR effects, identify the elements of the data on which the methods focus, and identify statistical software to support such analyses. PMID:22457655

  18. Tools to support interpreting multiple regression in the face of multicollinearity.

    PubMed

    Kraha, Amanda; Turner, Heather; Nimon, Kim; Zientek, Linda Reichwein; Henson, Robin K

    2012-01-01

    While multicollinearity may increase the difficulty of interpreting multiple regression (MR) results, it should not cause undue problems for the knowledgeable researcher. In the current paper, we argue that rather than using one technique to investigate regression results, researchers should consider multiple indices to understand the contributions that predictors make not only to a regression model, but to each other as well. Some of the techniques to interpret MR effects include, but are not limited to, correlation coefficients, beta weights, structure coefficients, all possible subsets regression, commonality coefficients, dominance weights, and relative importance weights. This article will review a set of techniques to interpret MR effects, identify the elements of the data on which the methods focus, and identify statistical software to support such analyses.

  19. Meta-analytical synthesis of regression coefficients under different categorization scheme of continuous covariates.

    PubMed

    Yoneoka, Daisuke; Henmi, Masayuki

    2017-11-30

    Recently, the number of clinical prediction models sharing the same regression task has increased in the medical literature. However, evidence synthesis methodologies that use the results of these regression models have not been sufficiently studied, particularly in meta-analysis settings where only regression coefficients are available. One of the difficulties lies in the differences between the categorization schemes of continuous covariates across different studies. In general, categorization methods using cutoff values are study specific across available models, even if they focus on the same covariates of interest. Differences in the categorization of covariates could lead to serious bias in the estimated regression coefficients and thus in subsequent syntheses. To tackle this issue, we developed synthesis methods for linear regression models with different categorization schemes of covariates. A 2-step approach to aggregate the regression coefficient estimates is proposed. The first step is to estimate the joint distribution of covariates by introducing a latent sampling distribution, which uses one set of individual participant data to estimate the marginal distribution of covariates with categorization. The second step is to use a nonlinear mixed-effects model with correction terms for the bias due to categorization to estimate the overall regression coefficients. Especially in terms of precision, numerical simulations show that our approach outperforms conventional methods, which only use studies with common covariates or ignore the differences between categorization schemes. The method developed in this study is also applied to a series of WHO epidemiologic studies on white blood cell counts. Copyright © 2017 John Wiley & Sons, Ltd.

  20. Development of a statistical model for the determination of the probability of riverbank erosion in a Meditteranean river basin

    NASA Astrophysics Data System (ADS)

    Varouchakis, Emmanouil; Kourgialas, Nektarios; Karatzas, George; Giannakis, Georgios; Lilli, Maria; Nikolaidis, Nikolaos

    2014-05-01

    Riverbank erosion affects the river morphology and the local habitat and results in riparian land loss, damage to property and infrastructures, ultimately weakening flood defences. An important issue concerning riverbank erosion is the identification of the areas vulnerable to erosion, as it allows for predicting changes and assists with stream management and restoration. One way to predict the vulnerable to erosion areas is to determine the erosion probability by identifying the underlying relations between riverbank erosion and the geomorphological and/or hydrological variables that prevent or stimulate erosion. A statistical model for evaluating the probability of erosion based on a series of independent local variables and by using logistic regression is developed in this work. The main variables affecting erosion are vegetation index (stability), the presence or absence of meanders, bank material (classification), stream power, bank height, river bank slope, riverbed slope, cross section width and water velocities (Luppi et al. 2009). In statistics, logistic regression is a type of regression analysis used for predicting the outcome of a categorical dependent variable, e.g. binary response, based on one or more predictor variables (continuous or categorical). The probabilities of the possible outcomes are modelled as a function of independent variables using a logistic function. Logistic regression measures the relationship between a categorical dependent variable and, usually, one or several continuous independent variables by converting the dependent variable to probability scores. Then, a logistic regression is formed, which predicts success or failure of a given binary variable (e.g. 1 = "presence of erosion" and 0 = "no erosion") for any value of the independent variables. The regression coefficients are estimated by using maximum likelihood estimation. The erosion occurrence probability can be calculated in conjunction with the model deviance regarding the independent variables tested (Atkinson et al. 2003). The developed statistical model is applied to the Koiliaris River Basin in the island of Crete, Greece. The aim is to determine the probability of erosion along the Koiliaris' riverbanks considering a series of independent geomorphological and/or hydrological variables. Data for the river bank slope and for the river cross section width are available at ten locations along the river. The riverbank has indications of erosion at six of the ten locations while four has remained stable. Based on a recent work, measurements for the two independent variables and data regarding bank stability are available at eight different locations along the river. These locations were used as validation points for the proposed statistical model. The results show a very close agreement between the observed erosion indications and the statistical model as the probability of erosion was accurately predicted at seven out of the eight locations. The next step is to apply the model at more locations along the riverbanks. In November 2013, stakes were inserted at selected locations in order to be able to identify the presence or absence of erosion after the winter period. In April 2014 the presence or absence of erosion will be identified and the model results will be compared to the field data. Our intent is to extend the model by increasing the number of independent variables in order to indentify the key factors favouring erosion along the Koiliaris River. We aim at developing an easy to use statistical tool that will provide a quantified measure of the erosion probability along the riverbanks, which could consequently be used to prevent erosion and flooding events. Atkinson, P. M., German, S. E., Sear, D. A. and Clark, M. J. 2003. Exploring the relations between riverbank erosion and geomorphological controls using geographically weighted logistic regression. Geographical Analysis, 35 (1), 58-82. Luppi, L., Rinaldi, M., Teruggi, L. B., Darby, S. E. and Nardi, L. 2009. Monitoring and numerical modelling of riverbank erosion processes: A case study along the Cecina River (central Italy). Earth Surface Processes and Landforms, 34 (4), 530-546. Acknowledgements This work is part of an on-going THALES project (CYBERSENSORS - High Frequency Monitoring System for Integrated Water Resources Management of Rivers). The project has been co-financed by the European Union (European Social Fund - ESF) and Greek national funds through the Operational Program "Education and Lifelong Learning" of the National Strategic Reference Framework (NSRF) - Research Funding Program: THALES. Investing in knowledge society through the European Social Fund.

  1. Growth models of Rhizophora mangle L. seedlings in tropical southwestern Atlantic

    NASA Astrophysics Data System (ADS)

    Lima, Karen Otoni de Oliveira; Tognella, Mônica Maria Pereira; Cunha, Simone Rabelo; Andrade, Humber Agrelli de

    2018-07-01

    The present study selected and compared regression models that best describe the growth curves of Rhizophora mangle seedlings based on the height (cm) and time (days) variables. The Linear, Exponential, Power Law, Monomolecular, Logistic, and Gompertz models were adjusted with non-linear formulations and minimization of the sum of the squares of the residues. The Akaike Information Criterion was used to select the best model for each seedling. After this selection, the determination coefficient, which evaluates how well a model describes height variation as a time function, was inspected. Differing from the classic population ecology studies, the Monomolecular, Three-parameter Logistic, and Gompertz models presented the best performance in describing growth, suggesting they are the most adequate options for long-term studies. The different growth curves reflect the complexity of stem growth at the seedling stage for R. mangle. The analysis of the joint distribution of the parameters initial height, growth rate, and, asymptotic size allowed the study of the species ecological attributes and to observe its intraspecific variability in each model. Our results provide a basis for interpretation of the dynamics of seedlings growth during their establishment in a mature forest, as well as its regeneration processes.

  2. Logistic model analysis of neurological findings in Minamata disease and the predicting index.

    PubMed

    Nakagawa, Masanori; Kodama, Tomoko; Akiba, Suminori; Arimura, Kimiyoshi; Wakamiya, Junji; Futatsuka, Makoto; Kitano, Takao; Osame, Mitsuhiro

    2002-01-01

    To establish a statistical diagnostic method to identify patients with Minamata disease (MD) considering factors of aging and sex, we analyzed the neurological findings in MD patients, inhabitants in a methylmercury polluted (MP) area, and inhabitants in a non-MP area. We compared the neurological findings in MD patients and inhabitants aged more than 40 years in the non-MP area. Based on the different frequencies of the neurological signs in the two groups, we devised the following formula to calculate the predicting index for MD: predicting index = 1/(1+e(-x)) x 100 (The value of x was calculated using the regression coefficients of each neurological finding obtained from logistic analysis. The index 100 indicated MD, and 0, non-MD). Using this method, we found that 100% of male and 98% of female patients with MD (95 cases) gave predicting indices higher than 95. Five percent of the aged inhabitants in the MP area (598 inhabitants) and 0.2% of those in the non-MP area (558 inhabitants) gave predicting indices of 50 or higher. Our statistical diagnostic method for MD was useful in distinguishing MD patients from healthy elders based on their neurological findings.

  3. Quasi-Likelihood Techniques in a Logistic Regression Equation for Identifying Simulium damnosum s.l. Larval Habitats Intra-cluster Covariates in Togo.

    PubMed

    Jacob, Benjamin G; Novak, Robert J; Toe, Laurent; Sanfo, Moussa S; Afriyie, Abena N; Ibrahim, Mohammed A; Griffith, Daniel A; Unnasch, Thomas R

    2012-01-01

    The standard methods for regression analyses of clustered riverine larval habitat data of Simulium damnosum s.l. a major black-fly vector of Onchoceriasis, postulate models relating observational ecological-sampled parameter estimators to prolific habitats without accounting for residual intra-cluster error correlation effects. Generally, this correlation comes from two sources: (1) the design of the random effects and their assumed covariance from the multiple levels within the regression model; and, (2) the correlation structure of the residuals. Unfortunately, inconspicuous errors in residual intra-cluster correlation estimates can overstate precision in forecasted S.damnosum s.l. riverine larval habitat explanatory attributes regardless how they are treated (e.g., independent, autoregressive, Toeplitz, etc). In this research, the geographical locations for multiple riverine-based S. damnosum s.l. larval ecosystem habitats sampled from 2 pre-established epidemiological sites in Togo were identified and recorded from July 2009 to June 2010. Initially the data was aggregated into proc genmod. An agglomerative hierarchical residual cluster-based analysis was then performed. The sampled clustered study site data was then analyzed for statistical correlations using Monthly Biting Rates (MBR). Euclidean distance measurements and terrain-related geomorphological statistics were then generated in ArcGIS. A digital overlay was then performed also in ArcGIS using the georeferenced ground coordinates of high and low density clusters stratified by Annual Biting Rates (ABR). This data was overlain onto multitemporal sub-meter pixel resolution satellite data (i.e., QuickBird 0.61m wavbands ). Orthogonal spatial filter eigenvectors were then generated in SAS/GIS. Univariate and non-linear regression-based models (i.e., Logistic, Poisson and Negative Binomial) were also employed to determine probability distributions and to identify statistically significant parameter estimators from the sampled data. Thereafter, Durbin-Watson test statistics were used to test the null hypothesis that the regression residuals were not autocorrelated against the alternative that the residuals followed an autoregressive process in AUTOREG. Bayesian uncertainty matrices were also constructed employing normal priors for each of the sampled estimators in PROC MCMC. The residuals revealed both spatially structured and unstructured error effects in the high and low ABR-stratified clusters. The analyses also revealed that the estimators, levels of turbidity and presence of rocks were statistically significant for the high-ABR-stratified clusters, while the estimators distance between habitats and floating vegetation were important for the low-ABR-stratified cluster. Varying and constant coefficient regression models, ABR- stratified GIS-generated clusters, sub-meter resolution satellite imagery, a robust residual intra-cluster diagnostic test, MBR-based histograms, eigendecomposition spatial filter algorithms and Bayesian matrices can enable accurate autoregressive estimation of latent uncertainity affects and other residual error probabilities (i.e., heteroskedasticity) for testing correlations between georeferenced S. damnosum s.l. riverine larval habitat estimators. The asymptotic distribution of the resulting residual adjusted intra-cluster predictor error autocovariate coefficients can thereafter be established while estimates of the asymptotic variance can lead to the construction of approximate confidence intervals for accurately targeting productive S. damnosum s.l habitats based on spatiotemporal field-sampled count data.

  4. Determination of riverbank erosion probability using Locally Weighted Logistic Regression

    NASA Astrophysics Data System (ADS)

    Ioannidou, Elena; Flori, Aikaterini; Varouchakis, Emmanouil A.; Giannakis, Georgios; Vozinaki, Anthi Eirini K.; Karatzas, George P.; Nikolaidis, Nikolaos

    2015-04-01

    Riverbank erosion is a natural geomorphologic process that affects the fluvial environment. The most important issue concerning riverbank erosion is the identification of the vulnerable locations. An alternative to the usual hydrodynamic models to predict vulnerable locations is to quantify the probability of erosion occurrence. This can be achieved by identifying the underlying relations between riverbank erosion and the geomorphological or hydrological variables that prevent or stimulate erosion. Thus, riverbank erosion can be determined by a regression model using independent variables that are considered to affect the erosion process. The impact of such variables may vary spatially, therefore, a non-stationary regression model is preferred instead of a stationary equivalent. Locally Weighted Regression (LWR) is proposed as a suitable choice. This method can be extended to predict the binary presence or absence of erosion based on a series of independent local variables by using the logistic regression model. It is referred to as Locally Weighted Logistic Regression (LWLR). Logistic regression is a type of regression analysis used for predicting the outcome of a categorical dependent variable (e.g. binary response) based on one or more predictor variables. The method can be combined with LWR to assign weights to local independent variables of the dependent one. LWR allows model parameters to vary over space in order to reflect spatial heterogeneity. The probabilities of the possible outcomes are modelled as a function of the independent variables using a logistic function. Logistic regression measures the relationship between a categorical dependent variable and, usually, one or several continuous independent variables by converting the dependent variable to probability scores. Then, a logistic regression is formed, which predicts success or failure of a given binary variable (e.g. erosion presence or absence) for any value of the independent variables. The erosion occurrence probability can be calculated in conjunction with the model deviance regarding the independent variables tested. The most straightforward measure for goodness of fit is the G statistic. It is a simple and effective way to study and evaluate the Logistic Regression model efficiency and the reliability of each independent variable. The developed statistical model is applied to the Koiliaris River Basin on the island of Crete, Greece. Two datasets of river bank slope, river cross-section width and indications of erosion were available for the analysis (12 and 8 locations). Two different types of spatial dependence functions, exponential and tricubic, were examined to determine the local spatial dependence of the independent variables at the measurement locations. The results show a significant improvement when the tricubic function is applied as the erosion probability is accurately predicted at all eight validation locations. Results for the model deviance show that cross-section width is more important than bank slope in the estimation of erosion probability along the Koiliaris riverbanks. The proposed statistical model is a useful tool that quantifies the erosion probability along the riverbanks and can be used to assist managing erosion and flooding events. Acknowledgements This work is part of an on-going THALES project (CYBERSENSORS - High Frequency Monitoring System for Integrated Water Resources Management of Rivers). The project has been co-financed by the European Union (European Social Fund - ESF) and Greek national funds through the Operational Program "Education and Lifelong Learning" of the National Strategic Reference Framework (NSRF) - Research Funding Program: THALES. Investing in knowledge society through the European Social Fund.

  5. [Relationship between the ankle-arm index determined by Doppler ultrasonography and cardiovascular outcomes and amputations, in a group of patients with type 2 diabetes mellitus from the Instituto Nacional de Ciencias Médicas y Nutrición Salvador Zubirán].

    PubMed

    Miranda Garduño, Luis Miguel; Bermúdez Rocha, Rocío; Gómez Pérez, Francisco J; Aguilar Salinas, Carlos A

    2011-01-01

    An ankle/arm index < 0.90 and ≥ 1.41 is considered as abnormal. This study was aimed to investigate the prevalence of peripheral arterial disease through the identification of the ankle/arm index using Doppler ultrasound, and the possible association between pathological ankle/arm index and the micro- and macrovascular complications of diabetes and amputation. The ankle/arm index was determined in outpatient type 2 diabetic subjects. There were the following variables: age and cardiovascular outcomes. To find if the ankle/arm index is related to the cardiovascular outcomes or with the presence of micro- or macrovascular complications we determined the index of correlation of Pearson and also used logistic regression methods to analyze the association between ankle/arm index with the categorical variables. We calculated the ankle/arm index in 242 patients. The prevalence of ischemic ankle/arm index (< 0.90) was 13.6%. The Pearson correlation coefficient for ankle/arm index pathological and cardiovascular outcomes was 0.180 (p = 0.005), amputation 0.130 (p < 0.05), retinopathy 0.132 (p < 0.05), and nephropathy 0.158 (p = 0.01). In logistic regression analysis, the factors associated with pathological ankle/arm index were age > 51 years, cardiovascular outcomes, and amputation. With the Mann Whitney U test we found that a relationship exists between pathological and amputation iliotibial band (p < 0.05). Diabetic patients have a high prevalence of pathological ankle/arm index.

  6. A Predictive Score for Bronchopleural Fistula Established Using the French Database Epithor.

    PubMed

    Pforr, Arnaud; Pagès, Pierre-Benoit; Baste, Jean-Marc; Thomas, Pascal; Falcoz, Pierre-Emmanuel; Lepimpec Barthes, Francoise; Dahan, Marcel; Bernard, Alain

    2016-01-01

    Bronchopleural fistula (BPF) remains a rare but fatal complication of thoracic surgery. The aim of this study was to develop and validate a predictive model of BPF after pulmonary resection and to identify patients at high risk for BPF. From January 2005 to December 2012, 34,000 patients underwent major pulmonary resection (lobectomy, bilobectomy, or pneumonectomy) and were entered into the French National database Epithor. The primary outcome was the occurrence of postoperative BPF at 30 days. The logistic regression model was built using a backward stepwise variable selection. Bronchopleural fistula occurred in 318 patients (0.94%); its prevalence was 0.5% for lobectomy (n = 139), 2.2% for bilobectomy (n = 39), and 3% for pneumonectomy (n = 140). The mortality rate was 25.9% for lobectomy (n = 36), 16.7% for bilobectomy (n = 6), and 20% for pneumonectomy (n = 28). In the final model, nine variables were selected: sex, body mass index, dyspnea score, number of comorbidities per patient, bilobectomy, pneumonectomy, emergency surgery, sleeve resection, and the side of the resection. In the development data set, the C-index was 0.8 (95% confidence interval: 0.78 to 0.82). This model was well calibrated because the Hosmer-Lemeshow test was not significant (χ(2) = 10.5, p = 0.23). We then calculated the logistic regression coefficient to build the predictive score for BPF. This strong model could be easily used by surgeons to identify patient at high risk for BPF. This score needs to be confirmed prospectively in an independent cohort. Copyright © 2016 The Society of Thoracic Surgeons. Published by Elsevier Inc. All rights reserved.

  7. A Multicenter Analysis of Factors Associated With Apixaban-Related Bleeding in Hospitalized Patients With End-Stage Renal Disease on Hemodialysis.

    PubMed

    Steuber, Taylor D; Shiltz, Dane L; Cairns, Alex C; Ding, Qian; Binger, Katie J; Courtney, Julia R

    2017-11-01

    In 2014, the United States Food and Drug Administration approved a labeling change for apixaban to include recommendations for patients with severe renal impairment and patients with end-stage renal disease (ESRD) on hemodialysis (HD), though these recommendations are largely based on pharmacokinetic and pharmacodynamic data. Identify variables associated with bleeding events in hospitalized patients with ESRD on HD receiving apixaban. This retrospective, multicenter cohort study evaluated hospitalized patients with ESRD on HD receiving apixaban from January 1, 2013, through March 31, 2016. Correlational analysis and logistic regression were completed to identify factors associated with bleeding. A total of 114 adults were included in the analysis. The median length of stay (LOS) was 6.2 (interquartile range = 3.8-11.9) days and bleeding events occurred in a total of 17 patients (15%). A weak correlation was identified for higher cumulative apixaban exposure, increased number of HD sessions while receiving apixaban, and increased hospital LOS ( P < 0.05; correlation coefficient < 0.40). When controlling for confounders, logistic regression revealed that composite bleeding events were independently increased by continuation of outpatient apixaban (odds ratio = 13.07; 95% CI = 1.54-110.54; P = 0.018), increased total daily dose of apixaban (odds ratio = 1.72; 95% CI = 1.20 to 2.48; P = 0.003), and total HD sessions while receiving apixaban (odds ratio = 2.04; 95% CI = 1.06-3.92; P = 0.033). The association between these factors and increased bleeding should prompt concern for long-term anticoagulation with apixaban in patients with ESRD receiving chronic HD.

  8. Predicting multi-level drug response with gene expression profile in multiple myeloma using hierarchical ordinal regression.

    PubMed

    Zhang, Xinyan; Li, Bingzong; Han, Huiying; Song, Sha; Xu, Hongxia; Hong, Yating; Yi, Nengjun; Zhuang, Wenzhuo

    2018-05-10

    Multiple myeloma (MM), like other cancers, is caused by the accumulation of genetic abnormalities. Heterogeneity exists in the patients' response to treatments, for example, bortezomib. This urges efforts to identify biomarkers from numerous molecular features and build predictive models for identifying patients that can benefit from a certain treatment scheme. However, previous studies treated the multi-level ordinal drug response as a binary response where only responsive and non-responsive groups are considered. It is desirable to directly analyze the multi-level drug response, rather than combining the response to two groups. In this study, we present a novel method to identify significantly associated biomarkers and then develop ordinal genomic classifier using the hierarchical ordinal logistic model. The proposed hierarchical ordinal logistic model employs the heavy-tailed Cauchy prior on the coefficients and is fitted by an efficient quasi-Newton algorithm. We apply our hierarchical ordinal regression approach to analyze two publicly available datasets for MM with five-level drug response and numerous gene expression measures. Our results show that our method is able to identify genes associated with the multi-level drug response and to generate powerful predictive models for predicting the multi-level response. The proposed method allows us to jointly fit numerous correlated predictors and thus build efficient models for predicting the multi-level drug response. The predictive model for the multi-level drug response can be more informative than the previous approaches. Thus, the proposed approach provides a powerful tool for predicting multi-level drug response and has important impact on cancer studies.

  9. Influence of the usual motivation for dental attendance on dental status and oral health-related quality of life.

    PubMed

    Montero, Javier; Albaladejo, Alberto; Zalba, José-Ignacio

    2014-05-01

    To evaluate the influence of dental visiting patterns on the dental status and Oral Health-related Quality of Life (OHQoL) of patients visiting the University Clinic of Salamanca (Spain). This cross-sectional study consisted of a clinical oral examination and a questionnaire-based interviewin a consecutive sample of patients seeking a dental examination. Patients were classified as problem-based dental attendees(PB) and regular dental attendees(RB). Clinical and OHQoL(OHIP-14 & OIDP)data were compared betweengroups. Pair-wise comparisons were performed and a Logistic Regression Model was fitted for predicting the Odds Ratio (OR) of being a PB patient. The sample was composed of 255 patients aged 18 to 87 years (mean age: 63.1 ± 12.7; women: 51.8%). The PB patients had a poorer dental status (i.e. caries, periodontal and prosthetic needs), brushed their teethless,and were significantly more impaired in their OHQoL according to both instruments.The logistic regression coefficients demonstrated that on average the OR of being a PB patient was high in this dental patient sample, but this OR increased significantly if the patient was a male (OR= 1.1-5.0) or referred pain-related impacts according to the OHIP and, additionally, the OR decreased significantly as a function of the number of healthy fillings and the number of sextants coded as CPI=0. Regular dental check-ups are associated with better dental status and a better OHQoL after controlling for potentially related confounding factors.

  10. [Predictors of hospitalization for alcohol use disorder in Korean men].

    PubMed

    Hong, Hae-Sook; Park, Jeong-Eun; Park, Wan-Ju

    2014-10-01

    This study was done to identify the patterns and significant predictors influencing hospitalization of Korean men for alcohol use disorder. A descriptive study design was utilized. Data were collected using self-report questionnaires from 143 inpatients who met the DSM-5 alcohol use disorder criteria and were receiving treatment and 157 social drinkers living in the community. The questionnaires included Alcohol Use Disorders Identification Test (AUDIT), Alcohol Problems, Alcohol Expectancy Questionnaire (AEQ), Life Position, and The Korean version of the Children of Alcoholics Screening Test (CAST-K). Data were analyzed using descriptive statistics, t-test, χ²-test, F-test, Pearson correlation coefficients, and logistic regression with forward stepwise. AUDIT had significant correlations with alcohol problems, alcohol expectancy, and parents' alcoholism. In logistic regression, factors significantly affecting hospitalization were divorced (OR=4.18, 95% CI: 1.28-13.71), graduation from elementary school (OR=28.50, 95% CI: 8.07-100.69), middle school (OR=6.66, 95% CI: 2.21-20.09), high school (OR=6.31, 95% CI: 2.59-15.36), drinking alone (OR=9.07, 95% CI: 1.78-46.17), family history of alcoholism (OR=2.41, 95% CI: 1.11-5.25), interpersonal relationship problems (OR=1.28, 95% CI:1.17-1.41), and sexual enhancement of alcohol expectancy (OR=0.83, 95% CI: 0.72-0.94), which accounted for 53% of the variance. Results suggest that interpersonal relationship programs and customized cognitive programs for social drinkers in the community are needed to decreased alcohol related hospitalization in Korean men.

  11. Inequality in the hepatitis B awareness level in rural residents from 7 provinces in China.

    PubMed

    Zheng, Juan; Li, Quan; Wang, Jian; Zhang, Guojie; Wangen, Knut R

    2017-05-04

    The hepatitis B (HB) awareness level is an important factor affecting the rates of HB virus vaccination. To better understand income-related inequalities in the HB awareness level, it is imperative to identify the sources of inequalities and assess the contribution rates of these influential factors. This study analyzed the unequal distribution of the HB awareness level and the contributions of various influential factors. We performed a cross-sectional household survey with questionnaire-based, face-to-face interviews in 7 Chinese provinces. Responses from 7271 respondents were used in this analysis. Multinomial logistic regression was used for the analysis of contributing factors, and the concentration index was used as a measure of HB awareness inequalities. The HB awareness level varied across participants with different characteristics. Multinomial logistic regression of the explanatory factors of the HB awareness level showed that several estimated coefficients and relative risk ratios were statistically significant for middle- and high-level awareness, except for sex, occupation, and household income. The concentration index of the HB knowledge score was 0.140, indicating inequality gradients disadvantageous to the poor. The contribution rate of socioeconomic factors was the largest (60.8%), followed by demographic characteristics (29.0%) and geographic factors (4.3%). Demographic, socioeconomic, and geographic factors are associated with the HB awareness inequality. Therefore, to reduce inequality, HB-related health education targeting individuals with low socioeconomic status should be performed. Less-developed provinces, especially with high proportions of poor residents, warrant particular attention. Our findings may be beneficial to improve the HB virus vaccination rate for individuals with low socioeconomic status.

  12. Factors explaining priority setting at community mental health centres: a quantitative analysis of referral assessments.

    PubMed

    Grepperud, Sverre; Holman, Per Arne; Wangen, Knut Reidar

    2014-12-14

    Clinicians at Norwegian community mental health centres assess referrals from general practitioners and classify them into three priority groups (high priority, low priority, and refusal) according to need where need is defined by three prioritization criteria (severity, effect, and cost-effectiveness). In this study, we seek to operationalize the three criteria and analyze to what extent they have an effect on clinical-level priority setting after controlling for clinician characteristics and organisational factors. Twenty anonymous referrals were rated by 42 admission team members employed at 14 community mental health centres in the South-East Health Region of Norway. Intra-class correlation coefficients were calculated and logistic regressions were performed. Variation in clinicians' assessments of the three criteria was highest for effect and cost-effectiveness. An ordered logistic regression model showed that all three criteria for prioritization, three clinician characteristics (education, being a manager or not, and "guideline awareness"), and the centres themselves (fixed effects), explained priority decisions. The relative importance of the explanatory factors, however, depended on the priority decision studied. For the classification of all admitted patients into high- and low-priority groups, all clinician characteristics became insignificant. For the classification of patients, into those admitted and non-admitted, one criterion (effect) and "being a manager or not" became insignificant, while profession ("being a psychiatrist") became significant. Our findings suggest that variation in priority decisions can be reduced by: (i) reducing the disagreement in clinicians' assessments of cost-effectiveness and effect, and (ii) restricting priority decisions to clinicians with a similar background (education, being a manager or not, and "guideline awareness").

  13. Landslide susceptibility mapping using frequency ratio, logistic regression, artificial neural networks and their comparison: A case study from Kat landslides (Tokat—Turkey)

    NASA Astrophysics Data System (ADS)

    Yilmaz, Işık

    2009-06-01

    The purpose of this study is to compare the landslide susceptibility mapping methods of frequency ratio (FR), logistic regression and artificial neural networks (ANN) applied in the Kat County (Tokat—Turkey). Digital elevation model (DEM) was first constructed using GIS software. Landslide-related factors such as geology, faults, drainage system, topographical elevation, slope angle, slope aspect, topographic wetness index (TWI) and stream power index (SPI) were used in the landslide susceptibility analyses. Landslide susceptibility maps were produced from the frequency ratio, logistic regression and neural networks models, and they were then compared by means of their validations. The higher accuracies of the susceptibility maps for all three models were obtained from the comparison of the landslide susceptibility maps with the known landslide locations. However, respective area under curve (AUC) values of 0.826, 0.842 and 0.852 for frequency ratio, logistic regression and artificial neural networks showed that the map obtained from ANN model is more accurate than the other models, accuracies of all models can be evaluated relatively similar. The results obtained in this study also showed that the frequency ratio model can be used as a simple tool in assessment of landslide susceptibility when a sufficient number of data were obtained. Input process, calculations and output process are very simple and can be readily understood in the frequency ratio model, however logistic regression and neural networks require the conversion of data to ASCII or other formats. Moreover, it is also very hard to process the large amount of data in the statistical package.

  14. Using the Coefficient of Determination "R"[superscript 2] to Test the Significance of Multiple Linear Regression

    ERIC Educational Resources Information Center

    Quinino, Roberto C.; Reis, Edna A.; Bessegato, Lupercio F.

    2013-01-01

    This article proposes the use of the coefficient of determination as a statistic for hypothesis testing in multiple linear regression based on distributions acquired by beta sampling. (Contains 3 figures.)

  15. A Comparison of Logistic Regression, Neural Networks, and Classification Trees Predicting Success of Actuarial Students

    ERIC Educational Resources Information Center

    Schumacher, Phyllis; Olinsky, Alan; Quinn, John; Smith, Richard

    2010-01-01

    The authors extended previous research by 2 of the authors who conducted a study designed to predict the successful completion of students enrolled in an actuarial program. They used logistic regression to determine the probability of an actuarial student graduating in the major or dropping out. They compared the results of this study with those…

  16. Logistic regression accuracy across different spatial and temporal scales for a wide-ranging species, the marbled murrelet

    Treesearch

    Carolyn B. Meyer; Sherri L. Miller; C. John Ralph

    2004-01-01

    The scale at which habitat variables are measured affects the accuracy of resource selection functions in predicting animal use of sites. We used logistic regression models for a wide-ranging species, the marbled murrelet, (Brachyramphus marmoratus) in a large region in California to address how much changing the spatial or temporal scale of...

  17. Odds Ratio, Delta, ETS Classification, and Standardization Measures of DIF Magnitude for Binary Logistic Regression

    ERIC Educational Resources Information Center

    Monahan, Patrick O.; McHorney, Colleen A.; Stump, Timothy E.; Perkins, Anthony J.

    2007-01-01

    Previous methodological and applied studies that used binary logistic regression (LR) for detection of differential item functioning (DIF) in dichotomously scored items either did not report an effect size or did not employ several useful measures of DIF magnitude derived from the LR model. Equations are provided for these effect size indices.…

  18. A Generalized Logistic Regression Procedure to Detect Differential Item Functioning among Multiple Groups

    ERIC Educational Resources Information Center

    Magis, David; Raiche, Gilles; Beland, Sebastien; Gerard, Paul

    2011-01-01

    We present an extension of the logistic regression procedure to identify dichotomous differential item functioning (DIF) in the presence of more than two groups of respondents. Starting from the usual framework of a single focal group, we propose a general approach to estimate the item response functions in each group and to test for the presence…

  19. Risk Factors of Falls in Community-Dwelling Older Adults: Logistic Regression Tree Analysis

    ERIC Educational Resources Information Center

    Yamashita, Takashi; Noe, Douglas A.; Bailer, A. John

    2012-01-01

    Purpose of the Study: A novel logistic regression tree-based method was applied to identify fall risk factors and possible interaction effects of those risk factors. Design and Methods: A nationally representative sample of American older adults aged 65 years and older (N = 9,592) in the Health and Retirement Study 2004 and 2006 modules was used.…

  20. Estimation of Logistic Regression Models in Small Samples. A Simulation Study Using a Weakly Informative Default Prior Distribution

    ERIC Educational Resources Information Center

    Gordovil-Merino, Amalia; Guardia-Olmos, Joan; Pero-Cebollero, Maribel

    2012-01-01

    In this paper, we used simulations to compare the performance of classical and Bayesian estimations in logistic regression models using small samples. In the performed simulations, conditions were varied, including the type of relationship between independent and dependent variable values (i.e., unrelated and related values), the type of variable…

  1. Using multiple logistic regression and GIS technology to predict landslide hazard in northeast Kansas, USA

    USGS Publications Warehouse

    Ohlmacher, G.C.; Davis, J.C.

    2003-01-01

    Landslides in the hilly terrain along the Kansas and Missouri rivers in northeastern Kansas have caused millions of dollars in property damage during the last decade. To address this problem, a statistical method called multiple logistic regression has been used to create a landslide-hazard map for Atchison, Kansas, and surrounding areas. Data included digitized geology, slopes, and landslides, manipulated using ArcView GIS. Logistic regression relates predictor variables to the occurrence or nonoccurrence of landslides within geographic cells and uses the relationship to produce a map showing the probability of future landslides, given local slopes and geologic units. Results indicated that slope is the most important variable for estimating landslide hazard in the study area. Geologic units consisting mostly of shale, siltstone, and sandstone were most susceptible to landslides. Soil type and aspect ratio were considered but excluded from the final analysis because these variables did not significantly add to the predictive power of the logistic regression. Soil types were highly correlated with the geologic units, and no significant relationships existed between landslides and slope aspect. ?? 2003 Elsevier Science B.V. All rights reserved.

  2. A Method for Calculating the Probability of Successfully Completing a Rocket Propulsion Ground Test

    NASA Technical Reports Server (NTRS)

    Messer, Bradley

    2007-01-01

    Propulsion ground test facilities face the daily challenge of scheduling multiple customers into limited facility space and successfully completing their propulsion test projects. Over the last decade NASA s propulsion test facilities have performed hundreds of tests, collected thousands of seconds of test data, and exceeded the capabilities of numerous test facility and test article components. A logistic regression mathematical modeling technique has been developed to predict the probability of successfully completing a rocket propulsion test. A logistic regression model is a mathematical modeling approach that can be used to describe the relationship of several independent predictor variables X(sub 1), X(sub 2),.., X(sub k) to a binary or dichotomous dependent variable Y, where Y can only be one of two possible outcomes, in this case Success or Failure of accomplishing a full duration test. The use of logistic regression modeling is not new; however, modeling propulsion ground test facilities using logistic regression is both a new and unique application of the statistical technique. Results from this type of model provide project managers with insight and confidence into the effectiveness of rocket propulsion ground testing.

  3. Predicting risk for portal vein thrombosis in acute pancreatitis patients: A comparison of radical basis function artificial neural network and logistic regression models.

    PubMed

    Fei, Yang; Hu, Jian; Gao, Kun; Tu, Jianfeng; Li, Wei-Qin; Wang, Wei

    2017-06-01

    To construct a radical basis function (RBF) artificial neural networks (ANNs) model to predict the incidence of acute pancreatitis (AP)-induced portal vein thrombosis. The analysis included 353 patients with AP who had admitted between January 2011 and December 2015. RBF ANNs model and logistic regression model were constructed based on eleven factors relevant to AP respectively. Statistical indexes were used to evaluate the value of the prediction in two models. The predict sensitivity, specificity, positive predictive value, negative predictive value and accuracy by RBF ANNs model for PVT were 73.3%, 91.4%, 68.8%, 93.0% and 87.7%, respectively. There were significant differences between the RBF ANNs and logistic regression models in these parameters (P<0.05). In addition, a comparison of the area under receiver operating characteristic curves of the two models showed a statistically significant difference (P<0.05). The RBF ANNs model is more likely to predict the occurrence of PVT induced by AP than logistic regression model. D-dimer, AMY, Hct and PT were important prediction factors of approval for AP-induced PVT. Copyright © 2017 Elsevier Inc. All rights reserved.

  4. SPSS macros to compare any two fitted values from a regression model.

    PubMed

    Weaver, Bruce; Dubois, Sacha

    2012-12-01

    In regression models with first-order terms only, the coefficient for a given variable is typically interpreted as the change in the fitted value of Y for a one-unit increase in that variable, with all other variables held constant. Therefore, each regression coefficient represents the difference between two fitted values of Y. But the coefficients represent only a fraction of the possible fitted value comparisons that might be of interest to researchers. For many fitted value comparisons that are not captured by any of the regression coefficients, common statistical software packages do not provide the standard errors needed to compute confidence intervals or carry out statistical tests-particularly in more complex models that include interactions, polynomial terms, or regression splines. We describe two SPSS macros that implement a matrix algebra method for comparing any two fitted values from a regression model. The !OLScomp and !MLEcomp macros are for use with models fitted via ordinary least squares and maximum likelihood estimation, respectively. The output from the macros includes the standard error of the difference between the two fitted values, a 95% confidence interval for the difference, and a corresponding statistical test with its p-value.

  5. Implementations of geographically weighted lasso in spatial data with multicollinearity (Case study: Poverty modeling of Java Island)

    NASA Astrophysics Data System (ADS)

    Setiyorini, Anis; Suprijadi, Jadi; Handoko, Budhi

    2017-03-01

    Geographically Weighted Regression (GWR) is a regression model that takes into account the spatial heterogeneity effect. In the application of the GWR, inference on regression coefficients is often of interest, as is estimation and prediction of the response variable. Empirical research and studies have demonstrated that local correlation between explanatory variables can lead to estimated regression coefficients in GWR that are strongly correlated, a condition named multicollinearity. It later results on a large standard error on estimated regression coefficients, and, hence, problematic for inference on relationships between variables. Geographically Weighted Lasso (GWL) is a method which capable to deal with spatial heterogeneity and local multicollinearity in spatial data sets. GWL is a further development of GWR method, which adds a LASSO (Least Absolute Shrinkage and Selection Operator) constraint in parameter estimation. In this study, GWL will be applied by using fixed exponential kernel weights matrix to establish a poverty modeling of Java Island, Indonesia. The results of applying the GWL to poverty datasets show that this method stabilizes regression coefficients in the presence of multicollinearity and produces lower prediction and estimation error of the response variable than GWR does.

  6. An improved multiple linear regression and data analysis computer program package

    NASA Technical Reports Server (NTRS)

    Sidik, S. M.

    1972-01-01

    NEWRAP, an improved version of a previous multiple linear regression program called RAPIER, CREDUC, and CRSPLT, allows for a complete regression analysis including cross plots of the independent and dependent variables, correlation coefficients, regression coefficients, analysis of variance tables, t-statistics and their probability levels, rejection of independent variables, plots of residuals against the independent and dependent variables, and a canonical reduction of quadratic response functions useful in optimum seeking experimentation. A major improvement over RAPIER is that all regression calculations are done in double precision arithmetic.

  7. Female Labor Supply and Fertility in Iran: A Comparison Between Developed, Semi Developed and Less Developed Regions.

    PubMed

    Emamgholipour Sefiddashti, Sara; Homaie Rad, Enayatollah; Arab, Mohamad; Bordbar, Shima

    2016-02-01

    Female labor supply has been changed dramatically in the recent yr. In this study, we examined the effects of development on the relationship between fertility and female labor supply. We used data of population and housing census of Iran and estimated three separate models. To do this we employed Logistic Regressions (BLR). The estimation results of our study showed that there was a negative relationship between fertility rate and female labor supply and there are some differences for this relationship in three models. When fertility rate increases, FLS would decreases. In addition, for higher fertility rates, the woman might be forced to work more because of the economic conditions of her family; and negative coefficients of the fertility rate effects on FLS would increase with a diminishing rate.

  8. Dietary consumption patterns and laryngeal cancer risk.

    PubMed

    Vlastarakos, Petros V; Vassileiou, Andrianna; Delicha, Evie; Kikidis, Dimitrios; Protopapas, Dimosthenis; Nikolopoulos, Thomas P

    2016-06-01

    We conducted a case-control study to investigate the effect of diet on laryngeal carcinogenesis. Our study population was made up of 140 participants-70 patients with laryngeal cancer (LC) and 70 controls with a non-neoplastic condition that was unrelated to diet, smoking, or alcohol. A food-frequency questionnaire determined the mean consumption of 113 different items during the 3 years prior to symptom onset. Total energy intake and cooking mode were also noted. The relative risk, odds ratio (OR), and 95% confidence interval (CI) were estimated by multiple logistic regression analysis. We found that the total energy intake was significantly higher in the LC group (p < 0.001), and that the difference remained statistically significant after logistic regression analysis (p < 0.001; OR: 118.70). Notably, meat consumption was higher in the LC group (p < 0.001), and the difference remained significant after logistic regression analysis (p = 0.029; OR: 1.16). LC patients also consumed significantly more fried food (p = 0.036); this difference also remained significant in the logistic regression model (p = 0.026; OR: 5.45). The LC group also consumed significantly more seafood (p = 0.012); the difference persisted after logistic regression analysis (p = 0.009; OR: 2.48), with the consumption of shrimp proving detrimental (p = 0.049; OR: 2.18). Finally, the intake of zinc was significantly higher in the LC group before and after logistic regression analysis (p = 0.034 and p = 0.011; OR: 30.15, respectively). Cereal consumption (including pastas) was also higher among the LC patients (p = 0.043), with logistic regression analysis showing that their negative effect was possibly associated with the sauces and dressings that traditionally accompany pasta dishes (p = 0.006; OR: 4.78). Conversely, a higher consumption of dairy products was found in controls (p < 0.05); logistic regression analysis showed that calcium appeared to be protective at the micronutrient level (p < 0.001; OR: 0.27). We found no difference in the overall consumption of fruits and vegetables between the LC patients and controls; however, the LC patients did have a greater consumption of cooked tomatoes and cooked root vegetables (p = 0.039 for both), and the controls had more consumption of leeks (p = 0.042) and, among controls younger than 65 years, cooked beans (p = 0.037). Lemon (p = 0.037), squeezed fruit juice (p = 0.032), and watermelon (p = 0.018) were also more frequently consumed by the controls. Other differences at the micronutrient level included greater consumption by the LC patients of retinol (p = 0.044), polyunsaturated fats (p = 0.041), and linoleic acid (p = 0.008); LC patients younger than 65 years also had greater intake of riboflavin (p = 0.045). We conclude that the differences in dietary consumption patterns between LC patients and controls indicate a possible role for lifestyle modifications involving nutritional factors as a means of decreasing the risk of laryngeal cancer.

  9. Utility of Clinical Parameters and Multiparametric MRI as Predictive Factors for Differentiating Uterine Sarcoma From Atypical Leiomyoma.

    PubMed

    Bi, Qiu; Xiao, Zhibo; Lv, Fajin; Liu, Yao; Zou, Chunxia; Shen, Yiqing

    2018-02-05

    The objective of this study was to find clinical parameters and qualitative and quantitative magnetic resonance imaging (MRI) features for differentiating uterine sarcoma from atypical leiomyoma (ALM) preoperatively and to calculate predictive values for uterine sarcoma. Data from 60 patients with uterine sarcoma and 88 patients with ALM confirmed by surgery and pathology were collected. Clinical parameters, qualitative MRI features, diffusion-weighted imaging with apparent diffusion coefficient values, and quantitative parameters of dynamic contrast-enhanced MRI of these two tumor types were compared. Predictive values for uterine sarcoma were calculated using multivariable logistic regression. Patient clinical manifestations, tumor locations, margins, T2-weighted imaging signals, mean apparent diffusion coefficient values, minimum apparent diffusion coefficient values, and time-signal intensity curves of solid tumor components were obvious significant parameters for distinguishing between uterine sarcoma and ALM (all P <.001). Abnormal vaginal bleeding, tumors located mainly in the uterine cavity, ill-defined tumor margins, and mean apparent diffusion coefficient values of <1.272 × 10 -3  mm 2 /s were significant preoperative predictors of uterine sarcoma. When the overall scores of these four predictors were greater than or equal to 7 points, the sensitivity, the specificity, the accuracy, and the positive and negative predictive values were 88.9%, 99.9%, 95.7%, 97.0%, and 95.1%, respectively. The use of clinical parameters and multiparametric MRI as predictive factors was beneficial for diagnosing uterine sarcoma preoperatively. These findings could be helpful for guiding treatment decisions. Copyright © 2018 The Association of University Radiologists. Published by Elsevier Inc. All rights reserved.

  10. Association between sarcopenia and osteoporosis in chronic liver disease.

    PubMed

    Hayashi, Manabu; Abe, Kazumichi; Fujita, Masashi; Okai, Ken; Takahashi, Atsushi; Ohira, Hiromasa

    2018-05-07

    Sarcopenia and osteoporosis are important complications in chronic liver disease (CLD). The aim of this study was to investigate the relationship between sarcopenia and osteoporosis in patients with CLD. We retrospectively investigated the relationship between sarcopenia and osteoporosis in 112 CLD patients (57 males and 55 females), including 40 cirrhotic patients (36%), by measuring the appendicular skeletal muscle mass index (ASMI) using bio-impedance analysis. Bone mineral density (BMD) was measured by dual-energy X-ray absorptiometry. The sarcopenia rate was 13% (14/112), and the osteoporosis and osteopenia rates were 17% (19/112) and 65% (73/112), respectively. The rate of osteoporosis was significant and high in patients with sarcopenia or cirrhosis. In linear regression analysis, sarcopenia was significantly associated with the BMD of the lumbar spine (Coefficient = -0.149, P = 0.014) and the femur neck (Coefficient = -0.110, P = 0.003). Cirrhosis was also significantly associated with low BMD of the lumbar spine (Coefficient = -0.160, P < 0.001) and the femur neck (Coefficient = -0.066, P = 0.015). In the logistic analysis, sarcopenia (odds ratio = 6.16, P = 0.039) and cirrhosis (odds ratio = 15.8, P = 0.002) were independent risk factors for osteoporosis. The ASMI cut-off values for osteoporosis were 7.33 kg/m 2 in males and 5.71 kg/m 2 in females. Sarcopenia was closely associated with osteoporosis, and a low ASMI was a potential predictor of osteoporosis in CLD patients. Screening for BMD may be required to detect osteoporosis in cirrhotic patients. This article is protected by copyright. All rights reserved.

  11. Parent-Child Resemblance in Weight Status and Its Correlates in the United States

    PubMed Central

    Liang, Lan; Wang, Youfa

    2013-01-01

    Background Few studies have examined parent-child resemblance in body weight status using nationally representative data for the US. Design We analyzed Body Mass Index (BMI), weight status, and related correlates for 4,846 boys, 4,725 girls, and their parents based on US nationally representative data from the 2006 and 2007 Medical Expenditure Panel Survey (MEPS). Pearson partial correlation coefficients, percent agreement, weighted kappa coefficients, and binary and multinomial logistic regression were used to examine parent-child resemblance, adjusted for complex sampling design. Results Pearson partial correlation coefficients between parent and child’s BMI measures were 0.15 for father-son pairs, 0.17 for father-daughter pairs, 0.20 for mother-son pairs, and 0.23 for mother-daughter pairs. The weighted kappa coefficients between BMI quintiles of parent and child ranged from −0.02 to 0.25. Odds ratio analyses found children were 2.1 (95% confidence interval (CI): 1.6, 2.8) times more likely to be obese if only their father was obese, 1.9 (95% CI: 1.5, 2.4) times more likely if only their mother was obese, and 3.2 (95% CI: 2.5, 4.2) times more likely if both parents were obese. Conclusions Parent-child resemblance in BMI appears weak and may vary across parent-child dyad types in the US population. However, parental obesity status is associated with children’s obesity status. Use of different measures of parent-child resemblance in body weight status can lead to different conclusions. PMID:23762352

  12. Probability and amounts of yogurt intake are differently affected by sociodemographic, economic, and lifestyle factors in adults and the elderly-results from a population-based study.

    PubMed

    Possa, Gabriela; de Castro, Michelle Alessandra; Marchioni, Dirce Maria Lobo; Fisberg, Regina Mara; Fisberg, Mauro

    2015-08-01

    The aim of this population-based cross-sectional health survey (N = 532) was to investigate the factors associated with the probability and amounts of yogurt intake in Brazilian adults and the elderly. A structured questionnaire was used to obtain data on demographics, socioeconomic information, presence of morbidities and lifestyle and anthropometric characteristics. Food intake was evaluated using two nonconsecutive 24-hour dietary recalls and a Food Frequency Questionnaire. Approximately 60% of the subjects were classified as yogurt consumers. In the logistic regression model, yogurt intake was associated with smoking (odds ratio [OR], 1.98), female sex (OR, 2.12), and age 20 to 39 years (OR, 3.11). Per capita family income and being a nonsmoker were factors positively associated with the amount of yogurt consumption (coefficients, 0.61 and 3.73, respectively), whereas the level of education of the head of household was inversely associated (coefficient, 0.61). In this study, probability and amounts of yogurt intake are differently affected by demographic, socioeconomic, and lifestyle factors in adults and the elderly. Copyright © 2015 Elsevier Inc. All rights reserved.

  13. A hybrid PSO-SVM-based method for predicting the friction coefficient between aircraft tire and coating

    NASA Astrophysics Data System (ADS)

    Zhan, Liwei; Li, Chengwei

    2017-02-01

    A hybrid PSO-SVM-based model is proposed to predict the friction coefficient between aircraft tire and coating. The presented hybrid model combines a support vector machine (SVM) with particle swarm optimization (PSO) technique. SVM has been adopted to solve regression problems successfully. Its regression accuracy is greatly related to optimizing parameters such as the regularization constant C , the parameter gamma γ corresponding to RBF kernel and the epsilon parameter \\varepsilon in the SVM training procedure. However, the friction coefficient which is predicted based on SVM has yet to be explored between aircraft tire and coating. The experiment reveals that drop height and tire rotational speed are the factors affecting friction coefficient. Bearing in mind, the friction coefficient can been predicted using the hybrid PSO-SVM-based model by the measured friction coefficient between aircraft tire and coating. To compare regression accuracy, a grid search (GS) method and a genetic algorithm (GA) are used to optimize the relevant parameters (C , γ and \\varepsilon ), respectively. The regression accuracy could be reflected by the coefficient of determination ({{R}2} ). The result shows that the hybrid PSO-RBF-SVM-based model has better accuracy compared with the GS-RBF-SVM- and GA-RBF-SVM-based models. The agreement of this model (PSO-RBF-SVM) with experiment data confirms its good performance.

  14. Parameter estimation in Cox models with missing failure indicators and the OPPERA study.

    PubMed

    Brownstein, Naomi C; Cai, Jianwen; Slade, Gary D; Bair, Eric

    2015-12-30

    In a prospective cohort study, examining all participants for incidence of the condition of interest may be prohibitively expensive. For example, the "gold standard" for diagnosing temporomandibular disorder (TMD) is a physical examination by a trained clinician. In large studies, examining all participants in this manner is infeasible. Instead, it is common to use questionnaires to screen for incidence of TMD and perform the "gold standard" examination only on participants who screen positively. Unfortunately, some participants may leave the study before receiving the "gold standard" examination. Within the framework of survival analysis, this results in missing failure indicators. Motivated by the Orofacial Pain: Prospective Evaluation and Risk Assessment (OPPERA) study, a large cohort study of TMD, we propose a method for parameter estimation in survival models with missing failure indicators. We estimate the probability of being an incident case for those lacking a "gold standard" examination using logistic regression. These estimated probabilities are used to generate multiple imputations of case status for each missing examination that are combined with observed data in appropriate regression models. The variance introduced by the procedure is estimated using multiple imputation. The method can be used to estimate both regression coefficients in Cox proportional hazard models as well as incidence rates using Poisson regression. We simulate data with missing failure indicators and show that our method performs as well as or better than competing methods. Finally, we apply the proposed method to data from the OPPERA study. Copyright © 2015 John Wiley & Sons, Ltd.

  15. A Comparison of the Logistic Regression and Contingency Table Methods for Simultaneous Detection of Uniform and Nonuniform DIF

    ERIC Educational Resources Information Center

    Guler, Nese; Penfield, Randall D.

    2009-01-01

    In this study, we investigate the logistic regression (LR), Mantel-Haenszel (MH), and Breslow-Day (BD) procedures for the simultaneous detection of both uniform and nonuniform differential item functioning (DIF). A simulation study was used to assess and compare the Type I error rate and power of a combined decision rule (CDR), which assesses DIF…

  16. The Overall Odds Ratio as an Intuitive Effect Size Index for Multiple Logistic Regression: Examination of Further Refinements

    ERIC Educational Resources Information Center

    Le, Huy; Marcus, Justin

    2012-01-01

    This study used Monte Carlo simulation to examine the properties of the overall odds ratio (OOR), which was recently introduced as an index for overall effect size in multiple logistic regression. It was found that the OOR was relatively independent of study base rate and performed better than most commonly used R-square analogs in indexing model…

  17. Predicting Student Success on the Texas Chemistry STAAR Test: A Logistic Regression Analysis

    ERIC Educational Resources Information Center

    Johnson, William L.; Johnson, Annabel M.; Johnson, Jared

    2012-01-01

    Background: The context is the new Texas STAAR end-of-course testing program. Purpose: The authors developed a logistic regression model to predict who would pass-or-fail the new Texas chemistry STAAR end-of-course exam. Setting: Robert E. Lee High School (5A) with an enrollment of 2700 students, Tyler, Texas. Date of the study was the 2011-2012…

  18. Using ROC curves to compare neural networks and logistic regression for modeling individual noncatastrophic tree mortality

    Treesearch

    Susan L. King

    2003-01-01

    The performance of two classifiers, logistic regression and neural networks, are compared for modeling noncatastrophic individual tree mortality for 21 species of trees in West Virginia. The output of the classifier is usually a continuous number between 0 and 1. A threshold is selected between 0 and 1 and all of the trees below the threshold are classified as...

  19. Logistic regression trees for initial selection of interesting loci in case-control studies

    PubMed Central

    Nickolov, Radoslav Z; Milanov, Valentin B

    2007-01-01

    Modern genetic epidemiology faces the challenge of dealing with hundreds of thousands of genetic markers. The selection of a small initial subset of interesting markers for further investigation can greatly facilitate genetic studies. In this contribution we suggest the use of a logistic regression tree algorithm known as logistic tree with unbiased selection. Using the simulated data provided for Genetic Analysis Workshop 15, we show how this algorithm, with incorporation of multifactor dimensionality reduction method, can reduce an initial large pool of markers to a small set that includes the interesting markers with high probability. PMID:18466557

  20. Predicting out-of-office blood pressure level using repeated measurements in the clinic: an observational cohort study

    PubMed Central

    Sheppard, James P.; Holder, Roger; Nichols, Linda; Bray, Emma; Hobbs, F.D. Richard; Mant, Jonathan; Little, Paul; Williams, Bryan; Greenfield, Sheila; McManus, Richard J.

    2014-01-01

    Objectives: Identification of people with lower (white-coat effect) or higher (masked effect) blood pressure at home compared to the clinic usually requires ambulatory or home monitoring. This study assessed whether changes in SBP with repeated measurement at a single clinic predict subsequent differences between clinic and home measurements. Methods: This study used an observational cohort design and included 220 individuals aged 35–84 years, receiving treatment for hypertension, but whose SBP was not controlled. The characteristics of change in SBP over six clinic readings were defined as the SBP drop, the slope and the quadratic coefficient using polynomial regression modelling. The predictive abilities of these characteristics for lower or higher home SBP readings were investigated with logistic regression and repeated operating characteristic analysis. Results: The single clinic SBP drop was predictive of the white-coat effect with a sensitivity of 90%, specificity of 50%, positive predictive value of 56% and negative predictive value of 88%. Predictive values for the masked effect and those of the slope and quadratic coefficient were slightly lower, but when the slope and quadratic variables were combined, the sensitivity, specificity, positive and negative predictive values for the masked effect were improved to 91, 48, 24 and 97%, respectively. Conclusion: Characteristics obtainable from multiple SBP measurements in a single clinic in patients with treated hypertension appear to reasonably predict those unlikely to have a large white-coat or masked effect, potentially allowing better targeting of out-of-office monitoring in routine clinical practice. PMID:25144295

  1. Aldosterone and glomerular filtration – observations in the general population

    PubMed Central

    2014-01-01

    Background Increasing evidence suggests that aldosterone promotes renal damage. Since data on the association between aldosterone and renal function in the general population are sparse, we chose to address this issue. We investigated the associations between the plasma aldosterone concentration (PAC) or the aldosterone-to-renin ratio (ARR) and the estimated glomerular filtration rate (eGFR) in a sample of adult men and women from Northeast Germany. Methods A study population of 1921 adult men and women who participated in the first follow-up of the Study of Health in Pomerania was selected. None of the subjects used drugs that alter PAC or ARR. The eGFR was calculated according to the four-variable Modification of Diet in Renal Disease formula. Chronic kidney disease (CKD) was defined as an eGFR <60 ml/min/1.73 m2. Results Linear regression models, adjusted for sex, age, waist circumference, diabetes mellitus, smoking status, systolic and diastolic blood pressures, serum triglyceride concentrations and time of blood sampling revealed inverse associations of PAC or ARR with eGFR (ß-coefficient for log-transformed PAC −3.12, p < 0.001; ß-coefficient for log-transformed ARR −3.36, p < 0.001). Logistic regression models revealed increased odds for CKD with increasing PAC (odds ratio for a one standard deviation increase in PAC: 1.35, 95% confidence interval: 1.06-1.71). There was no statistically significant association between ARR and CKD. Conclusion Our study demonstrates that PAC and ARR are inversely associated with the glomerular filtration rate in the general population. PMID:24612948

  2. Cotton dust and endotoxin exposure-response relationships in cotton textile workers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kennedy, S.M.; Christiani, D.C.; Eisen, E.A.

    Endotoxin exposure has been implicated in the etiology of lung disease in cotton workers. We investigated this potential relationship in 443 cotton workers from 2 factories in Shanghai and 439 control subjects from a nearby silk mill. A respiratory questionnaire was administered and pre- and postshift forced expiratory volume (FVC) and flow in one second (FEV1) were determined for each worker. Multiple area air samples were analyzed for total elutriated dust concentration (range: 0.15 to 2.5 mg/m3) and endotoxin (range: 0.002 to 0.55 microgram U.S. Reference Endotoxin/m3). The cotton worker population was stratified by current and cumulative dust or endotoxinmore » exposure. Groups were compared for FEV1, FVC, FEV1/FVC%, % change in FEV1 over the shift (delta FEV1%), and prevalences of chronic bronchitis and byssinosis, and linear and logistic regression models were constructed. No dose-response relationships were demonstrated comparing dust concentration to any pulmonary function or symptom variable. A dose-response trend was seen with the current endotoxin level and FEV1, delta FEV1%, and the prevalence of byssinosis and chronic bronchitis, except for the highest exposure level group in which a reversal of the trend was seen. The regression coefficients for current endotoxin exposure were significant (p less than 0.05) in the models for FEV1 and chronic bronchitis but not in the models for delta FEV1% (i.e., acute change in FEV1) or byssinosis prevalence. The coefficient for dust level was never significant in the models.« less

  3. Using Logistic Regression to Predict the Probability of Debris Flows in Areas Burned by Wildfires, Southern California, 2003-2006

    USGS Publications Warehouse

    Rupert, Michael G.; Cannon, Susan H.; Gartner, Joseph E.; Michael, John A.; Helsel, Dennis R.

    2008-01-01

    Logistic regression was used to develop statistical models that can be used to predict the probability of debris flows in areas recently burned by wildfires by using data from 14 wildfires that burned in southern California during 2003-2006. Twenty-eight independent variables describing the basin morphology, burn severity, rainfall, and soil properties of 306 drainage basins located within those burned areas were evaluated. The models were developed as follows: (1) Basins that did and did not produce debris flows soon after the 2003 to 2006 fires were delineated from data in the National Elevation Dataset using a geographic information system; (2) Data describing the basin morphology, burn severity, rainfall, and soil properties were compiled for each basin. These data were then input to a statistics software package for analysis using logistic regression; and (3) Relations between the occurrence or absence of debris flows and the basin morphology, burn severity, rainfall, and soil properties were evaluated, and five multivariate logistic regression models were constructed. All possible combinations of independent variables were evaluated to determine which combinations produced the most effective models, and the multivariate models that best predicted the occurrence of debris flows were identified. Percentage of high burn severity and 3-hour peak rainfall intensity were significant variables in all models. Soil organic matter content and soil clay content were significant variables in all models except Model 5. Soil slope was a significant variable in all models except Model 4. The most suitable model can be selected from these five models on the basis of the availability of independent variables in the particular area of interest and field checking of probability maps. The multivariate logistic regression models can be entered into a geographic information system, and maps showing the probability of debris flows can be constructed in recently burned areas of southern California. This study demonstrates that logistic regression is a valuable tool for developing models that predict the probability of debris flows occurring in recently burned landscapes.

  4. Polymorphism Thr160Thr in SRD5A1, involved in the progesterone metabolism, modifies postmenopausal breast cancer risk associated with menopausal hormone therapy.

    PubMed

    Hein, R; Abbas, S; Seibold, P; Salazar, R; Flesch-Janys, D; Chang-Claude, J

    2012-01-01

    Menopausal hormone therapy (MHT) is associated with an increased breast cancer risk in postmenopausal women, with combined estrogen-progestagen therapy posing a greater risk than estrogen monotherapy. However, few studies focused on potential effect modification of MHT-associated breast cancer risk by genetic polymorphisms in the progesterone metabolism. We assessed effect modification of MHT use by five coding single nucleotide polymorphisms (SNPs) in the progesterone metabolizing enzymes AKR1C3 (rs7741), AKR1C4 (rs3829125, rs17134592), and SRD5A1 (rs248793, rs3736316) using a two-center population-based case-control study from Germany with 2,502 postmenopausal breast cancer patients and 4,833 matched controls. An empirical-Bayes procedure that tests for interaction using a weighted combination of the prospective and the retrospective case-control estimators as well as standard prospective logistic regression were applied to assess multiplicative statistical interaction between polymorphisms and duration of MHT use with regard to breast cancer risk assuming a log-additive mode of inheritance. No genetic marginal effects were observed. Breast cancer risk associated with duration of combined therapy was significantly modified by SRD5A1_rs3736316, showing a reduced risk elevation in carriers of the minor allele (p (interaction,empirical-Bayes) = 0.006 using the empirical-Bayes method, p (interaction,logistic regression) = 0.013 using logistic regression). The risk associated with duration of use of monotherapy was increased by AKR1C3_rs7741 in minor allele carriers (p (interaction,empirical-Bayes) = 0.083, p (interaction,logistic regression) = 0.029) and decreased in minor allele carriers of two SNPs in AKR1C4 (rs3829125: p (interaction,empirical-Bayes) = 0.07, p (interaction,logistic regression) = 0.021; rs17134592: p (interaction,empirical-Bayes) = 0.101, p (interaction,logistic regression) = 0.038). After Bonferroni correction for multiple testing only SRD5A1_rs3736316 assessed using the empirical-Bayes method remained significant. Postmenopausal breast cancer risk associated with combined therapy may be modified by genetic variation in SRD5A1. Further well-powered studies are, however, required to replicate our finding.

  5. [Correlation coefficient-based classification method of hydrological dependence variability: With auto-regression model as example].

    PubMed

    Zhao, Yu Xi; Xie, Ping; Sang, Yan Fang; Wu, Zi Yi

    2018-04-01

    Hydrological process evaluation is temporal dependent. Hydrological time series including dependence components do not meet the data consistency assumption for hydrological computation. Both of those factors cause great difficulty for water researches. Given the existence of hydrological dependence variability, we proposed a correlationcoefficient-based method for significance evaluation of hydrological dependence based on auto-regression model. By calculating the correlation coefficient between the original series and its dependence component and selecting reasonable thresholds of correlation coefficient, this method divided significance degree of dependence into no variability, weak variability, mid variability, strong variability, and drastic variability. By deducing the relationship between correlation coefficient and auto-correlation coefficient in each order of series, we found that the correlation coefficient was mainly determined by the magnitude of auto-correlation coefficient from the 1 order to p order, which clarified the theoretical basis of this method. With the first-order and second-order auto-regression models as examples, the reasonability of the deduced formula was verified through Monte-Carlo experiments to classify the relationship between correlation coefficient and auto-correlation coefficient. This method was used to analyze three observed hydrological time series. The results indicated the coexistence of stochastic and dependence characteristics in hydrological process.

  6. On the analysis of Canadian Holstein dairy cow lactation curves using standard growth functions.

    PubMed

    López, S; France, J; Odongo, N E; McBride, R A; Kebreab, E; AlZahal, O; McBride, B W; Dijkstra, J

    2015-04-01

    Six classical growth functions (monomolecular, Schumacher, Gompertz, logistic, Richards, and Morgan) were fitted to individual and average (by parity) cumulative milk production curves of Canadian Holstein dairy cows. The data analyzed consisted of approximately 91,000 daily milk yield records corresponding to 122 first, 99 second, and 92 third parity individual lactation curves. The functions were fitted using nonlinear regression procedures, and their performance was assessed using goodness-of-fit statistics (coefficient of determination, residual mean squares, Akaike information criterion, and the correlation and concordance coefficients between observed and adjusted milk yields at several days in milk). Overall, all the growth functions evaluated showed an acceptable fit to the cumulative milk production curves, with the Richards equation ranking first (smallest Akaike information criterion) followed by the Morgan equation. Differences among the functions in their goodness-of-fit were enlarged when fitted to average curves by parity, where the sigmoidal functions with a variable point of inflection (Richards and Morgan) outperformed the other 4 equations. All the functions provided satisfactory predictions of milk yield (calculated from the first derivative of the functions) at different lactation stages, from early to late lactation. The Richards and Morgan equations provided the most accurate estimates of peak yield and total milk production per 305-d lactation, whereas the least accurate estimates were obtained with the logistic equation. In conclusion, classical growth functions (especially sigmoidal functions with a variable point of inflection) proved to be feasible alternatives to fit cumulative milk production curves of dairy cows, resulting in suitable statistical performance and accurate estimates of lactation traits. Copyright © 2015 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  7. Applications of statistics to medical science, III. Correlation and regression.

    PubMed

    Watanabe, Hiroshi

    2012-01-01

    In this third part of a series surveying medical statistics, the concepts of correlation and regression are reviewed. In particular, methods of linear regression and logistic regression are discussed. Arguments related to survival analysis will be made in a subsequent paper.

  8. Filtering data from the collaborative initial glaucoma treatment study for improved identification of glaucoma progression.

    PubMed

    Schell, Greggory J; Lavieri, Mariel S; Stein, Joshua D; Musch, David C

    2013-12-21

    Open-angle glaucoma (OAG) is a prevalent, degenerate ocular disease which can lead to blindness without proper clinical management. The tests used to assess disease progression are susceptible to process and measurement noise. The aim of this study was to develop a methodology which accounts for the inherent noise in the data and improve significant disease progression identification. Longitudinal observations from the Collaborative Initial Glaucoma Treatment Study (CIGTS) were used to parameterize and validate a Kalman filter model and logistic regression function. The Kalman filter estimates the true value of biomarkers associated with OAG and forecasts future values of these variables. We develop two logistic regression models via generalized estimating equations (GEE) for calculating the probability of experiencing significant OAG progression: one model based on the raw measurements from CIGTS and another model based on the Kalman filter estimates of the CIGTS data. Receiver operating characteristic (ROC) curves and associated area under the ROC curve (AUC) estimates are calculated using cross-fold validation. The logistic regression model developed using Kalman filter estimates as data input achieves higher sensitivity and specificity than the model developed using raw measurements. The mean AUC for the Kalman filter-based model is 0.961 while the mean AUC for the raw measurements model is 0.889. Hence, using the probability function generated via Kalman filter estimates and GEE for logistic regression, we are able to more accurately classify patients and instances as experiencing significant OAG progression. A Kalman filter approach for estimating the true value of OAG biomarkers resulted in data input which improved the accuracy of a logistic regression classification model compared to a model using raw measurements as input. This methodology accounts for process and measurement noise to enable improved discrimination between progression and nonprogression in chronic diseases.

  9. Computing group cardinality constraint solutions for logistic regression problems.

    PubMed

    Zhang, Yong; Kwon, Dongjin; Pohl, Kilian M

    2017-01-01

    We derive an algorithm to directly solve logistic regression based on cardinality constraint, group sparsity and use it to classify intra-subject MRI sequences (e.g. cine MRIs) of healthy from diseased subjects. Group cardinality constraint models are often applied to medical images in order to avoid overfitting of the classifier to the training data. Solutions within these models are generally determined by relaxing the cardinality constraint to a weighted feature selection scheme. However, these solutions relate to the original sparse problem only under specific assumptions, which generally do not hold for medical image applications. In addition, inferring clinical meaning from features weighted by a classifier is an ongoing topic of discussion. Avoiding weighing features, we propose to directly solve the group cardinality constraint logistic regression problem by generalizing the Penalty Decomposition method. To do so, we assume that an intra-subject series of images represents repeated samples of the same disease patterns. We model this assumption by combining series of measurements created by a feature across time into a single group. Our algorithm then derives a solution within that model by decoupling the minimization of the logistic regression function from enforcing the group sparsity constraint. The minimum to the smooth and convex logistic regression problem is determined via gradient descent while we derive a closed form solution for finding a sparse approximation of that minimum. We apply our method to cine MRI of 38 healthy controls and 44 adult patients that received reconstructive surgery of Tetralogy of Fallot (TOF) during infancy. Our method correctly identifies regions impacted by TOF and generally obtains statistically significant higher classification accuracy than alternative solutions to this model, i.e., ones relaxing group cardinality constraints. Copyright © 2016 Elsevier B.V. All rights reserved.

  10. Influential factors of red-light running at signalized intersection and prediction using a rare events logistic regression model.

    PubMed

    Ren, Yilong; Wang, Yunpeng; Wu, Xinkai; Yu, Guizhen; Ding, Chuan

    2016-10-01

    Red light running (RLR) has become a major safety concern at signalized intersection. To prevent RLR related crashes, it is critical to identify the factors that significantly impact the drivers' behaviors of RLR, and to predict potential RLR in real time. In this research, 9-month's RLR events extracted from high-resolution traffic data collected by loop detectors from three signalized intersections were applied to identify the factors that significantly affect RLR behaviors. The data analysis indicated that occupancy time, time gap, used yellow time, time left to yellow start, whether the preceding vehicle runs through the intersection during yellow, and whether there is a vehicle passing through the intersection on the adjacent lane were significantly factors for RLR behaviors. Furthermore, due to the rare events nature of RLR, a modified rare events logistic regression model was developed for RLR prediction. The rare events logistic regression method has been applied in many fields for rare events studies and shows impressive performance, but so far none of previous research has applied this method to study RLR. The results showed that the rare events logistic regression model performed significantly better than the standard logistic regression model. More importantly, the proposed RLR prediction method is purely based on loop detector data collected from a single advance loop detector located 400 feet away from stop-bar. This brings great potential for future field applications of the proposed method since loops have been widely implemented in many intersections and can collect data in real time. This research is expected to contribute to the improvement of intersection safety significantly. Copyright © 2016 Elsevier Ltd. All rights reserved.

  11. Use of genetic programming, logistic regression, and artificial neural nets to predict readmission after coronary artery bypass surgery.

    PubMed

    Engoren, Milo; Habib, Robert H; Dooner, John J; Schwann, Thomas A

    2013-08-01

    As many as 14 % of patients undergoing coronary artery bypass surgery are readmitted within 30 days. Readmission is usually the result of morbidity and may lead to death. The purpose of this study is to develop and compare statistical and genetic programming models to predict readmission. Patients were divided into separate Construction and Validation populations. Using 88 variables, logistic regression, genetic programs, and artificial neural nets were used to develop predictive models. Models were first constructed and tested on the Construction populations, then validated on the Validation population. Areas under the receiver operator characteristic curves (AU ROC) were used to compare the models. Two hundred and two patients (7.6 %) in the 2,644 patient Construction group and 216 (8.0 %) of the 2,711 patient Validation group were re-admitted within 30 days of CABG surgery. Logistic regression predicted readmission with AU ROC = .675 ± .021 in the Construction group. Genetic programs significantly improved the accuracy, AU ROC = .767 ± .001, p < .001). Artificial neural nets were less accurate with AU ROC = 0.597 ± .001 in the Construction group. Predictive accuracy of all three techniques fell in the Validation group. However, the accuracy of genetic programming (AU ROC = .654 ± .001) was still trivially but statistically non-significantly better than that of the logistic regression (AU ROC = .644 ± .020, p = .61). Genetic programming and logistic regression provide alternative methods to predict readmission that are similarly accurate.

  12. Artificial neural network, genetic algorithm, and logistic regression applications for predicting renal colic in emergency settings.

    PubMed

    Eken, Cenker; Bilge, Ugur; Kartal, Mutlu; Eray, Oktay

    2009-06-03

    Logistic regression is the most common statistical model for processing multivariate data in the medical literature. Artificial intelligence models like an artificial neural network (ANN) and genetic algorithm (GA) may also be useful to interpret medical data. The purpose of this study was to perform artificial intelligence models on a medical data sheet and compare to logistic regression. ANN, GA, and logistic regression analysis were carried out on a data sheet of a previously published article regarding patients presenting to an emergency department with flank pain suspicious for renal colic. The study population was composed of 227 patients: 176 patients had a diagnosis of urinary stone, while 51 ultimately had no calculus. The GA found two decision rules in predicting urinary stones. Rule 1 consisted of being male, pain not spreading to back, and no fever. In rule 2, pelvicaliceal dilatation on bedside ultrasonography replaced no fever. ANN, GA rule 1, GA rule 2, and logistic regression had a sensitivity of 94.9, 67.6, 56.8, and 95.5%, a specificity of 78.4, 76.47, 86.3, and 47.1%, a positive likelihood ratio of 4.4, 2.9, 4.1, and 1.8, and a negative likelihood ratio of 0.06, 0.42, 0.5, and 0.09, respectively. The area under the curve was found to be 0.867, 0.720, 0.715, and 0.713 for all applications, respectively. Data mining techniques such as ANN and GA can be used for predicting renal colic in emergency settings and to constitute clinical decision rules. They may be an alternative to conventional multivariate analysis applications used in biostatistics.

  13. Detection of Cutting Tool Wear using Statistical Analysis and Regression Model

    NASA Astrophysics Data System (ADS)

    Ghani, Jaharah A.; Rizal, Muhammad; Nuawi, Mohd Zaki; Haron, Che Hassan Che; Ramli, Rizauddin

    2010-10-01

    This study presents a new method for detecting the cutting tool wear based on the measured cutting force signals. A statistical-based method called Integrated Kurtosis-based Algorithm for Z-Filter technique, called I-kaz was used for developing a regression model and 3D graphic presentation of I-kaz 3D coefficient during machining process. The machining tests were carried out using a CNC turning machine Colchester Master Tornado T4 in dry cutting condition. A Kistler 9255B dynamometer was used to measure the cutting force signals, which were transmitted, analyzed, and displayed in the DasyLab software. Various force signals from machining operation were analyzed, and each has its own I-kaz 3D coefficient. This coefficient was examined and its relationship with flank wear lands (VB) was determined. A regression model was developed due to this relationship, and results of the regression model shows that the I-kaz 3D coefficient value decreases as tool wear increases. The result then is used for real time tool wear monitoring.

  14. New robust statistical procedures for the polytomous logistic regression models.

    PubMed

    Castilla, Elena; Ghosh, Abhik; Martin, Nirian; Pardo, Leandro

    2018-05-17

    This article derives a new family of estimators, namely the minimum density power divergence estimators, as a robust generalization of the maximum likelihood estimator for the polytomous logistic regression model. Based on these estimators, a family of Wald-type test statistics for linear hypotheses is introduced. Robustness properties of both the proposed estimators and the test statistics are theoretically studied through the classical influence function analysis. Appropriate real life examples are presented to justify the requirement of suitable robust statistical procedures in place of the likelihood based inference for the polytomous logistic regression model. The validity of the theoretical results established in the article are further confirmed empirically through suitable simulation studies. Finally, an approach for the data-driven selection of the robustness tuning parameter is proposed with empirical justifications. © 2018, The International Biometric Society.

  15. Updated logistic regression equations for the calculation of post-fire debris-flow likelihood in the western United States

    USGS Publications Warehouse

    Staley, Dennis M.; Negri, Jacquelyn A.; Kean, Jason W.; Laber, Jayme L.; Tillery, Anne C.; Youberg, Ann M.

    2016-06-30

    Wildfire can significantly alter the hydrologic response of a watershed to the extent that even modest rainstorms can generate dangerous flash floods and debris flows. To reduce public exposure to hazard, the U.S. Geological Survey produces post-fire debris-flow hazard assessments for select fires in the western United States. We use publicly available geospatial data describing basin morphology, burn severity, soil properties, and rainfall characteristics to estimate the statistical likelihood that debris flows will occur in response to a storm of a given rainfall intensity. Using an empirical database and refined geospatial analysis methods, we defined new equations for the prediction of debris-flow likelihood using logistic regression methods. We showed that the new logistic regression model outperformed previous models used to predict debris-flow likelihood.

  16. Nowcasting of Low-Visibility Procedure States with Ordered Logistic Regression at Vienna International Airport

    NASA Astrophysics Data System (ADS)

    Kneringer, Philipp; Dietz, Sebastian; Mayr, Georg J.; Zeileis, Achim

    2017-04-01

    Low-visibility conditions have a large impact on aviation safety and economic efficiency of airports and airlines. To support decision makers, we develop a statistical probabilistic nowcasting tool for the occurrence of capacity-reducing operations related to low visibility. The probabilities of four different low visibility classes are predicted with an ordered logistic regression model based on time series of meteorological point measurements. Potential predictor variables for the statistical models are visibility, humidity, temperature and wind measurements at several measurement sites. A stepwise variable selection method indicates that visibility and humidity measurements are the most important model inputs. The forecasts are tested with a 30 minute forecast interval up to two hours, which is a sufficient time span for tactical planning at Vienna Airport. The ordered logistic regression models outperform persistence and are competitive with human forecasters.

  17. A computational approach to compare regression modelling strategies in prediction research.

    PubMed

    Pajouheshnia, Romin; Pestman, Wiebe R; Teerenstra, Steven; Groenwold, Rolf H H

    2016-08-25

    It is often unclear which approach to fit, assess and adjust a model will yield the most accurate prediction model. We present an extension of an approach for comparing modelling strategies in linear regression to the setting of logistic regression and demonstrate its application in clinical prediction research. A framework for comparing logistic regression modelling strategies by their likelihoods was formulated using a wrapper approach. Five different strategies for modelling, including simple shrinkage methods, were compared in four empirical data sets to illustrate the concept of a priori strategy comparison. Simulations were performed in both randomly generated data and empirical data to investigate the influence of data characteristics on strategy performance. We applied the comparison framework in a case study setting. Optimal strategies were selected based on the results of a priori comparisons in a clinical data set and the performance of models built according to each strategy was assessed using the Brier score and calibration plots. The performance of modelling strategies was highly dependent on the characteristics of the development data in both linear and logistic regression settings. A priori comparisons in four empirical data sets found that no strategy consistently outperformed the others. The percentage of times that a model adjustment strategy outperformed a logistic model ranged from 3.9 to 94.9 %, depending on the strategy and data set. However, in our case study setting the a priori selection of optimal methods did not result in detectable improvement in model performance when assessed in an external data set. The performance of prediction modelling strategies is a data-dependent process and can be highly variable between data sets within the same clinical domain. A priori strategy comparison can be used to determine an optimal logistic regression modelling strategy for a given data set before selecting a final modelling approach.

  18. SCI model structure determination program (OSR) user's guide. [optimal subset regression

    NASA Technical Reports Server (NTRS)

    1979-01-01

    The computer program, OSR (Optimal Subset Regression) which estimates models for rotorcraft body and rotor force and moment coefficients is described. The technique used is based on the subset regression algorithm. Given time histories of aerodynamic coefficients, aerodynamic variables, and control inputs, the program computes correlation between various time histories. The model structure determination is based on these correlations. Inputs and outputs of the program are given.

  19. Cytopathologic differential diagnosis of low-grade urothelial carcinoma and reactive urothelial proliferation in bladder washings: a logistic regression analysis.

    PubMed

    Cakir, Ebru; Kucuk, Ulku; Pala, Emel Ebru; Sezer, Ozlem; Ekin, Rahmi Gokhan; Cakmak, Ozgur

    2017-05-01

    Conventional cytomorphologic assessment is the first step to establish an accurate diagnosis in urinary cytology. In cytologic preparations, the separation of low-grade urothelial carcinoma (LGUC) from reactive urothelial proliferation (RUP) can be exceedingly difficult. The bladder washing cytologies of 32 LGUC and 29 RUP were reviewed. The cytologic slides were examined for the presence or absence of the 28 cytologic features. The cytologic criteria showing statistical significance in LGUC were increased numbers of monotonous single (non-umbrella) cells, three-dimensional cellular papillary clusters without fibrovascular cores, irregular bordered clusters, atypical single cells, irregular nuclear overlap, cytoplasmic homogeneity, increased N/C ratio, pleomorphism, nuclear border irregularity, nuclear eccentricity, elongated nuclei, and hyperchromasia (p ˂ 0.05), and the cytologic criteria showing statistical significance in RUP were inflammatory background, mixture of small and large urothelial cells, loose monolayer aggregates, and vacuolated cytoplasm (p ˂ 0.05). When these variables were subjected to a stepwise logistic regression analysis, four features were selected to distinguish LGUC from RUP: increased numbers of monotonous single (non-umbrella) cells, increased nuclear cytoplasmic ratio, hyperchromasia, and presence of small and large urothelial cells (p = 0.0001). By this logistic model of the 32 cases with proven LGUC, the stepwise logistic regression analysis correctly predicted 31 (96.9%) patients with this diagnosis, and of the 29 patients with RUP, the logistic model correctly predicted 26 (89.7%) patients as having this disease. There are several cytologic features to separate LGUC from RUP. Stepwise logistic regression analysis is a valuable tool for determining the most useful cytologic criteria to distinguish these entities. © 2017 APMIS. Published by John Wiley & Sons Ltd.

  20. Ecotoxicology of phenylphosphonothioates.

    PubMed Central

    Francis, B M; Hansen, L G; Fukuto, T R; Lu, P Y; Metcalf, R L

    1980-01-01

    The phenylphosphonothioate insecticides EPN and leptophos, and several analogs, were evaluated with respect to their delayed neurotoxic effects in hens and their environmental behavior in a terrestrial-aquatic model ecosystem. Acute toxicity to insects was highly correlated with sigma sigma of the substituted phenyl group (regression coefficient r = -0.91) while acute toxicity to mammals was slightly less well correlated (regression coefficient r = -0.71), and neurotoxicity was poorly correlated with sigma sigma (regression coefficient r = -0.35). Both EPN and leptophos were markedly more persistent and bioaccumulative in the model ecosystem than parathion. Desbromoleptophos, a contaminant and metabolite of leptophos, was seen to be a highly stable and persistent terminal residue of leptophos. PMID:6159210

  1. Science of Test Research Consortium: Year Two Final Report

    DTIC Science & Technology

    2012-10-02

    July 2012. Analysis of an Intervention for Small Unmanned Aerial System ( SUAS ) Accidents, submitted to Quality Engineering, LQEN-2012-0056. Stone... Systems Engineering. Wolf, S. E., R. R. Hill, and J. J. Pignatiello. June 2012. Using Neural Networks and Logistic Regression to Model Small Unmanned ...Human Retina. 6. Wolf, S. E. March 2012. Modeling Small Unmanned Aerial System Mishaps using Logistic Regression and Artificial Neural Networks. 7

  2. Binary Logistic Regression Analysis for Detecting Differential Item Functioning: Effectiveness of R[superscript 2] and Delta Log Odds Ratio Effect Size Measures

    ERIC Educational Resources Information Center

    Hidalgo, Mª Dolores; Gómez-Benito, Juana; Zumbo, Bruno D.

    2014-01-01

    The authors analyze the effectiveness of the R[superscript 2] and delta log odds ratio effect size measures when using logistic regression analysis to detect differential item functioning (DIF) in dichotomous items. A simulation study was carried out, and the Type I error rate and power estimates under conditions in which only statistical testing…

  3. Logistic quantile regression provides improved estimates for bounded avian counts: a case study of California Spotted Owl fledgling production

    Treesearch

    Brian S. Cade; Barry R. Noon; Rick D. Scherer; John J. Keane

    2017-01-01

    Counts of avian fledglings, nestlings, or clutch size that are bounded below by zero and above by some small integer form a discrete random variable distribution that is not approximated well by conventional parametric count distributions such as the Poisson or negative binomial. We developed a logistic quantile regression model to provide estimates of the empirical...

  4. Comparison of four methods for deriving hospital standardised mortality ratios from a single hierarchical logistic regression model.

    PubMed

    Mohammed, Mohammed A; Manktelow, Bradley N; Hofer, Timothy P

    2016-04-01

    There is interest in deriving case-mix adjusted standardised mortality ratios so that comparisons between healthcare providers, such as hospitals, can be undertaken in the controversial belief that variability in standardised mortality ratios reflects quality of care. Typically standardised mortality ratios are derived using a fixed effects logistic regression model, without a hospital term in the model. This fails to account for the hierarchical structure of the data - patients nested within hospitals - and so a hierarchical logistic regression model is more appropriate. However, four methods have been advocated for deriving standardised mortality ratios from a hierarchical logistic regression model, but their agreement is not known and neither do we know which is to be preferred. We found significant differences between the four types of standardised mortality ratios because they reflect a range of underlying conceptual issues. The most subtle issue is the distinction between asking how an average patient fares in different hospitals versus how patients at a given hospital fare at an average hospital. Since the answers to these questions are not the same and since the choice between these two approaches is not obvious, the extent to which profiling hospitals on mortality can be undertaken safely and reliably, without resolving these methodological issues, remains questionable. © The Author(s) 2012.

  5. Three methods to construct predictive models using logistic regression and likelihood ratios to facilitate adjustment for pretest probability give similar results.

    PubMed

    Chan, Siew Foong; Deeks, Jonathan J; Macaskill, Petra; Irwig, Les

    2008-01-01

    To compare three predictive models based on logistic regression to estimate adjusted likelihood ratios allowing for interdependency between diagnostic variables (tests). This study was a review of the theoretical basis, assumptions, and limitations of published models; and a statistical extension of methods and application to a case study of the diagnosis of obstructive airways disease based on history and clinical examination. Albert's method includes an offset term to estimate an adjusted likelihood ratio for combinations of tests. Spiegelhalter and Knill-Jones method uses the unadjusted likelihood ratio for each test as a predictor and computes shrinkage factors to allow for interdependence. Knottnerus' method differs from the other methods because it requires sequencing of tests, which limits its application to situations where there are few tests and substantial data. Although parameter estimates differed between the models, predicted "posttest" probabilities were generally similar. Construction of predictive models using logistic regression is preferred to the independence Bayes' approach when it is important to adjust for dependency of tests errors. Methods to estimate adjusted likelihood ratios from predictive models should be considered in preference to a standard logistic regression model to facilitate ease of interpretation and application. Albert's method provides the most straightforward approach.

  6. A comparison of three methods of assessing differential item functioning (DIF) in the Hospital Anxiety Depression Scale: ordinal logistic regression, Rasch analysis and the Mantel chi-square procedure.

    PubMed

    Cameron, Isobel M; Scott, Neil W; Adler, Mats; Reid, Ian C

    2014-12-01

    It is important for clinical practice and research that measurement scales of well-being and quality of life exhibit only minimal differential item functioning (DIF). DIF occurs where different groups of people endorse items in a scale to different extents after being matched by the intended scale attribute. We investigate the equivalence or otherwise of common methods of assessing DIF. Three methods of measuring age- and sex-related DIF (ordinal logistic regression, Rasch analysis and Mantel χ(2) procedure) were applied to Hospital Anxiety Depression Scale (HADS) data pertaining to a sample of 1,068 patients consulting primary care practitioners. Three items were flagged by all three approaches as having either age- or sex-related DIF with a consistent direction of effect; a further three items identified did not meet stricter criteria for important DIF using at least one method. When applying strict criteria for significant DIF, ordinal logistic regression was slightly less sensitive. Ordinal logistic regression, Rasch analysis and contingency table methods yielded consistent results when identifying DIF in the HADS depression and HADS anxiety scales. Regardless of methods applied, investigators should use a combination of statistical significance, magnitude of the DIF effect and investigator judgement when interpreting the results.

  7. Extreme Sparse Multinomial Logistic Regression: A Fast and Robust Framework for Hyperspectral Image Classification

    NASA Astrophysics Data System (ADS)

    Cao, Faxian; Yang, Zhijing; Ren, Jinchang; Ling, Wing-Kuen; Zhao, Huimin; Marshall, Stephen

    2017-12-01

    Although the sparse multinomial logistic regression (SMLR) has provided a useful tool for sparse classification, it suffers from inefficacy in dealing with high dimensional features and manually set initial regressor values. This has significantly constrained its applications for hyperspectral image (HSI) classification. In order to tackle these two drawbacks, an extreme sparse multinomial logistic regression (ESMLR) is proposed for effective classification of HSI. First, the HSI dataset is projected to a new feature space with randomly generated weight and bias. Second, an optimization model is established by the Lagrange multiplier method and the dual principle to automatically determine a good initial regressor for SMLR via minimizing the training error and the regressor value. Furthermore, the extended multi-attribute profiles (EMAPs) are utilized for extracting both the spectral and spatial features. A combinational linear multiple features learning (MFL) method is proposed to further enhance the features extracted by ESMLR and EMAPs. Finally, the logistic regression via the variable splitting and the augmented Lagrangian (LORSAL) is adopted in the proposed framework for reducing the computational time. Experiments are conducted on two well-known HSI datasets, namely the Indian Pines dataset and the Pavia University dataset, which have shown the fast and robust performance of the proposed ESMLR framework.

  8. Adherence to preferable behavior for lipid control by high-risk dyslipidemic Japanese patients under pravastatin treatment: the APPROACH-J study.

    PubMed

    Kitagawa, Yasuhisa; Teramoto, Tamio; Daida, Hiroyuki

    2012-01-01

    We evaluated the impact of adherence to preferable behavior on serum lipid control assessed by a self-reported questionnaire in high-risk patients taking pravastatin for primary prevention of coronary artery disease. High-risk patients taking pravastatin were followed for 2 years. Questionnaire surveys comprising 21 questions, including 18 questions concerning awareness of health, and current status of diet, exercise, and drug therapy, were conducted at baseline and after 1 year. Potential domains were established by factor analysis from the results of questionnaires, and adherence scores were calculated in each domain. The relationship between adherence scores and lipid values during the 1-year treatment period was analyzed by each domain using multiple regression analysis. A total of 5,792 patients taking pravastatin were included in the analysis. Multiple regression analysis showed a significant correlation in terms of "Intake of high fat/cholesterol/sugar foods" (regression coefficient -0.58, p=0.0105) and "Adherence to instructions for drug therapy" (regression coefficient -6.61, p<0.0001). Low-density lipoprotein cholesterol (LDL-C) values were significantly lower in patients who had an increase in the adherence score in the "Awareness of health" domain compared with those with a decreased score. There was a significant correlation between high-density lipoprotein (HDL-C) values and "Awareness of health" (regression coefficient 0.26; p= 0.0037), "Preferable dietary behaviors" (regression coefficient 0.75; p<0.0001), and "Exercise" (regression coefficient 0.73; p= 0.0002). Similar relations were seen with triglycerides. In patients who have a high awareness of their health, a positive attitude toward lipid-lowering treatment including diet, exercise, and high adherence to drug therapy, is related with favorable overall lipid control even in patients under treatment with pravastatin.

  9. Choline in anxiety and depression: the Hordaland Health Study.

    PubMed

    Bjelland, Ingvar; Tell, Grethe S; Vollset, Stein E; Konstantinova, Svetlana; Ueland, Per M

    2009-10-01

    Despite its importance in the central nervous system as a precursor for acetylcholine and membrane phosphatidylcholine, the role of choline in mental illness has been little studied. We examined the cross-sectional association between plasma choline concentrations and scores of anxiety and depression symptoms in a general population sample. We studied a subsample (n = 5918) of the Hordaland Health Study, including both sexes and 2 age groups of 46-49 and 70-74 y who had valid information on plasma choline concentrations and symptoms of anxiety and depression measured by the Hospital Anxiety and Depression Scale--the latter 2 as continuous measures and dichotomized at a score > or =8 for both subscales. The lowest choline quintile was significantly associated with high anxiety levels (odds ratio: 1.33; 95% CI: 1.06, 1.69) in the fully adjusted (age group, sex, time since last meal, educational level, and smoking habits) logistic regression model. Also, the trend test in the anxiety model was significant (P = 0.007). In the equivalent fully adjusted linear regression model, a significant inverse association was found between choline quintiles and anxiety levels (standardized regression coefficient = -0.027, P = 0.045). We found no significant associations in the corresponding analyses of the relation between plasma choline and depression symptoms. In this large population-based study, choline concentrations were negatively associated with anxiety symptoms but not with depression symptoms.

  10. Social inequalities in depression and suicidal ideation among older primary care patients

    PubMed Central

    Gilman, Stephen E.; Bruce, Martha L.; Have, Thomas Ten; Alexopoulos, George S.; Mulsant, Benoit H.; Reynolds, Charles F.; Cohen, Alex

    2012-01-01

    Purpose Depression and suicide are major public health concerns, and are often unrecognized among the elderly. This study investigated social inequalities in depressive symptoms and suicidal ideation among older adults. Methods Data come from 1,226 participants in PROSPECT (Prevention of Suicide in Primary Care Elderly: Collaborative Trial), a large primary care-based intervention trial for late-life depression. Linear and logistic regressions were used to analyze depressive symptoms and suicidal ideation over the two-year follow-up period. Results Mean Hamilton Depression Rating Scale (HDRS) scores were significantly higher among participants in financial strain (regression coefficient (b)=1.78, 95% confidence interval (CI)=0.67–2.89) and with annual incomes below $20,000 (b=1.67, CI=0.34–3.00). Financial strain was also associated with a higher risk of suicidal ideation (odds ratio=2.35, CI=1.38–3.98). Conclusions There exist marked social inequalities in depressive symptoms and suicidal ideation among older adults attending primary care practices, the setting in which depression is most commonly treated. Our results justify continued efforts to understand the mechanisms generating such inequalities, and to recognize and provide effective treatments for depression among high-risk populations. PMID:22948560

  11. Use of the NASA Giovanni Data System for Geospatial Public Health Research: Example of Weather-Influenza Connection

    NASA Technical Reports Server (NTRS)

    Acker, James G.; Soebiyanto, Radina; Kiang, Richard; Kempler, Steve

    2014-01-01

    The NASA Giovanni data analysis system has been recognized as a useful tool to access and analyze many different types of remote sensing data. The variety of environmental data types has allowed the use of Giovanni for different application areas, such as agriculture, hydrology, and air quality research. The use of Giovanni for researching connections between public health issues and Earths environment and climate, potentially exacerbated by anthropogenic influence, has been increasingly demonstrated. In this communication, the pertinence of several different data parameters to public health will be described. This communication also provides a case study of the use of remote sensing data from Giovanni in assessing the associations between seasonal influenza and meteorological parameters. In this study, logistic regression was employed with precipitation, temperature and specific humidity as predictors. Specific humidity was found to be associated (p 0.05) with influenza activity in both temperate and tropical climate. In the two temperate locations studied, specific humidity was negatively correlated with influenza; conversely, in the three tropical locations, specific humidity was positively correlated with influenza. Influenza prediction using the regression models showed good agreement with the observed data (correlation coefficient of 0.50.83).

  12. Use of a tracing task to assess visuomotor performance for evidence of concussion and recuperation.

    PubMed

    Kelty-Stephen, Damian G; Qureshi Ahmad, Mona; Stirling, Leia

    2015-12-01

    The likelihood of suffering a concussion while playing a contact sport ranges from 15-45% per year of play. These rates are highly variable as athletes seldom report concussive symptoms, or do not recognize their symptoms. We performed a prospective cohort study (n = 206, aged 10-17) to examine visuomotor tracing to determine the sensitivity for detecting neuromotor components of concussion. Tracing variability measures were investigated for a mean shift with presentation of concussion-related symptoms and a linear return toward baseline over subsequent return visits. Furthermore, previous research relating brain injury to the dissociation of smooth movements into "submovements" led to the expectation that cumulative micropause duration, a measure of motion continuity, might detect likelihood of injury. Separate linear mixed effects regressions of tracing measures indicated that 4 of the 5 tracing measures captured both short-term effects of injury and longer-term effects of recovery with subsequent visits. Cumulative micropause duration has a positive relationship with likelihood of participants having had a concussion. The present results suggest that future research should evaluate how well the coefficients for the tracing parameter in the logistic regression help to detect concussion in novel cases. (c) 2015 APA, all rights reserved).

  13. The association between season of pregnancy and birth-sex among Chinese.

    PubMed

    Xu, Tan; Lin, Dongdong; Liang, Hui; Chen, Mei; Tong, Weijun; Mu, Yongping; Feng, Cindy Xin; Gao, Yongqing; Zheng, Yumei; Sun, Wenjie

    2014-08-11

    although numerous studies have reported the association between birth season and sex ratio, few studies have been conducted in subtropical regions in a non-Western setting. The present study assessed the effects of pregnancy season on birth sex ratio in China. We conducted a national population-based retrospective study from 2006-2008 with 3175 children-parents pairs enrolled in the Northeast regions of China. Demographics and data relating to pregnancy and birth were collected and analyzed. A multiple logistical regression model was fitted to estimate the regression coefficient and 95% confidence interval (CI) of refractive error for mother pregnancy season, adjusting for potential confounders. After adjusting for parental age (cut-off point was 30 years), region, nationality, mother education level, and mother miscarriage history, there is a significant statistical different mother pregnancy season on birth-sex. Compared with mothers who were pregnant in spring, those pregnant in summer or winter had a high probability of delivering girls (p < 0.05). The birth-sex ratio varied with months. Our results suggested that mothers pregnant in summer and winter were more likely to deliver girls, compared with those pregnant in spring. Pregnancy season may play an important role in the birth-sex.

  14. Neutropenia is independently associated with sub-therapeutic serum concentration of vancomycin.

    PubMed

    Choi, Min Hyuk; Choe, Yeon Hwa; Lee, Sang-Guk; Jeong, Seok Hoon; Kim, Jeong-Ho

    2017-02-01

    We aimed to identify the impact of the presence of neutropenia on serum vancomycin concentration (SVC). A retrospective study was conducted from January 2005 to December 2015. The study population was comprised of adult patients who were performed serum concentration of vancomycin. Patients with renal failure or using non-conventional dosages of vancomycin were excluded. A total of 1307 adult patients were included in this study, of whom 163 (12.4%) were neutropenic. Patients with neutropenia presented significantly lower SVCs than non-neutropenic patients (P<0.0001). Multiple linear regressions showed significant association between neutropenia and trough SVC (beta coefficients, -2.351; P=0.004). Multiple logistic regression analysis also revealed a significant association between sub-therapeutic vancomycin concentrations (trough SVC values<10mg/l) and neutropenia (odds ratio, 1.75, P=0.029) CONCLUSIONS: The presence of neutropenia is significantly associated with low SVC, even after adjusting for other variables. Therefore, neutropenic patients had a higher risk of sub-therapeutic SVC compared with non-neutropenic patients. We recommended that vancomycin therapy should be monitored with TDM-guided optimization of dosage and intervals, especially in neutropenic patients. Copyright © 2016 Elsevier B.V. All rights reserved.

  15. Is a Baccalaureate in Nursing Worth It? The Return to Education, 2000–2008

    PubMed Central

    Spetz, Joanne; Bates, Timothy

    2013-01-01

    Objective. A registered nurse (RN) license can be obtained by completing a baccalaureate degree (BSN), an associate degree (AD), or a diploma program. The aim of this article is to examine the return to baccalaureate education from the perspective of the nurse. Data Sources. National Sample Survey of Registered Nurses, 2000, 2004, and 2008. Study Design. The effect of education on RN wages is estimated using multivariate regression, both for initial education and for completing a second degree. The coefficients are used to calculate lifetime expected earnings. Multinomial logistic regression is used to examine the relationship between education and job title. Principal Findings. Lifetime earnings for nurses whose initial education is the BSN are higher than those of AD nurses only if the AD program requires 3 years and the discount rate is 2 percent. For individuals who enter nursing with an AD, lifetime earnings are higher if they complete a BSN. The BSN is associated with higher likelihood of being an advanced practice registered nurse, having an academic title, and having a management title. Conclusions. Because baccalaureate education confers benefits both for RNs and their patients, policies to encourage the pursuit of BSN degrees need to be supported. PMID:24102422

  16. [Habitat suitability index of larval Japanese Halfbeak (Hyporhamphus sajori) in Bohai Sea based on geographically weighted regression.

    PubMed

    Zhao, Yang; Zhang, Xue Qing; Bian, Xiao Dong

    2018-01-01

    To investigate the early supplementary processes of fishre sources in the Bohai Sea, the geographically weighted regression (GWR) was introduced to the habitat suitability index (HSI) model. The Bohai Sea larval Japanese Halfbeak HSI GWR model was established with four environmental variables, including sea surface temperature (SST), sea surface salinity (SSS), water depth (DEP), and chlorophyll a concentration (Chl a). Results of the simulation showed that the four variables had different performances in August 2015. SST and Chl a were global variables, and had little impacts on HSI, with the regression coefficients of -0.027 and 0.006, respectively. SSS and DEP were local variables, and had larger impacts on HSI, while the average values of absolute values of their regression coefficients were 0.075 and 0.129, respectively. In the central Bohai Sea, SSS showed a negative correlation with HSI, and the most negative correlation coefficient was -0.3. In contrast, SSS was correlated positively but weakly with HSI in the three bays of Bohai Sea, and the largest correlation coefficient was 0.1. In particular, DEP and HSI were negatively correlated in the entire Bohai Sea, while they were more negatively correlated in the three bays of Bohai than in the central Bohai Sea, and the most negative correlation coefficient was -0.16 in the three bays. The Poisson regression coefficient of the HSI GWR model was 0.705, consistent with field measurements. Therefore, it could provide a new method for the research on fish habitats in the future.

  17. Predictors of postoperative outcomes of cubital tunnel syndrome treatments using multiple logistic regression analysis.

    PubMed

    Suzuki, Taku; Iwamoto, Takuji; Shizu, Kanae; Suzuki, Katsuji; Yamada, Harumoto; Sato, Kazuki

    2017-05-01

    This retrospective study was designed to investigate prognostic factors for postoperative outcomes for cubital tunnel syndrome (CubTS) using multiple logistic regression analysis with a large number of patients. Eighty-three patients with CubTS who underwent surgeries were enrolled. The following potential prognostic factors for disease severity were selected according to previous reports: sex, age, type of surgery, disease duration, body mass index, cervical lesion, presence of diabetes mellitus, Workers' Compensation status, preoperative severity, and preoperative electrodiagnostic testing. Postoperative severity of disease was assessed 2 years after surgery by Messina's criteria which is an outcome measure specifically for CubTS. Bivariate analysis was performed to select candidate prognostic factors for multiple linear regression analyses. Multiple logistic regression analysis was conducted to identify the association between postoperative severity and selected prognostic factors. Both bivariate and multiple linear regression analysis revealed only preoperative severity as an independent risk factor for poor prognosis, while other factors did not show any significant association. Although conflicting results exist regarding prognosis of CubTS, this study supports evidence from previous studies and concludes early surgical intervention portends the most favorable prognosis. Copyright © 2017 The Japanese Orthopaedic Association. Published by Elsevier B.V. All rights reserved.

  18. Agile Combat Support Doctrine and Logistics Officer Training: Do We Need an Integrated Logistics School for the Expeditionary Air and Space Force?

    DTIC Science & Technology

    2003-02-01

    Rank-Order Correlation Coefficients statistical analysis via SPSS 8.0. Interview informants’ perceptions and perspec­ tives are combined with...logistics training in facilitating the em­ ployment of doctrinal tenets in a deployed environment. Statistical Correlations: Confirmed Relationships...integration of technology and cross-func­ tional training for the tactical practitioners. Statistical Correlations: Confirmed Relationships on the Need

  19. [The relationship between depressive symptoms and family functioning in institutionalized elderly].

    PubMed

    de Oliveira, Simone Camargo; dos Santos, Ariene Angelini; Pavarini, Sofia Cristina Iost

    2014-02-01

    The present study aimed to investigate the relationship between family functioning and depressive symptoms among institutionalized elderly. This is a descriptive, cross-sectional study of quantitative character. A total of 107 institutionalized elderly were assessed using a sociodemographic questionnaire, the Geriatric Depression Scale (to track depressive symptoms) and the Family APGAR (to assess family functioning). The correlation coefficient of Pearson's, the chi-square test and the crude and adjusted logistic regression were used in the data analysis with a significance level of 5 %. The institutionalized elderly with depressive symptoms were predominantly women and in the age group of 80 years and older. Regarding family functioning, most elderly had high family dysfunctioning (57 %). Family dysfunctioning was higher among the elderly with depressive symptoms. There was a significant correlation between family functioning and depressive symptoms. The conclusion is that institutionalized elderly with dysfunctional families are more likely to have depressive symptoms.

  20. Performance of the likelihood ratio difference (G2 Diff) test for detecting unidimensionality in applications of the multidimensional Rasch model.

    PubMed

    Harrell-Williams, Leigh; Wolfe, Edward W

    2014-01-01

    Previous research has investigated the influence of sample size, model misspecification, test length, ability distribution offset, and generating model on the likelihood ratio difference test in applications of item response models. This study extended that research to the evaluation of dimensionality using the multidimensional random coefficients multinomial logit model (MRCMLM). Logistic regression analysis of simulated data reveal that sample size and test length have a large effect on the capacity of the LR difference test to correctly identify unidimensionality, with shorter tests and smaller sample sizes leading to smaller Type I error rates. Higher levels of simulated misfit resulted in fewer incorrect decisions than data with no or little misfit. However, Type I error rates indicate that the likelihood ratio difference test is not suitable under any of the simulated conditions for evaluating dimensionality in applications of the MRCMLM.

  1. Modeling of urban growth using cellular automata (CA) optimized by Particle Swarm Optimization (PSO)

    NASA Astrophysics Data System (ADS)

    Khalilnia, M. H.; Ghaemirad, T.; Abbaspour, R. A.

    2013-09-01

    In this paper, two satellite images of Tehran, the capital city of Iran, which were taken by TM and ETM+ for years 1988 and 2010 are used as the base information layers to study the changes in urban patterns of this metropolis. The patterns of urban growth for the city of Tehran are extracted in a period of twelve years using cellular automata setting the logistic regression functions as transition functions. Furthermore, the weighting coefficients of parameters affecting the urban growth, i.e. distance from urban centers, distance from rural centers, distance from agricultural centers, and neighborhood effects were selected using PSO. In order to evaluate the results of the prediction, the percent correct match index is calculated. According to the results, by combining optimization techniques with cellular automata model, the urban growth patterns can be predicted with accuracy up to 75 %.

  2. Body mass index, waist circumference, and arterial hypertension in students.

    PubMed

    Guilherme, Flávio Ricardo; Molena-Fernandes, Carlos Alexandre; Guilherme, Vânia Renata; Fávero, Maria Teresa Martins; dos Reis, Eliane Josefa Barbosa; Rinaldi, Wilson

    2015-01-01

    to investigate what is the best anthropometric predictor of arterial hypertension among private school students. this was a cross-sectional study with 286 students between the ages of 10 and 14 from two private schools in the city of Paranavaí, Paraná, Brazil. The following variables were analyzed: body mass index, waist circumference and blood pressure. Statistical analysis was conducted with Pearson's partial correlation test and multivariate logistic regression, with p<0.05. both anthropometric indicators displayed weak correlation with systolic and diastolic levels, with coefficients (r) ranging from 0.27 to 0.36 (p < 0.001). Multivariate analysis showed that the only anthropometric indicator associated with arterial hypertension was waist circumference (OR= 2.3; 95% CI: 1.1-4.5), regardless of age or gender. this age group, waist circumference appeared to be a better predictor for arterial hypertension than body mass index.

  3. A Logistic Regression Analysis of Turkey's 15-Year-Olds' Scoring above the OECD Average on the PISA'09 Reading Assessment

    ERIC Educational Resources Information Center

    Kasapoglu, Koray

    2014-01-01

    This study aims to investigate which factors are associated with Turkey's 15-year-olds' scoring above the OECD average (493) on the PISA'09 reading assessment. Collected from a total of 4,996 15-year-old students from Turkey, data were analyzed by logistic regression analysis in order to model the data of students who were split into two: (1)…

  4. Upgrade Summer Severe Weather Tool

    NASA Technical Reports Server (NTRS)

    Watson, Leela

    2011-01-01

    The goal of this task was to upgrade to the existing severe weather database by adding observations from the 2010 warm season, update the verification dataset with results from the 2010 warm season, use statistical logistic regression analysis on the database and develop a new forecast tool. The AMU analyzed 7 stability parameters that showed the possibility of providing guidance in forecasting severe weather, calculated verification statistics for the Total Threat Score (TTS), and calculated warm season verification statistics for the 2010 season. The AMU also performed statistical logistic regression analysis on the 22-year severe weather database. The results indicated that the logistic regression equation did not show an increase in skill over the previously developed TTS. The equation showed less accuracy than TTS at predicting severe weather, little ability to distinguish between severe and non-severe weather days, and worse standard categorical accuracy measures and skill scores over TTS.

  5. Estimating the Probability of Rare Events Occurring Using a Local Model Averaging.

    PubMed

    Chen, Jin-Hua; Chen, Chun-Shu; Huang, Meng-Fan; Lin, Hung-Chih

    2016-10-01

    In statistical applications, logistic regression is a popular method for analyzing binary data accompanied by explanatory variables. But when one of the two outcomes is rare, the estimation of model parameters has been shown to be severely biased and hence estimating the probability of rare events occurring based on a logistic regression model would be inaccurate. In this article, we focus on estimating the probability of rare events occurring based on logistic regression models. Instead of selecting a best model, we propose a local model averaging procedure based on a data perturbation technique applied to different information criteria to obtain different probability estimates of rare events occurring. Then an approximately unbiased estimator of Kullback-Leibler loss is used to choose the best one among them. We design complete simulations to show the effectiveness of our approach. For illustration, a necrotizing enterocolitis (NEC) data set is analyzed. © 2016 Society for Risk Analysis.

  6. Evaluating the perennial stream using logistic regression in central Taiwan

    NASA Astrophysics Data System (ADS)

    Ruljigaljig, T.; Cheng, Y. S.; Lin, H. I.; Lee, C. H.; Yu, T. T.

    2014-12-01

    This study produces a perennial stream head potential map, based on a logistic regression method with a Geographic Information System (GIS). Perennial stream initiation locations, indicates the location of the groundwater and surface contact, were identified in the study area from field survey. The perennial stream potential map in central Taiwan was constructed using the relationship between perennial stream and their causative factors, such as Catchment area, slope gradient, aspect, elevation, groundwater recharge and precipitation. Here, the field surveys of 272 streams were determined in the study area. The areas under the curve for logistic regression methods were calculated as 0.87. The results illustrate the importance of catchment area and groundwater recharge as key factors within the model. The results obtained from the model within the GIS were then used to produce a map of perennial stream and estimate the location of perennial stream head.

  7. The use of logistic regression to enhance risk assessment and decision making by mental health administrators.

    PubMed

    Menditto, Anthony A; Linhorst, Donald M; Coleman, James C; Beck, Niels C

    2006-04-01

    Development of policies and procedures to contend with the risks presented by elopement, aggression, and suicidal behaviors are long-standing challenges for mental health administrators. Guidance in making such judgments can be obtained through the use of a multivariate statistical technique known as logistic regression. This procedure can be used to develop a predictive equation that is mathematically formulated to use the best combination of predictors, rather than considering just one factor at a time. This paper presents an overview of logistic regression and its utility in mental health administrative decision making. A case example of its application is presented using data on elopements from Missouri's long-term state psychiatric hospitals. Ultimately, the use of statistical prediction analyses tempered with differential qualitative weighting of classification errors can augment decision-making processes in a manner that provides guidance and flexibility while wrestling with the complex problem of risk assessment and decision making.

  8. Reporting quality of multivariable logistic regression in selected Indian medical journals.

    PubMed

    Kumar, R; Indrayan, A; Chhabra, P

    2012-01-01

    Use of multivariable logistic regression (MLR) modeling has steeply increased in the medical literature over the past few years. Testing of model assumptions and adequate reporting of MLR allow the reader to interpret results more accurately. To review the fulfillment of assumptions and reporting quality of MLR in selected Indian medical journals using established criteria. Analysis of published literature. Medknow.com publishes 68 Indian medical journals with open access. Eight of these journals had at least five articles using MLR between the years 1994 to 2008. Articles from each of these journals were evaluated according to the previously established 10-point quality criteria for reporting and to test the MLR model assumptions. SPSS 17 software and non-parametric test (Kruskal-Wallis H, Mann Whitney U, Spearman Correlation). One hundred and nine articles were finally found using MLR for analyzing the data in the selected eight journals. The number of such articles gradually increased after year 2003, but quality score remained almost similar over time. P value, odds ratio, and 95% confidence interval for coefficients in MLR was reported in 75.2% and sufficient cases (>10) per covariate of limiting sample size were reported in the 58.7% of the articles. No article reported the test for conformity of linear gradient for continuous covariates. Total score was not significantly different across the journals. However, involvement of statistician or epidemiologist as a co-author improved the average quality score significantly (P=0.014). Reporting of MLR in many Indian journals is incomplete. Only one article managed to score 8 out of 10 among 109 articles under review. All others scored less. Appropriate guidelines in instructions to authors, and pre-publication review of articles using MLR by a qualified statistician may improve quality of reporting.

  9. Personality traits and coping styles explain anxiety in lung cancer patients to a greater extent than other factors.

    PubMed

    Shimizu, Ken; Nakaya, Naoki; Saito-Nakaya, Kumi; Akechi, Tatsuo; Ogawa, Asao; Fujisawa, Daisuke; Sone, Toshimasa; Yoshiuchi, Kazuhiro; Goto, Koichi; Iwasaki, Motoki; Tsugane, Shoichiro; Uchitomi, Yosuke

    2015-05-01

    Although various factors thought to be correlated with anxiety in cancer patients, relative importance of each factors were unknown. We tested our hypothesis that personality traits and coping styles explain anxiety in lung cancer patients to a greater extent than other factors. A total of 1334 consecutively recruited lung cancer patients were selected, and data on cancer-related variables, demographic characteristics, health behaviors, physical symptoms and psychological factors consisting of personality traits and coping styles were obtained. The participants were divided into groups with or without a significant anxiety using the Hospital Anxiety and Depression Scale-Anxiety, and a binary logistic regression analysis was used to identify factors correlated with significant anxiety using a multivariate model. Among the recruited patients, 440 (33.0%) had significant anxiety. The binary logistic regression analysis revealed a coefficient of determination (overall R(2)) of 39.0%, and the explanation for psychological factors was much higher (30.7%) than those for cancer-related variables (1.1%), demographic characteristics (2.1%), health behaviors (0.8%) and physical symptoms (4.3%). Four specific factors remained significant in a multivariate model. A neurotic personality trait, a coping style of helplessness/hopelessness, and a female sex were positively correlated with significant anxiety, while a coping style of fatalism was negatively correlated. Our hypothesis was supported, and anxiety was strongly linked with personality trait and coping style. As a clinical implication, the use of screening instruments to identify these factors and intervention for psychological crisis may be needed. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  10. Body mass index and waist circumference are better predictors of insulin resistance than total body fat percentage in middle-aged and elderly Taiwanese.

    PubMed

    Cheng, Yiu-Hua; Tsao, Yu-Chung; Tzeng, I-Shiang; Chuang, Hai-Hua; Li, Wen-Cheng; Tung, Tao-Hsin; Chen, Jau-Yuan

    2017-09-01

    The incidence of diabetes mellitus is rising worldwide, and prediabetic screening for insulin resistance (IR) has become ever more essential. This study aimed to investigate whether body mass index (BMI), waist circumference (WC), or body fat percentage (BF%) could be a better predictor of IR in a middle-aged and elderly population. In this cross-sectional, community-based study, 394 individuals (97 with IR and 297 without IR) were enrolled in the analysis. IR was measured by homeostasis model assessment (HOMA-IR), and subjects with HOMA-IR value ≧75th percentile were defined as being IR. Associations between IR and BMI, WC and BF% were evaluated by t test, chi square, Pearson correlation, logistic regression, and receiver operating characteristic (ROC) curves. A total of 394 community-dwelling, middle-aged, and elderly persons were enrolled; 138 (35%) were male, and 256 were female (65%). The mean age was 64.41 ± 8.46 years. A significant association was identified between BMI, WC, BF%, and IR, with Pearson correlation coefficients of 0.437 (P < .001), 0.412 (P < .001), and 0.361 (P < .001), respectively. Multivariate logistic regression revealed BMI (OR = 1.31; 95% CI = 1.20-1.42), WC (OR = 1.13; 95% CI = 1.08-1.17), and BF% (OR = 1.17; 95% CI = 1.11-1.23) to be independent predictors of IR. The area under curves of BMI and WC, 0.749 and 0.745 respectively, are greater than that of BF% 0.687. BMI and WC were more strongly associated with IR than was BF%. Excess body weight and body fat distribution were more important than total body fat in predicting IR.

  11. Patient safety in out-of-hours primary care: a review of patient records.

    PubMed

    Smits, Marleen; Huibers, Linda; Kerssemeijer, Brian; de Feijter, Eimert; Wensing, Michel; Giesen, Paul

    2010-12-10

    Most patients receive healthcare in primary care settings, but relatively little is known about patient safety. Out-of-hours contacts are of particular importance to patient safety. Our aim was to examine the incidence, types, causes, and consequences of patient safety incidents at general practice cooperatives for out-of-hours primary care and to examine which factors were associated with the occurrence of patient safety incidents. A retrospective study of 1,145 medical records concerning patient contacts with four general practice cooperatives. Reviewers identified records with evidence of a potential patient safety incident; a physician panel determined whether a patient safety incident had indeed occurred. In addition, the panel determined the type, causes, and consequences of the incidents. Factors associated with incidents were examined in a random coefficient logistic regression analysis. In 1,145 patient records, 27 patient safety incidents were identified, an incident rate of 2.4% (95% CI: 1.5% to 3.2%). The most frequent incident type was treatment (56%). All incidents had at least partly been caused by failures in clinical reasoning. The majority of incidents did not result in patient harm (70%). Eight incidents had consequences for the patient, such as additional interventions or hospitalisation. The panel assessed that most incidents were unlikely to result in patient harm in the long term (89%). Logistic regression analysis showed that age was significantly related to incident occurrence: the likelihood of an incident increased with 1.03 for each year increase in age (95% CI: 1.01 to 1.04). Patient safety incidents occur in out-of-hours primary care, but most do not result in harm to patients. As clinical reasoning played an important part in these incidents, a better understanding of clinical reasoning and guideline adherence at GP cooperatives could contribute to patient safety.

  12. Predictors of chemoradiation related febrile neutropenia prophylaxis in older adults - Experience from a limited resource setting.

    PubMed

    Gangopadhyay, Aparna

    2018-01-01

    To identify risk factors that lower efficacy of antibiotic prophylaxis of febrile neutropenia among older patients on chemoradiation. Audit of institutional data showed that older adults are at higher risk of febrile neutropenia during chemoradiation. In limited resource settings widespread use of Granulocyte-Colony Stimulating Factor (G-CSF) is not economically feasible and antibiotics are used commonly. Despite compliance with antibiotics, prophylaxis is inadequate in many patients owing to patient and tumor related factors. Data from records of 219 older patients receiving antibiotic prophylaxis during chemoradiation were studied. Baseline assessment data and predisposing factors for febrile neutropenia were recorded. All patients received prophylactic fluoroquinolones. Incidence of febrile neutropenia and association with predisposing factors at baseline was analyzed by multiple logistic regression. 38.4% developed febrile neutropenia despite compliance. Multiple logistic regression revealed geriatric assessment (G8) score and tumor stage to be significant predictors of febrile neutropenia while on antibiotics ( p  < 0.0001). Odds ratios for two significant predictors G8 score and tumor stage, respectively, were 2.9 (95% CI 1.8036-4.6815) and 2.7 (95% CI 1.7501-4.1318). Correlation between these two significant predictors was found to be low in our cohort (Spearman's coefficient of rank correlation (rho) - 0.431, p  < 0.0001). G8 score and tumor burden are significant predictors of efficacy of antibiotic prophylaxis among older adults receiving chemoradiation. In older patients having poor G8 scores and advanced tumors, antibiotic prophylaxis is unsuitable. Interestingly, co-morbidities and poor performance status did not impact efficacy of antibiotic prophylaxis among our elderly patients.

  13. Development of a Multicomponent Prediction Model for Acute Esophagitis in Lung Cancer Patients Receiving Chemoradiotherapy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    De Ruyck, Kim, E-mail: kim.deruyck@UGent.be; Sabbe, Nick; Oberije, Cary

    2011-10-01

    Purpose: To construct a model for the prediction of acute esophagitis in lung cancer patients receiving chemoradiotherapy by combining clinical data, treatment parameters, and genotyping profile. Patients and Methods: Data were available for 273 lung cancer patients treated with curative chemoradiotherapy. Clinical data included gender, age, World Health Organization performance score, nicotine use, diabetes, chronic disease, tumor type, tumor stage, lymph node stage, tumor location, and medical center. Treatment parameters included chemotherapy, surgery, radiotherapy technique, tumor dose, mean fractionation size, mean and maximal esophageal dose, and overall treatment time. A total of 332 genetic polymorphisms were considered in 112 candidatemore » genes. The predicting model was achieved by lasso logistic regression for predictor selection, followed by classic logistic regression for unbiased estimation of the coefficients. Performance of the model was expressed as the area under the curve of the receiver operating characteristic and as the false-negative rate in the optimal point on the receiver operating characteristic curve. Results: A total of 110 patients (40%) developed acute esophagitis Grade {>=}2 (Common Terminology Criteria for Adverse Events v3.0). The final model contained chemotherapy treatment, lymph node stage, mean esophageal dose, gender, overall treatment time, radiotherapy technique, rs2302535 (EGFR), rs16930129 (ENG), rs1131877 (TRAF3), and rs2230528 (ITGB2). The area under the curve was 0.87, and the false-negative rate was 16%. Conclusion: Prediction of acute esophagitis can be improved by combining clinical, treatment, and genetic factors. A multicomponent prediction model for acute esophagitis with a sensitivity of 84% was constructed with two clinical parameters, four treatment parameters, and four genetic polymorphisms.« less

  14. Clinical utility of calf front hoof circumference and maternal intrapelvic area in predicting dystocia in 103 late gestation Holstein-Friesian heifers and cows.

    PubMed

    Hiew, Mark W H; Megahed, Ameer A; Townsend, Jonathan R; Singleton, Wayne L; Constable, Peter D

    2016-02-01

    The objective of this study was to determine the clinical utility of measuring calf front hoof circumference, maternal intrapelvic area, and selected morphometric values in predicting dystocia in dairy cattle. An observational study using a convenience sample of 103 late-gestation Holstein-Friesian heifers and cows was performed. Intrapelvic height and width of the dam were measured using a pelvimeter, and the intrapelvic area was calculated. Calf front hoof circumference and birth weight were also measured. Data were analyzed using Spearman's correlation coefficient (rs), Mann-Whitney U test, and binary or ordered logistic regression; P < 0.05 was significant. The calving difficulty score (1-5) was greater in heifers (median, 3.0) than in cows (median, 1.0). Median intrapelvic area immediately before parturition was smaller in heifers (268 cm(2)) than in cows (332 cm(2)), whereas front hoof circumference and birth weight of the calf were similar in both groups. The calving difficulty score was positively associated with calf birth weight in heifers (rs = 0.39) and cows (rs = 0.24). Binary logistic regression using both dam and calf data indicated that the ratio of front hoof circumference of the calf to the maternal intrapelvic area provided the best predictor of dystocia (calving difficulty score = 4 or 5), with sensitivity = 0.50 and specificity = 0.93 at the optimal cutpoint for the ratio (>0.068 cm/cm(2)). Determining the ratio of calf front hoof circumference to maternal intrapelvic area has clinical utility in predicting the calving difficulty score in Holstein-Friesian cattle. Copyright © 2016 Elsevier Inc. All rights reserved.

  15. A score to estimate the likelihood of detecting advanced colorectal neoplasia at colonoscopy

    PubMed Central

    Kaminski, Michal F; Polkowski, Marcin; Kraszewska, Ewa; Rupinski, Maciej; Butruk, Eugeniusz; Regula, Jaroslaw

    2014-01-01

    Objective This study aimed to develop and validate a model to estimate the likelihood of detecting advanced colorectal neoplasia in Caucasian patients. Design We performed a cross-sectional analysis of database records for 40-year-old to 66-year-old patients who entered a national primary colonoscopy-based screening programme for colorectal cancer in 73 centres in Poland in the year 2007. We used multivariate logistic regression to investigate the associations between clinical variables and the presence of advanced neoplasia in a randomly selected test set, and confirmed the associations in a validation set. We used model coefficients to develop a risk score for detection of advanced colorectal neoplasia. Results Advanced colorectal neoplasia was detected in 2544 of the 35 918 included participants (7.1%). In the test set, a logistic-regression model showed that independent risk factors for advanced colorectal neoplasia were: age, sex, family history of colorectal cancer, cigarette smoking (p<0.001 for these four factors), and Body Mass Index (p=0.033). In the validation set, the model was well calibrated (ratio of expected to observed risk of advanced neoplasia: 1.00 (95% CI 0.95 to 1.06)) and had moderate discriminatory power (c-statistic 0.62). We developed a score that estimated the likelihood of detecting advanced neoplasia in the validation set, from 1.32% for patients scoring 0, to 19.12% for patients scoring 7–8. Conclusions Developed and internally validated score consisting of simple clinical factors successfully estimates the likelihood of detecting advanced colorectal neoplasia in asymptomatic Caucasian patients. Once externally validated, it may be useful for counselling or designing primary prevention studies. PMID:24385598

  16. A score to estimate the likelihood of detecting advanced colorectal neoplasia at colonoscopy.

    PubMed

    Kaminski, Michal F; Polkowski, Marcin; Kraszewska, Ewa; Rupinski, Maciej; Butruk, Eugeniusz; Regula, Jaroslaw

    2014-07-01

    This study aimed to develop and validate a model to estimate the likelihood of detecting advanced colorectal neoplasia in Caucasian patients. We performed a cross-sectional analysis of database records for 40-year-old to 66-year-old patients who entered a national primary colonoscopy-based screening programme for colorectal cancer in 73 centres in Poland in the year 2007. We used multivariate logistic regression to investigate the associations between clinical variables and the presence of advanced neoplasia in a randomly selected test set, and confirmed the associations in a validation set. We used model coefficients to develop a risk score for detection of advanced colorectal neoplasia. Advanced colorectal neoplasia was detected in 2544 of the 35,918 included participants (7.1%). In the test set, a logistic-regression model showed that independent risk factors for advanced colorectal neoplasia were: age, sex, family history of colorectal cancer, cigarette smoking (p<0.001 for these four factors), and Body Mass Index (p=0.033). In the validation set, the model was well calibrated (ratio of expected to observed risk of advanced neoplasia: 1.00 (95% CI 0.95 to 1.06)) and had moderate discriminatory power (c-statistic 0.62). We developed a score that estimated the likelihood of detecting advanced neoplasia in the validation set, from 1.32% for patients scoring 0, to 19.12% for patients scoring 7-8. Developed and internally validated score consisting of simple clinical factors successfully estimates the likelihood of detecting advanced colorectal neoplasia in asymptomatic Caucasian patients. Once externally validated, it may be useful for counselling or designing primary prevention studies. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  17. Inequality in the hepatitis B awareness level in rural residents from 7 provinces in China

    PubMed Central

    Zheng, Juan; Li, Quan; Wang, Jian; Zhang, Guojie; Wangen, Knut R.

    2017-01-01

    ABSTRACT The hepatitis B (HB) awareness level is an important factor affecting the rates of HB virus vaccination. To better understand income-related inequalities in the HB awareness level, it is imperative to identify the sources of inequalities and assess the contribution rates of these influential factors. This study analyzed the unequal distribution of the HB awareness level and the contributions of various influential factors. We performed a cross-sectional household survey with questionnaire-based, face-to-face interviews in 7 Chinese provinces. Responses from 7271 respondents were used in this analysis. Multinomial logistic regression was used for the analysis of contributing factors, and the concentration index was used as a measure of HB awareness inequalities. The HB awareness level varied across participants with different characteristics. Multinomial logistic regression of the explanatory factors of the HB awareness level showed that several estimated coefficients and relative risk ratios were statistically significant for middle- and high-level awareness, except for sex, occupation, and household income. The concentration index of the HB knowledge score was 0.140, indicating inequality gradients disadvantageous to the poor. The contribution rate of socioeconomic factors was the largest (60.8%), followed by demographic characteristics (29.0%) and geographic factors (4.3%). Demographic, socioeconomic, and geographic factors are associated with the HB awareness inequality. Therefore, to reduce inequality, HB-related health education targeting individuals with low socioeconomic status should be performed. Less-developed provinces, especially with high proportions of poor residents, warrant particular attention. Our findings may be beneficial to improve the HB virus vaccination rate for individuals with low socioeconomic status. PMID:28277091

  18. Prevalence of kidney stones and associated risk factors in the Shunyi District of Beijing, China.

    PubMed

    Jiang, Y G; He, L H; Luo, G T; Zhang, X D

    2017-10-01

    Kidney stone formation is a multifactorial condition that involves interaction of environmental and genetic factors. Presence of kidney stones is strongly related to other diseases, which may result in a heavy economic and social burden. Clinical data on the prevalence and influencing factors in kidney stone disease in the north of China are scarce. In this study, we explored the prevalence of kidney stone and potentially associated risk factors in the Shunyi District of Beijing, China. A population-based cross-sectional study was conducted from December 2011 to November 2012 in a northern area of China. Participants were interviewed in randomly selected towns. Univariate analysis of continuous and categorical variables was first performed by calculation of Spearman's correlation coefficient and Pearson Chi squared value, respectively. Variables with statistical significance were further analysed by multivariate logistic regression to explore the potential influencing factors. A total of 3350 participants (1091 males and 2259 females) completed the survey and the response rate was 99.67%. Among the participants, 3.61% were diagnosed with kidney stone. Univariate analysis showed that significant differences were evident in 31 variables. Blood and urine tests were performed in 100 randomly selected patients with kidney stone and 100 healthy controls. Serum creatinine, calcium, and uric acid were significantly different between the patients with kidney stone and healthy controls. Multivariate logistic regression revealed that being male (odds ratio=102.681; 95% confidence interval, 1.062-9925.797), daily intake of white spirits (6.331; 1.204-33.282), and a history of urolithiasis (1797.775; 24.228-133 396.982) were factors potentially associated with kidney stone prevalence. Male gender, drinking white spirits, and a history of urolithiasis are potentially associated with kidney stone formation.

  19. Interaction among general practitioners age and patient load in the prediction of job strain, decision latitude and perception of job demands. A cross-sectional study.

    PubMed

    Vanagas, Giedrius; Bihari-Axelsson, Susanna

    2004-12-07

    It is widely recognized and accepted that job strain adversely impacts the workforce. Individual responses to stressful situations can vary greatly and it has been shown that certain people are more likely to experience high levels of stress in their job than others. Studies highlighted that there can be age differences in job strain perception. Cross-sectional postal survey of 300 Lithuanian general practitioners. Psychosocial stress was investigated with a questionnaire based on the Reeder scale. Job demands were investigated with the Karasek scale. The analysis included descriptive statistics; logistic regression beta coefficients to find out predictors and interactions between characteristics and predictors. Response rate was 66% (N = 197). Logistic regression as significant predictors for job strain assigned - duration of work in primary care; for job demands- age and duration of working in primary care; for decision latitude- age and patient load.The interactions with regard to job strain showed that GP's age and job strain are negatively associated to a low patient load. Lower decision latitude for older GP age is strongly related to higher patient load. Job demands and GP age are slightly positively related at low patient load. Lithuanian GP's have high patient load and are at risk of stress, they have high job demands and low decision latitude. Older GP's perceive less strain, lower job demands and higher decision latitude in case of low patient load. Young GP's decision latitude has week association to patient load. Regarding to the changes in patient load younger GP's perceive it more sensitively as changes in job demands.

  20. Landslide susceptibility mapping using decision-tree based CHi-squared automatic interaction detection (CHAID) and Logistic regression (LR) integration

    NASA Astrophysics Data System (ADS)

    Althuwaynee, Omar F.; Pradhan, Biswajeet; Ahmad, Noordin

    2014-06-01

    This article uses methodology based on chi-squared automatic interaction detection (CHAID), as a multivariate method that has an automatic classification capacity to analyse large numbers of landslide conditioning factors. This new algorithm was developed to overcome the subjectivity of the manual categorization of scale data of landslide conditioning factors, and to predict rainfall-induced susceptibility map in Kuala Lumpur city and surrounding areas using geographic information system (GIS). The main objective of this article is to use CHi-squared automatic interaction detection (CHAID) method to perform the best classification fit for each conditioning factor, then, combining it with logistic regression (LR). LR model was used to find the corresponding coefficients of best fitting function that assess the optimal terminal nodes. A cluster pattern of landslide locations was extracted in previous study using nearest neighbor index (NNI), which were then used to identify the clustered landslide locations range. Clustered locations were used as model training data with 14 landslide conditioning factors such as; topographic derived parameters, lithology, NDVI, land use and land cover maps. Pearson chi-squared value was used to find the best classification fit between the dependent variable and conditioning factors. Finally the relationship between conditioning factors were assessed and the landslide susceptibility map (LSM) was produced. An area under the curve (AUC) was used to test the model reliability and prediction capability with the training and validation landslide locations respectively. This study proved the efficiency and reliability of decision tree (DT) model in landslide susceptibility mapping. Also it provided a valuable scientific basis for spatial decision making in planning and urban management studies.

  1. Utilisation of maternal health care in western rural China under a new rural health insurance system (New Co-operative Medical System).

    PubMed

    Long, Qian; Zhang, Tuohong; Xu, Ling; Tang, Shenglan; Hemminki, Elina

    2010-10-01

    To investigate factors influencing maternal health care utilisation in western rural China and its relation to income before (2002) and after (2007) introducing a new rural health insurance system (NCMS). Data from cross-sectional household-based health surveys carried out in ten western rural provinces of China in 2003 and 2008 were used in the study. The study population comprised women giving birth in 2002 or 2007, with 917 and 809 births, respectively. Correlations between outcomes and explanatory variables were studied by logistic regression models and a log-linear model. Between 2002 and 2007, having no any pre-natal visit decreased from 25% to 12% (difference 13%, 95% CI 10-17%); facility-based delivery increased from 45% to 80% (difference 35%, 95% CI 29-37%); and differences in using pre-natal and delivery care between the income groups narrowed. In a logistic regression analysis, women with lower education, from minority groups, or high parity were less likely to use pre-natal and delivery care in 2007. The expenditure for facility-based delivery increased over the period, but the out-of-pocket expenditure for delivery as a percentage of the annual household income decreased. In 2007, it was 14% in the low-income group. NCMS participation was found positively correlated with lower out-of-pocket expenditure for facility-based delivery (coefficient -1.14 P < 0.05) in 2007. Facility-based delivery greatly increased between 2002 and 2007, coinciding with the introduction of the NCMS. The rural poor were still facing substantial payment for facility-based delivery, although NCMS participation reduced the out-of-pocket expenditure on average. © 2010 Blackwell Publishing Ltd.

  2. Additional value of anaerobic threshold in a general mortality prediction model in a urban patient cohort with Chagas cardiomyopathy.

    PubMed

    Silva, Roberto Ribeiro da; Reis, Michel Silva; Pereira, Basílio de Bragança; Nascimento, Emilia Matos do; Pedrosa, Roberto Coury

    2017-12-01

    Anaerobic threshold (AT) is recognized as objective and direct measurement that reflects variations in metabolism of skeletal muscles during exercise. Its prognostic value in heart diseases of non-chagasic etiology is well established. However, the assessment of risk of death in Chagas heart disease is relatively well established by Rassi score. But, the added value that AT can bring to Rassi score has not been studied yet. To assess whether AT presents additional effect to Rassi score in patients with chronic Chagas' heart disease. Prospective research of dynamic cohort by review of 150 medical records of patients. Were selected for cohort 45 medical records of patients who underwent cardiopulmonary exercise testing between 1996-1997 and followed until September 2015. Data analysis to detect association between studied variables can be seen using a logistic regression model. The suitability of the models was verified using ROC curves and the coefficient of determination R 2 . 8 patients (17.78%) died by September 2015, with 7 of them (87.5%) from cardiovascular causes, of which 4 (57.14%) were considered on high risk by Rassi score. With Rassi score as independent variable, and death being the outcome, we obtained an area under the curve (AUC)=0.711, with R 2 =0.214. Instituting AT as independent variable, we found AUC=0.706, with R 2 =0.078. When we define Rassi score and AT as independent variables, it was obtained AUC=0.800 and R 2 =0.263. when AT is included in logistic regression, it increases by 5% the explanation (R 2 ) to the death estimation. Copyright © 2017 Sociedade Portuguesa de Cardiologia. Publicado por Elsevier España, S.L.U. All rights reserved.

  3. Empirical study of seven data mining algorithms on different characteristics of datasets for biomedical classification applications.

    PubMed

    Zhang, Yiyan; Xin, Yi; Li, Qin; Ma, Jianshe; Li, Shuai; Lv, Xiaodan; Lv, Weiqi

    2017-11-02

    Various kinds of data mining algorithms are continuously raised with the development of related disciplines. The applicable scopes and their performances of these algorithms are different. Hence, finding a suitable algorithm for a dataset is becoming an important emphasis for biomedical researchers to solve practical problems promptly. In this paper, seven kinds of sophisticated active algorithms, namely, C4.5, support vector machine, AdaBoost, k-nearest neighbor, naïve Bayes, random forest, and logistic regression, were selected as the research objects. The seven algorithms were applied to the 12 top-click UCI public datasets with the task of classification, and their performances were compared through induction and analysis. The sample size, number of attributes, number of missing values, and the sample size of each class, correlation coefficients between variables, class entropy of task variable, and the ratio of the sample size of the largest class to the least class were calculated to character the 12 research datasets. The two ensemble algorithms reach high accuracy of classification on most datasets. Moreover, random forest performs better than AdaBoost on the unbalanced dataset of the multi-class task. Simple algorithms, such as the naïve Bayes and logistic regression model are suitable for a small dataset with high correlation between the task and other non-task attribute variables. K-nearest neighbor and C4.5 decision tree algorithms perform well on binary- and multi-class task datasets. Support vector machine is more adept on the balanced small dataset of the binary-class task. No algorithm can maintain the best performance in all datasets. The applicability of the seven data mining algorithms on the datasets with different characteristics was summarized to provide a reference for biomedical researchers or beginners in different fields.

  4. Beyond the hype: deep neural networks outperform established methods using a ChEMBL bioactivity benchmark set.

    PubMed

    Lenselink, Eelke B; Ten Dijke, Niels; Bongers, Brandon; Papadatos, George; van Vlijmen, Herman W T; Kowalczyk, Wojtek; IJzerman, Adriaan P; van Westen, Gerard J P

    2017-08-14

    The increase of publicly available bioactivity data in recent years has fueled and catalyzed research in chemogenomics, data mining, and modeling approaches. As a direct result, over the past few years a multitude of different methods have been reported and evaluated, such as target fishing, nearest neighbor similarity-based methods, and Quantitative Structure Activity Relationship (QSAR)-based protocols. However, such studies are typically conducted on different datasets, using different validation strategies, and different metrics. In this study, different methods were compared using one single standardized dataset obtained from ChEMBL, which is made available to the public, using standardized metrics (BEDROC and Matthews Correlation Coefficient). Specifically, the performance of Naïve Bayes, Random Forests, Support Vector Machines, Logistic Regression, and Deep Neural Networks was assessed using QSAR and proteochemometric (PCM) methods. All methods were validated using both a random split validation and a temporal validation, with the latter being a more realistic benchmark of expected prospective execution. Deep Neural Networks are the top performing classifiers, highlighting the added value of Deep Neural Networks over other more conventional methods. Moreover, the best method ('DNN_PCM') performed significantly better at almost one standard deviation higher than the mean performance. Furthermore, Multi-task and PCM implementations were shown to improve performance over single task Deep Neural Networks. Conversely, target prediction performed almost two standard deviations under the mean performance. Random Forests, Support Vector Machines, and Logistic Regression performed around mean performance. Finally, using an ensemble of DNNs, alongside additional tuning, enhanced the relative performance by another 27% (compared with unoptimized 'DNN_PCM'). Here, a standardized set to test and evaluate different machine learning algorithms in the context of multi-task learning is offered by providing the data and the protocols. Graphical Abstract .

  5. Metabolic phenotyping of urine for discriminating alcohol-dependent from social drinkers and alcohol-naive subjects.

    PubMed

    Mostafa, Hamza; Amin, Arwa M; Teh, Chin-Hoe; Murugaiyah, Vikneswaran; Arif, Nor Hayati; Ibrahim, Baharudin

    2016-12-01

    Alcohol-dependence (AD) is a ravaging public health and social problem. AD diagnosis depends on questionnaires and some biomarkers, which lack specificity and sensitivity, however, often leading to less precise diagnosis, as well as delaying treatment. This represents a great burden, not only on AD individuals but also on their families. Metabolomics using nuclear magnetic resonance spectroscopy (NMR) can provide novel techniques for the identification of novel biomarkers of AD. These putative biomarkers can facilitate early diagnosis of AD. To identify novel biomarkers able to discriminate between alcohol-dependent, non-AD alcohol drinkers and controls using metabolomics. Urine samples were collected from 30 alcohol-dependent persons who did not yet start AD treatment, 54 social drinkers and 60 controls, who were then analysed using NMR. Data analysis was done using multivariate analysis including principal component analysis (PCA) and orthogonal partial least square-discriminate analysis (OPLS-DA), followed by univariate and multivariate logistic regression to develop the discriminatory model. The reproducibility was done using intraclass correlation coefficient (ICC). The OPLS-DA revealed significant discrimination between AD and other groups with sensitivity 86.21%, specificity 97.25% and accuracy 94.93%. Six biomarkers were significantly associated with AD in the multivariate logistic regression model. These biomarkers were cis-aconitic acid, citric acid, alanine, lactic acid, 1,2-propanediol and 2-hydroxyisovaleric acid. The reproducibility of all biomarkers was excellent (0.81-1.0). This study revealed that metabolomics analysis of urine using NMR identified AD novel biomarkers which can discriminate AD from social drinkers and controls with high accuracy. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  6. Relationships between stressful life events and impaired fasting glucose among left-behind farmers in rural China.

    PubMed

    Liang, Han; Cheng, Jing; Shen, Xingrong; Chen, Penglai; Tong, Guixian; Chai, Jing; Li, Kaichun; Xie, Shaoyu; Shi, Yong; Wang, Debin; Sun, Yehuan

    2015-02-01

    This study aims at examining the effects of stressful life events on risk of impaired fasting glucose among left-behind farmers in rural China. The study collected data about stressful life events, family history of diabetes, lifestyle, demographics and minimum anthropometrics from left-behind famers aged 40-70 years. Calculated life event index was applied to assess the combined effects of stressful life events experienced by the left-behind farmers and its association with impaired fasting glucose was estimated using binary logistic regression models. The prevalence of abnormal fasting glucose was 61.4% by American Diabetes Association (ADA) standard and 32.4% by World Health Organization (WHO) standard. Binary logistic regression analysis revealed a coefficient of 0.033 (P<.001) by ADA standard or 0.028 (P<.001) by WHO standard between impaired fasting glucose and life event index. The overall odds ratios of impaired glucose for the second, third and fourth (highest) versus the first (lowest) quartile of life event index were 1.419 [95% CI=(1.173, 1.717)], 1.711 [95% CI=(1.413, 2.071)] and 1.957 [95% CI=(1.606, 2.385)] respectively by ADA standard. When more and more confounding factors were controlled for, these odds ratios remained statistically significant though decreased to a small extent. The left-behind farmers showed over two-fold prevalence rate of pre-diabetes than that of the nation's average and their risk of impaired fasting glucose was positively associated with stressful life events in a dose-dependent way. Both the population studied and their life events merit special attention. Copyright © 2014 Elsevier Inc. All rights reserved.

  7. An application in identifying high-risk populations in alternative tobacco product use utilizing logistic regression and CART: a heuristic comparison.

    PubMed

    Lei, Yang; Nollen, Nikki; Ahluwahlia, Jasjit S; Yu, Qing; Mayo, Matthew S

    2015-04-09

    Other forms of tobacco use are increasing in prevalence, yet most tobacco control efforts are aimed at cigarettes. In light of this, it is important to identify individuals who are using both cigarettes and alternative tobacco products (ATPs). Most previous studies have used regression models. We conducted a traditional logistic regression model and a classification and regression tree (CART) model to illustrate and discuss the added advantages of using CART in the setting of identifying high-risk subgroups of ATP users among cigarettes smokers. The data were collected from an online cross-sectional survey administered by Survey Sampling International between July 5, 2012 and August 15, 2012. Eligible participants self-identified as current smokers, African American, White, or Latino (of any race), were English-speaking, and were at least 25 years old. The study sample included 2,376 participants and was divided into independent training and validation samples for a hold out validation. Logistic regression and CART models were used to examine the important predictors of cigarettes + ATP users. The logistic regression model identified nine important factors: gender, age, race, nicotine dependence, buying cigarettes or borrowing, whether the price of cigarettes influences the brand purchased, whether the participants set limits on cigarettes per day, alcohol use scores, and discrimination frequencies. The C-index of the logistic regression model was 0.74, indicating good discriminatory capability. The model performed well in the validation cohort also with good discrimination (c-index = 0.73) and excellent calibration (R-square = 0.96 in the calibration regression). The parsimonious CART model identified gender, age, alcohol use score, race, and discrimination frequencies to be the most important factors. It also revealed interesting partial interactions. The c-index is 0.70 for the training sample and 0.69 for the validation sample. The misclassification rate was 0.342 for the training sample and 0.346 for the validation sample. The CART model was easier to interpret and discovered target populations that possess clinical significance. This study suggests that the non-parametric CART model is parsimonious, potentially easier to interpret, and provides additional information in identifying the subgroups at high risk of ATP use among cigarette smokers.

  8. A method for assigning species into groups based on generalized Mahalanobis distance between habitat model coefficients

    USGS Publications Warehouse

    Williams, C.J.; Heglund, P.J.

    2009-01-01

    Habitat association models are commonly developed for individual animal species using generalized linear modeling methods such as logistic regression. We considered the issue of grouping species based on their habitat use so that management decisions can be based on sets of species rather than individual species. This research was motivated by a study of western landbirds in northern Idaho forests. The method we examined was to separately fit models to each species and to use a generalized Mahalanobis distance between coefficient vectors to create a distance matrix among species. Clustering methods were used to group species from the distance matrix, and multidimensional scaling methods were used to visualize the relations among species groups. Methods were also discussed for evaluating the sensitivity of the conclusions because of outliers or influential data points. We illustrate these methods with data from the landbird study conducted in northern Idaho. Simulation results are presented to compare the success of this method to alternative methods using Euclidean distance between coefficient vectors and to methods that do not use habitat association models. These simulations demonstrate that our Mahalanobis-distance- based method was nearly always better than Euclidean-distance-based methods or methods not based on habitat association models. The methods used to develop candidate species groups are easily explained to other scientists and resource managers since they mainly rely on classical multivariate statistical methods. ?? 2008 Springer Science+Business Media, LLC.

  9. EPIBLASTER-fast exhaustive two-locus epistasis detection strategy using graphical processing units

    PubMed Central

    Kam-Thong, Tony; Czamara, Darina; Tsuda, Koji; Borgwardt, Karsten; Lewis, Cathryn M; Erhardt-Lehmann, Angelika; Hemmer, Bernhard; Rieckmann, Peter; Daake, Markus; Weber, Frank; Wolf, Christiane; Ziegler, Andreas; Pütz, Benno; Holsboer, Florian; Schölkopf, Bernhard; Müller-Myhsok, Bertram

    2011-01-01

    Detection of epistatic interaction between loci has been postulated to provide a more in-depth understanding of the complex biological and biochemical pathways underlying human diseases. Studying the interaction between two loci is the natural progression following traditional and well-established single locus analysis. However, the added costs and time duration required for the computation involved have thus far deterred researchers from pursuing a genome-wide analysis of epistasis. In this paper, we propose a method allowing such analysis to be conducted very rapidly. The method, dubbed EPIBLASTER, is applicable to case–control studies and consists of a two-step process in which the difference in Pearson's correlation coefficients is computed between controls and cases across all possible SNP pairs as an indication of significant interaction warranting further analysis. For the subset of interactions deemed potentially significant, a second-stage analysis is performed using the likelihood ratio test from the logistic regression to obtain the P-value for the estimated coefficients of the individual effects and the interaction term. The algorithm is implemented using the parallel computational capability of commercially available graphical processing units to greatly reduce the computation time involved. In the current setup and example data sets (211 cases, 222 controls, 299468 SNPs; and 601 cases, 825 controls, 291095 SNPs), this coefficient evaluation stage can be completed in roughly 1 day. Our method allows for exhaustive and rapid detection of significant SNP pair interactions without imposing significant marginal effects of the single loci involved in the pair. PMID:21150885

  10. Determination of osteoporosis risk factors using a multiple logistic regression model in postmenopausal Turkish women.

    PubMed

    Akkus, Zeki; Camdeviren, Handan; Celik, Fatma; Gur, Ali; Nas, Kemal

    2005-09-01

    To determine the risk factors of osteoporosis using a multiple binary logistic regression method and to assess the risk variables for osteoporosis, which is a major and growing health problem in many countries. We presented a case-control study, consisting of 126 postmenopausal healthy women as control group and 225 postmenopausal osteoporotic women as the case group. The study was carried out in the Department of Physical Medicine and Rehabilitation, Dicle University, Diyarbakir, Turkey between 1999-2002. The data from the 351 participants were collected using a standard questionnaire that contains 43 variables. A multiple logistic regression model was then used to evaluate the data and to find the best regression model. We classified 80.1% (281/351) of the participants using the regression model. Furthermore, the specificity value of the model was 67% (84/126) of the control group while the sensitivity value was 88% (197/225) of the case group. We found the distribution of residual values standardized for final model to be exponential using the Kolmogorow-Smirnow test (p=0.193). The receiver operating characteristic curve was found successful to predict patients with risk for osteoporosis. This study suggests that low levels of dietary calcium intake, physical activity, education, and longer duration of menopause are independent predictors of the risk of low bone density in our population. Adequate dietary calcium intake in combination with maintaining a daily physical activity, increasing educational level, decreasing birth rate, and duration of breast-feeding may contribute to healthy bones and play a role in practical prevention of osteoporosis in Southeast Anatolia. In addition, the findings of the present study indicate that the use of multivariate statistical method as a multiple logistic regression in osteoporosis, which maybe influenced by many variables, is better than univariate statistical evaluation.

  11. Classification and regression tree analysis of acute-on-chronic hepatitis B liver failure: Seeing the forest for the trees.

    PubMed

    Shi, K-Q; Zhou, Y-Y; Yan, H-D; Li, H; Wu, F-L; Xie, Y-Y; Braddock, M; Lin, X-Y; Zheng, M-H

    2017-02-01

    At present, there is no ideal model for predicting the short-term outcome of patients with acute-on-chronic hepatitis B liver failure (ACHBLF). This study aimed to establish and validate a prognostic model by using the classification and regression tree (CART) analysis. A total of 1047 patients from two separate medical centres with suspected ACHBLF were screened in the study, which were recognized as derivation cohort and validation cohort, respectively. CART analysis was applied to predict the 3-month mortality of patients with ACHBLF. The accuracy of the CART model was tested using the area under the receiver operating characteristic curve, which was compared with the model for end-stage liver disease (MELD) score and a new logistic regression model. CART analysis identified four variables as prognostic factors of ACHBLF: total bilirubin, age, serum sodium and INR, and three distinct risk groups: low risk (4.2%), intermediate risk (30.2%-53.2%) and high risk (81.4%-96.9%). The new logistic regression model was constructed with four independent factors, including age, total bilirubin, serum sodium and prothrombin activity by multivariate logistic regression analysis. The performances of the CART model (0.896), similar to the logistic regression model (0.914, P=.382), exceeded that of MELD score (0.667, P<.001). The results were confirmed in the validation cohort. We have developed and validated a novel CART model superior to MELD for predicting three-month mortality of patients with ACHBLF. Thus, the CART model could facilitate medical decision-making and provide clinicians with a validated practical bedside tool for ACHBLF risk stratification. © 2016 John Wiley & Sons Ltd.

  12. Identification of immune correlates of protection in Shigella infection by application of machine learning.

    PubMed

    Arevalillo, Jorge M; Sztein, Marcelo B; Kotloff, Karen L; Levine, Myron M; Simon, Jakub K

    2017-10-01

    Immunologic correlates of protection are important in vaccine development because they give insight into mechanisms of protection, assist in the identification of promising vaccine candidates, and serve as endpoints in bridging clinical vaccine studies. Our goal is the development of a methodology to identify immunologic correlates of protection using the Shigella challenge as a model. The proposed methodology utilizes the Random Forests (RF) machine learning algorithm as well as Classification and Regression Trees (CART) to detect immune markers that predict protection, identify interactions between variables, and define optimal cutoffs. Logistic regression modeling is applied to estimate the probability of protection and the confidence interval (CI) for such a probability is computed by bootstrapping the logistic regression models. The results demonstrate that the combination of Classification and Regression Trees and Random Forests complements the standard logistic regression and uncovers subtle immune interactions. Specific levels of immunoglobulin IgG antibody in blood on the day of challenge predicted protection in 75% (95% CI 67-86). Of those subjects that did not have blood IgG at or above a defined threshold, 100% were protected if they had IgA antibody secreting cells above a defined threshold. Comparison with the results obtained by applying only logistic regression modeling with standard Akaike Information Criterion for model selection shows the usefulness of the proposed method. Given the complexity of the immune system, the use of machine learning methods may enhance traditional statistical approaches. When applied together, they offer a novel way to quantify important immune correlates of protection that may help the development of vaccines. Copyright © 2017 Elsevier Inc. All rights reserved.

  13. The quest for conditional independence in prospectivity modeling: weights-of-evidence, boost weights-of-evidence, and logistic regression

    NASA Astrophysics Data System (ADS)

    Schaeben, Helmut; Semmler, Georg

    2016-09-01

    The objective of prospectivity modeling is prediction of the conditional probability of the presence T = 1 or absence T = 0 of a target T given favorable or prohibitive predictors B, or construction of a two classes 0,1 classification of T. A special case of logistic regression called weights-of-evidence (WofE) is geologists' favorite method of prospectivity modeling due to its apparent simplicity. However, the numerical simplicity is deceiving as it is implied by the severe mathematical modeling assumption of joint conditional independence of all predictors given the target. General weights of evidence are explicitly introduced which are as simple to estimate as conventional weights, i.e., by counting, but do not require conditional independence. Complementary to the regression view is the classification view on prospectivity modeling. Boosting is the construction of a strong classifier from a set of weak classifiers. From the regression point of view it is closely related to logistic regression. Boost weights-of-evidence (BoostWofE) was introduced into prospectivity modeling to counterbalance violations of the assumption of conditional independence even though relaxation of modeling assumptions with respect to weak classifiers was not the (initial) purpose of boosting. In the original publication of BoostWofE a fabricated dataset was used to "validate" this approach. Using the same fabricated dataset it is shown that BoostWofE cannot generally compensate lacking conditional independence whatever the consecutively processing order of predictors. Thus the alleged features of BoostWofE are disproved by way of counterexamples, while theoretical findings are confirmed that logistic regression including interaction terms can exactly compensate violations of joint conditional independence if the predictors are indicators.

  14. Resemblance in dietary intakes between urban low-income African-American adolescents and their mothers: the healthy eating and active lifestyles from school to home for kids study.

    PubMed

    Wang, Youfa; Li, Ji; Caballero, Benjamin

    2009-01-01

    To examine the association and predictors of dietary intake resemblance between urban low-income African-American adolescents and their mothers. Detailed dietary data collected from 121 child-parent pairs in Chicago during fall 2003 were used. The association was assessed using correlation coefficients, kappa, and percentage of agreement, as well as logistic regression models. Overall, the association was weak as indicated by correlations and other measures. None of the mother-son correlations for nutrients and food groups were greater than 0.20. Mother-daughter pairs had stronger correlations (0.26 for energy and 0.30 for fat). The association was stronger in normal-weight mothers than in mothers with overweight or obesity. Logistic models showed that mother being a current smoker, giving child more pocket money, and allowing child to eat or purchase snacks without parental permission or presence predicted a higher probability of resemblance in undesirable eating patterns, such as high-energy, high-fat, and high-snack intakes (P<0.05). Mother-child diet association was generally weak, and varied considerably across groups and intake variables in this homogenous population. Some maternal characteristics seem to affect the association.

  15. Modeling the dynamics of urban growth using multinomial logistic regression: a case study of Jiayu County, Hubei Province, China

    NASA Astrophysics Data System (ADS)

    Nong, Yu; Du, Qingyun; Wang, Kun; Miao, Lei; Zhang, Weiwei

    2008-10-01

    Urban growth modeling, one of the most important aspects of land use and land cover change study, has attracted substantial attention because it helps to comprehend the mechanisms of land use change thus helps relevant policies made. This study applied multinomial logistic regression to model urban growth in the Jiayu county of Hubei province, China to discover the relationship between urban growth and the driving forces of which biophysical and social-economic factors are selected as independent variables. This type of regression is similar to binary logistic regression, but it is more general because the dependent variable is not restricted to two categories, as those previous studies did. The multinomial one can simulate the process of multiple land use competition between urban land, bare land, cultivated land and orchard land. Taking the land use type of Urban as reference category, parameters could be estimated with odds ratio. A probability map is generated from the model to predict where urban growth will occur as a result of the computation.

  16. Confidence Intervals for Squared Semipartial Correlation Coefficients: The Effect of Nonnormality

    ERIC Educational Resources Information Center

    Algina, James; Keselman, H. J.; Penfield, Randall D.

    2010-01-01

    The increase in the squared multiple correlation coefficient ([delta]R[superscript 2]) associated with a variable in a regression equation is a commonly used measure of importance in regression analysis. Algina, Keselman, and Penfield found that intervals based on asymptotic principles were typically very inaccurate, even though the sample size…

  17. Estimation of octanol/water partition coefficients using LSER parameters

    USGS Publications Warehouse

    Luehrs, Dean C.; Hickey, James P.; Godbole, Kalpana A.; Rogers, Tony N.

    1998-01-01

    The logarithms of octanol/water partition coefficients, logKow, were regressed against the linear solvation energy relationship (LSER) parameters for a training set of 981 diverse organic chemicals. The standard deviation for logKow was 0.49. The regression equation was then used to estimate logKow for a test of 146 chemicals which included pesticides and other diverse polyfunctional compounds. Thus the octanol/water partition coefficient may be estimated by LSER parameters without elaborate software but only moderate accuracy should be expected.

  18. Sparse representation of multi parametric DCE-MRI features using K-SVD for classifying gene expression based breast cancer recurrence risk

    NASA Astrophysics Data System (ADS)

    Mahrooghy, Majid; Ashraf, Ahmed B.; Daye, Dania; Mies, Carolyn; Rosen, Mark; Feldman, Michael; Kontos, Despina

    2014-03-01

    We evaluate the prognostic value of sparse representation-based features by applying the K-SVD algorithm on multiparametric kinetic, textural, and morphologic features in breast dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI). K-SVD is an iterative dimensionality reduction method that optimally reduces the initial feature space by updating the dictionary columns jointly with the sparse representation coefficients. Therefore, by using K-SVD, we not only provide sparse representation of the features and condense the information in a few coefficients but also we reduce the dimensionality. The extracted K-SVD features are evaluated by a machine learning algorithm including a logistic regression classifier for the task of classifying high versus low breast cancer recurrence risk as determined by a validated gene expression assay. The features are evaluated using ROC curve analysis and leave one-out cross validation for different sparse representation and dimensionality reduction numbers. Optimal sparse representation is obtained when the number of dictionary elements is 4 (K=4) and maximum non-zero coefficients is 2 (L=2). We compare K-SVD with ANOVA based feature selection for the same prognostic features. The ROC results show that the AUC of the K-SVD based (K=4, L=2), the ANOVA based, and the original features (i.e., no dimensionality reduction) are 0.78, 0.71. and 0.68, respectively. From the results, it can be inferred that by using sparse representation of the originally extracted multi-parametric, high-dimensional data, we can condense the information on a few coefficients with the highest predictive value. In addition, the dimensionality reduction introduced by K-SVD can prevent models from over-fitting.

  19. Thermal requirements of Dermanyssus gallinae (De Geer, 1778) (Acari: Dermanyssidae).

    PubMed

    Tucci, Edna Clara; do Prado, Angelo P; de Araújo, Raquel Pires

    2008-01-01

    The thermal requirements for development of Dermanyssus gallinae were studied under laboratory conditions at 15, 20, 25, 30 and 35 degrees C, a 12h photoperiod and 60-85% RH. The thermal requirements for D. gallinae were as follows. Preoviposition: base temperature 3.4 degrees C, thermal constant (k) 562.85 degree-hours, determination coefficient (R(2)) 0.59, regression equation: Y= -0.006035 + 0.001777x. Egg: base temperature 10.60 degrees C, thermal constant (k) 689.65 degree-hours, determination coefficient (R(2)) 0.94, regression equation: Y= -0.015367 + 0.001450x. Larva: base temperature 9.82 degrees C, thermal constant (k) 464.91 degree-hours, determination coefficient (R(2)) 0.87, regression equation: Y= -0.021123 + 0.002151x. Protonymph: base temperature 10.17 degrees C, thermal constant (k) 504.49 degree-hours, determination coefficient (R(2)) 0.90, regression equation: Y= -0.020152 + 0.001982x. Deutonymph: base temperature 11.80 degrees C, thermal constant (k) 501.11 degree-hours, determination coefficient (R(2)) 0.99, regression equation: Y= -0.023555 + 0.001996x. The results obtained showed that 15 to 42 generations of Dermanyssus gallinae may occur during the year in the State of São Paulo, as estimated based on isotherm charts. Dermanyssus gallinae may develop continually in the State of São Paulo, with a population decrease in the winter. There were differences between the developmental stages of D. gallinae in relation to thermal requirements.

  20. Logistic Mixed Models to Investigate Implicit and Explicit Belief Tracking.

    PubMed

    Lages, Martin; Scheel, Anne

    2016-01-01

    We investigated the proposition of a two-systems Theory of Mind in adults' belief tracking. A sample of N = 45 participants predicted the choice of one of two opponent players after observing several rounds in an animated card game. Three matches of this card game were played and initial gaze direction on target and subsequent choice predictions were recorded for each belief task and participant. We conducted logistic regressions with mixed effects on the binary data and developed Bayesian logistic mixed models to infer implicit and explicit mentalizing in true belief and false belief tasks. Although logistic regressions with mixed effects predicted the data well a Bayesian logistic mixed model with latent task- and subject-specific parameters gave a better account of the data. As expected explicit choice predictions suggested a clear understanding of true and false beliefs (TB/FB). Surprisingly, however, model parameters for initial gaze direction also indicated belief tracking. We discuss why task-specific parameters for initial gaze directions are different from choice predictions yet reflect second-order perspective taking.

  1. Model selection for logistic regression models

    NASA Astrophysics Data System (ADS)

    Duller, Christine

    2012-09-01

    Model selection for logistic regression models decides which of some given potential regressors have an effect and hence should be included in the final model. The second interesting question is whether a certain factor is heterogeneous among some subsets, i.e. whether the model should include a random intercept or not. In this paper these questions will be answered with classical as well as with Bayesian methods. The application show some results of recent research projects in medicine and business administration.

  2. Biostatistics Series Module 6: Correlation and Linear Regression.

    PubMed

    Hazra, Avijit; Gogtay, Nithya

    2016-01-01

    Correlation and linear regression are the most commonly used techniques for quantifying the association between two numeric variables. Correlation quantifies the strength of the linear relationship between paired variables, expressing this as a correlation coefficient. If both variables x and y are normally distributed, we calculate Pearson's correlation coefficient ( r ). If normality assumption is not met for one or both variables in a correlation analysis, a rank correlation coefficient, such as Spearman's rho (ρ) may be calculated. A hypothesis test of correlation tests whether the linear relationship between the two variables holds in the underlying population, in which case it returns a P < 0.05. A 95% confidence interval of the correlation coefficient can also be calculated for an idea of the correlation in the population. The value r 2 denotes the proportion of the variability of the dependent variable y that can be attributed to its linear relation with the independent variable x and is called the coefficient of determination. Linear regression is a technique that attempts to link two correlated variables x and y in the form of a mathematical equation ( y = a + bx ), such that given the value of one variable the other may be predicted. In general, the method of least squares is applied to obtain the equation of the regression line. Correlation and linear regression analysis are based on certain assumptions pertaining to the data sets. If these assumptions are not met, misleading conclusions may be drawn. The first assumption is that of linear relationship between the two variables. A scatter plot is essential before embarking on any correlation-regression analysis to show that this is indeed the case. Outliers or clustering within data sets can distort the correlation coefficient value. Finally, it is vital to remember that though strong correlation can be a pointer toward causation, the two are not synonymous.

  3. Biostatistics Series Module 6: Correlation and Linear Regression

    PubMed Central

    Hazra, Avijit; Gogtay, Nithya

    2016-01-01

    Correlation and linear regression are the most commonly used techniques for quantifying the association between two numeric variables. Correlation quantifies the strength of the linear relationship between paired variables, expressing this as a correlation coefficient. If both variables x and y are normally distributed, we calculate Pearson's correlation coefficient (r). If normality assumption is not met for one or both variables in a correlation analysis, a rank correlation coefficient, such as Spearman's rho (ρ) may be calculated. A hypothesis test of correlation tests whether the linear relationship between the two variables holds in the underlying population, in which case it returns a P < 0.05. A 95% confidence interval of the correlation coefficient can also be calculated for an idea of the correlation in the population. The value r2 denotes the proportion of the variability of the dependent variable y that can be attributed to its linear relation with the independent variable x and is called the coefficient of determination. Linear regression is a technique that attempts to link two correlated variables x and y in the form of a mathematical equation (y = a + bx), such that given the value of one variable the other may be predicted. In general, the method of least squares is applied to obtain the equation of the regression line. Correlation and linear regression analysis are based on certain assumptions pertaining to the data sets. If these assumptions are not met, misleading conclusions may be drawn. The first assumption is that of linear relationship between the two variables. A scatter plot is essential before embarking on any correlation-regression analysis to show that this is indeed the case. Outliers or clustering within data sets can distort the correlation coefficient value. Finally, it is vital to remember that though strong correlation can be a pointer toward causation, the two are not synonymous. PMID:27904175

  4. Radiomorphometric analysis of frontal sinus for sex determination.

    PubMed

    Verma, Saumya; Mahima, V G; Patil, Karthikeya

    2014-09-01

    Sex determination of unknown individuals carries crucial significance in forensic research, in cases where fragments of skull persist with no likelihood of identification based on dental arch. In these instances sex determination becomes important to rule out certain number of possibilities instantly and helps in establishing a biological profile of human remains. The aim of the study is to evaluate a mathematical method based on logistic regression analysis capable of ascertaining the sex of individuals in the South Indian population. The study was conducted in the department of Oral Medicine and Radiology. The right and left areas, maximum height, width of frontal sinus were determined in 100 Caldwell views of 50 women and 50 men aged 20 years and above, with the help of Vernier callipers and a square grid with 1 square measuring 1mm(2) in area. Student's t-test, logistic regression analysis. The mean values of variables were greater in men, based on Student's t-test at 5% level of significance. The mathematical model based on logistic regression analysis gave percentage agreement of total area to correctly predict the female gender as 55.2%, of right area as 60.9% and of left area as 55.2%. The areas of the frontal sinus and the logistic regression proved to be unreliable in sex determination. (Logit = 0.924 - 0.00217 × right area).

  5. Genetic prediction of type 2 diabetes using deep neural network.

    PubMed

    Kim, J; Kim, J; Kwak, M J; Bajaj, M

    2018-04-01

    Type 2 diabetes (T2DM) has strong heritability but genetic models to explain heritability have been challenging. We tested deep neural network (DNN) to predict T2DM using the nested case-control study of Nurses' Health Study (3326 females, 45.6% T2DM) and Health Professionals Follow-up Study (2502 males, 46.5% T2DM). We selected 96, 214, 399, and 678 single-nucleotide polymorphism (SNPs) through Fisher's exact test and L1-penalized logistic regression. We split each dataset randomly in 4:1 to train prediction models and test their performance. DNN and logistic regressions showed better area under the curve (AUC) of ROC curves than the clinical model when 399 or more SNPs included. DNN was superior than logistic regressions in AUC with 399 or more SNPs in male and 678 SNPs in female. Addition of clinical factors consistently increased AUC of DNN but failed to improve logistic regressions with 214 or more SNPs. In conclusion, we show that DNN can be a versatile tool to predict T2DM incorporating large numbers of SNPs and clinical information. Limitations include a relatively small number of the subjects mostly of European ethnicity. Further studies are warranted to confirm and improve performance of genetic prediction models using DNN in different ethnic groups. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  6. Unconditional or Conditional Logistic Regression Model for Age-Matched Case-Control Data?

    PubMed

    Kuo, Chia-Ling; Duan, Yinghui; Grady, James

    2018-01-01

    Matching on demographic variables is commonly used in case-control studies to adjust for confounding at the design stage. There is a presumption that matched data need to be analyzed by matched methods. Conditional logistic regression has become a standard for matched case-control data to tackle the sparse data problem. The sparse data problem, however, may not be a concern for loose-matching data when the matching between cases and controls is not unique, and one case can be matched to other controls without substantially changing the association. Data matched on a few demographic variables are clearly loose-matching data, and we hypothesize that unconditional logistic regression is a proper method to perform. To address the hypothesis, we compare unconditional and conditional logistic regression models by precision in estimates and hypothesis testing using simulated matched case-control data. Our results support our hypothesis; however, the unconditional model is not as robust as the conditional model to the matching distortion that the matching process not only makes cases and controls similar for matching variables but also for the exposure status. When the study design involves other complex features or the computational burden is high, matching in loose-matching data can be ignored for negligible loss in testing and estimation if the distributions of matching variables are not extremely different between cases and controls.

  7. Unconditional or Conditional Logistic Regression Model for Age-Matched Case–Control Data?

    PubMed Central

    Kuo, Chia-Ling; Duan, Yinghui; Grady, James

    2018-01-01

    Matching on demographic variables is commonly used in case–control studies to adjust for confounding at the design stage. There is a presumption that matched data need to be analyzed by matched methods. Conditional logistic regression has become a standard for matched case–control data to tackle the sparse data problem. The sparse data problem, however, may not be a concern for loose-matching data when the matching between cases and controls is not unique, and one case can be matched to other controls without substantially changing the association. Data matched on a few demographic variables are clearly loose-matching data, and we hypothesize that unconditional logistic regression is a proper method to perform. To address the hypothesis, we compare unconditional and conditional logistic regression models by precision in estimates and hypothesis testing using simulated matched case–control data. Our results support our hypothesis; however, the unconditional model is not as robust as the conditional model to the matching distortion that the matching process not only makes cases and controls similar for matching variables but also for the exposure status. When the study design involves other complex features or the computational burden is high, matching in loose-matching data can be ignored for negligible loss in testing and estimation if the distributions of matching variables are not extremely different between cases and controls. PMID:29552553

  8. Estimating multilevel logistic regression models when the number of clusters is low: a comparison of different statistical software procedures.

    PubMed

    Austin, Peter C

    2010-04-22

    Multilevel logistic regression models are increasingly being used to analyze clustered data in medical, public health, epidemiological, and educational research. Procedures for estimating the parameters of such models are available in many statistical software packages. There is currently little evidence on the minimum number of clusters necessary to reliably fit multilevel regression models. We conducted a Monte Carlo study to compare the performance of different statistical software procedures for estimating multilevel logistic regression models when the number of clusters was low. We examined procedures available in BUGS, HLM, R, SAS, and Stata. We found that there were qualitative differences in the performance of different software procedures for estimating multilevel logistic models when the number of clusters was low. Among the likelihood-based procedures, estimation methods based on adaptive Gauss-Hermite approximations to the likelihood (glmer in R and xtlogit in Stata) or adaptive Gaussian quadrature (Proc NLMIXED in SAS) tended to have superior performance for estimating variance components when the number of clusters was small, compared to software procedures based on penalized quasi-likelihood. However, only Bayesian estimation with BUGS allowed for accurate estimation of variance components when there were fewer than 10 clusters. For all statistical software procedures, estimation of variance components tended to be poor when there were only five subjects per cluster, regardless of the number of clusters.

  9. Building a Decision Support System for Inpatient Admission Prediction With the Manchester Triage System and Administrative Check-in Variables.

    PubMed

    Zlotnik, Alexander; Alfaro, Miguel Cuchí; Pérez, María Carmen Pérez; Gallardo-Antolín, Ascensión; Martínez, Juan Manuel Montero

    2016-05-01

    The usage of decision support tools in emergency departments, based on predictive models, capable of estimating the probability of admission for patients in the emergency department may give nursing staff the possibility of allocating resources in advance. We present a methodology for developing and building one such system for a large specialized care hospital using a logistic regression and an artificial neural network model using nine routinely collected variables available right at the end of the triage process.A database of 255.668 triaged nonobstetric emergency department presentations from the Ramon y Cajal University Hospital of Madrid, from January 2011 to December 2012, was used to develop and test the models, with 66% of the data used for derivation and 34% for validation, with an ordered nonrandom partition. On the validation dataset areas under the receiver operating characteristic curve were 0.8568 (95% confidence interval, 0.8508-0.8583) for the logistic regression model and 0.8575 (95% confidence interval, 0.8540-0. 8610) for the artificial neural network model. χ Values for Hosmer-Lemeshow fixed "deciles of risk" were 65.32 for the logistic regression model and 17.28 for the artificial neural network model. A nomogram was generated upon the logistic regression model and an automated software decision support system with a Web interface was built based on the artificial neural network model.

  10. Product unit neural network models for predicting the growth limits of Listeria monocytogenes.

    PubMed

    Valero, A; Hervás, C; García-Gimeno, R M; Zurera, G

    2007-08-01

    A new approach to predict the growth/no growth interface of Listeria monocytogenes as a function of storage temperature, pH, citric acid (CA) and ascorbic acid (AA) is presented. A linear logistic regression procedure was performed and a non-linear model was obtained by adding new variables by means of a Neural Network model based on Product Units (PUNN). The classification efficiency of the training data set and the generalization data of the new Logistic Regression PUNN model (LRPU) were compared with Linear Logistic Regression (LLR) and Polynomial Logistic Regression (PLR) models. 92% of the total cases from the LRPU model were correctly classified, an improvement on the percentage obtained using the PLR model (90%) and significantly higher than the results obtained with the LLR model, 80%. On the other hand predictions of LRPU were closer to data observed which permits to design proper formulations in minimally processed foods. This novel methodology can be applied to predictive microbiology for describing growth/no growth interface of food-borne microorganisms such as L. monocytogenes. The optimal balance is trying to find models with an acceptable interpretation capacity and with good ability to fit the data on the boundaries of variable range. The results obtained conclude that these kinds of models might well be very a valuable tool for mathematical modeling.

  11. Analysis of a database to predict the result of allergy testing in vivo in patients with chronic nasal symptoms.

    PubMed

    Lacagnina, Valerio; Leto-Barone, Maria S; La Piana, Simona; Seidita, Aurelio; Pingitore, Giuseppe; Di Lorenzo, Gabriele

    2014-01-01

    This article uses the logistic regression model for diagnostic decision making in patients with chronic nasal symptoms. We studied the ability of the logistic regression model, obtained by the evaluation of a database, to detect patients with positive allergy skin-prick test (SPT) and patients with negative SPT. The model developed was validated using the data set obtained from another medical institution. The analysis was performed using a database obtained from a questionnaire administered to the patients with nasal symptoms containing personal data, clinical data, and results of allergy testing (SPT). All variables found to be significantly different between patients with positive and negative SPT (p < 0.05) were selected for the logistic regression models and were analyzed with backward stepwise logistic regression, evaluated with area under the curve of the receiver operating characteristic curve. A second set of patients from another institution was used to prove the model. The accuracy of the model in identifying, over the second set, both patients whose SPT will be positive and negative was high. The model detected 96% of patients with nasal symptoms and positive SPT and classified 94% of those with negative SPT. This study is preliminary to the creation of a software that could help the primary care doctors in a diagnostic decision making process (need of allergy testing) in patients complaining of chronic nasal symptoms.

  12. Comparing machine learning and logistic regression methods for predicting hypertension using a combination of gene expression and next-generation sequencing data.

    PubMed

    Held, Elizabeth; Cape, Joshua; Tintle, Nathan

    2016-01-01

    Machine learning methods continue to show promise in the analysis of data from genetic association studies because of the high number of variables relative to the number of observations. However, few best practices exist for the application of these methods. We extend a recently proposed supervised machine learning approach for predicting disease risk by genotypes to be able to incorporate gene expression data and rare variants. We then apply 2 different versions of the approach (radial and linear support vector machines) to simulated data from Genetic Analysis Workshop 19 and compare performance to logistic regression. Method performance was not radically different across the 3 methods, although the linear support vector machine tended to show small gains in predictive ability relative to a radial support vector machine and logistic regression. Importantly, as the number of genes in the models was increased, even when those genes contained causal rare variants, model predictive ability showed a statistically significant decrease in performance for both the radial support vector machine and logistic regression. The linear support vector machine showed more robust performance to the inclusion of additional genes. Further work is needed to evaluate machine learning approaches on larger samples and to evaluate the relative improvement in model prediction from the incorporation of gene expression data.

  13. Why medical students choose psychiatry - a 20 country cross-sectional survey.

    PubMed

    Farooq, Kitty; Lydall, Gregory J; Malik, Amit; Ndetei, David M; Bhugra, Dinesh

    2014-01-15

    Recruitment to psychiatry is insufficient to meet projected mental health service needs world-wide. We report on the career plans of final year medical students from 20 countries, investigating factors identified from the literature which influence psychiatric career choice. Cross sectional electronic or paper survey. Subjects were final year medical students at 46 medical schools in participating countries. We assessed students' career intentions, motivations, medical school teaching and exposure to psychiatry. We assessed students' attitudes and personality factors. The main outcome measure was likelihood of specializing in psychiatry. Multilevel logistic regression was used to examine the joint effect of factors upon the main outcome. 2198 of 9135 (24%) of students responded (range 4 to 91%) across the countries. Internationally 4.5% of students definitely considered psychiatry as a career (range 1 to 12%). 19% of students (range 0 to 33%) were "quite likely", and 25% were "definitely not" considering psychiatry. Female gender, experience of mental/physical illness, media portrayal of doctors, and positive attitudes to psychiatry, but not personality factors, were associated with choosing psychiatry. Quality of psychiatric placement (correlation coefficient = 0.22, p < 0.001) and number of placements (correlation coefficient =0.21, p < 0.001) were associated with higher ATP scores. During medical school, experience of psychiatric enrichment activities (special studies modules and university psychiatry clubs), experience of acutely unwell patients and perceived clinical responsibility were all associated with choice of psychiatry.Multilevel logistic regression revealed six factors associated with students choosing psychiatry: importance of own vocation, odds ratio (OR) 3.01, 95% CI 1.61 to 5.91, p < 0.001); interest in psychiatry before medical school, OR 10.8 (5.38 to 21.8, p < 0.001); undertaking a psychiatry special study module, OR 1.45 (1.05 to 2.01, p = 0.03) or elective OR 4.28 (2.87- 6.38, p < 0.001); membership of a university psychiatry club, OR 3.25 (2.87 to 6.38, p < 0.001); and exposure to didactic teaching, OR 0.54 (0.40 to 0.72, p < 0.001). We report factors relevant to medical student selection and psychiatry teaching which affect career choice. Addressing these factors may improve recruitment to psychiatry internationally.

  14. [Research on the reliability and validity of postural workload assessment method and the relation to work-related musculoskeletal disorders of workers].

    PubMed

    Qin, D L; Jin, X N; Wang, S J; Wang, J J; Mamat, N; Wang, F J; Wang, Y; Shen, Z A; Sheng, L G; Forsman, M; Yang, L Y; Wang, S; Zhang, Z B; He, L H

    2018-06-18

    To form a new assessment method to evaluate postural workload comprehensively analyzing the dynamic and static postural workload for workers during their work process to analyze the reliability and validity, and to study the relation between workers' postural workload and work-related musculoskeletal disorders (WMSDs). In the study, 844 workers from electronic and railway vehicle manufacturing factories were selected as subjects investigated by using the China Musculoskeletal Questionnaire (CMQ) to form the postural workload comprehensive assessment method. The Cronbach's α, cluster analysis and factor analysis were used to assess the reliability and validity of the new assessment method. Non-conditional Logistic regression was used to analyze the relation between workers' postural workload and WMSDs. Reliability of the assessment method for postural workload: internal consistency analysis results showed that Cronbach's α was 0.934 and the results of split-half reliability indicated that Spearman-Brown coefficient was 0.881 and the correlation coefficient between the first part and the second was 0.787. Validity of the assessment method for postural workload: the results of cluster analysis indicated that square Euclidean distance between dynamic and static postural workload assessment in the same part or work posture was the shortest. The results of factor analysis showed that 2 components were extracted and the cumulative percentage of variance achieved 65.604%. The postural workload score of the different occupational workers showed significant difference (P<0.05) by covariance analysis. The results of nonconditional Logistic regression indicated that alcohol intake (OR=2.141, 95%CI 1.337-3.428) and obesity (OR=3.408, 95%CI 1.629-7.130) were risk factors for WMSDs. The risk for WMSDs would rise as workers' postural workload rose (OR=1.035, 95%CI 1.022-1.048). There was significant different risk for WMSDs in the different groups of workers distinguished by work type, gender and age. Female workers exhibited a higher prevalence for WMSDs (OR=2.626, 95%CI 1.414-4.879) and workers between 30-40 years of age (OR=1.909, 95%CI 1.237-2.946) as compared with those under 30. This method for comprehensively assessing postural workload is reliable and effective when used in assembling workers, and there is certain relation between the postural workload and WMSDs.

  15. [Application of Bayes Probability Model in Differentiation of Yin and Yang Jaundice Syndromes in Neonates].

    PubMed

    Mu, Chun-sun; Zhang, Ping; Kong, Chun-yan; Li, Yang-ning

    2015-09-01

    To study the application of Bayes probability model in differentiating yin and yang jaundice syndromes in neonates. Totally 107 jaundice neonates who admitted to hospital within 10 days after birth were assigned to two groups according to syndrome differentiation, 68 in the yang jaundice syndrome group and 39 in the yin jaundice syndrome group. Data collected for neonates were factors related to jaundice before, during and after birth. Blood routines, liver and renal functions, and myocardial enzymes were tested on the admission day or the next day. Logistic regression model and Bayes discriminating analysis were used to screen factors important for yin and yang jaundice syndrome differentiation. Finally, Bayes probability model for yin and yang jaundice syndromes was established and assessed. Factors important for yin and yang jaundice syndrome differentiation screened by Logistic regression model and Bayes discriminating analysis included mothers' age, mother with gestational diabetes mellitus (GDM), gestational age, asphyxia, or ABO hemolytic diseases, red blood cell distribution width (RDW-SD), platelet-large cell ratio (P-LCR), serum direct bilirubin (DBIL), alkaline phosphatase (ALP), cholinesterase (CHE). Bayes discriminating analysis was performed by SPSS to obtain Bayes discriminant function coefficient. Bayes discriminant function was established according to discriminant function coefficients. Yang jaundice syndrome: y1= -21. 701 +2. 589 x mother's age + 1. 037 x GDM-17. 175 x asphyxia + 13. 876 x gestational age + 6. 303 x ABO hemolytic disease + 2.116 x RDW-SD + 0. 831 x DBIL + 0. 012 x ALP + 1. 697 x LCR + 0. 001 x CHE; Yin jaundice syndrome: y2= -33. 511 + 2.991 x mother's age + 3.960 x GDM-12. 877 x asphyxia + 11. 848 x gestational age + 1. 820 x ABO hemolytic disease +2. 231 x RDW-SD +0. 999 x DBIL +0. 023 x ALP +1. 916 x LCR +0. 002 x CHE. Bayes discriminant function was hypothesis tested and got Wilks' λ =0. 393 (P =0. 000). So Bayes discriminant function was proved to be with statistical difference. To check Bayes probability model in discriminating yin and yang jaundice syndromes, coincidence rates for yin and yang jaundice syndromes were both 90% plus. Yin and yang jaundice syndromes in neonates could be accurately judged by Bayesian discriminating functions.

  16. [Use of multiple regression models in observational studies (1970-2013) and requirements of the STROBE guidelines in Spanish scientific journals].

    PubMed

    Real, J; Cleries, R; Forné, C; Roso-Llorach, A; Martínez-Sánchez, J M

    In medicine and biomedical research, statistical techniques like logistic, linear, Cox and Poisson regression are widely known. The main objective is to describe the evolution of multivariate techniques used in observational studies indexed in PubMed (1970-2013), and to check the requirements of the STROBE guidelines in the author guidelines in Spanish journals indexed in PubMed. A targeted PubMed search was performed to identify papers that used logistic linear Cox and Poisson models. Furthermore, a review was also made of the author guidelines of journals published in Spain and indexed in PubMed and Web of Science. Only 6.1% of the indexed manuscripts included a term related to multivariate analysis, increasing from 0.14% in 1980 to 12.3% in 2013. In 2013, 6.7, 2.5, 3.5, and 0.31% of the manuscripts contained terms related to logistic, linear, Cox and Poisson regression, respectively. On the other hand, 12.8% of journals author guidelines explicitly recommend to follow the STROBE guidelines, and 35.9% recommend the CONSORT guideline. A low percentage of Spanish scientific journals indexed in PubMed include the STROBE statement requirement in the author guidelines. Multivariate regression models in published observational studies such as logistic regression, linear, Cox and Poisson are increasingly used both at international level, as well as in journals published in Spanish. Copyright © 2015 Sociedad Española de Médicos de Atención Primaria (SEMERGEN). Publicado por Elsevier España, S.L.U. All rights reserved.

  17. The microbiological profile and presence of bloodstream infection influence mortality rates in necrotizing fasciitis

    PubMed Central

    2011-01-01

    Introduction Necrotizing fasciitis (NF) is a life threatening infectious disease with a high mortality rate. We carried out a microbiological characterization of the causative pathogens. We investigated the correlation of mortality in NF with bloodstream infection and with the presence of co-morbidities. Methods In this retrospective study, we analyzed 323 patients who presented with necrotizing fasciitis at two different institutions. Bloodstream infection (BSI) was defined as a positive blood culture result. The patients were categorized as survivors and non-survivors. Eleven clinically important variables which were statistically significant by univariate analysis were selected for multivariate regression analysis and a stepwise logistic regression model was developed to determine the association between BSI and mortality. Results Univariate logistic regression analysis showed that patients with hypotension, heart disease, liver disease, presence of Vibrio spp. in wound cultures, presence of fungus in wound cultures, and presence of Streptococcus group A, Aeromonas spp. or Vibrio spp. in blood cultures, had a significantly higher risk of in-hospital mortality. Our multivariate logistic regression analysis showed a higher risk of mortality in patients with pre-existing conditions like hypotension, heart disease, and liver disease. Multivariate logistic regression analysis also showed that presence of Vibrio spp in wound cultures, and presence of Streptococcus Group A in blood cultures were associated with a high risk of mortality while debridement > = 3 was associated with improved survival. Conclusions Mortality in patients with necrotizing fasciitis was significantly associated with the presence of Vibrio in wound cultures and Streptococcus group A in blood cultures. PMID:21693053

  18. Prediction of siRNA potency using sparse logistic regression.

    PubMed

    Hu, Wei; Hu, John

    2014-06-01

    RNA interference (RNAi) can modulate gene expression at post-transcriptional as well as transcriptional levels. Short interfering RNA (siRNA) serves as a trigger for the RNAi gene inhibition mechanism, and therefore is a crucial intermediate step in RNAi. There have been extensive studies to identify the sequence characteristics of potent siRNAs. One such study built a linear model using LASSO (Least Absolute Shrinkage and Selection Operator) to measure the contribution of each siRNA sequence feature. This model is simple and interpretable, but it requires a large number of nonzero weights. We have introduced a novel technique, sparse logistic regression, to build a linear model using single-position specific nucleotide compositions which has the same prediction accuracy of the linear model based on LASSO. The weights in our new model share the same general trend as those in the previous model, but have only 25 nonzero weights out of a total 84 weights, a 54% reduction compared to the previous model. Contrary to the linear model based on LASSO, our model suggests that only a few positions are influential on the efficacy of the siRNA, which are the 5' and 3' ends and the seed region of siRNA sequences. We also employed sparse logistic regression to build a linear model using dual-position specific nucleotide compositions, a task LASSO is not able to accomplish well due to its high dimensional nature. Our results demonstrate the superiority of sparse logistic regression as a technique for both feature selection and regression over LASSO in the context of siRNA design.

  19. Chronic obstructive pulmonary disease in Welsh slate miners.

    PubMed

    Reynolds, C J; MacNeill, S J; Williams, J; Hodges, N G; Campbell, M J; Newman Taylor, A J; Cullinan, P

    2017-01-01

    Exposure to respirable crystalline silica (RCS) causes emphysema, airflow limitation and chronic obstructive pulmonary disease (COPD). Slate miners are exposed to slate dust containing RCS but their COPD risk has not previously been studied. To study the cumulative effect of mining on lung function and risk of COPD in a cohort of Welsh slate miners and whether these were independent of smoking and pneumoconiosis. The study was based on a secondary analysis of Medical Research Council (MRC) survey data. COPD was defined as forced expiratory volume in 1 s/forced vital capacity (FEV 1 /FVC) ratio <0.7. We created multivariable models to assess the association between mining and lung function after adjusting for age and smoking status. We used linear regression models for FEV 1 and FVC and logistic regression for COPD. In the original MRC study, 1255 men participated (726 slate miners, 529 unexposed non-miners). COPD was significantly more common in miners (n = 213, 33%) than non-miners (n = 120, 26%), P < 0.05. There was no statistically significant difference in risk of COPD between miners and non-miners when analysis was limited to non-smokers or those without radiographic evidence of pneumoconiosis. After adjustment for smoking, slate mining was associated with a reduction in %predicted FEV 1 [β coefficient = -3.97, 95% confidence interval (CI) -6.65, -1.29] and FVC (β coefficient = -2.32, 95% CI -4.31, -0.33) and increased risk of COPD (odds ratio: 1.38, 95% CI 1.06, 1.81). Slate mining may reduce lung function and increase the incidence of COPD independently of smoking and pneumoconiosis. © The Author 2016. Published by Oxford University Press on behalf of the Society of Occupational Medicine. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  20. Vitamin D Deficiency and Its Relationship with Child-Pugh Class in Patients with Chronic Liver Disease

    PubMed Central

    Jamil, Zubia; Arif, Sharmin; Khan, Anum; Durrani, Asghar Aurangzeb; Yaqoob, Nayyar

    2018-01-01

    Abstract Background and Aims: Skeletal manifestation in liver diseases represents the minimally scrutinized part of the disease spectrum. Vitamin D deficiency has a central role in developing hepatic osteodystrophy in patients with chronic liver disease. This study aimed to investigate vitamin D levels and their relationship with disease advancement in these patients. Methods: Vitamin D levels were checked in 125 chronic liver disease patients. The patients were classified in three stages according to Child-Pugh score: A, B and C. The relationship of vitamin D levels with Child-Pugh score and other variables in the study was assessed by the contingency coefficient. Correlation and logistic regression analyses were also carried out to find additional predictors of low vitamin D levels. Results: Among the patients, 88% had either insufficient or deficient stores of vitamin D, while only 12% had sufficient vitamin D levels (p >0.05). Vitamin D levels were notably related to Child-Pugh class (contingency coefficient = 0.5, p <0.05). On univariate and multinomial regression analyses, age, female sex, MELD and Child-Pugh class were predictors of low vitamin D levels. Age, model of end-stage liver disease score and Child-Pugh score were negatively correlated to vitamin D levels (p <0.05). Conclusions: Vitamin D deficiency is notably related to age, female sex and model of end-stage liver disease score, in addition to Child-Pugh class of liver cirrhosis. Vitamin D levels should be routinely checked in patients with advanced liver cirrhosis (Child-Pugh class B and C) and this deficiency must be addressed in a timely manner to improve general well-being of cirrhotic patients.

  1. Women's empowerment and ideal family size: an examination of DHS empowerment measures in Sub-Saharan Africa.

    PubMed

    Upadhyay, Ushma D; Karasek, Deborah

    2012-06-01

    The Demographic and Health Survey (DHS) program collects data on women's empowerment, but little is known about how these measures perform in Sub-Saharan African countries. It is important to understand whether women's empowerment is associated with their ideal number of children and ability to limit fertility to that ideal number in the Sub-Saharan African context. The analysis used couples data from DHS surveys in four Sub-Saharan African countries: Guinea, Mali, Namibia and Zambia. Women's empowerment was measured by participation in household decision making, attitudes toward wife beating and attitudes toward refusing sex with one's husband. Multivariable linear regression was used to model women's ideal number of children, and multivariable logistic regression was used to model women's odds of having more children than their ideal. In Guinea and Zambia, negative attitudes toward wife beating were associated with having a smaller ideal number of children (beta coefficients, -0.5 and -0.3, respectively). Greater household decision making was associated with a smaller ideal number of children only in Guinea (beta coefficient, -0.3). Additionally, household decision making and positive attitudes toward women's right to refuse sex were associated with elevated odds of having more children than desired in Namibia and Zambia, respectively (odds ratios, 2.3 and 1.4); negative attitudes toward wife beating were associated with reduced odds of the outcome in Mali (0.4). Women's empowerment--as assessed using currently available measures--is not consistently associated with a desire for smaller families or the ability to achieve desired fertility in these Sub-Saharan African countries. Further research is needed to determine what measures are most applicable for these contexts.

  2. Medication adherence and visit-to-visit variability of systolic blood pressure in African Americans with chronic kidney disease in the AASK trial.

    PubMed

    Hong, K; Muntner, P; Kronish, I; Shilane, D; Chang, T I

    2016-01-01

    Lower adherence to antihypertensive medications may increase visit-to-visit variability of blood pressure (VVV of BP), a risk factor for cardiovascular events and death. We used data from the African American Study of Kidney Disease and Hypertension (AASK) trial to examine whether lower medication adherence is associated with higher systolic VVV of BP in African Americans with hypertensive chronic kidney disease (CKD). Determinants of VVV of BP were also explored. AASK participants (n=988) were categorized by self-report or pill count as having perfect (100%), moderately high (75-99%), moderately low (50-74%) or low (<50%) proportion of study visits with high medication adherence over a 1-year follow-up period. We used multinomial logistic regression to examine determinants of medication adherence, and multivariable-adjusted linear regression to examine the association between medication adherence and systolic VVV of BP, defined as the coefficient of variation or the average real variability (ARV). Participants with lower self-reported adherence were generally younger and had a higher prevalence of comorbid conditions. Compared with perfect adherence, moderately high, moderately low and low adherence was associated with 0.65% (±0.31%), 0.99% (±0.31%) and 1.29% (±0.32%) higher systolic VVV of BP (defined as the coefficient of variation) in fully adjusted models. Results were qualitatively similar when using ARV or when using pill counts as the measure of adherence. Lower medication adherence is associated with higher systolic VVV of BP in African Americans with hypertensive CKD; efforts to improve medication adherence in this population may reduce systolic VVV of BP.

  3. Household food insecurity is associated with less physical activity among children and adults in the U.S. population.

    PubMed

    To, Quyen G; Frongillo, Edward A; Gallegos, Danielle; Moore, Justin B

    2014-11-01

    Household food insecurity and physical activity are each important public-health concerns in the United States, but the relation between them has not been investigated thoroughly. This study aimed to examine the association between food insecurity and physical activity in the U.S. population. Physical activity measured by accelerometry (PAM) and physical activity measured by questionnaire (PAQ) data from the NHANES 2003-2006 were used. Individuals aged <6 y or >65 y, pregnant women, individuals with physical limitations, and individuals with family income >350% of the poverty line were excluded. Food insecurity was measured by the USDA Household Food Security Survey Module. Adjusted ORs were calculated from logistic regression to identify the association between food insecurity and adherence to the physical-activity guidelines. Adjusted coefficients were obtained from linear regression to identify the association between food insecurity with sedentary/physical-activity minutes. In children, food insecurity was not associated with adherence to physical-activity guidelines measured via PAM or PAQ and with sedentary minutes (P > 0.05). Food-insecure children did less moderate to vigorous physical activity than food-secure children (adjusted coefficient = -5.24, P = 0.02). In adults, food insecurity was significantly associated with adherence to physical-activity guidelines (adjusted OR = 0.72, P = 0.03 for PAM; and OR = 0.84, P < 0.01 for PAQ) but was not associated with sedentary minutes (P > 0.05). Food-insecure children did less moderate to vigorous physical activity, and food-insecure adults were less likely to adhere to the physical-activity guidelines than those without food insecurity. © 2014 American Society for Nutrition.

  4. Factor Scores, Structure Coefficients, and Communality Coefficients

    ERIC Educational Resources Information Center

    Goodwyn, Fara

    2012-01-01

    This paper presents heuristic explanations of factor scores, structure coefficients, and communality coefficients. Common misconceptions regarding these topics are clarified. In addition, (a) the regression (b) Bartlett, (c) Anderson-Rubin, and (d) Thompson methods for calculating factor scores are reviewed. Syntax necessary to execute all four…

  5. Combining logistic regression with classification and regression tree to predict quality of care in a home health nursing data set.

    PubMed

    Guo, Huey-Ming; Shyu, Yea-Ing Lotus; Chang, Her-Kun

    2006-01-01

    In this article, the authors provide an overview of a research method to predict quality of care in home health nursing data set. The results of this study can be visualized through classification an regression tree (CART) graphs. The analysis was more effective, and the results were more informative since the home health nursing dataset was analyzed with a combination of the logistic regression and CART, these two techniques complete each other. And the results more informative that more patients' characters were related to quality of care in home care. The results contributed to home health nurse predict patient outcome in case management. Improved prediction is needed for interventions to be appropriately targeted for improved patient outcome and quality of care.

  6. A general framework for the use of logistic regression models in meta-analysis.

    PubMed

    Simmonds, Mark C; Higgins, Julian Pt

    2016-12-01

    Where individual participant data are available for every randomised trial in a meta-analysis of dichotomous event outcomes, "one-stage" random-effects logistic regression models have been proposed as a way to analyse these data. Such models can also be used even when individual participant data are not available and we have only summary contingency table data. One benefit of this one-stage regression model over conventional meta-analysis methods is that it maximises the correct binomial likelihood for the data and so does not require the common assumption that effect estimates are normally distributed. A second benefit of using this model is that it may be applied, with only minor modification, in a range of meta-analytic scenarios, including meta-regression, network meta-analyses and meta-analyses of diagnostic test accuracy. This single model can potentially replace the variety of often complex methods used in these areas. This paper considers, with a range of meta-analysis examples, how random-effects logistic regression models may be used in a number of different types of meta-analyses. This one-stage approach is compared with widely used meta-analysis methods including Bayesian network meta-analysis and the bivariate and hierarchical summary receiver operating characteristic (ROC) models for meta-analyses of diagnostic test accuracy. © The Author(s) 2014.

  7. Asthma exacerbation and proximity of residence to major roads: a population-based matched case-control study among the pediatric Medicaid population in Detroit, Michigan

    PubMed Central

    2011-01-01

    Background The relationship between asthma and traffic-related pollutants has received considerable attention. The use of individual-level exposure measures, such as residence location or proximity to emission sources, may avoid ecological biases. Method This study focused on the pediatric Medicaid population in Detroit, MI, a high-risk population for asthma-related events. A population-based matched case-control analysis was used to investigate associations between acute asthma outcomes and proximity of residence to major roads, including freeways. Asthma cases were identified as all children who made at least one asthma claim, including inpatient and emergency department visits, during the three-year study period, 2004-06. Individually matched controls were randomly selected from the rest of the Medicaid population on the basis of non-respiratory related illness. We used conditional logistic regression with distance as both categorical and continuous variables, and examined non-linear relationships with distance using polynomial splines. The conditional logistic regression models were then extended by considering multiple asthma states (based on the frequency of acute asthma outcomes) using polychotomous conditional logistic regression. Results Asthma events were associated with proximity to primary roads with an odds ratio of 0.97 (95% CI: 0.94, 0.99) for a 1 km increase in distance using conditional logistic regression, implying that asthma events are less likely as the distance between the residence and a primary road increases. Similar relationships and effect sizes were found using polychotomous conditional logistic regression. Another plausible exposure metric, a reduced form response surface model that represents atmospheric dispersion of pollutants from roads, was not associated under that exposure model. Conclusions There is moderately strong evidence of elevated risk of asthma close to major roads based on the results obtained in this population-based matched case-control study. PMID:21513554

  8. Neural network modeling for surgical decisions on traumatic brain injury patients.

    PubMed

    Li, Y C; Liu, L; Chiu, W T; Jian, W S

    2000-01-01

    Computerized medical decision support systems have been a major research topic in recent years. Intelligent computer programs were implemented to aid physicians and other medical professionals in making difficult medical decisions. This report compares three different mathematical models for building a traumatic brain injury (TBI) medical decision support system (MDSS). These models were developed based on a large TBI patient database. This MDSS accepts a set of patient data such as the types of skull fracture, Glasgow Coma Scale (GCS), episode of convulsion and return the chance that a neurosurgeon would recommend an open-skull surgery for this patient. The three mathematical models described in this report including a logistic regression model, a multi-layer perceptron (MLP) neural network and a radial-basis-function (RBF) neural network. From the 12,640 patients selected from the database. A randomly drawn 9480 cases were used as the training group to develop/train our models. The other 3160 cases were in the validation group which we used to evaluate the performance of these models. We used sensitivity, specificity, areas under receiver-operating characteristics (ROC) curve and calibration curves as the indicator of how accurate these models are in predicting a neurosurgeon's decision on open-skull surgery. The results showed that, assuming equal importance of sensitivity and specificity, the logistic regression model had a (sensitivity, specificity) of (73%, 68%), compared to (80%, 80%) from the RBF model and (88%, 80%) from the MLP model. The resultant areas under ROC curve for logistic regression, RBF and MLP neural networks are 0.761, 0.880 and 0.897, respectively (P < 0.05). Among these models, the logistic regression has noticeably poorer calibration. This study demonstrated the feasibility of applying neural networks as the mechanism for TBI decision support systems based on clinical databases. The results also suggest that neural networks may be a better solution for complex, non-linear medical decision support systems than conventional statistical techniques such as logistic regression.

  9. Cluster Analysis of Campylobacter jejuni Genotypes Isolated from Small and Medium-Sized Mammalian Wildlife and Bovine Livestock from Ontario Farms.

    PubMed

    Viswanathan, M; Pearl, D L; Taboada, E N; Parmley, E J; Mutschall, S K; Jardine, C M

    2017-05-01

    Using data collected from a cross-sectional study of 25 farms (eight beef, eight swine and nine dairy) in 2010, we assessed clustering of molecular subtypes of C. jejuni based on a Campylobacter-specific 40 gene comparative genomic fingerprinting assay (CGF40) subtypes, using unweighted pair-group method with arithmetic mean (UPGMA) analysis, and multiple correspondence analysis. Exact logistic regression was used to determine which genes differentiate wildlife and livestock subtypes in our study population. A total of 33 bovine livestock (17 beef and 16 dairy), 26 wildlife (20 raccoon (Procyon lotor), five skunk (Mephitis mephitis) and one mouse (Peromyscus spp.) C. jejuni isolates were subtyped using CGF40. Dendrogram analysis, based on UPGMA, showed distinct branches separating bovine livestock and mammalian wildlife isolates. Furthermore, two-dimensional multiple correspondence analysis was highly concordant with dendrogram analysis showing clear differentiation between livestock and wildlife CGF40 subtypes. Based on multilevel logistic regression models with a random intercept for farm of origin, we found that isolates in general, and raccoons more specifically, were significantly more likely to be part of the wildlife branch. Exact logistic regression conducted gene by gene revealed 15 genes that were predictive of whether an isolate was of wildlife or bovine livestock isolate origin. Both multiple correspondence analysis and exact logistic regression revealed that in most cases, the presence of a particular gene (13 of 15) was associated with an isolate being of livestock rather than wildlife origin. In conclusion, the evidence gained from dendrogram analysis, multiple correspondence analysis and exact logistic regression indicates that mammalian wildlife carry CGF40 subtypes of C. jejuni distinct from those carried by bovine livestock. Future studies focused on source attribution of C. jejuni in human infections will help determine whether wildlife transmit Campylobacter jejuni directly to humans. © 2016 Blackwell Verlag GmbH.

  10. Individual-tree probability of survival model for the Northeastern United States

    Treesearch

    Richard M. Teck; Donald E. Hilt

    1990-01-01

    Describes a distance-independent individual-free probability of survival model for the Northeastern United States. Survival is predicted using a sixparameter logistic function with species-specific coefficients. Coefficients are presented for 28 species groups. The model accounts for variability in annual survival due to species, tree size, site quality, and the tree...

  11. Multiple linear regression analysis

    NASA Technical Reports Server (NTRS)

    Edwards, T. R.

    1980-01-01

    Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.

  12. The Importance of Gestational Sac Size of Ectopic Pregnancy in Response to Single-Dose Methotrexate

    PubMed Central

    Kimiaei, Parichehr; Khani, Zahra; Marefian, Azadeh; Gholampour Ghavamabadi, Maryam; Salimnejad, Maryam

    2013-01-01

    This retrospective cohort study was designed in a selective group of 185 patients diagnosed with and treated for ectopic pregnancy. Intramuscular administration of a single dose of methotrexate (50 mg/m2) was performed to measure predictors of failure or resistance to treatment necessitating surgical intervention. During the time of treatment with a single dose of MTX, 20 patients (10.8%) failed to response, in which 6 of 20 (30%) indicated side effects to MTX and rupture of the ectopic pregnancy. Remaining cases (n = 14) showed resistance to the drug; the level of β-hCG did not fall at least 15% during 7 days after treatment and necessitated laparotomy. In backward-step analysis by multiple logistic regressions of various types of predictor factors, size of gestational sac (coefficient = 1.91, OR = 6.78, 95% confidence interval = 3.18–8.22) and baseline level β-hCG (coefficient = 1.60, OR = 5.0, 95% confidence interval = 4.26–6.72) had significant correlation with leading EP patients failing to response to MTX. This study suggests that further investigation for finding relative contraindications of MTX treatment in EP women should be considered on the gestational sac size because other variables are in the causal pathway of this variable. PMID:23762575

  13. A three-parameter model for classifying anurans into four genera based on advertisement calls.

    PubMed

    Gingras, Bruno; Fitch, William Tecumseh

    2013-01-01

    The vocalizations of anurans are innate in structure and may therefore contain indicators of phylogenetic history. Thus, advertisement calls of species which are more closely related phylogenetically are predicted to be more similar than those of distant species. This hypothesis was evaluated by comparing several widely used machine-learning algorithms. Recordings of advertisement calls from 142 species belonging to four genera were analyzed. A logistic regression model, using mean values for dominant frequency, coefficient of variation of root-mean square energy, and spectral flux, correctly classified advertisement calls with regard to genus with an accuracy above 70%. Similar accuracy rates were obtained using these parameters with a support vector machine model, a K-nearest neighbor algorithm, and a multivariate Gaussian distribution classifier, whereas a Gaussian mixture model performed slightly worse. In contrast, models based on mel-frequency cepstral coefficients did not fare as well. Comparable accuracy levels were obtained on out-of-sample recordings from 52 of the 142 original species. The results suggest that a combination of low-level acoustic attributes is sufficient to discriminate efficiently between the vocalizations of these four genera, thus supporting the initial premise and validating the use of high-throughput algorithms on animal vocalizations to evaluate phylogenetic hypotheses.

  14. 2012 Workplace and Gender Relations Survey of Reserve Component Members: Statistical Methodology Report

    DTIC Science & Technology

    2012-09-01

    3,435 10,461 9.1 3.1 63 Unmarried with Children+ Unmarried without Children 439,495 0.01 10,350 43,870 10.1 2.2 64 Married with Children+ Married ...logistic regression model was used to predict the probability of eligibility for the survey (known eligibility vs . unknown eligibility). A second logistic...regression model was used to predict the probability of response among eligible sample members (complete response vs . non-response). CHAID (Chi

  15. Habitat features and predictive habitat modeling for the Colorado chipmunk in southern New Mexico

    USGS Publications Warehouse

    Rivieccio, M.; Thompson, B.C.; Gould, W.R.; Boykin, K.G.

    2003-01-01

    Two subspecies of Colorado chipmunk (state threatened and federal species of concern) occur in southern New Mexico: Tamias quadrivittatus australis in the Organ Mountains and T. q. oscuraensis in the Oscura Mountains. We developed a GIS model of potentially suitable habitat based on vegetation and elevation features, evaluated site classifications of the GIS model, and determined vegetation and terrain features associated with chipmunk occurrence. We compared GIS model classifications with actual vegetation and elevation features measured at 37 sites. At 60 sites we measured 18 habitat variables regarding slope, aspect, tree species, shrub species, and ground cover. We used logistic regression to analyze habitat variables associated with chipmunk presence/absence. All (100%) 37 sample sites (28 predicted suitable, 9 predicted unsuitable) were classified correctly by the GIS model regarding elevation and vegetation. For 28 sites predicted suitable by the GIS model, 18 sites (64%) appeared visually suitable based on habitat variables selected from logistic regression analyses, of which 10 sites (36%) were specifically predicted as suitable habitat via logistic regression. We detected chipmunks at 70% of sites deemed suitable via the logistic regression models. Shrub cover, tree density, plant proximity, presence of logs, and presence of rock outcrop were retained in the logistic model for the Oscura Mountains; litter, shrub cover, and grass cover were retained in the logistic model for the Organ Mountains. Evaluation of predictive models illustrates the need for multi-stage analyses to best judge performance. Microhabitat analyses indicate prospective needs for different management strategies between the subspecies. Sensitivities of each population of the Colorado chipmunk to natural and prescribed fire suggest that partial burnings of areas inhabited by Colorado chipmunks in southern New Mexico may be beneficial. These partial burnings may later help avoid a fire that could substantially reduce habitat of chipmunks over a mountain range.

  16. The Associations of Serum Lipids with Vitamin D Status.

    PubMed

    Wang, Ying; Si, Shaoyan; Liu, Junli; Wang, Zongye; Jia, Haiying; Feng, Kai; Sun, Lili; Song, Shu Jun

    2016-01-01

    Vitamin D deficiency has been associated with some disorders including cardiovascular diseases. Dyslipidemia is a major risk factor for cardiovascular diseases. However, data about the relationships between vitamin D and lipids are inconsistent. The relationship of vitamin D and Atherogenic Index of Plasma (AIP), as an excellent predictor of level of small and dense LDL, has not been reported. The objective of this study was to investigate the effects of vitamin D status on serum lipids in Chinese adults. The study was carried out using 1475 participants from the Center for Physical Examination, 306 Hospital of PLA in Beijing, China. Fasting blood samples were collected and serum concentrations of 25(OH)D, total cholesterol (TC), triglyceride (TG), high density lipoprotein cholesterol (HDL-C) and low density lipoprotein cholesterol (LDL-C) were measured. AIP was calculated based on the formula: log [TG/HDL-C]. Multiple linear regression analysis was used to estimate the associations between serum 25(OH)D and lipids. The association between the occurrences of dyslipidemias and vitamin D levels was assessed by multiple logistic regression analysis. Confounding factors, age and BMI, were used for the adjustment. The median of serum 25(OH)D concentration was 47 (27-92.25) nmol/L in all subjects. The overall percentage of 25(OH)D ≦ 50 nmol/L was 58.5% (males 54.4%, females 63.7%). The serum 25(OH)D levels were inversely associated with TG (β coefficient = -0.24, p < 0.001) and LDL-C (β coefficient = -0.34, p < 0.001) and positively associated with TC (β coefficient = 0.35, p < 0.002) in men. The associations between serum 25(OH)D and LDL-C (β coefficient = -0.25, p = 0.01) and TC (β coefficient = 0.39, p = 0.001) also existed in women. The serum 25(OH)D concentrations were negatively associated with AIP in men (r = -0.111, p < 0.01) but not in women. In addition, vitamin D deficient men had higher AIP values than vitamin D sufficient men. Furthermore, the occurrences of dyslipidemias (reduced HDL-C, elevated TG and elevated AIP) correlated with lower 25(OH)D levels in men, whereas the higher TC and LDL-C associated with higher 25(OH)D levels in women. It seems that the serum 25(OH)D levels are closely associated with the serum lipids and AIP. Vitamin D deficiency may be associated with the increased risk of dyslipidemias, especially in men. The association between vitamin D status and serum lipids may differ by genders.

  17. The logistic model for predicting the non-gonoactive Aedes aegypti females.

    PubMed

    Reyes-Villanueva, Filiberto; Rodríguez-Pérez, Mario A

    2004-01-01

    To estimate, using logistic regression, the likelihood of occurrence of a non-gonoactive Aedes aegypti female, previously fed human blood, with relation to body size and collection method. This study was conducted in Monterrey, Mexico, between 1994 and 1996. Ten samplings of 60 mosquitoes of Ae. aegypti females were carried out in three dengue endemic areas: six of biting females, two of emerging mosquitoes, and two of indoor resting females. Gravid females, as well as those with blood in the gut were removed. Mosquitoes were taken to the laboratory and engorged on human blood. After 48 hours, ovaries were dissected to register whether they were gonoactive or non-gonoactive. Wing-length in mm was an indicator for body size. The logistic regression model was used to assess the likelihood of non-gonoactivity, as a binary variable, in relation to wing-length and collection method. Of the 600 females, 164 (27%) remained non-gonoactive, with a wing-length range of 1.9-3.2 mm, almost equal to that of all females (1.8-3.3 mm). The logistic regression model showed a significant likelihood of a female remaining non-gonoactive (Y=1). The collection method did not influence the binary response, but there was an inverse relationship between non-gonoactivity and wing-length. Dengue vector populations from Monterrey, Mexico display a wide-range body size. Logistic regression was a useful tool to estimate the likelihood for an engorged female to remain non-gonoactive. The necessity for a second blood meal is present in any female, but small mosquitoes are more likely to bite again within a 2-day interval, in order to attain egg maturation. The English version of this paper is available too at: http://www.insp.mx/salud/index.html.

  18. The Application of the Cumulative Logistic Regression Model to Automated Essay Scoring

    ERIC Educational Resources Information Center

    Haberman, Shelby J.; Sinharay, Sandip

    2010-01-01

    Most automated essay scoring programs use a linear regression model to predict an essay score from several essay features. This article applied a cumulative logit model instead of the linear regression model to automated essay scoring. Comparison of the performances of the linear regression model and the cumulative logit model was performed on a…

  19. Widen NomoGram for multinomial logistic regression: an application to staging liver fibrosis in chronic hepatitis C patients.

    PubMed

    Ardoino, Ilaria; Lanzoni, Monica; Marano, Giuseppe; Boracchi, Patrizia; Sagrini, Elisabetta; Gianstefani, Alice; Piscaglia, Fabio; Biganzoli, Elia M

    2017-04-01

    The interpretation of regression models results can often benefit from the generation of nomograms, 'user friendly' graphical devices especially useful for assisting the decision-making processes. However, in the case of multinomial regression models, whenever categorical responses with more than two classes are involved, nomograms cannot be drawn in the conventional way. Such a difficulty in managing and interpreting the outcome could often result in a limitation of the use of multinomial regression in decision-making support. In the present paper, we illustrate the derivation of a non-conventional nomogram for multinomial regression models, intended to overcome this issue. Although it may appear less straightforward at first sight, the proposed methodology allows an easy interpretation of the results of multinomial regression models and makes them more accessible for clinicians and general practitioners too. Development of prediction model based on multinomial logistic regression and of the pertinent graphical tool is illustrated by means of an example involving the prediction of the extent of liver fibrosis in hepatitis C patients by routinely available markers.

  20. On Using the Average Intercorrelation Among Predictor Variables and Eigenvector Orientation to Choose a Regression Solution.

    ERIC Educational Resources Information Center

    Mugrage, Beverly; And Others

    Three ridge regression solutions are compared with ordinary least squares regression and with principal components regression using all components. Ridge regression, particularly the Lawless-Wang solution, out-performed ordinary least squares regression and the principal components solution on the criteria of stability of coefficient and closeness…

  1. Regularization Paths for Conditional Logistic Regression: The clogitL1 Package.

    PubMed

    Reid, Stephen; Tibshirani, Rob

    2014-07-01

    We apply the cyclic coordinate descent algorithm of Friedman, Hastie, and Tibshirani (2010) to the fitting of a conditional logistic regression model with lasso [Formula: see text] and elastic net penalties. The sequential strong rules of Tibshirani, Bien, Hastie, Friedman, Taylor, Simon, and Tibshirani (2012) are also used in the algorithm and it is shown that these offer a considerable speed up over the standard coordinate descent algorithm with warm starts. Once implemented, the algorithm is used in simulation studies to compare the variable selection and prediction performance of the conditional logistic regression model against that of its unconditional (standard) counterpart. We find that the conditional model performs admirably on datasets drawn from a suitable conditional distribution, outperforming its unconditional counterpart at variable selection. The conditional model is also fit to a small real world dataset, demonstrating how we obtain regularization paths for the parameters of the model and how we apply cross validation for this method where natural unconditional prediction rules are hard to come by.

  2. Computational tools for exact conditional logistic regression.

    PubMed

    Corcoran, C; Mehta, C; Patel, N; Senchaudhuri, P

    Logistic regression analyses are often challenged by the inability of unconditional likelihood-based approximations to yield consistent, valid estimates and p-values for model parameters. This can be due to sparseness or separability in the data. Conditional logistic regression, though useful in such situations, can also be computationally unfeasible when the sample size or number of explanatory covariates is large. We review recent developments that allow efficient approximate conditional inference, including Monte Carlo sampling and saddlepoint approximations. We demonstrate through real examples that these methods enable the analysis of significantly larger and more complex data sets. We find in this investigation that for these moderately large data sets Monte Carlo seems a better alternative, as it provides unbiased estimates of the exact results and can be executed in less CPU time than can the single saddlepoint approximation. Moreover, the double saddlepoint approximation, while computationally the easiest to obtain, offers little practical advantage. It produces unreliable results and cannot be computed when a maximum likelihood solution does not exist. Copyright 2001 John Wiley & Sons, Ltd.

  3. Regularization Paths for Conditional Logistic Regression: The clogitL1 Package

    PubMed Central

    Reid, Stephen; Tibshirani, Rob

    2014-01-01

    We apply the cyclic coordinate descent algorithm of Friedman, Hastie, and Tibshirani (2010) to the fitting of a conditional logistic regression model with lasso (ℓ1) and elastic net penalties. The sequential strong rules of Tibshirani, Bien, Hastie, Friedman, Taylor, Simon, and Tibshirani (2012) are also used in the algorithm and it is shown that these offer a considerable speed up over the standard coordinate descent algorithm with warm starts. Once implemented, the algorithm is used in simulation studies to compare the variable selection and prediction performance of the conditional logistic regression model against that of its unconditional (standard) counterpart. We find that the conditional model performs admirably on datasets drawn from a suitable conditional distribution, outperforming its unconditional counterpart at variable selection. The conditional model is also fit to a small real world dataset, demonstrating how we obtain regularization paths for the parameters of the model and how we apply cross validation for this method where natural unconditional prediction rules are hard to come by. PMID:26257587

  4. Ordinal logistic regression analysis on the nutritional status of children in KarangKitri village

    NASA Astrophysics Data System (ADS)

    Ohyver, Margaretha; Yongharto, Kimmy Octavian

    2015-09-01

    Ordinal logistic regression is a statistical technique that can be used to describe the relationship between ordinal response variable with one or more independent variables. This method has been used in various fields including in the health field. In this research, ordinal logistic regression is used to describe the relationship between nutritional status of children with age, gender, height, and family status. Nutritional status of children in this research is divided into over nutrition, well nutrition, less nutrition, and malnutrition. The purpose for this research is to describe the characteristics of children in the KarangKitri Village and to determine the factors that influence the nutritional status of children in the KarangKitri village. There are three things that obtained from this research. First, there are still children who are not categorized as well nutritional status. Second, there are children who come from sufficient economic level which include in not normal status. Third, the factors that affect the nutritional level of children are age, family status, and height.

  5. Analysis of an Environmental Exposure Health Questionnaire in a Metropolitan Minority Population Utilizing Logistic Regression and Support Vector Machines

    PubMed Central

    Chen, Chau-Kuang; Bruce, Michelle; Tyler, Lauren; Brown, Claudine; Garrett, Angelica; Goggins, Susan; Lewis-Polite, Brandy; Weriwoh, Mirabel L; Juarez, Paul D.; Hood, Darryl B.; Skelton, Tyler

    2014-01-01

    The goal of this study was to analyze a 54-item instrument for assessment of perception of exposure to environmental contaminants within the context of the built environment, or exposome. This exposome was defined in five domains to include 1) home and hobby, 2) school, 3) community, 4) occupation, and 5) exposure history. Interviews were conducted with child-bearing-age minority women at Metro Nashville General Hospital at Meharry Medical College. Data were analyzed utilizing DTReg software for Support Vector Machine (SVM) modeling followed by an SPSS package for a logistic regression model. The target (outcome) variable of interest was respondent's residence by ZIP code. The results demonstrate that the rank order of important variables with respect to SVM modeling versus traditional logistic regression models is almost identical. This is the first study documenting that SVM analysis has discriminate power for determination of higher-ordered spatial relationships on an environmental exposure history questionnaire. PMID:23395953

  6. An ultra low power feature extraction and classification system for wearable seizure detection.

    PubMed

    Page, Adam; Pramod Tim Oates, Siddharth; Mohsenin, Tinoosh

    2015-01-01

    In this paper we explore the use of a variety of machine learning algorithms for designing a reliable and low-power, multi-channel EEG feature extractor and classifier for predicting seizures from electroencephalographic data (scalp EEG). Different machine learning classifiers including k-nearest neighbor, support vector machines, naïve Bayes, logistic regression, and neural networks are explored with the goal of maximizing detection accuracy while minimizing power, area, and latency. The input to each machine learning classifier is a 198 feature vector containing 9 features for each of the 22 EEG channels obtained over 1-second windows. All classifiers were able to obtain F1 scores over 80% and onset sensitivity of 100% when tested on 10 patients. Among five different classifiers that were explored, logistic regression (LR) proved to have minimum hardware complexity while providing average F-1 score of 91%. Both ASIC and FPGA implementations of logistic regression are presented and show the smallest area, power consumption, and the lowest latency when compared to the previous work.

  7. The arcsine is asinine: the analysis of proportions in ecology.

    PubMed

    Warton, David I; Hui, Francis K C

    2011-01-01

    The arcsine square root transformation has long been standard procedure when analyzing proportional data in ecology, with applications in data sets containing binomial and non-binomial response variables. Here, we argue that the arcsine transform should not be used in either circumstance. For binomial data, logistic regression has greater interpretability and higher power than analyses of transformed data. However, it is important to check the data for additional unexplained variation, i.e., overdispersion, and to account for it via the inclusion of random effects in the model if found. For non-binomial data, the arcsine transform is undesirable on the grounds of interpretability, and because it can produce nonsensical predictions. The logit transformation is proposed as an alternative approach to address these issues. Examples are presented in both cases to illustrate these advantages, comparing various methods of analyzing proportions including untransformed, arcsine- and logit-transformed linear models and logistic regression (with or without random effects). Simulations demonstrate that logistic regression usually provides a gain in power over other methods.

  8. Analysis of an environmental exposure health questionnaire in a metropolitan minority population utilizing logistic regression and Support Vector Machines.

    PubMed

    Chen, Chau-Kuang; Bruce, Michelle; Tyler, Lauren; Brown, Claudine; Garrett, Angelica; Goggins, Susan; Lewis-Polite, Brandy; Weriwoh, Mirabel L; Juarez, Paul D; Hood, Darryl B; Skelton, Tyler

    2013-02-01

    The goal of this study was to analyze a 54-item instrument for assessment of perception of exposure to environmental contaminants within the context of the built environment, or exposome. This exposome was defined in five domains to include 1) home and hobby, 2) school, 3) community, 4) occupation, and 5) exposure history. Interviews were conducted with child-bearing-age minority women at Metro Nashville General Hospital at Meharry Medical College. Data were analyzed utilizing DTReg software for Support Vector Machine (SVM) modeling followed by an SPSS package for a logistic regression model. The target (outcome) variable of interest was respondent's residence by ZIP code. The results demonstrate that the rank order of important variables with respect to SVM modeling versus traditional logistic regression models is almost identical. This is the first study documenting that SVM analysis has discriminate power for determination of higher-ordered spatial relationships on an environmental exposure history questionnaire.

  9. Building a new predictor for multiple linear regression technique-based corrective maintenance turnaround time.

    PubMed

    Cruz, Antonio M; Barr, Cameron; Puñales-Pozo, Elsa

    2008-01-01

    This research's main goals were to build a predictor for a turnaround time (TAT) indicator for estimating its values and use a numerical clustering technique for finding possible causes of undesirable TAT values. The following stages were used: domain understanding, data characterisation and sample reduction and insight characterisation. Building the TAT indicator multiple linear regression predictor and clustering techniques were used for improving corrective maintenance task efficiency in a clinical engineering department (CED). The indicator being studied was turnaround time (TAT). Multiple linear regression was used for building a predictive TAT value model. The variables contributing to such model were clinical engineering department response time (CE(rt), 0.415 positive coefficient), stock service response time (Stock(rt), 0.734 positive coefficient), priority level (0.21 positive coefficient) and service time (0.06 positive coefficient). The regression process showed heavy reliance on Stock(rt), CE(rt) and priority, in that order. Clustering techniques revealed the main causes of high TAT values. This examination has provided a means for analysing current technical service quality and effectiveness. In doing so, it has demonstrated a process for identifying areas and methods of improvement and a model against which to analyse these methods' effectiveness.

  10. Prescription-drug-related risk in driving: comparing conventional and lasso shrinkage logistic regressions.

    PubMed

    Avalos, Marta; Adroher, Nuria Duran; Lagarde, Emmanuel; Thiessard, Frantz; Grandvalet, Yves; Contrand, Benjamin; Orriols, Ludivine

    2012-09-01

    Large data sets with many variables provide particular challenges when constructing analytic models. Lasso-related methods provide a useful tool, although one that remains unfamiliar to most epidemiologists. We illustrate the application of lasso methods in an analysis of the impact of prescribed drugs on the risk of a road traffic crash, using a large French nationwide database (PLoS Med 2010;7:e1000366). In the original case-control study, the authors analyzed each exposure separately. We use the lasso method, which can simultaneously perform estimation and variable selection in a single model. We compare point estimates and confidence intervals using (1) a separate logistic regression model for each drug with a Bonferroni correction and (2) lasso shrinkage logistic regression analysis. Shrinkage regression had little effect on (bias corrected) point estimates, but led to less conservative results, noticeably for drugs with moderate levels of exposure. Carbamates, carboxamide derivative and fatty acid derivative antiepileptics, drugs used in opioid dependence, and mineral supplements of potassium showed stronger associations. Lasso is a relevant method in the analysis of databases with large number of exposures and can be recommended as an alternative to conventional strategies.

  11. Shrinkage regression-based methods for microarray missing value imputation.

    PubMed

    Wang, Hsiuying; Chiu, Chia-Chun; Wu, Yi-Ching; Wu, Wei-Sheng

    2013-01-01

    Missing values commonly occur in the microarray data, which usually contain more than 5% missing values with up to 90% of genes affected. Inaccurate missing value estimation results in reducing the power of downstream microarray data analyses. Many types of methods have been developed to estimate missing values. Among them, the regression-based methods are very popular and have been shown to perform better than the other types of methods in many testing microarray datasets. To further improve the performances of the regression-based methods, we propose shrinkage regression-based methods. Our methods take the advantage of the correlation structure in the microarray data and select similar genes for the target gene by Pearson correlation coefficients. Besides, our methods incorporate the least squares principle, utilize a shrinkage estimation approach to adjust the coefficients of the regression model, and then use the new coefficients to estimate missing values. Simulation results show that the proposed methods provide more accurate missing value estimation in six testing microarray datasets than the existing regression-based methods do. Imputation of missing values is a very important aspect of microarray data analyses because most of the downstream analyses require a complete dataset. Therefore, exploring accurate and efficient methods for estimating missing values has become an essential issue. Since our proposed shrinkage regression-based methods can provide accurate missing value estimation, they are competitive alternatives to the existing regression-based methods.

  12. Can Predictive Modeling Identify Head and Neck Oncology Patients at Risk for Readmission?

    PubMed

    Manning, Amy M; Casper, Keith A; Peter, Kay St; Wilson, Keith M; Mark, Jonathan R; Collar, Ryan M

    2018-05-01

    Objective Unplanned readmission within 30 days is a contributor to health care costs in the United States. The use of predictive modeling during hospitalization to identify patients at risk for readmission offers a novel approach to quality improvement and cost reduction. Study Design Two-phase study including retrospective analysis of prospectively collected data followed by prospective longitudinal study. Setting Tertiary academic medical center. Subjects and Methods Prospectively collected data for patients undergoing surgical treatment for head and neck cancer from January 2013 to January 2015 were used to build predictive models for readmission within 30 days of discharge using logistic regression, classification and regression tree (CART) analysis, and random forests. One model (logistic regression) was then placed prospectively into the discharge workflow from March 2016 to May 2016 to determine the model's ability to predict which patients would be readmitted within 30 days. Results In total, 174 admissions had descriptive data. Thirty-two were excluded due to incomplete data. Logistic regression, CART, and random forest predictive models were constructed using the remaining 142 admissions. When applied to 106 consecutive prospective head and neck oncology patients at the time of discharge, the logistic regression model predicted readmissions with a specificity of 94%, a sensitivity of 47%, a negative predictive value of 90%, and a positive predictive value of 62% (odds ratio, 14.9; 95% confidence interval, 4.02-55.45). Conclusion Prospectively collected head and neck cancer databases can be used to develop predictive models that can accurately predict which patients will be readmitted. This offers valuable support for quality improvement initiatives and readmission-related cost reduction in head and neck cancer care.

  13. Utility of an Abbreviated Dizziness Questionnaire to Differentiate between Causes of Vertigo and Guide Appropriate Referral: A Multicenter Prospective Blinded Study

    PubMed Central

    Roland, Lauren T.; Kallogjeri, Dorina; Sinks, Belinda C.; Rauch, Steven D.; Shepard, Neil T.; White, Judith A.; Goebel, Joel A.

    2015-01-01

    Objective Test performance of a focused dizziness questionnaire’s ability to discriminate between peripheral and non-peripheral causes of vertigo. Study Design Prospective multi-center Setting Four academic centers with experienced balance specialists Patients New dizzy patients Interventions A 32-question survey was given to participants. Balance specialists were blinded and a diagnosis was established for all participating patients within 6 months. Main outcomes Multinomial logistic regression was used to evaluate questionnaire performance in predicting final diagnosis and differentiating between peripheral and non-peripheral vertigo. Univariate and multivariable stepwise logistic regression were used to identify questions as significant predictors of the ultimate diagnosis. C-index was used to evaluate performance and discriminative power of the multivariable models. Results 437 patients participated in the study. Eight participants without confirmed diagnoses were excluded and 429 were included in the analysis. Multinomial regression revealed that the model had good overall predictive accuracy of 78.5% for the final diagnosis and 75.5% for differentiating between peripheral and non-peripheral vertigo. Univariate logistic regression identified significant predictors of three main categories of vertigo: peripheral, central and other. Predictors were entered into forward stepwise multivariable logistic regression. The discriminative power of the final models for peripheral, central and other causes were considered good as measured by c-indices of 0.75, 0.7 and 0.78, respectively. Conclusions This multicenter study demonstrates a focused dizziness questionnaire can accurately predict diagnosis for patients with chronic/relapsing dizziness referred to outpatient clinics. Additionally, this survey has significant capability to differentiate peripheral from non-peripheral causes of vertigo and may, in the future, serve as a screening tool for specialty referral. Clinical utility of this questionnaire to guide specialty referral is discussed. PMID:26485598

  14. Utility of an Abbreviated Dizziness Questionnaire to Differentiate Between Causes of Vertigo and Guide Appropriate Referral: A Multicenter Prospective Blinded Study.

    PubMed

    Roland, Lauren T; Kallogjeri, Dorina; Sinks, Belinda C; Rauch, Steven D; Shepard, Neil T; White, Judith A; Goebel, Joel A

    2015-12-01

    Test performance of a focused dizziness questionnaire's ability to discriminate between peripheral and nonperipheral causes of vertigo. Prospective multicenter. Four academic centers with experienced balance specialists. New dizzy patients. A 32-question survey was given to participants. Balance specialists were blinded and a diagnosis was established for all participating patients within 6 months. Multinomial logistic regression was used to evaluate questionnaire performance in predicting final diagnosis and differentiating between peripheral and nonperipheral vertigo. Univariate and multivariable stepwise logistic regression were used to identify questions as significant predictors of the ultimate diagnosis. C-index was used to evaluate performance and discriminative power of the multivariable models. In total, 437 patients participated in the study. Eight participants without confirmed diagnoses were excluded and 429 were included in the analysis. Multinomial regression revealed that the model had good overall predictive accuracy of 78.5% for the final diagnosis and 75.5% for differentiating between peripheral and nonperipheral vertigo. Univariate logistic regression identified significant predictors of three main categories of vertigo: peripheral, central, and other. Predictors were entered into forward stepwise multivariable logistic regression. The discriminative power of the final models for peripheral, central, and other causes was considered good as measured by c-indices of 0.75, 0.7, and 0.78, respectively. This multicenter study demonstrates a focused dizziness questionnaire can accurately predict diagnosis for patients with chronic/relapsing dizziness referred to outpatient clinics. Additionally, this survey has significant capability to differentiate peripheral from nonperipheral causes of vertigo and may, in the future, serve as a screening tool for specialty referral. Clinical utility of this questionnaire to guide specialty referral is discussed.

  15. Prediction of cold and heat patterns using anthropometric measures based on machine learning.

    PubMed

    Lee, Bum Ju; Lee, Jae Chul; Nam, Jiho; Kim, Jong Yeol

    2018-01-01

    To examine the association of body shape with cold and heat patterns, to determine which anthropometric measure is the best indicator for discriminating between the two patterns, and to investigate whether using a combination of measures can improve the predictive power to diagnose these patterns. Based on a total of 4,859 subjects (3,000 women and 1,859 men), statistical analyses using binary logistic regression were performed to assess the significance of the difference and the predictive power of each anthropometric measure, and binary logistic regression and Naive Bayes with the variable selection technique were used to assess the improvement in the predictive power of the patterns using the combined measures. In women, the strongest indicators for determining the cold and heat patterns among anthropometric measures were body mass index (BMI) and rib circumference; in men, the best indicator was BMI. In experiments using a combination of measures, the values of the area under the receiver operating characteristic curve in women were 0.776 by Naive Bayes and 0.772 by logistic regression, and the values in men were 0.788 by Naive Bayes and 0.779 by logistic regression. Individuals with a higher BMI have a tendency toward a heat pattern in both women and men. The use of a combination of anthropometric measures can slightly improve the diagnostic accuracy. Our findings can provide fundamental information for the diagnosis of cold and heat patterns based on body shape for personalized medicine.

  16. Application of classification tree and logistic regression for the management and health intervention plans in a community-based study.

    PubMed

    Teng, Ju-Hsi; Lin, Kuan-Chia; Ho, Bin-Shenq

    2007-10-01

    A community-based aboriginal study was conducted and analysed to explore the application of classification tree and logistic regression. A total of 1066 aboriginal residents in Yilan County were screened during 2003-2004. The independent variables include demographic characteristics, physical examinations, geographic location, health behaviours, dietary habits and family hereditary diseases history. Risk factors of cardiovascular diseases were selected as the dependent variables in further analysis. The completion rate for heath interview is 88.9%. The classification tree results find that if body mass index is higher than 25.72 kg m(-2) and the age is above 51 years, the predicted probability for number of cardiovascular risk factors > or =3 is 73.6% and the population is 322. If body mass index is higher than 26.35 kg m(-2) and geographical latitude of the village is lower than 24 degrees 22.8', the predicted probability for number of cardiovascular risk factors > or =4 is 60.8% and the population is 74. As the logistic regression results indicate that body mass index, drinking habit and menopause are the top three significant independent variables. The classification tree model specifically shows the discrimination paths and interactions between the risk groups. The logistic regression model presents and analyses the statistical independent factors of cardiovascular risks. Applying both models to specific situations will provide a different angle for the design and management of future health intervention plans after community-based study.

  17. Risk factors for pedicled flap necrosis in hand soft tissue reconstruction: a multivariate logistic regression analysis.

    PubMed

    Gong, Xu; Cui, Jianli; Jiang, Ziping; Lu, Laijin; Li, Xiucun

    2018-03-01

    Few clinical retrospective studies have reported the risk factors of pedicled flap necrosis in hand soft tissue reconstruction. The aim of this study was to identify non-technical risk factors associated with pedicled flap perioperative necrosis in hand soft tissue reconstruction via a multivariate logistic regression analysis. For patients with hand soft tissue reconstruction, we carefully reviewed hospital records and identified 163 patients who met the inclusion criteria. The characteristics of these patients, flap transfer procedures and postoperative complications were recorded. Eleven predictors were identified. The correlations between pedicled flap necrosis and risk factors were analysed using a logistic regression model. Of 163 skin flaps, 125 flaps survived completely without any complications. The pedicled flap necrosis rate in hands was 11.04%, which included partial flap necrosis (7.36%) and total flap necrosis (3.68%). Soft tissue defects in fingers were noted in 68.10% of all cases. The logistic regression analysis indicated that the soft tissue defect site (P = 0.046, odds ratio (OR) = 0.079, confidence interval (CI) (0.006, 0.959)), flap size (P = 0.020, OR = 1.024, CI (1.004, 1.045)) and postoperative wound infection (P < 0.001, OR = 17.407, CI (3.821, 79.303)) were statistically significant risk factors for pedicled flap necrosis of the hand. Soft tissue defect site, flap size and postoperative wound infection were risk factors associated with pedicled flap necrosis in hand soft tissue defect reconstruction. © 2017 Royal Australasian College of Surgeons.

  18. The microcomputer scientific software series 2: general linear model--regression.

    Treesearch

    Harold M. Rauscher

    1983-01-01

    The general linear model regression (GLMR) program provides the microcomputer user with a sophisticated regression analysis capability. The output provides a regression ANOVA table, estimators of the regression model coefficients, their confidence intervals, confidence intervals around the predicted Y-values, residuals for plotting, a check for multicollinearity, a...

  19. [Predicting the probability of development and progression of primary open angle glaucoma by regression modeling].

    PubMed

    Likhvantseva, V G; Sokolov, V A; Levanova, O N; Kovelenova, I V

    2018-01-01

    Prediction of the clinical course of primary open-angle glaucoma (POAG) is one of the main directions in solving the problem of vision loss prevention and stabilization of the pathological process. Simple statistical methods of correlation analysis show the extent of each risk factor's impact, but do not indicate the total impact of these factors in personalized combinations. The relationships between the risk factors is subject to correlation and regression analysis. The regression equation represents the dependence of the mathematical expectation of the resulting sign on the combination of factor signs. To develop a technique for predicting the probability of development and progression of primary open-angle glaucoma based on a personalized combination of risk factors by linear multivariate regression analysis. The study included 66 patients (23 female and 43 male; 132 eyes) with newly diagnosed primary open-angle glaucoma. The control group consisted of 14 patients (8 male and 6 female). Standard ophthalmic examination was supplemented with biochemical study of lacrimal fluid. Concentration of matrix metalloproteinase MMP-2 and MMP-9 in tear fluid in both eyes was determined using 'sandwich' enzyme-linked immunosorbent assay (ELISA) method. The study resulted in the development of regression equations and step-by-step multivariate logistic models that can help calculate the risk of development and progression of POAG. Those models are based on expert evaluation of clinical and instrumental indicators of hydrodynamic disturbances (coefficient of outflow ease - C, volume of intraocular fluid secretion - F, fluctuation of intraocular pressure), as well as personalized morphometric parameters of the retina (central retinal thickness in the macular area) and concentration of MMP-2 and MMP-9 in the tear film. The newly developed regression equations are highly informative and can be a reliable tool for studying of the influence vector and assessment of pathogenic potential of the independent risk factors in specific personalized combinations.

  20. Serum lipid level and lifestyles are associated with carotid femoral pulse wave velocity among adults: 4.4-year prospectively longitudinal follow-up of a clinical trial.

    PubMed

    Zhao, XiaoXiao; Wang, Hongyu; Bo, LiuJin; Zhao, Hongwei; Li, Lihong; Zhou, Yingyan

    2018-01-01

    Lifestyle modifications are recommended as the initial treatment for high blood pressure. The influence of dyslipidemia might be via moderate arterial stiffness, which results in hypertension and cardiovascular disease. We used data from a subgroup of the lifestyle, level of serum lipids/carotid femoral-pulse wave velocity (CF-PWV) Susceptibility BEST Study, a population-based study of community-dwelling adults aged 45-75 years. The serum lipid level and CF-PWV were measured at baseline, and lifestyle such as smoking status, sleeping habits, and the level of oil or salt intake was determined with the use of a validated questionnaire during follow-up. Arterial stiffness was determined as CF-PWV using an electrocardiogram after a mean follow-up of 4.4 years. Regression coefficients (95% CIs), adjusted for demographics, risk factors, cholesterol, and triglycerides (TGs), were calculated by linear regression. Logistic regression analysis was used to identify the association between the variables with CF-PWV independently. In the results, glucose and total cholesterol (TC) were associated with higher CF-PWV (p = 0.000) and lower-destiny lipoprotein was associated with lower CF-PWV (p = 0.001) after adjustments for age, sex, mean arterial pressure, and heart rate. There were significant associations observed for current salt intake in relation to CF-PWV (p-trend = 0.038) without adjustment. This association was retained after adjustments for covariates and had statistical significance (p-trend = 0.048) in model 3, which adjusted age, sex, baseline CF-PWV, mean arterial pressure, heart rate waist circumference, education, smoking status, physical activity, diabetes mellitus (DM), heart disease, high-density lipoprotein (HDL) cholesterol, low-density lipoprotein (LDL) cholesterol, TGs, antihypertensive medicine, nitrate medicine, and antiplatelet medicine. Linear regression showed statistically significant associations between LDL and CF-PWV in the fully adjusted models (model 1 p = 0.010, model 2 p = 0.020, model 3 p = 0.017). Logistic regression analysis showed that CF-PWV was independently associated with age (p = 0.000), TC (p = 0.000), TGs (p = 0.000), and homo-cysteine (p = 0.000), and their odds ratios were 0.781, 3.424, 0.075, and 1.046, respectively. Our results showed a positive association between LDL and arterial stiffness, and suggested that less smoking status, sleeping disorder, and salt intake were associated with less arterial stiffness.

  1. Logistic Mixed Models to Investigate Implicit and Explicit Belief Tracking

    PubMed Central

    Lages, Martin; Scheel, Anne

    2016-01-01

    We investigated the proposition of a two-systems Theory of Mind in adults’ belief tracking. A sample of N = 45 participants predicted the choice of one of two opponent players after observing several rounds in an animated card game. Three matches of this card game were played and initial gaze direction on target and subsequent choice predictions were recorded for each belief task and participant. We conducted logistic regressions with mixed effects on the binary data and developed Bayesian logistic mixed models to infer implicit and explicit mentalizing in true belief and false belief tasks. Although logistic regressions with mixed effects predicted the data well a Bayesian logistic mixed model with latent task- and subject-specific parameters gave a better account of the data. As expected explicit choice predictions suggested a clear understanding of true and false beliefs (TB/FB). Surprisingly, however, model parameters for initial gaze direction also indicated belief tracking. We discuss why task-specific parameters for initial gaze directions are different from choice predictions yet reflect second-order perspective taking. PMID:27853440

  2. A risk score to predict the incidence of prolonged air leak after video-assisted thoracoscopic lobectomy: An analysis from the European Society of Thoracic Surgeons database.

    PubMed

    Pompili, Cecilia; Falcoz, Pierre Emmanuel; Salati, Michele; Szanto, Zalan; Brunelli, Alessandro

    2017-04-01

    The study objective was to develop an aggregate risk score for predicting the occurrence of prolonged air leak after video-assisted thoracoscopic lobectomy from patients registered in the European Society of Thoracic Surgeons database. A total of 5069 patients who underwent video-assisted thoracoscopic lobectomy (July 2007 to August 2015) were analyzed. Exclusion criteria included sublobar resections or pneumonectomies, lung resection associated with chest wall or diaphragm resections, sleeve resections, and need for postoperative assisted mechanical ventilation. Prolonged air leak was defined as an air leak more than 5 days. Several baseline and surgical variables were tested for a possible association with prolonged air leak using univariable and logistic regression analyses, determined by bootstrap resampling. Predictors were proportionally weighed according to their regression estimates (assigning 1 point to the smallest coefficient). Prolonged air leak was observed in 504 patients (9.9%). Three variables were found associated with prolonged air leak after logistic regression: male gender (P < .0001, score = 1), forced expiratory volume in 1 second less than 80% (P < .0001, score = 1), and body mass index less than 18.5 kg/m 2 (P < .0001, score = 2). The aggregate prolonged air leak risk score was calculated for each patient by summing the individual scores assigned to each variable (range, 0-4). Patients were then grouped into 4 classes with an incremental risk of prolonged air leak (P < .0001): class A (score 0 points, 1493 patients) 6.3% with prolonged air leak, class B (score 1 point, 2240 patients) 10% with prolonged air leak, class C (score 2 points, 1219 patients) 13% with prolonged air leak, and class D (score >2 points, 117 patients) 25% with prolonged air leak. An aggregate risk score was created to stratify the incidence of prolonged air leak after video-assisted thoracoscopic lobectomy. The score can be used for patient counseling and to identify those patients who can benefit from additional intraoperative preventative measures. Copyright © 2016 The American Association for Thoracic Surgery. Published by Elsevier Inc. All rights reserved.

  3. Description of Aspergillus flavus growth under the influence of different factors (water activity, incubation temperature, protein and fat concentration, pH, and cinnamon essential oil concentration) by kinetic, probability of growth, and time-to-detection models.

    PubMed

    Kosegarten, Carlos E; Ramírez-Corona, Nelly; Mani-López, Emma; Palou, Enrique; López-Malo, Aurelio

    2017-01-02

    A Box-Behnken design was used to determine the effect of protein concentration (0, 5, or 10g of casein/100g), fat (0, 3, or 6g of corn oil/100g), a w (0.900, 0.945, or 0.990), pH (3.5, 5.0, or 6.5), concentration of cinnamon essential oil (CEO, 0, 200, or 400μL/kg) and incubation temperature (15, 25, or 35°C) on the growth of Aspergillus flavus during 50days of incubation. Mold response under the evaluated conditions was modeled by the modified Gompertz equation, logistic regression, and time-to-detection model. The obtained polynomial regression models allow the significant coefficients (p<0.05) for linear, quadratic and interaction effects for the Gompertz equation's parameters to be identified, which adequately described (R 2 >0.967) the studied mold responses. After 50days of incubation, every tested model system was classified according to the observed response as 1 (growth) or 0 (no growth), then a binary logistic regression was utilized to model A. flavus growth interface, allowing to predict the probability of mold growth under selected combinations of tested factors. The time-to-detection model was utilized to estimate the time at which A. flavus visible growth begins. Water activity, temperature, and CEO concentration were the most important factors affecting fungal growth. It was observed that there is a range of possible combinations that may induce growth, such that incubation conditions and the amount of essential oil necessary for fungal growth inhibition strongly depend on protein and fat concentrations as well as on the pH of studied model systems. The probabilistic model and the time-to-detection models constitute another option to determine appropriate storage/processing conditions and accurately predict the probability and/or the time at which A. flavus growth occurs. Copyright © 2016 Elsevier B.V. All rights reserved.

  4. Innovating patient care delivery: DSRIP's interrupted time series analysis paradigm.

    PubMed

    Shenoy, Amrita G; Begley, Charles E; Revere, Lee; Linder, Stephen H; Daiger, Stephen P

    2017-12-08

    Adoption of Medicaid Section 1115 waiver is one of the many ways of innovating healthcare delivery system. The Delivery System Reform Incentive Payment (DSRIP) pool, one of the two funding pools of the waiver has four categories viz. infrastructure development, program innovation and redesign, quality improvement reporting and lastly, bringing about population health improvement. A metric of the fourth category, preventable hospitalization (PH) rate was analyzed in the context of eight conditions for two time periods, pre-reporting years (2010-2012) and post-reporting years (2013-2015) for two hospital cohorts, DSRIP participating and non-participating hospitals. The study explains how DSRIP impacted Preventable Hospitalization (PH) rates of eight conditions for both hospital cohorts within two time periods. Eight PH rates were regressed as the dependent variable with time, intervention and post-DSRIP Intervention as independent variables. PH rates of eight conditions were then consolidated into one rate for regressing with the above independent variables to evaluate overall impact of DSRIP. An interrupted time series regression was performed after accounting for auto-correlation, stationarity and seasonality in the dataset. In the individual regression model, PH rates showed statistically significant coefficients for seven out of eight conditions in DSRIP participating hospitals. In the combined regression model, the coefficient of the PH rate showed a statistically significant decrease with negative p-values for regression coefficients in DSRIP participating hospitals compared to positive/increased p-values for regression coefficients in DSRIP non-participating hospitals. Several macro- and micro-level factors may have likely contributed DSRIP hospitals outperforming DSRIP non-participating hospitals. Healthcare organization/provider collaboration, support from healthcare professionals, DSRIP's design, state reimbursement and coordination in care delivery methods may have led to likely success of DSRIP. IV, a retrospective cohort study based on longitudinal data. Copyright © 2017 Elsevier Inc. All rights reserved.

  5. Carotid artery intima-media complex thickening in patients with relatively long-surviving type 1 diabetes mellitus.

    PubMed

    Distiller, Larry A; Joffe, Barry I; Melville, Vanessa; Welman, Tania; Distiller, Greg B

    2006-01-01

    The factors responsible for premature coronary atherosclerosis in patients with type 1 diabetes are ill defined. We therefore assessed carotid intima-media complex thickness (IMT) in relatively long-surviving patients with type 1 diabetes as a marker of atherosclerosis and correlated this with traditional risk factors. Cross-sectional study of 148 patients with relatively long-surviving (>18 years) type 1 diabetes (76 men and 72 women) attending the Centre for Diabetes and Endocrinology, Johannesburg. The mean common carotid artery IMT and presence or absence of plaque was evaluated by high-resolution B-mode ultrasound. Their median age was 48 years and duration of diabetes 26 years (range 18-59 years). Traditional risk factors (age, duration of diabetes, glycemic control, hypertension, smoking and lipoprotein concentrations) were recorded. Three response variables were defined and modeled. Standard multiple regression was used for a continuous IMT variable, logistic regression for the presence/absence of plaque and ordinal logistic regression to model three categories of "risk." The median common carotid IMT was 0.62 mm (range 0.44-1.23 mm) with plaque detected in 28 cases. The multiple regression model found significant associations between IMT and current age (P=.001), duration of diabetes (P=.033), BMI (P=.008) and diagnosed hypertension (P=.046) with HDL showing a protective effect (P=.022). Current age (P=.001) and diagnosed hypertension (P=.004), smoking (P=.008) and retinopathy (P=.033) were significant in the logistic regression model. Current age was also significant in the ordinal logistic regression model (P<.001), as was total cholesterol/HDL ratio (P<.001) and mean HbA(1c) concentration (P=.073). The major factors influencing common carotid IMT in patients with relatively long-surviving type 1 diabetes are age, duration of diabetes, existing hypertension and HDL (protective) with a relatively minor role ascribed to relatively long-standing glycemic control.

  6. Evaluation of a Microbiological Multi-Residue System on the detection of antibacterial substances in ewe milk.

    PubMed

    Althaus, Rafael; Berruga, Maria Isabel; Montero, Ana; Roca, Marta; Molina, Maria Pilar

    2009-01-19

    To protect both, public health and the dairy industry, from the presence of antibiotic residues in milk, control programmes have been established, which include the needed screening tests. This work focuses on the application of a Microbiological Multi-Residue System in ewe milk, a method based on the use of six different plates, each seeded with one of the following bacteria: Geobacillus stearothermophilus var. calidolactis (beta-lactams), Bacillus subtilis at pH 8.0 (aminoglycosides), Kocuria rhizophila (macrolides), Escherichia coli (quinolones), B. cereus (tetracyclines) and B. subtilis at pH 7.0 (sulphonamides), respectively. Twenty-three antimicrobial substances were analysed and a logistic regression was established for each substance assayed to relate the antibiotic concentration and the zone of microbial growth inhibition. Great linearity in the response was observed (regression coefficients of over 0.97). This fact suggests the possibility of establishing a decision level of antibiotic concentrations near to the Maximum Residue Limits (MRL). Zones of inhibition were suggested as proposed action levels for the different antimicrobial groups (diameters of inhibition of 18 mm for the aminoglycoside, beta-lactam and sulphonamide plates; 19 mm for the tetracycline plate, 21 mm for the macrolide plate, and 24 mm for the quinolone plate). Specificity and cross-reactivity were also assayed.

  7. Determinants of the lethality of climate-related disasters in the Caribbean Community (CARICOM): a cross-country analysis

    PubMed Central

    Andrewin, Aisha N.; Rodriguez-Llanes, Jose M.; Guha-Sapir, Debarati

    2015-01-01

    Floods and storms are climate-related hazards posing high mortality risk to Caribbean Community (CARICOM) nations. However risk factors for their lethality remain untested. We conducted an ecological study investigating risk factors for flood and storm lethality in CARICOM nations for the period 1980–2012. Lethality - deaths versus no deaths per disaster event- was the outcome. We examined biophysical and social vulnerability proxies and a decadal effect as predictors. We developed our regression model via multivariate analysis using a generalized logistic regression model with quasi-binomial distribution; removal of multi-collinear variables and backward elimination. Robustness was checked through subset analysis. We found significant positive associations between lethality, percentage of total land dedicated to agriculture (odds ratio [OR] 1.032; 95% CI: 1.013–1.053) and percentage urban population (OR 1.029, 95% CI 1.003–1.057). Deaths were more likely in the 2000–2012 period versus 1980–1989 (OR 3.708, 95% CI 1.615–8.737). Robustness checks revealed similar coefficients and directions of association. Population health in CARICOM nations is being increasingly impacted by climate-related disasters connected to increasing urbanization and land use patterns. Our findings support the evidence base for setting sustainable development goals (SDG). PMID:26153115

  8. Association of serum uric acid with high-sensitivity C-reactive protein in postmenopausal women.

    PubMed

    Raeisi, A; Ostovar, A; Vahdat, K; Rezaei, P; Darabi, H; Moshtaghi, D; Nabipour, I

    2017-02-01

    To explore the independent correlation between serum uric acid and low-grade inflammation (measured by high-sensitivity C-reactive protein, hs-CRP) in postmenopausal women. A total of 378 healthy Iranian postmenopausal women were randomly selected in a population-based study. Circulating hs-CRP levels were measured by highly specific enzyme-linked immunosorbent assay method and an enzymatic calorimetric method was used to measure serum levels of uric acid. Pearson correlation coefficient, multiple linear regression and logistic regression models were used to analyze the association between uric acid and hs-CRP levels. A statistically significant correlation was seen between serum levels of uric acid and log-transformed circulating hs-CRP (r = 0.25, p < 0.001). After adjustment for age and cardiovascular risk factors (according to NCEP ATP III criteria), circulating hs-CRP levels were significantly associated with serum uric acid levels (β = 0.20, p < 0.001). After adjustment for age and cardiovascular risk factors, hs-CRP levels ≥3 mg/l were significantly associated with higher uric acid levels (odds ratio =1.52, 95% confidence interval 1.18-1.96). Higher serum uric acid levels were positively and independently associated with circulating hs-CRP in healthy postmenopausal women.

  9. Sensory impairments of the lower limb after stroke: a pooled analysis of individual patient data.

    PubMed

    Tyson, Sarah F; Crow, J Lesley; Connell, Louise; Winward, Charlotte; Hillier, Susan

    2013-01-01

    To obtain more generalizable information on the frequency and factors influencing sensory impairment after stroke and their relationship to mobility and function. A pooled analysis of individual data of stroke survivors (N = 459); mean (SD) age = 67.2 (14.8) years, 54% male, mean (SD) time since stroke = 22.33 (63.1) days, 50% left-sided weakness. Where different measurement tools were used, data were recorded. Descriptive statistics described frequency of sensory impairments, kappa coefficients investigated relationships between sensory modalities, binary logistic regression explored the factors influencing sensory impairments, and linear regression assessed the impact of sensory impairments on activity limitations. Most patients' sensation was intact (55%), and individual sensory modalities were highly associated (κ = 0.60, P < .001). Weakness and neglect influenced sensory impairment (P < .001), but demographics, stroke pathology, and spasticity did not. Sensation influenced independence in activities of daily living, mobility, and balance but less strongly than weakness. Pooled individual data analysis showed sensation of the lower limb is grossly preserved in most stroke survivors but, when present, it affects function. Sensory modalities are highly interrelated; interventions that treat the motor system during functional tasks may be as effective at treating the sensory system as sensory retraining alone.

  10. Prevention of motor‐vehicle deaths by changing vehicle factors

    PubMed Central

    Robertson, Leon S

    2007-01-01

    Objective To estimate the effect of changing vehicle factors to reduce mortality in a comprehensive study. Design/methods Odds of death in the United States during 2000–2005 were analyzed, involving specific makes and models of 1999–2005 model year cars, minivans, and sport utility vehicles using logistic regression after selection of factors to be included by examination of least‐squares correlations of vehicle factors to maximize independence of predictors. Based on the regression coefficients, percentages of deaths preventable by changes in selected factors were calculated. Correlations of vehicle characteristics to environmental and behavioral risk factors were also examined to assess any potential confounding. Results Deaths in the studied vehicles would have been 42% lower had all had electronic stability control (ESC) systems. Improved crashworthiness as measured by offset frontal and side crash tests would have produced an additional 28% reduction, and static stability improvement would have reduced the deaths 11%. Although weight–power that reduces fuel economy is associated with lower risk to drivers, it increases risk of deaths to pedestrians and bicyclists but has an overall minor effect compared to the other factors. Conclusion A large majority of motor‐vehicle‐related fatalities could be avoided by universal adoption of the most effective technologies. PMID:17916886

  11. Correlation and simple linear regression.

    PubMed

    Eberly, Lynn E

    2007-01-01

    This chapter highlights important steps in using correlation and simple linear regression to address scientific questions about the association of two continuous variables with each other. These steps include estimation and inference, assessing model fit, the connection between regression and ANOVA, and study design. Examples in microbiology are used throughout. This chapter provides a framework that is helpful in understanding more complex statistical techniques, such as multiple linear regression, linear mixed effects models, logistic regression, and proportional hazards regression.

  12. Multiple Imputation of a Randomly Censored Covariate Improves Logistic Regression Analysis.

    PubMed

    Atem, Folefac D; Qian, Jing; Maye, Jacqueline E; Johnson, Keith A; Betensky, Rebecca A

    2016-01-01

    Randomly censored covariates arise frequently in epidemiologic studies. The most commonly used methods, including complete case and single imputation or substitution, suffer from inefficiency and bias. They make strong parametric assumptions or they consider limit of detection censoring only. We employ multiple imputation, in conjunction with semi-parametric modeling of the censored covariate, to overcome these shortcomings and to facilitate robust estimation. We develop a multiple imputation approach for randomly censored covariates within the framework of a logistic regression model. We use the non-parametric estimate of the covariate distribution or the semiparametric Cox model estimate in the presence of additional covariates in the model. We evaluate this procedure in simulations, and compare its operating characteristics to those from the complete case analysis and a survival regression approach. We apply the procedures to an Alzheimer's study of the association between amyloid positivity and maternal age of onset of dementia. Multiple imputation achieves lower standard errors and higher power than the complete case approach under heavy and moderate censoring and is comparable under light censoring. The survival regression approach achieves the highest power among all procedures, but does not produce interpretable estimates of association. Multiple imputation offers a favorable alternative to complete case analysis and ad hoc substitution methods in the presence of randomly censored covariates within the framework of logistic regression.

  13. Comparison between light scattering and gravimetric samplers for PM10 mass concentration in poultry and pig houses

    NASA Astrophysics Data System (ADS)

    Cambra-López, María; Winkel, Albert; Mosquera, Julio; Ogink, Nico W. M.; Aarnink, André J. A.

    2015-06-01

    The objective of this study was to compare co-located real-time light scattering devices and equivalent gravimetric samplers in poultry and pig houses for PM10 mass concentration, and to develop animal-specific calibration factors for light scattering samplers. These results will contribute to evaluate the comparability of different sampling instruments for PM10 concentrations. Paired DustTrak light scattering device (DustTrak aerosol monitor, TSI, U.S.) and PM10 gravimetric cyclone sampler were used for measuring PM10 mass concentrations during 24 h periods (from noon to noon) inside animal houses. Sampling was conducted in 32 animal houses in the Netherlands, including broilers, broiler breeders, layers in floor and in aviary system, turkeys, piglets, growing-finishing pigs in traditional and low emission housing with dry and liquid feed, and sows in individual and group housing. A total of 119 pairs of 24 h measurements (55 for poultry and 64 for pigs) were recorded and analyzed using linear regression analysis. Deviations between samplers were calculated and discussed. In poultry, cyclone sampler and DustTrak data fitted well to a linear regression, with a regression coefficient equal to 0.41, an intercept of 0.16 mg m-3 and a correlation coefficient of 0.91 (excluding turkeys). Results in turkeys showed a regression coefficient equal to 1.1 (P = 0.49), an intercept of 0.06 mg m-3 (P < 0.0001) and a correlation coefficient of 0.98. In pigs, we found a regression coefficient equal to 0.61, an intercept of 0.05 mg m-3 and a correlation coefficient of 0.84. Measured PM10 concentrations using DustTraks were clearly underestimated (approx. by a factor 2) in both poultry and pig housing systems compared with cyclone pre-separators. Absolute, relative, and random deviations increased with concentration. DustTrak light scattering devices should be self-calibrated to investigate PM10 mass concentrations accurately in animal houses. We recommend linear regression equations as animal-specific calibration factors for DustTraks instead of manufacturer calibration factors, especially in heavily dusty environments such as animal houses.

  14. Associations between parental BMI, socioeconomic factors, family structure and overweight in Finnish children: a path model approach.

    PubMed

    Parikka, Suvi; Mäki, Päivi; Levälahti, Esko; Lehtinen-Jacks, Susanna; Martelin, Tuija; Laatikainen, Tiina

    2015-03-19

    The aim of this study was to assess the less studied interrelationships and pathways between parental BMI, socioeconomic factors, family structure and childhood overweight. The cross-sectional LATE-study was carried out in Finland in 2007-2009. The data for the analyses was classified into four categories: younger boys and girls (ca 3-8 years) (n = 2573) and older boys and girls (ca 11-16 years) (n = 1836). Associations between parental BMI, education, labor market status, self-perceived income sufficiency, family structure and childhood overweight were first examined by logistic regression analyses. As parental BMI and education had the most consistent associations with childhood overweight, the direct and indirect (mediated by parental BMI) associations of maternal and paternal education with childhood overweight were further assessed using a path model. Parental BMI and education were the strongest determinants of childhood overweight. Children of overweight parents had an increased risk of being overweight. In younger boys, maternal and paternal education had both direct (b-coefficient paternal -0.21, 95% CI -0.34 to -0.09; maternal -0.17, 95% CI -0.28 to -0.07) and indirect (b-coefficient paternal -0.04, 95% CI -0.07 to -0.02; maternal -0.04, 95% CI -0.06 to -0.02) inverse associations with overweight. Among the older boys, paternal education had both direct (b-coefficient -0.12, 95% CI -0.24 to -0.01) and indirect (b-coefficient -0.03, 95% CI -0.06 to -0.01) inverse associations with overweight, but maternal education had only an indirect association (b-coefficient -0.04, 95% CI -0.07 to -0.02). Among older girls, only an indirect association of maternal education with childhood overweight was found (b-coefficient -0.03, 95% CI -0.06 to -0.01). In younger girls, parental education was not associated with childhood overweight. The observed pathways between parental BMI and education and childhood overweight emphasize a need for evidence-based health promotion interventions tailored for families identified with parental overweight and low level of education.

  15. Poor methodological quality and reporting standards of systematic reviews in burn care management.

    PubMed

    Wasiak, Jason; Tyack, Zephanie; Ware, Robert; Goodwin, Nicholas; Faggion, Clovis M

    2017-10-01

    The methodological and reporting quality of burn-specific systematic reviews has not been established. The aim of this study was to evaluate the methodological quality of systematic reviews in burn care management. Computerised searches were performed in Ovid MEDLINE, Ovid EMBASE and The Cochrane Library through to February 2016 for systematic reviews relevant to burn care using medical subject and free-text terms such as 'burn', 'systematic review' or 'meta-analysis'. Additional studies were identified by hand-searching five discipline-specific journals. Two authors independently screened papers, extracted and evaluated methodological quality using the 11-item A Measurement Tool to Assess Systematic Reviews (AMSTAR) tool and reporting quality using the 27-item Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) checklist. Characteristics of systematic reviews associated with methodological and reporting quality were identified. Descriptive statistics and linear regression identified features associated with improved methodological quality. A total of 60 systematic reviews met the inclusion criteria. Six of the 11 AMSTAR items reporting on 'a priori' design, duplicate study selection, grey literature, included/excluded studies, publication bias and conflict of interest were reported in less than 50% of the systematic reviews. Of the 27 items listed for PRISMA, 13 items reporting on introduction, methods, results and the discussion were addressed in less than 50% of systematic reviews. Multivariable analyses showed that systematic reviews associated with higher methodological or reporting quality incorporated a meta-analysis (AMSTAR regression coefficient 2.1; 95% CI: 1.1, 3.1; PRISMA regression coefficient 6·3; 95% CI: 3·8, 8·7) were published in the Cochrane library (AMSTAR regression coefficient 2·9; 95% CI: 1·6, 4·2; PRISMA regression coefficient 6·1; 95% CI: 3·1, 9·2) and included a randomised control trial (AMSTAR regression coefficient 1·4; 95%CI: 0·4, 2·4; PRISMA regression coefficient 3·4; 95% CI: 0·9, 5·8). The methodological and reporting quality of systematic reviews in burn care requires further improvement with stricter adherence by authors to the PRISMA checklist and AMSTAR tool. © 2016 Medicalhelplines.com Inc and John Wiley & Sons Ltd.

  16. Multinomial logistic regression in workers' health

    NASA Astrophysics Data System (ADS)

    Grilo, Luís M.; Grilo, Helena L.; Gonçalves, Sónia P.; Junça, Ana

    2017-11-01

    In European countries, namely in Portugal, it is common to hear some people mentioning that they are exposed to excessive and continuous psychosocial stressors at work. This is increasing in diverse activity sectors, such as, the Services sector. A representative sample was collected from a Portuguese Services' organization, by applying a survey (internationally validated), which variables were measured in five ordered categories in Likert-type scale. A multinomial logistic regression model is used to estimate the probability of each category of the dependent variable general health perception where, among other independent variables, burnout appear as statistically significant.

  17. Solid harmonic wavelet scattering for predictions of molecule properties

    NASA Astrophysics Data System (ADS)

    Eickenberg, Michael; Exarchakis, Georgios; Hirn, Matthew; Mallat, Stéphane; Thiry, Louis

    2018-06-01

    We present a machine learning algorithm for the prediction of molecule properties inspired by ideas from density functional theory (DFT). Using Gaussian-type orbital functions, we create surrogate electronic densities of the molecule from which we compute invariant "solid harmonic scattering coefficients" that account for different types of interactions at different scales. Multilinear regressions of various physical properties of molecules are computed from these invariant coefficients. Numerical experiments show that these regressions have near state-of-the-art performance, even with relatively few training examples. Predictions over small sets of scattering coefficients can reach a DFT precision while being interpretable.

  18. Blastocoele expansion degree predicts live birth after single blastocyst transfer for fresh and vitrified/warmed single blastocyst transfer cycles.

    PubMed

    Du, Qing-Yun; Wang, En-Yin; Huang, Yan; Guo, Xiao-Yi; Xiong, Yu-Jing; Yu, Yi-Ping; Yao, Gui-Dong; Shi, Sen-Lin; Sun, Ying-Pu

    2016-04-01

    To evaluate the independent effects of the degree of blastocoele expansion and re-expansion and the inner cell mass (ICM) and trophectoderm (TE) grades on predicting live birth after fresh and vitrified/warmed single blastocyst transfer. Retrospective study. Reproductive medical center. Women undergoing 844 fresh and 370 vitrified/warmed single blastocyst transfer cycles. None. Live-birth rate correlated with blastocyst morphology parameters by logistic regression analysis and Spearman correlations analysis. The degree of blastocoele expansion and re-expansion was the only blastocyst morphology parameter that exhibited a significant ability to predict live birth in both fresh and vitrified/warmed single blastocyst transfer cycles respectively by multivariate logistic regression and Spearman correlations analysis. Although the ICM grade was significantly related to live birth in fresh cycles according to the univariate model, its effect was not maintained in the multivariate logistic analysis. In vitrified/warmed cycles, neither ICM nor TE grade was correlated with live birth by logistic regression analysis. This study is the first to confirm that the degree of blastocoele expansion and re-expansion is a better predictor of live birth after both fresh and vitrified/warmed single blastocyst transfer cycles than ICM or TE grade. Copyright © 2016. Published by Elsevier Inc.

  19. Factor complexity of crash occurrence: An empirical demonstration using boosted regression trees.

    PubMed

    Chung, Yi-Shih

    2013-12-01

    Factor complexity is a characteristic of traffic crashes. This paper proposes a novel method, namely boosted regression trees (BRT), to investigate the complex and nonlinear relationships in high-variance traffic crash data. The Taiwanese 2004-2005 single-vehicle motorcycle crash data are used to demonstrate the utility of BRT. Traditional logistic regression and classification and regression tree (CART) models are also used to compare their estimation results and external validities. Both the in-sample cross-validation and out-of-sample validation results show that an increase in tree complexity provides improved, although declining, classification performance, indicating a limited factor complexity of single-vehicle motorcycle crashes. The effects of crucial variables including geographical, time, and sociodemographic factors explain some fatal crashes. Relatively unique fatal crashes are better approximated by interactive terms, especially combinations of behavioral factors. BRT models generally provide improved transferability than conventional logistic regression and CART models. This study also discusses the implications of the results for devising safety policies. Copyright © 2012 Elsevier Ltd. All rights reserved.

  20. Decomposing Racial/Ethnic Disparities in Influenza Vaccination among the Elderly

    PubMed Central

    Yoo, Byung-Kwang; Hasebe, Takuya; Szilagyi, Peter G.

    2015-01-01

    While persistent racial/ethnic disparities in influenza vaccination have been reported among the elderly, characteristics contributing to disparities are poorly understood. This study aimed to assess characteristics associated with racial/ethnic disparities in influenza vaccination using a nonlinear Oaxaca-Blinder decomposition method. We performed cross-sectional multivariable logistic regression analyses for which the dependent variable was self-reported receipt of influenza vaccine during the 2010–2011 season among community dwelling non-Hispanic African-American (AA), non-Hispanic White (W), English-speaking Hispanic (EH) and Spanish-speaking Hispanic (SH) elderly, enrolled in the 2011 Medicare Current Beneficiary Survey (MCBS) (un-weighted/weighted N= 6,095/19.2million). Using the nonlinear Oaxaca-Blinder decomposition method, we assessed the relative contribution of seventeen covariates—including socio-demographic characteristics, health status, insurance, access, preference regarding healthcare, and geographic regions —to disparities in influenza vaccination. Unadjusted racial/ethnic disparities in influenza vaccination were 14.1 percentage points (pp) (W-AA disparity, p<.001), 25.7 pp (W-SH disparity, p<.001) and 0.6 pp (W-EH disparity, p>.8). The Oaxaca-Blinder decomposition method estimated that the unadjusted W-AA and W-SH disparities in vaccination could be reduced by only 45% even if AA and SH groups become equivalent to Whites in all covariates in multivariable regression models. The remaining 55% of disparities were attributed to (a) racial/ethnic differences in the estimated coefficients (e.g., odds ratios) in the regression models and (b) characteristics not included in the regression models. Our analysis found that only about 45% of racial/ethnic disparities in influenza vaccination among the elderly could be reduced by equalizing recognized characteristics among racial/ethnic groups. Future studies are needed to identify additional modifiable characteristics causing disparities in influenza vaccination. PMID:25900133

Top