Li, Ji; Gray, B.R.; Bates, D.M.
2008-01-01
Partitioning the variance of a response by design levels is challenging for binomial and other discrete outcomes. Goldstein (2003) proposed four definitions for variance partitioning coefficients (VPC) under a two-level logistic regression model. In this study, we explicitly derived formulae for multi-level logistic regression model and subsequently studied the distributional properties of the calculated VPCs. Using simulations and a vegetation dataset, we demonstrated associations between different VPC definitions, the importance of methods for estimating VPCs (by comparing VPC obtained using Laplace and penalized quasilikehood methods), and bivariate dependence between VPCs calculated at different levels. Such an empirical study lends an immediate support to wider applications of VPC in scientific data analysis.
Austin, Peter C
2010-04-22
Multilevel logistic regression models are increasingly being used to analyze clustered data in medical, public health, epidemiological, and educational research. Procedures for estimating the parameters of such models are available in many statistical software packages. There is currently little evidence on the minimum number of clusters necessary to reliably fit multilevel regression models. We conducted a Monte Carlo study to compare the performance of different statistical software procedures for estimating multilevel logistic regression models when the number of clusters was low. We examined procedures available in BUGS, HLM, R, SAS, and Stata. We found that there were qualitative differences in the performance of different software procedures for estimating multilevel logistic models when the number of clusters was low. Among the likelihood-based procedures, estimation methods based on adaptive Gauss-Hermite approximations to the likelihood (glmer in R and xtlogit in Stata) or adaptive Gaussian quadrature (Proc NLMIXED in SAS) tended to have superior performance for estimating variance components when the number of clusters was small, compared to software procedures based on penalized quasi-likelihood. However, only Bayesian estimation with BUGS allowed for accurate estimation of variance components when there were fewer than 10 clusters. For all statistical software procedures, estimation of variance components tended to be poor when there were only five subjects per cluster, regardless of the number of clusters.
Intermediate and advanced topics in multilevel logistic regression analysis
Merlo, Juan
2017-01-01
Multilevel data occur frequently in health services, population and public health, and epidemiologic research. In such research, binary outcomes are common. Multilevel logistic regression models allow one to account for the clustering of subjects within clusters of higher‐level units when estimating the effect of subject and cluster characteristics on subject outcomes. A search of the PubMed database demonstrated that the use of multilevel or hierarchical regression models is increasing rapidly. However, our impression is that many analysts simply use multilevel regression models to account for the nuisance of within‐cluster homogeneity that is induced by clustering. In this article, we describe a suite of analyses that can complement the fitting of multilevel logistic regression models. These ancillary analyses permit analysts to estimate the marginal or population‐average effect of covariates measured at the subject and cluster level, in contrast to the within‐cluster or cluster‐specific effects arising from the original multilevel logistic regression model. We describe the interval odds ratio and the proportion of opposed odds ratios, which are summary measures of effect for cluster‐level covariates. We describe the variance partition coefficient and the median odds ratio which are measures of components of variance and heterogeneity in outcomes. These measures allow one to quantify the magnitude of the general contextual effect. We describe an R 2 measure that allows analysts to quantify the proportion of variation explained by different multilevel logistic regression models. We illustrate the application and interpretation of these measures by analyzing mortality in patients hospitalized with a diagnosis of acute myocardial infarction. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. PMID:28543517
Intermediate and advanced topics in multilevel logistic regression analysis.
Austin, Peter C; Merlo, Juan
2017-09-10
Multilevel data occur frequently in health services, population and public health, and epidemiologic research. In such research, binary outcomes are common. Multilevel logistic regression models allow one to account for the clustering of subjects within clusters of higher-level units when estimating the effect of subject and cluster characteristics on subject outcomes. A search of the PubMed database demonstrated that the use of multilevel or hierarchical regression models is increasing rapidly. However, our impression is that many analysts simply use multilevel regression models to account for the nuisance of within-cluster homogeneity that is induced by clustering. In this article, we describe a suite of analyses that can complement the fitting of multilevel logistic regression models. These ancillary analyses permit analysts to estimate the marginal or population-average effect of covariates measured at the subject and cluster level, in contrast to the within-cluster or cluster-specific effects arising from the original multilevel logistic regression model. We describe the interval odds ratio and the proportion of opposed odds ratios, which are summary measures of effect for cluster-level covariates. We describe the variance partition coefficient and the median odds ratio which are measures of components of variance and heterogeneity in outcomes. These measures allow one to quantify the magnitude of the general contextual effect. We describe an R 2 measure that allows analysts to quantify the proportion of variation explained by different multilevel logistic regression models. We illustrate the application and interpretation of these measures by analyzing mortality in patients hospitalized with a diagnosis of acute myocardial infarction. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.
Cade, Brian S.; Noon, Barry R.; Scherer, Rick D.; Keane, John J.
2017-01-01
Counts of avian fledglings, nestlings, or clutch size that are bounded below by zero and above by some small integer form a discrete random variable distribution that is not approximated well by conventional parametric count distributions such as the Poisson or negative binomial. We developed a logistic quantile regression model to provide estimates of the empirical conditional distribution of a bounded discrete random variable. The logistic quantile regression model requires that counts are randomly jittered to a continuous random variable, logit transformed to bound them between specified lower and upper values, then estimated in conventional linear quantile regression, repeating the 3 steps and averaging estimates. Back-transformation to the original discrete scale relies on the fact that quantiles are equivariant to monotonic transformations. We demonstrate this statistical procedure by modeling 20 years of California Spotted Owl fledgling production (0−3 per territory) on the Lassen National Forest, California, USA, as related to climate, demographic, and landscape habitat characteristics at territories. Spotted Owl fledgling counts increased nonlinearly with decreasing precipitation in the early nesting period, in the winter prior to nesting, and in the prior growing season; with increasing minimum temperatures in the early nesting period; with adult compared to subadult parents; when there was no fledgling production in the prior year; and when percentage of the landscape surrounding nesting sites (202 ha) with trees ≥25 m height increased. Changes in production were primarily driven by changes in the proportion of territories with 2 or 3 fledglings. Average variances of the discrete cumulative distributions of the estimated fledgling counts indicated that temporal changes in climate and parent age class explained 18% of the annual variance in owl fledgling production, which was 34% of the total variance. Prior fledgling production explained as much of the variance in the fledgling counts as climate, parent age class, and landscape habitat predictors. Our logistic quantile regression model can be used for any discrete response variables with fixed upper and lower bounds.
The use of generalized estimating equations in the analysis of motor vehicle crash data.
Hutchings, Caroline B; Knight, Stacey; Reading, James C
2003-01-01
The purpose of this study was to determine if it is necessary to use generalized estimating equations (GEEs) in the analysis of seat belt effectiveness in preventing injuries in motor vehicle crashes. The 1992 Utah crash dataset was used, excluding crash participants where seat belt use was not appropriate (n=93,633). The model used in the 1996 Report to Congress [Report to congress on benefits of safety belts and motorcycle helmets, based on data from the Crash Outcome Data Evaluation System (CODES). National Center for Statistics and Analysis, NHTSA, Washington, DC, February 1996] was analyzed for all occupants with logistic regression, one level of nesting (occupants within crashes), and two levels of nesting (occupants within vehicles within crashes) to compare the use of GEEs with logistic regression. When using one level of nesting compared to logistic regression, 13 of 16 variance estimates changed more than 10%, and eight of 16 parameter estimates changed more than 10%. In addition, three of the independent variables changed from significant to insignificant (alpha=0.05). With the use of two levels of nesting, two of 16 variance estimates and three of 16 parameter estimates changed more than 10% from the variance and parameter estimates in one level of nesting. One of the independent variables changed from insignificant to significant (alpha=0.05) in the two levels of nesting model; therefore, only two of the independent variables changed from significant to insignificant when the logistic regression model was compared to the two levels of nesting model. The odds ratio of seat belt effectiveness in preventing injuries was 12% lower when a one-level nested model was used. Based on these results, we stress the need to use a nested model and GEEs when analyzing motor vehicle crash data.
Tangen, C M; Koch, G G
1999-03-01
In the randomized clinical trial setting, controlling for covariates is expected to produce variance reduction for the treatment parameter estimate and to adjust for random imbalances of covariates between the treatment groups. However, for the logistic regression model, variance reduction is not obviously obtained. This can lead to concerns about the assumptions of the logistic model. We introduce a complementary nonparametric method for covariate adjustment. It provides results that are usually compatible with expectations for analysis of covariance. The only assumptions required are based on randomization and sampling arguments. The resulting treatment parameter is a (unconditional) population average log-odds ratio that has been adjusted for random imbalance of covariates. Data from a randomized clinical trial are used to compare results from the traditional maximum likelihood logistic method with those from the nonparametric logistic method. We examine treatment parameter estimates, corresponding standard errors, and significance levels in models with and without covariate adjustment. In addition, we discuss differences between unconditional population average treatment parameters and conditional subpopulation average treatment parameters. Additional features of the nonparametric method, including stratified (multicenter) and multivariate (multivisit) analyses, are illustrated. Extensions of this methodology to the proportional odds model are also made.
2004-03-01
Breusch - Pagan test for constant variance of the residuals. Using Microsoft Excel® we calculate a p-value of 0.841237. This high p-value, which is above...our alpha of 0.05, indicates that our residuals indeed pass the Breusch - Pagan test for constant variance. In addition to the assumption tests , we...Wilk Test for Normality – Support (Reduced) Model (OLS) Finally, we perform a Breusch - Pagan test for constant variance of the residuals. Using
Hansson, Lisbeth; Khamis, Harry J
2008-12-01
Simulated data sets are used to evaluate conditional and unconditional maximum likelihood estimation in an individual case-control design with continuous covariates when there are different rates of excluded cases and different levels of other design parameters. The effectiveness of the estimation procedures is measured by method bias, variance of the estimators, root mean square error (RMSE) for logistic regression and the percentage of explained variation. Conditional estimation leads to higher RMSE than unconditional estimation in the presence of missing observations, especially for 1:1 matching. The RMSE is higher for the smaller stratum size, especially for the 1:1 matching. The percentage of explained variation appears to be insensitive to missing data, but is generally higher for the conditional estimation than for the unconditional estimation. It is particularly good for the 1:2 matching design. For minimizing RMSE, a high matching ratio is recommended; in this case, conditional and unconditional logistic regression models yield comparable levels of effectiveness. For maximizing the percentage of explained variation, the 1:2 matching design with the conditional logistic regression model is recommended.
2004-03-01
constant variance via an analysis of the residuals, as well as the Breusch - Pagan test (see Figure 3 below). As a result, we follow the footsteps of...reasonably normal, which ensures that our residuals meet the assumption of constant variance by passing the Breusch - Pagan test (see Figure 4 below...sections for Research and Development, Test and Evaluation (RDT&E), procurement and military construction (Jarvaise, 1996:3). While differing
Estimating Procurement Cost Growth Using Logistic and Multiple Regression
2003-03-01
Figure 4). The plots fail to pass the visual inspection for constant variance as well as the Breusch - Pagan test (Neter, 1996: 112) at an alpha level...plots fail to pass the visual inspection for constant variance as well as the Breusch - Pagan test at an alpha level of 0.05. Based on these findings...amount of cost growth a program will have 13 once model A deems that the program will incur cost growth. Sipple conducts validation testing on
Getting Answers to Natural Language Questions on the Web.
ERIC Educational Resources Information Center
Radev, Dragomir R.; Libner, Kelsey; Fan, Weiguo
2002-01-01
Describes a study that investigated the use of natural language questions on Web search engines. Highlights include query languages; differences in search engine syntax; and results of logistic regression and analysis of variance that showed aspects of questions that predicted significantly different performances, including the number of words,…
Factor complexity of crash occurrence: An empirical demonstration using boosted regression trees.
Chung, Yi-Shih
2013-12-01
Factor complexity is a characteristic of traffic crashes. This paper proposes a novel method, namely boosted regression trees (BRT), to investigate the complex and nonlinear relationships in high-variance traffic crash data. The Taiwanese 2004-2005 single-vehicle motorcycle crash data are used to demonstrate the utility of BRT. Traditional logistic regression and classification and regression tree (CART) models are also used to compare their estimation results and external validities. Both the in-sample cross-validation and out-of-sample validation results show that an increase in tree complexity provides improved, although declining, classification performance, indicating a limited factor complexity of single-vehicle motorcycle crashes. The effects of crucial variables including geographical, time, and sociodemographic factors explain some fatal crashes. Relatively unique fatal crashes are better approximated by interactive terms, especially combinations of behavioral factors. BRT models generally provide improved transferability than conventional logistic regression and CART models. This study also discusses the implications of the results for devising safety policies. Copyright © 2012 Elsevier Ltd. All rights reserved.
Logistic regression of family data from retrospective study designs.
Whittemore, Alice S; Halpern, Jerry
2003-11-01
We wish to study the effects of genetic and environmental factors on disease risk, using data from families ascertained because they contain multiple cases of the disease. To do so, we must account for the way participants were ascertained, and for within-family correlations in both disease occurrences and covariates. We model the joint probability distribution of the covariates of ascertained family members, given family disease occurrence and pedigree structure. We describe two such covariate models: the random effects model and the marginal model. Both models assume a logistic form for the distribution of one person's covariates that involves a vector beta of regression parameters. The components of beta in the two models have different interpretations, and they differ in magnitude when the covariates are correlated within families. We describe ascertainment assumptions needed to estimate consistently the parameters beta(RE) in the random effects model and the parameters beta(M) in the marginal model. Under the ascertainment assumptions for the random effects model, we show that conditional logistic regression (CLR) of matched family data gives a consistent estimate beta(RE) for beta(RE) and a consistent estimate for the covariance matrix of beta(RE). Under the ascertainment assumptions for the marginal model, we show that unconditional logistic regression (ULR) gives a consistent estimate for beta(M), and we give a consistent estimator for its covariance matrix. The random effects/CLR approach is simple to use and to interpret, but it can use data only from families containing both affected and unaffected members. The marginal/ULR approach uses data from all individuals, but its variance estimates require special computations. A C program to compute these variance estimates is available at http://www.stanford.edu/dept/HRP/epidemiology. We illustrate these pros and cons by application to data on the effects of parity on ovarian cancer risk in mother/daughter pairs, and use simulations to study the performance of the estimates. Copyright 2003 Wiley-Liss, Inc.
Comprehension of texts by deaf elementary school students: The role of grammatical understanding.
Barajas, Carmen; González-Cuenca, Antonia M; Carrero, Francisco
2016-12-01
The aim of this study was to analyze how the reading process of deaf Spanish elementary school students is affected both by those components that explain reading comprehension according to the Simple View of Reading model: decoding and linguistic comprehension (both lexical and grammatical) and by other variables that are external to the reading process: the type of assistive technology used, the age at which it is implanted or fitted, the participant's socioeconomic status and school stage. Forty-seven students aged between 6 and 13 years participated in the study; all presented with profound or severe prelingual bilateral deafness, and all used digital hearing aids or cochlear implants. Students' text comprehension skills, decoding skills and oral comprehension skills (both lexical and grammatical) were evaluated. Logistic regression analysis indicated that neither the type of assistive technology, age at time of fitting or activation, socioeconomic status, nor school stage could predict the presence or absence of difficulties in text comprehension. Furthermore, logistic regression analysis indicated that neither decoding skills, nor lexical age could predict competency in text comprehension; however, grammatical age could explain 41% of the variance. Probing deeper into the effect of grammatical understanding, logistic regression analysis indicated that a participant's understanding of reversible passive object-verb-subject sentences and reversible predicative subject-verb-object sentences accounted for 38% of the variance in text comprehension. Based on these results, we suggest that it might be beneficial to devise and evaluate interventions that focus specifically on grammatical comprehension. Copyright © 2016 Elsevier Ltd. All rights reserved.
ERIC Educational Resources Information Center
Hess, Brian; Olejnik, Stephen; Huberty, Carl J.
2001-01-01
Studied the efficacy of two improvement-over-chance or "I" effect sizes derived from predictive discriminant analysis and logistic regression analysis for two-group univariate mean comparisons through simulation. Discusses the ways in which the usefulness of each of the indices depends on the population characteristics. (SLD)
Influence of landscape-scale factors in limiting brook trout populations in Pennsylvania streams
Kocovsky, P.M.; Carline, R.F.
2006-01-01
Landscapes influence the capacity of streams to produce trout through their effect on water chemistry and other factors at the reach scale. Trout abundance also fluctuates over time; thus, to thoroughly understand how spatial factors at landscape scales affect trout populations, one must assess the changes in populations over time to provide a context for interpreting the importance of spatial factors. We used data from the Pennsylvania Fish and Boat Commission's fisheries management database to investigate spatial factors that affect the capacity of streams to support brook trout Salvelinus fontinalis and to provide models useful for their management. We assessed the relative importance of spatial and temporal variation by calculating variance components and comparing relative standard errors for spatial and temporal variation. We used binary logistic regression to predict the presence of harvestable-length brook trout and multiple linear regression to assess the mechanistic links between landscapes and trout populations and to predict population density. The variance in trout density among streams was equal to or greater than the temporal variation for several streams, indicating that differences among sites affect population density. Logistic regression models correctly predicted the absence of harvestable-length brook trout in 60% of validation samples. The r 2-value for the linear regression model predicting density was 0.3, indicating low predictive ability. Both logistic and linear regression models supported buffering capacity against acid episodes as an important mechanistic link between landscapes and trout populations. Although our models fail to predict trout densities precisely, their success at elucidating the mechanistic links between landscapes and trout populations, in concert with the importance of spatial variation, increases our understanding of factors affecting brook trout abundance and will help managers and private groups to protect and enhance populations of wild brook trout. ?? Copyright by the American Fisheries Society 2006.
A Proposal for Phase 4 of the Forest Inventory and Analysis Program
Ronald E. McRoberts
2005-01-01
Maps of forest cover were constructed using observations from forest inventory plots, Landsat Thematic Mapper satellite imagery, and a logistic regression model. Estimates of mean proportion forest area and the variance of the mean were calculated for circular study areas with radii ranging from 1 km to 15 km. The spatial correlation among pixel predictions was...
The Outlier Detection for Ordinal Data Using Scalling Technique of Regression Coefficients
NASA Astrophysics Data System (ADS)
Adnan, Arisman; Sugiarto, Sigit
2017-06-01
The aims of this study is to detect the outliers by using coefficients of Ordinal Logistic Regression (OLR) for the case of k category responses where the score from 1 (the best) to 8 (the worst). We detect them by using the sum of moduli of the ordinal regression coefficients calculated by jackknife technique. This technique is improved by scalling the regression coefficients to their means. R language has been used on a set of ordinal data from reference distribution. Furthermore, we compare this approach by using studentised residual plots of jackknife technique for ANOVA (Analysis of Variance) and OLR. This study shows that the jackknifing technique along with the proper scaling may lead us to reveal outliers in ordinal regression reasonably well.
Testing for gene-environment interaction under exposure misspecification.
Sun, Ryan; Carroll, Raymond J; Christiani, David C; Lin, Xihong
2017-11-09
Complex interplay between genetic and environmental factors characterizes the etiology of many diseases. Modeling gene-environment (GxE) interactions is often challenged by the unknown functional form of the environment term in the true data-generating mechanism. We study the impact of misspecification of the environmental exposure effect on inference for the GxE interaction term in linear and logistic regression models. We first examine the asymptotic bias of the GxE interaction regression coefficient, allowing for confounders as well as arbitrary misspecification of the exposure and confounder effects. For linear regression, we show that under gene-environment independence and some confounder-dependent conditions, when the environment effect is misspecified, the regression coefficient of the GxE interaction can be unbiased. However, inference on the GxE interaction is still often incorrect. In logistic regression, we show that the regression coefficient is generally biased if the genetic factor is associated with the outcome directly or indirectly. Further, we show that the standard robust sandwich variance estimator for the GxE interaction does not perform well in practical GxE studies, and we provide an alternative testing procedure that has better finite sample properties. © 2017, The International Biometric Society.
Li, Baoyue; Lingsma, Hester F; Steyerberg, Ewout W; Lesaffre, Emmanuel
2011-05-23
Logistic random effects models are a popular tool to analyze multilevel also called hierarchical data with a binary or ordinal outcome. Here, we aim to compare different statistical software implementations of these models. We used individual patient data from 8509 patients in 231 centers with moderate and severe Traumatic Brain Injury (TBI) enrolled in eight Randomized Controlled Trials (RCTs) and three observational studies. We fitted logistic random effects regression models with the 5-point Glasgow Outcome Scale (GOS) as outcome, both dichotomized as well as ordinal, with center and/or trial as random effects, and as covariates age, motor score, pupil reactivity or trial. We then compared the implementations of frequentist and Bayesian methods to estimate the fixed and random effects. Frequentist approaches included R (lme4), Stata (GLLAMM), SAS (GLIMMIX and NLMIXED), MLwiN ([R]IGLS) and MIXOR, Bayesian approaches included WinBUGS, MLwiN (MCMC), R package MCMCglmm and SAS experimental procedure MCMC.Three data sets (the full data set and two sub-datasets) were analysed using basically two logistic random effects models with either one random effect for the center or two random effects for center and trial. For the ordinal outcome in the full data set also a proportional odds model with a random center effect was fitted. The packages gave similar parameter estimates for both the fixed and random effects and for the binary (and ordinal) models for the main study and when based on a relatively large number of level-1 (patient level) data compared to the number of level-2 (hospital level) data. However, when based on relatively sparse data set, i.e. when the numbers of level-1 and level-2 data units were about the same, the frequentist and Bayesian approaches showed somewhat different results. The software implementations differ considerably in flexibility, computation time, and usability. There are also differences in the availability of additional tools for model evaluation, such as diagnostic plots. The experimental SAS (version 9.2) procedure MCMC appeared to be inefficient. On relatively large data sets, the different software implementations of logistic random effects regression models produced similar results. Thus, for a large data set there seems to be no explicit preference (of course if there is no preference from a philosophical point of view) for either a frequentist or Bayesian approach (if based on vague priors). The choice for a particular implementation may largely depend on the desired flexibility, and the usability of the package. For small data sets the random effects variances are difficult to estimate. In the frequentist approaches the MLE of this variance was often estimated zero with a standard error that is either zero or could not be determined, while for Bayesian methods the estimates could depend on the chosen "non-informative" prior of the variance parameter. The starting value for the variance parameter may be also critical for the convergence of the Markov chain.
Factors Affecting the Clinical Success Rate of Miniscrew Implants for Orthodontic Treatment.
Jing, Zheng; Wu, Yeke; Jiang, Wenlu; Zhao, Lixing; Jing, Dian; Zhang, Nian; Cao, Xiaoqing; Xu, Zhenrui; Zhao, Zhihe
2016-01-01
The purpose of this study was to evaluate the various factors that influence the success rate of miniscrew implants used as orthodontic anchorage. Potential confounding variables examined were sex, age, vertical (FMA) and sagittal (ANB) skeletal facial pattern, site of placement (labial and buccal, palatal, and retromandibular triangle), arch of placement (maxilla and mandible), placement soft tissue type, oral hygiene, diameter and length of miniscrew implants, insertion method (predrilled or drill-free), angle of placement, onset and strength of force application, and clinical purpose. The correlations between success rate and overall variables were investigated by logistic regression analysis, and the effect of each variable on the success rate was utilized by variance analysis. One hundred fourteen patients were included with a total of 253 miniscrew implants. The overall success rate was 88.54% with an average loading period of 9.5 months in successful cases. Age, oral hygiene, vertical skeletal facial pattern (FMA), and general placement sites (maxillary and mandibular) presented significant differences in success rates both by logistic regression analysis and variance analysis (P < .05). To minimize the failure of miniscrew implants, proper oral hygiene instruction and effective supervision should be given for patients, especially young (< 12 years) high-angle patients with miniscrew implants placed in the mandible.
Smith, Tyler C; Smith, Besa; Corbeil, Thomas E; Riddle, James R; Ryan, Margaret A K
2004-08-01
There is much concern over the potential for short- and long-term adverse mental health effects caused by the terrorist attacks on September 11, 2001. This analysis used data from the Millennium Cohort Study to identify subgroups of US military members who enrolled in the cohort and reported their mental health status before the traumatic events of September 11 and soon after September 11. While adjusting for confounding, multivariable logistic regression, analysis of variance, and multivariate ordinal, or polychotomous logistic regression were used to compare 18 self-reported mental health measures in US military members who enrolled in the cohort before September 11, 2001 with those military personnel who enrolled after September 11, 2001. In contrast to studies of other populations, military respondents reported fewer mental health problems in the months immediately after September 11, 2001.
2011-01-01
Background Logistic random effects models are a popular tool to analyze multilevel also called hierarchical data with a binary or ordinal outcome. Here, we aim to compare different statistical software implementations of these models. Methods We used individual patient data from 8509 patients in 231 centers with moderate and severe Traumatic Brain Injury (TBI) enrolled in eight Randomized Controlled Trials (RCTs) and three observational studies. We fitted logistic random effects regression models with the 5-point Glasgow Outcome Scale (GOS) as outcome, both dichotomized as well as ordinal, with center and/or trial as random effects, and as covariates age, motor score, pupil reactivity or trial. We then compared the implementations of frequentist and Bayesian methods to estimate the fixed and random effects. Frequentist approaches included R (lme4), Stata (GLLAMM), SAS (GLIMMIX and NLMIXED), MLwiN ([R]IGLS) and MIXOR, Bayesian approaches included WinBUGS, MLwiN (MCMC), R package MCMCglmm and SAS experimental procedure MCMC. Three data sets (the full data set and two sub-datasets) were analysed using basically two logistic random effects models with either one random effect for the center or two random effects for center and trial. For the ordinal outcome in the full data set also a proportional odds model with a random center effect was fitted. Results The packages gave similar parameter estimates for both the fixed and random effects and for the binary (and ordinal) models for the main study and when based on a relatively large number of level-1 (patient level) data compared to the number of level-2 (hospital level) data. However, when based on relatively sparse data set, i.e. when the numbers of level-1 and level-2 data units were about the same, the frequentist and Bayesian approaches showed somewhat different results. The software implementations differ considerably in flexibility, computation time, and usability. There are also differences in the availability of additional tools for model evaluation, such as diagnostic plots. The experimental SAS (version 9.2) procedure MCMC appeared to be inefficient. Conclusions On relatively large data sets, the different software implementations of logistic random effects regression models produced similar results. Thus, for a large data set there seems to be no explicit preference (of course if there is no preference from a philosophical point of view) for either a frequentist or Bayesian approach (if based on vague priors). The choice for a particular implementation may largely depend on the desired flexibility, and the usability of the package. For small data sets the random effects variances are difficult to estimate. In the frequentist approaches the MLE of this variance was often estimated zero with a standard error that is either zero or could not be determined, while for Bayesian methods the estimates could depend on the chosen "non-informative" prior of the variance parameter. The starting value for the variance parameter may be also critical for the convergence of the Markov chain. PMID:21605357
Fong, Youyi; Yu, Xuesong
2016-01-01
Many modern serial dilution assays are based on fluorescence intensity (FI) readouts. We study optimal transformation model choice for fitting five parameter logistic curves (5PL) to FI-based serial dilution assay data. We first develop a generalized least squares-pseudolikelihood type algorithm for fitting heteroscedastic logistic models. Next we show that the 5PL and log 5PL functions can approximate each other well. We then compare four 5PL models with different choices of log transformation and variance modeling through a Monte Carlo study and real data. Our findings are that the optimal choice depends on the intended use of the fitted curves. PMID:27642502
Austin, Peter C; Steyerberg, Ewout W
2012-06-20
When outcomes are binary, the c-statistic (equivalent to the area under the Receiver Operating Characteristic curve) is a standard measure of the predictive accuracy of a logistic regression model. An analytical expression was derived under the assumption that a continuous explanatory variable follows a normal distribution in those with and without the condition. We then conducted an extensive set of Monte Carlo simulations to examine whether the expressions derived under the assumption of binormality allowed for accurate prediction of the empirical c-statistic when the explanatory variable followed a normal distribution in the combined sample of those with and without the condition. We also examine the accuracy of the predicted c-statistic when the explanatory variable followed a gamma, log-normal or uniform distribution in combined sample of those with and without the condition. Under the assumption of binormality with equality of variances, the c-statistic follows a standard normal cumulative distribution function with dependence on the product of the standard deviation of the normal components (reflecting more heterogeneity) and the log-odds ratio (reflecting larger effects). Under the assumption of binormality with unequal variances, the c-statistic follows a standard normal cumulative distribution function with dependence on the standardized difference of the explanatory variable in those with and without the condition. In our Monte Carlo simulations, we found that these expressions allowed for reasonably accurate prediction of the empirical c-statistic when the distribution of the explanatory variable was normal, gamma, log-normal, and uniform in the entire sample of those with and without the condition. The discriminative ability of a continuous explanatory variable cannot be judged by its odds ratio alone, but always needs to be considered in relation to the heterogeneity of the population.
Novikov, I; Fund, N; Freedman, L S
2010-01-15
Different methods for the calculation of sample size for simple logistic regression (LR) with one normally distributed continuous covariate give different results. Sometimes the difference can be large. Furthermore, some methods require the user to specify the prevalence of cases when the covariate equals its population mean, rather than the more natural population prevalence. We focus on two commonly used methods and show through simulations that the power for a given sample size may differ substantially from the nominal value for one method, especially when the covariate effect is large, while the other method performs poorly if the user provides the population prevalence instead of the required parameter. We propose a modification of the method of Hsieh et al. that requires specification of the population prevalence and that employs Schouten's sample size formula for a t-test with unequal variances and group sizes. This approach appears to increase the accuracy of the sample size estimates for LR with one continuous covariate.
Pearl, D L; Louie, M; Chui, L; Doré, K; Grimsrud, K M; Martin, S W; Michel, P; Svenson, L W; McEwen, S A
2008-04-01
Using multivariable models, we compared whether there were significant differences between reported outbreak and sporadic cases in terms of their sex, age, and mode and site of disease transmission. We also determined the potential role of administrative, temporal, and spatial factors within these models. We compared a variety of approaches to account for clustering of cases in outbreaks including weighted logistic regression, random effects models, general estimating equations, robust variance estimates, and the random selection of one case from each outbreak. Age and mode of transmission were the only epidemiologically and statistically significant covariates in our final models using the above approaches. Weighing observations in a logistic regression model by the inverse of their outbreak size appeared to be a relatively robust and valid means for modelling these data. Some analytical techniques, designed to account for clustering, had difficulty converging or producing realistic measures of association.
Collinearity and Causal Diagrams: A Lesson on the Importance of Model Specification.
Schisterman, Enrique F; Perkins, Neil J; Mumford, Sunni L; Ahrens, Katherine A; Mitchell, Emily M
2017-01-01
Correlated data are ubiquitous in epidemiologic research, particularly in nutritional and environmental epidemiology where mixtures of factors are often studied. Our objectives are to demonstrate how highly correlated data arise in epidemiologic research and provide guidance, using a directed acyclic graph approach, on how to proceed analytically when faced with highly correlated data. We identified three fundamental structural scenarios in which high correlation between a given variable and the exposure can arise: intermediates, confounders, and colliders. For each of these scenarios, we evaluated the consequences of increasing correlation between the given variable and the exposure on the bias and variance for the total effect of the exposure on the outcome using unadjusted and adjusted models. We derived closed-form solutions for continuous outcomes using linear regression and empirically present our findings for binary outcomes using logistic regression. For models properly specified, total effect estimates remained unbiased even when there was almost perfect correlation between the exposure and a given intermediate, confounder, or collider. In general, as the correlation increased, the variance of the parameter estimate for the exposure in the adjusted models increased, while in the unadjusted models, the variance increased to a lesser extent or decreased. Our findings highlight the importance of considering the causal framework under study when specifying regression models. Strategies that do not take into consideration the causal structure may lead to biased effect estimation for the original question of interest, even under high correlation.
Zhang, Xingyu; Kim, Joyce; Patzer, Rachel E; Pitts, Stephen R; Patzer, Aaron; Schrager, Justin D
2017-10-26
To describe and compare logistic regression and neural network modeling strategies to predict hospital admission or transfer following initial presentation to Emergency Department (ED) triage with and without the addition of natural language processing elements. Using data from the National Hospital Ambulatory Medical Care Survey (NHAMCS), a cross-sectional probability sample of United States EDs from 2012 and 2013 survey years, we developed several predictive models with the outcome being admission to the hospital or transfer vs. discharge home. We included patient characteristics immediately available after the patient has presented to the ED and undergone a triage process. We used this information to construct logistic regression (LR) and multilayer neural network models (MLNN) which included natural language processing (NLP) and principal component analysis from the patient's reason for visit. Ten-fold cross validation was used to test the predictive capacity of each model and receiver operating curves (AUC) were then calculated for each model. Of the 47,200 ED visits from 642 hospitals, 6,335 (13.42%) resulted in hospital admission (or transfer). A total of 48 principal components were extracted by NLP from the reason for visit fields, which explained 75% of the overall variance for hospitalization. In the model including only structured variables, the AUC was 0.824 (95% CI 0.818-0.830) for logistic regression and 0.823 (95% CI 0.817-0.829) for MLNN. Models including only free-text information generated AUC of 0.742 (95% CI 0.731- 0.753) for logistic regression and 0.753 (95% CI 0.742-0.764) for MLNN. When both structured variables and free text variables were included, the AUC reached 0.846 (95% CI 0.839-0.853) for logistic regression and 0.844 (95% CI 0.836-0.852) for MLNN. The predictive accuracy of hospital admission or transfer for patients who presented to ED triage overall was good, and was improved with the inclusion of free text data from a patient's reason for visit regardless of modeling approach. Natural language processing and neural networks that incorporate patient-reported outcome free text may increase predictive accuracy for hospital admission.
Elder abuse and socioeconomic inequalities: a multilevel study in 7 European countries.
Fraga, Sílvia; Lindert, Jutta; Barros, Henrique; Torres-González, Francisco; Ioannidi-Kapolou, Elisabeth; Melchiorre, Maria Gabriella; Stankunas, Mindaugas; Soares, Joaquim F
2014-04-01
To compare the prevalence of elder abuse using a multilevel approach that takes into account the characteristics of participants as well as socioeconomic indicators at city and country level. In 2009, the project on abuse of elderly in Europe (ABUEL) was conducted in seven cities (Stuttgart, Germany; Ancona, Italy; Kaunas, Lithuania, Stockholm, Sweden; Porto, Portugal; Granada, Spain; Athens, Greece) comprising 4467 individuals aged 60-84 years. We used a 3-level hierarchical structure of data: 1) characteristics of participants; 2) mean of tertiary education of each city; and 3) country inequality indicator (Gini coefficient). Multilevel logistic regression was used and proportional changes in Intraclass Correlation Coefficient (ICC) were inspected to assert explained variance between models. The prevalence of elder abuse showed large variations across sites. Adding tertiary education to the regression model reduced the country level variance for psychological abuse (ICC=3.4%), with no significant decrease in the explained variance for the other types of abuse. When the Gini coefficient was considered, the highest drop in ICC was observed for financial abuse (from 9.5% to 4.3%). There is a societal and community level dimension that adds information to individual variability in explaining country differences in elder abuse, highlighting underlying socioeconomic inequalities leading to such behavior. Copyright © 2014 Elsevier Inc. All rights reserved.
Lindström, Martin; Lindström, Christine; Moghaddassi, Mahnaz; Merlo, Juan
2006-12-01
The aim of this study was to investigate the influence of contextual (social capital and neo-materialist) and individual factors on sense of insecurity in the neighbourhood. The 2000 public health survey in Scania is a cross-sectional study. A total of 13,715 persons answered a postal questionnaire, which is 59% of the random sample. A multilevel logistic regression model, with individuals at the first level and municipalities at the second, was performed. The effect (median odds ratios, intra-class correlation, cross-level modification and odds ratios) of individual and municipality/city quarter (social capital and police district) factors on sense of insecurity was analysed. The crude variance between municipalities/city quarters was not affected by individual factors. The introduction of administrative police district in the model reduced the municipality variance, although some of the significant variance between municipalities remained. The introduction of social capital did not affect the municipality variance. This study suggests that the neo-materialist factor administrative police district may partly explain the individual's sense of insecurity in the neighbourhood.
Bureau, Alexandre; Duchesne, Thierry
2015-12-01
Splitting extended families into their component nuclear families to apply a genetic association method designed for nuclear families is a widespread practice in familial genetic studies. Dependence among genotypes and phenotypes of nuclear families from the same extended family arises because of genetic linkage of the tested marker with a risk variant or because of familial specificity of genetic effects due to gene-environment interaction. This raises concerns about the validity of inference conducted under the assumption of independence of the nuclear families. We indeed prove theoretically that, in a conditional logistic regression analysis applicable to disease cases and their genotyped parents, the naive model-based estimator of the variance of the coefficient estimates underestimates the true variance. However, simulations with realistic effect sizes of risk variants and variation of this effect from family to family reveal that the underestimation is negligible. The simulations also show the greater efficiency of the model-based variance estimator compared to a robust empirical estimator. Our recommendation is therefore, to use the model-based estimator of variance for inference on effects of genetic variants.
Resilience model for parents of children with cancer in mainland China-An exploratory study.
Ye, Zeng Jie; Qiu, Hong Zhong; Li, Peng Fei; Liang, Mu Zi; Wang, Shu Ni; Quan, Xiao Ming
2017-04-01
Parents have psychosocial functions that are critical for the entire family. Therefore, when their child is diagnosed with cancer, it is important that they exhibit resilience, which is the ability to preserve their emotional and physical well-being in the face of stress. The Resilience Model for Parents of Children with Cancer (RMP-CC) was developed to increase our understanding of how resilience is positively and negatively affected by protective and risk factors, respectively, in Chinese parents with children diagnosed with cancer. To evaluate the RMP-CC, the latent psychosocial variables and demographics of 229 parents were evaluated using exploratory structural equation modeling (SEM) and logistic regression. The majority of goodness-of-fit indices indicate that the SEM of RMP-CC was a good model with a high level of variance in resilience (58%). Logistic regression revealed that two demographics, educational level and clinical classification of cancer, accounted for 12% of this variance. Our results indicate that RMP-CC is an effective structure by which to develop mainland Chinese parent-focused interventions that are grounded in the experiences of the parents as caregivers of children who have been diagnosed with cancer. RMP-CC allows for a better understanding of what these parents experience while their children undergo treatment. Further studies will be needed to confirm the efficiency of the current structure, and would assist in further refinement of its clinical applications. Copyright © 2017 Elsevier Ltd. All rights reserved.
Low, Ashley; Dixon, Shannan; Higgs, Amanda; Joines, Jessica; Hippman, Catriona
2018-02-01
Mental illness is extremely common and genetic counselors frequently see patients with mental illness. Genetic counselors report discomfort in providing psychiatric genetic counseling (GC), suggesting the need to look critically at training for psychiatric GC. This study aimed to investigate psychiatric GC training and its impact on perceived preparedness to provide psychiatric GC (preparedness). Current students and recent graduates were invited to complete an anonymous survey evaluating psychiatric GC training and outcomes. Bivariate correlations (p<.10) identified variables for inclusion in a logistic regression model to predict preparedness. Data were checked for assumptions underlying logistic regression. The logistic regression model for the 286 respondents [χ 2 (8)=84.87, p<.001] explained between 37.1% (Cox & Snell R 2 =.371) and 49.7% (Nagelkerke R 2 =.497) of the variance in preparedness scores. More frequent psychiatric GC instruction (OR=5.13), more active methods for practicing risk assessment (OR=4.43), and education on providing resources for mental illness (OR=4.99) made uniquely significant contributions to the model (p<.001). Responses to open-ended questions revealed interest in further psychiatric GC training, particularly enabling "hands on" experience. This exploratory study suggests that enriching GC training through more frequent psychiatric GC instruction and more active opportunities to practice psychiatric GC skills will support students in feeling more prepared to provide psychiatric GC after graduation.
Seligman, D A; Pullinger, A G
2000-01-01
Confusion about the relationship of occlusion to temporomandibular disorders (TMD) persists. This study attempted to identify occlusal and attrition factors plus age that would characterize asymptomatic normal female subjects. A total of 124 female patients with intracapsular TMD were compared with 47 asymptomatic female controls for associations to 9 occlusal factors, 3 attrition severity measures, and age using classification tree, multiple stepwise logistic regression, and univariate analyses. Models were tested for accuracy (sensitivity and specificity) and total contribution to the variance. The classification tree model had 4 terminal nodes that used only anterior attrition and age. "Normals" were mainly characterized by low attrition levels, whereas patients had higher attrition and tended to be younger. The tree model was only moderately useful (sensitivity 63%, specificity 94%) in predicting normals. The logistic regression model incorporated unilateral posterior crossbite and mediotrusive attrition severity in addition to the 2 factors in the tree, but was slightly less accurate than the tree (sensitivity 51%, specificity 90%). When only occlusal factors were considered in the analysis, normals were additionally characterized by a lack of anterior open bite, smaller overjet, and smaller RCP-ICP slides. The log likelihood accounted for was similar for both the tree (pseudo R(2) = 29.38%; mean deviance = 0.95) and the multiple logistic regression (Cox Snell R(2) = 30.3%, mean deviance = 0.84) models. The occlusal and attrition factors studied were only moderately useful in differentiating normals from TMD patients.
Inferring microhabitat preferences of Lilium catesbaei (Liliaceae).
Sommers, Kristen Penney; Elswick, Michael; Herrick, Gabriel I; Fox, Gordon A
2011-05-01
Microhabitat studies use varied statistical methods, some treating site occupancy as a dependent and others as an independent variable. Using the rare Lilium catesbaei as an example, we show why approaches to testing hypotheses of differences between occupied and unoccupied sites can lead to erroneous conclusions about habitat preferences. Predictive approaches like logistic regression can better lead to understanding of habitat requirements. Using 32 lily locations and 30 random locations >2 m from a lily (complete data: 31 lily and 28 random spots), we measured physical conditions--photosynthetically active radiation (PAR), canopy cover, litter depth, distance to and height of nearest shrub, and soil moisture--and number and identity of neighboring plants. Twelve lilies were used to estimate a photosynthetic assimilation curve. Analyses used logistic regression, discriminant function analysis (DFA), (multivariate) analysis of variance, and resampled Wilcoxon tests. Logistic regression and DFA found identical predictors of presence (PAR, canopy cover, distance to shrub, litter), but hypothesis tests pointed to a different set (PAR, litter, canopy cover, height of nearest shrub). Lilies are mainly in high-PAR spots, often close to light saturation. By contrast, PAR in random spots was often near the lily light compensation point. Lilies were near Serenoa repens less than at random; otherwise, neighbor identity had no significant effect. Predictive methods are more useful in this context than the hypothesis tests. Light availability plays a big role in lily presence, which may help to explain increases in flowering and emergence after fire and roller-chopping.
Prunier, J G; Colyn, M; Legendre, X; Nimon, K F; Flamand, M C
2015-01-01
Direct gradient analyses in spatial genetics provide unique opportunities to describe the inherent complexity of genetic variation in wildlife species and are the object of many methodological developments. However, multicollinearity among explanatory variables is a systemic issue in multivariate regression analyses and is likely to cause serious difficulties in properly interpreting results of direct gradient analyses, with the risk of erroneous conclusions, misdirected research and inefficient or counterproductive conservation measures. Using simulated data sets along with linear and logistic regressions on distance matrices, we illustrate how commonality analysis (CA), a detailed variance-partitioning procedure that was recently introduced in the field of ecology, can be used to deal with nonindependence among spatial predictors. By decomposing model fit indices into unique and common (or shared) variance components, CA allows identifying the location and magnitude of multicollinearity, revealing spurious correlations and thus thoroughly improving the interpretation of multivariate regressions. Despite a few inherent limitations, especially in the case of resistance model optimization, this review highlights the great potential of CA to account for complex multicollinearity patterns in spatial genetics and identifies future applications and lines of research. We strongly urge spatial geneticists to systematically investigate commonalities when performing direct gradient analyses. © 2014 John Wiley & Sons Ltd.
Item Response Theory Modeling of the Philadelphia Naming Test.
Fergadiotis, Gerasimos; Kellough, Stacey; Hula, William D
2015-06-01
In this study, we investigated the fit of the Philadelphia Naming Test (PNT; Roach, Schwartz, Martin, Grewal, & Brecher, 1996) to an item-response-theory measurement model, estimated the precision of the resulting scores and item parameters, and provided a theoretical rationale for the interpretation of PNT overall scores by relating explanatory variables to item difficulty. This article describes the statistical model underlying the computer adaptive PNT presented in a companion article (Hula, Kellough, & Fergadiotis, 2015). Using archival data, we evaluated the fit of the PNT to 1- and 2-parameter logistic models and examined the precision of the resulting parameter estimates. We regressed the item difficulty estimates on three predictor variables: word length, age of acquisition, and contextual diversity. The 2-parameter logistic model demonstrated marginally better fit, but the fit of the 1-parameter logistic model was adequate. Precision was excellent for both person ability and item difficulty estimates. Word length, age of acquisition, and contextual diversity all independently contributed to variance in item difficulty. Item-response-theory methods can be productively used to analyze and quantify anomia severity in aphasia. Regression of item difficulty on lexical variables supported the validity of the PNT and interpretation of anomia severity scores in the context of current word-finding models.
2012-01-01
Background When outcomes are binary, the c-statistic (equivalent to the area under the Receiver Operating Characteristic curve) is a standard measure of the predictive accuracy of a logistic regression model. Methods An analytical expression was derived under the assumption that a continuous explanatory variable follows a normal distribution in those with and without the condition. We then conducted an extensive set of Monte Carlo simulations to examine whether the expressions derived under the assumption of binormality allowed for accurate prediction of the empirical c-statistic when the explanatory variable followed a normal distribution in the combined sample of those with and without the condition. We also examine the accuracy of the predicted c-statistic when the explanatory variable followed a gamma, log-normal or uniform distribution in combined sample of those with and without the condition. Results Under the assumption of binormality with equality of variances, the c-statistic follows a standard normal cumulative distribution function with dependence on the product of the standard deviation of the normal components (reflecting more heterogeneity) and the log-odds ratio (reflecting larger effects). Under the assumption of binormality with unequal variances, the c-statistic follows a standard normal cumulative distribution function with dependence on the standardized difference of the explanatory variable in those with and without the condition. In our Monte Carlo simulations, we found that these expressions allowed for reasonably accurate prediction of the empirical c-statistic when the distribution of the explanatory variable was normal, gamma, log-normal, and uniform in the entire sample of those with and without the condition. Conclusions The discriminative ability of a continuous explanatory variable cannot be judged by its odds ratio alone, but always needs to be considered in relation to the heterogeneity of the population. PMID:22716998
Personality traits and psychotic symptoms in recent onset of psychosis patients.
Sevilla-Llewellyn-Jones, Julia; Cano-Domínguez, Pablo; de-Luis-Matilla, Antonia; Peñuelas-Calvo, Inmaculada; Espina-Eizaguirre, Alberto; Moreno-Kustner, Berta; Ochoa, Susana
2017-04-01
Personality in patients with psychosis, and particularly its relation to psychotic symptoms in recent onset of psychosis (ROP) patients, is understudied. The aims of this research were to study the relation between dimensional and categorical clinical personality traits and symptoms, as well as the effects that symptoms, sex and age have on clinically significant personality traits. Data for these analyses were obtained from 94 ROP patients. The Millon Clinical Multiaxial Inventory and the Positive and Negative Syndrome Scale were used to assess personality and symptoms. Correlational Analysis, Mann-Whitney test, and, finally, logistic regression were carried out. The negative dimension was higher in patients with schizoid traits. The excited dimension was lower for those with avoidant and depressive traits. The anxiety and depression dimension was higher for patients with dependent traits. The positive dimension was lower for patients with histrionic and higher for patients with compulsive traits. Logistic regression demonstrated that gender and the positive and negative dimensions explained 35.9% of the variance of the schizoid trait. The excited dimension explained 9.1% of the variance of avoidant trait. The anxiety and depression dimension and age explained 31.3% of the dependent trait. Gender explained 11.6% of the histrionic trait, 14.5% of the narcissistic trait and 11.6% of the paranoid trait. Finally gender and positive dimension explained 16.1% of the compulsive trait. The study highlights the importance of studying personality in patients with psychosis as it broadens understating of the patients themselves and the symptoms suffered. Copyright © 2017 Elsevier Inc. All rights reserved.
A note on variance estimation in random effects meta-regression.
Sidik, Kurex; Jonkman, Jeffrey N
2005-01-01
For random effects meta-regression inference, variance estimation for the parameter estimates is discussed. Because estimated weights are used for meta-regression analysis in practice, the assumed or estimated covariance matrix used in meta-regression is not strictly correct, due to possible errors in estimating the weights. Therefore, this note investigates the use of a robust variance estimation approach for obtaining variances of the parameter estimates in random effects meta-regression inference. This method treats the assumed covariance matrix of the effect measure variables as a working covariance matrix. Using an example of meta-analysis data from clinical trials of a vaccine, the robust variance estimation approach is illustrated in comparison with two other methods of variance estimation. A simulation study is presented, comparing the three methods of variance estimation in terms of bias and coverage probability. We find that, despite the seeming suitability of the robust estimator for random effects meta-regression, the improved variance estimator of Knapp and Hartung (2003) yields the best performance among the three estimators, and thus may provide the best protection against errors in the estimated weights.
Myer, Gregory D; Ford, Kevin R; Khoury, Jane; Succop, Paul; Hewett, Timothy E
2014-01-01
Objective Knee abduction moment (KAM) during landing predicts non-contact anterior cruciate ligament (ACL) injury risk with high sensitivity and specificity in female athletes. The purpose of this study was to employ sensitive laboratory (lab-based) tools to determine predictive mechanisms that underlie increased KAM during landing. Methods Female basketball and soccer players (N=744) from a single county public school district were recruited to participate in testing of anthropometrics, maturation, laxity/flexibility, strength and landing biomechanics. Linear regression was used to model KAM, and logistic regression was used to examine high (>25.25 Nm of KAM) versus low KAM as surrogate for ACL injury risk. Results The most parsimonious model included independent predictors (β±1 SE) (1) peak knee abduction angle (1.78±0.05; p<0.001), (2) peak knee extensor moment (0.17±0.01; p<0.001), (3) knee flexion range of motion (0.15±0.03; p<0.01), (4) body mass index (BMI) Z-score (−1.67±0.36; p<0.001) and (5) tibia length (−0.50±0.14; p<0.001) and accounted for 78% of the variance in KAM during landing. The logistic regression model that employed these same variables predicted high KAM status with 85% sensitivity and 93% specificity and a C-statistic of 0.96. Conclusions Increased knee abduction angle, quadriceps recruitment, tibia length and BMI with decreased knee flexion account for 80% of the measured variance in KAM during a drop vertical jump. Clinical relevance Females who demonstrate increased KAM are more responsive and more likely to benefit from neuromuscular training. These findings should significantly enhance the identification of those at increased risk and facilitate neuromuscular training targeted to this important risk factor (high KAM) for ACL injury. PMID:20558526
2014-01-01
Background Meta-regression is becoming increasingly used to model study level covariate effects. However this type of statistical analysis presents many difficulties and challenges. Here two methods for calculating confidence intervals for the magnitude of the residual between-study variance in random effects meta-regression models are developed. A further suggestion for calculating credible intervals using informative prior distributions for the residual between-study variance is presented. Methods Two recently proposed and, under the assumptions of the random effects model, exact methods for constructing confidence intervals for the between-study variance in random effects meta-analyses are extended to the meta-regression setting. The use of Generalised Cochran heterogeneity statistics is extended to the meta-regression setting and a Newton-Raphson procedure is developed to implement the Q profile method for meta-analysis and meta-regression. WinBUGS is used to implement informative priors for the residual between-study variance in the context of Bayesian meta-regressions. Results Results are obtained for two contrasting examples, where the first example involves a binary covariate and the second involves a continuous covariate. Intervals for the residual between-study variance are wide for both examples. Conclusions Statistical methods, and R computer software, are available to compute exact confidence intervals for the residual between-study variance under the random effects model for meta-regression. These frequentist methods are almost as easily implemented as their established counterparts for meta-analysis. Bayesian meta-regressions are also easily performed by analysts who are comfortable using WinBUGS. Estimates of the residual between-study variance in random effects meta-regressions should be routinely reported and accompanied by some measure of their uncertainty. Confidence and/or credible intervals are well-suited to this purpose. PMID:25196829
Are prescription drug insurance choices consistent with expected utility theory?
Bundorf, M Kate; Mata, Rui; Schoenbaum, Michael; Bhattacharya, Jay
2013-09-01
To determine the extent to which people make choices inconsistent with expected utility theory when choosing among prescription drug insurance plans and whether tabular or graphical presentation format influences the consistency of their choices. Members of an Internet-enabled panel chose between two Medicare prescription drug plans. The "low variance" plan required higher out-of-pocket payments for the drugs respondents usually took but lower out-of-pocket payments for the drugs they might need if they developed a new health condition than the "high variance" plan. The probability of a change in health varied within subjects and the presentation format (text vs. graphical) and the affective salience of the clinical condition (abstract vs. risk related to specific clinical condition) varied between subjects. Respondents were classified based on whether they consistently chose either the low or high variance plan. Logistic regression models were estimated to examine the relationship between decision outcomes and task characteristics. The majority of respondents consistently chose either the low or high variance plan, consistent with expected utility theory. Half of respondents consistently chose the low variance plan. Respondents were less likely to make discrepant choices when information was presented in graphical format. Many people, although not all, make choices consistent with expected utility theory when they have information on differences among plans in the variance of out-of-pocket spending. Medicare beneficiaries would benefit from information on the extent to which prescription drug plans provide risk protection. PsycINFO Database Record (c) 2013 APA, all rights reserved.
Griffiths, Mark D.; Sinha, Rajita; Hetland, Jørn
2016-01-01
Despite the many number of studies examining workaholism, large-scale studies have been lacking. The present study utilized an open web-based cross-sectional survey assessing symptoms of psychiatric disorders and workaholism among 16,426 workers (Mage = 37.3 years, SD = 11.4, range = 16–75 years). Participants were administered the Adult ADHD Self-Report Scale, the Obsession-Compulsive Inventory-Revised, the Hospital Anxiety and Depression Scale, and the Bergen Work Addiction Scale, along with additional questions examining demographic and work-related variables. Correlations between workaholism and all psychiatric disorder symptoms were positive and significant. Workaholism comprised the dependent variable in a three-step linear multiple hierarchical regression analysis. Basic demographics (age, gender, relationship status, and education) explained 1.2% of the variance in workaholism, whereas work demographics (work status, position, sector, and annual income) explained an additional 5.4% of the variance. Age (inversely) and managerial positions (positively) were of most importance. The psychiatric symptoms (ADHD, OCD, anxiety, and depression) explained 17.0% of the variance. ADHD and anxiety contributed considerably. The prevalence rate of workaholism status was 7.8% of the present sample. In an adjusted logistic regression analysis, all psychiatric symptoms were positively associated with being a workaholic. The independent variables explained between 6.1% and 14.4% in total of the variance in workaholism cases. Although most effect sizes were relatively small, the study’s findings expand our understanding of possible psychiatric predictors of workaholism, and particularly shed new insight into the reality of adult ADHD in work life. The study’s implications, strengths, and shortcomings are also discussed. PMID:27192149
Candel, Math J J M; Van Breukelen, Gerard J P
2010-06-30
Adjustments of sample size formulas are given for varying cluster sizes in cluster randomized trials with a binary outcome when testing the treatment effect with mixed effects logistic regression using second-order penalized quasi-likelihood estimation (PQL). Starting from first-order marginal quasi-likelihood (MQL) estimation of the treatment effect, the asymptotic relative efficiency of unequal versus equal cluster sizes is derived. A Monte Carlo simulation study shows this asymptotic relative efficiency to be rather accurate for realistic sample sizes, when employing second-order PQL. An approximate, simpler formula is presented to estimate the efficiency loss due to varying cluster sizes when planning a trial. In many cases sampling 14 per cent more clusters is sufficient to repair the efficiency loss due to varying cluster sizes. Since current closed-form formulas for sample size calculation are based on first-order MQL, planning a trial also requires a conversion factor to obtain the variance of the second-order PQL estimator. In a second Monte Carlo study, this conversion factor turned out to be 1.25 at most. (c) 2010 John Wiley & Sons, Ltd.
Laboratory test variables useful for distinguishing upper from lower gastrointestinal bleeding.
Tomizawa, Minoru; Shinozaki, Fuminobu; Hasegawa, Rumiko; Shirai, Yoshinori; Motoyoshi, Yasufumi; Sugiyama, Takao; Yamamoto, Shigenori; Ishige, Naoki
2015-05-28
To distinguish upper from lower gastrointestinal (GI) bleeding. Patient records between April 2011 and March 2014 were analyzed retrospectively (3296 upper endoscopy, and 1520 colonoscopy). Seventy-six patients had upper GI bleeding (Upper group) and 65 had lower GI bleeding (Lower group). Variables were compared between the groups using one-way analysis of variance. Logistic regression was performed to identify variables significantly associated with the diagnosis of upper vs lower GI bleeding. Receiver-operator characteristic (ROC) analysis was performed to determine the threshold value that could distinguish upper from lower GI bleeding. Hemoglobin (P = 0.023), total protein (P = 0.0002), and lactate dehydrogenase (P = 0.009) were significantly lower in the Upper group than in the Lower group. Blood urea nitrogen (BUN) was higher in the Upper group than in the Lower group (P = 0.0065). Logistic regression analysis revealed that BUN was most strongly associated with the diagnosis of upper vs lower GI bleeding. ROC analysis revealed a threshold BUN value of 21.0 mg/dL, with a specificity of 93.0%. The threshold BUN value for distinguishing upper from lower GI bleeding was 21.0 mg/dL.
Laboratory test variables useful for distinguishing upper from lower gastrointestinal bleeding
Tomizawa, Minoru; Shinozaki, Fuminobu; Hasegawa, Rumiko; Shirai, Yoshinori; Motoyoshi, Yasufumi; Sugiyama, Takao; Yamamoto, Shigenori; Ishige, Naoki
2015-01-01
AIM: To distinguish upper from lower gastrointestinal (GI) bleeding. METHODS: Patient records between April 2011 and March 2014 were analyzed retrospectively (3296 upper endoscopy, and 1520 colonoscopy). Seventy-six patients had upper GI bleeding (Upper group) and 65 had lower GI bleeding (Lower group). Variables were compared between the groups using one-way analysis of variance. Logistic regression was performed to identify variables significantly associated with the diagnosis of upper vs lower GI bleeding. Receiver-operator characteristic (ROC) analysis was performed to determine the threshold value that could distinguish upper from lower GI bleeding. RESULTS: Hemoglobin (P = 0.023), total protein (P = 0.0002), and lactate dehydrogenase (P = 0.009) were significantly lower in the Upper group than in the Lower group. Blood urea nitrogen (BUN) was higher in the Upper group than in the Lower group (P = 0.0065). Logistic regression analysis revealed that BUN was most strongly associated with the diagnosis of upper vs lower GI bleeding. ROC analysis revealed a threshold BUN value of 21.0 mg/dL, with a specificity of 93.0%. CONCLUSION: The threshold BUN value for distinguishing upper from lower GI bleeding was 21.0 mg/dL. PMID:26034359
A Formula to Calculate Standard Liver Volume Using Thoracoabdominal Circumference.
Shaw, Brian I; Burdine, Lyle J; Braun, Hillary J; Ascher, Nancy L; Roberts, John P
2017-12-01
With the use of split liver grafts as well as living donor liver transplantation (LDLT) it is imperative to know the minimum graft volume to avoid complications. Most current formulas to predict standard liver volume (SLV) rely on weight-based measures that are likely inaccurate in the setting of cirrhosis. Therefore, we sought to create a formula for estimating SLV without weight-based covariates. LDLT donors underwent computed tomography scan volumetric evaluation of their livers. An optimal formula for calculating SLV using the anthropomorphic measure thoracoabdominal circumference (TAC) was determined using leave-one-out cross-validation. The ability of this formula to correctly predict liver volume was checked against other existing formulas by analysis of variance. The ability of the formula to predict small grafts in LDLT was evaluated by exact logistic regression. The optimal formula using TAC was determined to be SLV = (TAC × 3.5816) - (Age × 3.9844) - (Sex × 109.7386) - 934.5949. When compared to historic formulas, the current formula was the only one which was not significantly different than computed tomography determined liver volumes when compared by analysis of variance with Dunnett posttest. When evaluating the ability of the formula to predict small for size syndrome, many (10/16) of the formulas tested had significant results by exact logistic regression, with our formula predicting small for size syndrome with an odds ratio of 7.94 (95% confidence interval, 1.23-91.36; P = 0.025). We report a formula for calculating SLV that does not rely on weight-based variables that has good ability to predict SLV and identify patients with potentially small grafts.
Toldson, Ivory A; Ray, Kilynda; Hatcher, Schnavia Smith; Louis, Laura Straughn
2011-01-01
This study examines disparities in the long-term health, emotional well-being, and economic consequences of the 2005 Gulf Coast hurricanes. Researchers analyzed the responses of 216 Black and 508 White Hurricane Katrina survivors who participated in the ABC News Hurricane Katrina Anniversary Poll in 2006. Self-reported data of the long-term negative impact of the hurricane on personal health, emotional well-being, and finances were regressed on race, income, and measures of loss, injury, family mortality, anxiety, and confidence in the government. Descriptive analyses, stepwise logistic regression, and analyses of variance revealed that Black hurricane survivors more frequently reported hurricane-related problems with personal health, emotional well-being, and finances. In addition, Blacks were more likely than Whites to report the loss of friends, relatives, and personal property.
Parker, Kristin M; Wilson, Mark G; Vandenberg, Robert J; DeJoy, David M; Orpinas, Pamela
2009-10-01
This study tests the hypothesis that employees with comorbid physical health conditions and mental health symptoms are less productive than other employees. Self-reported health status and productivity measures were collected from 1723 employees of a national retail organization. chi2, analysis of variance, and linear contrast analyses were conducted to evaluate whether health status groups differed on productivity measures. Multivariate linear regression and multinomial logistic regression analyses were conducted to analyze how predictive health status was of productivity. Those with comorbidities were significantly less productive on all productivity measures compared with all other health status groups and those with only physical health conditions or mental health symptoms. Health status also significantly predicted levels of employee productivity. These findings provide evidence for the relationship between health statuses and productivity, which has potential programmatic implications.
Hinton, Devon E; Chhean, Dara; Fama, Jeanne M; Pollack, Mark H; McNally, Richard J
2007-01-01
Among Cambodian refugees attending a psychiatric clinic, we assessed psychopathology associated with gastrointestinal panic (GIP), and investigated possible causal mechanisms, including "fear of fear" and GIP-associated flashbacks and catastrophic cognitions. GIP (n=46) patients had greater psychopathology (Clinician-Administered PTSD Scale [CAPS] and Symptom Checklist-90-R [SCL]) and "fear of fear" (Anxiety Sensitivity Index [ASI]) than did non-GIP patients (n=84). Logistic regression revealed that general psychopathology (SCL; odds ratio=4.1) and fear of anxiety-related sensations (ASI; odds ratio=2.4) predicted the presence of GIP. Among GIP patients, a hierarchical regression revealed that GIP-associated trauma recall and catastrophic cognitions explained variance in GIP severity beyond a measure of general psychopathology (SCL). A mediational analysis indicated that SCL's effect on GIP severity was mediated by GIP-associated flashbacks and catastrophic cognitions.
NASA Astrophysics Data System (ADS)
Ariffin, Syaiba Balqish; Midi, Habshah
2014-06-01
This article is concerned with the performance of logistic ridge regression estimation technique in the presence of multicollinearity and high leverage points. In logistic regression, multicollinearity exists among predictors and in the information matrix. The maximum likelihood estimator suffers a huge setback in the presence of multicollinearity which cause regression estimates to have unduly large standard errors. To remedy this problem, a logistic ridge regression estimator is put forward. It is evident that the logistic ridge regression estimator outperforms the maximum likelihood approach for handling multicollinearity. The effect of high leverage points are then investigated on the performance of the logistic ridge regression estimator through real data set and simulation study. The findings signify that logistic ridge regression estimator fails to provide better parameter estimates in the presence of both high leverage points and multicollinearity.
Lindström, Martin; Moghaddassi, Mahnaz; Bolin, Kristian; Lindgren, Björn; Merlo, Juan
2003-01-01
The aim of this study was to investigate the influence of contextual and individual factors on daily tobacco smoking. The public-health survey in Malmö 1994 is a cross-sectional study. A total of 5600 individuals aged 20-80 years were invited to answer a postal questionnaire. The participation rate was 71%. A multilevel logistic regression model, with individuals at the first level and neighbourhoods at the second, was performed. We analysed the effect (intra-area correlation, cross-level modification and odds ratios) of individual and neighbourhood factors on smoking after adjustment for individual factors. Neighbourhood factors accounted for 2.5% of the crude total variance in daily tobacco smoking. This effect was significantly reduced when the individual factors such as education were included in the model. However, individual social capital, measured by social participation, only marginally affected the total neighbourhood variance in daily tobacco smoking. In fact, no significant variance in daily tobacco smoking remained after the introduction of the individual factors other than individual social capital in the model. In Malmö, the neighbourhood variance in daily tobacco smoking is mainly affected by individual factors other than individual social capital, especially socioeconomic status measured as level of education.
Lyles, Robert H.; Mitchell, Emily M.; Weinberg, Clarice R.; Umbach, David M.; Schisterman, Enrique F.
2016-01-01
Summary Potential reductions in laboratory assay costs afforded by pooling equal aliquots of biospecimens have long been recognized in disease surveillance and epidemiological research and, more recently, have motivated design and analytic developments in regression settings. For example, Weinberg and Umbach (1999, Biometrics 55, 718–726) provided methods for fitting set-based logistic regression models to case-control data when a continuous exposure variable (e.g., a biomarker) is assayed on pooled specimens. We focus on improving estimation efficiency by utilizing available subject-specific information at the pool allocation stage. We find that a strategy that we call “(y,c)-pooling,” which forms pooling sets of individuals within strata defined jointly by the outcome and other covariates, provides more precise estimation of the risk parameters associated with those covariates than does pooling within strata defined only by the outcome. We review the approach to set-based analysis through offsets developed by Weinberg and Umbach in a recent correction to their original paper. We propose a method for variance estimation under this design and use simulations and a real-data example to illustrate the precision benefits of (y,c)-pooling relative to y-pooling. We also note and illustrate that set-based models permit estimation of covariate interactions with exposure. PMID:26964741
Hu, Wen
2017-06-01
In November 2010 and October 2013, Utah increased speed limits on sections of rural interstates from 75 to 80mph. Effects on vehicle speeds and speed variance were examined. Speeds were measured in May 2010 and May 2014 within the new 80mph zones, and at a nearby spillover site and at more distant control sites where speed limits remained 75mph. Log-linear regression models estimated percentage changes in speed variance and mean speeds for passenger vehicles and large trucks associated with the speed limit increase. Logistic regression models estimated effects on the probability of passenger vehicles exceeding 80, 85, or 90mph and large trucks exceeding 80mph. Within the 80mph zones and at the spillover location in 2014, mean passenger vehicle speeds were significantly higher (4.1% and 3.5%, respectively), as were the probabilities that passenger vehicles exceeded 80mph (122.3% and 88.5%, respectively), than would have been expected without the speed limit increase. Probabilities that passenger vehicles exceeded 85 and 90mph were non-significantly higher than expected within the 80mph zones. For large trucks, the mean speed and probability of exceeding 80mph were higher than expected within the 80mph zones. Only the increase in mean speed was significant. Raising the speed limit was associated with non-significant increases in speed variance. The study adds to the wealth of evidence that increasing speed limits leads to higher travel speeds and an increased probability of exceeding the new speed limit. Results moreover contradict the claim that increasing speed limits reduces speed variance. Although the estimated increases in mean vehicle speeds may appear modest, prior research suggests such increases would be associated with substantial increases in fatal or injury crashes. This should be considered by lawmakers considering increasing speed limits. Copyright © 2017 Elsevier Ltd and National Safety Council. All rights reserved.
Sample size determination for logistic regression on a logit-normal distribution.
Kim, Seongho; Heath, Elisabeth; Heilbrun, Lance
2017-06-01
Although the sample size for simple logistic regression can be readily determined using currently available methods, the sample size calculation for multiple logistic regression requires some additional information, such as the coefficient of determination ([Formula: see text]) of a covariate of interest with other covariates, which is often unavailable in practice. The response variable of logistic regression follows a logit-normal distribution which can be generated from a logistic transformation of a normal distribution. Using this property of logistic regression, we propose new methods of determining the sample size for simple and multiple logistic regressions using a normal transformation of outcome measures. Simulation studies and a motivating example show several advantages of the proposed methods over the existing methods: (i) no need for [Formula: see text] for multiple logistic regression, (ii) available interim or group-sequential designs, and (iii) much smaller required sample size.
Staley, James R; Jones, Edmund; Kaptoge, Stephen; Butterworth, Adam S; Sweeting, Michael J; Wood, Angela M; Howson, Joanna M M
2017-06-01
Logistic regression is often used instead of Cox regression to analyse genome-wide association studies (GWAS) of single-nucleotide polymorphisms (SNPs) and disease outcomes with cohort and case-cohort designs, as it is less computationally expensive. Although Cox and logistic regression models have been compared previously in cohort studies, this work does not completely cover the GWAS setting nor extend to the case-cohort study design. Here, we evaluated Cox and logistic regression applied to cohort and case-cohort genetic association studies using simulated data and genetic data from the EPIC-CVD study. In the cohort setting, there was a modest improvement in power to detect SNP-disease associations using Cox regression compared with logistic regression, which increased as the disease incidence increased. In contrast, logistic regression had more power than (Prentice weighted) Cox regression in the case-cohort setting. Logistic regression yielded inflated effect estimates (assuming the hazard ratio is the underlying measure of association) for both study designs, especially for SNPs with greater effect on disease. Given logistic regression is substantially more computationally efficient than Cox regression in both settings, we propose a two-step approach to GWAS in cohort and case-cohort studies. First to analyse all SNPs with logistic regression to identify associated variants below a pre-defined P-value threshold, and second to fit Cox regression (appropriately weighted in case-cohort studies) to those identified SNPs to ensure accurate estimation of association with disease.
Bejaei, M; Wiseman, K; Cheng, K M
2015-01-01
Consumers' interest in specialty eggs appears to be growing in Europe and North America. The objective of this research was to develop logistic regression models that utilise purchaser attributes and demographics to predict the probability of a consumer purchasing a specific type of table egg including regular (white and brown), non-caged (free-run, free-range and organic) or nutrient-enhanced eggs. These purchase prediction models, together with the purchasers' attributes, can be used to assess market opportunities of different egg types specifically in British Columbia (BC). An online survey was used to gather data for the models. A total of 702 completed questionnaires were submitted by BC residents. Selected independent variables included in the logistic regression to develop models for different egg types to predict the probability of a consumer purchasing a specific type of table egg. The variables used in the model accounted for 54% and 49% of variances in the purchase of regular and non-caged eggs, respectively. Research results indicate that consumers of different egg types exhibit a set of unique and statistically significant characteristics and/or demographics. For example, consumers of regular eggs were less educated, older, price sensitive, major chain store buyers, and store flyer users, and had lower awareness about different types of eggs and less concern regarding animal welfare issues. However, most of the non-caged egg consumers were less concerned about price, had higher awareness about different types of table eggs, purchased their eggs from local/organic grocery stores, farm gates or farmers markets, and they were more concerned about care and feeding of hens compared to consumers of other eggs types.
The crux of the method: assumptions in ordinary least squares and logistic regression.
Long, Rebecca G
2008-10-01
Logistic regression has increasingly become the tool of choice when analyzing data with a binary dependent variable. While resources relating to the technique are widely available, clear discussions of why logistic regression should be used in place of ordinary least squares regression are difficult to find. The current paper compares and contrasts the assumptions of ordinary least squares with those of logistic regression and explains why logistic regression's looser assumptions make it adept at handling violations of the more important assumptions in ordinary least squares.
Relationships of Measurement Error and Prediction Error in Observed-Score Regression
ERIC Educational Resources Information Center
Moses, Tim
2012-01-01
The focus of this paper is assessing the impact of measurement errors on the prediction error of an observed-score regression. Measures are presented and described for decomposing the linear regression's prediction error variance into parts attributable to the true score variance and the error variances of the dependent variable and the predictor…
Using Dominance Analysis to Determine Predictor Importance in Logistic Regression
ERIC Educational Resources Information Center
Azen, Razia; Traxel, Nicole
2009-01-01
This article proposes an extension of dominance analysis that allows researchers to determine the relative importance of predictors in logistic regression models. Criteria for choosing logistic regression R[superscript 2] analogues were determined and measures were selected that can be used to perform dominance analysis in logistic regression. A…
Social support and support groups among people with HIV/AIDS in Ghana.
Abrefa-Gyan, Tina; Wu, Liyun; Lewis, Marilyn W
2016-01-01
HIV/AIDS, a chronic burden in Ghana, poses social and health outcome concerns to those infected. Examining the Medical Outcome Study Social Support Survey (MOS-SSS) instrument among 300 Ghanaians from a cross-sectional design, Principal Component Analysis yielded four factors (positive interaction, trust building, information giving, and essential support), which accounted for 85.73% of the total variance in the MOS-SSS. A logistic regression analysis showed that essential support was the strongest predictor of the length of time an individual stayed in the support group, whereas positive interaction indicated negative association. The study's implications for policy, research, and practice were discussed.
Applying Kaplan-Meier to Item Response Data
ERIC Educational Resources Information Center
McNeish, Daniel
2018-01-01
Some IRT models can be equivalently modeled in alternative frameworks such as logistic regression. Logistic regression can also model time-to-event data, which concerns the probability of an event occurring over time. Using the relation between time-to-event models and logistic regression and the relation between logistic regression and IRT, this…
Mental Health Correlates of Cigarette Use in LGBT Individuals in the Southeastern United States.
Drescher, Christopher F; Lopez, Eliot J; Griffin, James A; Toomey, Thomas M; Eldridge, Elizabeth D; Stepleman, Lara M
2018-05-12
Smoking prevalence for lesbian, gay, bisexual, and transgender (LGBT) individuals is higher than for heterosexual, cisgender individuals. Elevated smoking rates have been linked to psychiatric comorbidities, substance use, poverty, low education levels, and stress. This study examined mental health (MH) correlates of cigarette use in LGBT individuals residing in a metropolitan area in the southeastern United States. Participants were 335 individuals from an LGBT health needs assessment (mean age 34.7; SD = 13.5; 63% gay/lesbian; 66% Caucasian; 81% cisgender). Demographics, current/past psychiatric diagnoses, number of poor MH days in the last 30, the Patient Health Questionnaire (PHQ) 2 depression screener, the Three-Item Loneliness Scale, and frequency of cigarette use were included. Analyses included bivariate correlations, analysis of variance (ANOVA), and regression. Multiple demographic and MH factors were associated with smoker status and frequency of smoking. A logistic regression indicated that lower education and bipolar disorder were most strongly associated with being a smoker. For smokers, a hierarchical regression model including demographic and MH variables accounted for 17.6% of the variance in frequency of cigarette use. Only education, bipolar disorder, and the number of poor MH days were significant contributors in the overall model. Conclusions/Importance: Less education, bipolar disorder, and recurrent poor MH increase LGBT vulnerability to cigarette use. Access to LGBT-competent MH providers who can address culturally specific factors in tobacco cessation is crucial to reducing this health disparities.
Predictors of non- hookah smoking among high-school students based on prototype/willingness model.
Abedini, Sedigheh; MorowatiSharifabad, MohammadAli; Chaleshgar Kordasiabi, Mosharafeh; Ghanbarnejad, Amin
2014-01-01
The aim of the study was to determine predictors of refraining from hookah smoking among high-school students in Bandar Abbas, southern Iran based on Prototype/Willingness model. This cross- sectional with analytic approach was performed on 240 high-school students selected by a cluster random sampling. The data of demographic and Prototype-Willingness Model constructs were acquired via a self-administrated questionnaire. Data were analyzed by mean, frequency, correlation, liner and logistic regression statistical tests. Statistically significant determinants of the intention to refrain from hookah smoking were subjective norms, willingness, and attitude. Regression model indicated that the three items together explained 46.9% of the non-smoking hookah intention variance. Attitude and subjective norms predicted 36.0% of the non-smoking hookah intention variance. There was a significant relationship between the participants' negative prototype about the hookah smokers and the willingness to avoid from hookah smoking (P=0.002). Also willingness predicted non-smoking hookah better than the intention (P<0.001). Deigning intervention to increase negative prototype about the hookah smokers and reducing situations and conditions which facilitate hookah smoking, such as easy access to tobacco products in the cafés, beaches can be useful results among adolescents to hookah smoking prevention.
Lindström, Martin; Moghaddassi, Mahnaz; Merlo, Juan
2004-07-01
The influence of neighbourhood and individual factors on self-reported health was investigated. The public health survey in Malmö 1994 is a cross-sectional study. A total of 3,602 individuals aged 20-80 living in 75 neighbourhoods answered a postal questionnaire. The participation rate was 71%. A multilevel logistic regression model, with individuals at the first level and neighbourhoods at the second, was performed. We analysed the effect (intra-area correlation, cross-level modification and odds ratios) of neighbourhood on self-reported health after adjustment for individual factors. The neighbourhoods accounted for 2.8% of the crude total variance in self-reported health status. This effect was significantly reduced when individual factors such as country of origin, education and social participation were included in the model. In fact, no significant variance in self-reported health remained after the introduction of the individual factors in the model. In Malmö, the neighbourhood variance in self-reported health is mainly affected by individual factors, especially country of origin, socioeconomic status measured as level of education and individual social participation. Copyright 2004 The Institute for Cancer Prevention and Elsevier Inc.
The Use of Online Modules and the Effect on Student Outcomes in a High School Chemistry Class
NASA Astrophysics Data System (ADS)
Lamb, Richard L.; Annetta, Len
2013-10-01
The purpose of the study was to review the efficacy of online chemistry simulations in a high school chemistry class and provide discussion of the factors that may affect student learning. The sample consisted of 351 high school students exposed to online simulations. Researchers administered a pretest, intermediate test and posttest to measure chemistry content knowledge acquired during the use of online chemistry laboratory simulations. The authors also analyzed student journal entries as an attitudinal measure of chemistry during the simulation experience. The four analyses conducted were Repeated Time Measures Analysis of Variance, a three-way Analysis of Variance, Logistic Regression and Multiple Analysis of Variance. Each of these analyses provides for a slightly different aspect of factors regarding student attitudes and outcomes. Results indicate that there is a statistically significant main effect across grouping type (experimental versus control, p = 0.042, α = 0.05). Analysis of student journal entries suggests that attitudinal factors may affect student outcomes concerning the use of online supplemental instruction. Implications for this study show that the use of online simulations promotes increased understanding of chemistry content through open-ended and interactive questioning.
MIXOR: a computer program for mixed-effects ordinal regression analysis.
Hedeker, D; Gibbons, R D
1996-03-01
MIXOR provides maximum marginal likelihood estimates for mixed-effects ordinal probit, logistic, and complementary log-log regression models. These models can be used for analysis of dichotomous and ordinal outcomes from either a clustered or longitudinal design. For clustered data, the mixed-effects model assumes that data within clusters are dependent. The degree of dependency is jointly estimated with the usual model parameters, thus adjusting for dependence resulting from clustering of the data. Similarly, for longitudinal data, the mixed-effects approach can allow for individual-varying intercepts and slopes across time, and can estimate the degree to which these time-related effects vary in the population of individuals. MIXOR uses marginal maximum likelihood estimation, utilizing a Fisher-scoring solution. For the scoring solution, the Cholesky factor of the random-effects variance-covariance matrix is estimated, along with the effects of model covariates. Examples illustrating usage and features of MIXOR are provided.
Molecular markers of neuropsychological functioning and Alzheimer's disease.
Edwards, Melissa; Balldin, Valerie Hobson; Hall, James; O'Bryant, Sid
2015-03-01
The current project sought to examine molecular markers of neuropsychological functioning among elders with and without Alzheimer's disease (AD) and determine the predictive ability of combined molecular markers and select neuropsychological tests in detecting disease presence. Data were analyzed from 300 participants (n = 150, AD and n = 150, controls) enrolled in the Texas Alzheimer's Research and Care Consortium. Linear regression models were created to examine the link between the top five molecular markers from our AD blood profile and neuropsychological test scores. Logistical regressions were used to predict AD presence using serum biomarkers in combination with select neuropsychological measures. Using the neuropsychological test with the least amount of variance overlap with the molecular markers, the combined neuropsychological test and molecular markers was highly accurate in detecting AD presence. This work provides the foundation for the generation of a point-of-care device that can be used to screen for AD.
Adolescent Characters and Alcohol Use Scenes in Brazilian Movies, 2000-2008.
Castaldelli-Maia, João Mauricio; de Andrade, Arthur Guerra; Lotufo-Neto, Francisco; Bhugra, Dinesh
2016-04-01
Quantitative structured assessment of 193 scenes depicting substance use from a convenience sample of 50 Brazilian movies was performed. Logistic regression and analysis of variance or multivariate analysis of variance models were employed to test for two different types of outcome regarding alcohol appearance: The mean length of alcohol scenes in seconds and the prevalence of alcohol use scenes. The presence of adolescent characters was associated with a higher prevalence of alcohol use scenes compared to nonalcohol use scenes. The presence of adolescents was also associated with a higher than average length of alcohol use scenes compared to the nonalcohol use scenes. Alcohol use was negatively associated with cannabis, cocaine, and other drugs use. However, when the use of cannabis, cocaine, or other drugs was present in the alcohol use scenes, a higher average length was found. This may mean that most vulnerable group may see drinking as a more attractive option leading to higher alcohol use. © The Author(s) 2016.
Characterizing nonconstant instrumental variance in emerging miniaturized analytical techniques.
Noblitt, Scott D; Berg, Kathleen E; Cate, David M; Henry, Charles S
2016-04-07
Measurement variance is a crucial aspect of quantitative chemical analysis. Variance directly affects important analytical figures of merit, including detection limit, quantitation limit, and confidence intervals. Most reported analyses for emerging analytical techniques implicitly assume constant variance (homoskedasticity) by using unweighted regression calibrations. Despite the assumption of constant variance, it is known that most instruments exhibit heteroskedasticity, where variance changes with signal intensity. Ignoring nonconstant variance results in suboptimal calibrations, invalid uncertainty estimates, and incorrect detection limits. Three techniques where homoskedasticity is often assumed were covered in this work to evaluate if heteroskedasticity had a significant quantitative impact-naked-eye, distance-based detection using paper-based analytical devices (PADs), cathodic stripping voltammetry (CSV) with disposable carbon-ink electrode devices, and microchip electrophoresis (MCE) with conductivity detection. Despite these techniques representing a wide range of chemistries and precision, heteroskedastic behavior was confirmed for each. The general variance forms were analyzed, and recommendations for accounting for nonconstant variance discussed. Monte Carlo simulations of instrument responses were performed to quantify the benefits of weighted regression, and the sensitivity to uncertainty in the variance function was tested. Results show that heteroskedasticity should be considered during development of new techniques; even moderate uncertainty (30%) in the variance function still results in weighted regression outperforming unweighted regressions. We recommend utilizing the power model of variance because it is easy to apply, requires little additional experimentation, and produces higher-precision results and more reliable uncertainty estimates than assuming homoskedasticity. Copyright © 2016 Elsevier B.V. All rights reserved.
Chen, Cong; Zhang, Guohui; Liu, Xiaoyue Cathy; Ci, Yusheng; Huang, Helai; Ma, Jianming; Chen, Yanyan; Guan, Hongzhi
2016-12-01
There is a high potential of severe injury outcomes in traffic crashes on rural interstate highways due to the significant amount of high speed traffic on these corridors. Hierarchical Bayesian models are capable of incorporating between-crash variance and within-crash correlations into traffic crash data analysis and are increasingly utilized in traffic crash severity analysis. This paper applies a hierarchical Bayesian logistic model to examine the significant factors at crash and vehicle/driver levels and their heterogeneous impacts on driver injury severity in rural interstate highway crashes. Analysis results indicate that the majority of the total variance is induced by the between-crash variance, showing the appropriateness of the utilized hierarchical modeling approach. Three crash-level variables and six vehicle/driver-level variables are found significant in predicting driver injury severities: road curve, maximum vehicle damage in a crash, number of vehicles in a crash, wet road surface, vehicle type, driver age, driver gender, driver seatbelt use and driver alcohol or drug involvement. Among these variables, road curve, functional and disabled vehicle damage in crash, single-vehicle crashes, female drivers, senior drivers, motorcycles and driver alcohol or drug involvement tend to increase the odds of drivers being incapably injured or killed in rural interstate crashes, while wet road surface, male drivers and driver seatbelt use are more likely to decrease the probability of severe driver injuries. The developed methodology and estimation results provide insightful understanding of the internal mechanism of rural interstate crashes and beneficial references for developing effective countermeasures for rural interstate crash prevention. Copyright © 2016 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Lin, Yingzhi; Deng, Xiangzheng; Li, Xing; Ma, Enjun
2014-12-01
Spatially explicit simulation of land use change is the basis for estimating the effects of land use and cover change on energy fluxes, ecology and the environment. At the pixel level, logistic regression is one of the most common approaches used in spatially explicit land use allocation models to determine the relationship between land use and its causal factors in driving land use change, and thereby to evaluate land use suitability. However, these models have a drawback in that they do not determine/allocate land use based on the direct relationship between land use change and its driving factors. Consequently, a multinomial logistic regression method was introduced to address this flaw, and thereby, judge the suitability of a type of land use in any given pixel in a case study area of the Jiangxi Province, China. A comparison of the two regression methods indicated that the proportion of correctly allocated pixels using multinomial logistic regression was 92.98%, which was 8.47% higher than that obtained using logistic regression. Paired t-test results also showed that pixels were more clearly distinguished by multinomial logistic regression than by logistic regression. In conclusion, multinomial logistic regression is a more efficient and accurate method for the spatial allocation of land use changes. The application of this method in future land use change studies may improve the accuracy of predicting the effects of land use and cover change on energy fluxes, ecology, and environment.
Austin, Peter C; Wagner, Philippe; Merlo, Juan
2017-03-15
Multilevel data occurs frequently in many research areas like health services research and epidemiology. A suitable way to analyze such data is through the use of multilevel regression models (MLRM). MLRM incorporate cluster-specific random effects which allow one to partition the total individual variance into between-cluster variation and between-individual variation. Statistically, MLRM account for the dependency of the data within clusters and provide correct estimates of uncertainty around regression coefficients. Substantively, the magnitude of the effect of clustering provides a measure of the General Contextual Effect (GCE). When outcomes are binary, the GCE can also be quantified by measures of heterogeneity like the Median Odds Ratio (MOR) calculated from a multilevel logistic regression model. Time-to-event outcomes within a multilevel structure occur commonly in epidemiological and medical research. However, the Median Hazard Ratio (MHR) that corresponds to the MOR in multilevel (i.e., 'frailty') Cox proportional hazards regression is rarely used. Analogously to the MOR, the MHR is the median relative change in the hazard of the occurrence of the outcome when comparing identical subjects from two randomly selected different clusters that are ordered by risk. We illustrate the application and interpretation of the MHR in a case study analyzing the hazard of mortality in patients hospitalized for acute myocardial infarction at hospitals in Ontario, Canada. We provide R code for computing the MHR. The MHR is a useful and intuitive measure for expressing cluster heterogeneity in the outcome and, thereby, estimating general contextual effects in multilevel survival analysis. © 2016 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. © 2016 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.
Wagner, Philippe; Merlo, Juan
2016-01-01
Multilevel data occurs frequently in many research areas like health services research and epidemiology. A suitable way to analyze such data is through the use of multilevel regression models (MLRM). MLRM incorporate cluster‐specific random effects which allow one to partition the total individual variance into between‐cluster variation and between‐individual variation. Statistically, MLRM account for the dependency of the data within clusters and provide correct estimates of uncertainty around regression coefficients. Substantively, the magnitude of the effect of clustering provides a measure of the General Contextual Effect (GCE). When outcomes are binary, the GCE can also be quantified by measures of heterogeneity like the Median Odds Ratio (MOR) calculated from a multilevel logistic regression model. Time‐to‐event outcomes within a multilevel structure occur commonly in epidemiological and medical research. However, the Median Hazard Ratio (MHR) that corresponds to the MOR in multilevel (i.e., ‘frailty’) Cox proportional hazards regression is rarely used. Analogously to the MOR, the MHR is the median relative change in the hazard of the occurrence of the outcome when comparing identical subjects from two randomly selected different clusters that are ordered by risk. We illustrate the application and interpretation of the MHR in a case study analyzing the hazard of mortality in patients hospitalized for acute myocardial infarction at hospitals in Ontario, Canada. We provide R code for computing the MHR. The MHR is a useful and intuitive measure for expressing cluster heterogeneity in the outcome and, thereby, estimating general contextual effects in multilevel survival analysis. © 2016 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. PMID:27885709
Ali Morowatisharifabad, Mohammad; Abdolkarimi, Mahdi; Asadpour, Mohammad; Fathollahi, Mahmood Sheikh; Balaee, Parisa
2018-04-15
Theory-based education tailored to target behaviour and group can be effective in promoting physical activity. The purpose of this study was to examine the predictive power of Protection Motivation Theory on intent and behaviour of Physical Activity in Patients with Type 2 Diabetes. This descriptive study was conducted on 250 patients in Rafsanjan, Iran. To examine the scores of protection motivation theory structures, a researcher-made questionnaire was used. Its validity and reliability were confirmed. The level of physical activity was also measured by the International Short - form Physical Activity Inventory. Its validity and reliability were also approved. Data were analysed by statistical tests including correlation coefficient, chi-square, logistic regression and linear regression. The results revealed that there was a significant correlation between all the protection motivation theory constructs and the intention to do physical activity. The results showed that the Theory structures were able to predict 60% of the variance of physical activity intention. The results of logistic regression demonstrated that increase in the score of physical activity intent and self - efficacy increased the chance of higher level of physical activity by 3.4 and 1.5 times, respectively OR = (3.39, 1.54). Considering the ability of protection motivation theory structures to explain the physical activity behaviour, interventional designs are suggested based on the structures of this theory, especially to improve self -efficacy as the most powerful factor in predicting physical activity intention and behaviour.
Li, Feiming; Gimpel, John R; Arenson, Ethan; Song, Hao; Bates, Bruce P; Ludwin, Fredric
2014-04-01
Few studies have investigated how well scores from the Comprehensive Osteopathic Medical Licensing Examination-USA (COMLEX-USA) series predict resident outcomes, such as performance on board certification examinations. To determine how well COMLEX-USA predicts performance on the American Osteopathic Board of Emergency Medicine (AOBEM) Part I certification examination. The target study population was first-time examinees who took AOBEM Part I in 2011 and 2012 with matched performances on COMLEX-USA Level 1, Level 2-Cognitive Evaluation (CE), and Level 3. Pearson correlations were computed between AOBEM Part I first-attempt scores and COMLEX-USA performances to measure the association between these examinations. Stepwise linear regression analysis was conducted to predict AOBEM Part I scores by the 3 COMLEX-USA scores. An independent t test was conducted to compare mean COMLEX-USA performances between candidates who passed and who failed AOBEM Part I, and a stepwise logistic regression analysis was used to predict the log-odds of passing AOBEM Part I on the basis of COMLEX-USA scores. Scores from AOBEM Part I had the highest correlation with COMLEX-USA Level 3 scores (.57) and slightly lower correlation with COMLEX-USA Level 2-CE scores (.53). The lowest correlation was between AOBEM Part I and COMLEX-USA Level 1 scores (.47). According to the stepwise regression model, COMLEX-USA Level 1 and Level 2-CE scores, which residency programs often use as selection criteria, together explained 30% of variance in AOBEM Part I scores. Adding Level 3 scores explained 37% of variance. The independent t test indicated that the 397 examinees passing AOBEM Part I performed significantly better than the 54 examinees failing AOBEM Part I in all 3 COMLEX-USA levels (P<.001 for all 3 levels). The logistic regression model showed that COMLEX-USA Level 1 and Level 3 scores predicted the log-odds of passing AOBEM Part I (P=.03 and P<.001, respectively). The present study empirically supported the predictive and discriminant validities of the COMLEX-USA series in relation to the AOBEM Part I certification examination. Although residency programs may use COMLEX-USA Level 1 and Level 2-CE scores as partial criteria in selecting residents, Level 3 scores, though typically not available at the time of application, are actually the most statistically related to performances on AOBEM Part I.
NASA Astrophysics Data System (ADS)
Priya, Mallika; Rao, Bola Sadashiva Satish; Chandra, Subhash; Ray, Satadru; Mathew, Stanley; Datta, Anirbit; Nayak, Subramanya G.; Mahato, Krishna Kishore
2016-02-01
In spite of many efforts for early detection of breast cancer, there is still lack of technology for immediate implementation. In the present study, the potential photoacoustic spectroscopy was evaluated in discriminating breast cancer from normal, involving blood serum samples seeking early detection. Three photoacoustic spectra in time domain were recorded from each of 20 normal and 20 malignant samples at 281nm pulsed laser excitations and a total of 120 spectra were generated. The time domain spectra were then Fast Fourier Transformed into frequency domain and 116.5625 - 206.875 kHz region was selected for further analysis using a combinational approach of wavelet, PCA and logistic regression. Initially, wavelet analysis was performed on the FFT data and seven features (mean, median, area under the curve, variance, standard deviation, skewness and kurtosis) from each were extracted. PCA was then performed on the feature matrix (7x120) for discriminating malignant samples from the normal by plotting a decision boundary using logistic regression analysis. The unsupervised mode of classification used in the present study yielded specificity and sensitivity values of 100% in each respectively with a ROC - AUC value of 1. The results obtained have clearly demonstrated the capability of photoacoustic spectroscopy in discriminating cancer from the normal, suggesting its possible clinical implications.
Next day discharge rate has little use as a quality measure for individual physician performance.
Inabnit, Christopher; Markwell, Stephen; Gruwell, Jack; Jaeger, Cassie; Millburg, Lance; Griffen, David
2018-06-18
Emergency Department (ED) physicians' next day discharge rate (NDDR), the percentage of patients who were admitted from the ED and subsequently discharged within the next calendar day was hypothesized as a potential measure for unnecessary admissions. The objective was to determine if NDDR has validity as a measure for quality of individual ED physician performance. Hospital admission data was obtained for thirty-six ED physicians for calendar year 2015. Funnel plots were used to identify NDDR outliers beyond 95% control limits. A mixed model logistic regression was built to investigate factors contributing to NDDR. To determine yearly variation, data from calendar years 2014 and 2016 were analyzed, again by funnel plots and logistic regression. Intraclass correlation coefficient was used to estimate the percent of total variation in NDDR attributable to individual ED physicians. NDDR varied significantly among ED physicians. Individual ED physician outliers in NDDR varied year to year. Individual ED physician contribution to NDDR variation was minimal, accounting for 1%. Years of experience in Emergency Medicine practice was not correlated with NDDR. NDDR does not appear to be a reliable independent quality measure for individual ED physician performance. The percent of variance attributable to the ED physician was 1%. Copyright © 2018. Published by Elsevier Inc.
Constructive thinking, rational intelligence and irritable bowel syndrome.
Rey, Enrique; Moreno Ortega, Marta; Garcia Alonso, Monica-Olga; Diaz-Rubio, Manuel
2009-07-07
To evaluate rational and experiential intelligence in irritable bowel syndrome (IBS) sufferers. We recruited 100 subjects with IBS as per Rome II criteria (50 consulters and 50 non-consulters) and 100 healthy controls, matched by age, sex and educational level. Cases and controls completed a clinical questionnaire (including symptom characteristics and medical consultation) and the following tests: rational-intelligence (Wechsler Adult Intelligence Scale, 3rd edition); experiential-intelligence (Constructive Thinking Inventory); personality (NEO personality inventory); psychopathology (MMPI-2), anxiety (state-trait anxiety inventory) and life events (social readjustment rating scale). Analysis of variance was used to compare the test results of IBS-sufferers and controls, and a logistic regression model was then constructed and adjusted for age, sex and educational level to evaluate any possible association with IBS. No differences were found between IBS cases and controls in terms of IQ (102.0 +/- 10.8 vs 102.8 +/- 12.6), but IBS sufferers scored significantly lower in global constructive thinking (43.7 +/- 9.4 vs 49.6 +/- 9.7). In the logistic regression model, global constructive thinking score was independently linked to suffering from IBS [OR 0.92 (0.87-0.97)], without significant OR for total IQ. IBS subjects do not show lower rational intelligence than controls, but lower experiential intelligence is nevertheless associated with IBS.
Sauzet, Odile; Peacock, Janet L
2017-07-20
The analysis of perinatal outcomes often involves datasets with some multiple births. These are datasets mostly formed of independent observations and a limited number of clusters of size two (twins) and maybe of size three or more. This non-independence needs to be accounted for in the statistical analysis. Using simulated data based on a dataset of preterm infants we have previously investigated the performance of several approaches to the analysis of continuous outcomes in the presence of some clusters of size two. Mixed models have been developed for binomial outcomes but very little is known about their reliability when only a limited number of small clusters are present. Using simulated data based on a dataset of preterm infants we investigated the performance of several approaches to the analysis of binomial outcomes in the presence of some clusters of size two. Logistic models, several methods of estimation for the logistic random intercept models and generalised estimating equations were compared. The presence of even a small percentage of twins means that a logistic regression model will underestimate all parameters but a logistic random intercept model fails to estimate the correlation between siblings if the percentage of twins is too small and will provide similar estimates to logistic regression. The method which seems to provide the best balance between estimation of the standard error and the parameter for any percentage of twins is the generalised estimating equations. This study has shown that the number of covariates or the level two variance do not necessarily affect the performance of the various methods used to analyse datasets containing twins but when the percentage of small clusters is too small, mixed models cannot capture the dependence between siblings.
Standards for Standardized Logistic Regression Coefficients
ERIC Educational Resources Information Center
Menard, Scott
2011-01-01
Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…
Schörgendorfer, Angela; Branscum, Adam J; Hanson, Timothy E
2013-06-01
Logistic regression is a popular tool for risk analysis in medical and population health science. With continuous response data, it is common to create a dichotomous outcome for logistic regression analysis by specifying a threshold for positivity. Fitting a linear regression to the nondichotomized response variable assuming a logistic sampling model for the data has been empirically shown to yield more efficient estimates of odds ratios than ordinary logistic regression of the dichotomized endpoint. We illustrate that risk inference is not robust to departures from the parametric logistic distribution. Moreover, the model assumption of proportional odds is generally not satisfied when the condition of a logistic distribution for the data is violated, leading to biased inference from a parametric logistic analysis. We develop novel Bayesian semiparametric methodology for testing goodness of fit of parametric logistic regression with continuous measurement data. The testing procedures hold for any cutoff threshold and our approach simultaneously provides the ability to perform semiparametric risk estimation. Bayes factors are calculated using the Savage-Dickey ratio for testing the null hypothesis of logistic regression versus a semiparametric generalization. We propose a fully Bayesian and a computationally efficient empirical Bayesian approach to testing, and we present methods for semiparametric estimation of risks, relative risks, and odds ratios when parametric logistic regression fails. Theoretical results establish the consistency of the empirical Bayes test. Results from simulated data show that the proposed approach provides accurate inference irrespective of whether parametric assumptions hold or not. Evaluation of risk factors for obesity shows that different inferences are derived from an analysis of a real data set when deviations from a logistic distribution are permissible in a flexible semiparametric framework. © 2013, The International Biometric Society.
Socioeconomic Inequalities in Hearing Loss in a Healthy Population Sample: The HUNT Study
Krokstad, Steinar; Tambs, Kristian
2009-01-01
We assessed socioeconomic position and hearing loss in a Norwegian population of 17 593 men and women aged 30–54 years in 1984–1986 who were followed for 11 years. We used analysis of variance, logistic regression, and population-attributable fraction analyses to examine associations. Significant socioeconomic inequalities in hearing loss were found among men. Adjusted odds ratios for hearing loss were approximately 1.3 to 1.9 for semi- and unskilled manual workers compared with participants with high occupational class; the population-attributable fraction of the prevalence of hearing loss over the cutpoint in the high-frequency (3, 4, 6, and 8 kHz) range was 35%. PMID:19542048
Westreich, Daniel; Lessler, Justin; Funk, Michele Jonsson
2010-01-01
Summary Objective Propensity scores for the analysis of observational data are typically estimated using logistic regression. Our objective in this Review was to assess machine learning alternatives to logistic regression which may accomplish the same goals but with fewer assumptions or greater accuracy. Study Design and Setting We identified alternative methods for propensity score estimation and/or classification from the public health, biostatistics, discrete mathematics, and computer science literature, and evaluated these algorithms for applicability to the problem of propensity score estimation, potential advantages over logistic regression, and ease of use. Results We identified four techniques as alternatives to logistic regression: neural networks, support vector machines, decision trees (CART), and meta-classifiers (in particular, boosting). Conclusion While the assumptions of logistic regression are well understood, those assumptions are frequently ignored. All four alternatives have advantages and disadvantages compared with logistic regression. Boosting (meta-classifiers) and to a lesser extent decision trees (particularly CART) appear to be most promising for use in the context of propensity score analysis, but extensive simulation studies are needed to establish their utility in practice. PMID:20630332
Robust mislabel logistic regression without modeling mislabel probabilities.
Hung, Hung; Jou, Zhi-Yu; Huang, Su-Yun
2018-03-01
Logistic regression is among the most widely used statistical methods for linear discriminant analysis. In many applications, we only observe possibly mislabeled responses. Fitting a conventional logistic regression can then lead to biased estimation. One common resolution is to fit a mislabel logistic regression model, which takes into consideration of mislabeled responses. Another common method is to adopt a robust M-estimation by down-weighting suspected instances. In this work, we propose a new robust mislabel logistic regression based on γ-divergence. Our proposal possesses two advantageous features: (1) It does not need to model the mislabel probabilities. (2) The minimum γ-divergence estimation leads to a weighted estimating equation without the need to include any bias correction term, that is, it is automatically bias-corrected. These features make the proposed γ-logistic regression more robust in model fitting and more intuitive for model interpretation through a simple weighting scheme. Our method is also easy to implement, and two types of algorithms are included. Simulation studies and the Pima data application are presented to demonstrate the performance of γ-logistic regression. © 2017, The International Biometric Society.
Fungible weights in logistic regression.
Jones, Jeff A; Waller, Niels G
2016-06-01
In this article we develop methods for assessing parameter sensitivity in logistic regression models. To set the stage for this work, we first review Waller's (2008) equations for computing fungible weights in linear regression. Next, we describe 2 methods for computing fungible weights in logistic regression. To demonstrate the utility of these methods, we compute fungible logistic regression weights using data from the Centers for Disease Control and Prevention's (2010) Youth Risk Behavior Surveillance Survey, and we illustrate how these alternate weights can be used to evaluate parameter sensitivity. To make our work accessible to the research community, we provide R code (R Core Team, 2015) that will generate both kinds of fungible logistic regression weights. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Westreich, Daniel; Lessler, Justin; Funk, Michele Jonsson
2010-08-01
Propensity scores for the analysis of observational data are typically estimated using logistic regression. Our objective in this review was to assess machine learning alternatives to logistic regression, which may accomplish the same goals but with fewer assumptions or greater accuracy. We identified alternative methods for propensity score estimation and/or classification from the public health, biostatistics, discrete mathematics, and computer science literature, and evaluated these algorithms for applicability to the problem of propensity score estimation, potential advantages over logistic regression, and ease of use. We identified four techniques as alternatives to logistic regression: neural networks, support vector machines, decision trees (classification and regression trees [CART]), and meta-classifiers (in particular, boosting). Although the assumptions of logistic regression are well understood, those assumptions are frequently ignored. All four alternatives have advantages and disadvantages compared with logistic regression. Boosting (meta-classifiers) and, to a lesser extent, decision trees (particularly CART), appear to be most promising for use in the context of propensity score analysis, but extensive simulation studies are needed to establish their utility in practice. Copyright (c) 2010 Elsevier Inc. All rights reserved.
ERIC Educational Resources Information Center
Lee, Wan-Fung; Bulcock, Jeffrey Wilson
The purposes of this study are: (1) to demonstrate the superiority of simple ridge regression over ordinary least squares regression through theoretical argument and empirical example; (2) to modify ridge regression through use of the variance normalization criterion; and (3) to demonstrate the superiority of simple ridge regression based on the…
Should metacognition be measured by logistic regression?
Rausch, Manuel; Zehetleitner, Michael
2017-03-01
Are logistic regression slopes suitable to quantify metacognitive sensitivity, i.e. the efficiency with which subjective reports differentiate between correct and incorrect task responses? We analytically show that logistic regression slopes are independent from rating criteria in one specific model of metacognition, which assumes (i) that rating decisions are based on sensory evidence generated independently of the sensory evidence used for primary task responses and (ii) that the distributions of evidence are logistic. Given a hierarchical model of metacognition, logistic regression slopes depend on rating criteria. According to all considered models, regression slopes depend on the primary task criterion. A reanalysis of previous data revealed that massive numbers of trials are required to distinguish between hierarchical and independent models with tolerable accuracy. It is argued that researchers who wish to use logistic regression as measure of metacognitive sensitivity need to control the primary task criterion and rating criteria. Copyright © 2017 Elsevier Inc. All rights reserved.
Genetic and environmental transmission of body mass index fluctuation.
Bergin, Jocilyn E; Neale, Michael C; Eaves, Lindon J; Martin, Nicholas G; Heath, Andrew C; Maes, Hermine H
2012-11-01
This study sought to determine the relationship between body mass index (BMI) fluctuation and cardiovascular disease phenotypes, diabetes, and depression and the role of genetic and environmental factors in individual differences in BMI fluctuation using the extended twin-family model (ETFM). This study included 14,763 twins and their relatives. Health and Lifestyle Questionnaires were obtained from 28,492 individuals from the Virginia 30,000 dataset including twins, parents, siblings, spouses, and children of twins. Self-report cardiovascular disease, diabetes, and depression data were available. From self-reported height and weight, BMI fluctuation was calculated as the difference between highest and lowest BMI after age 18, for individuals 18-80 years. Logistic regression analyses were used to determine the relationship between BMI fluctuation and disease status. The ETFM was used to estimate the significance and contribution of genetic and environmental factors, cultural transmission, and assortative mating components to BMI fluctuation, while controlling for age. We tested sex differences in additive and dominant genetic effects, parental, non-parental, twin, and unique environmental effects. BMI fluctuation was highly associated with disease status, independent of BMI. Genetic effects accounted for ~34 % of variance in BMI fluctuation in males and ~43 % of variance in females. The majority of the variance was accounted for by environmental factors, about a third of which were shared among twins. Assortative mating, and cultural transmission accounted for only a small proportion of variance in this phenotype. Since there are substantial health risks associated with BMI fluctuation and environmental components of BMI fluctuation account for over 60 % of variance in males and over 50 % of variance in females, environmental risk factors may be appropriate targets to reduce BMI fluctuation.
Lee, Bum Ju; Kim, Keun Ho; Ku, Boncho; Jang, Jun-Su; Kim, Jong Yeol
2013-05-01
The body mass index (BMI) provides essential medical information related to body weight for the treatment and prognosis prediction of diseases such as cardiovascular disease, diabetes, and stroke. We propose a method for the prediction of normal, overweight, and obese classes based only on the combination of voice features that are associated with BMI status, independently of weight and height measurements. A total of 1568 subjects were divided into 4 groups according to age and gender differences. We performed statistical analyses by analysis of variance (ANOVA) and Scheffe test to find significant features in each group. We predicted BMI status (normal, overweight, and obese) by a logistic regression algorithm and two ensemble classification algorithms (bagging and random forests) based on statistically significant features. In the Female-2030 group (females aged 20-40 years), classification experiments using an imbalanced (original) data set gave area under the receiver operating characteristic curve (AUC) values of 0.569-0.731 by logistic regression, whereas experiments using a balanced data set gave AUC values of 0.893-0.994 by random forests. AUC values in Female-4050 (females aged 41-60 years), Male-2030 (males aged 20-40 years), and Male-4050 (males aged 41-60 years) groups by logistic regression in imbalanced data were 0.585-0.654, 0.581-0.614, and 0.557-0.653, respectively. AUC values in Female-4050, Male-2030, and Male-4050 groups in balanced data were 0.629-0.893 by bagging, 0.707-0.916 by random forests, and 0.695-0.854 by bagging, respectively. In each group, we found discriminatory features showing statistical differences among normal, overweight, and obese classes. The results showed that the classification models built by logistic regression in imbalanced data were better than those built by the other two algorithms, and significant features differed according to age and gender groups. Our results could support the development of BMI diagnosis tools for real-time monitoring; such tools are considered helpful in improving automated BMI status diagnosis in remote healthcare or telemedicine and are expected to have applications in forensic and medical science. Copyright © 2013 Elsevier B.V. All rights reserved.
London Measure of Unplanned Pregnancy: guidance for its use as an outcome measure
Hall, Jennifer A; Barrett, Geraldine; Copas, Andrew; Stephenson, Judith
2017-01-01
Background The London Measure of Unplanned Pregnancy (LMUP) is a psychometrically validated measure of the degree of intention of a current or recent pregnancy. The LMUP is increasingly being used worldwide, and can be used to evaluate family planning or preconception care programs. However, beyond recommending the use of the full LMUP scale, there is no published guidance on how to use the LMUP as an outcome measure. Ordinal logistic regression has been recommended informally, but studies published to date have all used binary logistic regression and dichotomized the scale at different cut points. There is thus a need for evidence-based guidance to provide a standardized methodology for multivariate analysis and to enable comparison of results. This paper makes recommendations for the regression method for analysis of the LMUP as an outcome measure. Materials and methods Data collected from 4,244 pregnant women in Malawi were used to compare five regression methods: linear, logistic with two cut points, and ordinal logistic with either the full or grouped LMUP score. The recommendations were then tested on the original UK LMUP data. Results There were small but no important differences in the findings across the regression models. Logistic regression resulted in the largest loss of information, and assumptions were violated for the linear and ordinal logistic regression. Consequently, robust standard errors were used for linear regression and a partial proportional odds ordinal logistic regression model attempted. The latter could only be fitted for grouped LMUP score. Conclusion We recommend the linear regression model with robust standard errors to make full use of the LMUP score when analyzed as an outcome measure. Ordinal logistic regression could be considered, but a partial proportional odds model with grouped LMUP score may be required. Logistic regression is the least-favored option, due to the loss of information. For logistic regression, the cut point for un/planned pregnancy should be between nine and ten. These recommendations will standardize the analysis of LMUP data and enhance comparability of results across studies. PMID:28435343
Logistic models--an odd(s) kind of regression.
Jupiter, Daniel C
2013-01-01
The logistic regression model bears some similarity to the multivariable linear regression with which we are familiar. However, the differences are great enough to warrant a discussion of the need for and interpretation of logistic regression. Copyright © 2013 American College of Foot and Ankle Surgeons. Published by Elsevier Inc. All rights reserved.
On the design of classifiers for crop inventories
NASA Technical Reports Server (NTRS)
Heydorn, R. P.; Takacs, H. C.
1986-01-01
Crop proportion estimators that use classifications of satellite data to correct, in an additive way, a given estimate acquired from ground observations are discussed. A linear version of these estimators is optimal, in terms of minimum variance, when the regression of the ground observations onto the satellite observations in linear. When this regression is not linear, but the reverse regression (satellite observations onto ground observations) is linear, the estimator is suboptimal but still has certain appealing variance properties. In this paper expressions are derived for those regressions which relate the intercepts and slopes to conditional classification probabilities. These expressions are then used to discuss the question of classifier designs that can lead to low-variance crop proportion estimates. Variance expressions for these estimates in terms of classifier omission and commission errors are also derived.
Berardi, Cecilia; Larson, Nicholas B.; Decker, Paul A.; Wassel, Christina L.; Kirsch, Phillip S.; Pankow, James S.; Sale, Michele M.; de Andrade, Mariza; Sicotte, Hugues; Tang, Weihong; Hanson, Naomi Q.; Tsai, Michael Y.; da Chen, Yii-Der I; Bielinski, Suzette J.
2015-01-01
L-selectin is constitutively expressed on leukocytes and mediates their interaction with endothelial cells during inflammation. Previous studies on the association of soluble L-selectin (sL-selectin) with cardiovascular disease (CVD) are inconsistent. Genetic variants associated with sL-selectin levels may be a better surrogate of levels over a lifetime. We explored the association of genetic variants and sL-selectin levels in a race/ethnicity stratified random sample of 2,403 participants in the Multi-Ethnic Study of Atherosclerosis (MESA). Through a genome-wide analysis with additive linear regression models, we found that rs12938 on the SELL gene accounted for a significant portion of the protein level variance across all four races/ethnicities. To evaluate potential additional associations, elastic net models were used for variants located in the SELL/SELP/SELE genetic region and an additional two SNPs, rs3917768 and rs4987361, were associated with sL-selectin levels in African Americans. These variants accounted for a portion of protein variance that ranged from 4% in Hispanic to 14% in African Americans. To investigate the relationship of these variants with CVD, 6,317 subjects were used. No significant association was found between any of the identified SNPs and carotid intima-media thickness or presence of carotid plaque using linear and logistic regression, respectively. Similarly no significant results were found for coronary artery calcium or coronary heart disease events. In conclusion, we found that variants within the SELL gene are associated with sL-selectin levels. Despite accounting for a significant portion of the protein level variance, none of the variants was associated with clinical or subclinical CVD. PMID:25576479
Peres, Maria Fernanda Tourinho; Azeredo, Catarina Machado; de Rezende, Leandro Fórnias Machado; Zucchi, Eliana Miura; Franca-Junior, Ivan; Luiz, Olinda do Carmo; Levy, Renata Bertazzi
2018-06-08
To investigate the association between personal, relational and school factors with involvement in fights with weapon among Brazilian school-age youth. Using data from the Adolescent School-Based Health Survey 2015 (n = 102.072), we conducted multilevel logistic regression models. IFW was associated with female sex (OR = 0.45), and with older age (OR = 1.15), previous involvement in physical violence (OR = 2.05), history of peer verbal (OR = 1.14) and domestic victimization (OR = 2.11), alcohol use (OR = 2.42) and drug use (OR = 3.23). The relational variables (e.g., parent's supervision) were mostly negatively associated with IFW. At the school level, attending public school and attending schools in violent surroundings were both positively associated with IFW. The intraclass correlation coefficient estimated in the empty model showed that 5.77% of the variance of IFW was at school level. When all individual- and school-level variables were included in the model, the proportional changes in variance were 61.7 and 71.55%, respectively. IFW is associated with personal, relational and school factors. Part of the variance in IFW by school is explained by characteristics of the school context.
Peralta, Victor; Cuesta, Manuel J
2005-11-15
The objective was to ascertain the underlying factor structure of alternative definitions of schizophrenia, and to examine the distribution of schizophrenia-related variables against the resulting factor solution. Twenty-three diagnostic schemes of schizophrenia were applied to 660 patients presenting with psychotic symptoms regardless of the specific diagnosis of psychotic disorder. Factor analysis of the 23 diagnostic schemes yielded three interpretable factors explaining 58% of the variance, the first factor (general schizophrenia factor) accounting for most of the variance (36%). On the basis of the general schizophrenia factor score, the sample was divided in quintile groups representing 5 levels of schizophrenia definition (absent, doubtful, very broad, broad and narrow) and the distribution of a number of schizophrenia-related variables was examined across the groups. This grouping procedure was used for examining the comparative validity of alternative levels of categorically defined schizophrenia and an ordinal (i.e. dimensional) definition. Overall, schizophrenia-related variables displayed a dose-response relationship with level of schizophrenia definition. Logistic regression analyses revealed that the dimensional definition explained more variance in the schizophrenia-related variables than the alternative levels for defining schizophrenia categorically. These results are consistent with a unitary and dimensional construct of schizophrenia with no clear "points of rarity" at its boundaries, thus supporting the continuum hypothesis of the psychotic illness.
Handling nonnormality and variance heterogeneity for quantitative sublethal toxicity tests.
Ritz, Christian; Van der Vliet, Leana
2009-09-01
The advantages of using regression-based techniques to derive endpoints from environmental toxicity data are clear, and slowly, this superior analytical technique is gaining acceptance. As use of regression-based analysis becomes more widespread, some of the associated nuances and potential problems come into sharper focus. Looking at data sets that cover a broad spectrum of standard test species, we noticed that some model fits to data failed to meet two key assumptions-variance homogeneity and normality-that are necessary for correct statistical analysis via regression-based techniques. Failure to meet these assumptions often is caused by reduced variance at the concentrations showing severe adverse effects. Although commonly used with linear regression analysis, transformation of the response variable only is not appropriate when fitting data using nonlinear regression techniques. Through analysis of sample data sets, including Lemna minor, Eisenia andrei (terrestrial earthworm), and algae, we show that both the so-called Box-Cox transformation and use of the Poisson distribution can help to correct variance heterogeneity and nonnormality and so allow nonlinear regression analysis to be implemented. Both the Box-Cox transformation and the Poisson distribution can be readily implemented into existing protocols for statistical analysis. By correcting for nonnormality and variance heterogeneity, these two statistical tools can be used to encourage the transition to regression-based analysis and the depreciation of less-desirable and less-flexible analytical techniques, such as linear interpolation.
Parameters Estimation of Geographically Weighted Ordinal Logistic Regression (GWOLR) Model
NASA Astrophysics Data System (ADS)
Zuhdi, Shaifudin; Retno Sari Saputro, Dewi; Widyaningsih, Purnami
2017-06-01
A regression model is the representation of relationship between independent variable and dependent variable. The dependent variable has categories used in the logistic regression model to calculate odds on. The logistic regression model for dependent variable has levels in the logistics regression model is ordinal. GWOLR model is an ordinal logistic regression model influenced the geographical location of the observation site. Parameters estimation in the model needed to determine the value of a population based on sample. The purpose of this research is to parameters estimation of GWOLR model using R software. Parameter estimation uses the data amount of dengue fever patients in Semarang City. Observation units used are 144 villages in Semarang City. The results of research get GWOLR model locally for each village and to know probability of number dengue fever patient categories.
The purpose of this report is to provide a reference manual that could be used by investigators for making informed use of logistic regression using two methods (standard logistic regression and MARS). The details for analyses of relationships between a dependent binary response ...
Predicting U.S. Army Reserve Unit Manning Using Market Demographics
2015-06-01
develops linear regression , classification tree, and logistic regression models to determine the ability of the location to support manning requirements... logistic regression model delivers predictive results that allow decision-makers to identify locations with a high probability of meeting unit...manning requirements. The recommendation of this thesis is that the USAR implement the logistic regression model. 14. SUBJECT TERMS U.S
ERIC Educational Resources Information Center
Chen, Chau-Kuang
2005-01-01
Logistic and Cox regression methods are practical tools used to model the relationships between certain student learning outcomes and their relevant explanatory variables. The logistic regression model fits an S-shaped curve into a binary outcome with data points of zero and one. The Cox regression model allows investigators to study the duration…
Yusuf, O B; Bamgboye, E A; Afolabi, R F; Shodimu, M A
2014-09-01
Logistic regression model is widely used in health research for description and predictive purposes. Unfortunately, most researchers are sometimes not aware that the underlying principles of the techniques have failed when the algorithm for maximum likelihood does not converge. Young researchers particularly postgraduate students may not know why separation problem whether quasi or complete occurs, how to identify it and how to fix it. This study was designed to critically evaluate convergence issues in articles that employed logistic regression analysis published in an African Journal of Medicine and medical sciences between 2004 and 2013. Problems of quasi or complete separation were described and were illustrated with the National Demographic and Health Survey dataset. A critical evaluation of articles that employed logistic regression was conducted. A total of 581 articles was reviewed, of which 40 (6.9%) used binary logistic regression. Twenty-four (60.0%) stated the use of logistic regression model in the methodology while none of the articles assessed model fit. Only 3 (12.5%) properly described the procedures. Of the 40 that used the logistic regression model, the problem of convergence occurred in 6 (15.0%) of the articles. Logistic regression tends to be poorly reported in studies published between 2004 and 2013. Our findings showed that the procedure may not be well understood by researchers since very few described the process in their reports and may be totally unaware of the problem of convergence or how to deal with it.
Ali Morowatisharifabad, Mohammad; Abdolkarimi, Mahdi; Asadpour, Mohammad; Fathollahi, Mahmood Sheikh; Balaee, Parisa
2018-01-01
INTRODUCTION: Theory-based education tailored to target behaviour and group can be effective in promoting physical activity. AIM: The purpose of this study was to examine the predictive power of Protection Motivation Theory on intent and behaviour of Physical Activity in Patients with Type 2 Diabetes. METHODS: This descriptive study was conducted on 250 patients in Rafsanjan, Iran. To examine the scores of protection motivation theory structures, a researcher-made questionnaire was used. Its validity and reliability were confirmed. The level of physical activity was also measured by the International Short - form Physical Activity Inventory. Its validity and reliability were also approved. Data were analysed by statistical tests including correlation coefficient, chi-square, logistic regression and linear regression. RESULTS: The results revealed that there was a significant correlation between all the protection motivation theory constructs and the intention to do physical activity. The results showed that the Theory structures were able to predict 60% of the variance of physical activity intention. The results of logistic regression demonstrated that increase in the score of physical activity intent and self - efficacy increased the chance of higher level of physical activity by 3.4 and 1.5 times, respectively OR = (3.39, 1.54). CONCLUSION: Considering the ability of protection motivation theory structures to explain the physical activity behaviour, interventional designs are suggested based on the structures of this theory, especially to improve self -efficacy as the most powerful factor in predicting physical activity intention and behaviour. PMID:29731945
Logistic Regression: Concept and Application
ERIC Educational Resources Information Center
Cokluk, Omay
2010-01-01
The main focus of logistic regression analysis is classification of individuals in different groups. The aim of the present study is to explain basic concepts and processes of binary logistic regression analysis intended to determine the combination of independent variables which best explain the membership in certain groups called dichotomous…
NASA Astrophysics Data System (ADS)
Pradhan, Biswajeet
2010-05-01
This paper presents the results of the cross-validation of a multivariate logistic regression model using remote sensing data and GIS for landslide hazard analysis on the Penang, Cameron, and Selangor areas in Malaysia. Landslide locations in the study areas were identified by interpreting aerial photographs and satellite images, supported by field surveys. SPOT 5 and Landsat TM satellite imagery were used to map landcover and vegetation index, respectively. Maps of topography, soil type, lineaments and land cover were constructed from the spatial datasets. Ten factors which influence landslide occurrence, i.e., slope, aspect, curvature, distance from drainage, lithology, distance from lineaments, soil type, landcover, rainfall precipitation, and normalized difference vegetation index (ndvi), were extracted from the spatial database and the logistic regression coefficient of each factor was computed. Then the landslide hazard was analysed using the multivariate logistic regression coefficients derived not only from the data for the respective area but also using the logistic regression coefficients calculated from each of the other two areas (nine hazard maps in all) as a cross-validation of the model. For verification of the model, the results of the analyses were then compared with the field-verified landslide locations. Among the three cases of the application of logistic regression coefficient in the same study area, the case of Selangor based on the Selangor logistic regression coefficients showed the highest accuracy (94%), where as Penang based on the Penang coefficients showed the lowest accuracy (86%). Similarly, among the six cases from the cross application of logistic regression coefficient in other two areas, the case of Selangor based on logistic coefficient of Cameron showed highest (90%) prediction accuracy where as the case of Penang based on the Selangor logistic regression coefficients showed the lowest accuracy (79%). Qualitatively, the cross application model yields reasonable results which can be used for preliminary landslide hazard mapping.
An Entropy-Based Measure for Assessing Fuzziness in Logistic Regression
Weiss, Brandi A.; Dardick, William
2015-01-01
This article introduces an entropy-based measure of data–model fit that can be used to assess the quality of logistic regression models. Entropy has previously been used in mixture-modeling to quantify how well individuals are classified into latent classes. The current study proposes the use of entropy for logistic regression models to quantify the quality of classification and separation of group membership. Entropy complements preexisting measures of data–model fit and provides unique information not contained in other measures. Hypothetical data scenarios, an applied example, and Monte Carlo simulation results are used to demonstrate the application of entropy in logistic regression. Entropy should be used in conjunction with other measures of data–model fit to assess how well logistic regression models classify cases into observed categories. PMID:29795897
Logistic regression applied to natural hazards: rare event logistic regression with replications
NASA Astrophysics Data System (ADS)
Guns, M.; Vanacker, V.
2012-06-01
Statistical analysis of natural hazards needs particular attention, as most of these phenomena are rare events. This study shows that the ordinary rare event logistic regression, as it is now commonly used in geomorphologic studies, does not always lead to a robust detection of controlling factors, as the results can be strongly sample-dependent. In this paper, we introduce some concepts of Monte Carlo simulations in rare event logistic regression. This technique, so-called rare event logistic regression with replications, combines the strength of probabilistic and statistical methods, and allows overcoming some of the limitations of previous developments through robust variable selection. This technique was here developed for the analyses of landslide controlling factors, but the concept is widely applicable for statistical analyses of natural hazards.
Large unbalanced credit scoring using Lasso-logistic regression ensemble.
Wang, Hong; Xu, Qingsong; Zhou, Lifeng
2015-01-01
Recently, various ensemble learning methods with different base classifiers have been proposed for credit scoring problems. However, for various reasons, there has been little research using logistic regression as the base classifier. In this paper, given large unbalanced data, we consider the plausibility of ensemble learning using regularized logistic regression as the base classifier to deal with credit scoring problems. In this research, the data is first balanced and diversified by clustering and bagging algorithms. Then we apply a Lasso-logistic regression learning ensemble to evaluate the credit risks. We show that the proposed algorithm outperforms popular credit scoring models such as decision tree, Lasso-logistic regression and random forests in terms of AUC and F-measure. We also provide two importance measures for the proposed model to identify important variables in the data.
An Entropy-Based Measure for Assessing Fuzziness in Logistic Regression.
Weiss, Brandi A; Dardick, William
2016-12-01
This article introduces an entropy-based measure of data-model fit that can be used to assess the quality of logistic regression models. Entropy has previously been used in mixture-modeling to quantify how well individuals are classified into latent classes. The current study proposes the use of entropy for logistic regression models to quantify the quality of classification and separation of group membership. Entropy complements preexisting measures of data-model fit and provides unique information not contained in other measures. Hypothetical data scenarios, an applied example, and Monte Carlo simulation results are used to demonstrate the application of entropy in logistic regression. Entropy should be used in conjunction with other measures of data-model fit to assess how well logistic regression models classify cases into observed categories.
Specification of variables predictive of victories in the sport of boxing.
Warnick, Jason E; Warnick, Kyla
2007-08-01
Compared to other sports, very little research has been conducted on which variables can predict victory in the sport of boxing. This investigation examined whether boxers' age, weight change from their preceding contest, country of origin, total number of wins, total number of losses, performance in their preceding contest, or the possession of a championship title was predictive of a winning performance in a given bout. A 1-mo. sample of male professional boxing records for all contests held in the USA (N = 400) were collected from the BoxRec online database. Logistic regression analysis indicated that only boxers' age, total number of wins and losses, and the performance in the preceding contest predicted significant variance in outcome.
Constructive thinking, rational intelligence and irritable bowel syndrome
Rey, Enrique; Ortega, Marta Moreno; Alonso, Monica Olga Garcia; Diaz-Rubio, Manuel
2009-01-01
AIM: To evaluate rational and experiential intelligence in irritable bowel syndrome (IBS) sufferers. METHODS: We recruited 100 subjects with IBS as per Rome II criteria (50 consulters and 50 non-consulters) and 100 healthy controls, matched by age, sex and educational level. Cases and controls completed a clinical questionnaire (including symptom characteristics and medical consultation) and the following tests: rational-intelligence (Wechsler Adult Intelligence Scale, 3rd edition); experiential-intelligence (Constructive Thinking Inventory); personality (NEO personality inventory); psychopathology (MMPI-2), anxiety (state-trait anxiety inventory) and life events (social readjustment rating scale). Analysis of variance was used to compare the test results of IBS-sufferers and controls, and a logistic regression model was then constructed and adjusted for age, sex and educational level to evaluate any possible association with IBS. RESULTS: No differences were found between IBS cases and controls in terms of IQ (102.0 ± 10.8 vs 102.8 ± 12.6), but IBS sufferers scored significantly lower in global constructive thinking (43.7 ± 9.4 vs 49.6 ± 9.7). In the logistic regression model, global constructive thinking score was independently linked to suffering from IBS [OR 0.92 (0.87-0.97)], without significant OR for total IQ. CONCLUSION: IBS subjects do not show lower rational intelligence than controls, but lower experiential intelligence is nevertheless associated with IBS. PMID:19575489
Dong, Mei-Xue; Hu, Ling; Huang, Yuan-Jun; Xu, Xiao-Min; Liu, Yang; Wei, You-Dong
2017-07-01
To determine cerebrovascular risk factors for patients with cerebral watershed infarction (CWI) from Southwest China.Patients suffering from acute ischemic stroke were categorized into internal CWI (I-CWI), external CWI (E-CWI), or non-CWI (patients without CWI) groups. Clinical data were collected and degrees of steno-occlusion of all cerebral arteries were scored. Arteries associated with the circle of Willis were also assessed. Data were compared using Pearson chi-squared tests for categorical data and 1-way analysis of variance with Bonferroni post hoc tests for continuous data, as appropriate. Multivariate binary logistic regression analysis was performed to determine independent cerebrovascular risk factors for CWI.Compared with non-CWI, I-CWI had higher degrees of steno-occlusion of the ipsilateral middle cerebral artery, ipsilateral carotid artery, and contralateral middle cerebral artery. E-CWI showed no significant differences. All the 3 arteries were independent cerebrovascular risk factors for I-CWI confirmed by multivariate binary logistic regression analysis. I-CWI had higher degrees of steno-occlusion of the ipsilateral middle cerebral artery compared with E-CWI. No significant differences were found among arteries associated with the circle of Willis.The ipsilateral middle cerebral artery, carotid artery, and contralateral middle cerebral artery were independent cerebrovascular risk factors for I-CWI. No cerebrovascular risk factor was identified for E-CWI.
Power and Sample Size Calculations for Logistic Regression Tests for Differential Item Functioning
ERIC Educational Resources Information Center
Li, Zhushan
2014-01-01
Logistic regression is a popular method for detecting uniform and nonuniform differential item functioning (DIF) effects. Theoretical formulas for the power and sample size calculations are derived for likelihood ratio tests and Wald tests based on the asymptotic distribution of the maximum likelihood estimators for the logistic regression model.…
A Methodology for Generating Placement Rules that Utilizes Logistic Regression
ERIC Educational Resources Information Center
Wurtz, Keith
2008-01-01
The purpose of this article is to provide the necessary tools for institutional researchers to conduct a logistic regression analysis and interpret the results. Aspects of the logistic regression procedure that are necessary to evaluate models are presented and discussed with an emphasis on cutoff values and choosing the appropriate number of…
John Hogland; Nedret Billor; Nathaniel Anderson
2013-01-01
Discriminant analysis, referred to as maximum likelihood classification within popular remote sensing software packages, is a common supervised technique used by analysts. Polytomous logistic regression (PLR), also referred to as multinomial logistic regression, is an alternative classification approach that is less restrictive, more flexible, and easy to interpret. To...
Large Unbalanced Credit Scoring Using Lasso-Logistic Regression Ensemble
Wang, Hong; Xu, Qingsong; Zhou, Lifeng
2015-01-01
Recently, various ensemble learning methods with different base classifiers have been proposed for credit scoring problems. However, for various reasons, there has been little research using logistic regression as the base classifier. In this paper, given large unbalanced data, we consider the plausibility of ensemble learning using regularized logistic regression as the base classifier to deal with credit scoring problems. In this research, the data is first balanced and diversified by clustering and bagging algorithms. Then we apply a Lasso-logistic regression learning ensemble to evaluate the credit risks. We show that the proposed algorithm outperforms popular credit scoring models such as decision tree, Lasso-logistic regression and random forests in terms of AUC and F-measure. We also provide two importance measures for the proposed model to identify important variables in the data. PMID:25706988
Estimating integrated variance in the presence of microstructure noise using linear regression
NASA Astrophysics Data System (ADS)
Holý, Vladimír
2017-07-01
Using financial high-frequency data for estimation of integrated variance of asset prices is beneficial but with increasing number of observations so-called microstructure noise occurs. This noise can significantly bias the realized variance estimator. We propose a method for estimation of the integrated variance robust to microstructure noise as well as for testing the presence of the noise. Our method utilizes linear regression in which realized variances estimated from different data subsamples act as dependent variable while the number of observations act as explanatory variable. We compare proposed estimator with other methods on simulated data for several microstructure noise structures.
Hossain, Md Golam; Saw, Aik; Alam, Rashidul; Ohtsuki, Fumio; Kamarul, Tunku
2013-09-01
Cephalic index (CI), the ratio of head breadth to head length, is widely used to categorise human populations. The aim of this study was to access the impact of anthropometric measurements on the CI of male Japanese university students. This study included 1,215 male university students from Tokyo and Kyoto, selected using convenient sampling. Multiple regression analysis was used to determine the effect of anthropometric measurements on CI. The variance inflation factor (VIF) showed no evidence of a multicollinearity problem among independent variables. The coefficients of the regression line demonstrated a significant positive relationship between CI and minimum frontal breadth (p < 0.01), bizygomatic breadth (p < 0.01) and head height (p < 0.05), and a negative relationship between CI and morphological facial height (p < 0.01) and head circumference (p < 0.01). Moreover, the coefficient and odds ratio of logistic regression analysis showed a greater likelihood for minimum frontal breadth (p < 0.01) and bizygomatic breadth (p < 0.01) to predict round-headedness, and morphological facial height (p < 0.05) and head circumference (p < 0.01) to predict long-headedness. Stepwise regression analysis revealed bizygomatic breadth, head circumference, minimum frontal breadth, head height and morphological facial height to be the best predictor craniofacial measurements with respect to CI. The results suggest that most of the variables considered in this study appear to influence the CI of adult male Japanese students.
An Entropy-Based Measure for Assessing Fuzziness in Logistic Regression
ERIC Educational Resources Information Center
Weiss, Brandi A.; Dardick, William
2016-01-01
This article introduces an entropy-based measure of data-model fit that can be used to assess the quality of logistic regression models. Entropy has previously been used in mixture-modeling to quantify how well individuals are classified into latent classes. The current study proposes the use of entropy for logistic regression models to quantify…
What Are the Odds of that? A Primer on Understanding Logistic Regression
ERIC Educational Resources Information Center
Huang, Francis L.; Moon, Tonya R.
2013-01-01
The purpose of this Methodological Brief is to present a brief primer on logistic regression, a commonly used technique when modeling dichotomous outcomes. Using data from the National Education Longitudinal Study of 1988 (NELS:88), logistic regression techniques were used to investigate student-level variables in eighth grade (i.e., enrolled in a…
On the Usefulness of a Multilevel Logistic Regression Approach to Person-Fit Analysis
ERIC Educational Resources Information Center
Conijn, Judith M.; Emons, Wilco H. M.; van Assen, Marcel A. L. M.; Sijtsma, Klaas
2011-01-01
The logistic person response function (PRF) models the probability of a correct response as a function of the item locations. Reise (2000) proposed to use the slope parameter of the logistic PRF as a person-fit measure. He reformulated the logistic PRF model as a multilevel logistic regression model and estimated the PRF parameters from this…
Stylianou, Neophytos; Akbarov, Artur; Kontopantelis, Evangelos; Buchan, Iain; Dunn, Ken W
2015-08-01
Predicting mortality from burn injury has traditionally employed logistic regression models. Alternative machine learning methods have been introduced in some areas of clinical prediction as the necessary software and computational facilities have become accessible. Here we compare logistic regression and machine learning predictions of mortality from burn. An established logistic mortality model was compared to machine learning methods (artificial neural network, support vector machine, random forests and naïve Bayes) using a population-based (England & Wales) case-cohort registry. Predictive evaluation used: area under the receiver operating characteristic curve; sensitivity; specificity; positive predictive value and Youden's index. All methods had comparable discriminatory abilities, similar sensitivities, specificities and positive predictive values. Although some machine learning methods performed marginally better than logistic regression the differences were seldom statistically significant and clinically insubstantial. Random forests were marginally better for high positive predictive value and reasonable sensitivity. Neural networks yielded slightly better prediction overall. Logistic regression gives an optimal mix of performance and interpretability. The established logistic regression model of burn mortality performs well against more complex alternatives. Clinical prediction with a small set of strong, stable, independent predictors is unlikely to gain much from machine learning outside specialist research contexts. Copyright © 2015 Elsevier Ltd and ISBI. All rights reserved.
Valle, Denis; Lima, Joanna M Tucker; Millar, Justin; Amratia, Punam; Haque, Ubydul
2015-11-04
Logistic regression is a statistical model widely used in cross-sectional and cohort studies to identify and quantify the effects of potential disease risk factors. However, the impact of imperfect tests on adjusted odds ratios (and thus on the identification of risk factors) is under-appreciated. The purpose of this article is to draw attention to the problem associated with modelling imperfect diagnostic tests, and propose simple Bayesian models to adequately address this issue. A systematic literature review was conducted to determine the proportion of malaria studies that appropriately accounted for false-negatives/false-positives in a logistic regression setting. Inference from the standard logistic regression was also compared with that from three proposed Bayesian models using simulations and malaria data from the western Brazilian Amazon. A systematic literature review suggests that malaria epidemiologists are largely unaware of the problem of using logistic regression to model imperfect diagnostic test results. Simulation results reveal that statistical inference can be substantially improved when using the proposed Bayesian models versus the standard logistic regression. Finally, analysis of original malaria data with one of the proposed Bayesian models reveals that microscopy sensitivity is strongly influenced by how long people have lived in the study region, and an important risk factor (i.e., participation in forest extractivism) is identified that would have been missed by standard logistic regression. Given the numerous diagnostic methods employed by malaria researchers and the ubiquitous use of logistic regression to model the results of these diagnostic tests, this paper provides critical guidelines to improve data analysis practice in the presence of misclassification error. Easy-to-use code that can be readily adapted to WinBUGS is provided, enabling straightforward implementation of the proposed Bayesian models.
Logistic regression for risk factor modelling in stuttering research.
Reed, Phil; Wu, Yaqionq
2013-06-01
To outline the uses of logistic regression and other statistical methods for risk factor analysis in the context of research on stuttering. The principles underlying the application of a logistic regression are illustrated, and the types of questions to which such a technique has been applied in the stuttering field are outlined. The assumptions and limitations of the technique are discussed with respect to existing stuttering research, and with respect to formulating appropriate research strategies to accommodate these considerations. Finally, some alternatives to the approach are briefly discussed. The way the statistical procedures are employed are demonstrated with some hypothetical data. Research into several practical issues concerning stuttering could benefit if risk factor modelling were used. Important examples are early diagnosis, prognosis (whether a child will recover or persist) and assessment of treatment outcome. After reading this article you will: (a) Summarize the situations in which logistic regression can be applied to a range of issues about stuttering; (b) Follow the steps in performing a logistic regression analysis; (c) Describe the assumptions of the logistic regression technique and the precautions that need to be checked when it is employed; (d) Be able to summarize its advantages over other techniques like estimation of group differences and simple regression. Copyright © 2012 Elsevier Inc. All rights reserved.
Examination of premenstrual symptoms as a risk factor for depression in postpartum women.
Buttner, Melissa M; Mott, Sarah L; Pearlstein, Teri; Stuart, Scott; Zlotnick, Caron; O'Hara, Michael W
2013-06-01
Postpartum depression (PPD) is a significant public health concern with prevalence of major and minor depressions reaching 20 % in the first three postpartum months. Sociodemographic and psychopathology correlates of PPD are well established; however, information on the relationship between premenstrual disorders and the development of PPD is less well established. Thus, the aim of this study was to examine the role of premenstrual syndrome (PMS)/premenstrual dysphoric disorder (PMDD) as a risk factor for PPD. Premenstrual symptoms were assessed retrospectively using the premenstrual symptoms screening tool (PSST) and depression was diagnosed according to the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV) criteria and assessed using the Hamilton Depression Rating Scale (HDRS). A two-stage screening procedure was applied. In the first stage, the Patient Health Questionnaire (PHQ-9) was employed. In the second stage, women endorsing ≥5 symptoms on the PHQ-9 were administered the Structured Clinical Interview for DSM-IV, HDRS, and PSST. Hierarchical linear regression showed that history of depression and PMS/PMDD contributed an additional 2 % of the variance (p < 0.001), beyond that of sociodemographic factor effects. The full model accounted for 13 % of the variance in postpartum depressive symptoms. Using logistic regression, a significant association also emerged between PMS/PMDD and PPD (OR = 1.97). The findings of this study suggest that PMS/PMDD is an important risk factor for PPD. Women endorsing a history of PMS/PMDD should be monitored during the perinatal period.
Factors Associated With Surgery Clerkship Performance and Subsequent USMLE Step Scores.
Dong, Ting; Copeland, Annesley; Gangidine, Matthew; Schreiber-Gregory, Deanna; Ritter, E Matthew; Durning, Steven J
2018-03-12
We conducted an in-depth empirical investigation to achieve a better understanding of the surgery clerkship from multiple perspectives, including the influence of clerkship sequence on performance, the relationship between self-logged work hours and performance, as well as the association between surgery clerkship performance with subsequent USMLE Step exams' scores. The study cohort consisted of medical students graduating between 2015 and 2018 (n = 687). The primary measures of interest were clerkship sequence (internal medicine clerkship before or after surgery clerkship), self-logged work hours during surgery clerkship, surgery NBME subject exam score, surgery clerkship overall grade, and Step 1, Step 2 CK, and Step 3 exam scores. We reported the descriptive statistics and conducted correlation analysis, stepwise linear regression analysis, and variable selection analysis of logistic regression to answer the research questions. Students who completed internal medicine clerkship prior to surgery clerkship had better performance on surgery subject exam. The subject exam score explained an additional 28% of the variance of the Step 2 CK score, and the clerkship overall score accounted for an additional 24% of the variance after the MCAT scores and undergraduate GPA were controlled. Our finding suggests that the clerkship sequence does matter when it comes to performance on the surgery NBME subject exam. Performance on the surgery subject exam is predictive of subsequent performance on future USMLE Step exams. Copyright © 2018 Association of Program Directors in Surgery. Published by Elsevier Inc. All rights reserved.
Dynamic Dimensionality Selection for Bayesian Classifier Ensembles
2015-03-19
learning of weights in an otherwise generatively learned naive Bayes classifier. WANBIA-C is very cometitive to Logistic Regression but much more...classifier, Generative learning, Discriminative learning, Naïve Bayes, Feature selection, Logistic regression , higher order attribute independence 16...discriminative learning of weights in an otherwise generatively learned naive Bayes classifier. WANBIA-C is very cometitive to Logistic Regression but
Travis Woolley; David C. Shaw; Lisa M. Ganio; Stephen Fitzgerald
2012-01-01
Logistic regression models used to predict tree mortality are critical to post-fire management, planning prescribed bums and understanding disturbance ecology. We review literature concerning post-fire mortality prediction using logistic regression models for coniferous tree species in the western USA. We include synthesis and review of: methods to develop, evaluate...
Preserving Institutional Privacy in Distributed binary Logistic Regression.
Wu, Yuan; Jiang, Xiaoqian; Ohno-Machado, Lucila
2012-01-01
Privacy is becoming a major concern when sharing biomedical data across institutions. Although methods for protecting privacy of individual patients have been proposed, it is not clear how to protect the institutional privacy, which is many times a critical concern of data custodians. Built upon our previous work, Grid Binary LOgistic REgression (GLORE)1, we developed an Institutional Privacy-preserving Distributed binary Logistic Regression model (IPDLR) that considers both individual and institutional privacy for building a logistic regression model in a distributed manner. We tested our method using both simulated and clinical data, showing how it is possible to protect the privacy of individuals and of institutions using a distributed strategy.
Covariate Imbalance and Adjustment for Logistic Regression Analysis of Clinical Trial Data
Ciolino, Jody D.; Martin, Reneé H.; Zhao, Wenle; Jauch, Edward C.; Hill, Michael D.; Palesch, Yuko Y.
2014-01-01
In logistic regression analysis for binary clinical trial data, adjusted treatment effect estimates are often not equivalent to unadjusted estimates in the presence of influential covariates. This paper uses simulation to quantify the benefit of covariate adjustment in logistic regression. However, International Conference on Harmonization guidelines suggest that covariate adjustment be pre-specified. Unplanned adjusted analyses should be considered secondary. Results suggest that that if adjustment is not possible or unplanned in a logistic setting, balance in continuous covariates can alleviate some (but never all) of the shortcomings of unadjusted analyses. The case of log binomial regression is also explored. PMID:24138438
Differentially private distributed logistic regression using private and public data.
Ji, Zhanglong; Jiang, Xiaoqian; Wang, Shuang; Xiong, Li; Ohno-Machado, Lucila
2014-01-01
Privacy protecting is an important issue in medical informatics and differential privacy is a state-of-the-art framework for data privacy research. Differential privacy offers provable privacy against attackers who have auxiliary information, and can be applied to data mining models (for example, logistic regression). However, differentially private methods sometimes introduce too much noise and make outputs less useful. Given available public data in medical research (e.g. from patients who sign open-consent agreements), we can design algorithms that use both public and private data sets to decrease the amount of noise that is introduced. In this paper, we modify the update step in Newton-Raphson method to propose a differentially private distributed logistic regression model based on both public and private data. We try our algorithm on three different data sets, and show its advantage over: (1) a logistic regression model based solely on public data, and (2) a differentially private distributed logistic regression model based on private data under various scenarios. Logistic regression models built with our new algorithm based on both private and public datasets demonstrate better utility than models that trained on private or public datasets alone without sacrificing the rigorous privacy guarantee.
Deng, Yingyuan; Wang, Tianfu; Chen, Siping; Liu, Weixiang
2017-01-01
The aim of the study is to screen the significant sonographic features by logistic regression analysis and fit a model to diagnose thyroid nodules. A total of 525 pathological thyroid nodules were retrospectively analyzed. All the nodules underwent conventional ultrasonography (US), strain elastosonography (SE), and contrast -enhanced ultrasound (CEUS). Those nodules’ 12 suspicious sonographic features were used to assess thyroid nodules. The significant features of diagnosing thyroid nodules were picked out by logistic regression analysis. All variables that were statistically related to diagnosis of thyroid nodules, at a level of p < 0.05 were embodied in a logistic regression analysis model. The significant features in the logistic regression model of diagnosing thyroid nodules were calcification, suspected cervical lymph node metastasis, hypoenhancement pattern, margin, shape, vascularity, posterior acoustic, echogenicity, and elastography score. According to the results of logistic regression analysis, the formula that could predict whether or not thyroid nodules are malignant was established. The area under the receiver operating curve (ROC) was 0.930 and the sensitivity, specificity, accuracy, positive predictive value, and negative predictive value were 83.77%, 89.56%, 87.05%, 86.04%, and 87.79% respectively. PMID:29228030
Pang, Tiantian; Huang, Leidan; Deng, Yingyuan; Wang, Tianfu; Chen, Siping; Gong, Xuehao; Liu, Weixiang
2017-01-01
The aim of the study is to screen the significant sonographic features by logistic regression analysis and fit a model to diagnose thyroid nodules. A total of 525 pathological thyroid nodules were retrospectively analyzed. All the nodules underwent conventional ultrasonography (US), strain elastosonography (SE), and contrast -enhanced ultrasound (CEUS). Those nodules' 12 suspicious sonographic features were used to assess thyroid nodules. The significant features of diagnosing thyroid nodules were picked out by logistic regression analysis. All variables that were statistically related to diagnosis of thyroid nodules, at a level of p < 0.05 were embodied in a logistic regression analysis model. The significant features in the logistic regression model of diagnosing thyroid nodules were calcification, suspected cervical lymph node metastasis, hypoenhancement pattern, margin, shape, vascularity, posterior acoustic, echogenicity, and elastography score. According to the results of logistic regression analysis, the formula that could predict whether or not thyroid nodules are malignant was established. The area under the receiver operating curve (ROC) was 0.930 and the sensitivity, specificity, accuracy, positive predictive value, and negative predictive value were 83.77%, 89.56%, 87.05%, 86.04%, and 87.79% respectively.
Amini, Payam; Maroufizadeh, Saman; Samani, Reza Omani; Hamidi, Omid; Sepidarkish, Mahdi
2017-06-01
Preterm birth (PTB) is a leading cause of neonatal death and the second biggest cause of death in children under five years of age. The objective of this study was to determine the prevalence of PTB and its associated factors using logistic regression and decision tree classification methods. This cross-sectional study was conducted on 4,415 pregnant women in Tehran, Iran, from July 6-21, 2015. Data were collected by a researcher-developed questionnaire through interviews with mothers and review of their medical records. To evaluate the accuracy of the logistic regression and decision tree methods, several indices such as sensitivity, specificity, and the area under the curve were used. The PTB rate was 5.5% in this study. The logistic regression outperformed the decision tree for the classification of PTB based on risk factors. Logistic regression showed that multiple pregnancies, mothers with preeclampsia, and those who conceived with assisted reproductive technology had an increased risk for PTB ( p < 0.05). Identifying and training mothers at risk as well as improving prenatal care may reduce the PTB rate. We also recommend that statisticians utilize the logistic regression model for the classification of risk groups for PTB.
Individual and contextual factors associated with verbal bullying among Brazilian adolescents.
Azeredo, Catarina Machado; Levy, Renata Bertazzi; Araya, Ricardo; Menezes, Paulo Rossi
2015-05-01
Few studies have been carried out in low- middle-income countries assessing contextual characteristics associated with bullying. This study aimed to assess the relative importance of contextual (school and city) and individual-level factors to explain the variance in verbal bullying among a nationally representative sample of Brazilian adolescents. 59,348 students from 1,453 schools and 26 state capitals and the Federal District participated in the National Survey of School Health among 9th Grade Students (PeNSE, 2009). We performed multilevel logistic regression in a three level model (individual, school and city). The 30-day prevalence of verbal bullying among these students was 14.2%. We found that 1.8% and 0.3% of the total variance in bullying occurred at school-level and city-level, respectively, and 97.9% at individual-level. At city-level, all factors included failed to demonstrate a significant association with bullying (p < 0.05) whereas at school-level, private schools presented more bullying than public schools (OR = 1.17, CI 1.04-1.31). At individual-level, male gender, younger age, not living with both parents, exposed to domestic violence, under or overweight were all associated with bullying. All socioeconomic indicators assessed contributed little to explain the variance in bullying at individual, school or city-level. Population subgroups at risk identified according to their individual profile could be targeted in future interventions in Brazil.
Kim, Sung Han; Oh, Shin Ah; Oh, Seung-June
2014-02-01
To identify the voiding characteristics of bladder pain syndrome/interstitial cystitis and overactive bladder. Between September 2005 and June 2010, 3-day voiding diaries of 49 consecutive bladder pain syndrome/interstitial cystitis patients and 301 overactive bladder patients were prospectively collected at an outpatient clinic and retrospectively analyzed. The characteristics of the two groups were not significantly different. However, all voiding variables including volume and frequency were significantly different except for the total voided volume: patients with bladder pain syndrome/interstitial cystitis showed significantly higher voiding frequencies, smaller maximal and mean voided volume, and more constant and narrower ranges of voided volume compared with overactive bladder patients (P < 0.005). Furthermore, mean intervals between voiding in bladder pain syndrome/interstitial cystitis were shorter and more consistent during the day and night (P < 0.001), although mean night-time variances were greater than daytime variances. Logistic regression analysis showed that total night-time frequency, maximal night-time voided volume and mean variance of daytime voiding intervals most significantly differentiated the two groups. Some voiding characteristics of bladder pain syndrome/interstitial cystitis and overactive bladder patients differ significantly according to 3-day voiding diary records. These findings provide additional information regarding the differences between these two diseases in the outpatient clinical setting. © 2013 The Japanese Urological Association.
Predictors of healthcare utilization among older Mexican Americans.
Al Snih, Soham; Markides, Kyriakos S; Ray, Laura A; Freeman, Jean L; Ostir, Glenn V; Goodwin, James S
2006-01-01
To examine the effects of predisposing, enabling, and need factors on physician and hospital use among older Mexican Americans. A two-year prospective cohort study. Five Southwestern states: Texas, New Mexico, Colorado, Arizona, and California. A population-based sample of 1987 non-institutionalized Mexican American men and women age > or =65 years. Physician and hospital utilization. Predictor variables included predisposing, enabling, and need factors. Ordinary least square and logistic regression analysis were used to model the effects of predictor factors specified in the Andersen model of health service use on physician and hospital use. After two years of follow-up, predisposing and enabling factors accounted for <5% of the variance in physician and hospital use. Need factors explained 21% of the variance in physician use and 7% of the variance in hospital use. Older age; being female; insurance coverage; having arthritis, diabetes, heart attack, hypertension, stroke, or cancer; and number of medications were factors associated with higher physician utilization. Subjects with arthritis, diabetes, hip fracture, high depressive symptoms, activities of daily living (ADL) disability, or high number of medications increased the odds of having any hospitalization. Subjects with diabetes, heart attack, hip fracture, ADL disabled, and high number of medications had a greater number of hospital nights than their counterparts. Older age, female sex, insurance coverage, and prevalent medical conditions are determinants of healthcare use among older Mexican Americans.
Doyle, F; McGee, H M; Conroy, R M; Delaney, M
2011-05-01
Depression is associated with increased cardiovascular risk in acute coronary syndrome (ACS) patients, but some argue that elevated depression is actually a marker of cardiovascular disease severity. Therefore, disease indices should better predict depression than established theoretical causes of depression (interpersonal life events, reinforcing events, cognitive distortions, type D personality). However, little theory-based research has been conducted in this area. In a cross-sectional design, ACS patients (n = 336) completed questionnaires assessing depression and psychosocial vulnerabilities. Nested logistic regression assessed the relative contribution of demographic or vulnerability factors, or disease indices or vulnerabilities to depression. In multivariate analysis, all vulnerabilities were independent significant predictors of depression (scoring above threshold on any scale, 48%). Demographic variables accounted for <1% of the variance of depression status, with vulnerabilities accounting for significantly more (pseudo R² = 0.16, χ²(change) = 150.9, df = 4, p < 0.001). Disease indices accounted for 7% of the variance in depression (pseudo R² = 0.07, χ² = 137.9, p < 0.001). However, adding the vulnerabilities increased the overall variance explained to 22% (pseudo R² = 0.22, χ² = 58.6, df = 4, p < 0.001). Theoretical vulnerabilities predicted depression status better than did either demographic or disease indices. The presence of these proximal causes of depression suggests that depression in ACS patients is not simply a result of cardiovascular disease severity.
Mulier, Jan P; De Boeck, Liesje; Meulders, Michel; Beliën, Jeroen; Colpaert, Jan; Sels, Annabel
2015-01-01
Rationale, aims and objectives What factors determine the use of an anaesthesia preparation room and shorten non-operative time? Methods A logistic regression is applied to 18 751 surgery records from AZ Sint-Jan Brugge AV, Belgium, where each operating room has its own anaesthesia preparation room. Surgeries, in which the patient's induction has already started when the preceding patient's surgery has ended, belong to a first group where the preparation room is used as an induction room. Surgeries not fulfilling this property belong to a second group. A logistic regression model tries to predict the probability that a surgery will be classified into a specific group. Non-operative time is calculated as the time between end of the previous surgery and incision of the next surgery. A log-linear regression of this non-operative time is performed. Results It was found that switches in surgeons, being a non-elective surgery as well as the previous surgery being non-elective, increase the probability of being classified into the second group. Only a few surgery types, anaesthesiologists and operating rooms can be found exclusively in one of the two groups. Analysis of variance demonstrates that the first group has significantly lower non-operative times. Switches in surgeons, anaesthesiologists and longer scheduled durations of the previous surgery increases the non-operative time. A switch in both surgeon and anaesthesiologist strengthens this negative effect. Only a few operating rooms and surgery types influence the non-operative time. Conclusion The use of the anaesthesia preparation room shortens the non-operative time and is determined by several human and structural factors. PMID:25496600
Logistic regression for dichotomized counts.
Preisser, John S; Das, Kalyan; Benecha, Habtamu; Stamm, John W
2016-12-01
Sometimes there is interest in a dichotomized outcome indicating whether a count variable is positive or zero. Under this scenario, the application of ordinary logistic regression may result in efficiency loss, which is quantifiable under an assumed model for the counts. In such situations, a shared-parameter hurdle model is investigated for more efficient estimation of regression parameters relating to overall effects of covariates on the dichotomous outcome, while handling count data with many zeroes. One model part provides a logistic regression containing marginal log odds ratio effects of primary interest, while an ancillary model part describes the mean count of a Poisson or negative binomial process in terms of nuisance regression parameters. Asymptotic efficiency of the logistic model parameter estimators of the two-part models is evaluated with respect to ordinary logistic regression. Simulations are used to assess the properties of the models with respect to power and Type I error, the latter investigated under both misspecified and correctly specified models. The methods are applied to data from a randomized clinical trial of three toothpaste formulations to prevent incident dental caries in a large population of Scottish schoolchildren. © The Author(s) 2014.
Zhu, K; Lou, Z; Zhou, J; Ballester, N; Kong, N; Parikh, P
2015-01-01
This article is part of the Focus Theme of Methods of Information in Medicine on "Big Data and Analytics in Healthcare". Hospital readmissions raise healthcare costs and cause significant distress to providers and patients. It is, therefore, of great interest to healthcare organizations to predict what patients are at risk to be readmitted to their hospitals. However, current logistic regression based risk prediction models have limited prediction power when applied to hospital administrative data. Meanwhile, although decision trees and random forests have been applied, they tend to be too complex to understand among the hospital practitioners. Explore the use of conditional logistic regression to increase the prediction accuracy. We analyzed an HCUP statewide inpatient discharge record dataset, which includes patient demographics, clinical and care utilization data from California. We extracted records of heart failure Medicare beneficiaries who had inpatient experience during an 11-month period. We corrected the data imbalance issue with under-sampling. In our study, we first applied standard logistic regression and decision tree to obtain influential variables and derive practically meaning decision rules. We then stratified the original data set accordingly and applied logistic regression on each data stratum. We further explored the effect of interacting variables in the logistic regression modeling. We conducted cross validation to assess the overall prediction performance of conditional logistic regression (CLR) and compared it with standard classification models. The developed CLR models outperformed several standard classification models (e.g., straightforward logistic regression, stepwise logistic regression, random forest, support vector machine). For example, the best CLR model improved the classification accuracy by nearly 20% over the straightforward logistic regression model. Furthermore, the developed CLR models tend to achieve better sensitivity of more than 10% over the standard classification models, which can be translated to correct labeling of additional 400 - 500 readmissions for heart failure patients in the state of California over a year. Lastly, several key predictor identified from the HCUP data include the disposition location from discharge, the number of chronic conditions, and the number of acute procedures. It would be beneficial to apply simple decision rules obtained from the decision tree in an ad-hoc manner to guide the cohort stratification. It could be potentially beneficial to explore the effect of pairwise interactions between influential predictors when building the logistic regression models for different data strata. Judicious use of the ad-hoc CLR models developed offers insights into future development of prediction models for hospital readmissions, which can lead to better intuition in identifying high-risk patients and developing effective post-discharge care strategies. Lastly, this paper is expected to raise the awareness of collecting data on additional markers and developing necessary database infrastructure for larger-scale exploratory studies on readmission risk prediction.
Artes, Paul H; Crabb, David P
2010-01-01
To investigate why the specificity of the Moorfields Regression Analysis (MRA) of the Heidelberg Retina Tomograph (HRT) varies with disc size, and to derive accurate normative limits for neuroretinal rim area to address this problem. Two datasets from healthy subjects (Manchester, UK, n = 88; Halifax, Nova Scotia, Canada, n = 75) were used to investigate the physiological relationship between the optic disc and neuroretinal rim area. Normative limits for rim area were derived by quantile regression (QR) and compared with those of the MRA (derived by linear regression). Logistic regression analyses were performed to quantify the association between disc size and positive classifications with the MRA, as well as with the QR-derived normative limits. In both datasets, the specificity of the MRA depended on optic disc size. The odds of observing a borderline or outside-normal-limits classification increased by approximately 10% for each 0.1 mm(2) increase in disc area (P < 0.1). The lower specificity of the MRA with large optic discs could be explained by the failure of linear regression to model the extremes of the rim area distribution (observations far from the mean). In comparison, the normative limits predicted by QR were larger for smaller discs (less specific, more sensitive), and smaller for larger discs, such that false-positive rates became independent of optic disc size. Normative limits derived by quantile regression appear to remove the size-dependence of specificity with the MRA. Because quantile regression does not rely on the restrictive assumptions of standard linear regression, it may be a more appropriate method for establishing normative limits in other clinical applications where the underlying distributions are nonnormal or have nonconstant variance.
Impact of multicollinearity on small sample hydrologic regression models
NASA Astrophysics Data System (ADS)
Kroll, Charles N.; Song, Peter
2013-06-01
Often hydrologic regression models are developed with ordinary least squares (OLS) procedures. The use of OLS with highly correlated explanatory variables produces multicollinearity, which creates highly sensitive parameter estimators with inflated variances and improper model selection. It is not clear how to best address multicollinearity in hydrologic regression models. Here a Monte Carlo simulation is developed to compare four techniques to address multicollinearity: OLS, OLS with variance inflation factor screening (VIF), principal component regression (PCR), and partial least squares regression (PLS). The performance of these four techniques was observed for varying sample sizes, correlation coefficients between the explanatory variables, and model error variances consistent with hydrologic regional regression models. The negative effects of multicollinearity are magnified at smaller sample sizes, higher correlations between the variables, and larger model error variances (smaller R2). The Monte Carlo simulation indicates that if the true model is known, multicollinearity is present, and the estimation and statistical testing of regression parameters are of interest, then PCR or PLS should be employed. If the model is unknown, or if the interest is solely on model predictions, is it recommended that OLS be employed since using more complicated techniques did not produce any improvement in model performance. A leave-one-out cross-validation case study was also performed using low-streamflow data sets from the eastern United States. Results indicate that OLS with stepwise selection generally produces models across study regions with varying levels of multicollinearity that are as good as biased regression techniques such as PCR and PLS.
Interpretation of commonly used statistical regression models.
Kasza, Jessica; Wolfe, Rory
2014-01-01
A review of some regression models commonly used in respiratory health applications is provided in this article. Simple linear regression, multiple linear regression, logistic regression and ordinal logistic regression are considered. The focus of this article is on the interpretation of the regression coefficients of each model, which are illustrated through the application of these models to a respiratory health research study. © 2013 The Authors. Respirology © 2013 Asian Pacific Society of Respirology.
Using Robust Variance Estimation to Combine Multiple Regression Estimates with Meta-Analysis
ERIC Educational Resources Information Center
Williams, Ryan
2013-01-01
The purpose of this study was to explore the use of robust variance estimation for combining commonly specified multiple regression models and for combining sample-dependent focal slope estimates from diversely specified models. The proposed estimator obviates traditionally required information about the covariance structure of the dependent…
Choi, Seung Hoan; Labadorf, Adam T; Myers, Richard H; Lunetta, Kathryn L; Dupuis, Josée; DeStefano, Anita L
2017-02-06
Next generation sequencing provides a count of RNA molecules in the form of short reads, yielding discrete, often highly non-normally distributed gene expression measurements. Although Negative Binomial (NB) regression has been generally accepted in the analysis of RNA sequencing (RNA-Seq) data, its appropriateness has not been exhaustively evaluated. We explore logistic regression as an alternative method for RNA-Seq studies designed to compare cases and controls, where disease status is modeled as a function of RNA-Seq reads using simulated and Huntington disease data. We evaluate the effect of adjusting for covariates that have an unknown relationship with gene expression. Finally, we incorporate the data adaptive method in order to compare false positive rates. When the sample size is small or the expression levels of a gene are highly dispersed, the NB regression shows inflated Type-I error rates but the Classical logistic and Bayes logistic (BL) regressions are conservative. Firth's logistic (FL) regression performs well or is slightly conservative. Large sample size and low dispersion generally make Type-I error rates of all methods close to nominal alpha levels of 0.05 and 0.01. However, Type-I error rates are controlled after applying the data adaptive method. The NB, BL, and FL regressions gain increased power with large sample size, large log2 fold-change, and low dispersion. The FL regression has comparable power to NB regression. We conclude that implementing the data adaptive method appropriately controls Type-I error rates in RNA-Seq analysis. Firth's logistic regression provides a concise statistical inference process and reduces spurious associations from inaccurately estimated dispersion parameters in the negative binomial framework.
Kim, Tae Kyung; Lee, H-C; Lee, S G; Han, K-T; Park, E-C
2017-01-01
Introduction Reports of sexual harassment are becoming more frequent in Republic of Korea (ROK) Armed Forces. This study aimed to analyse the impact of sexual harassment on mental health among female military personnel of the ROK Armed Forces. Methods Data from the 2014 Military Health Survey were used. Instances of sexual harassment were recorded as ‘yes’ or ‘no’. Analysis of variance (ANOVA) was carried out to compare Kessler Psychological Distress Scale 10 (K-10) scores. Multiple logistic regression analysis was performed to identify associations between sexual harassment and K-10 scores. Results Among 228 female military personnel, 13 (5.7%) individuals experienced sexual harassment. Multiple logistic regression analysis revealed that sexual harassment had a significantly negative impact on K-10 scores (3.486, p<0.04). Higher K-10 scores among individuals experiencing sexual harassment were identified in the unmarried (including never-married) group (6.761, p<0.04), the short-term military service group (12.014, p<0.03) and the group whose length of service was <2 years (11.067, p<0.02). Conclusions Sexual harassment has a negative impact on mental health. Factors associated with worse mental health scores included service classification and length of service. The results provide helpful information with which to develop measures for minimising the negative psychological effects from sexual harassment and promoting sexual harassment prevention policy. PMID:27084842
What kind of sexual dysfunction is most common among overweight and obese women in reproductive age?
Rabiepoor, S; Khalkhali, H R; Sadeghi, E
2017-03-01
The aim of this study was to investigate the association between body mass index (BMI) and sexual health and determine what kind of sexual dysfunction is most common among overweight and obese women in reproductive age from Iran. A cross-sectional descriptive design was adopted. The data of 198 women who referred to health centers during 2014-2015 in Iran were collected through convenient sampling. Data were collected using a demographic questionnaire, female sexual function and sexual satisfaction indexes. Participants' heights and weights were recorded in centimeters and kilogram. Data were analyzed applying descriptive statistics, one-way analysis of variance, regression logistic analysis and χ 2 . P-values<0.05 were considered significant. The mean age of women was 29.89±7.01 and ages ranged from 17 to 45 years. 85.9% of the participants had sexual dysfunction, and 69.7% had dissatisfaction and low satisfaction. According to our evaluations, orgasm dysfunction had the most frequency; on the other hand, desire dysfunction and pain dysfunction had the lowest frequency among overweight and obese women, respectively. Using logistic regression analysis, we have shown that BMI affected on sexual satisfaction, but there was not significant differences between BMI and sexual function. This article concludes that all women especially women with overweight and obesity should be counseled about health outcomes related to sexual activity. This article concludes that all women especially women with overweight and obesity should be counseled about health outcomes related to sexual activity.
Zeng, Zhi; Lu, Liming; Rao, Zhanhong; Han, Lu; Shi, Jingrong; Ling, Li
2014-04-01
To investigate the current supply and use of personal protective equipment (PPE) among rural-to-urban migrant workers in small and medium enterprises (SMEs) in Zhongshan and Shenzhen, China and the influential factors for the use of PPE, and to provide a basis for better occupational health services and ensuring the health of migrant workers. Multi-stage sampling was used to select 856 migrant workers from 27 SMEs in Zhongshan and Shenzhen, and face-to-face questionnaire survey was conducted in these subjects. Statistical analysis was performed by one-way analysis of variance, chi-square test, and logistic regression. Of all migrant workers, 38.67%were supplied with free PPE by the factory, and this rate varied across industries (furniture industry: 45.81%; electronic industry: 31.46%) and SMEs (medium enterprises: 42.13%; small enterprises: 39.20%; micro enterprises: 22.16%); 22.43% insisted on the use of PPE. The logistic regression analysis showed that factors associated with the use of PPE included sex, age, awareness of occupational health knowledge, and the size of enterprise. The rates of supply and use of PPE among migrant workers are low. The larger the enterprise, the better the supply of PPE. Male gender, being elder, and high occupational health knowledge score were favorable factors for the use of PPE, while small enterprise size was the unfavorable factor for the use of PPE.
Rappole, Catherine; Grier, Tyson; Anderson, Morgan K; Hauschild, Veronique; Jones, Bruce H
2017-11-01
To investigate the effects of age, aerobic fitness, and body mass index (BMI) on injury risk in operational Army soldiers. Retrospective cohort study. Male soldiers from an operational Army brigade were administered electronic surveys regarding personal characteristics, physical fitness, and injuries occurring over the last 12 months. Injury risks were stratified by age, 2-mile run time, and BMI. Analyses included descriptive incidence, a Mantel-Haenszel χ 2 test to determine trends, a multivariable logistic regression to determine factors associated with injury, and a one-way analysis of variance (ANOVA). Forty-seventy percent of 1099 respondents reported at least one injury. A linear trend showed that as age, 2-mile run time, and BMI increased, so did injury risk (p<0.01). When controlling for BMI, the most significant independent injury risk factors were older age (odd ratio (OR) 30years-35years/≤24years=1.25, 95%CI: 1.08-2.32), (OR≥36years/≤24years=2.05, 95%CI: 1.36-3.10), and slow run times (OR≥15.9min/≤13.9min=1.91, 95%CI: 1.28-2.85). An ANOVA showed that both run times and BMI increased with age. The stratified analysis and the multivariable logistic regression suggested that older age and poor aerobic fitness are stronger predictors of injury than BMI. Copyright © 2017 Sports Medicine Australia. All rights reserved.
Tang, Catherine So-kum
2006-08-01
This study aimed to examine rates and associated factors of parent-to-child corporal punishment and physical maltreatment in Hong Kong Chinese families. Cross-sectional and randomized household interviews were conducted with 1,662 Chinese parents to collect information on demographic characteristics of parents and children, marital satisfaction, perceived social support, evaluation of child problem behaviors, and reactions to conflicts with children. Descriptive statistics, analyses of variances, and logistic regression analyses were conducted. The rates of parent-to-child physical aggression were 57.5% for corporal punishment and 4.5% for physical maltreatment. Mothers as compared to fathers reported higher rates and more frequent use of corporal punishment on their children, but this parental gender effect was insignificant among older parents and those with adolescent children. Boys as compared to girls were more likely to experience higher rates and more frequent parental corporal punishment, especially in middle childhood at aged 5-12. Furthermore, parents perpetrated more frequent physical maltreatment on younger as compared to older children. Results from logistic regression analyses indicated that significant correlates of parental corporal punishment were: children's young age, male gender, and externalizing behaviors as well as parents' young age, non-employment, and marital dissatisfaction. For parent-to-child physical maltreatment, significant correlates were externalizing behaviors of children and parental marital dissatisfaction. Hong Kong Chinese parents commonly used corporal punishment on their children, which was associated with characteristics of children, parents, and family.
Osteoporosis prediction from the mandible using cone-beam computed tomography
Al Haffar, Iyad; Khattab, Razan
2014-01-01
Purpose This study aimed to evaluate the use of dental cone-beam computed tomography (CBCT) in the diagnosis of osteoporosis among menopausal and postmenopausal women by using only a CBCT viewer program. Materials and Methods Thirty-eight menopausal and postmenopausal women who underwent dual-energy X-ray absorptiometry (DXA) examination for hip and lumbar vertebrae were scanned using CBCT (field of view: 13 cm×15 cm; voxel size: 0.25 mm). Slices from the body of the mandible as well as the ramus were selected and some CBCT-derived variables, such as radiographic density (RD) as gray values, were calculated as gray values. Pearson's correlation, one-way analysis of variance (ANOVA), and accuracy (sensitivity and specificity) evaluation based on linear and logistic regression were performed to choose the variable that best correlated with the lumbar and femoral neck T-scores. Results RD of the whole bone area of the mandible was the variable that best correlated with and predicted both the femoral neck and the lumbar vertebrae T-scores; further, Pearson's correlation coefficients were 0.5/0.6 (p value=0.037/0.009). The sensitivity, specificity, and accuracy based on the logistic regression were 50%, 88.9%, and 78.4%, respectively, for the femoral neck, and 46.2%, 91.3%, and 75%, respectively, for the lumbar vertebrae. Conclusion Lumbar vertebrae and femoral neck osteoporosis can be predicted with high accuracy from the RD value of the body of the mandible by using a CBCT viewer program. PMID:25473633
Factors Associated With Peer Victimization Among Adolescents in Taiwan.
Huang, Hui-Wen; Chen, Jyu-Lin; Wang, Ruey-Hsia
2018-02-01
Adolescents who have experienced peer victimization face a higher risk of negative health outcomes. However, little is known about the factors that are associated with peer victimization among adolescents in Taiwan. The aim of this study was to examine the factors related to peer victimization among Taiwanese adolescents. A cross-sectional design was employed. Three hundred seventy-seven adolescents aged 13-16 years from seven middle schools in southern Taiwan were recruited as participants. Validated, self-reported questionnaires were used to gather data on demographic characteristics, resilience, peer relationship, parental monitoring, school connectedness, social support, and peer victimization. Logistic regression analysis was used to examine the factors that were related to peer victimization. About 17% (n = 64) of the participants experienced peer victimization during the previous 1-year period. Logistic regression analysis indicated that parental monitoring of daily life, school connectedness, and peer support were significant predictors of a reduced risk of peer victimization. The final model explained 23.1% of the total variance in less peer victimization and predicted 80.1% of peer victimization. School connectedness and peer support were identified as important factors facilitating the avoidance of peer victimization among adolescents in Taiwan. Healthcare providers and school personnel should consider school-based programs to improve school connectedness and to build an atmosphere of peer support to reduce peer victimization. Educating parents to monitor their adolescents' daily activities is also encouraged in concert with these school-based programs.
Differentially private distributed logistic regression using private and public data
2014-01-01
Background Privacy protecting is an important issue in medical informatics and differential privacy is a state-of-the-art framework for data privacy research. Differential privacy offers provable privacy against attackers who have auxiliary information, and can be applied to data mining models (for example, logistic regression). However, differentially private methods sometimes introduce too much noise and make outputs less useful. Given available public data in medical research (e.g. from patients who sign open-consent agreements), we can design algorithms that use both public and private data sets to decrease the amount of noise that is introduced. Methodology In this paper, we modify the update step in Newton-Raphson method to propose a differentially private distributed logistic regression model based on both public and private data. Experiments and results We try our algorithm on three different data sets, and show its advantage over: (1) a logistic regression model based solely on public data, and (2) a differentially private distributed logistic regression model based on private data under various scenarios. Conclusion Logistic regression models built with our new algorithm based on both private and public datasets demonstrate better utility than models that trained on private or public datasets alone without sacrificing the rigorous privacy guarantee. PMID:25079786
Park, Ji Hyun; Kim, Hyeon-Young; Lee, Hanna; Yun, Eun Kyoung
2015-12-01
This study compares the performance of the logistic regression and decision tree analysis methods for assessing the risk factors for infection in cancer patients undergoing chemotherapy. The subjects were 732 cancer patients who were receiving chemotherapy at K university hospital in Seoul, Korea. The data were collected between March 2011 and February 2013 and were processed for descriptive analysis, logistic regression and decision tree analysis using the IBM SPSS Statistics 19 and Modeler 15.1 programs. The most common risk factors for infection in cancer patients receiving chemotherapy were identified as alkylating agents, vinca alkaloid and underlying diabetes mellitus. The logistic regression explained 66.7% of the variation in the data in terms of sensitivity and 88.9% in terms of specificity. The decision tree analysis accounted for 55.0% of the variation in the data in terms of sensitivity and 89.0% in terms of specificity. As for the overall classification accuracy, the logistic regression explained 88.0% and the decision tree analysis explained 87.2%. The logistic regression analysis showed a higher degree of sensitivity and classification accuracy. Therefore, logistic regression analysis is concluded to be the more effective and useful method for establishing an infection prediction model for patients undergoing chemotherapy. Copyright © 2015 Elsevier Ltd. All rights reserved.
Yang, Lixue; Chen, Kean
2015-11-01
To improve the design of underwater target recognition systems based on auditory perception, this study compared human listeners with automatic classifiers. Performances measures and strategies in three discrimination experiments, including discriminations between man-made and natural targets, between ships and submarines, and among three types of ships, were used. In the experiments, the subjects were asked to assign a score to each sound based on how confident they were about the category to which it belonged, and logistic regression, which represents linear discriminative models, also completed three similar tasks by utilizing many auditory features. The results indicated that the performances of logistic regression improved as the ratio between inter- and intra-class differences became larger, whereas the performances of the human subjects were limited by their unfamiliarity with the targets. Logistic regression performed better than the human subjects in all tasks but the discrimination between man-made and natural targets, and the strategies employed by excellent human subjects were similar to that of logistic regression. Logistic regression and several human subjects demonstrated similar performances when discriminating man-made and natural targets, but in this case, their strategies were not similar. An appropriate fusion of their strategies led to further improvement in recognition accuracy.
NASA Astrophysics Data System (ADS)
Mei, Zhixiong; Wu, Hao; Li, Shiyun
2018-06-01
The Conversion of Land Use and its Effects at Small regional extent (CLUE-S), which is a widely used model for land-use simulation, utilizes logistic regression to estimate the relationships between land use and its drivers, and thus, predict land-use change probabilities. However, logistic regression disregards possible spatial autocorrelation and self-organization in land-use data. Autologistic regression can depict spatial autocorrelation but cannot address self-organization, while logistic regression by considering only self-organization (NElogistic regression) fails to capture spatial autocorrelation. Therefore, this study developed a regression (NE-autologistic regression) method, which incorporated both spatial autocorrelation and self-organization, to improve CLUE-S. The Zengcheng District of Guangzhou, China was selected as the study area. The land-use data of 2001, 2005, and 2009, as well as 10 typical driving factors, were used to validate the proposed regression method and the improved CLUE-S model. Then, three future land-use scenarios in 2020: the natural growth scenario, ecological protection scenario, and economic development scenario, were simulated using the improved model. Validation results showed that NE-autologistic regression performed better than logistic regression, autologistic regression, and NE-logistic regression in predicting land-use change probabilities. The spatial allocation accuracy and kappa values of NE-autologistic-CLUE-S were higher than those of logistic-CLUE-S, autologistic-CLUE-S, and NE-logistic-CLUE-S for the simulations of two periods, 2001-2009 and 2005-2009, which proved that the improved CLUE-S model achieved the best simulation and was thereby effective to a certain extent. The scenario simulation results indicated that under all three scenarios, traffic land and residential/industrial land would increase, whereas arable land and unused land would decrease during 2009-2020. Apparent differences also existed in the simulated change sizes and locations of each land-use type under different scenarios. The results not only demonstrate the validity of the improved model but also provide a valuable reference for relevant policy-makers.
Unitary Response Regression Models
ERIC Educational Resources Information Center
Lipovetsky, S.
2007-01-01
The dependent variable in a regular linear regression is a numerical variable, and in a logistic regression it is a binary or categorical variable. In these models the dependent variable has varying values. However, there are problems yielding an identity output of a constant value which can also be modelled in a linear or logistic regression with…
Binary logistic regression-Instrument for assessing museum indoor air impact on exhibits.
Bucur, Elena; Danet, Andrei Florin; Lehr, Carol Blaziu; Lehr, Elena; Nita-Lazar, Mihai
2017-04-01
This paper presents a new way to assess the environmental impact on historical artifacts using binary logistic regression. The prediction of the impact on the exhibits during certain pollution scenarios (environmental impact) was calculated by a mathematical model based on the binary logistic regression; it allows the identification of those environmental parameters from a multitude of possible parameters with a significant impact on exhibitions and ranks them according to their severity effect. Air quality (NO 2 , SO 2 , O 3 and PM 2.5 ) and microclimate parameters (temperature, humidity) monitoring data from a case study conducted within exhibition and storage spaces of the Romanian National Aviation Museum Bucharest have been used for developing and validating the binary logistic regression method and the mathematical model. The logistic regression analysis was used on 794 data combinations (715 to develop of the model and 79 to validate it) by a Statistical Package for Social Sciences (SPSS 20.0). The results from the binary logistic regression analysis demonstrated that from six parameters taken into consideration, four of them present a significant effect upon exhibits in the following order: O 3 >PM 2.5 >NO 2 >humidity followed at a significant distance by the effects of SO 2 and temperature. The mathematical model, developed in this study, correctly predicted 95.1 % of the cumulated effect of the environmental parameters upon the exhibits. Moreover, this model could also be used in the decisional process regarding the preventive preservation measures that should be implemented within the exhibition space. The paper presents a new way to assess the environmental impact on historical artifacts using binary logistic regression. The mathematical model developed on the environmental parameters analyzed by the binary logistic regression method could be useful in a decision-making process establishing the best measures for pollution reduction and preventive preservation of exhibits.
Determining factors influencing survival of breast cancer by fuzzy logistic regression model.
Nikbakht, Roya; Bahrampour, Abbas
2017-01-01
Fuzzy logistic regression model can be used for determining influential factors of disease. This study explores the important factors of actual predictive survival factors of breast cancer's patients. We used breast cancer data which collected by cancer registry of Kerman University of Medical Sciences during the period of 2000-2007. The variables such as morphology, grade, age, and treatments (surgery, radiotherapy, and chemotherapy) were applied in the fuzzy logistic regression model. Performance of model was determined in terms of mean degree of membership (MDM). The study results showed that almost 41% of patients were in neoplasm and malignant group and more than two-third of them were still alive after 5-year follow-up. Based on the fuzzy logistic model, the most important factors influencing survival were chemotherapy, morphology, and radiotherapy, respectively. Furthermore, the MDM criteria show that the fuzzy logistic regression have a good fit on the data (MDM = 0.86). Fuzzy logistic regression model showed that chemotherapy is more important than radiotherapy in survival of patients with breast cancer. In addition, another ability of this model is calculating possibilistic odds of survival in cancer patients. The results of this study can be applied in clinical research. Furthermore, there are few studies which applied the fuzzy logistic models. Furthermore, we recommend using this model in various research areas.
Bignardi, A B; El Faro, L; Cardoso, V L; Machado, P F; Albuquerque, L G
2009-09-01
The objective of the present study was to estimate milk yield genetic parameters applying random regression models and parametric correlation functions combined with a variance function to model animal permanent environmental effects. A total of 152,145 test-day milk yields from 7,317 first lactations of Holstein cows belonging to herds located in the southeastern region of Brazil were analyzed. Test-day milk yields were divided into 44 weekly classes of days in milk. Contemporary groups were defined by herd-test-day comprising a total of 2,539 classes. The model included direct additive genetic, permanent environmental, and residual random effects. The following fixed effects were considered: contemporary group, age of cow at calving (linear and quadratic regressions), and the population average lactation curve modeled by fourth-order orthogonal Legendre polynomial. Additive genetic effects were modeled by random regression on orthogonal Legendre polynomials of days in milk, whereas permanent environmental effects were estimated using a stationary or nonstationary parametric correlation function combined with a variance function of different orders. The structure of residual variances was modeled using a step function containing 6 variance classes. The genetic parameter estimates obtained with the model using a stationary correlation function associated with a variance function to model permanent environmental effects were similar to those obtained with models employing orthogonal Legendre polynomials for the same effect. A model using a sixth-order polynomial for additive effects and a stationary parametric correlation function associated with a seventh-order variance function to model permanent environmental effects would be sufficient for data fitting.
Vavougios, George D; George D, George; Pastaka, Chaido; Zarogiannis, Sotirios G; Gourgoulianis, Konstantinos I
2016-02-01
Phenotyping obstructive sleep apnea syndrome's comorbidity has been attempted for the first time only recently. The aim of our study was to determine phenotypes of comorbidity in obstructive sleep apnea syndrome patients employing a data-driven approach. Data from 1472 consecutive patient records were recovered from our hospital's database. Categorical principal component analysis and two-step clustering were employed to detect distinct clusters in the data. Univariate comparisons between clusters included one-way analysis of variance with Bonferroni correction and chi-square tests. Predictors of pairwise cluster membership were determined via a binary logistic regression model. The analyses revealed six distinct clusters: A, 'healthy, reporting sleeping related symptoms'; B, 'mild obstructive sleep apnea syndrome without significant comorbidities'; C1: 'moderate obstructive sleep apnea syndrome, obesity, without significant comorbidities'; C2: 'moderate obstructive sleep apnea syndrome with severe comorbidity, obesity and the exclusive inclusion of stroke'; D1: 'severe obstructive sleep apnea syndrome and obesity without comorbidity and a 33.8% prevalence of hypertension'; and D2: 'severe obstructive sleep apnea syndrome with severe comorbidities, along with the highest Epworth Sleepiness Scale score and highest body mass index'. Clusters differed significantly in apnea-hypopnea index, oxygen desaturation index; arousal index; age, body mass index, minimum oxygen saturation and daytime oxygen saturation (one-way analysis of variance P < 0.0001). Binary logistic regression indicated that older age, greater body mass index, lower daytime oxygen saturation and hypertension were associated independently with an increased risk of belonging in a comorbid cluster. Six distinct phenotypes of obstructive sleep apnea syndrome and its comorbidities were identified. Mapping the heterogeneity of the obstructive sleep apnea syndrome may help the early identification of at-risk groups. Finally, determining predictors of comorbidity for the moderate and severe strata of these phenotypes implies a need to take these factors into account when considering obstructive sleep apnea syndrome treatment options. © 2015 The Authors. Journal of Sleep Research published by John Wiley & Sons Ltd on behalf of European Sleep Research Society.
Lindström, Martin; Moghaddassi, Mahnaz; Merlo, Juan
2006-01-01
To investigate the influence of contextual and individual factors on self-reported psychological health. The 2000 public health survey in Scania is a cross-sectional postal questionnaire study with a 59% participation rate. A total of 13,715 persons aged 18-80 answered the questionnaire. A multilevel logistic regression model, with individuals at the first level and municipalities/city quarters at the second, was performed. The effect (intra-class correlation, cross-level modification, and odds ratios) of individual and municipality/city quarter factors on self-reported psychological health was analysed. The crude variance between municipalities/city quarters was small but significant. It was particularly affected and lowered by individual civil status, country of origin, economic stress, and social participation. The inclusion of all individual factors age, sex, civil status, country of origin, education, economic stress, and social participation lowered the between municipality variance to not-significant levels, which is the reason why no contextual variables were included in the calculations. The results of this study suggest that poor self-reported psychological health is affected mainly by individual characteristics of the population and not by contextual factors at the municipality/city quarter level.
The importance of family support in pediatrics and its impact on healthcare satisfaction.
Sigurdardottir, Anna Olafia; Garwick, Ann W; Svavarsdottir, Erla Kolbrun
2017-06-01
To evaluate predictors of healthcare satisfaction for parents whose children received hospital-based healthcare services at the Children's hospital at Landspitali University Hospital. In this cross-sectional study, data on perceived family support, family quality of life, expressive family functioning, coping strategies and healthcare satisfaction were collected from 159 mothers and 60 fathers (N = 177 families) of children and adolescents from 2011 to 2012. Logistic regression analysis revealed that, for mothers, 38.8% of the variance in satisfaction with healthcare services was predicted by perceived family support and their coping strategies, while for fathers, 59.9% of the variance of their satisfaction with healthcare service was predicted by perceived family support, family quality of life and whether the child had been hospitalised before. Perceived family support was the one factor that was found to predict both the mothers' and the fathers' satisfaction with healthcare services. Knowing which factors predict satisfaction with health care among parents of hospitalised children with different chronic illnesses and health issues can inform the delivery of effective family-focused interventions and evidence-based practice to families. © 2016 Nordic College of Caring Science.
Eynon, Michael John; O'Donnell, Christopher; Williams, Lynn
2017-10-01
Given the mixed findings concerning self-determination theory in explaining adherence to exercise referral schemes (ERS), the present study attempted to examine whether autonomous motivation and psychological need satisfaction could predict ERS adherence. Participants referred to an 8-week ERS completed self-report measures grounded in self-determination theory and basic needs theory at baseline (N = 124), mid-scheme (N = 58), and at the end of the scheme (N = 40). Logistic regressions were used to analyse the data. Autonomous motivation measured at mid-scheme explained between 12 and 16% of the variance in ERS adherence. Autonomy, relatedness and competence measured at mid-scheme explained between 18 and 26% of the variance in ERS adherence. This model also explained between 18 and 25% when measured at the end of the scheme. The study found limited evidence for the role of autonomous motivation in explaining ERS adherence. Stronger support was found for the satisfaction of the three needs for autonomy, relatedness and competence in predicting ERS adherence. Future research should tap into the satisfaction of all three needs collectively to help foster ERS adherence.
Influence diagnostics in meta-regression model.
Shi, Lei; Zuo, ShanShan; Yu, Dalei; Zhou, Xiaohua
2017-09-01
This paper studies the influence diagnostics in meta-regression model including case deletion diagnostic and local influence analysis. We derive the subset deletion formulae for the estimation of regression coefficient and heterogeneity variance and obtain the corresponding influence measures. The DerSimonian and Laird estimation and maximum likelihood estimation methods in meta-regression are considered, respectively, to derive the results. Internal and external residual and leverage measure are defined. The local influence analysis based on case-weights perturbation scheme, responses perturbation scheme, covariate perturbation scheme, and within-variance perturbation scheme are explored. We introduce a method by simultaneous perturbing responses, covariate, and within-variance to obtain the local influence measure, which has an advantage of capable to compare the influence magnitude of influential studies from different perturbations. An example is used to illustrate the proposed methodology. Copyright © 2017 John Wiley & Sons, Ltd.
Chow, Esther O W; Ho, Henry C Y
2012-01-01
The rapidly ageing population in Hong Kong has led to a major concern in providing care for the elderly. Due to the current social changes in Hong Kong, such as smaller family size, longer life spans, and increasing employment demands, spouses increasingly serve as the primary caregivers for older adults. To explore the mental health of older spousal caregivers, this study investigated the relationships between psychological resources, social resources, and depression. One hundred fifty-eight spousal caregivers aged 55 and above were recruited from 13 caregiver resource centres in Hong Kong. Data were collected using structured questionnaires. Hierarchical regression analysis revealed that the number of duties and psychological resources including purpose in life, caregiver burden, and personal wellbeing explained 56% of the variance in depression. Logistic regression analysis further indicated that purpose in life predicted the likelihood of depression reported by caregivers. Social resources did not significantly predict depression. Results suggest that mental health enhancement programs should be developed for Chinese spousal caregivers with a focus on purpose in life, burden, and personal wellbeing.
Sensitivity and specificity of memory and naming tests for identifying left temporal-lobe epilepsy.
Umfleet, Laura Glass; Janecek, Julie K; Quasney, Erin; Sabsevitz, David S; Ryan, Joseph J; Binder, Jeffrey R; Swanson, Sara J
2015-01-01
The sensitivity and specificity of the Selective Reminding Test (SRT) Delayed Recall, Wechsler Memory Scale (WMS) Logical Memory, the Boston Naming Test (BNT), and two nonverbal memory measures for detecting lateralized dysfunction in association with side of seizure focus was examined in a sample of 143 patients with left or right temporal-lobe epilepsy (TLE). Scores on the SRT and BNT were statistically significantly lower in the left TLE group compared with the right TLE group, whereas no group differences emerged on the Logical Memory subtest. No significant group differences were found with nonverbal memory measures. When the SRT and BNT were both entered as predictors in a logistic regression, the BNT, although significant, added minimal value to the model beyond the variance accounted for by the SRT Delayed Recall. Both variables emerged as significant predictors of side of seizure focus when entered into separate regressions. Sensitivity and specificity of the SRT and BNT ranged from 56% to 65%. The WMS Logical Memory and nonverbal memory measures were not significant predictors of the side of seizure focus.
Association between Personality Traits and Sleep Quality in Young Korean Women
Kim, Han-Na; Cho, Juhee; Chang, Yoosoo; Ryu, Seungho
2015-01-01
Personality is a trait that affects behavior and lifestyle, and sleep quality is an important component of a healthy life. We analyzed the association between personality traits and sleep quality in a cross-section of 1,406 young women (from 18 to 40 years of age) who were not reporting clinically meaningful depression symptoms. Surveys were carried out from December 2011 to February 2012, using the Revised NEO Personality Inventory and the Pittsburgh Sleep Quality Index (PSQI). All analyses were adjusted for demographic and behavioral variables. We considered beta weights, structure coefficients, unique effects, and common effects when evaluating the importance of sleep quality predictors in multiple linear regression models. Neuroticism was the most important contributor to PSQI global scores in the multiple regression models. By contrast, despite being strongly correlated with sleep quality, conscientiousness had a near-zero beta weight in linear regression models, because most variance was shared with other personality traits. However, conscientiousness was the most noteworthy predictor of poor sleep quality status (PSQI≥6) in logistic regression models and individuals high in conscientiousness were least likely to have poor sleep quality, which is consistent with an OR of 0.813, with conscientiousness being protective against poor sleep quality. Personality may be a factor in poor sleep quality and should be considered in sleep interventions targeting young women. PMID:26030141
Espelt, Albert; Marí-Dell'Olmo, Marc; Penelo, Eva; Bosque-Prous, Marina
2016-06-14
To examine the differences between Prevalence Ratio (PR) and Odds Ratio (OR) in a cross-sectional study and to provide tools to calculate PR using two statistical packages widely used in substance use research (STATA and R). We used cross-sectional data from 41,263 participants of 16 European countries participating in the Survey on Health, Ageing and Retirement in Europe (SHARE). The dependent variable, hazardous drinking, was calculated using the Alcohol Use Disorders Identification Test - Consumption (AUDIT-C). The main independent variable was gender. Other variables used were: age, educational level and country of residence. PR of hazardous drinking in men with relation to women was estimated using Mantel-Haenszel method, log-binomial regression models and poisson regression models with robust variance. These estimations were compared to the OR calculated using logistic regression models. Prevalence of hazardous drinkers varied among countries. Generally, men have higher prevalence of hazardous drinking than women [PR=1.43 (1.38-1.47)]. Estimated PR was identical independently of the method and the statistical package used. However, OR overestimated PR, depending on the prevalence of hazardous drinking in the country. In cross-sectional studies, where comparisons between countries with differences in the prevalence of the disease or condition are made, it is advisable to use PR instead of OR.
Zhao, Lei; Li, Weizheng; Su, Zhihong; Liu, Yong; Zhu, Liyong; Zhu, Shaihong
2018-05-29
This study investigated the role of preoperative fasting C-peptide (FCP) levels in predicting diabetic outcomes in low-BMI Chinese patients following Roux-en-Y gastric bypass (RYGB) by comparing the metabolic outcomes of patients with FCP > 1 ng/ml versus FCP ≤ 1 ng/ml. The study sample included 78 type 2 diabetes mellitus patients with an average BMI < 30 kg/m 2 at baseline. Patients' parameters were analyzed before and after surgery, with a 2-year follow-up. A univariate logistic regression analysis and multivariate analysis of variance between the remission and improvement group were performed to determine factors that were associated with type 2 diabetes remission after RYGB. Linear correlation analyses between FCP and metabolic parameters were performed. Patients were divided into two groups: FCP > 1 ng/ml and FCP ≤ 1 ng/ml, with measured parameters compared between the groups. Patients' fasting plasma glucose, 2-h postprandial plasma glucose, FCP, and HbA1c improved significantly after surgery (p < 0.05). Factors associated with type 2 diabetes remission were BMI, 2hINS, and FCP at the univariate logistic regression analysis (p < 0.05). Multivariate logistic regression analysis was performed then showed the results were more related to FCP (OR = 2.39). FCP showed a significant linear correlation with fasting insulin and BMI (p < 0.05). There was a significant difference in remission rate between the FCP > 1 ng/ml and FCP ≤ 1 ng/ml groups (p = 0.01). The parameters of patients with FCP > 1 ng/ml, including BMI, plasma glucose, HbA1c, and plasma insulin, decreased markedly after surgery (p < 0.05). FCP level is a significant predictor of diabetes outcomes after RYGB in low-BMI Chinese patients. An FCP level of 1 ng/ml may be a useful threshold for predicting surgical prognosis, with FCP > 1 ng/ml predicting better clinical outcomes following RYGB.
Simulation of Autonomic Logistics System (ALS) Sortie Generation
2003-03-01
84 Appendix B. ANOVA Assumptions Mission Capable Rate ANOVA Assumptions Constant Variance SSR # X cols SSE n Breusch - Pagan Chi-square 3.57E...85 Flying Scheduling Effectiveness ANOVA Assumptions Constant Variance SSR # X cols SSE n Breusch - Pagan Chi-square 2.12E-10 3 0.000816 270...Constant Variance SSR # X cols SSE n Breusch - Pagan Chi-square 1.86E-09 3 0.003758 270 3.20308814 0.9556957 Independence Durbin-Watson
Predicting Cost and Schedule Growth for Military and Civil Space Systems
2008-03-01
the Shapiro-Wilk Test , and testing the residuals for constant variance using the Breusch - Pagan test . For logistic models, diagnostics include...the Breusch - Pagan Test . With this test , a p-value below 0.05 rejects the null hypothesis that the residuals have constant variance. Thus, similar...to the Shapiro- Wilk Test , because the optimal model will have constant variance of its residuals, this requires Breusch - Pagan p-values over 0.05
Mixed conditional logistic regression for habitat selection studies.
Duchesne, Thierry; Fortin, Daniel; Courbin, Nicolas
2010-05-01
1. Resource selection functions (RSFs) are becoming a dominant tool in habitat selection studies. RSF coefficients can be estimated with unconditional (standard) and conditional logistic regressions. While the advantage of mixed-effects models is recognized for standard logistic regression, mixed conditional logistic regression remains largely overlooked in ecological studies. 2. We demonstrate the significance of mixed conditional logistic regression for habitat selection studies. First, we use spatially explicit models to illustrate how mixed-effects RSFs can be useful in the presence of inter-individual heterogeneity in selection and when the assumption of independence from irrelevant alternatives (IIA) is violated. The IIA hypothesis states that the strength of preference for habitat type A over habitat type B does not depend on the other habitat types also available. Secondly, we demonstrate the significance of mixed-effects models to evaluate habitat selection of free-ranging bison Bison bison. 3. When movement rules were homogeneous among individuals and the IIA assumption was respected, fixed-effects RSFs adequately described habitat selection by simulated animals. In situations violating the inter-individual homogeneity and IIA assumptions, however, RSFs were best estimated with mixed-effects regressions, and fixed-effects models could even provide faulty conclusions. 4. Mixed-effects models indicate that bison did not select farmlands, but exhibited strong inter-individual variations in their response to farmlands. Less than half of the bison preferred farmlands over forests. Conversely, the fixed-effect model simply suggested an overall selection for farmlands. 5. Conditional logistic regression is recognized as a powerful approach to evaluate habitat selection when resource availability changes. This regression is increasingly used in ecological studies, but almost exclusively in the context of fixed-effects models. Fitness maximization can imply differences in trade-offs among individuals, which can yield inter-individual differences in selection and lead to departure from IIA. These situations are best modelled with mixed-effects models. Mixed-effects conditional logistic regression should become a valuable tool for ecological research.
Advanced colorectal neoplasia risk stratification by penalized logistic regression.
Lin, Yunzhi; Yu, Menggang; Wang, Sijian; Chappell, Richard; Imperiale, Thomas F
2016-08-01
Colorectal cancer is the second leading cause of death from cancer in the United States. To facilitate the efficiency of colorectal cancer screening, there is a need to stratify risk for colorectal cancer among the 90% of US residents who are considered "average risk." In this article, we investigate such risk stratification rules for advanced colorectal neoplasia (colorectal cancer and advanced, precancerous polyps). We use a recently completed large cohort study of subjects who underwent a first screening colonoscopy. Logistic regression models have been used in the literature to estimate the risk of advanced colorectal neoplasia based on quantifiable risk factors. However, logistic regression may be prone to overfitting and instability in variable selection. Since most of the risk factors in our study have several categories, it was tempting to collapse these categories into fewer risk groups. We propose a penalized logistic regression method that automatically and simultaneously selects variables, groups categories, and estimates their coefficients by penalizing the [Formula: see text]-norm of both the coefficients and their differences. Hence, it encourages sparsity in the categories, i.e. grouping of the categories, and sparsity in the variables, i.e. variable selection. We apply the penalized logistic regression method to our data. The important variables are selected, with close categories simultaneously grouped, by penalized regression models with and without the interactions terms. The models are validated with 10-fold cross-validation. The receiver operating characteristic curves of the penalized regression models dominate the receiver operating characteristic curve of naive logistic regressions, indicating a superior discriminative performance. © The Author(s) 2013.
Rupert, Michael G.; Cannon, Susan H.; Gartner, Joseph E.
2003-01-01
Logistic regression was used to predict the probability of debris flows occurring in areas recently burned by wildland fires. Multiple logistic regression is conceptually similar to multiple linear regression because statistical relations between one dependent variable and several independent variables are evaluated. In logistic regression, however, the dependent variable is transformed to a binary variable (debris flow did or did not occur), and the actual probability of the debris flow occurring is statistically modeled. Data from 399 basins located within 15 wildland fires that burned during 2000-2002 in Colorado, Idaho, Montana, and New Mexico were evaluated. More than 35 independent variables describing the burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated. The models were developed as follows: (1) Basins that did and did not produce debris flows were delineated from National Elevation Data using a Geographic Information System (GIS). (2) Data describing the burn severity, geology, land surface gradient, rainfall, and soil properties were determined for each basin. These data were then downloaded to a statistics software package for analysis using logistic regression. (3) Relations between the occurrence/non-occurrence of debris flows and burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated and several preliminary multivariate logistic regression models were constructed. All possible combinations of independent variables were evaluated to determine which combination produced the most effective model. The multivariate model that best predicted the occurrence of debris flows was selected. (4) The multivariate logistic regression model was entered into a GIS, and a map showing the probability of debris flows was constructed. The most effective model incorporates the percentage of each basin with slope greater than 30 percent, percentage of land burned at medium and high burn severity in each basin, particle size sorting, average storm intensity (millimeters per hour), soil organic matter content, soil permeability, and soil drainage. The results of this study demonstrate that logistic regression is a valuable tool for predicting the probability of debris flows occurring in recently-burned landscapes.
Key factors in children's competence to consent to clinical research.
Hein, Irma M; Troost, Pieter W; Lindeboom, Robert; Benninga, Marc A; Zwaan, C Michel; van Goudoever, Johannes B; Lindauer, Ramón J L
2015-10-24
Although law is established on a strong presumption that persons younger than a certain age are not competent to consent, statutory age limits for asking children's consent to clinical research differ widely internationally. From a clinical perspective, competence is assumed to involve many factors including the developmental stage, the influence of parents and peers, and life experience. We examined potential determining factors for children's competence to consent to clinical research and to what extent they explain the variation in competence judgments. From January 1, 2012 through January 1, 2014, pediatric patients aged 6 to 18 years, eligible for clinical research studies were enrolled prospectively at various in- and outpatient pediatric departments. Children's competence to consent was assessed by MacArthur Competence Assessment Tool for Clinical Research. Potential determining child variables included age, gender, intelligence, disease experience, ethnicity and socio-economic status (SES). We used logistic regression analysis and change in explained variance in competence judgments to quantify the contribution of a child variable to the total explained variance. Contextual factors included risk and complexity of the decision to participate, parental competence judgment and the child's or parents decision to participate. Out of 209 eligible patients, 161 were included (mean age, 10.6 years, 47.2 % male). Age, SES, intelligence, ethnicity, complexity, parental competence judgment and trial participation were univariately associated with competence (P < 0.05). Total explained variance in competence judgments was 71.5 %. Only age and intelligence significantly and independently explained the variance in competence judgments, explaining 56.6 % and 12.7 % of the total variance respectively. SES, male gender, disease experience and ethnicity each explained less than 1 % of the variance in competence judgments. Contextual factors together explained an extra 2.8 % (P > 0.05). Age is the factor that explaines most of to the variance in children's competence to consent, followed by intelligence. Experience with disease did not affect competence in this study, nor did other variables. Development and use of a standardized instrument for assessing children's competence to consent in drug trials: Are legally established age limits valid?, NTR3918.
Ebrahimzadeh, Farzad; Hajizadeh, Ebrahim; Vahabi, Nasim; Almasian, Mohammad; Bakhteyar, Katayoon
2015-01-01
Background: Unwanted pregnancy not intended by at least one of the parents has undesirable consequences for the family and the society. In the present study, three classification models were used and compared to predict unwanted pregnancies in an urban population. Methods: In this cross-sectional study, 887 pregnant mothers referring to health centers in Khorramabad, Iran, in 2012 were selected by the stratified and cluster sampling; relevant variables were measured and for prediction of unwanted pregnancy, logistic regression, discriminant analysis, and probit regression models and SPSS software version 21 were used. To compare these models, indicators such as sensitivity, specificity, the area under the ROC curve, and the percentage of correct predictions were used. Results: The prevalence of unwanted pregnancies was 25.3%. The logistic and probit regression models indicated that parity and pregnancy spacing, contraceptive methods, household income and number of living male children were related to unwanted pregnancy. The performance of the models based on the area under the ROC curve was 0.735, 0.733, and 0.680 for logistic regression, probit regression, and linear discriminant analysis, respectively. Conclusion: Given the relatively high prevalence of unwanted pregnancies in Khorramabad, it seems necessary to revise family planning programs. Despite the similar accuracy of the models, if the researcher is interested in the interpretability of the results, the use of the logistic regression model is recommended. PMID:26793655
Ebrahimzadeh, Farzad; Hajizadeh, Ebrahim; Vahabi, Nasim; Almasian, Mohammad; Bakhteyar, Katayoon
2015-01-01
Unwanted pregnancy not intended by at least one of the parents has undesirable consequences for the family and the society. In the present study, three classification models were used and compared to predict unwanted pregnancies in an urban population. In this cross-sectional study, 887 pregnant mothers referring to health centers in Khorramabad, Iran, in 2012 were selected by the stratified and cluster sampling; relevant variables were measured and for prediction of unwanted pregnancy, logistic regression, discriminant analysis, and probit regression models and SPSS software version 21 were used. To compare these models, indicators such as sensitivity, specificity, the area under the ROC curve, and the percentage of correct predictions were used. The prevalence of unwanted pregnancies was 25.3%. The logistic and probit regression models indicated that parity and pregnancy spacing, contraceptive methods, household income and number of living male children were related to unwanted pregnancy. The performance of the models based on the area under the ROC curve was 0.735, 0.733, and 0.680 for logistic regression, probit regression, and linear discriminant analysis, respectively. Given the relatively high prevalence of unwanted pregnancies in Khorramabad, it seems necessary to revise family planning programs. Despite the similar accuracy of the models, if the researcher is interested in the interpretability of the results, the use of the logistic regression model is recommended.
Kempe, P T; van Oppen, P; de Haan, E; Twisk, J W R; Sluis, A; Smit, J H; van Dyck, R; van Balkom, A J L M
2007-09-01
Two methods for predicting remissions in obsessive-compulsive disorder (OCD) treatment are evaluated. Y-BOCS measurements of 88 patients with a primary OCD (DSM-III-R) diagnosis were performed over a 16-week treatment period, and during three follow-ups. Remission at any measurement was defined as a Y-BOCS score lower than thirteen combined with a reduction of seven points when compared with baseline. Logistic regression models were compared with a Cox regression for recurrent events model. Logistic regression yielded different models at different evaluation times. The recurrent events model remained stable when fewer measurements were used. Higher baseline levels of neuroticism and more severe OCD symptoms were associated with a lower chance of remission, early age of onset and more depressive symptoms with a higher chance. Choice of outcome time affects logistic regression prediction models. Recurrent events analysis uses all information on remissions and relapses. Short- and long-term predictors for OCD remission show overlap.
Estimating the exceedance probability of rain rate by logistic regression
NASA Technical Reports Server (NTRS)
Chiu, Long S.; Kedem, Benjamin
1990-01-01
Recent studies have shown that the fraction of an area with rain intensity above a fixed threshold is highly correlated with the area-averaged rain rate. To estimate the fractional rainy area, a logistic regression model, which estimates the conditional probability that rain rate over an area exceeds a fixed threshold given the values of related covariates, is developed. The problem of dependency in the data in the estimation procedure is bypassed by the method of partial likelihood. Analyses of simulated scanning multichannel microwave radiometer and observed electrically scanning microwave radiometer data during the Global Atlantic Tropical Experiment period show that the use of logistic regression in pixel classification is superior to multiple regression in predicting whether rain rate at each pixel exceeds a given threshold, even in the presence of noisy data. The potential of the logistic regression technique in satellite rain rate estimation is discussed.
Kim, Minjung; Lamont, Andrea E.; Jaki, Thomas; Feaster, Daniel; Howe, George; Van Horn, M. Lee
2015-01-01
Regression mixture models are a novel approach for modeling heterogeneous effects of predictors on an outcome. In the model building process residual variances are often disregarded and simplifying assumptions made without thorough examination of the consequences. This simulation study investigated the impact of an equality constraint on the residual variances across latent classes. We examine the consequence of constraining the residual variances on class enumeration (finding the true number of latent classes) and parameter estimates under a number of different simulation conditions meant to reflect the type of heterogeneity likely to exist in applied analyses. Results showed that bias in class enumeration increased as the difference in residual variances between the classes increased. Also, an inappropriate equality constraint on the residual variances greatly impacted estimated class sizes and showed the potential to greatly impact parameter estimates in each class. Results suggest that it is important to make assumptions about residual variances with care and to carefully report what assumptions were made. PMID:26139512
NASA Astrophysics Data System (ADS)
Cary, Theodore W.; Cwanger, Alyssa; Venkatesh, Santosh S.; Conant, Emily F.; Sehgal, Chandra M.
2012-03-01
This study compares the performance of two proven but very different machine learners, Naïve Bayes and logistic regression, for differentiating malignant and benign breast masses using ultrasound imaging. Ultrasound images of 266 masses were analyzed quantitatively for shape, echogenicity, margin characteristics, and texture features. These features along with patient age, race, and mammographic BI-RADS category were used to train Naïve Bayes and logistic regression classifiers to diagnose lesions as malignant or benign. ROC analysis was performed using all of the features and using only a subset that maximized information gain. Performance was determined by the area under the ROC curve, Az, obtained from leave-one-out cross validation. Naïve Bayes showed significant variation (Az 0.733 +/- 0.035 to 0.840 +/- 0.029, P < 0.002) with the choice of features, but the performance of logistic regression was relatively unchanged under feature selection (Az 0.839 +/- 0.029 to 0.859 +/- 0.028, P = 0.605). Out of 34 features, a subset of 6 gave the highest information gain: brightness difference, margin sharpness, depth-to-width, mammographic BI-RADs, age, and race. The probabilities of malignancy determined by Naïve Bayes and logistic regression after feature selection showed significant correlation (R2= 0.87, P < 0.0001). The diagnostic performance of Naïve Bayes and logistic regression can be comparable, but logistic regression is more robust. Since probability of malignancy cannot be measured directly, high correlation between the probabilities derived from two basic but dissimilar models increases confidence in the predictive power of machine learning models for characterizing solid breast masses on ultrasound.
Wang, Qingliang; Li, Xiaojie; Hu, Kunpeng; Zhao, Kun; Yang, Peisheng; Liu, Bo
2015-05-12
To explore the risk factors of portal hypertensive gastropathy (PHG) in patients with hepatitis B associated cirrhosis and establish a Logistic regression model of noninvasive prediction. The clinical data of 234 hospitalized patients with hepatitis B associated cirrhosis from March 2012 to March 2014 were analyzed retrospectively. The dependent variable was the occurrence of PHG while the independent variables were screened by binary Logistic analysis. Multivariate Logistic regression was used for further analysis of significant noninvasive independent variables. Logistic regression model was established and odds ratio was calculated for each factor. The accuracy, sensitivity and specificity of model were evaluated by the curve of receiver operating characteristic (ROC). According to univariate Logistic regression, the risk factors included hepatic dysfunction, albumin (ALB), bilirubin (TB), prothrombin time (PT), platelet (PLT), white blood cell (WBC), portal vein diameter, spleen index, splenic vein diameter, diameter ratio, PLT to spleen volume ratio, esophageal varices (EV) and gastric varices (GV). Multivariate analysis showed that hepatic dysfunction (X1), TB (X2), PLT (X3) and splenic vein diameter (X4) were the major occurring factors for PHG. The established regression model was Logit P=-2.667+2.186X1-2.167X2+0.725X3+0.976X4. The accuracy of model for PHG was 79.1% with a sensitivity of 77.2% and a specificity of 80.8%. Hepatic dysfunction, TB, PLT and splenic vein diameter are risk factors for PHG and the noninvasive predicted Logistic regression model was Logit P=-2.667+2.186X1-2.167X2+0.725X3+0.976X4.
Variable Selection in Logistic Regression.
1987-06-01
23 %. AUTIOR(.) S. CONTRACT OR GRANT NUMBE Rf.i %Z. D. Bai, P. R. Krishnaiah and . C. Zhao F49620-85- C-0008 " PERFORMING ORGANIZATION NAME AND AOORESS...d I7 IOK-TK- d 7 -I0 7’ VARIABLE SELECTION IN LOGISTIC REGRESSION Z. D. Bai, P. R. Krishnaiah and L. C. Zhao Center for Multivariate Analysis...University of Pittsburgh Center for Multivariate Analysis University of Pittsburgh Y !I VARIABLE SELECTION IN LOGISTIC REGRESSION Z- 0. Bai, P. R. Krishnaiah
NASA Astrophysics Data System (ADS)
Madhu, B.; Ashok, N. C.; Balasubramanian, S.
2014-11-01
Multinomial logistic regression analysis was used to develop statistical model that can predict the probability of breast cancer in Southern Karnataka using the breast cancer occurrence data during 2007-2011. Independent socio-economic variables describing the breast cancer occurrence like age, education, occupation, parity, type of family, health insurance coverage, residential locality and socioeconomic status of each case was obtained. The models were developed as follows: i) Spatial visualization of the Urban- rural distribution of breast cancer cases that were obtained from the Bharat Hospital and Institute of Oncology. ii) Socio-economic risk factors describing the breast cancer occurrences were complied for each case. These data were then analysed using multinomial logistic regression analysis in a SPSS statistical software and relations between the occurrence of breast cancer across the socio-economic status and the influence of other socio-economic variables were evaluated and multinomial logistic regression models were constructed. iii) the model that best predicted the occurrence of breast cancer were identified. This multivariate logistic regression model has been entered into a geographic information system and maps showing the predicted probability of breast cancer occurrence in Southern Karnataka was created. This study demonstrates that Multinomial logistic regression is a valuable tool for developing models that predict the probability of breast cancer Occurrence in Southern Karnataka.
Parsaeian, M; Mohammad, K; Mahmoudi, M; Zeraati, H
2012-01-01
Background: The purpose of this investigation was to compare empirically predictive ability of an artificial neural network with a logistic regression in prediction of low back pain. Methods: Data from the second national health survey were considered in this investigation. This data includes the information of low back pain and its associated risk factors among Iranian people aged 15 years and older. Artificial neural network and logistic regression models were developed using a set of 17294 data and they were validated in a test set of 17295 data. Hosmer and Lemeshow recommendation for model selection was used in fitting the logistic regression. A three-layer perceptron with 9 inputs, 3 hidden and 1 output neurons was employed. The efficiency of two models was compared by receiver operating characteristic analysis, root mean square and -2 Loglikelihood criteria. Results: The area under the ROC curve (SE), root mean square and -2Loglikelihood of the logistic regression was 0.752 (0.004), 0.3832 and 14769.2, respectively. The area under the ROC curve (SE), root mean square and -2Loglikelihood of the artificial neural network was 0.754 (0.004), 0.3770 and 14757.6, respectively. Conclusions: Based on these three criteria, artificial neural network would give better performance than logistic regression. Although, the difference is statistically significant, it does not seem to be clinically significant. PMID:23113198
Parsaeian, M; Mohammad, K; Mahmoudi, M; Zeraati, H
2012-01-01
The purpose of this investigation was to compare empirically predictive ability of an artificial neural network with a logistic regression in prediction of low back pain. Data from the second national health survey were considered in this investigation. This data includes the information of low back pain and its associated risk factors among Iranian people aged 15 years and older. Artificial neural network and logistic regression models were developed using a set of 17294 data and they were validated in a test set of 17295 data. Hosmer and Lemeshow recommendation for model selection was used in fitting the logistic regression. A three-layer perceptron with 9 inputs, 3 hidden and 1 output neurons was employed. The efficiency of two models was compared by receiver operating characteristic analysis, root mean square and -2 Loglikelihood criteria. The area under the ROC curve (SE), root mean square and -2Loglikelihood of the logistic regression was 0.752 (0.004), 0.3832 and 14769.2, respectively. The area under the ROC curve (SE), root mean square and -2Loglikelihood of the artificial neural network was 0.754 (0.004), 0.3770 and 14757.6, respectively. Based on these three criteria, artificial neural network would give better performance than logistic regression. Although, the difference is statistically significant, it does not seem to be clinically significant.
NASA Astrophysics Data System (ADS)
Kamaruddin, Ainur Amira; Ali, Zalila; Noor, Norlida Mohd.; Baharum, Adam; Ahmad, Wan Muhamad Amir W.
2014-07-01
Logistic regression analysis examines the influence of various factors on a dichotomous outcome by estimating the probability of the event's occurrence. Logistic regression, also called a logit model, is a statistical procedure used to model dichotomous outcomes. In the logit model the log odds of the dichotomous outcome is modeled as a linear combination of the predictor variables. The log odds ratio in logistic regression provides a description of the probabilistic relationship of the variables and the outcome. In conducting logistic regression, selection procedures are used in selecting important predictor variables, diagnostics are used to check that assumptions are valid which include independence of errors, linearity in the logit for continuous variables, absence of multicollinearity, and lack of strongly influential outliers and a test statistic is calculated to determine the aptness of the model. This study used the binary logistic regression model to investigate overweight and obesity among rural secondary school students on the basis of their demographics profile, medical history, diet and lifestyle. The results indicate that overweight and obesity of students are influenced by obesity in family and the interaction between a student's ethnicity and routine meals intake. The odds of a student being overweight and obese are higher for a student having a family history of obesity and for a non-Malay student who frequently takes routine meals as compared to a Malay student.
Understanding logistic regression analysis.
Sperandei, Sandro
2014-01-01
Logistic regression is used to obtain odds ratio in the presence of more than one explanatory variable. The procedure is quite similar to multiple linear regression, with the exception that the response variable is binomial. The result is the impact of each variable on the odds ratio of the observed event of interest. The main advantage is to avoid confounding effects by analyzing the association of all variables together. In this article, we explain the logistic regression procedure using examples to make it as simple as possible. After definition of the technique, the basic interpretation of the results is highlighted and then some special issues are discussed.
Pütter, Carolin; Pechlivanis, Sonali; Nöthen, Markus M; Jöckel, Karl-Heinz; Wichmann, Heinz-Erich; Scherag, André
2011-01-01
Genome-wide association studies have identified robust associations between single nucleotide polymorphisms and complex traits. As the proportion of phenotypic variance explained is still limited for most of the traits, larger and larger meta-analyses are being conducted to detect additional associations. Here we investigate the impact of the study design and the underlying assumption about the true genetic effect in a bimodal mixture situation on the power to detect associations. We performed simulations of quantitative phenotypes analysed by standard linear regression and dichotomized case-control data sets from the extremes of the quantitative trait analysed by standard logistic regression. Using linear regression, markers with an effect in the extremes of the traits were almost undetectable, whereas analysing extremes by case-control design had superior power even for much smaller sample sizes. Two real data examples are provided to support our theoretical findings and to explore our mixture and parameter assumption. Our findings support the idea to re-analyse the available meta-analysis data sets to detect new loci in the extremes. Moreover, our investigation offers an explanation for discrepant findings when analysing quantitative traits in the general population and in the extremes. Copyright © 2011 S. Karger AG, Basel.
On the equivalence of case-crossover and time series methods in environmental epidemiology.
Lu, Yun; Zeger, Scott L
2007-04-01
The case-crossover design was introduced in epidemiology 15 years ago as a method for studying the effects of a risk factor on a health event using only cases. The idea is to compare a case's exposure immediately prior to or during the case-defining event with that same person's exposure at otherwise similar "reference" times. An alternative approach to the analysis of daily exposure and case-only data is time series analysis. Here, log-linear regression models express the expected total number of events on each day as a function of the exposure level and potential confounding variables. In time series analyses of air pollution, smooth functions of time and weather are the main confounders. Time series and case-crossover methods are often viewed as competing methods. In this paper, we show that case-crossover using conditional logistic regression is a special case of time series analysis when there is a common exposure such as in air pollution studies. This equivalence provides computational convenience for case-crossover analyses and a better understanding of time series models. Time series log-linear regression accounts for overdispersion of the Poisson variance, while case-crossover analyses typically do not. This equivalence also permits model checking for case-crossover data using standard log-linear model diagnostics.
Stolz, Erwin; Fux, Beat; Mayerl, Hannes; Rásky, Éva; Freidl, Wolfgang
2016-09-01
Passive suicide ideation (PSI) is common among older adults, but prevalences have been reported to vary considerably across European countries. The goal of this study was to assess the role of individual-level risk factors and societal contextual factors associated with PSI in old age. We analyzed longitudinal data from the Survey of Health, Ageing, and Retirement in Europe (SHARE) on 6,791 community-dwelling respondents (75+) from 12 countries. Bayesian logistic multilevel regression models were used to assess variance components, individual-level and country-level risk factors. About 4% of the total variance of PSI was located at the country level, a third of which was attributable to compositional effects of individual-level predictors. Predictors for the development of PSI at the individual level were female gender, depression, older age, poor health, smaller social network size, loneliness, nonreligiosity, and low perceived control (R (2) = 25.8%). At the country level, cultural acceptance of suicide, religiosity, and intergenerational cohabitation were associated with the rates of PSI. Cross-national variation in old-age PSI is mostly attributable to individual-level determinants and compositional differences, but there is also evidence for contextual effects of country-level characteristics. Suicide prevention programs should be intensified in high-risk countries and attitudes toward suicide should be addressed in information campaigns. © The Author 2016. Published by Oxford University Press on behalf of The Gerontological Society of America. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
ERIC Educational Resources Information Center
Koon, Sharon; Petscher, Yaacov
2015-01-01
The purpose of this report was to explicate the use of logistic regression and classification and regression tree (CART) analysis in the development of early warning systems. It was motivated by state education leaders' interest in maintaining high classification accuracy while simultaneously improving practitioner understanding of the rules by…
Wiegerink, Diana J H G; Stam, Henk J; Ketelaar, Marjolijn; Cohen-Kettenis, Peggy T; Roebroeck, Marij E
2012-01-01
To study determinants of romantic relationships and sexual activity of young adults with cerebral palsy (CP), focusing on personal and environmental factors. A cohort study was performed with 74 young adults (46 men; 28 women) aged 20-25 years (SD 1.4) with CP (49% unilateral CP, 76% GMFCS level I, 85% MACS level I). All participants were of normal intelligence. Romantic relationships, sexual activity (outcome measures), personal and environmental factors (associated factors) were assessed. Associations were analyzed using logistic regression analyses. More females than males with CP were in a current romantic relationship. Self-esteem, sexual esteem and feelings of competence regarding self-efficacy contributed positively to having current romantic relationships. A negative parenting style contributed negatively. Age and gross motor functioning explained 20% of the variance in experience with intercourse. In addition, sexual esteem and taking initiative contributed significantly to intercourse experience. For young adults with CP personal factors (20-35% explained variances) seem to contribute more than environmental factors (9-12% explained variances) to current romantic relationships and sexual experiences. We advice parents and professionals to focus on self-efficacy, self-esteem and sexual self-esteem in development of young adults with CP. [ • The severity of gross motor functioning contributed somewhat to sexual activities, but not to romantic relationships.• High self-efficacy, self-esteem and sexual self-esteem can facilitate involvement in romantic and sexual relationships for young adults with CP.
Estimation of variance in Cox's regression model with shared gamma frailties.
Andersen, P K; Klein, J P; Knudsen, K M; Tabanera y Palacios, R
1997-12-01
The Cox regression model with a shared frailty factor allows for unobserved heterogeneity or for statistical dependence between the observed survival times. Estimation in this model when the frailties are assumed to follow a gamma distribution is reviewed, and we address the problem of obtaining variance estimates for regression coefficients, frailty parameter, and cumulative baseline hazards using the observed nonparametric information matrix. A number of examples are given comparing this approach with fully parametric inference in models with piecewise constant baseline hazards.
2017-03-23
PUBLIC RELEASE; DISTRIBUTION UNLIMITED Using Multiple and Logistic Regression to Estimate the Median Will- Cost and Probability of Cost and... Cost and Probability of Cost and Schedule Overrun for Program Managers Ryan C. Trudelle Follow this and additional works at: https://scholar.afit.edu...afit.edu. Recommended Citation Trudelle, Ryan C., "Using Multiple and Logistic Regression to Estimate the Median Will- Cost and Probability of Cost and
2013-11-01
Ptrend 0.78 0.62 0.75 Unconditional logistic regression was used to estimate odds ratios (OR) and 95 % confidence intervals (CI) for risk of node...Ptrend 0.71 0.67 Unconditional logistic regression was used to estimate odds ratios (OR) and 95 % confidence intervals (CI) for risk of high-grade tumors... logistic regression was used to estimate odds ratios (OR) and 95 % confidence intervals (CI) for the associations between each of the seven SNPs and
Kim, Sun Mi; Kim, Yongdai; Jeong, Kuhwan; Jeong, Heeyeong; Kim, Jiyoung
2018-01-01
The aim of this study was to compare the performance of image analysis for predicting breast cancer using two distinct regression models and to evaluate the usefulness of incorporating clinical and demographic data (CDD) into the image analysis in order to improve the diagnosis of breast cancer. This study included 139 solid masses from 139 patients who underwent a ultrasonography-guided core biopsy and had available CDD between June 2009 and April 2010. Three breast radiologists retrospectively reviewed 139 breast masses and described each lesion using the Breast Imaging Reporting and Data System (BI-RADS) lexicon. We applied and compared two regression methods-stepwise logistic (SL) regression and logistic least absolute shrinkage and selection operator (LASSO) regression-in which the BI-RADS descriptors and CDD were used as covariates. We investigated the performances of these regression methods and the agreement of radiologists in terms of test misclassification error and the area under the curve (AUC) of the tests. Logistic LASSO regression was superior (P<0.05) to SL regression, regardless of whether CDD was included in the covariates, in terms of test misclassification errors (0.234 vs. 0.253, without CDD; 0.196 vs. 0.258, with CDD) and AUC (0.785 vs. 0.759, without CDD; 0.873 vs. 0.735, with CDD). However, it was inferior (P<0.05) to the agreement of three radiologists in terms of test misclassification errors (0.234 vs. 0.168, without CDD; 0.196 vs. 0.088, with CDD) and the AUC without CDD (0.785 vs. 0.844, P<0.001), but was comparable to the AUC with CDD (0.873 vs. 0.880, P=0.141). Logistic LASSO regression based on BI-RADS descriptors and CDD showed better performance than SL in predicting the presence of breast cancer. The use of CDD as a supplement to the BI-RADS descriptors significantly improved the prediction of breast cancer using logistic LASSO regression.
Yu, Yuanyuan; Li, Hongkai; Sun, Xiaoru; Su, Ping; Wang, Tingting; Liu, Yi; Yuan, Zhongshang; Liu, Yanxun; Xue, Fuzhong
2017-12-28
Confounders can produce spurious associations between exposure and outcome in observational studies. For majority of epidemiologists, adjusting for confounders using logistic regression model is their habitual method, though it has some problems in accuracy and precision. It is, therefore, important to highlight the problems of logistic regression and search the alternative method. Four causal diagram models were defined to summarize confounding equivalence. Both theoretical proofs and simulation studies were performed to verify whether conditioning on different confounding equivalence sets had the same bias-reducing potential and then to select the optimum adjusting strategy, in which logistic regression model and inverse probability weighting based marginal structural model (IPW-based-MSM) were compared. The "do-calculus" was used to calculate the true causal effect of exposure on outcome, then the bias and standard error were used to evaluate the performances of different strategies. Adjusting for different sets of confounding equivalence, as judged by identical Markov boundaries, produced different bias-reducing potential in the logistic regression model. For the sets satisfied G-admissibility, adjusting for the set including all the confounders reduced the equivalent bias to the one containing the parent nodes of the outcome, while the bias after adjusting for the parent nodes of exposure was not equivalent to them. In addition, all causal effect estimations through logistic regression were biased, although the estimation after adjusting for the parent nodes of exposure was nearest to the true causal effect. However, conditioning on different confounding equivalence sets had the same bias-reducing potential under IPW-based-MSM. Compared with logistic regression, the IPW-based-MSM could obtain unbiased causal effect estimation when the adjusted confounders satisfied G-admissibility and the optimal strategy was to adjust for the parent nodes of outcome, which obtained the highest precision. All adjustment strategies through logistic regression were biased for causal effect estimation, while IPW-based-MSM could always obtain unbiased estimation when the adjusted set satisfied G-admissibility. Thus, IPW-based-MSM was recommended to adjust for confounders set.
Use and interpretation of logistic regression in habitat-selection studies
Keating, Kim A.; Cherry, Steve
2004-01-01
Logistic regression is an important tool for wildlife habitat-selection studies, but the method frequently has been misapplied due to an inadequate understanding of the logistic model, its interpretation, and the influence of sampling design. To promote better use of this method, we review its application and interpretation under 3 sampling designs: random, case-control, and use-availability. Logistic regression is appropriate for habitat use-nonuse studies employing random sampling and can be used to directly model the conditional probability of use in such cases. Logistic regression also is appropriate for studies employing case-control sampling designs, but careful attention is required to interpret results correctly. Unless bias can be estimated or probability of use is small for all habitats, results of case-control studies should be interpreted as odds ratios, rather than probability of use or relative probability of use. When data are gathered under a use-availability design, logistic regression can be used to estimate approximate odds ratios if probability of use is small, at least on average. More generally, however, logistic regression is inappropriate for modeling habitat selection in use-availability studies. In particular, using logistic regression to fit the exponential model of Manly et al. (2002:100) does not guarantee maximum-likelihood estimates, valid probabilities, or valid likelihoods. We show that the resource selection function (RSF) commonly used for the exponential model is proportional to a logistic discriminant function. Thus, it may be used to rank habitats with respect to probability of use and to identify important habitat characteristics or their surrogates, but it is not guaranteed to be proportional to probability of use. Other problems associated with the exponential model also are discussed. We describe an alternative model based on Lancaster and Imbens (1996) that offers a method for estimating conditional probability of use in use-availability studies. Although promising, this model fails to converge to a unique solution in some important situations. Further work is needed to obtain a robust method that is broadly applicable to use-availability studies.
Pian, Wenjing; Khoo, Christopher SG
2017-01-01
Background Users searching for health information on the Internet may be searching for their own health issue, searching for someone else’s health issue, or browsing with no particular health issue in mind. Previous research has found that these three categories of users focus on different types of health information. However, most health information websites provide static content for all users. If the three types of user health information need contexts can be identified by the Web application, the search results or information offered to the user can be customized to increase its relevance or usefulness to the user. Objective The aim of this study was to investigate the possibility of identifying the three user health information contexts (searching for self, searching for others, or browsing with no particular health issue in mind) using just hyperlink clicking behavior; using eye-tracking information; and using a combination of eye-tracking, demographic, and urgency information. Predictive models are developed using multinomial logistic regression. Methods A total of 74 participants (39 females and 35 males) who were mainly staff and students of a university were asked to browse a health discussion forum, Healthboards.com. An eye tracker recorded their examining (eye fixation) and skimming (quick eye movement) behaviors on 2 types of screens: summary result screen displaying a list of post headers, and detailed post screen. The following three types of predictive models were developed using logistic regression analysis: model 1 used only the time spent in scanning the summary result screen and reading the detailed post screen, which can be determined from the user’s mouse clicks; model 2 used the examining and skimming durations on each screen, recorded by an eye tracker; and model 3 added user demographic and urgency information to model 2. Results An analysis of variance (ANOVA) analysis found that users’ browsing durations were significantly different for the three health information contexts (P<.001). The logistic regression model 3 was able to predict the user’s type of health information context with a 10-fold cross validation mean accuracy of 84% (62/74), followed by model 2 at 73% (54/74) and model 1 at 71% (52/78). In addition, correlation analysis found that particular browsing durations were highly correlated with users’ age, education level, and the urgency of their information need. Conclusions A user’s type of health information need context (ie, searching for self, for others, or with no health issue in mind) can be identified with reasonable accuracy using just user mouse clicks that can easily be detected by Web applications. Higher accuracy can be obtained using Google glass or future computing devices with eye tracking function. PMID:29269342
Modeling Governance KB with CATPCA to Overcome Multicollinearity in the Logistic Regression
NASA Astrophysics Data System (ADS)
Khikmah, L.; Wijayanto, H.; Syafitri, U. D.
2017-04-01
The problem often encounters in logistic regression modeling are multicollinearity problems. Data that have multicollinearity between explanatory variables with the result in the estimation of parameters to be bias. Besides, the multicollinearity will result in error in the classification. In general, to overcome multicollinearity in regression used stepwise regression. They are also another method to overcome multicollinearity which involves all variable for prediction. That is Principal Component Analysis (PCA). However, classical PCA in only for numeric data. Its data are categorical, one method to solve the problems is Categorical Principal Component Analysis (CATPCA). Data were used in this research were a part of data Demographic and Population Survey Indonesia (IDHS) 2012. This research focuses on the characteristic of women of using the contraceptive methods. Classification results evaluated using Area Under Curve (AUC) values. The higher the AUC value, the better. Based on AUC values, the classification of the contraceptive method using stepwise method (58.66%) is better than the logistic regression model (57.39%) and CATPCA (57.39%). Evaluation of the results of logistic regression using sensitivity, shows the opposite where CATPCA method (99.79%) is better than logistic regression method (92.43%) and stepwise (92.05%). Therefore in this study focuses on major class classification (using a contraceptive method), then the selected model is CATPCA because it can raise the level of the major class model accuracy.
Radiographic Outcomes of Volar Locked Plating for Distal Radius Fractures
Mignemi, Megan E.; Byram, Ian R.; Wolfe, Carmen C.; Fan, Kang-Hsien; Koehler, Elizabeth A.; Block, John J.; Jordanov, Martin I.; Watson, Jeffry T.; Weikert, Douglas R.; Lee, Donald H.
2013-01-01
Purpose To assess the ability of volar locked plating to achieve and maintain normal radiographic parameters for articular stepoff, volar tilt, radial inclination, ulnar variance, and radial height in distal radius fractures. Methods We performed a retrospective review of 185 distal radius fractures that underwent volar locked plating with a single plate design over a 5-year period. We reviewed radiographs and recorded measurements for volar tilt, radial inclination, ulnar variance, radial height, and articular stepoff. We used logistic regression to determine the association between return to radiographic standard norms and fracture type. Results At the first and final postoperative follow-up visits, we observed articular congruence less than 2 mm in 92% of fractures at both times. Normal volar tilt (11°) was restored in 46% at the first follow-up and 48% at the final one. Radial inclination (22°) was achieved in 44% at the first follow-up and 43% at the final one, and ulnar variance (01 ± 2 mm) was achieved in 53% at the first follow-up and 53% at the final one. In addition, radial height (14 ± 1mm) was restored in 14% at the first follow-up and 12% at the final one. More complex, intra-articular fractures (AO class B and C and Frykman types 3, 4, 7, and 8) were less likely to be restored to normal radiographic parameters. However, because of the small sample size for some fracture types, it was difficult to discover significant associations between fracture type and radiographic outcome. Conclusions Volar locked plating for distal radius fractures achieved articular stepoff less than 2 mm in most fractures but only restored and maintained normal radiographic measurements for volar tilt, radial inclination, and ulnar variance in 50% of fractures. The ability of volar locked plating to restore and maintain ulnar variance and volar tilt decreased with more complex intra-articular fracture types. PMID:23218558
Adding a Parameter Increases the Variance of an Estimated Regression Function
ERIC Educational Resources Information Center
Withers, Christopher S.; Nadarajah, Saralees
2011-01-01
The linear regression model is one of the most popular models in statistics. It is also one of the simplest models in statistics. It has received applications in almost every area of science, engineering and medicine. In this article, the authors show that adding a predictor to a linear model increases the variance of the estimated regression…
NASA Astrophysics Data System (ADS)
Reis, D. S.; Stedinger, J. R.; Martins, E. S.
2005-10-01
This paper develops a Bayesian approach to analysis of a generalized least squares (GLS) regression model for regional analyses of hydrologic data. The new approach allows computation of the posterior distributions of the parameters and the model error variance using a quasi-analytic approach. Two regional skew estimation studies illustrate the value of the Bayesian GLS approach for regional statistical analysis of a shape parameter and demonstrate that regional skew models can be relatively precise with effective record lengths in excess of 60 years. With Bayesian GLS the marginal posterior distribution of the model error variance and the corresponding mean and variance of the parameters can be computed directly, thereby providing a simple but important extension of the regional GLS regression procedures popularized by Tasker and Stedinger (1989), which is sensitive to the likely values of the model error variance when it is small relative to the sampling error in the at-site estimator.
Logistic regression models of factors influencing the location of bioenergy and biofuels plants
T.M. Young; R.L. Zaretzki; J.H. Perdue; F.M. Guess; X. Liu
2011-01-01
Logistic regression models were developed to identify significant factors that influence the location of existing wood-using bioenergy/biofuels plants and traditional wood-using facilities. Logistic models provided quantitative insight for variables influencing the location of woody biomass-using facilities. Availability of "thinnings to a basal area of 31.7m2/ha...
Discrete post-processing of total cloud cover ensemble forecasts
NASA Astrophysics Data System (ADS)
Hemri, Stephan; Haiden, Thomas; Pappenberger, Florian
2017-04-01
This contribution presents an approach to post-process ensemble forecasts for the discrete and bounded weather variable of total cloud cover. Two methods for discrete statistical post-processing of ensemble predictions are tested. The first approach is based on multinomial logistic regression, the second involves a proportional odds logistic regression model. Applying them to total cloud cover raw ensemble forecasts from the European Centre for Medium-Range Weather Forecasts improves forecast skill significantly. Based on station-wise post-processing of raw ensemble total cloud cover forecasts for a global set of 3330 stations over the period from 2007 to early 2014, the more parsimonious proportional odds logistic regression model proved to slightly outperform the multinomial logistic regression model. Reference Hemri, S., Haiden, T., & Pappenberger, F. (2016). Discrete post-processing of total cloud cover ensemble forecasts. Monthly Weather Review 144, 2565-2577.
Fuzzy multinomial logistic regression analysis: A multi-objective programming approach
NASA Astrophysics Data System (ADS)
Abdalla, Hesham A.; El-Sayed, Amany A.; Hamed, Ramadan
2017-05-01
Parameter estimation for multinomial logistic regression is usually based on maximizing the likelihood function. For large well-balanced datasets, Maximum Likelihood (ML) estimation is a satisfactory approach. Unfortunately, ML can fail completely or at least produce poor results in terms of estimated probabilities and confidence intervals of parameters, specially for small datasets. In this study, a new approach based on fuzzy concepts is proposed to estimate parameters of the multinomial logistic regression. The study assumes that the parameters of multinomial logistic regression are fuzzy. Based on the extension principle stated by Zadeh and Bárdossy's proposition, a multi-objective programming approach is suggested to estimate these fuzzy parameters. A simulation study is used to evaluate the performance of the new approach versus Maximum likelihood (ML) approach. Results show that the new proposed model outperforms ML in cases of small datasets.
A Primer on Logistic Regression.
ERIC Educational Resources Information Center
Woldbeck, Tanya
This paper introduces logistic regression as a viable alternative when the researcher is faced with variables that are not continuous. If one is to use simple regression, the dependent variable must be measured on a continuous scale. In the behavioral sciences, it may not always be appropriate or possible to have a measured dependent variable on a…
A Solution to Separation and Multicollinearity in Multiple Logistic Regression
Shen, Jianzhao; Gao, Sujuan
2010-01-01
In dementia screening tests, item selection for shortening an existing screening test can be achieved using multiple logistic regression. However, maximum likelihood estimates for such logistic regression models often experience serious bias or even non-existence because of separation and multicollinearity problems resulting from a large number of highly correlated items. Firth (1993, Biometrika, 80(1), 27–38) proposed a penalized likelihood estimator for generalized linear models and it was shown to reduce bias and the non-existence problems. The ridge regression has been used in logistic regression to stabilize the estimates in cases of multicollinearity. However, neither solves the problems for each other. In this paper, we propose a double penalized maximum likelihood estimator combining Firth’s penalized likelihood equation with a ridge parameter. We present a simulation study evaluating the empirical performance of the double penalized likelihood estimator in small to moderate sample sizes. We demonstrate the proposed approach using a current screening data from a community-based dementia study. PMID:20376286
A Solution to Separation and Multicollinearity in Multiple Logistic Regression.
Shen, Jianzhao; Gao, Sujuan
2008-10-01
In dementia screening tests, item selection for shortening an existing screening test can be achieved using multiple logistic regression. However, maximum likelihood estimates for such logistic regression models often experience serious bias or even non-existence because of separation and multicollinearity problems resulting from a large number of highly correlated items. Firth (1993, Biometrika, 80(1), 27-38) proposed a penalized likelihood estimator for generalized linear models and it was shown to reduce bias and the non-existence problems. The ridge regression has been used in logistic regression to stabilize the estimates in cases of multicollinearity. However, neither solves the problems for each other. In this paper, we propose a double penalized maximum likelihood estimator combining Firth's penalized likelihood equation with a ridge parameter. We present a simulation study evaluating the empirical performance of the double penalized likelihood estimator in small to moderate sample sizes. We demonstrate the proposed approach using a current screening data from a community-based dementia study.
Ye, Dong-qing; Hu, Yi-song; Li, Xiang-pei; Huang, Fen; Yang, Shi-gui; Hao, Jia-hu; Yin, Jing; Zhang, Guo-qing; Liu, Hui-hui
2004-11-01
To explore the impact of environmental factors, daily lifestyle, psycho-social factors and the interactions between environmental factors and chemokines genes on systemic lupus erythematosus (SLE). Case-control study was carried out and environmental factors for SLE were analyzed by univariate and multivariate unconditional logistic regression. Interactions between environmental factors and chemokines polymorphism contributing to systemic lupus erythematosus were also analyzed by logistic regression model. There were nineteen factors associated with SLE when univariate unconditional logistic regression was used. However, when multivariate unconditional logistic regression was used, only five factors showed having impacts on the disease, in which drinking well water (OR=0.099) was protective factor for SLE, and multiple drug allergy (OR=8.174), over-exposure to sunshine (OR=18.339), taking antibiotics (OR=9.630) and oral contraceptives were risk factors for SLE. When unconditional logistic regression model was used, results showed that there was interaction between eating irritable food and -2518MCP-1G/G genotype (OR=4.387). No interaction between environmental factors was found that contributing to SLE in this study. Many environmental factors were related to SLE, and there was an interaction between -2518MCP-1G/G genotype and eating irritable food.
Mielniczuk, Jan; Teisseyre, Paweł
2018-03-01
Detection of gene-gene interactions is one of the most important challenges in genome-wide case-control studies. Besides traditional logistic regression analysis, recently the entropy-based methods attracted a significant attention. Among entropy-based methods, interaction information is one of the most promising measures having many desirable properties. Although both logistic regression and interaction information have been used in several genome-wide association studies, the relationship between them has not been thoroughly investigated theoretically. The present paper attempts to fill this gap. We show that although certain connections between the two methods exist, in general they refer two different concepts of dependence and looking for interactions in those two senses leads to different approaches to interaction detection. We introduce ordering between interaction measures and specify conditions for independent and dependent genes under which interaction information is more discriminative measure than logistic regression. Moreover, we show that for so-called perfect distributions those measures are equivalent. The numerical experiments illustrate the theoretical findings indicating that interaction information and its modified version are more universal tools for detecting various types of interaction than logistic regression and linkage disequilibrium measures. © 2017 WILEY PERIODICALS, INC.
Kim, Tae Kyung; Lee, H-C; Lee, S G; Han, K-T; Park, E-C
2017-04-01
Reports of sexual harassment are becoming more frequent in Republic of Korea (ROK) Armed Forces. This study aimed to analyse the impact of sexual harassment on mental health among female military personnel of the ROK Armed Forces. Data from the 2014 Military Health Survey were used. Instances of sexual harassment were recorded as 'yes' or 'no'. Analysis of variance (ANOVA) was carried out to compare Kessler Psychological Distress Scale 10 (K-10) scores. Multiple logistic regression analysis was performed to identify associations between sexual harassment and K-10 scores. Among 228 female military personnel, 13 (5.7%) individuals experienced sexual harassment. Multiple logistic regression analysis revealed that sexual harassment had a significantly negative impact on K-10 scores (3.486, p<0.04). Higher K-10 scores among individuals experiencing sexual harassment were identified in the unmarried (including never-married) group (6.761, p<0.04), the short-term military service group (12.014, p<0.03) and the group whose length of service was <2 years (11.067, p<0.02). Sexual harassment has a negative impact on mental health. Factors associated with worse mental health scores included service classification and length of service. The results provide helpful information with which to develop measures for minimising the negative psychological effects from sexual harassment and promoting sexual harassment prevention policy. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.
Zhao, Ying; Kane, Irene; Mao, Liping; Shi, Shenxun; Wang, Jing; Lin, Qiping; Luo, Jianfeng
2016-06-01
The psychological status of Chinese pregnant women who present with obstetrical complications is concerning to Chinese health professionals. This study aimed to investigate the prevalence of antenatal depression and analyzed related risk factors in a population of high-risk Chinese women. A large sample size, cross-sectional study. A total of 842 pregnant women with complications completed the Chinese version of the Postpartum Depression Screen Scale (PDSS) in this cross-sectional study. t-Test, ANOVA and Binary logistic regression tests were used in data analysis of antenatal depression and risk factors. The prevalence of major or minor depression in high-risk Chinese pregnant women during antenatal period was 8.3% and 28.9%, respectively. Independent-sample t-test and two-way analysis of variance (ANOVA) indicated significant differences in age, education, occupation and the number of complications (P<0.05). Binary logistic regression analysis indicated a significant negative association between depression and education (P<0.01) with lower educational level (OR: 0.590; 95% CI: 0.424-0.820) associated with a higher risk for depression. A significant positive association was observed between depression and age (P<0.05) with higher age (OR: 1.338; 95% CI: 1.008-1.774) correlated with a higher risk for depression. Women who experienced obstetric complications presented with higher PDSS depression scores. Screening for antenatal depression in high-risk pregnant women to promote early detection of depression and reduce health risks for universal health promotion is recommended. Copyright © 2015 Elsevier Inc. All rights reserved.
Nayeri, Arash; Bhatia, Nirmanmoh; Holmes, Benjamin; Borges, Nyal; Armstrong, William; Xu, Meng; Farber-Eger, Eric; Wells, Quinn S; McPherson, John A
2017-06-01
Recent studies on comatose survivors of cardiac arrest undergoing targeted temperature management (TTM) have shown similar outcomes at multiple target temperatures. However, details regarding core temperature variability during TTM and its prognostic implications remain largely unknown. We sought to assess the association between core temperature variability and neurological outcomes in patients undergoing TTM following cardiac arrest. We analyzed a prospectively collected cohort of 242 patients treated with TTM following cardiac arrest at a tertiary care hospital between 2007 and 2014. Core temperature variability was defined as the statistical variance (i.e. standard deviation squared) amongst all core temperature recordings during the maintenance phase of TTM. Poor neurological outcome at hospital discharge, defined as a Cerebral Performance Category (CPC) score>2, was the primary outcome. Death prior to hospital discharge was assessed as the secondary outcome. Multivariable logistic regression was used to examine the association between temperature variability and neurological outcome or death at hospital discharge. A poor neurological outcome was observed in 147 (61%) patients and 136 (56%) patients died prior to hospital discharge. In multivariable logistic regression, increased core temperature variability was not associated with increased odds of poor neurological outcomes (OR 0.38, 95% CI 0.11-1.38, p=0.142) or death (OR 0.43, 95% CI 0.12-1.53, p=0.193) at hospital discharge. In this study, individual core temperature variability during TTM was not associated with poor neurological outcomes or death at hospital discharge. Copyright © 2017 Elsevier Inc. All rights reserved.
An Exploratory Analysis of Work Engagement, Satisfaction, and Depression in Psychiatry Residents.
Agarwal, Gaurava; Karpouzian, Tatiana
2016-02-01
This exploratory study aims to measure work engagement levels in psychiatry residents at three psychiatry residency programs using the Utrecht Work Engagement Scale (UWES). In addition, the study investigates the relationship between total engagement and its subscales, resident satisfaction, and a depression screen. Recruitment of 53/79 residents from three psychiatry residency programs in Illinois was completed. The residents were administered a questionnaire consisting of the UWES, the Primary Care Evaluation of Mental Disorders (Prime-MD) depression screen, and a residency satisfaction scale. Statistical analysis using independent samples t test and a one-way analysis of variance was used to assess differences on engagement total score and subscales and satisfaction scale. A logistic regression was used with the engagement subscales and the satisfaction scale as predictors of belonging to the depressed or non-depressed group. Psychiatry residents scored in the high range for total engagement and all its subscales except for vigor which was in the moderate range. Residents who screened positive for depression reported lower total engagement than those who were negative on the depression screen. Vigor was the only significant predictor (p = .004) of being in the depressed group after logistic regression. Total engagement and the subscale of dedication significantly predicted overall residency satisfaction (β = .473, p = .016). Higher total UWES-15 and its subscales of vigor and dedication are correlated with a lower rate of screening positive for depression and higher residency satisfaction. This exploratory study lends support for further study of this psychological construct in medical training programs, but replication is needed.
Zhang, Shun; McGoy, Shanell L.; Dawes, Daniel; Fransua, Mesfin; Rust, George; Satcher, David
2014-01-01
Objectives The purpose of this study was to explore the racial and ethnic disparities in initiation of antiretroviral treatment (ARV treatment or ART) among HIV-infected Medicaid enrollees 18–64 years of age in 14 southern states which have high prevalence of HIV/AIDS and high racial disparities in HIV treatment access and mortality. Methods We used Medicaid claims data from 2005 to 2007 for a retrospective cohort study. We compared frequency variances of HIV treatment uptake among persons of different racial- ethnic groups using univariate and multivariate methods. The unadjusted odds ratio was estimated through multinomial logistic regression. The multinomial logistic regression model was repeated with adjustment for multiple covariates. Results Of the 23,801 Medicaid enrollees who met criteria for initiation of ARV treatment, only one third (34.6%) received ART consistent with national guideline treatment protocols, and 21.5% received some ARV medication, but with sub-optimal treatment profiles. There was no significant difference in the proportion of people who received ARV treatment between black (35.8%) and non-Hispanic whites (35.7%), but Hispanic/Latino persons (26%) were significantly less likely to receive ARV treatment. Conclusions Overall ARV treatment levels for all segments of the population are less than optimal. Among the Medicaid population there are no racial HIV treatment disparities between Black and White persons living with HIV, which suggests the potential relevance of Medicaid to currently uninsured populations, and the potential to achieve similar levels of equality within Medicaid for Hispanic/Latino enrollees and other segments of the Medicaid population. PMID:24769625
Third and Fourth Degree Perineal Injury After Vaginal Delivery: Does Race Make a Difference?
de Silva, Kanoe-Lehua; Tsai, Pai-Jong Stacy; Kon, Leanne M; Kessel, Bruce; Seto, Todd; Kaneshiro, Bliss
2014-01-01
Severe perineal injury (third and fourth degree laceration) at the time of vaginal delivery increases the risk of fecal incontinence, chronic perineal pain, and dyspareunia.1–5 Studies suggest the prevalence of severe perineal injury may vary by racial group.6 The purpose of the current study was to examine rates of severe perineal injury in different Asian and Pacific Islander subgroups. A retrospective cohort study was performed among all patients who had a vaginal delivery at Queens Medical Center in Honolulu, Hawai‘i between January 1, 2002 and December 31, 2003. Demographic and health related variables were obtained for each participant. Maternal race/ethnicity (Japanese, Filipino, Chinese, other Asian, Part-Hawaiian/Hawaiian, Micronesian, other Pacific Islander, Caucasian, multiracial [non-Hawaiian], and other) was self-reported by the patient at the time admission. The significance of associations between racial/ethnic groups and demographic and health related variables was determined using chi-square tests for categorical variables and analysis of variance for continuous factors. Multiple logistic regression was performed to adjust for potential confounders when examining severe laceration rates. A total of 1842 subjects met inclusion criteria. The proportion of severe perineal lacerations did not differ significantly between racial groups. In the multiple logistic regression analysis, operative vaginal delivery was related to both race and severe perineal laceration. However, despite adjusting for this variable, race was not associated with an increased risk of having a severe laceration (P = .70). The results of this study indicate the risk of severe perineal laceration does not differ based on maternal race/ethnicity. PMID:24660124
Gower, Amy L; Rider, G Nicole; Coleman, Eli; Brown, Camille; McMorris, Barbara J; Eisenberg, Marla E
2018-06-19
As measures of birth-assigned sex, gender identity, and perceived gender presentation are increasingly included in large-scale research studies, data analysis approaches incorporating such measures are needed. Large samples capable of demonstrating variation within the transgender and gender diverse (TGD) community can inform intervention efforts to improve health equity. A population-based sample of TGD youth was used to examine associations between perceived gender presentation, bullying victimization, and emotional distress using two data analysis approaches. Secondary data analysis of the Minnesota Student Survey included 2168 9th and 11th graders who identified as "transgender, genderqueer, genderfluid, or unsure about their gender identity." Youth reported their biological sex, how others perceived their gender presentation, experiences of four forms of bullying victimization, and four measures of emotional distress. Logistic regression and multifactor analysis of variance (ANOVA) were used to compare and contrast two analysis approaches. Logistic regressions indicated that TGD youth perceived as more gender incongruent had higher odds of bullying victimization and emotional distress relative to those perceived as very congruent with their biological sex. Multifactor ANOVAs demonstrated more variable patterns and allowed for comparisons of each perceived presentation group with all other groups, reflecting nuances that exist within TGD youth. Researchers should adopt data analysis strategies that allow for comparisons of all perceived gender presentation categories rather than assigning a reference group. Those working with TGD youth should be particularly attuned to youth perceived as gender incongruent as they may be more likely to experience bullying victimization and emotional distress.
Reducing the number of reconstructions needed for estimating channelized observer performance
NASA Astrophysics Data System (ADS)
Pineda, Angel R.; Miedema, Hope; Brenner, Melissa; Altaf, Sana
2018-03-01
A challenge for task-based optimization is the time required for each reconstructed image in applications where reconstructions are time consuming. Our goal is to reduce the number of reconstructions needed to estimate the area under the receiver operating characteristic curve (AUC) of the infinitely-trained optimal channelized linear observer. We explore the use of classifiers which either do not invert the channel covariance matrix or do feature selection. We also study the assumption that multiple low contrast signals in the same image of a non-linear reconstruction do not significantly change the estimate of the AUC. We compared the AUC of several classifiers (Hotelling, logistic regression, logistic regression using Firth bias reduction and the least absolute shrinkage and selection operator (LASSO)) with a small number of observations both for normal simulated data and images from a total variation reconstruction in magnetic resonance imaging (MRI). We used 10 Laguerre-Gauss channels and the Mann-Whitney estimator for AUC. For this data, our results show that at small sample sizes feature selection using the LASSO technique can decrease bias of the AUC estimation with increased variance and that for large sample sizes the difference between these classifiers is small. We also compared the use of multiple signals in a single reconstructed image to reduce the number of reconstructions in a total variation reconstruction for accelerated imaging in MRI. We found that AUC estimation using multiple low contrast signals in the same image resulted in similar AUC estimates as doing a single reconstruction per signal leading to a 13x reduction in the number of reconstructions needed.
Tang, Yi; Sorenson, Jeff; Lanspa, Michael; Grissom, Colin K; Mathews, V J; Brown, Samuel M
2017-06-17
Severe sepsis and septic shock are often lethal syndromes, in which the autonomic nervous system may fail to maintain adequate blood pressure. Heart rate variability has been associated with outcomes in sepsis. Whether systolic blood pressure (SBP) variability is associated with clinical outcomes in septic patients is unknown. The propose of this study is to determine whether variability in SBP correlates with vasopressor independence and mortality among septic patients. We prospectively studied patients with severe sepsis or septic shock, admitted to an intensive care unit (ICU) with an arterial catheter. We analyzed SBP variability on the first 5-min window immediately following ICU admission. We performed principal component analysis of multidimensional complexity, and used the first principal component (PC 1 ) as input for Firth logistic regression, controlling for mean systolic pressure (SBP) in the primary analyses, and Acute Physiology and Chronic Health Evaluation (APACHE) II score or NEE dose in the ancillary analyses. Prespecified outcomes were vasopressor independence at 24 h (primary), and 28-day mortality (secondary). We studied 51 patients, 51% of whom achieved vasopressor independence at 24 h. Ten percent died at 28 days. PC 1 represented 26% of the variance in complexity measures. PC 1 was not associated with vasopressor independence on Firth logistic regression (OR 1.04; 95% CI: 0.93-1.16; p = 0.54), but was associated with 28-day mortality (OR 1.16, 95% CI: 1.01-1.35, p = 0.040). Early SBP variability appears to be associated with 28-day mortality in patients with severe sepsis and septic shock.
Tull, Eugene S; Taylor, Jerome
2014-01-01
This investigation among Afro-Caribbean adults in the United States Virgin Islands (USVI) examined whether acculturation and preference for dining out accounted for variation by nativity in the frequency of fast food restaurant use, and assessed the relationship of fast food restaurant use to body weight and insulin resistance. A randomly selected sample of 679 Afro-Caribbean adults (aged ≥ 20 years), including 436 who were foreign-born and 243 who were native-born, were recruited on the island of St. Croix, USVI. Information on demographic characteristics, level of acculturation and dietary practices were obtained from participants by questionnaire. Fasting blood samples, which were measured for glucose and insulin, and anthropometric measurements were also collected from participants. Insulin resistance was estimated by the homeostasis model assessment (HOMA). Relationships between variables were assessed with analysis of variance and logistic regression analyses. In bivariate analyses, birth in the USVI, younger age, being single, greater preference for dining out and higher levels of education and acculturation were significantly (P < .05) associated with fast food restaurant use. In multivariate logistic regression analyses, birth in the USVI, younger age and preference for dining out were independently associated with frequent (≥ 2 days/week) fast food restaurant use. The mean level of HOMA insulin resistance among participants increased significantly with more frequent use of fast food restaurants. Among Afro-Caribbean adults in the USVI, fast food restaurant use is positively associated with insulin resistance and varies by nativity, but acculturation does not account for this variation.
[Predictors of hospitalization for alcohol use disorder in Korean men].
Hong, Hae-Sook; Park, Jeong-Eun; Park, Wan-Ju
2014-10-01
This study was done to identify the patterns and significant predictors influencing hospitalization of Korean men for alcohol use disorder. A descriptive study design was utilized. Data were collected using self-report questionnaires from 143 inpatients who met the DSM-5 alcohol use disorder criteria and were receiving treatment and 157 social drinkers living in the community. The questionnaires included Alcohol Use Disorders Identification Test (AUDIT), Alcohol Problems, Alcohol Expectancy Questionnaire (AEQ), Life Position, and The Korean version of the Children of Alcoholics Screening Test (CAST-K). Data were analyzed using descriptive statistics, t-test, χ²-test, F-test, Pearson correlation coefficients, and logistic regression with forward stepwise. AUDIT had significant correlations with alcohol problems, alcohol expectancy, and parents' alcoholism. In logistic regression, factors significantly affecting hospitalization were divorced (OR=4.18, 95% CI: 1.28-13.71), graduation from elementary school (OR=28.50, 95% CI: 8.07-100.69), middle school (OR=6.66, 95% CI: 2.21-20.09), high school (OR=6.31, 95% CI: 2.59-15.36), drinking alone (OR=9.07, 95% CI: 1.78-46.17), family history of alcoholism (OR=2.41, 95% CI: 1.11-5.25), interpersonal relationship problems (OR=1.28, 95% CI:1.17-1.41), and sexual enhancement of alcohol expectancy (OR=0.83, 95% CI: 0.72-0.94), which accounted for 53% of the variance. Results suggest that interpersonal relationship programs and customized cognitive programs for social drinkers in the community are needed to decreased alcohol related hospitalization in Korean men.
Kim, Minjung; Lamont, Andrea E; Jaki, Thomas; Feaster, Daniel; Howe, George; Van Horn, M Lee
2016-06-01
Regression mixture models are a novel approach to modeling the heterogeneous effects of predictors on an outcome. In the model-building process, often residual variances are disregarded and simplifying assumptions are made without thorough examination of the consequences. In this simulation study, we investigated the impact of an equality constraint on the residual variances across latent classes. We examined the consequences of constraining the residual variances on class enumeration (finding the true number of latent classes) and on the parameter estimates, under a number of different simulation conditions meant to reflect the types of heterogeneity likely to exist in applied analyses. The results showed that bias in class enumeration increased as the difference in residual variances between the classes increased. Also, an inappropriate equality constraint on the residual variances greatly impacted on the estimated class sizes and showed the potential to greatly affect the parameter estimates in each class. These results suggest that it is important to make assumptions about residual variances with care and to carefully report what assumptions are made.
ERIC Educational Resources Information Center
Shih, Ching-Lin; Liu, Tien-Hsiang; Wang, Wen-Chung
2014-01-01
The simultaneous item bias test (SIBTEST) method regression procedure and the differential item functioning (DIF)-free-then-DIF strategy are applied to the logistic regression (LR) method simultaneously in this study. These procedures are used to adjust the effects of matching true score on observed score and to better control the Type I error…
Smeers, Inge; Decorte, Ronny; Van de Voorde, Wim; Bekaert, Bram
2018-05-01
DNA methylation is a promising biomarker for forensic age prediction. A challenge that has emerged in recent studies is the fact that prediction errors become larger with increasing age due to interindividual differences in epigenetic ageing rates. This phenomenon of non-constant variance or heteroscedasticity violates an assumption of the often used method of ordinary least squares (OLS) regression. The aim of this study was to evaluate alternative statistical methods that do take heteroscedasticity into account in order to provide more accurate, age-dependent prediction intervals. A weighted least squares (WLS) regression is proposed as well as a quantile regression model. Their performances were compared against an OLS regression model based on the same dataset. Both models provided age-dependent prediction intervals which account for the increasing variance with age, but WLS regression performed better in terms of success rate in the current dataset. However, quantile regression might be a preferred method when dealing with a variance that is not only non-constant, but also not normally distributed. Ultimately the choice of which model to use should depend on the observed characteristics of the data. Copyright © 2018 Elsevier B.V. All rights reserved.
Access disparities to Magnet hospitals for patients undergoing neurosurgical operations
Missios, Symeon; Bekelis, Kimon
2017-01-01
Background Centers of excellence focusing on quality improvement have demonstrated superior outcomes for a variety of surgical interventions. We investigated the presence of access disparities to hospitals recognized by the Magnet Recognition Program of the American Nurses Credentialing Center (ANCC) for patients undergoing neurosurgical operations. Methods We performed a cohort study of all neurosurgery patients who were registered in the New York Statewide Planning and Research Cooperative System (SPARCS) database from 2009–2013. We examined the association of African-American race and lack of insurance with Magnet status hospitalization for neurosurgical procedures. A mixed effects propensity adjusted multivariable regression analysis was used to control for confounding. Results During the study period, 190,535 neurosurgical patients met the inclusion criteria. Using a multivariable logistic regression, we demonstrate that African-Americans had lower admission rates to Magnet institutions (OR 0.62; 95% CI, 0.58–0.67). This persisted in a mixed effects logistic regression model (OR 0.77; 95% CI, 0.70–0.83) to adjust for clustering at the patient county level, and a propensity score adjusted logistic regression model (OR 0.75; 95% CI, 0.69–0.82). Additionally, lack of insurance was associated with lower admission rates to Magnet institutions (OR 0.71; 95% CI, 0.68–0.73), in a multivariable logistic regression model. This persisted in a mixed effects logistic regression model (OR 0.72; 95% CI, 0.69–0.74), and a propensity score adjusted logistic regression model (OR 0.72; 95% CI, 0.69–0.75). Conclusions Using a comprehensive all-payer cohort of neurosurgery patients in New York State we identified an association of African-American race and lack of insurance with lower rates of admission to Magnet hospitals. PMID:28684152
Adjusting for Confounding in Early Postlaunch Settings: Going Beyond Logistic Regression Models.
Schmidt, Amand F; Klungel, Olaf H; Groenwold, Rolf H H
2016-01-01
Postlaunch data on medical treatments can be analyzed to explore adverse events or relative effectiveness in real-life settings. These analyses are often complicated by the number of potential confounders and the possibility of model misspecification. We conducted a simulation study to compare the performance of logistic regression, propensity score, disease risk score, and stabilized inverse probability weighting methods to adjust for confounding. Model misspecification was induced in the independent derivation dataset. We evaluated performance using relative bias confidence interval coverage of the true effect, among other metrics. At low events per coefficient (1.0 and 0.5), the logistic regression estimates had a large relative bias (greater than -100%). Bias of the disease risk score estimates was at most 13.48% and 18.83%. For the propensity score model, this was 8.74% and >100%, respectively. At events per coefficient of 1.0 and 0.5, inverse probability weighting frequently failed or reduced to a crude regression, resulting in biases of -8.49% and 24.55%. Coverage of logistic regression estimates became less than the nominal level at events per coefficient ≤5. For the disease risk score, inverse probability weighting, and propensity score, coverage became less than nominal at events per coefficient ≤2.5, ≤1.0, and ≤1.0, respectively. Bias of misspecified disease risk score models was 16.55%. In settings with low events/exposed subjects per coefficient, disease risk score methods can be useful alternatives to logistic regression models, especially when propensity score models cannot be used. Despite better performance of disease risk score methods than logistic regression and propensity score models in small events per coefficient settings, bias, and coverage still deviated from nominal.
Pfeiffer, R M; Riedl, R
2015-08-15
We assess the asymptotic bias of estimates of exposure effects conditional on covariates when summary scores of confounders, instead of the confounders themselves, are used to analyze observational data. First, we study regression models for cohort data that are adjusted for summary scores. Second, we derive the asymptotic bias for case-control studies when cases and controls are matched on a summary score, and then analyzed either using conditional logistic regression or by unconditional logistic regression adjusted for the summary score. Two scores, the propensity score (PS) and the disease risk score (DRS) are studied in detail. For cohort analysis, when regression models are adjusted for the PS, the estimated conditional treatment effect is unbiased only for linear models, or at the null for non-linear models. Adjustment of cohort data for DRS yields unbiased estimates only for linear regression; all other estimates of exposure effects are biased. Matching cases and controls on DRS and analyzing them using conditional logistic regression yields unbiased estimates of exposure effect, whereas adjusting for the DRS in unconditional logistic regression yields biased estimates, even under the null hypothesis of no association. Matching cases and controls on the PS yield unbiased estimates only under the null for both conditional and unconditional logistic regression, adjusted for the PS. We study the bias for various confounding scenarios and compare our asymptotic results with those from simulations with limited sample sizes. To create realistic correlations among multiple confounders, we also based simulations on a real dataset. Copyright © 2015 John Wiley & Sons, Ltd.
Nie, Z Q; Ou, Y Q; Zhuang, J; Qu, Y J; Mai, J Z; Chen, J M; Liu, X Q
2016-05-01
Conditional logistic regression analysis and unconditional logistic regression analysis are commonly used in case control study, but Cox proportional hazard model is often used in survival data analysis. Most literature only refer to main effect model, however, generalized linear model differs from general linear model, and the interaction was composed of multiplicative interaction and additive interaction. The former is only statistical significant, but the latter has biological significance. In this paper, macros was written by using SAS 9.4 and the contrast ratio, attributable proportion due to interaction and synergy index were calculated while calculating the items of logistic and Cox regression interactions, and the confidence intervals of Wald, delta and profile likelihood were used to evaluate additive interaction for the reference in big data analysis in clinical epidemiology and in analysis of genetic multiplicative and additive interactions.
Parameter estimation in Cox models with missing failure indicators and the OPPERA study.
Brownstein, Naomi C; Cai, Jianwen; Slade, Gary D; Bair, Eric
2015-12-30
In a prospective cohort study, examining all participants for incidence of the condition of interest may be prohibitively expensive. For example, the "gold standard" for diagnosing temporomandibular disorder (TMD) is a physical examination by a trained clinician. In large studies, examining all participants in this manner is infeasible. Instead, it is common to use questionnaires to screen for incidence of TMD and perform the "gold standard" examination only on participants who screen positively. Unfortunately, some participants may leave the study before receiving the "gold standard" examination. Within the framework of survival analysis, this results in missing failure indicators. Motivated by the Orofacial Pain: Prospective Evaluation and Risk Assessment (OPPERA) study, a large cohort study of TMD, we propose a method for parameter estimation in survival models with missing failure indicators. We estimate the probability of being an incident case for those lacking a "gold standard" examination using logistic regression. These estimated probabilities are used to generate multiple imputations of case status for each missing examination that are combined with observed data in appropriate regression models. The variance introduced by the procedure is estimated using multiple imputation. The method can be used to estimate both regression coefficients in Cox proportional hazard models as well as incidence rates using Poisson regression. We simulate data with missing failure indicators and show that our method performs as well as or better than competing methods. Finally, we apply the proposed method to data from the OPPERA study. Copyright © 2015 John Wiley & Sons, Ltd.
2011-01-01
Background To facilitate access to the prevention of mother-to-child HIV transmission (PMTCT) services, HIV counselling and testing are offered routinely in antenatal care settings. Focusing a cohort of pregnant women attending public and private antenatal care facilities, this study applied an extended version of the Theory of Planned Behaviour (TPB) to explain intended- and actual HIV testing. Methods A sequential exploratory mixed methods study was conducted in Addis Ababa in 2009. The study involved first time antenatal attendees from public- and private health care facilities. Three Focus Group Discussions were conducted to inform the TPB questionnaire. A total of 3033 women completed the baseline TPB interviews, including attitudes, subjective norms, perceived behavioural control and intention with respect to HIV testing, whereas 2928 completed actual HIV testing at follow up. Data were analysed using descriptive statistics, Chi-square tests, Fisher's Exact tests, Internal consistency reliability, Pearson's correlation, Linear regression, Logistic regression and using Epidemiological indices. P-values < 0.05 was considered significant and 95% Confidence Interval (CI) was used for the odds ratio. Results The TPB explained 9.2% and 16.4% of the variance in intention among public- and private health facility attendees. Intention and perceived barriers explained 2.4% and external variables explained 7% of the total variance in HIV testing. Positive and negative predictive values of intention were 96% and 6% respectively. Across both groups, subjective norm explained a substantial amount of variance in intention, followed by attitudes. Women intended to test for HIV if they perceived social support and anticipated positive consequences following test performance. Type of counselling did not modify the link between intended and actual HIV testing. Conclusion The TPB explained substantial amount of variance in intention to test but was less sufficient in explaining actual HIV testing. This low explanatory power of TPB was mainly due to the large proportion of low intenders that ended up being tested contrary to their intention before entering the antenatal clinic. PMTCT programs should strengthen women's intention through social approval and information that testing will provide positive consequences for them. However, women's rights to opt-out should be emphasized in any attempt to improve the PMTCT programs. PMID:21851613
NASA Astrophysics Data System (ADS)
Rock, N. M. S.; Duffy, T. R.
REGRES allows a range of regression equations to be calculated for paired sets of data values in which both variables are subject to error (i.e. neither is the "independent" variable). Nonparametric regressions, based on medians of all possible pairwise slopes and intercepts, are treated in detail. Estimated slopes and intercepts are output, along with confidence limits, Spearman and Kendall rank correlation coefficients. Outliers can be rejected with user-determined stringency. Parametric regressions can be calculated for any value of λ (the ratio of the variances of the random errors for y and x)—including: (1) major axis ( λ = 1); (2) reduced major axis ( λ = variance of y/variance of x); (3) Y on Xλ = infinity; or (4) X on Y ( λ = 0) solutions. Pearson linear correlation coefficients also are output. REGRES provides an alternative to conventional isochron assessment techniques where bivariate normal errors cannot be assumed, or weighting methods are inappropriate.
No rationale for 1 variable per 10 events criterion for binary logistic regression analysis.
van Smeden, Maarten; de Groot, Joris A H; Moons, Karel G M; Collins, Gary S; Altman, Douglas G; Eijkemans, Marinus J C; Reitsma, Johannes B
2016-11-24
Ten events per variable (EPV) is a widely advocated minimal criterion for sample size considerations in logistic regression analysis. Of three previous simulation studies that examined this minimal EPV criterion only one supports the use of a minimum of 10 EPV. In this paper, we examine the reasons for substantial differences between these extensive simulation studies. The current study uses Monte Carlo simulations to evaluate small sample bias, coverage of confidence intervals and mean square error of logit coefficients. Logistic regression models fitted by maximum likelihood and a modified estimation procedure, known as Firth's correction, are compared. The results show that besides EPV, the problems associated with low EPV depend on other factors such as the total sample size. It is also demonstrated that simulation results can be dominated by even a few simulated data sets for which the prediction of the outcome by the covariates is perfect ('separation'). We reveal that different approaches for identifying and handling separation leads to substantially different simulation results. We further show that Firth's correction can be used to improve the accuracy of regression coefficients and alleviate the problems associated with separation. The current evidence supporting EPV rules for binary logistic regression is weak. Given our findings, there is an urgent need for new research to provide guidance for supporting sample size considerations for binary logistic regression analysis.
Li, Yi; Tseng, Yufeng J.; Pan, Dahua; Liu, Jianzhong; Kern, Petra S.; Gerberick, G. Frank; Hopfinger, Anton J.
2008-01-01
Currently, the only validated methods to identify skin sensitization effects are in vivo models, such as the Local Lymph Node Assay (LLNA) and guinea pig studies. There is a tremendous need, in particular due to novel legislation, to develop animal alternatives, eg. Quantitative Structure-Activity Relationship (QSAR) models. Here, QSAR models for skin sensitization using LLNA data have been constructed. The descriptors used to generate these models are derived from the 4D-molecular similarity paradigm and are referred to as universal 4D-fingerprints. A training set of 132 structurally diverse compounds and a test set of 15 structurally diverse compounds were used in this study. The statistical methodologies used to build the models are logistic regression (LR), and partial least square coupled logistic regression (PLS-LR), which prove to be effective tools for studying skin sensitization measures expressed in the two categorical terms of sensitizer and non-sensitizer. QSAR models with low values of the Hosmer-Lemeshow goodness-of-fit statistic, χHL2, are significant and predictive. For the training set, the cross-validated prediction accuracy of the logistic regression models ranges from 77.3% to 78.0%, while that of PLS-logistic regression models ranges from 87.1% to 89.4%. For the test set, the prediction accuracy of logistic regression models ranges from 80.0%-86.7%, while that of PLS-logistic regression models ranges from 73.3%-80.0%. The QSAR models are made up of 4D-fingerprints related to aromatic atoms, hydrogen bond acceptors and negatively partially charged atoms. PMID:17226934
MODELING SNAKE MICROHABITAT FROM RADIOTELEMETRY STUDIES USING POLYTOMOUS LOGISTIC REGRESSION
Multivariate analysis of snake microhabitat has historically used techniques that were derived under assumptions of normality and common covariance structure (e.g., discriminant function analysis, MANOVA). In this study, polytomous logistic regression (PLR which does not require ...
Age-related risk factors with nonfatal traffic accidents in urban areas in Maringá, Paraná, Brazil.
de Melo, Willian Augusto; Alarcão, Ana Carolina Jacinto; de Oliveira, Analice Paula Rocha; Pelloso, Sandra Marisa; Carvalho, Maria Dalva de Barros
2017-02-17
The present study aimed to analyze the factors associated with the occurrence of nonfatal traffic accidents regarding age. A retrospective, transversal, and analytical study was carried out in the municipality of Maringá, Paraná, Brazil, based on data from Boletins de Ocorrência de Acidente de Trânsito ("Police Occurrence Bulletins"; BOATs). Following probability sampling, the sociodemographic aspects, logistics, environmental conditions, and time of occurrence of 418 cases of accidents were analyzed. The age of the victims was considered to be the dependent variable. The data were analyzed using descriptive statistics and bivariate, multivariate, and variance analysis, considering a confidence interval of 95% and a significance level of 5% (P <.05). Results revealed that young people (15-29 years) were twice as likely to be hospitalized due to severe injuries. Young motorcyclists had a 2.5 times greater chance of suffering accidents (P <.001); the use of other vehicles such as cars, bicycles, buses, and trucks represented a protective factor for this group (P <.05). Multiple logistic regression revealed that the main predictors for the occurrence of accidents were being single, having over 8 years of education, having had a driver's license for less than 3 years, roads with low luminosity, and driving at night. Demographic, environmental, and logistical factors were associated with morbidity due to traffic accidents among young people. These results challenge society and policy makers to create more effective strategies to minimize this serious public health problem.
[Parenting styles and their relationship with hyperactivity].
Raya Trenas, Antonio Félix; Herreruzo Cabrera, Javier; Pino Osuna, María José
2008-11-01
The present study aims to determine the relationship among factors that make up the parenting styles according to the PCRI (Parent-Child Relationship Inventory) and hyperactivity reported by parents through the BASC (Behaviour Assessment System for Children). We selected a sample of 32 children between 3 and 14 years old (23 male and 9 female) with risk scores in hyperactivity and another similar group with low scores in hyperactivity. After administering both instruments to the parents, we carried out a binomial logistic regression analysis which resulted in a prediction model for 84.4% of the sample, made up of the PCRI factors: fathers' involvement, communication and role orientation, mothers' parental support, and both parents' limit-setting and autonomy. Moreover, our analysis of the variance produced significant differences in the support perceived by the fathers and mothers of both groups. Lastly, the utility of results to propose intervention strategies within the family based on an authoritative style is discussed.
Low iron stores: a risk factor for excessive hair loss in non-menopausal women.
Deloche, Claire; Bastien, Philippe; Chadoutaud, Stéphanie; Galan, Pilar; Bertrais, Sandrine; Hercberg, Serge; de Lacharrière, Olivier
2007-01-01
Iron deficiency has been suspected to represent one of the possible causes of excessive hair loss in women. The aim of our study was to assess this relationship in a very large population of 5110 women aged between 35 and 60 years. Hair loss was evaluated using a standardized questionnaire sent to all volunteers. The iron status was assessed by a serum ferritin assay carried out in each volunteer. Multivariate analysis allowed us to identify three categories: "absence of hair loss" (43%), "moderate hair loss" (48%) and "excessive hair loss" (9%). Among the women affected by excessive hair loss, a larger proportion of women (59%) had low iron stores (< 40 microg/L) compared to the remainder of the population (48%). Analysis of variance and logistic regression show that a low iron store represents a risk factor for hair loss in non-menopausal women.
Regional patterns and correlates of HIV voluntary counselling and testing among youths in Nigeria.
Nwachukwu, Chukwuemeka E; Odimegwu, Clifford
2011-06-01
Prevalence of Voluntary Counselling and Testing (VCT) for HIV among young people in Nigeria is low with implications on the epidemic control. Using the 2003 Nigerian National Demographic and Health Survey, we examined the regional prevalence, pattern and correlates of VCT for HIV among youths aged 15 to 24 in Nigeria. Analysis was based on 3573 (out of 11,050) observations using logistic regression model to estimate the effects of identified predictors of volunteering for HIV testing. Results show that national prevalence of VCT is low (2.6%) with regional variations. Generally, the critical factors associated with VCT uptake are age, sex, education, wealth index and risk perception with North (sex, education, religion, occupation and risk perception) and South (age and education) variance. It is recommended that Nigerian HIV programmers should introduce evidence based youth programmes to increase the uptake of VCT with differing approaches across the regions.
Yeung, Pui-Sze; Ho, Connie Suk-Han; Chan, David Wai-Ock; Chung, Kevin Kien-Hoa
2014-05-01
To identify the indicators of persistent reading difficulties among Chinese readers in early elementary grades, the performance of three groups of Chinese children with different reading trajectories ('persistent poor word readers', 'improved poor word readers' and 'skilled word readers') in reading-related measures was analysed in a 3-year longitudinal study. The three groups were classified according to their performance in a standardized Chinese word reading test in Grade 1 and Grade 4. Results of analysis of variance and logistic regression on the reading-related measures revealed that rapid naming and syntactic skills were important indicators of early word reading difficulty. Syntactic skills and morphological awareness were possible markers of persistent reading problems. Chinese persistent poor readers did not differ significantly from skilled readers on the measures of phonological skills. Copyright © 2014 John Wiley & Sons, Ltd.
Coutinho, Letícia Maria Silva; Matijasevich, Alícia; Scazufca, Márcia; Menezes, Paulo Rossi
2014-09-01
Social context can play a important role in the etiology and prevalence of mental disorders. The aim of the present study was to investigate risk factors for common mental disorders (CMD), considering different contextual levels: individual, household, and census tract. The study used a population-based sample of 2,366 respondents from the São Paulo Ageing & Health Study. Presence of CMD was identified by the SRQ-20. Sex, age, education, and occupation were individual characteristics associated with prevalence of CMD. Multilevel logistic regression models showed that part of the variance in prevalence of CMD was associated with the household level, showing associations between crowding, family income, and CMD, even after controlling for individual characteristics. These results suggest that characteristics of the environment where people live can influence their mental health status.
Duncan, Amie W; Bishop, Somer L
2015-01-01
Daily living skills standard scores on the Vineland Adaptive Behavior Scales-2nd edition were examined in 417 adolescents from the Simons Simplex Collection. All participants had at least average intelligence and a diagnosis of autism spectrum disorder. Descriptive statistics and binary logistic regressions were used to examine the prevalence and predictors of a "daily living skills deficit," defined as below average daily living skills in the context of average intelligence quotient. Approximately half of the adolescents were identified as having a daily living skills deficit. Autism symptomatology, intelligence quotient, maternal education, age, and sex accounted for only 10% of the variance in predicting a daily living skills deficit. Identifying factors associated with better or worse daily living skills may help shed light on the variability in adult outcome in individuals with autism spectrum disorder with average intelligence. © The Author(s) 2013.
Influence of perceived and actual neighbourhood disorder on common mental illness.
Polling, C; Khondoker, M; Hatch, S L; Hotopf, M
2014-06-01
Fear of crime and perceived neighbourhood disorder have been linked to common mental illness (CMI). However, few UK studies have also considered the experience of crime at the individual and neighbourhood level. This study aims to identify individual and local area factors associated with increased perceived neighbourhood disorder and test associations between CMI and individuals' perceptions of disorder in their neighbourhoods, personal experiences of crime and neighbourhood crime rates. A cross-sectional survey was conducted of 1,698 adults living in 1,075 households in Lambeth and Southwark, London. CMI was assessed using the Revised Clinical Interview Schedule. Data were analysed using multilevel logistic regression with neighbourhood defined as lower super output area. Individuals who reported neighbourhood disorder were more likely to suffer CMI (OR 2.12) as were those with individual experience of crime. These effects remained significant when individual characteristics were controlled for. While 14 % of the variance in perceived neighbourhood disorder occurred at the neighbourhood level, there was no significant variance at this level for CMI. Perceived neighbourhood disorder is more common in income-deprived areas and individuals who are unemployed. Worry about one's local area and individual experience of crime are strongly and independently associated with CMI, but neighbourhood crime rates do not appear to impact on mental health.
Vina, Andres; Peters, Albert J.; Ji, Lei
2003-01-01
There is a global concern about the increase in atmospheric concentrations of greenhouse gases. One method being discussed to encourage greenhouse gas mitigation efforts is based on a trading system whereby carbon emitters can buy effective mitigation efforts from farmers implementing conservation tillage practices. These practices sequester carbon from the atmosphere, and such a trading system would require a low-cost and accurate method of verification. Remote sensing technology can offer such a verification technique. This paper is focused on the use of standard image processing procedures applied to a multispectral Ikonos image, to determine whether it is possible to validate that farmers have complied with agreements to implement conservation tillage practices. A principal component analysis (PCA) was performed in order to isolate image variance in cropped fields. Analyses of variance (ANOVA) statistical procedures were used to evaluate the capability of each Ikonos band and each principal component to discriminate between conventional and conservation tillage practices. A logistic regression model was implemented on the principal component most effective in discriminating between conventional and conservation tillage, in order to produce a map of the probability of conventional tillage. The Ikonos imagery, in combination with ground-reference information, proved to be a useful tool for verification of conservation tillage practices.
Statistical Methods for Generalized Linear Models with Covariates Subject to Detection Limits.
Bernhardt, Paul W; Wang, Huixia J; Zhang, Daowen
2015-05-01
Censored observations are a common occurrence in biomedical data sets. Although a large amount of research has been devoted to estimation and inference for data with censored responses, very little research has focused on proper statistical procedures when predictors are censored. In this paper, we consider statistical methods for dealing with multiple predictors subject to detection limits within the context of generalized linear models. We investigate and adapt several conventional methods and develop a new multiple imputation approach for analyzing data sets with predictors censored due to detection limits. We establish the consistency and asymptotic normality of the proposed multiple imputation estimator and suggest a computationally simple and consistent variance estimator. We also demonstrate that the conditional mean imputation method often leads to inconsistent estimates in generalized linear models, while several other methods are either computationally intensive or lead to parameter estimates that are biased or more variable compared to the proposed multiple imputation estimator. In an extensive simulation study, we assess the bias and variability of different approaches within the context of a logistic regression model and compare variance estimation methods for the proposed multiple imputation estimator. Lastly, we apply several methods to analyze the data set from a recently-conducted GenIMS study.
Guček, Nena Kopčavar; Selič, Polona
2018-01-26
This multi-centre cross-sectional study explored associations between prevalence of depression and exposure to intimate partner violence (IPV) at any time in patients' adult life in 471 participants of a previous IPV study. In 2016, 174 interviews were performed, using the Short Form Domestic Violence Exposure Questionnaire, the Zung Scale and questions about behavioural patterns of exposure to IPV. Family doctors reviewed patients' medical charts for period from 2012 to 2016, using the Domestic Violence Exposure Medical Chart Check List, for conditions which persisted for at least three years. Depression was found to be associated with any exposure to IPV in adult life and was more likely to affect women. In multivariable logistic regression modelling, factors associated with self-rated depression were identified (p < 0.05). Exposure to emotional and physical violence was identified as a risk factor in the first model, explaining 23% of the variance. The second model explained 66% of the variance; past divorce, dysfunctional family relationships and a history of incapacity to work increased the likelihood of depression in patients. Family doctors should consider IPV exposure when detecting depression, since lifetime IPV exposure was found to be 40.4% and 36.9% of depressed revealed it.
Wagner, Philippe; Ghith, Nermin; Leckie, George
2016-01-01
Background and Aim Many multilevel logistic regression analyses of “neighbourhood and health” focus on interpreting measures of associations (e.g., odds ratio, OR). In contrast, multilevel analysis of variance is rarely considered. We propose an original stepwise analytical approach that distinguishes between “specific” (measures of association) and “general” (measures of variance) contextual effects. Performing two empirical examples we illustrate the methodology, interpret the results and discuss the implications of this kind of analysis in public health. Methods We analyse 43,291 individuals residing in 218 neighbourhoods in the city of Malmö, Sweden in 2006. We study two individual outcomes (psychotropic drug use and choice of private vs. public general practitioner, GP) for which the relative importance of neighbourhood as a source of individual variation differs substantially. In Step 1 of the analysis, we evaluate the OR and the area under the receiver operating characteristic (AUC) curve for individual-level covariates (i.e., age, sex and individual low income). In Step 2, we assess general contextual effects using the AUC. Finally, in Step 3 the OR for a specific neighbourhood characteristic (i.e., neighbourhood income) is interpreted jointly with the proportional change in variance (i.e., PCV) and the proportion of ORs in the opposite direction (POOR) statistics. Results For both outcomes, information on individual characteristics (Step 1) provide a low discriminatory accuracy (AUC = 0.616 for psychotropic drugs; = 0.600 for choosing a private GP). Accounting for neighbourhood of residence (Step 2) only improved the AUC for choosing a private GP (+0.295 units). High neighbourhood income (Step 3) was strongly associated to choosing a private GP (OR = 3.50) but the PCV was only 11% and the POOR 33%. Conclusion Applying an innovative stepwise multilevel analysis, we observed that, in Malmö, the neighbourhood context per se had a negligible influence on individual use of psychotropic drugs, but appears to strongly condition individual choice of a private GP. However, the latter was only modestly explained by the socioeconomic circumstances of the neighbourhoods. Our analyses are based on real data and provide useful information for understanding neighbourhood level influences in general and on individual use of psychotropic drugs and choice of GP in particular. However, our primary aim is to illustrate how to perform and interpret a multilevel analysis of individual heterogeneity in social epidemiology and public health. Our study shows that neighbourhood “effects” are not properly quantified by reporting differences between neighbourhood averages but rather by measuring the share of the individual heterogeneity that exists at the neighbourhood level. PMID:27120054
Brenn, T; Arnesen, E
1985-01-01
For comparative evaluation, discriminant analysis, logistic regression and Cox's model were used to select risk factors for total and coronary deaths among 6595 men aged 20-49 followed for 9 years. Groups with mortality between 5 and 93 per 1000 were considered. Discriminant analysis selected variable sets only marginally different from the logistic and Cox methods which always selected the same sets. A time-saving option, offered for both the logistic and Cox selection, showed no advantage compared with discriminant analysis. Analysing more than 3800 subjects, the logistic and Cox methods consumed, respectively, 80 and 10 times more computer time than discriminant analysis. When including the same set of variables in non-stepwise analyses, all methods estimated coefficients that in most cases were almost identical. In conclusion, discriminant analysis is advocated for preliminary or stepwise analysis, otherwise Cox's method should be used.
ERIC Educational Resources Information Center
DeMars, Christine E.
2009-01-01
The Mantel-Haenszel (MH) and logistic regression (LR) differential item functioning (DIF) procedures have inflated Type I error rates when there are large mean group differences, short tests, and large sample sizes.When there are large group differences in mean score, groups matched on the observed number-correct score differ on true score,…
Reed, Margot O.; Jakubovski, Ewgeni; Johnson, Jessica A.
2017-01-01
Abstract Objective: To explore predictors of 8-year school-based behavioral outcomes in attention-deficit/hyperactivity disorder (ADHD). Methods: We examined potential baseline predictors of school-based behavioral outcomes in children who completed the 8-year follow-up in the multimodal treatment study of children with ADHD. Stepwise logistic regression and receiver operating characteristic (ROC) analysis identified baseline predictors that were associated with a higher risk of truancy, school discipline, and in-school fights. Results: Stepwise regression analysis explained between 8.1% (in-school fights) and 12.0% (school discipline) of the total variance in school-based behavioral outcomes. Logistic regression identified several baseline characteristics that were associated with school-based behavioral difficulties 8 years later, including being male (associated with truancy and school discipline), African American (school discipline, in-school fights), increased conduct disorder (CD) symptoms (truancy), decreased affection from parents (school discipline), ADHD severity (in-school fights), and study site (truancy and school discipline). ROC analyses identified the most discriminative predictors of truancy, school discipline, and in-school fights, which were Aggression and Conduct Problem Scale Total score, family income, and race, respectively. Conclusions: A modest, but nontrivial portion of school-based behavioral outcomes, was predicted by baseline childhood characteristics. Exploratory analyses identified modifiable (lack of paternal involvement, lower parental knowledge of behavioral principles, and parental use of physical punishment), somewhat modifiable (income and having comorbid CD), and nonmodifiable (African American and male) factors that were associated with school-based behavioral difficulties. Future research should confirm that the associations between earlier specific parenting behaviors and poor subsequent school-based behavioral outcomes are, indeed, causally related and independent cooccurring childhood psychopathology. Future research might target increasing paternal involvement and parental knowledge of behavioral principles and reducing use of physical punishment to improve school-based behavioral outcomes in children with ADHD. PMID:28253029
Reed, Margot O; Jakubovski, Ewgeni; Johnson, Jessica A; Bloch, Michael H
2017-05-01
To explore predictors of 8-year school-based behavioral outcomes in attention-deficit/hyperactivity disorder (ADHD). We examined potential baseline predictors of school-based behavioral outcomes in children who completed the 8-year follow-up in the multimodal treatment study of children with ADHD. Stepwise logistic regression and receiver operating characteristic (ROC) analysis identified baseline predictors that were associated with a higher risk of truancy, school discipline, and in-school fights. Stepwise regression analysis explained between 8.1% (in-school fights) and 12.0% (school discipline) of the total variance in school-based behavioral outcomes. Logistic regression identified several baseline characteristics that were associated with school-based behavioral difficulties 8 years later, including being male (associated with truancy and school discipline), African American (school discipline, in-school fights), increased conduct disorder (CD) symptoms (truancy), decreased affection from parents (school discipline), ADHD severity (in-school fights), and study site (truancy and school discipline). ROC analyses identified the most discriminative predictors of truancy, school discipline, and in-school fights, which were Aggression and Conduct Problem Scale Total score, family income, and race, respectively. A modest, but nontrivial portion of school-based behavioral outcomes, was predicted by baseline childhood characteristics. Exploratory analyses identified modifiable (lack of paternal involvement, lower parental knowledge of behavioral principles, and parental use of physical punishment), somewhat modifiable (income and having comorbid CD), and nonmodifiable (African American and male) factors that were associated with school-based behavioral difficulties. Future research should confirm that the associations between earlier specific parenting behaviors and poor subsequent school-based behavioral outcomes are, indeed, causally related and independent cooccurring childhood psychopathology. Future research might target increasing paternal involvement and parental knowledge of behavioral principles and reducing use of physical punishment to improve school-based behavioral outcomes in children with ADHD.
Satellite rainfall retrieval by logistic regression
NASA Technical Reports Server (NTRS)
Chiu, Long S.
1986-01-01
The potential use of logistic regression in rainfall estimation from satellite measurements is investigated. Satellite measurements provide covariate information in terms of radiances from different remote sensors.The logistic regression technique can effectively accommodate many covariates and test their significance in the estimation. The outcome from the logistical model is the probability that the rainrate of a satellite pixel is above a certain threshold. By varying the thresholds, a rainrate histogram can be obtained, from which the mean and the variant can be estimated. A logistical model is developed and applied to rainfall data collected during GATE, using as covariates the fractional rain area and a radiance measurement which is deduced from a microwave temperature-rainrate relation. It is demonstrated that the fractional rain area is an important covariate in the model, consistent with the use of the so-called Area Time Integral in estimating total rain volume in other studies. To calibrate the logistical model, simulated rain fields generated by rainfield models with prescribed parameters are needed. A stringent test of the logistical model is its ability to recover the prescribed parameters of simulated rain fields. A rain field simulation model which preserves the fractional rain area and lognormality of rainrates as found in GATE is developed. A stochastic regression model of branching and immigration whose solutions are lognormally distributed in some asymptotic limits has also been developed.
MEASUREMENT OF WIND SPEED FROM COOLING LAKE THERMAL IMAGERY
DOE Office of Scientific and Technical Information (OSTI.GOV)
Garrett, A; Robert Kurzeja, R; Eliel Villa-Aleman, E
2009-01-20
The Savannah River National Laboratory (SRNL) collected thermal imagery and ground truth data at two commercial power plant cooling lakes to investigate the applicability of laboratory empirical correlations between surface heat flux and wind speed, and statistics derived from thermal imagery. SRNL demonstrated in a previous paper [1] that a linear relationship exists between the standard deviation of image temperature and surface heat flux. In this paper, SRNL will show that the skewness of the temperature distribution derived from cooling lake thermal images correlates with instantaneous wind speed measured at the same location. SRNL collected thermal imagery, surface meteorology andmore » water temperatures from helicopters and boats at the Comanche Peak and H. B. Robinson nuclear power plant cooling lakes. SRNL found that decreasing skewness correlated with increasing wind speed, as was the case for the laboratory experiments. Simple linear and orthogonal regression models both explained about 50% of the variance in the skewness - wind speed plots. A nonlinear (logistic) regression model produced a better fit to the data, apparently because the thermal convection and resulting skewness are related to wind speed in a highly nonlinear way in nearly calm and in windy conditions.« less
Support for smoke-free policies in the Cyprus hospitality industry.
Lazuras, Lambros; Savva, Christos S; Talias, Michael A; Soteriades, Elpidoforos S
2015-12-01
The present study used attitudinal and behavioural indicators to measure support for smoke-free policies among employers and employees in the hospitality industry in Cyprus. A representative sample of 600 participants (95 % response rate) completed anonymous structured questionnaires on demographic variables, smoking status, exposure to second-hand smoke at work and related health beliefs, social norms, and smoke-free policy support. Participants were predominantly males (68.3 %), with a mean age of 40 years (SD = 12.69), and 39.7 % were employers/owners of the hospitality venue. Analysis of variance showed that employers and smokers were less supportive of smoke-free policies, as compared to employees and non-smokers. Linear regression models showed that attitudes towards smoke-free policy were predicted by smoking status, SHS exposure and related health beliefs, and social norm variables. Logistic regression analysis showed that willingness to confront a policy violator was predicted by SHS exposure, perceived prevalence of smoker clients, and smoke-free policy attitudes. SHS exposure and related health beliefs, and normative factors should be targeted by interventions aiming to promote policy support in the hospitality industry in Cyprus.
Practical Session: Logistic Regression
NASA Astrophysics Data System (ADS)
Clausel, M.; Grégoire, G.
2014-12-01
An exercise is proposed to illustrate the logistic regression. One investigates the different risk factors in the apparition of coronary heart disease. It has been proposed in Chapter 5 of the book of D.G. Kleinbaum and M. Klein, "Logistic Regression", Statistics for Biology and Health, Springer Science Business Media, LLC (2010) and also by D. Chessel and A.B. Dufour in Lyon 1 (see Sect. 6 of http://pbil.univ-lyon1.fr/R/pdf/tdr341.pdf). This example is based on data given in the file evans.txt coming from http://www.sph.emory.edu/dkleinb/logreg3.htm#data.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ghazali, Amirul Syafiq Mohd; Ali, Zalila; Noor, Norlida Mohd
Multinomial logistic regression is widely used to model the outcomes of a polytomous response variable, a categorical dependent variable with more than two categories. The model assumes that the conditional mean of the dependent categorical variables is the logistic function of an affine combination of predictor variables. Its procedure gives a number of logistic regression models that make specific comparisons of the response categories. When there are q categories of the response variable, the model consists of q-1 logit equations which are fitted simultaneously. The model is validated by variable selection procedures, tests of regression coefficients, a significant test ofmore » the overall model, goodness-of-fit measures, and validation of predicted probabilities using odds ratio. This study used the multinomial logistic regression model to investigate obesity and overweight among primary school students in a rural area on the basis of their demographic profiles, lifestyles and on the diet and food intake. The results indicated that obesity and overweight of students are related to gender, religion, sleep duration, time spent on electronic games, breakfast intake in a week, with whom meals are taken, protein intake, and also, the interaction between breakfast intake in a week with sleep duration, and the interaction between gender and protein intake.« less
NASA Astrophysics Data System (ADS)
Ghazali, Amirul Syafiq Mohd; Ali, Zalila; Noor, Norlida Mohd; Baharum, Adam
2015-10-01
Multinomial logistic regression is widely used to model the outcomes of a polytomous response variable, a categorical dependent variable with more than two categories. The model assumes that the conditional mean of the dependent categorical variables is the logistic function of an affine combination of predictor variables. Its procedure gives a number of logistic regression models that make specific comparisons of the response categories. When there are q categories of the response variable, the model consists of q-1 logit equations which are fitted simultaneously. The model is validated by variable selection procedures, tests of regression coefficients, a significant test of the overall model, goodness-of-fit measures, and validation of predicted probabilities using odds ratio. This study used the multinomial logistic regression model to investigate obesity and overweight among primary school students in a rural area on the basis of their demographic profiles, lifestyles and on the diet and food intake. The results indicated that obesity and overweight of students are related to gender, religion, sleep duration, time spent on electronic games, breakfast intake in a week, with whom meals are taken, protein intake, and also, the interaction between breakfast intake in a week with sleep duration, and the interaction between gender and protein intake.
The cross-validated AUC for MCP-logistic regression with high-dimensional data.
Jiang, Dingfeng; Huang, Jian; Zhang, Ying
2013-10-01
We propose a cross-validated area under the receiving operator characteristic (ROC) curve (CV-AUC) criterion for tuning parameter selection for penalized methods in sparse, high-dimensional logistic regression models. We use this criterion in combination with the minimax concave penalty (MCP) method for variable selection. The CV-AUC criterion is specifically designed for optimizing the classification performance for binary outcome data. To implement the proposed approach, we derive an efficient coordinate descent algorithm to compute the MCP-logistic regression solution surface. Simulation studies are conducted to evaluate the finite sample performance of the proposed method and its comparison with the existing methods including the Akaike information criterion (AIC), Bayesian information criterion (BIC) or Extended BIC (EBIC). The model selected based on the CV-AUC criterion tends to have a larger predictive AUC and smaller classification error than those with tuning parameters selected using the AIC, BIC or EBIC. We illustrate the application of the MCP-logistic regression with the CV-AUC criterion on three microarray datasets from the studies that attempt to identify genes related to cancers. Our simulation studies and data examples demonstrate that the CV-AUC is an attractive method for tuning parameter selection for penalized methods in high-dimensional logistic regression models.
Yoo, Jinho; Kim, Bo-Hyung; Kim, Soo-Hwan; Kim, Yangseok; Yim, Sung-Vin
2016-05-01
The study aimed to identify single nucleotide polymorphisms (SNPs) that significantly influenced the level of improvement of two kinds of training responses, including maximal O2 uptake (V'O2max) and knee peak torque of healthy adults participating in the high intensity training (HIT) program. The study also aimed to use these SNPs to develop prediction models for individual training responses. 79 Healthy volunteers participated in the HIT program. A genome-wide association study, based on 2,391,739 SNPs, was performed to identify SNPs that were significantly associated with gains in V'O2max and knee peak torque, following 9 weeks of the HIT program. To predict two training responses, two independent SNPs sets were determined using linear regression and iterative binary logistic regression analysis. False discovery rate analysis and permutation tests were performed to avoid false-positive findings. To predict gains in V'O2max, 7 SNPs were identified. These SNPs accounted for 26.0 % of the variance in the increment of V'O2max, and discriminated the subjects into three subgroups, non-responders, medium responders, and high responders, with prediction accuracy of 86.1 %. For the knee peak torque, 6 SNPs were identified, and accounted for 27.5 % of the variance in the increment of knee peak torque. The prediction accuracy discriminating the subjects into the three subgroups was estimated as 77.2 %. Novel SNPs found in this study could explain, and predict inter-individual variability in gains of V'O2max, and knee peak torque. Furthermore, with these genetic markers, a methodology suggested in this study provides a sound approach for the personalized training program.
Toh, Sengwee; Reichman, Marsha E; Houstoun, Monika; Ding, Xiao; Fireman, Bruce H; Gravel, Eric; Levenson, Mark; Li, Lingling; Moyneur, Erick; Shoaibi, Azadeh; Zornberg, Gwen; Hennessy, Sean
2013-11-01
It is increasingly necessary to analyze data from multiple sources when conducting public health safety surveillance or comparative effectiveness research. However, security, privacy, proprietary, and legal concerns often reduce data holders' willingness to share highly granular information. We describe and compare two approaches that do not require sharing of patient-level information to adjust for confounding in multi-site studies. We estimated the risks of angioedema associated with angiotensin-converting enzyme inhibitors (ACEIs), angiotensin receptor blockers (ARBs), and aliskiren in comparison with beta-blockers within Mini-Sentinel, which has created a distributed data system of 18 health plans. To obtain the adjusted hazard ratios (HRs) and 95% confidence intervals (CIs), we performed (i) a propensity score-stratified case-centered logistic regression analysis, a method identical to a stratified Cox regression analysis but needing only aggregated risk set data, and (ii) an inverse variance-weighted meta-analysis, which requires only the site-specific HR and variance. We also performed simulations to further compare the two methods. Compared with beta-blockers, the adjusted HR was 3.04 (95% CI: 2.81, 3.27) for ACEIs, 1.16 (1.00, 1.34) for ARBs, and 2.85 (1.34, 6.04) for aliskiren in the case-centered analysis. The corresponding HRs were 2.98 (2.76, 3.21), 1.15 (1.00, 1.33), and 2.86 (1.35, 6.04) in the meta-analysis. Simulations suggested that the two methods may produce different results under certain analytic scenarios. The case-centered analysis and the meta-analysis produced similar results without the need to share patient-level data across sites in our empirical study, but may provide different results in other study settings. Copyright © 2013 John Wiley & Sons, Ltd.
Mark, Kristen P; Janssen, Erick; Milhausen, Robin R
2011-10-01
This study aimed to assess the relative importance of demographic, interpersonal, and personality factors in predicting sexual infidelity in heterosexual couples. A total of 506 men (M age = 32.86 years, SD = 10.60) and 412 women (M age = 27.66 years, SD = 8.93), who indicated they were in a monogamous sexual relationship, completed a series of questionnaires, including the Sexual Excitation/Inhibition (SES/SIS) scales and the Mood and Sexuality Questionnaire, and answered questions about, among others, religiosity, education, income, relationship and sexual satisfaction, and sexual compatibility. Almost one-quarter of men (23.2%) and 19.2% of women indicated that they had "cheated" during their current relationship (i.e., engaged in sexual interactions with someone other than their partner that could jeopardize, or hurt, their relationship). Among men, a logistic regression analysis, explaining 17% of the variance, revealed that a higher propensity of sexual excitation (SES) and sexual inhibition due to "the threat of performance concerns" (SIS1), a lower propensity for sexual inhibition due to "the threat of performance consequences" (SIS2), and an increased tendency to engage in regretful sexual behavior during negative affective states were all significant predictors of infidelity. In women, a similar regression analysis explained 21% of the variance in engaging in infidelity. In addition to SIS1 and SIS2, for which the same patterns were found as for men, low relationship happiness and low compatibility in terms of sexual attitudes and values were predictive of infidelity. The findings of this study suggest that, for both men and women, sexual personality characteristics and, for women, relationship factors are more relevant to the prediction of sexual infidelity than demographic variables such as marital status and religiosity.
Wickizer, Thomas M; Franklin, Gary; Fulton-Kehoe, Deborah; Turner, Judith A; Mootz, Robert; Smith-Weller, Terri
2004-01-01
Objective To determine what aspects of patient satisfaction are most important in explaining the variance in patients' overall treatment experience and to evaluate the relationship between treatment experience and subsequent outcomes. Data Sources and Setting Data from a population-based survey of 804 randomly selected injured workers in Washington State filing a workers' compensation claim between November 1999 and February 2000 were combined with insurance claims data indicating whether survey respondents were receiving disability compensation payments for being out of work at 6 or 12 months after claim filing. Study Design We conducted a two-step analysis. In the first step, we tested a multiple linear regression model to assess the relationship of satisfaction measures to patients' overall treatment experience. In the second step, we used logistic regression to assess the relationship of treatment experience to subsequent outcomes. Principal Findings Among injured workers who had ongoing follow-up care after their initial treatment (n=681), satisfaction with interpersonal and technical aspects of care and with care coordination was strongly and positively associated with overall treatment experience (p<0.001). As a group, the satisfaction measures explained 38 percent of the variance in treatment experience after controlling for demographics, satisfaction with medical care prior to injury, job satisfaction, type of injury, and provider type. Injured workers who reported less-favorable treatment experience were 3.54 times as likely (95 percent confidence interval, 1.20–10.95, p=.021) to be receiving time-loss compensation for inability to work due to injury 6 or 12 months after filing a claim, compared to patients whose treatment experience was more positive. PMID:15230925
Sethuraman, Kavita; Lansdown, Richard; Sullivan, Keith
2006-06-01
Moderate malnutrition continues to affect 46% of children under five years of age and 47% of rural women in India. Women's lack of empowerment is believed to be an important factor in the persistent prevalence of malnutrition. In India, women's empowerment often varies by community, with tribes sometimes being the most progressive. To explore the relationship between women's empowerment, maternal nutritional status, and the nutritional status of their children aged 6 to 24 months in rural and tribal communities. This study in rural Karnataka, India, included tribal and rural subjects and used both qualitative and quantitative methods of data collection. Structured interviews with mothers were performed and anthropometric measurements were obtained for 820 mother-child pairs. The data were analyzed by multivariate and logistic regression. Some degree of malnutrition was seen in 83.5% of children and 72.4% of mothers in the sample. Biological variables explained most of the variance in nutritional status, followed by health-care seeking and women's empowerment variables; socioeconomic variables explained the least amount of variance. Women's empowerment variables were significantly associated with child nutrition and explained 5.6% of the variance in the sample. Maternal experience of psychological abuse and sexual coercion increased the risk of malnutrition in mothers and children. Domestic violence was experienced by 34% of mothers in the sample. In addition to the known investments needed to reduce malnutrition, improving women's nutrition, promoting gender equality, empowering women, and ending violence against women could further reduce the prevalence of malnutrition in this segment of the Indian population.
Netz, Yael; Dunsky, Ayelet; Zach, Sima; Goldsmith, Rebecca; Shimony, Tal; Goldbourt, Uri; Zeev, Aviva
2012-12-01
Official health organizations have established the dose of physical activity needed for preserving both physical and psychological health in old age. The objective of this study was to explore whether adherence to the recommended criterion of physical activity accounted for better psychological functioning in older adults in Israel. A random sample of 1,663 (799 men) Israelis reported their physical activity routine, and based on official guidelines were divided into sufficiently active, insufficiently active, and inactive groups. The General Health Questionnaire (GHQ) was used for assessing mental health and the Mini-Mental State Examination (MMSE) for assessing cognitive functioning. Factor analysis performed on the GHQ yielded two factors - positive and negative. Logistic regressions for the GHQ factors and for the MMSE were conducted for explaining their variance, with demographic variables entered first, followed by health and then physical activity. The explained variance in the three steps was Cox and Snell R2 = 0.022, 0.023, 0.039 for the positive factor, 0.066, 0.093, 0.101 for the negative factor, and 0.204, 0.206, 0.209 for the MMSE. Adherence to the recommended dose of physical activity accounted for better psychological functioning beyond demographic and health variables; however, the additional explained variance was small. More specific guidelines of physical activity may elucidate a stronger relationship, but only randomized controlled trials can reveal cause-effect relationship between physical activity and psychological functioning. More studies are needed focusing on the positive factor of psychological functioning.
Alarcón, G S; Bastian, H M; Beasley, T M; Roseman, J M; Tan, F K; Fessler, B J; Vilá, L M; McGwin, G
2006-01-01
Renal involvement in systemic lupus erythematosus (SLE) is more frequent in minorities. We examined whether genetic or socioeconomic status (SES) explain these disparities in a large multiethnic (Hispanics from Texas and Puerto Rico, African Americans and Caucasians) SLE cohort. Renal involvement was defined as WHO Class II-V and/or proteinuria (> 0.5 g/24 h or 3+) attributable to SLE and/or abnormal urinary sediment, proteinuria 2+, elevated serum creatinine/ decreased creatinine clearance twice, 6 months apart present any time over the course of the disease. Ancestry informative markers (AIMS) were used to define the admixture proportions in each patient and group. Logistic regression models were examined to determine the percentage variance (R2) in renal involvement related to ethnicity that is explained by socio-economic status (SES) and admixture (adjusting for age, gender and disease duration, basic model). Four-hundred and fifty-nine (out of 575) patients were included; renal involvement occurred in 44.6% Texas Hispanics, 11.3% Puerto Rico Hispanics, 45.8% African Americans, 18.3% Caucasians. SES accounted for 14.5% of the variance due to ethnicity (after adjusting for basic model variables), admixture 36.8% and both, 12.2%; 45.9% of the variance remained unexplained. Alternative models for decreased glomerula filtration rate and end-stage renal disease were comparable in the distribution of the explanatory variables. Our data indicate that genetic factors appear to be more important than SES in explaining the ethnic disparities in the occurrence of renal involvement.
Commitment to personal values and guilt feelings in dementia caregivers.
Gallego-Alberto, Laura; Losada, Andrés; Márquez-González, María; Romero-Moreno, Rosa; Vara, Carlos
2017-01-01
Caregivers' commitment to personal values is linked to caregivers' well-being, although the effects of personal values on caregivers' guilt have not been explored to date. The goal of this study is to analyze the relationship between caregivers´ commitment to personal values and guilt feelings. Participants were 179 dementia family caregivers. Face-to-face interviews were carried out to describe sociodemographic variables and assess stressors, caregivers' commitment to personal values and guilt feelings. Commitment to values was conceptualized as two factors (commitment to own values and commitment to family values) and 12 specific individual values (e.g. education, family or caregiving role). Hierarchical regressions were performed controlling for sociodemographic variables and stressors, and introducing the two commitment factors (in a first regression) or the commitment to individual/specific values (in a second regression) as predictors of guilt. In terms of the commitment to values factors, the analyzed regression model explained 21% of the variance of guilt feelings. Only the factor commitment to family values contributed significantly to the model, explaining 7% of variance. With regard to the regression analyzing the contribution of specific values to caregivers' guilt, commitment to the caregiving role and with leisure contributed negatively and significantly to the explanation of caregivers' guilt. Commitment to work contributed positively to guilt feelings. The full model explained 30% of guilt feelings variance. The specific values explained 16% of the variance. Our findings suggest that commitment to personal values is a relevant variable to understand guilt feelings in caregivers.
Vaeth, Michael; Skovlund, Eva
2004-06-15
For a given regression problem it is possible to identify a suitably defined equivalent two-sample problem such that the power or sample size obtained for the two-sample problem also applies to the regression problem. For a standard linear regression model the equivalent two-sample problem is easily identified, but for generalized linear models and for Cox regression models the situation is more complicated. An approximately equivalent two-sample problem may, however, also be identified here. In particular, we show that for logistic regression and Cox regression models the equivalent two-sample problem is obtained by selecting two equally sized samples for which the parameters differ by a value equal to the slope times twice the standard deviation of the independent variable and further requiring that the overall expected number of events is unchanged. In a simulation study we examine the validity of this approach to power calculations in logistic regression and Cox regression models. Several different covariate distributions are considered for selected values of the overall response probability and a range of alternatives. For the Cox regression model we consider both constant and non-constant hazard rates. The results show that in general the approach is remarkably accurate even in relatively small samples. Some discrepancies are, however, found in small samples with few events and a highly skewed covariate distribution. Comparison with results based on alternative methods for logistic regression models with a single continuous covariate indicates that the proposed method is at least as good as its competitors. The method is easy to implement and therefore provides a simple way to extend the range of problems that can be covered by the usual formulas for power and sample size determination. Copyright 2004 John Wiley & Sons, Ltd.
Li, Shengjie; Gao, Yanting; Shao, Mingxi; Tang, Binghua; Cao, Wenjun; Sun, Xinghuai
2017-11-04
To evaluate the association between coagulation function and patients with primary angle closure glaucoma (PACG). A retrospective, hospital-based, case-control study. Shanghai, China. A total of 1778 subjects were recruited from the Eye & ENT Hospital of Fudan University from January 2010 to December 2015, including patients with PACG (male=296; female=569) and control subjects (male=290; female=623). Sociodemographic data and clinical data were collected. The one-way analysis of variance test was used to compare the levels of laboratory parameters among the mild, moderate and severe PACG groups. Multivariate logistic regression analyses were performed to identify the independent risk factors for PACG. The nomogram was constructed based on the logistic regression model using the R project for statistical computing (R V.3.3.2). The activated partial thromboplastin time (APTT) of the PACG group was approximately 4% shorter (p<0.001) than that of the control group. The prothrombin time (PT) was approximately 2.40% shorter (p<0.001) in patients with PACG compared with the control group. The thrombin time was also approximately 2.14% shorter (p<0.001) in patients with PACG compared with the control group. The level of D-dimer was significantly higher (p=0.042) in patients with PACG. Moreover, the mean platelet volume (MPV) of the PACG group was significantly higher (p=0.013) than that of the control group. A similar trend was observed when coagulation parameters were compared between the PACG and control groups with respect to gender and/or age. Multiple logistic regression analyses revealed that APTT (OR=1.032, 95% CI 1.000 to 1.026), PT (OR=1.249, 95% CI 1.071 to 1.457) and MPV (OR=1.185, 95% CI 1.081 to 1.299) were independently associated with PACG. Patients with PACG had a shorter coagulation time. Our results suggest that coagulation function is significantly associated with patients with PACG and may play an important role in the onset and development of PACG. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Kesselmeier, Miriam; Lorenzo Bermejo, Justo
2017-11-01
Logistic regression is the most common technique used for genetic case-control association studies. A disadvantage of standard maximum likelihood estimators of the genotype relative risk (GRR) is their strong dependence on outlier subjects, for example, patients diagnosed at unusually young age. Robust methods are available to constrain outlier influence, but they are scarcely used in genetic studies. This article provides a non-intimidating introduction to robust logistic regression, and investigates its benefits and limitations in genetic association studies. We applied the bounded Huber and extended the R package 'robustbase' with the re-descending Hampel functions to down-weight outlier influence. Computer simulations were carried out to assess the type I error rate, mean squared error (MSE) and statistical power according to major characteristics of the genetic study and investigated markers. Simulations were complemented with the analysis of real data. Both standard and robust estimation controlled type I error rates. Standard logistic regression showed the highest power but standard GRR estimates also showed the largest bias and MSE, in particular for associated rare and recessive variants. For illustration, a recessive variant with a true GRR=6.32 and a minor allele frequency=0.05 investigated in a 1000 case/1000 control study by standard logistic regression resulted in power=0.60 and MSE=16.5. The corresponding figures for Huber-based estimation were power=0.51 and MSE=0.53. Overall, Hampel- and Huber-based GRR estimates did not differ much. Robust logistic regression may represent a valuable alternative to standard maximum likelihood estimation when the focus lies on risk prediction rather than identification of susceptibility variants. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Sampson, Maureen L; Gounden, Verena; van Deventer, Hendrik E; Remaley, Alan T
2016-02-01
The main drawback of the periodic analysis of quality control (QC) material is that test performance is not monitored in time periods between QC analyses, potentially leading to the reporting of faulty test results. The objective of this study was to develop a patient based QC procedure for the more timely detection of test errors. Results from a Chem-14 panel measured on the Beckman LX20 analyzer were used to develop the model. Each test result was predicted from the other 13 members of the panel by multiple regression, which resulted in correlation coefficients between the predicted and measured result of >0.7 for 8 of the 14 tests. A logistic regression model, which utilized the measured test result, the predicted test result, the day of the week and time of day, was then developed for predicting test errors. The output of the logistic regression was tallied by a daily CUSUM approach and used to predict test errors, with a fixed specificity of 90%. The mean average run length (ARL) before error detection by CUSUM-Logistic Regression (CSLR) was 20 with a mean sensitivity of 97%, which was considerably shorter than the mean ARL of 53 (sensitivity 87.5%) for a simple prediction model that only used the measured result for error detection. A CUSUM-Logistic Regression analysis of patient laboratory data can be an effective approach for the rapid and sensitive detection of clinical laboratory errors. Published by Elsevier Inc.
Nonconvex Sparse Logistic Regression With Weakly Convex Regularization
NASA Astrophysics Data System (ADS)
Shen, Xinyue; Gu, Yuantao
2018-06-01
In this work we propose to fit a sparse logistic regression model by a weakly convex regularized nonconvex optimization problem. The idea is based on the finding that a weakly convex function as an approximation of the $\\ell_0$ pseudo norm is able to better induce sparsity than the commonly used $\\ell_1$ norm. For a class of weakly convex sparsity inducing functions, we prove the nonconvexity of the corresponding sparse logistic regression problem, and study its local optimality conditions and the choice of the regularization parameter to exclude trivial solutions. Despite the nonconvexity, a method based on proximal gradient descent is used to solve the general weakly convex sparse logistic regression, and its convergence behavior is studied theoretically. Then the general framework is applied to a specific weakly convex function, and a necessary and sufficient local optimality condition is provided. The solution method is instantiated in this case as an iterative firm-shrinkage algorithm, and its effectiveness is demonstrated in numerical experiments by both randomly generated and real datasets.
A comparative study on entrepreneurial attitudes modeled with logistic regression and Bayes nets.
López Puga, Jorge; García García, Juan
2012-11-01
Entrepreneurship research is receiving increasing attention in our context, as entrepreneurs are key social agents involved in economic development. We compare the success of the dichotomic logistic regression model and the Bayes simple classifier to predict entrepreneurship, after manipulating the percentage of missing data and the level of categorization in predictors. A sample of undergraduate university students (N = 1230) completed five scales (motivation, attitude towards business creation, obstacles, deficiencies, and training needs) and we found that each of them predicted different aspects of the tendency to business creation. Additionally, our results show that the receiver operating characteristic (ROC) curve is affected by the rate of missing data in both techniques, but logistic regression seems to be more vulnerable when faced with missing data, whereas Bayes nets underperform slightly when categorization has been manipulated. Our study sheds light on the potential entrepreneur profile and we propose to use Bayesian networks as an additional alternative to overcome the weaknesses of logistic regression when missing data are present in applied research.
Campos-Filho, N; Franco, E L
1989-02-01
A frequent procedure in matched case-control studies is to report results from the multivariate unmatched analyses if they do not differ substantially from the ones obtained after conditioning on the matching variables. Although conceptually simple, this rule requires that an extensive series of logistic regression models be evaluated by both the conditional and unconditional maximum likelihood methods. Most computer programs for logistic regression employ only one maximum likelihood method, which requires that the analyses be performed in separate steps. This paper describes a Pascal microcomputer (IBM PC) program that performs multiple logistic regression by both maximum likelihood estimation methods, which obviates the need for switching between programs to obtain relative risk estimates from both matched and unmatched analyses. The program calculates most standard statistics and allows factoring of categorical or continuous variables by two distinct methods of contrast. A built-in, descriptive statistics option allows the user to inspect the distribution of cases and controls across categories of any given variable.
Comparison of cranial sex determination by discriminant analysis and logistic regression.
Amores-Ampuero, Anabel; Alemán, Inmaculada
2016-04-05
Various methods have been proposed for estimating dimorphism. The objective of this study was to compare sex determination results from cranial measurements using discriminant analysis or logistic regression. The study sample comprised 130 individuals (70 males) of known sex, age, and cause of death from San José cemetery in Granada (Spain). Measurements of 19 neurocranial dimensions and 11 splanchnocranial dimensions were subjected to discriminant analysis and logistic regression, and the percentages of correct classification were compared between the sex functions obtained with each method. The discriminant capacity of the selected variables was evaluated with a cross-validation procedure. The percentage accuracy with discriminant analysis was 78.2% for the neurocranium (82.4% in females and 74.6% in males) and 73.7% for the splanchnocranium (79.6% in females and 68.8% in males). These percentages were higher with logistic regression analysis: 85.7% for the neurocranium (in both sexes) and 94.1% for the splanchnocranium (100% in females and 91.7% in males).
Applied Multiple Linear Regression: A General Research Strategy
ERIC Educational Resources Information Center
Smith, Brandon B.
1969-01-01
Illustrates some of the basic concepts and procedures for using regression analysis in experimental design, analysis of variance, analysis of covariance, and curvilinear regression. Applications to evaluation of instruction and vocational education programs are illustrated. (GR)
Huxley, Peter John; Chan, Kara; Chiu, Marcus; Ma, Yanni; Gaze, Sarah; Evans, Sherrill
2016-03-01
China's future major health problem will be the management of chronic diseases - of which mental health is a major one. An instrument is needed to measure mental health inclusion outcomes for mental health services in Hong Kong and mainland China as they strive to promote a more inclusive society for their citizens and particular disadvantaged groups. To report on the analysis of structural equivalence and item differentiation in two mentally unhealthy and one healthy sample in the United Kingdom and Hong Kong. The mental health sample in Hong Kong was made up of non-governmental organisation (NGO) referrals meeting the selection/exclusion criteria (being well enough to be interviewed, having a formal psychiatric diagnosis and living in the community). A similar sample in the United Kingdom meeting the same selection criteria was obtained from a community mental health organisation, equivalent to the NGOs in Hong Kong. Exploratory factor analysis and logistic regression were conducted. The single-variable, self-rated 'overall social inclusion' differs significantly between all of the samples, in the way we would expect from previous research, with the healthy population feeling more included than the serious mental illness (SMI) groups. In the exploratory factor analysis, the first two factors explain between a third and half of the variance, and the single variable which enters into all the analyses in the first factor is having friends to visit the home. All the regression models were significant; however, in Hong Kong sample, only one-fifth of the total variance is explained. The structural findings imply that the social and community opportunities profile-Chinese version (SCOPE-C) gives similar results when applied to another culture. As only one-fifth of the variance of 'overall inclusion' was explained in the Hong Kong sample, it may be that the instrument needs to be refined using different or additional items within the structural domains of inclusion. © The Author(s) 2015.
Hill, Andrew; Loh, Po-Ru; Bharadwaj, Ragu B.; Pons, Pascal; Shang, Jingbo; Guinan, Eva; Lakhani, Karim; Kilty, Iain
2017-01-01
Abstract Background: The association of differing genotypes with disease-related phenotypic traits offers great potential to both help identify new therapeutic targets and support stratification of patients who would gain the greatest benefit from specific drug classes. Development of low-cost genotyping and sequencing has made collecting large-scale genotyping data routine in population and therapeutic intervention studies. In addition, a range of new technologies is being used to capture numerous new and complex phenotypic descriptors. As a result, genotype and phenotype datasets have grown exponentially. Genome-wide association studies associate genotypes and phenotypes using methods such as logistic regression. As existing tools for association analysis limit the efficiency by which value can be extracted from increasing volumes of data, there is a pressing need for new software tools that can accelerate association analyses on large genotype-phenotype datasets. Results: Using open innovation (OI) and contest-based crowdsourcing, the logistic regression analysis in a leading, community-standard genetics software package (PLINK 1.07) was substantially accelerated. OI allowed us to do this in <6 months by providing rapid access to highly skilled programmers with specialized, difficult-to-find skill sets. Through a crowd-based contest a combination of computational, numeric, and algorithmic approaches was identified that accelerated the logistic regression in PLINK 1.07 by 18- to 45-fold. Combining contest-derived logistic regression code with coarse-grained parallelization, multithreading, and associated changes to data initialization code further developed through distributed innovation, we achieved an end-to-end speedup of 591-fold for a data set size of 6678 subjects by 645 863 variants, compared to PLINK 1.07's logistic regression. This represents a reduction in run time from 4.8 hours to 29 seconds. Accelerated logistic regression code developed in this project has been incorporated into the PLINK2 project. Conclusions: Using iterative competition-based OI, we have developed a new, faster implementation of logistic regression for genome-wide association studies analysis. We present lessons learned and recommendations on running a successful OI process for bioinformatics. PMID:28327993
Hill, Andrew; Loh, Po-Ru; Bharadwaj, Ragu B; Pons, Pascal; Shang, Jingbo; Guinan, Eva; Lakhani, Karim; Kilty, Iain; Jelinsky, Scott A
2017-05-01
The association of differing genotypes with disease-related phenotypic traits offers great potential to both help identify new therapeutic targets and support stratification of patients who would gain the greatest benefit from specific drug classes. Development of low-cost genotyping and sequencing has made collecting large-scale genotyping data routine in population and therapeutic intervention studies. In addition, a range of new technologies is being used to capture numerous new and complex phenotypic descriptors. As a result, genotype and phenotype datasets have grown exponentially. Genome-wide association studies associate genotypes and phenotypes using methods such as logistic regression. As existing tools for association analysis limit the efficiency by which value can be extracted from increasing volumes of data, there is a pressing need for new software tools that can accelerate association analyses on large genotype-phenotype datasets. Using open innovation (OI) and contest-based crowdsourcing, the logistic regression analysis in a leading, community-standard genetics software package (PLINK 1.07) was substantially accelerated. OI allowed us to do this in <6 months by providing rapid access to highly skilled programmers with specialized, difficult-to-find skill sets. Through a crowd-based contest a combination of computational, numeric, and algorithmic approaches was identified that accelerated the logistic regression in PLINK 1.07 by 18- to 45-fold. Combining contest-derived logistic regression code with coarse-grained parallelization, multithreading, and associated changes to data initialization code further developed through distributed innovation, we achieved an end-to-end speedup of 591-fold for a data set size of 6678 subjects by 645 863 variants, compared to PLINK 1.07's logistic regression. This represents a reduction in run time from 4.8 hours to 29 seconds. Accelerated logistic regression code developed in this project has been incorporated into the PLINK2 project. Using iterative competition-based OI, we have developed a new, faster implementation of logistic regression for genome-wide association studies analysis. We present lessons learned and recommendations on running a successful OI process for bioinformatics. © The Author 2017. Published by Oxford University Press.
Lin, Chao-Cheng; Bai, Ya-Mei; Chen, Jen-Yeu; Hwang, Tzung-Jeng; Chen, Tzu-Ting; Chiu, Hung-Wen; Li, Yu-Chuan
2010-03-01
Metabolic syndrome (MetS) is an important side effect of second-generation antipsychotics (SGAs). However, many SGA-treated patients with MetS remain undetected. In this study, we trained and validated artificial neural network (ANN) and multiple logistic regression models without biochemical parameters to rapidly identify MetS in patients with SGA treatment. A total of 383 patients with a diagnosis of schizophrenia or schizoaffective disorder (DSM-IV criteria) with SGA treatment for more than 6 months were investigated to determine whether they met the MetS criteria according to the International Diabetes Federation. The data for these patients were collected between March 2005 and September 2005. The input variables of ANN and logistic regression were limited to demographic and anthropometric data only. All models were trained by randomly selecting two-thirds of the patient data and were internally validated with the remaining one-third of the data. The models were then externally validated with data from 69 patients from another hospital, collected between March 2008 and June 2008. The area under the receiver operating characteristic curve (AUC) was used to measure the performance of all models. Both the final ANN and logistic regression models had high accuracy (88.3% vs 83.6%), sensitivity (93.1% vs 86.2%), and specificity (86.9% vs 83.8%) to identify MetS in the internal validation set. The mean +/- SD AUC was high for both the ANN and logistic regression models (0.934 +/- 0.033 vs 0.922 +/- 0.035, P = .63). During external validation, high AUC was still obtained for both models. Waist circumference and diastolic blood pressure were the common variables that were left in the final ANN and logistic regression models. Our study developed accurate ANN and logistic regression models to detect MetS in patients with SGA treatment. The models are likely to provide a noninvasive tool for large-scale screening of MetS in this group of patients. (c) 2010 Physicians Postgraduate Press, Inc.
Bayesian logistic regression in detection of gene-steroid interaction for cancer at PDLIM5 locus.
Wang, Ke-Sheng; Owusu, Daniel; Pan, Yue; Xie, Changchun
2016-06-01
The PDZ and LIM domain 5 (PDLIM5) gene may play a role in cancer, bipolar disorder, major depression, alcohol dependence and schizophrenia; however, little is known about the interaction effect of steroid and PDLIM5 gene on cancer. This study examined 47 single-nucleotide polymorphisms (SNPs) within the PDLIM5 gene in the Marshfield sample with 716 cancer patients (any diagnosed cancer, excluding minor skin cancer) and 2848 noncancer controls. Multiple logistic regression model in PLINK software was used to examine the association of each SNP with cancer. Bayesian logistic regression in PROC GENMOD in SAS statistical software, ver. 9.4 was used to detect gene- steroid interactions influencing cancer. Single marker analysis using PLINK identified 12 SNPs associated with cancer (P< 0.05); especially, SNP rs6532496 revealed the strongest association with cancer (P = 6.84 × 10⁻³); while the next best signal was rs951613 (P = 7.46 × 10⁻³). Classic logistic regression in PROC GENMOD showed that both rs6532496 and rs951613 revealed strong gene-steroid interaction effects (OR=2.18, 95% CI=1.31-3.63 with P = 2.9 × 10⁻³ for rs6532496 and OR=2.07, 95% CI=1.24-3.45 with P = 5.43 × 10⁻³ for rs951613, respectively). Results from Bayesian logistic regression showed stronger interaction effects (OR=2.26, 95% CI=1.2-3.38 for rs6532496 and OR=2.14, 95% CI=1.14-3.2 for rs951613, respectively). All the 12 SNPs associated with cancer revealed significant gene-steroid interaction effects (P < 0.05); whereas 13 SNPs showed gene-steroid interaction effects without main effect on cancer. SNP rs4634230 revealed the strongest gene-steroid interaction effect (OR=2.49, 95% CI=1.5-4.13 with P = 4.0 × 10⁻⁴ based on the classic logistic regression and OR=2.59, 95% CI=1.4-3.97 from Bayesian logistic regression; respectively). This study provides evidence of common genetic variants within the PDLIM5 gene and interactions between PLDIM5 gene polymorphisms and steroid use influencing cancer.
Gender differences in psychosocial predictors of texting while driving.
Struckman-Johnson, Cindy; Gaster, Samuel; Struckman-Johnson, Dave; Johnson, Melissa; May-Shinagle, Gabby
2015-01-01
A sample of 158 male and 357 female college students at a midwestern university participated in an on-line study of psychosocial motives for texting while driving. Men and women did not differ in self-reported ratings of how often they texted while driving. However, more women sent texts of less than a sentence while more men sent texts of 1-5 sentences. More women than men said they would quit texting while driving due to police warnings, receiving information about texting dangers, being shown graphic pictures of texting accidents, and being in a car accident. A hierarchical regression for men's data revealed that lower levels of feeling distracted by texting while driving (20% of the variance), higher levels of cell phone dependence (11.5% of the variance), risky behavioral tendencies (6.5% of the variance) and impulsivity (2.3%) of the variance) were significantly associated with more texting while driving (total model variance=42%). A separate regression for women revealed that higher levels of cell phone dependence (10.4% of the variance), risky behavioral tendencies (9.9% of the variance), texting distractibility (6.2%), crash risk estimates (2.2% of the variance) and driving confidence (1.3% of the variance) were significantly associated with more texting while driving (total model variance=31%.) Friendship potential and need for intimacy were not related to men's or women's texting while driving. Implications of the results for gender-specific prevention strategies are discussed. Copyright © 2014 Elsevier Ltd. All rights reserved.
Deletion Diagnostics for Alternating Logistic Regressions
Preisser, John S.; By, Kunthel; Perin, Jamie; Qaqish, Bahjat F.
2013-01-01
Deletion diagnostics are introduced for the regression analysis of clustered binary outcomes estimated with alternating logistic regressions, an implementation of generalized estimating equations (GEE) that estimates regression coefficients in a marginal mean model and in a model for the intracluster association given by the log odds ratio. The diagnostics are developed within an estimating equations framework that recasts the estimating functions for association parameters based upon conditional residuals into equivalent functions based upon marginal residuals. Extensions of earlier work on GEE diagnostics follow directly, including computational formulae for one-step deletion diagnostics that measure the influence of a cluster of observations on the estimated regression parameters and on the overall marginal mean or association model fit. The diagnostic formulae are evaluated with simulations studies and with an application concerning an assessment of factors associated with health maintenance visits in primary care medical practices. The application and the simulations demonstrate that the proposed cluster-deletion diagnostics for alternating logistic regressions are good approximations of their exact fully iterated counterparts. PMID:22777960
Pian, Wenjing; Khoo, Christopher Sg; Chi, Jianxing
2017-12-21
Users searching for health information on the Internet may be searching for their own health issue, searching for someone else's health issue, or browsing with no particular health issue in mind. Previous research has found that these three categories of users focus on different types of health information. However, most health information websites provide static content for all users. If the three types of user health information need contexts can be identified by the Web application, the search results or information offered to the user can be customized to increase its relevance or usefulness to the user. The aim of this study was to investigate the possibility of identifying the three user health information contexts (searching for self, searching for others, or browsing with no particular health issue in mind) using just hyperlink clicking behavior; using eye-tracking information; and using a combination of eye-tracking, demographic, and urgency information. Predictive models are developed using multinomial logistic regression. A total of 74 participants (39 females and 35 males) who were mainly staff and students of a university were asked to browse a health discussion forum, Healthboards.com. An eye tracker recorded their examining (eye fixation) and skimming (quick eye movement) behaviors on 2 types of screens: summary result screen displaying a list of post headers, and detailed post screen. The following three types of predictive models were developed using logistic regression analysis: model 1 used only the time spent in scanning the summary result screen and reading the detailed post screen, which can be determined from the user's mouse clicks; model 2 used the examining and skimming durations on each screen, recorded by an eye tracker; and model 3 added user demographic and urgency information to model 2. An analysis of variance (ANOVA) analysis found that users' browsing durations were significantly different for the three health information contexts (P<.001). The logistic regression model 3 was able to predict the user's type of health information context with a 10-fold cross validation mean accuracy of 84% (62/74), followed by model 2 at 73% (54/74) and model 1 at 71% (52/78). In addition, correlation analysis found that particular browsing durations were highly correlated with users' age, education level, and the urgency of their information need. A user's type of health information need context (ie, searching for self, for others, or with no health issue in mind) can be identified with reasonable accuracy using just user mouse clicks that can easily be detected by Web applications. Higher accuracy can be obtained using Google glass or future computing devices with eye tracking function. ©Wenjing Pian, Christopher SG Khoo, Jianxing Chi. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 21.12.2017.
Knol, Mirjam J; van der Tweel, Ingeborg; Grobbee, Diederick E; Numans, Mattijs E; Geerlings, Mirjam I
2007-10-01
To determine the presence of interaction in epidemiologic research, typically a product term is added to the regression model. In linear regression, the regression coefficient of the product term reflects interaction as departure from additivity. However, in logistic regression it refers to interaction as departure from multiplicativity. Rothman has argued that interaction estimated as departure from additivity better reflects biologic interaction. So far, literature on estimating interaction on an additive scale using logistic regression only focused on dichotomous determinants. The objective of the present study was to provide the methods to estimate interaction between continuous determinants and to illustrate these methods with a clinical example. and results From the existing literature we derived the formulas to quantify interaction as departure from additivity between one continuous and one dichotomous determinant and between two continuous determinants using logistic regression. Bootstrapping was used to calculate the corresponding confidence intervals. To illustrate the theory with an empirical example, data from the Utrecht Health Project were used, with age and body mass index as risk factors for elevated diastolic blood pressure. The methods and formulas presented in this article are intended to assist epidemiologists to calculate interaction on an additive scale between two variables on a certain outcome. The proposed methods are included in a spreadsheet which is freely available at: http://www.juliuscenter.nl/additive-interaction.xls.
ERIC Educational Resources Information Center
Osborne, Jason W.
2012-01-01
Logistic regression is slowly gaining acceptance in the social sciences, and fills an important niche in the researcher's toolkit: being able to predict important outcomes that are not continuous in nature. While OLS regression is a valuable tool, it cannot routinely be used to predict outcomes that are binary or categorical in nature. These…
NASA Astrophysics Data System (ADS)
Delgado, Cesar
2013-06-01
Following a sociocultural perspective, this study investigates how students who have grown up using the SI (Système International d'Unités) (metric) or US customary (USC) systems of units for everyday use differ in their knowledge of scale and measurement. Student groups were similar in terms of socioeconomic status, curriculum, native language transparency of number word structure, type of school, and makeup by gender and grade level, while varying by native system of measurement. Their performance on several tasks was compared using binary logistic regression, ordinal logistic regression, and analysis of variance, with gender and grade level as covariates. Participants included 17 USC-native and 89 SI-native students in a school in Mexico, and 31 USC-native students in a school in the Midwestern USA. SI-native students performed at a significantly higher level estimating the length of a metre and a conceptual task (coordinating relative size and absolute size). No statistically significant differences were found on tasks involving factual knowledge about objects or units, scale construction, or estimation of other units. USC-native students in the US school performed at a higher level on smallest known object. These findings suggest that the more transparent SI system better supports conceptual thinking about scale and measurement than the idiosyncratic USC system. Greater emphasis on the SI system and more complete adoption of the SI system for everyday life may improve understanding among US students. Advancing sociocultural theory, systems of units were found to mediate learner's understanding of scale and measurement, much as number words mediate counting and problem solving.
Missed opportunities for concurrent HIV-STD testing in an academic emergency department.
Klein, Pamela W; Martin, Ian B K; Quinlivan, Evelyn B; Gay, Cynthia L; Leone, Peter A
2014-01-01
We evaluated emergency department (ED) provider adherence to guidelines for concurrent HIV-sexually transmitted disease (STD) testing within an expanded HIV testing program and assessed demographic and clinical factors associated with concurrent HIV-STD testing. We examined concurrent HIV-STD testing in a suburban academic ED with a targeted, expanded HIV testing program. Patients aged 18-64 years who were tested for syphilis, gonorrhea, or chlamydia in 2009 were evaluated for concurrent HIV testing. We analyzed demographic and clinical factors associated with concurrent HIV-STD testing using multivariate logistic regression with a robust variance estimator or, where applicable, exact logistic regression. Only 28.3% of patients tested for syphilis, 3.8% tested for gonorrhea, and 3.8% tested for chlamydia were concurrently tested for HIV during an ED visit. Concurrent HIV-syphilis testing was more likely among younger patients aged 25-34 years (adjusted odds ratio [AOR] = 0.36, 95% confidence interval [CI] 0.78, 2.10) and patients with STD-related chief complaints at triage (AOR=11.47, 95% CI 5.49, 25.06). Concurrent HIV-gonorrhea/chlamydia testing was more likely among men (gonorrhea: AOR=3.98, 95% CI 2.25, 7.02; chlamydia: AOR=3.25, 95% CI 1.80, 5.86) and less likely among patients with STD-related chief complaints at triage (gonorrhea: AOR=0.31, 95% CI 0.13, 0.82; chlamydia: AOR=0.21, 95% CI 0.09, 0.50). Concurrent HIV-STD testing in an academic ED remains low. Systematic interventions that remove the decision-making burden of ordering an HIV test from providers may increase HIV testing in this high-risk population of suspected STD patients.
Malhotra, Prashant; Luka, Arthur; McWilliams, Carla S; Poeth, Kaitlin G; Schwartz, Rebecca; Elfekey, Mohammed; Balwan, Sandy
2016-08-01
Respiratory viral illnesses (RVI) are reliably diagnosed by respiratory viral panel using polymerase chain reaction (RVP-PCR); however, owing to the scant data, clinical presentation alone is unreliable in establishing viral etiology. The primary objective of this study was to characterize signs and symptoms of RVI among inpatients in a major tertiary care hospital. Between 2013 and 2015, adult inpatients with RVI undergoing RVP-PCR were prospectively enrolled in our study. Clinical data were collected by interviews and electronic medical record reviews. Data analysis was performed using χ(2) testing, analysis of variance for continuous variables, and logistic regression modeling. Of 421 patients analyzed, 175 (41.7%) had a positive RVP-PCR. Patients were evenly matched at baseline except for renal disease. Multivariate logistic regression modeling demonstrated the following positive correlations: positive RVP-PCR with renal disease (odds ratio [OR] 2.08), cough (OR 2.28), and wheezing (OR 1.8); influenza with cough (OR 5.04), and renal disease (OR 2.17); metapneumovirus with age older than 65 (OR 3.24); respiratory syncytial viruses with wheezing (OR 3.42) and immunosuppression (OR 3.11); and parainfluenza with smoking (OR 3.16). Negative correlations included influenza with anosmia (OR 0.41); rhinovirus/enterovirus with feeling confined to bed (OR 0.3); metapneumovirus with smoking (OR 0.29); and parainfluenza with male sex (OR 0.22). In this descriptive study, we noted specific viral associations with clinical signs and symptoms among 421 inpatients with RVIs. With increasing RVP-PCR use, studies similar to ours may be able to better define the clinical presentation of RVIs and lead to evidence-based, clinical presentation-guided diagnostic and management algorithms.
Gleason, Jessica L; Beck, Kenneth H
2017-05-01
The purpose of this study was to determine how frequent permanent change of station moves and turnover in primary care providers are associated with continuity of care and patient satisfaction in military spouses. These domains have been studied extensively in civilian populations, but this study seeks to begin filling a gap in the literature surrounding military spouses and their experiences with the military health system. Spouses were recruited via social media to complete a brief online questionnaire to examine factors related to continuity of care and satisfaction with military health care. Results were analyzed using analysis of variance and χ 2 tests, and through logistic regression. Continuity of care scores were significantly lower as the number of moves and providers increased. Patient satisfaction was also significantly associated with continuity. In logistic regression analyses, patient-provider relationship and health status were the only significant predictors across two measures of patient satisfaction. Respondents with higher relationship scores were nearly two times more likely to report being satisfied than those with lower scores. Qualitative results indicated that the majority of dissatisfied spouses were unhappy with their military providers, which supported quantitative findings related to patient-provider relationship. No studies have previously been conducted to determine why military health system beneficiaries are less satisfied with care than their civilian counterparts. Discontinuous care is an ongoing issue for military families, which can impact satisfaction and potentially lead to poorer health outcomes. Although the military culture may not allow for fewer relocations, these results indicate that taking steps to promote enduring, trusting relationships with primary care providers may improve patient satisfaction. Reprint & Copyright © 2017 Association of Military Surgeons of the U.S.
Prediction of first episode of panic attack among white-collar workers.
Watanabe, Akira; Nakao, Kazuhisa; Tokuyama, Madoka; Takeda, Masatoshi
2005-04-01
The purpose of the present study was to elucidate a longitudinal matrix of the etiology for first-episode panic attack among white-collar workers. A path model was designed for this purpose. A 5-year, open-cohort study was carried out in a Japanese company. To evaluate the risk factors associated with the onset of a first episode of panic attack, the odds ratios of a new episode of panic attack were calculated by logistic regression. The path model contained five predictor variables: gender difference, overprotection, neuroticism, lifetime history of major depression, and recent stressful life events. The logistic regression analysis indicated that a person with a lifetime history of major depression and recent stressful life events had a fivefold and a threefold higher risk of panic attacks at follow up, respectively. The path model for the prediction of a first episode of panic attack fitted the data well. However, this model presented low accountability for the variance in the ultimate dependent variables, the first episode of panic attack. Three predictors (neuroticism, lifetime history of major depression, and recent stressful life events) had a direct effect on the risk for a first episode of panic attack, whereas gender difference and overprotection had no direct effect. The present model could not fully predict first episodes of panic attack in white-collar workers. To make a path model for the prediction of the first episode of panic attack, other strong predictor variables, which were not surveyed in the present study, are needed. It is suggested that genetic variables are among the other strong predictor variables. A new path model containing genetic variables (e.g. family history etc.) will be needed to predict the first episode of panic attack.
Chang, Chun-Jen; Pei, Dee; Wu, Chien-Chih; Palmer, Mary H; Su, Ching-Chieh; Kuo, Shu-Fen; Liao, Yuan-Mei
2017-07-01
To explore correlates of nocturia, compare sleep quality and glycemic control for women with and without nocturia, and examine relationships of nocturia with sleep quality and glycemic control in women with diabetes. This study was a cross-sectional, correlational study with data collected from 275 women with type 2 diabetes. Data were collected using a structured questionnaire. Multivariate logistic regression analyses were used to identify correlates. Chi-squared tests were used to identify candidate variables for the first logistic regression model. A one-way analysis of variance was used to compare sleep quality and glycemic control for women with and those without nocturia. Pearson correlations were used to examine the relationships of nocturia with sleep quality and glycemic control. Of the 275 participants, 124 (45.1%) had experienced nocturia (at least two voids per night). Waist circumference, parity, time since diagnosis of diabetes, sleep quality, and increased daytime urinary frequency were correlated with nocturia after adjusting for age. Compared to women without nocturia, women who had nocturia reported poorer sleep quality. A significant correlation was found between the number of nocturnal episodes and sleep quality. Nocturia and poor sleep are common among women with diabetes. The multifactorial nature of nocturia supports the delivered management and treatments being targeted to underlying etiologies in order to optimize women's symptom management. Interventions aimed at modifiable correlates may include maintaining a normal body weight and regular physical exercise for maintaining a normal waist circumference, and decreasing caffeine consumption, implementing feasible modifications in sleeping environments and maintaining sleep hygiene to improve sleep quality. Healthcare professionals should screen for nocturia and poor sleep and offer appropriate nonpharmacological lifestyle management, behavioral interventions, or pharmacotherapy for women with diabetes. © 2017 Sigma Theta Tau International.
Wang, Kesheng; Liu, Ying; Ouedraogo, Youssoufou; Wang, Nianyang; Xie, Xin; Xu, Chun; Luo, Xingguang
2018-05-01
Early alcohol, tobacco and drug use prior to 18 years old are comorbid and correlated. This study included 6239 adults with major depressive disorder (MDD) in the past year and 72,010 controls from the combined data of 2013 and 2014 National Survey on Drug Use and Health (NSDUH). To deal with multicollinearity existing among 17 variables related to early alcohol, tobacco and drug use prior to 18 years old, we used principal component analysis (PCA) to infer PC scores and then use weighted multiple logistic regression analyses to estimate the associations of potential factors and PC scores with MDD. The odds ratios (ORs) with 95% confidence intervals (CIs) were estimated. The overall prevalence of MDD was 6.7%. The first four PCs could explain 57% of the total variance. Weighted multiple logistic regression showed that PC 1 (a measure of psychotherapeutic drugs and illicit drugs other than marijuana use), PC 2 (a measure of cocaine and hallucinogens), PC 3 (a measure of early alcohol, cigarettes, and marijuana use), and PC 4 (a measure of cigar, smokeless tobacco use and illicit drugs use) revealed significant associations with MDD (OR = 1.12, 95% CI = 1.08-1.16, OR = 1.08, 95% CI = 1.04-1.12, OR = 1.13, 95% CI = 1.07-1.18, and OR = 1.15, 95% CI = 1.09-1.21, respectively). In conclusion, PCA can be used to reduce the indicators in complex survey data. Early alcohol, tobacco and drug use prior to 18 years old were found to be associated with increased odds of adult MDD. Copyright © 2018 Elsevier Ltd. All rights reserved.
Doostvandi, Tayebeh; Bahadoran, Zahra; Mozaffari-Khosravi, Hassan; Tahmasebinejad, Zhaleh; Mirmiran, Parvin; Azizi, Fereidoun
2017-05-01
The aim of this study was to investigate the relationship between major dietary patterns and the risk of insulin resistance (IR) among an urban Iranian population. In this longitudinal study, 802 adult men and women were studied within the framework of Tehran Lipid and Glucose Study. Fasting serum insulin and glucose were measured at baseline and again after a 3-year of followup. The usual dietary intakes were assessed using a validated 168 item semi-quantitative food frequency questionnaire and major dietary patterns were obtained using principal component analysis. Logistic regression models were used to estimate the occurrence of IR across tertiles of dietary patterns with adjustment for potential confounding variables. Mean age of participants was 39.0±11.2 years and 45.5% were men. Three major dietary patterns including the Western, traditional and healthy were extracted, which explained 25.3% of total variance in food intake. The healthy dietary pattern, loaded heavily on intake of vegetable oils, fresh and dried fruits, low-fat dairy, nuts and seeds, was accompanied with a reduced risk of insulin resistance by 51% (OR=0.49, 95% CI=0.30-0.81), and 81% (OR=0.19, 95% CI=0.10-0.36), in the second and third tertile, respectively (p trend=0.001). In the presence of all dietary pattern scores in the logistic regression model, a 45% reduced risk of IR was observed per 1 unit increase in healthy dietary pattern score. These findings confirmed the protective effect of a plant-based, low-fat dietary pattern against the development of insulin resistance as a main risk factor of type 2 diabetes and metabolic disorders.
2013-01-01
Background Malnutrition is one of the principal causes of child mortality in developing countries including Bangladesh. According to our knowledge, most of the available studies, that addressed the issue of malnutrition among under-five children, considered the categorical (dichotomous/polychotomous) outcome variables and applied logistic regression (binary/multinomial) to find their predictors. In this study malnutrition variable (i.e. outcome) is defined as the number of under-five malnourished children in a family, which is a non-negative count variable. The purposes of the study are (i) to demonstrate the applicability of the generalized Poisson regression (GPR) model as an alternative of other statistical methods and (ii) to find some predictors of this outcome variable. Methods The data is extracted from the Bangladesh Demographic and Health Survey (BDHS) 2007. Briefly, this survey employs a nationally representative sample which is based on a two-stage stratified sample of households. A total of 4,460 under-five children is analysed using various statistical techniques namely Chi-square test and GPR model. Results The GPR model (as compared to the standard Poisson regression and negative Binomial regression) is found to be justified to study the above-mentioned outcome variable because of its under-dispersion (variance < mean) property. Our study also identify several significant predictors of the outcome variable namely mother’s education, father’s education, wealth index, sanitation status, source of drinking water, and total number of children ever born to a woman. Conclusions Consistencies of our findings in light of many other studies suggest that the GPR model is an ideal alternative of other statistical models to analyse the number of under-five malnourished children in a family. Strategies based on significant predictors may improve the nutritional status of children in Bangladesh. PMID:23297699
Predicting Social Trust with Binary Logistic Regression
ERIC Educational Resources Information Center
Adwere-Boamah, Joseph; Hufstedler, Shirley
2015-01-01
This study used binary logistic regression to predict social trust with five demographic variables from a national sample of adult individuals who participated in The General Social Survey (GSS) in 2012. The five predictor variables were respondents' highest degree earned, race, sex, general happiness and the importance of personally assisting…
Effect of folic acid on appetite in children: ordinal logistic and fuzzy logistic regressions.
Namdari, Mahshid; Abadi, Alireza; Taheri, S Mahmoud; Rezaei, Mansour; Kalantari, Naser; Omidvar, Nasrin
2014-03-01
Reduced appetite and low food intake are often a concern in preschool children, since it can lead to malnutrition, a leading cause of impaired growth and mortality in childhood. It is occasionally considered that folic acid has a positive effect on appetite enhancement and consequently growth in children. The aim of this study was to assess the effect of folic acid on the appetite of preschool children 3 to 6 y old. The study sample included 127 children ages 3 to 6 who were randomly selected from 20 preschools in the city of Tehran in 2011. Since appetite was measured by linguistic terms, a fuzzy logistic regression was applied for modeling. The obtained results were compared with a statistical ordinal logistic model. After controlling for the potential confounders, in a statistical ordinal logistic model, serum folate showed a significantly positive effect on appetite. A small but positive effect of folate was detected by fuzzy logistic regression. Based on fuzzy regression, the risk for poor appetite in preschool children was related to the employment status of their mothers. In this study, a positive association was detected between the levels of serum folate and improved appetite. For further investigation, a randomized controlled, double-blind clinical trial could be helpful to address causality. Copyright © 2014 Elsevier Inc. All rights reserved.
Silva, F G; Torres, R A; Brito, L F; Euclydes, R F; Melo, A L P; Souza, N O; Ribeiro, J I; Rodrigues, M T
2013-12-11
The objective of this study was to identify the best random regression model using Legendre orthogonal polynomials to evaluate Alpine goats genetically and to estimate the parameters for test day milk yield. On the test day, we analyzed 20,710 records of milk yield of 667 goats from the Goat Sector of the Universidade Federal de Viçosa. The evaluated models had combinations of distinct fitting orders for polynomials (2-5), random genetic (1-7), and permanent environmental (1-7) fixed curves and a number of classes for residual variance (2, 4, 5, and 6). WOMBAT software was used for all genetic analyses. A random regression model using the best Legendre orthogonal polynomial for genetic evaluation of milk yield on the test day of Alpine goats considered a fixed curve of order 4, curve of genetic additive effects of order 2, curve of permanent environmental effects of order 7, and a minimum of 5 classes of residual variance because it was the most economical model among those that were equivalent to the complete model by the likelihood ratio test. Phenotypic variance and heritability were higher at the end of the lactation period, indicating that the length of lactation has more genetic components in relation to the production peak and persistence. It is very important that the evaluation utilizes the best combination of fixed, genetic additive and permanent environmental regressions, and number of classes of heterogeneous residual variance for genetic evaluation using random regression models, thereby enhancing the precision and accuracy of the estimates of parameters and prediction of genetic values.
Naserkheil, Masoumeh; Miraie-Ashtiani, Seyed Reza; Nejati-Javaremi, Ardeshir; Son, Jihyun; Lee, Deukhwan
2016-12-01
The objective of this study was to estimate the genetic parameters of milk protein yields in Iranian Holstein dairy cattle. A total of 1,112,082 test-day milk protein yield records of 167,269 first lactation Holstein cows, calved from 1990 to 2010, were analyzed. Estimates of the variance components, heritability, and genetic correlations for milk protein yields were obtained using a random regression test-day model. Milking times, herd, age of recording, year, and month of recording were included as fixed effects in the model. Additive genetic and permanent environmental random effects for the lactation curve were taken into account by applying orthogonal Legendre polynomials of the fourth order in the model. The lowest and highest additive genetic variances were estimated at the beginning and end of lactation, respectively. Permanent environmental variance was higher at both extremes. Residual variance was lowest at the middle of the lactation and contrarily, heritability increased during this period. Maximum heritability was found during the 12th lactation stage (0.213±0.007). Genetic, permanent, and phenotypic correlations among test-days decreased as the interval between consecutive test-days increased. A relatively large data set was used in this study; therefore, the estimated (co)variance components for random regression coefficients could be used for national genetic evaluation of dairy cattle in Iran.
Naserkheil, Masoumeh; Miraie-Ashtiani, Seyed Reza; Nejati-Javaremi, Ardeshir; Son, Jihyun; Lee, Deukhwan
2016-01-01
The objective of this study was to estimate the genetic parameters of milk protein yields in Iranian Holstein dairy cattle. A total of 1,112,082 test-day milk protein yield records of 167,269 first lactation Holstein cows, calved from 1990 to 2010, were analyzed. Estimates of the variance components, heritability, and genetic correlations for milk protein yields were obtained using a random regression test-day model. Milking times, herd, age of recording, year, and month of recording were included as fixed effects in the model. Additive genetic and permanent environmental random effects for the lactation curve were taken into account by applying orthogonal Legendre polynomials of the fourth order in the model. The lowest and highest additive genetic variances were estimated at the beginning and end of lactation, respectively. Permanent environmental variance was higher at both extremes. Residual variance was lowest at the middle of the lactation and contrarily, heritability increased during this period. Maximum heritability was found during the 12th lactation stage (0.213±0.007). Genetic, permanent, and phenotypic correlations among test-days decreased as the interval between consecutive test-days increased. A relatively large data set was used in this study; therefore, the estimated (co)variance components for random regression coefficients could be used for national genetic evaluation of dairy cattle in Iran. PMID:26954192
Estimation of Additive, Dominance, and Imprinting Genetic Variance Using Genomic Data
Lopes, Marcos S.; Bastiaansen, John W. M.; Janss, Luc; Knol, Egbert F.; Bovenhuis, Henk
2015-01-01
Traditionally, exploration of genetic variance in humans, plants, and livestock species has been limited mostly to the use of additive effects estimated using pedigree data. However, with the development of dense panels of single-nucleotide polymorphisms (SNPs), the exploration of genetic variation of complex traits is moving from quantifying the resemblance between family members to the dissection of genetic variation at individual loci. With SNPs, we were able to quantify the contribution of additive, dominance, and imprinting variance to the total genetic variance by using a SNP regression method. The method was validated in simulated data and applied to three traits (number of teats, backfat, and lifetime daily gain) in three purebred pig populations. In simulated data, the estimates of additive, dominance, and imprinting variance were very close to the simulated values. In real data, dominance effects account for a substantial proportion of the total genetic variance (up to 44%) for these traits in these populations. The contribution of imprinting to the total phenotypic variance of the evaluated traits was relatively small (1–3%). Our results indicate a strong relationship between additive variance explained per chromosome and chromosome length, which has been described previously for other traits in other species. We also show that a similar linear relationship exists for dominance and imprinting variance. These novel results improve our understanding of the genetic architecture of the evaluated traits and shows promise to apply the SNP regression method to other traits and species, including human diseases. PMID:26438289
Clustering performance comparison using K-means and expectation maximization algorithms.
Jung, Yong Gyu; Kang, Min Soo; Heo, Jun
2014-11-14
Clustering is an important means of data mining based on separating data categories by similar features. Unlike the classification algorithm, clustering belongs to the unsupervised type of algorithms. Two representatives of the clustering algorithms are the K -means and the expectation maximization (EM) algorithm. Linear regression analysis was extended to the category-type dependent variable, while logistic regression was achieved using a linear combination of independent variables. To predict the possibility of occurrence of an event, a statistical approach is used. However, the classification of all data by means of logistic regression analysis cannot guarantee the accuracy of the results. In this paper, the logistic regression analysis is applied to EM clusters and the K -means clustering method for quality assessment of red wine, and a method is proposed for ensuring the accuracy of the classification results.
Delva, J; Spencer, M S; Lin, J K
2000-01-01
This article compares estimates of the relative odds of nitrite use obtained from weighted unconditional logistic regression with estimates obtained from conditional logistic regression after post-stratification and matching of cases with controls by neighborhood of residence. We illustrate these methods by comparing the odds associated with nitrite use among adults of four racial/ethnic groups, with and without a high school education. We used aggregated data from the 1994-B through 1996 National Household Survey on Drug Abuse (NHSDA). Difference between the methods and implications for analysis and inference are discussed.
Mauer, Michael; Caramori, Maria Luiza; Fioretto, Paola; Najafian, Behzad
2015-06-01
Studies of structural-functional relationships have improved understanding of the natural history of diabetic nephropathy (DN). However, in order to consider structural end points for clinical trials, the robustness of the resultant models needs to be verified. This study examined whether structural-functional relationship models derived from a large cohort of type 1 diabetic (T1D) patients with a wide range of renal function are robust. The predictability of models derived from multiple regression analysis and piecewise linear regression analysis was also compared. T1D patients (n = 161) with research renal biopsies were divided into two equal groups matched for albumin excretion rate (AER). Models to explain AER and glomerular filtration rate (GFR) by classical DN lesions in one group (T1D-model, or T1D-M) were applied to the other group (T1D-test, or T1D-T) and regression analyses were performed. T1D-M-derived models explained 70 and 63% of AER variance and 32 and 21% of GFR variance in T1D-M and T1D-T, respectively, supporting the substantial robustness of the models. Piecewise linear regression analyses substantially improved predictability of the models with 83% of AER variance and 66% of GFR variance explained by classical DN glomerular lesions alone. These studies demonstrate that DN structural-functional relationship models are robust, and if appropriate models are used, glomerular lesions alone explain a major proportion of AER and GFR variance in T1D patients. © The Author 2014. Published by Oxford University Press on behalf of ERA-EDTA. All rights reserved.
Austin, Peter C; Lee, Douglas S; Steyerberg, Ewout W; Tu, Jack V
2012-01-01
In biomedical research, the logistic regression model is the most commonly used method for predicting the probability of a binary outcome. While many clinical researchers have expressed an enthusiasm for regression trees, this method may have limited accuracy for predicting health outcomes. We aimed to evaluate the improvement that is achieved by using ensemble-based methods, including bootstrap aggregation (bagging) of regression trees, random forests, and boosted regression trees. We analyzed 30-day mortality in two large cohorts of patients hospitalized with either acute myocardial infarction (N = 16,230) or congestive heart failure (N = 15,848) in two distinct eras (1999–2001 and 2004–2005). We found that both the in-sample and out-of-sample prediction of ensemble methods offered substantial improvement in predicting cardiovascular mortality compared to conventional regression trees. However, conventional logistic regression models that incorporated restricted cubic smoothing splines had even better performance. We conclude that ensemble methods from the data mining and machine learning literature increase the predictive performance of regression trees, but may not lead to clear advantages over conventional logistic regression models for predicting short-term mortality in population-based samples of subjects with cardiovascular disease. PMID:22777999
NASA Technical Reports Server (NTRS)
MCKissick, Burnell T. (Technical Monitor); Plassman, Gerald E.; Mall, Gerald H.; Quagliano, John R.
2005-01-01
Linear multivariable regression models for predicting day and night Eddy Dissipation Rate (EDR) from available meteorological data sources are defined and validated. Model definition is based on a combination of 1997-2000 Dallas/Fort Worth (DFW) data sources, EDR from Aircraft Vortex Spacing System (AVOSS) deployment data, and regression variables primarily from corresponding Automated Surface Observation System (ASOS) data. Model validation is accomplished through EDR predictions on a similar combination of 1994-1995 Memphis (MEM) AVOSS and ASOS data. Model forms include an intercept plus a single term of fixed optimal power for each of these regression variables; 30-minute forward averaged mean and variance of near-surface wind speed and temperature, variance of wind direction, and a discrete cloud cover metric. Distinct day and night models, regressing on EDR and the natural log of EDR respectively, yield best performance and avoid model discontinuity over day/night data boundaries.
ERIC Educational Resources Information Center
Fidalgo, Angel M.; Alavi, Seyed Mohammad; Amirian, Seyed Mohammad Reza
2014-01-01
This study examines three controversial aspects in differential item functioning (DIF) detection by logistic regression (LR) models: first, the relative effectiveness of different analytical strategies for detecting DIF; second, the suitability of the Wald statistic for determining the statistical significance of the parameters of interest; and…
ERIC Educational Resources Information Center
French, Brian F.; Maller, Susan J.
2007-01-01
Two unresolved implementation issues with logistic regression (LR) for differential item functioning (DIF) detection include ability purification and effect size use. Purification is suggested to control inaccuracies in DIF detection as a result of DIF items in the ability estimate. Additionally, effect size use may be beneficial in controlling…
A Note on Three Statistical Tests in the Logistic Regression DIF Procedure
ERIC Educational Resources Information Center
Paek, Insu
2012-01-01
Although logistic regression became one of the well-known methods in detecting differential item functioning (DIF), its three statistical tests, the Wald, likelihood ratio (LR), and score tests, which are readily available under the maximum likelihood, do not seem to be consistently distinguished in DIF literature. This paper provides a clarifying…
ERIC Educational Resources Information Center
West, Lindsey M.; Davis, Telsie A.; Thompson, Martie P.; Kaslow, Nadine J.
2011-01-01
Protective factors for fostering reasons for living were examined among low-income, suicidal, African American women. Bivariate logistic regressions revealed that higher levels of optimism, spiritual well-being, and family social support predicted reasons for living. Multivariate logistic regressions indicated that spiritual well-being showed…
Comparison of Two Approaches for Handling Missing Covariates in Logistic Regression
ERIC Educational Resources Information Center
Peng, Chao-Ying Joanne; Zhu, Jin
2008-01-01
For the past 25 years, methodological advances have been made in missing data treatment. Most published work has focused on missing data in dependent variables under various conditions. The present study seeks to fill the void by comparing two approaches for handling missing data in categorical covariates in logistic regression: the…
Comparison of IRT Likelihood Ratio Test and Logistic Regression DIF Detection Procedures
ERIC Educational Resources Information Center
Atar, Burcu; Kamata, Akihito
2011-01-01
The Type I error rates and the power of IRT likelihood ratio test and cumulative logit ordinal logistic regression procedures in detecting differential item functioning (DIF) for polytomously scored items were investigated in this Monte Carlo simulation study. For this purpose, 54 simulation conditions (combinations of 3 sample sizes, 2 sample…
Multiple Logistic Regression Analysis of Cigarette Use among High School Students
ERIC Educational Resources Information Center
Adwere-Boamah, Joseph
2011-01-01
A binary logistic regression analysis was performed to predict high school students' cigarette smoking behavior from selected predictors from 2009 CDC Youth Risk Behavior Surveillance Survey. The specific target student behavior of interest was frequent cigarette use. Five predictor variables included in the model were: a) race, b) frequency of…
ERIC Educational Resources Information Center
Anderson, Carolyn J.; Verkuilen, Jay; Peyton, Buddy L.
2010-01-01
Survey items with multiple response categories and multiple-choice test questions are ubiquitous in psychological and educational research. We illustrate the use of log-multiplicative association (LMA) models that are extensions of the well-known multinomial logistic regression model for multiple dependent outcome variables to reanalyze a set of…
Propensity Score Estimation with Data Mining Techniques: Alternatives to Logistic Regression
ERIC Educational Resources Information Center
Keller, Bryan S. B.; Kim, Jee-Seon; Steiner, Peter M.
2013-01-01
Propensity score analysis (PSA) is a methodological technique which may correct for selection bias in a quasi-experiment by modeling the selection process using observed covariates. Because logistic regression is well understood by researchers in a variety of fields and easy to implement in a number of popular software packages, it has…
Two-factor logistic regression in pediatric liver transplantation
NASA Astrophysics Data System (ADS)
Uzunova, Yordanka; Prodanova, Krasimira; Spasov, Lyubomir
2017-12-01
Using a two-factor logistic regression analysis an estimate is derived for the probability of absence of infections in the early postoperative period after pediatric liver transplantation. The influence of both the bilirubin level and the international normalized ratio of prothrombin time of blood coagulation at the 5th postoperative day is studied.
ERIC Educational Resources Information Center
Courtney, Jon R.; Prophet, Retta
2011-01-01
Placement instability is often associated with a number of negative outcomes for children. To gain state level contextual knowledge of factors associated with placement stability/instability, logistic regression was applied to selected variables from the New Mexico Adoption and Foster Care Administrative Reporting System dataset. Predictors…
Classifying machinery condition using oil samples and binary logistic regression
NASA Astrophysics Data System (ADS)
Phillips, J.; Cripps, E.; Lau, John W.; Hodkiewicz, M. R.
2015-08-01
The era of big data has resulted in an explosion of condition monitoring information. The result is an increasing motivation to automate the costly and time consuming human elements involved in the classification of machine health. When working with industry it is important to build an understanding and hence some trust in the classification scheme for those who use the analysis to initiate maintenance tasks. Typically "black box" approaches such as artificial neural networks (ANN) and support vector machines (SVM) can be difficult to provide ease of interpretability. In contrast, this paper argues that logistic regression offers easy interpretability to industry experts, providing insight to the drivers of the human classification process and to the ramifications of potential misclassification. Of course, accuracy is of foremost importance in any automated classification scheme, so we also provide a comparative study based on predictive performance of logistic regression, ANN and SVM. A real world oil analysis data set from engines on mining trucks is presented and using cross-validation we demonstrate that logistic regression out-performs the ANN and SVM approaches in terms of prediction for healthy/not healthy engines.
Length bias correction in gene ontology enrichment analysis using logistic regression.
Mi, Gu; Di, Yanming; Emerson, Sarah; Cumbie, Jason S; Chang, Jeff H
2012-01-01
When assessing differential gene expression from RNA sequencing data, commonly used statistical tests tend to have greater power to detect differential expression of genes encoding longer transcripts. This phenomenon, called "length bias", will influence subsequent analyses such as Gene Ontology enrichment analysis. In the presence of length bias, Gene Ontology categories that include longer genes are more likely to be identified as enriched. These categories, however, are not necessarily biologically more relevant. We show that one can effectively adjust for length bias in Gene Ontology analysis by including transcript length as a covariate in a logistic regression model. The logistic regression model makes the statistical issue underlying length bias more transparent: transcript length becomes a confounding factor when it correlates with both the Gene Ontology membership and the significance of the differential expression test. The inclusion of the transcript length as a covariate allows one to investigate the direct correlation between the Gene Ontology membership and the significance of testing differential expression, conditional on the transcript length. We present both real and simulated data examples to show that the logistic regression approach is simple, effective, and flexible.
Lee, Seokho; Shin, Hyejin; Lee, Sang Han
2016-12-01
Alzheimer's disease (AD) is usually diagnosed by clinicians through cognitive and functional performance test with a potential risk of misdiagnosis. Since the progression of AD is known to cause structural changes in the corpus callosum (CC), the CC thickness can be used as a functional covariate in AD classification problem for a diagnosis. However, misclassified class labels negatively impact the classification performance. Motivated by AD-CC association studies, we propose a logistic regression for functional data classification that is robust to misdiagnosis or label noise. Specifically, our logistic regression model is constructed by adopting individual intercepts to functional logistic regression model. This approach enables to indicate which observations are possibly mislabeled and also lead to a robust and efficient classifier. An effective algorithm using MM algorithm provides simple closed-form update formulas. We test our method using synthetic datasets to demonstrate its superiority over an existing method, and apply it to differentiating patients with AD from healthy normals based on CC from MRI. © 2016, The International Biometric Society.
Szekér, Szabolcs; Vathy-Fogarassy, Ágnes
2018-01-01
Logistic regression based propensity score matching is a widely used method in case-control studies to select the individuals of the control group. This method creates a suitable control group if all factors affecting the output variable are known. However, if relevant latent variables exist as well, which are not taken into account during the calculations, the quality of the control group is uncertain. In this paper, we present a statistics-based research in which we try to determine the relationship between the accuracy of the logistic regression model and the uncertainty of the dependent variable of the control group defined by propensity score matching. Our analyses show that there is a linear correlation between the fit of the logistic regression model and the uncertainty of the output variable. In certain cases, a latent binary explanatory variable can result in a relative error of up to 70% in the prediction of the outcome variable. The observed phenomenon calls the attention of analysts to an important point, which must be taken into account when deducting conclusions.
Logistic regression for circular data
NASA Astrophysics Data System (ADS)
Al-Daffaie, Kadhem; Khan, Shahjahan
2017-05-01
This paper considers the relationship between a binary response and a circular predictor. It develops the logistic regression model by employing the linear-circular regression approach. The maximum likelihood method is used to estimate the parameters. The Newton-Raphson numerical method is used to find the estimated values of the parameters. A data set from weather records of Toowoomba city is analysed by the proposed methods. Moreover, a simulation study is considered. The R software is used for all computations and simulations.
Naval Research Logistics Quarterly. Volume 28. Number 3,
1981-09-01
denotes component-wise maximum. f has antone (isotone) differences on C x D if for cl < c2 and d, < d2, NAVAL RESEARCH LOGISTICS QUARTERLY VOL. 28...or negative correlations and linear or nonlinear regressions. Given are the mo- ments to order two and, for special cases, (he regression function and...data sets. We designate this bnb distribution as G - B - N(a, 0, v). The distribution admits only of positive correlation and linear regressions
Bond, H S; Sullivan, S G; Cowling, B J
2016-06-01
Influenza vaccination is the most practical means available for preventing influenza virus infection and is widely used in many countries. Because vaccine components and circulating strains frequently change, it is important to continually monitor vaccine effectiveness (VE). The test-negative design is frequently used to estimate VE. In this design, patients meeting the same clinical case definition are recruited and tested for influenza; those who test positive are the cases and those who test negative form the comparison group. When determining VE in these studies, the typical approach has been to use logistic regression, adjusting for potential confounders. Because vaccine coverage and influenza incidence change throughout the season, time is included among these confounders. While most studies use unconditional logistic regression, adjusting for time, an alternative approach is to use conditional logistic regression, matching on time. Here, we used simulation data to examine the potential for both regression approaches to permit accurate and robust estimates of VE. In situations where vaccine coverage changed during the influenza season, the conditional model and unconditional models adjusting for categorical week and using a spline function for week provided more accurate estimates. We illustrated the two approaches on data from a test-negative study of influenza VE against hospitalization in children in Hong Kong which resulted in the conditional logistic regression model providing the best fit to the data.
Asghari, Mehdi Poursheikhali; Hayatshahi, Sayyed Hamed Sadat; Abdolmaleki, Parviz
2012-01-01
From both the structural and functional points of view, β-turns play important biological roles in proteins. In the present study, a novel two-stage hybrid procedure has been developed to identify β-turns in proteins. Binary logistic regression was initially used for the first time to select significant sequence parameters in identification of β-turns due to a re-substitution test procedure. Sequence parameters were consisted of 80 amino acid positional occurrences and 20 amino acid percentages in sequence. Among these parameters, the most significant ones which were selected by binary logistic regression model, were percentages of Gly, Ser and the occurrence of Asn in position i+2, respectively, in sequence. These significant parameters have the highest effect on the constitution of a β-turn sequence. A neural network model was then constructed and fed by the parameters selected by binary logistic regression to build a hybrid predictor. The networks have been trained and tested on a non-homologous dataset of 565 protein chains. With applying a nine fold cross-validation test on the dataset, the network reached an overall accuracy (Qtotal) of 74, which is comparable with results of the other β-turn prediction methods. In conclusion, this study proves that the parameter selection ability of binary logistic regression together with the prediction capability of neural networks lead to the development of more precise models for identifying β-turns in proteins. PMID:27418910
Asghari, Mehdi Poursheikhali; Hayatshahi, Sayyed Hamed Sadat; Abdolmaleki, Parviz
2012-01-01
From both the structural and functional points of view, β-turns play important biological roles in proteins. In the present study, a novel two-stage hybrid procedure has been developed to identify β-turns in proteins. Binary logistic regression was initially used for the first time to select significant sequence parameters in identification of β-turns due to a re-substitution test procedure. Sequence parameters were consisted of 80 amino acid positional occurrences and 20 amino acid percentages in sequence. Among these parameters, the most significant ones which were selected by binary logistic regression model, were percentages of Gly, Ser and the occurrence of Asn in position i+2, respectively, in sequence. These significant parameters have the highest effect on the constitution of a β-turn sequence. A neural network model was then constructed and fed by the parameters selected by binary logistic regression to build a hybrid predictor. The networks have been trained and tested on a non-homologous dataset of 565 protein chains. With applying a nine fold cross-validation test on the dataset, the network reached an overall accuracy (Qtotal) of 74, which is comparable with results of the other β-turn prediction methods. In conclusion, this study proves that the parameter selection ability of binary logistic regression together with the prediction capability of neural networks lead to the development of more precise models for identifying β-turns in proteins.
Crane, Paul K; Gibbons, Laura E; Jolley, Lance; van Belle, Gerald
2006-11-01
We present an ordinal logistic regression model for identification of items with differential item functioning (DIF) and apply this model to a Mini-Mental State Examination (MMSE) dataset. We employ item response theory ability estimation in our models. Three nested ordinal logistic regression models are applied to each item. Model testing begins with examination of the statistical significance of the interaction term between ability and the group indicator, consistent with nonuniform DIF. Then we turn our attention to the coefficient of the ability term in models with and without the group term. If including the group term has a marked effect on that coefficient, we declare that it has uniform DIF. We examined DIF related to language of test administration in addition to self-reported race, Hispanic ethnicity, age, years of education, and sex. We used PARSCALE for IRT analyses and STATA for ordinal logistic regression approaches. We used an iterative technique for adjusting IRT ability estimates on the basis of DIF findings. Five items were found to have DIF related to language. These same items also had DIF related to other covariates. The ordinal logistic regression approach to DIF detection, when combined with IRT ability estimates, provides a reasonable alternative for DIF detection. There appear to be several items with significant DIF related to language of test administration in the MMSE. More attention needs to be paid to the specific criteria used to determine whether an item has DIF, not just the technique used to identify DIF.
Conditional Poisson models: a flexible alternative to conditional logistic case cross-over analysis.
Armstrong, Ben G; Gasparrini, Antonio; Tobias, Aurelio
2014-11-24
The time stratified case cross-over approach is a popular alternative to conventional time series regression for analysing associations between time series of environmental exposures (air pollution, weather) and counts of health outcomes. These are almost always analyzed using conditional logistic regression on data expanded to case-control (case crossover) format, but this has some limitations. In particular adjusting for overdispersion and auto-correlation in the counts is not possible. It has been established that a Poisson model for counts with stratum indicators gives identical estimates to those from conditional logistic regression and does not have these limitations, but it is little used, probably because of the overheads in estimating many stratum parameters. The conditional Poisson model avoids estimating stratum parameters by conditioning on the total event count in each stratum, thus simplifying the computing and increasing the number of strata for which fitting is feasible compared with the standard unconditional Poisson model. Unlike the conditional logistic model, the conditional Poisson model does not require expanding the data, and can adjust for overdispersion and auto-correlation. It is available in Stata, R, and other packages. By applying to some real data and using simulations, we demonstrate that conditional Poisson models were simpler to code and shorter to run than are conditional logistic analyses and can be fitted to larger data sets than possible with standard Poisson models. Allowing for overdispersion or autocorrelation was possible with the conditional Poisson model but when not required this model gave identical estimates to those from conditional logistic regression. Conditional Poisson regression models provide an alternative to case crossover analysis of stratified time series data with some advantages. The conditional Poisson model can also be used in other contexts in which primary control for confounding is by fine stratification.
Use of generalized ordered logistic regression for the analysis of multidrug resistance data.
Agga, Getahun E; Scott, H Morgan
2015-10-01
Statistical analysis of antimicrobial resistance data largely focuses on individual antimicrobial's binary outcome (susceptible or resistant). However, bacteria are becoming increasingly multidrug resistant (MDR). Statistical analysis of MDR data is mostly descriptive often with tabular or graphical presentations. Here we report the applicability of generalized ordinal logistic regression model for the analysis of MDR data. A total of 1,152 Escherichia coli, isolated from the feces of weaned pigs experimentally supplemented with chlortetracycline (CTC) and copper, were tested for susceptibilities against 15 antimicrobials and were binary classified into resistant or susceptible. The 15 antimicrobial agents tested were grouped into eight different antimicrobial classes. We defined MDR as the number of antimicrobial classes to which E. coli isolates were resistant ranging from 0 to 8. Proportionality of the odds assumption of the ordinal logistic regression model was violated only for the effect of treatment period (pre-treatment, during-treatment and post-treatment); but not for the effect of CTC or copper supplementation. Subsequently, a partially constrained generalized ordinal logistic model was built that allows for the effect of treatment period to vary while constraining the effects of treatment (CTC and copper supplementation) to be constant across the levels of MDR classes. Copper (Proportional Odds Ratio [Prop OR]=1.03; 95% CI=0.73-1.47) and CTC (Prop OR=1.1; 95% CI=0.78-1.56) supplementation were not significantly associated with the level of MDR adjusted for the effect of treatment period. MDR generally declined over the trial period. In conclusion, generalized ordered logistic regression can be used for the analysis of ordinal data such as MDR data when the proportionality assumptions for ordered logistic regression are violated. Published by Elsevier B.V.
Fei, Y; Hu, J; Li, W-Q; Wang, W; Zong, G-Q
2017-03-01
Essentials Predicting the occurrence of portosplenomesenteric vein thrombosis (PSMVT) is difficult. We studied 72 patients with acute pancreatitis. Artificial neural networks modeling was more accurate than logistic regression in predicting PSMVT. Additional predictive factors may be incorporated into artificial neural networks. Objective To construct and validate artificial neural networks (ANNs) for predicting the occurrence of portosplenomesenteric venous thrombosis (PSMVT) and compare the predictive ability of the ANNs with that of logistic regression. Methods The ANNs and logistic regression modeling were constructed using simple clinical and laboratory data of 72 acute pancreatitis (AP) patients. The ANNs and logistic modeling were first trained on 48 randomly chosen patients and validated on the remaining 24 patients. The accuracy and the performance characteristics were compared between these two approaches by SPSS17.0 software. Results The training set and validation set did not differ on any of the 11 variables. After training, the back propagation network training error converged to 1 × 10 -20 , and it retained excellent pattern recognition ability. When the ANNs model was applied to the validation set, it revealed a sensitivity of 80%, specificity of 85.7%, a positive predictive value of 77.6% and negative predictive value of 90.7%. The accuracy was 83.3%. Differences could be found between ANNs modeling and logistic regression modeling in these parameters (10.0% [95% CI, -14.3 to 34.3%], 14.3% [95% CI, -8.6 to 37.2%], 15.7% [95% CI, -9.9 to 41.3%], 11.8% [95% CI, -8.2 to 31.8%], 22.6% [95% CI, -1.9 to 47.1%], respectively). When ANNs modeling was used to identify PSMVT, the area under receiver operating characteristic curve was 0.849 (95% CI, 0.807-0.901), which demonstrated better overall properties than logistic regression modeling (AUC = 0.716) (95% CI, 0.679-0.761). Conclusions ANNs modeling was a more accurate tool than logistic regression in predicting the occurrence of PSMVT following AP. More clinical factors or biomarkers may be incorporated into ANNs modeling to improve its predictive ability. © 2016 International Society on Thrombosis and Haemostasis.
McLaren, Christine E.; Chen, Wen-Pin; Nie, Ke; Su, Min-Ying
2009-01-01
Rationale and Objectives Dynamic contrast enhanced MRI (DCE-MRI) is a clinical imaging modality for detection and diagnosis of breast lesions. Analytical methods were compared for diagnostic feature selection and performance of lesion classification to differentiate between malignant and benign lesions in patients. Materials and Methods The study included 43 malignant and 28 benign histologically-proven lesions. Eight morphological parameters, ten gray level co-occurrence matrices (GLCM) texture features, and fourteen Laws’ texture features were obtained using automated lesion segmentation and quantitative feature extraction. Artificial neural network (ANN) and logistic regression analysis were compared for selection of the best predictors of malignant lesions among the normalized features. Results Using ANN, the final four selected features were compactness, energy, homogeneity, and Law_LS, with area under the receiver operating characteristic curve (AUC) = 0.82, and accuracy = 0.76. The diagnostic performance of these 4-features computed on the basis of logistic regression yielded AUC = 0.80 (95% CI, 0.688 to 0.905), similar to that of ANN. The analysis also shows that the odds of a malignant lesion decreased by 48% (95% CI, 25% to 92%) for every increase of 1 SD in the Law_LS feature, adjusted for differences in compactness, energy, and homogeneity. Using logistic regression with z-score transformation, a model comprised of compactness, NRL entropy, and gray level sum average was selected, and it had the highest overall accuracy of 0.75 among all models, with AUC = 0.77 (95% CI, 0.660 to 0.880). When logistic modeling of transformations using the Box-Cox method was performed, the most parsimonious model with predictors, compactness and Law_LS, had an AUC of 0.79 (95% CI, 0.672 to 0.898). Conclusion The diagnostic performance of models selected by ANN and logistic regression was similar. The analytic methods were found to be roughly equivalent in terms of predictive ability when a small number of variables were chosen. The robust ANN methodology utilizes a sophisticated non-linear model, while logistic regression analysis provides insightful information to enhance interpretation of the model features. PMID:19409817
Ai, Zi-Sheng; Gao, You-Shui; Sun, Yuan; Liu, Yue; Zhang, Chang-Qing; Jiang, Cheng-Hua
2013-03-01
Risk factors for femoral neck fracture-induced avascular necrosis of the femoral head have not been elucidated clearly in middle-aged and elderly patients. Moreover, the high incidence of screw removal in China and its effect on the fate of the involved femoral head require statistical methods to reflect their intrinsic relationship. Ninety-nine patients older than 45 years with femoral neck fracture were treated by internal fixation between May 1999 and April 2004. Descriptive analysis, interaction analysis between associated factors, single factor logistic regression, multivariate logistic regression, and detailed interaction analysis were employed to explore potential relationships among associated factors. Avascular necrosis of the femoral head was found in 15 cases (15.2 %). Age × the status of implants (removal vs. maintenance) and gender × the timing of reduction were interactive according to two-factor interactive analysis. Age, the displacement of fractures, the quality of reduction, and the status of implants were found to be significant factors in single factor logistic regression analysis. Age, age × the status of implants, and the quality of reduction were found to be significant factors in multivariate logistic regression analysis. In fine interaction analysis after multivariate logistic regression analysis, implant removal was the most important risk factor for avascular necrosis in 56-to-85-year-old patients, with a risk ratio of 26.00 (95 % CI = 3.076-219.747). The middle-aged and elderly have less incidence of avascular necrosis of the femoral head following femoral neck fractures treated by cannulated screws. The removal of cannulated screws can induce a significantly high incidence of avascular necrosis of the femoral head in elderly patients, while a high-quality reduction is helpful to reduce avascular necrosis.
Zhou, Jinzhe; Zhou, Yanbing; Cao, Shougen; Li, Shikuan; Wang, Hao; Niu, Zhaojian; Chen, Dong; Wang, Dongsheng; Lv, Liang; Zhang, Jian; Li, Yu; Jiao, Xuelong; Tan, Xiaojie; Zhang, Jianli; Wang, Haibo; Zhang, Bingyuan; Lu, Yun; Sun, Zhenqing
2016-01-01
Reporting of surgical complications is common, but few provide information about the severity and estimate risk factors of complications. If have, but lack of specificity. We retrospectively analyzed data on 2795 gastric cancer patients underwent surgical procedure at the Affiliated Hospital of Qingdao University between June 2007 and June 2012, established multivariate logistic regression model to predictive risk factors related to the postoperative complications according to the Clavien-Dindo classification system. Twenty-four out of 86 variables were identified statistically significant in univariate logistic regression analysis, 11 significant variables entered multivariate analysis were employed to produce the risk model. Liver cirrhosis, diabetes mellitus, Child classification, invasion of neighboring organs, combined resection, introperative transfusion, Billroth II anastomosis of reconstruction, malnutrition, surgical volume of surgeons, operating time and age were independent risk factors for postoperative complications after gastrectomy. Based on logistic regression equation, p=Exp∑BiXi / (1+Exp∑BiXi), multivariate logistic regression predictive model that calculated the risk of postoperative morbidity was developed, p = 1/(1 + e((4.810-1.287X1-0.504X2-0.500X3-0.474X4-0.405X5-0.318X6-0.316X7-0.305X8-0.278X9-0.255X10-0.138X11))). The accuracy, sensitivity and specificity of the model to predict the postoperative complications were 86.7%, 76.2% and 88.6%, respectively. This risk model based on Clavien-Dindo grading severity of complications system and logistic regression analysis can predict severe morbidity specific to an individual patient's risk factors, estimate patients' risks and benefits of gastric surgery as an accurate decision-making tool and may serve as a template for the development of risk models for other surgical groups.
Cucciare, Michael A; Gray, Heather; Azar, Armin; Jimenez, Daniel; Gallagher-Thompson, Dolores
2010-04-01
The present study examined the relationship between self-reported physical health, depressive symptoms, and the occurrence of depression diagnosis in Hispanic female dementia caregivers. Participants were 89 Hispanic female dementia caregivers. This study used a cross-sectional design. Baseline depression and physical health data were collected from participants enrolled in the 'Reducing Stress in Hispanic Anglo Dementia Caregivers' study sponsored by the National Institute on Aging. Physical health was assessed using the Medical Outcome Study Short Form-36 (SF-36), a one-item self-report health rating, body mass index, and the presence or history of self-reported physical illness. Depressive symptoms were assessed using the Center for Epidemiologic Studies-Depression Scale (CES-D). The occurrence of depression diagnosis was assessed using the Clinical Interview for DSM-IV Axis I Disorders (SCID). Multiple linear and logistic regression analysis was used to examine the extent to which indices of physical health and depressive symptoms accounted for variance in participants' depressive symptoms and depressive diagnoses. Self-reported indices of health (e.g., SF-36) accounted for a significant portion of variance in both CES-D scores and SCID diagnoses. Caregivers who reported worsened health tended to report increased symptoms of depression on the CES-D and increased likelihood of an SCID diagnosis of a depressive disorder. Self-reported health indices are helpful in identifying Hispanic dementia caregivers at risk for clinical levels of depression.
NASA Astrophysics Data System (ADS)
Varghese, Bino; Hwang, Darryl; Mohamed, Passant; Cen, Steven; Deng, Christopher; Chang, Michael; Duddalwar, Vinay
2017-11-01
Purpose: To evaluate potential use of wavelets analysis in discriminating benign and malignant renal masses (RM) Materials and Methods: Regions of interest of the whole lesion were manually segmented and co-registered from multiphase CT acquisitions of 144 patients (98 malignant RM: renal cell carcinoma (RCC) and 46 benign RM: oncocytoma, lipid-poor angiomyolipoma). Here, the Haar wavelet was used to analyze the grayscale images of the largest segmented tumor in the axial direction. Six metrics (energy, entropy, homogeneity, contrast, standard deviation (SD) and variance) derived from 3-levels of image decomposition in 3 directions (horizontal, vertical and diagonal) respectively, were used to quantify tumor texture. Independent t-test or Wilcoxon rank sum test depending on data normality were used as exploratory univariate analysis. Stepwise logistic regression and receiver operator characteristics (ROC) curve analysis were used to select predictors and assess prediction accuracy, respectively. Results: Consistently, 5 out of 6 wavelet-based texture measures (except homogeneity) were higher for malignant tumors compared to benign, when accounting for individual texture direction. Homogeneity was consistently lower in malignant than benign tumors irrespective of direction. SD and variance measured in the diagonal direction on the corticomedullary phase showed significant (p<0.05) difference between benign versus malignant tumors. The multivariate model with variance (3 directions) and SD (vertical direction) extracted from the excretory and pre-contrast phase, respectively showed an area under the ROC curve (AUC) of 0.78 (p < 0.05) in discriminating malignant from benign. Conclusion: Wavelet analysis is a valuable texture evaluation tool to add to a radiomics platforms geared at reliably characterizing and stratifying renal masses.
Middeldorp, C M; de Moor, M H M; McGrath, L M; Gordon, S D; Blackwood, D H; Costa, P T; Terracciano, A; Krueger, R F; de Geus, E J C; Nyholt, D R; Tanaka, T; Esko, T; Madden, P A F; Derringer, J; Amin, N; Willemsen, G; Hottenga, J-J; Distel, M A; Uda, M; Sanna, S; Spinhoven, P; Hartman, C A; Ripke, S; Sullivan, P F; Realo, A; Allik, J; Heath, A C; Pergadia, M L; Agrawal, A; Lin, P; Grucza, R A; Widen, E; Cousminer, D L; Eriksson, J G; Palotie, A; Barnett, J H; Lee, P H; Luciano, M; Tenesa, A; Davies, G; Lopez, L M; Hansell, N K; Medland, S E; Ferrucci, L; Schlessinger, D; Montgomery, G W; Wright, M J; Aulchenko, Y S; Janssens, A C J W; Oostra, B A; Metspalu, A; Abecasis, G R; Deary, I J; Räikkönen, K; Bierut, L J; Martin, N G; Wray, N R; van Duijn, C M; Smoller, J W; Penninx, B W J H; Boomsma, D I
2011-01-01
The relationship between major depressive disorder (MDD) and bipolar disorder (BD) remains controversial. Previous research has reported differences and similarities in risk factors for MDD and BD, such as predisposing personality traits. For example, high neuroticism is related to both disorders, whereas openness to experience is specific for BD. This study examined the genetic association between personality and MDD and BD by applying polygenic scores for neuroticism, extraversion, openness to experience, agreeableness and conscientiousness to both disorders. Polygenic scores reflect the weighted sum of multiple single-nucleotide polymorphism alleles associated with the trait for an individual and were based on a meta-analysis of genome-wide association studies for personality traits including 13 835 subjects. Polygenic scores were tested for MDD in the combined Genetic Association Information Network (GAIN-MDD) and MDD2000+ samples (N=8921) and for BD in the combined Systematic Treatment Enhancement Program for Bipolar Disorder and Wellcome Trust Case–Control Consortium samples (N=6329) using logistic regression analyses. At the phenotypic level, personality dimensions were associated with MDD and BD. Polygenic neuroticism scores were significantly positively associated with MDD, whereas polygenic extraversion scores were significantly positively associated with BD. The explained variance of MDD and BD, ∼0.1%, was highly comparable to the variance explained by the polygenic personality scores in the corresponding personality traits themselves (between 0.1 and 0.4%). This indicates that the proportions of variance explained in mood disorders are at the upper limit of what could have been expected. This study suggests shared genetic risk factors for neuroticism and MDD on the one hand and for extraversion and BD on the other. PMID:22833196
Mehta, Hemalkumar B.; Parmar, Abhishek D.; Adhikari, Deepak; Tamirisa, Nina P.; Dimou, Francesca; Jupiter, Daniel; Riall, Taylor S.
2016-01-01
Background Surgeon and hospital volume are both known to affect outcomes for patients undergoing pancreatic resection. The objective was to evaluate the relative effects of surgeon and hospital volume on 30-day mortality and 30-day complications after pancreatic resection among older patients. Materials and Methods The study used Texas Medicare data (2000–2012), identifying high-volume surgeons as those performing ≥4 pancreatic resections/year, and high-volume hospitals as those performing ≥11 pancreatic resections/year, on Medicare patients. Three-level hierarchical logistic regression models were used to evaluate the relative effects of surgeon and hospital volumes on mortality and complications, after adjusting for case mix differences. Results There were 2,453 pancreatic resections performed by 490 surgeons operating in 138 hospitals. 4.5% of surgeons and 6.5% of hospitals were high-volume. The overall 30-day mortality was 9.0%, and the 30-day complication rate was 40.6%. Overall, 8.9% of the variance in 30-day mortality was attributed to surgeon factors and 9.8% to hospital factors. For 30-day complications, 4.7% of the variance was attributed to surgeon factors and 1.2% to hospital factors. After adjusting for patient, surgeon and hospital characteristics, high surgeon volume (OR 0.54, 95% CI 0.33–0.87) and high hospital volume (OR, 0.52; 95% CI, 0.30–0.92) were associated with lower risk of mortality; high surgeon volume (OR 0.71, 95% CI 0.55–0.93) was also associated lower risk of 30-day complications. Conclusions Both hospital and surgeon factors contributed significantly to the observed variance in mortality, but only surgeon factors impacted complications. PMID:27565068
Steel, Jennifer L; Dunlavy, Andrea C; Harding, Collette E; Theorell, Töres
2017-06-01
Over 50 million people have been displaced, some as a result of conflict, which exposure can lead to psychiatric sequelae. The aims of this study were to provide estimates of pre-emigration trauma, post-migration stress, and psychological sequelae of immigrants and refugees from predominantly Sub-Saharan Africa who immigrated to Sweden. We also examined the predictors of the psychiatric sequelae as well as acculturation within the host country. A total of 420 refugees and immigrants were enrolled using stratified quota sampling. A battery of questionnaires including the Harvard Trauma Questionnaire, Post-Migration Living Difficulties Scale, the Cultural Lifestyle Questionnaire; and the Hopkins Checklist were administered. Descriptive statistics, Chi square analyses, Pearson correlations, analysis of variance, and logistic and linear regression were performed to test the aims of the study. Eighty-nine percent of participants reported at least one traumatic experience prior to emigration. Forty-seven percent of refugees reported clinically significant PTSD and 20 % reported clinically significant depressive symptoms. Males reported a significantly greater number of traumatic events [F(1, 198) = 14.5, p < 0.001] and post-migration stress than females [F(1, 414) = 5.3, p = 0.02], particularly on the financial, discrimination, and healthcare subscales. Females reported a higher prevalence of depressive symptoms when compared to males [F(1, 419) = 3.9, p = 0.05]. Those with a shorter duration in Sweden reported higher rates of PTSD [F(63, 419) = 1.7, p < 0.001]. The greater number of traumatic events was found to be significantly associated with the severity of PTSD symptoms [F(34, 419) = 9.6, p < 0.001]. Using regression analysis, 82 and 83 % of the variances associated with anxiety and depression, respectively, was explained by gender, education, religion, PTSD and post-migration stress. Sixty-nine percent of the variance associated with PTSD included education, number of traumatic events, depressive symptoms and post-migration stress. Forty-seven percent of the variance for acculturation was accounted for by a model that included age, education, duration in Sweden, anxiety, depression, and post-migration stress. These predictors were also significant for employment status with the exception of depressive symptoms. Multidimensional interventions that provide treatments to improve psychiatric symptoms in combination with advocacy and support to reduce stress (e.g., financial, access to health care) are recommended. The focus of the intervention may also be modified based on the gender of the participants.
Saunders, Christina T; Blume, Jeffrey D
2017-10-26
Mediation analysis explores the degree to which an exposure's effect on an outcome is diverted through a mediating variable. We describe a classical regression framework for conducting mediation analyses in which estimates of causal mediation effects and their variance are obtained from the fit of a single regression model. The vector of changes in exposure pathway coefficients, which we named the essential mediation components (EMCs), is used to estimate standard causal mediation effects. Because these effects are often simple functions of the EMCs, an analytical expression for their model-based variance follows directly. Given this formula, it is instructive to revisit the performance of routinely used variance approximations (e.g., delta method and resampling methods). Requiring the fit of only one model reduces the computation time required for complex mediation analyses and permits the use of a rich suite of regression tools that are not easily implemented on a system of three equations, as would be required in the Baron-Kenny framework. Using data from the BRAIN-ICU study, we provide examples to illustrate the advantages of this framework and compare it with the existing approaches. © The Author 2017. Published by Oxford University Press.
Rank-Optimized Logistic Matrix Regression toward Improved Matrix Data Classification.
Zhang, Jianguang; Jiang, Jianmin
2018-02-01
While existing logistic regression suffers from overfitting and often fails in considering structural information, we propose a novel matrix-based logistic regression to overcome the weakness. In the proposed method, 2D matrices are directly used to learn two groups of parameter vectors along each dimension without vectorization, which allows the proposed method to fully exploit the underlying structural information embedded inside the 2D matrices. Further, we add a joint [Formula: see text]-norm on two parameter matrices, which are organized by aligning each group of parameter vectors in columns. This added co-regularization term has two roles-enhancing the effect of regularization and optimizing the rank during the learning process. With our proposed fast iterative solution, we carried out extensive experiments. The results show that in comparison to both the traditional tensor-based methods and the vector-based regression methods, our proposed solution achieves better performance for matrix data classifications.
Bohmanova, J; Miglior, F; Jamrozik, J; Misztal, I; Sullivan, P G
2008-09-01
A random regression model with both random and fixed regressions fitted by Legendre polynomials of order 4 was compared with 3 alternative models fitting linear splines with 4, 5, or 6 knots. The effects common for all models were a herd-test-date effect, fixed regressions on days in milk (DIM) nested within region-age-season of calving class, and random regressions for additive genetic and permanent environmental effects. Data were test-day milk, fat and protein yields, and SCS recorded from 5 to 365 DIM during the first 3 lactations of Canadian Holstein cows. A random sample of 50 herds consisting of 96,756 test-day records was generated to estimate variance components within a Bayesian framework via Gibbs sampling. Two sets of genetic evaluations were subsequently carried out to investigate performance of the 4 models. Models were compared by graphical inspection of variance functions, goodness of fit, error of prediction of breeding values, and stability of estimated breeding values. Models with splines gave lower estimates of variances at extremes of lactations than the model with Legendre polynomials. Differences among models in goodness of fit measured by percentages of squared bias, correlations between predicted and observed records, and residual variances were small. The deviance information criterion favored the spline model with 6 knots. Smaller error of prediction and higher stability of estimated breeding values were achieved by using spline models with 5 and 6 knots compared with the model with Legendre polynomials. In general, the spline model with 6 knots had the best overall performance based upon the considered model comparison criteria.
NASA Astrophysics Data System (ADS)
Oguntunde, Philip G.; Lischeid, Gunnar; Dietrich, Ottfried
2018-03-01
This study examines the variations of climate variables and rice yield and quantifies the relationships among them using multiple linear regression, principal component analysis, and support vector machine (SVM) analysis in southwest Nigeria. The climate and yield data used was for a period of 36 years between 1980 and 2015. Similar to the observed decrease ( P < 0.001) in rice yield, pan evaporation, solar radiation, and wind speed declined significantly. Eight principal components exhibited an eigenvalue > 1 and explained 83.1% of the total variance of predictor variables. The SVM regression function using the scores of the first principal component explained about 75% of the variance in rice yield data and linear regression about 64%. SVM regression between annual solar radiation values and yield explained 67% of the variance. Only the first component of the principal component analysis (PCA) exhibited a clear long-term trend and sometimes short-term variance similar to that of rice yield. Short-term fluctuations of the scores of the PC1 are closely coupled to those of rice yield during the 1986-1993 and the 2006-2013 periods thereby revealing the inter-annual sensitivity of rice production to climate variability. Solar radiation stands out as the climate variable of highest influence on rice yield, and the influence was especially strong during monsoon and post-monsoon periods, which correspond to the vegetative, booting, flowering, and grain filling stages in the study area. The outcome is expected to provide more in-depth regional-specific climate-rice linkage for screening of better cultivars that can positively respond to future climate fluctuations as well as providing information that may help optimized planting dates for improved radiation use efficiency in the study area.
On the impact of relatedness on SNP association analysis.
Gross, Arnd; Tönjes, Anke; Scholz, Markus
2017-12-06
When testing for SNP (single nucleotide polymorphism) associations in related individuals, observations are not independent. Simple linear regression assuming independent normally distributed residuals results in an increased type I error and the power of the test is also affected in a more complicate manner. Inflation of type I error is often successfully corrected by genomic control. However, this reduces the power of the test when relatedness is of concern. In the present paper, we derive explicit formulae to investigate how heritability and strength of relatedness contribute to variance inflation of the effect estimate of the linear model. Further, we study the consequences of variance inflation on hypothesis testing and compare the results with those of genomic control correction. We apply the developed theory to the publicly available HapMap trio data (N=129), the Sorbs (a self-contained population with N=977 characterised by a cryptic relatedness structure) and synthetic family studies with different sample sizes (ranging from N=129 to N=999) and different degrees of relatedness. We derive explicit and easily to apply approximation formulae to estimate the impact of relatedness on the variance of the effect estimate of the linear regression model. Variance inflation increases with increasing heritability. Relatedness structure also impacts the degree of variance inflation as shown for example family structures. Variance inflation is smallest for HapMap trios, followed by a synthetic family study corresponding to the trio data but with larger sample size than HapMap. Next strongest inflation is observed for the Sorbs, and finally, for a synthetic family study with a more extreme relatedness structure but with similar sample size as the Sorbs. Type I error increases rapidly with increasing inflation. However, for smaller significance levels, power increases with increasing inflation while the opposite holds for larger significance levels. When genomic control is applied, type I error is preserved while power decreases rapidly with increasing variance inflation. Stronger relatedness as well as higher heritability result in increased variance of the effect estimate of simple linear regression analysis. While type I error rates are generally inflated, the behaviour of power is more complex since power can be increased or reduced in dependence on relatedness and the heritability of the phenotype. Genomic control cannot be recommended to deal with inflation due to relatedness. Although it preserves type I error, the loss in power can be considerable. We provide a simple formula for estimating variance inflation given the relatedness structure and the heritability of a trait of interest. As a rule of thumb, variance inflation below 1.05 does not require correction and simple linear regression analysis is still appropriate.
[How medical students perform academically by admission types?].
Kim, Se-Hoon; Lee, Keumho; Hur, Yera; Kim, Ji-Ha
2013-09-01
Despite the importance of selecting students whom are capable for medical education and to become a good doctor, not enough studies have been done in the category. This study focused on analysing the medical students' academic performance (grade point average, GPA) differences, flunk and dropout rates by admission types. From 2004 to 2010, we gathered 369 Konyang University College of Medicine's students admission data and analyzed the differences between admission method and academic achievement, differences in failure and dropout rates. Analysis of variance (ANOVA), ordinary least square, and logistic regression were used. The rolling students showed higher academic achievement from year 1 to 3 than regular students (p < 0.01). Using admission type variable as control variable in multiple regression model similar results were shown. But unlike the results of ANOVA, GPA differences by admission types were shown not only in lower academic years but also in year 6 (p < 0.01). From the regression analysis of flunk and dropout rate by admission types, regular admission type students showed higher drop out rate than the rolling ones which demonstrates admission types gives significant effect on flunk or dropout rates in medical students (p < 0.01). The rolling admissions type students tend to show lower flunk rate and dropout rates and perform better academically. This implies selecting students primarily by Korean College Scholastic Ability Test does not guarantee their academic success in medical education. Thus we suggest a more in-depth comprehensive method of selecting students that are appropriate to individual medical school's educational goal.
Detecting DIF in Polytomous Items Using MACS, IRT and Ordinal Logistic Regression
ERIC Educational Resources Information Center
Elosua, Paula; Wells, Craig
2013-01-01
The purpose of the present study was to compare the Type I error rate and power of two model-based procedures, the mean and covariance structure model (MACS) and the item response theory (IRT), and an observed-score based procedure, ordinal logistic regression, for detecting differential item functioning (DIF) in polytomous items. A simulation…
ERIC Educational Resources Information Center
Rudner, Lawrence
2016-01-01
In the machine learning literature, it is commonly accepted as fact that as calibration sample sizes increase, Naïve Bayes classifiers initially outperform Logistic Regression classifiers in terms of classification accuracy. Applied to subtests from an on-line final examination and from a highly regarded certification examination, this study shows…
ERIC Educational Resources Information Center
Fan, Xitao; Wang, Lin
The Monte Carlo study compared the performance of predictive discriminant analysis (PDA) and that of logistic regression (LR) for the two-group classification problem. Prior probabilities were used for classification, but the cost of misclassification was assumed to be equal. The study used a fully crossed three-factor experimental design (with…
ERIC Educational Resources Information Center
Nguyen, Phuong L.
2006-01-01
This study examines the effects of parental SES, school quality, and community factors on children's enrollment and achievement in rural areas in Viet Nam, using logistic regression and ordered logistic regression. Multivariate analysis reveals significant differences in educational enrollment and outcomes by level of household expenditures and…
School Exits in the Milwaukee Parental Choice Program: Evidence of a Marketplace?
ERIC Educational Resources Information Center
Ford, Michael
2011-01-01
This article examines whether the large number of school exits from the Milwaukee school voucher program is evidence of a marketplace. Two logistic regression and multinomial logistic regression models tested the relation between the inability to draw large numbers of voucher students and the ability for a private school to remain viable. Data on…
Peñacoba, Cecilia; Rodríguez, Laura; Carmona, Javier; Marín, Dolores
2018-02-01
Agreeableness is associated with good mental health during pregnancy. Although different studies have indicated that agreeableness is related to adaptive coping, this relation has scarcely been studied in pregnant women. The aim of this study was to analyze the possible differences between high and low agreeableness in relation to coping strategies and psychiatric symptoms in pregnant women. We conducted a longitudinal prospective study between October 2009 and January 2013. Pregnant women (n = 285) were assessed in the first trimester of pregnancy, and 122 of them were assessed during the third. Data were collected using the Coping Strategies Questionnaire, the Symptom Check List 90-R, and the agreeableness subscale of the NEO-FFI. Using the SPSS 21 statistics package, binary logistic regression, two-way mixed analysis of variance, and multiple regression analyses and a Sobel test were conducted. Higher levels of agreeableness were associated with positive reappraisal and problem-solving, and lower levels of agreeableness were associated with overt emotional expression and negative self-focused coping. Women with low agreeableness had poorer mental health, especially in the first trimester. These findings should be taken into account to improve women's experiences during pregnancy. Nevertheless, given the scarcity of data, additional studies are needed.
Fall prevention strategy in an emergency department.
Muray, Mwali; Bélanger, Charles H; Razmak, Jamil
2018-02-12
Purpose The purpose of this paper is to document the need for implementing a fall prevention strategy in an emergency department (ED). The paper also spells out the research process that led to approving an assessment tool for use in hospital outpatient services. Design/methodology/approach The fall risk assessment tool was based on the Morse Fall Scale. Gender mix and age above 65 and 80 years were assessed on six risk assessment variables using χ 2 analyses. A logistic regression analysis and model were used to test predictor strength and relationships among variables. Findings In total, 5,371 (56.5 percent) geriatric outpatients were deemed to be at fall risk during the study. Women have a higher falls incidence in young and old age categories. Being on medications for patients above 80 years exposed both genders to equal fall risks. Regression analysis explained 73-98 percent of the variance in the six-variable tool. Originality/value Canadian quality and safe healthcare accreditation standards require that hospital staff develop and adhere to fall prevention policies. Anticipated physiological falls can be prevented by healthcare interventions, particularly with older people known to bear higher risk factors. An aging population is increasing healthcare volumes and medical challenges. Precautionary measures for patients with a vulnerable cognitive and physical status are essential for quality care.
Hierarchical Bayesian Logistic Regression to forecast metabolic control in type 2 DM patients.
Dagliati, Arianna; Malovini, Alberto; Decata, Pasquale; Cogni, Giulia; Teliti, Marsida; Sacchi, Lucia; Cerra, Carlo; Chiovato, Luca; Bellazzi, Riccardo
2016-01-01
In this work we present our efforts in building a model able to forecast patients' changes in clinical conditions when repeated measurements are available. In this case the available risk calculators are typically not applicable. We propose a Hierarchical Bayesian Logistic Regression model, which allows taking into account individual and population variability in model parameters estimate. The model is used to predict metabolic control and its variation in type 2 diabetes mellitus. In particular we have analyzed a population of more than 1000 Italian type 2 diabetic patients, collected within the European project Mosaic. The results obtained in terms of Matthews Correlation Coefficient are significantly better than the ones gathered with standard logistic regression model, based on data pooling.
Model building strategy for logistic regression: purposeful selection.
Zhang, Zhongheng
2016-03-01
Logistic regression is one of the most commonly used models to account for confounders in medical literature. The article introduces how to perform purposeful selection model building strategy with R. I stress on the use of likelihood ratio test to see whether deleting a variable will have significant impact on model fit. A deleted variable should also be checked for whether it is an important adjustment of remaining covariates. Interaction should be checked to disentangle complex relationship between covariates and their synergistic effect on response variable. Model should be checked for the goodness-of-fit (GOF). In other words, how the fitted model reflects the real data. Hosmer-Lemeshow GOF test is the most widely used for logistic regression model.
ERIC Educational Resources Information Center
Bulcock, J. W.
The problem of model estimation when the data are collinear was examined. Though the ridge regression (RR) outperforms ordinary least squares (OLS) regression in the presence of acute multicollinearity, it is not a problem free technique for reducing the variance of the estimates. It is a stochastic procedure when it should be nonstochastic and it…
Multi-objective Optimization of Solar Irradiance and Variance at Pertinent Inclination Angles
NASA Astrophysics Data System (ADS)
Jain, Dhanesh; Lalwani, Mahendra
2018-05-01
The performance of photovoltaic panel gets highly affected bychange in atmospheric conditions and angle of inclination. This article evaluates the optimum tilt angle and orientation angle (surface azimuth angle) for solar photovoltaic array in order to get maximum solar irradiance and to reduce variance of radiation at different sets or subsets of time periods. Non-linear regression and adaptive neural fuzzy interference system (ANFIS) methods are used for predicting the solar radiation. The results of ANFIS are more accurate in comparison to non-linear regression. These results are further used for evaluating the correlation and applied for estimating the optimum combination of tilt angle and orientation angle with the help of general algebraic modelling system and multi-objective genetic algorithm. The hourly average solar irradiation is calculated at different combinations of tilt angle and orientation angle with the help of horizontal surface radiation data of Jodhpur (Rajasthan, India). The hourly average solar irradiance is calculated for three cases: zero variance, with actual variance and with double variance at different time scenarios. It is concluded that monthly collected solar radiation produces better result as compared to bimonthly, seasonally, half-yearly and yearly collected solar radiation. The profit obtained for monthly varying angle has 4.6% more with zero variance and 3.8% more with actual variance, than the annually fixed angle.
Memory complaints in epilepsy: An examination of the role of mood and illness perceptions.
Tinson, Deborah; Crockford, Christopher; Gharooni, Sara; Russell, Helen; Zoeller, Sophie; Leavy, Yvonne; Lloyd, Rachel; Duncan, Susan
2018-03-01
The study examined the role of mood and illness perceptions in explaining the variance in the memory complaints of patients with epilepsy. Forty-four patients from an outpatient tertiary care center and 43 volunteer controls completed a formal assessment of memory and a verbal fluency test, as well as validated self-report questionnaires on memory complaints, mood, and illness perceptions. In hierarchical multiple regression analyses, objective memory test performance and verbal fluency did not contribute significantly to the variance in memory complaints for either patients or controls. In patients, illness perceptions and mood were highly correlated. Illness perceptions correlated more highly with memory complaints than mood and were therefore added to the multiple regression analysis. This accounted for an additional 25% of the variance, after controlling for objective memory test performance and verbal fluency, and the model was significant (model B). In order to compare with other studies, mood was added to a second model, instead of illness perceptions. This accounted for an additional 24% of the variance, which was again significant (model C). In controls, low mood accounted for 11% of the variance in memory complaints (model C2). A measure of illness perceptions was more highly correlated with the memory complaints of patients with epilepsy than with a measure of mood. In a hierarchical multiple regression model, illness perceptions accounted for 25% of the variance in memory complaints. Illness perceptions could provide useful information in a clinical investigation into the self-reported memory complaints of patients with epilepsy, alongside the assessment of mood and formal memory testing. Copyright © 2017 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Ceppi, C.; Mancini, F.; Ritrovato, G.
2009-04-01
This study aim at the landslide susceptibility mapping within an area of the Daunia (Apulian Apennines, Italy) by a multivariate statistical method and data manipulation in a Geographical Information System (GIS) environment. Among the variety of existing statistical data analysis techniques, the logistic regression was chosen to produce a susceptibility map all over an area where small settlements are historically threatened by landslide phenomena. By logistic regression a best fitting between the presence or absence of landslide (dependent variable) and the set of independent variables is performed on the basis of a maximum likelihood criterion, bringing to the estimation of regression coefficients. The reliability of such analysis is therefore due to the ability to quantify the proneness to landslide occurrences by the probability level produced by the analysis. The inventory of dependent and independent variables were managed in a GIS, where geometric properties and attributes have been translated into raster cells in order to proceed with the logistic regression by means of SPSS (Statistical Package for the Social Sciences) package. A landslide inventory was used to produce the bivariate dependent variable whereas the independent set of variable concerned with slope, aspect, elevation, curvature, drained area, lithology and land use after their reductions to dummy variables. The effect of independent parameters on landslide occurrence was assessed by the corresponding coefficient in the logistic regression function, highlighting a major role played by the land use variable in determining occurrence and distribution of phenomena. Once the outcomes of the logistic regression are determined, data are re-introduced in the GIS to produce a map reporting the proneness to landslide as predicted level of probability. As validation of results and regression model a cell-by-cell comparison between the susceptibility map and the initial inventory of landslide events was performed and an agreement at 75% level achieved.
Poortinga, Ernest; Lemmen, Craig; Jibson, Michael D
2006-01-01
We examined the clinical, criminal, and sociodemographic characteristics of all white-collar crime defendants referred to the evaluation unit of a state center for forensic psychiatry. With 29,310 evaluations in a 12-year period, we found 70 defendants charged with embezzlement, 3 with health care fraud, and no other white-collar defendants (based on the eight crimes widely accepted as white-collar offenses). In a case-control study design, the 70 embezzlement cases were compared with 73 defendants charged with other forms of nonviolent theft. White-collar defendants were found to have a higher likelihood of white race (adjusted odds ratio (adj. OR) = 4.51), more years of education (adj. OR = 3471), and a lower likelihood of substance abuse (adj. OR = .28) than control defendants. Logistic regression modeling showed that the variance in the relationship between unipolar depression and white-collar crime was more economically accounted for by education, race, and substance abuse.
Mulcahey, M J; Merenda, Lisa; Tian, Feng; Kozin, Scott; James, Michelle; Gogola, Gloria; Ni, Pengsheng
2013-01-01
This study examined the psychometric properties of item pools relevant to upper-extremity function and activity performance and evaluated simulated 5-, 10-, and 15-item computer adaptive tests (CATs). In a multicenter, cross-sectional study of 200 children and youth with brachial plexus birth palsy (BPBP), parents responded to upper-extremity (n = 52) and activity (n = 34) items using a 5-point response scale. We used confirmatory and exploratory factor analysis, ordinal logistic regression, item maps, and standard errors to evaluate the psychometric properties of the item banks. Validity was evaluated using analysis of variance and Pearson correlation coefficients. Results show that the two item pools have acceptable model fit, scaled well for children and youth with BPBP, and had good validity, content range, and precision. Simulated CATs performed comparably to the full item banks, suggesting that a reduced number of items provide similar information to the entire set of items. Copyright © 2013 by the American Occupational Therapy Association, Inc.
Altruism, Helping, and Volunteering: Pathways to Well-Being in Late Life
Kahana, Eva; Bhatta, Tirth; Lovegreen, Loren D.; Kahana, Boaz; Midlarsky, Elizabeth
2013-01-01
Objectives We examined the influence of prosocial orientations including altruism, volunteering, and informal helping on positive and negative well-being outcomes among retirement community dwelling elders. Method We utilize data from 2 waves, 3 years apart, of a panel study of successful aging (N = 585). Psychosocial well-being outcomes measured include life satisfaction, positive affect, negative affect, and depressive symptomatology. Results Ordinal logistic regression results indicate that altruistic attitudes, volunteering, and informal helping behaviors make unique contributions to the maintenance of life satisfaction, positive affect and other well being outcomes considered in this research. Predictors explain variance primarily in the positive indicators of psychological well-being, but are not significantly associated with the negative outcomes. Female gender and functional limitations are also associated with diminished psychological well-being. Discussion Our findings underscore the value of altruistic attitudes as important additional predictors, along with prosocial behaviors in fostering life satisfaction and positive affect in old age. PMID:23324536
Impact of Non-Suicidal Self-Injury Scale: Initial Psychometric Validation
Burke, Taylor A.; Ammerman, Brooke A.; Hamilton, Jessica L.; Alloy, Lauren B.
2017-01-01
The current study examined the psychometric properties of the Impact of Non-Suicidal Self-Injury Scale (INS), a scale developed to assess the social, behavioral, and emotional consequences of engaging in non-suicidal self-injury (NSSI). University students (N=128) who endorsed a history of NSSI were administered the INS, as well as measures of hypothesized convergent and divergent validity. Results suggested that the INS is best conceptualized as a one-factor scale, and internal consistency analyses indicated excellent reliability. The INS was significantly correlated with well-known measures of NSSI severity (i.e., NSSI frequency, NSSI recency), and measures of suicide attempt history and emotional reactivity. Logistic regression analyses indicated that the INS contributed unique variance to the prediction of physical disfigurement (i.e., NSSI scarring) and clinically significant social anxiety, even after taking into account NSSI frequency. Furthermore, the INS demonstrated divergent validity. Implications for research on NSSI disorder and clinical practice are discussed. PMID:28824214
Jordan, Jennifer; McIntosh, Virginia V W; Carter, Frances A; Joyce, Peter R; Frampton, Christopher M A; Luty, Suzanne E; McKenzie, Janice M; Carter, Janet D; Bulik, Cynthia M
2017-08-01
Failure to complete treatment for anorexia nervosa (AN) is- common, clinically concerning but difficult to predict. This study examines whether therapy-related factors (patient-rated pretreatment credibility and early therapeutic alliance) predict subsequent premature termination of treatment (PTT) alongside self-transcendence (a previously identified clinical predictor) in women with AN. 56 women aged 17-40 years participating in a randomized outpatient psychotherapy trial for AN. Treatment completion was defined as attending 15/20 planned sessions. Measures were the Treatment Credibility, Temperament and Character Inventory, Vanderbilt Therapeutic Alliance Scale and the Vanderbilt Psychotherapy Process Scale. Statistics were univariate tests, correlations, and logistic regression. Treatment credibility and certain early patient and therapist alliance/process subscales predicted PTT. Lower self-transcendence and lower early process accounted for 33% of the variance in predicting PTT. Routine assessment of treatment credibility and early process (comprehensively assessed from multiple perspectives) may help clinicians reduce PTT thereby enhancing treatment outcomes. © 2017 Wiley Periodicals, Inc.
Somma, Antonella; Borroni, Serena; Maffei, Cesare; Giarolli, Laura E; Markon, Kristian E; Krueger, Robert F; Fossati, Andrea
2017-10-01
In order to assess the reliability, factorial validity, and criterion validity of the Personality Inventory for DSM-5 (PID-5) among adolescents, 1,264 Italian high school students were administered the PID-5. Participants were also administered the Questionnaire on Relationships and Substance Use as a criterion measure. In the full sample, McDonald's ω values were adequate for the PID-5 scales (median ω = .85, SD = .06), except for Suspiciousness. However, all PID-5 scales showed average inter-item correlation values in the .20-.55 range. Exploratory structural equation modeling analyses provided moderate support for the a priori model of PID-5 trait scales. Ordinal logistic regression analyses showed that selected PID-5 trait scales predicted a significant, albeit moderate (Cox & Snell R 2 values ranged from .08 to .15, all ps < .001) amount of variance in Questionnaire on Relationships and Substance Use variables.
Social capital influence in illicit drug use among racial/ethnic groups in the United States.
Reynoso-Vallejo, Humberto
2011-01-01
Data from the 2003 National Survey on Drug Use and Health was utilized to elucidate the relationship between individual-level social capital and illicit drug use among racial/ethnic groups. Analysis of variance indicated that Whites had different perceptions of social capital compared to other groups, in measures of social participation, neighborhood cohesion, trust, and norms of reciprocity. Logistic regression analysis showed that individual-level social capital, measured by trust and norms of reciprocity, was weakly associated with illicit drug use. However, individuals with higher social participation were less likely to have used illicit drugs ever or during the month prior to the interview. The association between social capital and illicit drug use is discussed, as well as the role of social participation in illicit drug use. Rather than an individual-level measure of social capital, future research should employ a neighborhood-level measure of social capital that aggregates neighborhood cohesion, trust, norms of reciprocity, and social participation. Copyright © Taylor & Francis Group, LLC
Bayesian multivariate hierarchical transformation models for ROC analysis.
O'Malley, A James; Zou, Kelly H
2006-02-15
A Bayesian multivariate hierarchical transformation model (BMHTM) is developed for receiver operating characteristic (ROC) curve analysis based on clustered continuous diagnostic outcome data with covariates. Two special features of this model are that it incorporates non-linear monotone transformations of the outcomes and that multiple correlated outcomes may be analysed. The mean, variance, and transformation components are all modelled parametrically, enabling a wide range of inferences. The general framework is illustrated by focusing on two problems: (1) analysis of the diagnostic accuracy of a covariate-dependent univariate test outcome requiring a Box-Cox transformation within each cluster to map the test outcomes to a common family of distributions; (2) development of an optimal composite diagnostic test using multivariate clustered outcome data. In the second problem, the composite test is estimated using discriminant function analysis and compared to the test derived from logistic regression analysis where the gold standard is a binary outcome. The proposed methodology is illustrated on prostate cancer biopsy data from a multi-centre clinical trial.
Bayesian multivariate hierarchical transformation models for ROC analysis
O'Malley, A. James; Zou, Kelly H.
2006-01-01
SUMMARY A Bayesian multivariate hierarchical transformation model (BMHTM) is developed for receiver operating characteristic (ROC) curve analysis based on clustered continuous diagnostic outcome data with covariates. Two special features of this model are that it incorporates non-linear monotone transformations of the outcomes and that multiple correlated outcomes may be analysed. The mean, variance, and transformation components are all modelled parametrically, enabling a wide range of inferences. The general framework is illustrated by focusing on two problems: (1) analysis of the diagnostic accuracy of a covariate-dependent univariate test outcome requiring a Box–Cox transformation within each cluster to map the test outcomes to a common family of distributions; (2) development of an optimal composite diagnostic test using multivariate clustered outcome data. In the second problem, the composite test is estimated using discriminant function analysis and compared to the test derived from logistic regression analysis where the gold standard is a binary outcome. The proposed methodology is illustrated on prostate cancer biopsy data from a multi-centre clinical trial. PMID:16217836
American grandparents providing extensive child care to their grandchildren: prevalence and profile.
Fuller-Thomson, E; Minkler, M
2001-04-01
This study sought to determine the prevalence and profile of grandparents providing extensive care for a grandchild (grandparents who provide 30+ hours per week or 90+ nights per year of child care, yet are not the primary caregiver of the grandchild). Secondary analysis of the 3,260 grandparent respondents in the 1992-94 National Survey of Families and Households (NSFH). Extensively caregiving grandparents were compared with custodial grandparents (those with primary responsibility for raising a grandchild for 6+ months), noncaregivers, occasional caregivers (<10 hours per week), and intermediate caregivers using chi-square tests, one-way analysis of variance tests, and logistic regression analyses. Close to 7% of all grandparents provided extensive caregiving, as did 14.9% of those who had provided any grandchild care in the last month. Extensive caregivers most closely resembled custodial caregivers and had least in common with those grandparents who never provided child care. Areas for future research, policy, and practice are highlighted, including the potential impact of welfare reform legislation on extensively caregiving grandparents.
Work Related Psychosocial and Organizational Factors for Neck Pain in Workers in the United States
Yang, Haiou; Hitchcock, Edward; Haldeman, Scott; Swanson, Naomi; Lu, Ming-Lun; Choi, BongKyoo; Nakata, Akinori; Baker, Dean
2016-01-01
Background Neck pain is a prevalent musculoskeletal condition among workers in the United States. This study explores a set of workplace psychosocial and organization-related factors for neck pain. Methods Data used for this study comes from the 2010 National Health interview Survey which provides a representative sample of the US population. To account for the complex sampling design, the Taylor linearized variance estimation method was used. Logistic regression models were constructed to measure the associations. Results This study demonstrated significant associations between neck pain and a set of workplace risk factors including work-family imbalance, exposure to a hostile work environment and job insecurity, non-standard work arrangements, multiple jobs and long work hours. Conclusion Workers with neck pain may benefit from intervention programs that address issues related to these workplace risk factors. Future studies exploring both psychosocial risk factors and physical risk factors with a longitudinal design will be important. PMID:27184340
Women Firefighters and Workplace Harassment: Associated Suicidality and Mental Health Sequelae.
Hom, Melanie A; Stanley, Ian H; Spencer-Thomas, Sally; Joiner, Thomas E
2017-12-01
This cross-sectional study investigated the association between harassment, career suicidality, and psychiatric symptoms among women firefighters. Women firefighters (n = 290) completed self-report measures of experiences with harassment on the job, career suicidality, and various psychiatric symptoms. Logistic regression analyses and one-way analyses of variance were used to address study aims. Of the sample, 21.7% reported having experienced sexual harassment and 20.3% reported having been threatened or harassed in another way on their firefighting job. Sexual harassment and other threats/harassment on the job were both significantly associated with a greater likelihood of reporting career suicidal ideation, as well as reporting more severe psychiatric symptoms. Harassment and threats experienced on the job may be associated with increased suicide risk and more severe psychiatric symptoms among women firefighters. Efforts are needed to reduce the occurrence of harassment and threats within the fire service and provide support for women firefighters who have been harassed or threatened.
Mahmood, S; Basarab, J A; Dixon, W T; Bruce, H L
2016-11-01
Previous research has suggested that cattle predisposed to dark cutting can be identified from live animal or carcass characteristics. This hypothesis was tested using production and phenotype data from an existing data set collected from heifers (n=467) on study at three farms. Carcasses in the data set graded Canada AAA (n=136), AA (n=296), A (n=14), and B4 (dark cutting, n=21). Farm was identified as significant (P=0.0268) by CATMOD analysis and slaughter weight and carcass weight accounted for the variation in dark cutting frequency across the farms. Analysis of variance indicated that dark cutting heifers had reduced weight at weaning (P<0.0001) and at slaughter (P<0.0001), and produced reduced weight carcasses (P<0.0001). Results of logistic regression indicated that the probability of dark cutting was decreased in heifers slaughtered at live weight greater than 550kg and in carcasses weighing greater than 325kg. Copyright © 2016 Elsevier Ltd. All rights reserved.
Vroling, Maartje S; Wiersma, Femke E; Lammers, Mirjam W; Noorthoorn, Eric O
2016-11-01
Dropout rates in binge eating disorder (BED) treatment are high (17-30%), and predictors of dropout are unknown. Participants were 376 patients following an intensive outpatient cognitive behavioural therapy programme for BED, 82 of whom (21.8%) dropped out of treatment. An exploratory logistic regression was performed using eating disorder variables, general psychopathology, personality and demographics to identify predictors of dropout. Binge eating pathology, preoccupations with eating, shape and weight, social adjustment, agreeableness, and social embedding appeared to be significant predictors of dropout. Also, education showed an association to dropout. This is one of the first studies investigating pre-treatment predictors for dropout in BED treatment. The total explained variance of the prediction model was low, yet the model correctly classified 80.6% of cases, which is comparable to other dropout studies in eating disorders. Copyright © 2016 John Wiley & Sons, Ltd and Eating Disorders Association. Copyright © 2016 John Wiley & Sons, Ltd and Eating Disorders Association.
Long, D Adam; Reed, Roger W; Duncan, Ian
2016-07-01
Evaluate a large employer's wellness intervention by studying outcomes across the value chain, and testing Health Engagement's (HE) dose-response relationship to outcomes. Evaluation included 37 measures across eight outcomes domains (OD) using repeated measures, analysis of variance and logistic regression. Participants with higher HE had better pre-post percent changes than control: 1.7% higher for Motivation (OD1), 3.4% for Behavior (OD2), 1.0% for Emotion (OD3), 5.8% for Biometrics (OD4), 6.3% for Compliance (OD5), and 5.2% for Claims (OD6). They also had 0.5% less Productivity loss (OD7), and odds of Turnover (OD8) one-quarter to one-half that of control. A dose-response relationship with degrees of HE was also shown. Three outcomes domains (OD6 to OD8) can be monetized for cost-benefit analysis. Authors recommend, however, staying focused on driving HE and using metrics from all OD to assess value.
Workplace psychosocial and organizational factors for neck pain in workers in the United States.
Yang, Haiou; Hitchcock, Edward; Haldeman, Scott; Swanson, Naomi; Lu, Ming-Lun; Choi, BongKyoo; Nakata, Akinori; Baker, Dean
2016-07-01
Neck pain is a prevalent musculoskeletal condition among workers in the United States. This study explores a set of workplace psychosocial and organization-related factors for neck pain. Data used for this study come from the 2010 National Health Interview Survey which provides a representative sample of the US population. To account for the complex sampling design, the Taylor linearized variance estimation method was used. Logistic regression models were constructed to measure the associations. This study demonstrated significant associations between neck pain and a set of workplace risk factors, including work-family imbalance, exposure to a hostile work environment and job insecurity, non-standard work arrangements, multiple jobs, and long work hours. Workers with neck pain may benefit from intervention programs that address issues related to these workplace risk factors. Future studies exploring both psychosocial risk factors and physical risk factors with a longitudinal design will be important. Am. J. Ind. Med. 59:549-560, 2016. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Rongen, Anne; Robroek, Suzan J W; Schaufeli, Wilmar; Burdorf, Alex
2014-08-01
To investigate whether work engagement influences self-perceived health, work ability, and sickness absence beyond health behaviors and work-related characteristics. Employees of two organizations participated in a 6-month longitudinal study (n = 733). Using questionnaires, information was collected on health behaviors, work-related characteristics, and work engagement at baseline, and self-perceived health, work ability, and sickness absence at 6-month follow-up. Associations between baseline and follow-up variables were studied using multivariate and multinomial logistic regression analyses and changes in R2 were calculated. Low work engagement was related with low work ability (odds ratio: 3.68; 95% confidence interval: 2.15 to 6.30) and long-term sickness absence (odds ratio: 1.84; 95% confidence interval: 1.04 to 3.27). Work engagement increased the explained variance in work ability and sickness absence with 4.1% and 0.5%, respectively. Work engagement contributes to work ability beyond known health behaviors and work-related characteristics.
Briken, Peer; Habermann, Niels; Berner, Wolfgang; Hill, Andreas
2005-09-01
The aim of this study was to investigate the number and type of brain abnormalities and their influence on psychosocial development, criminal history and paraphilias in sexual murderers. We analyzed psychiatric court reports of 166 sexual murderers and compared a group with notable signs of brain abnormalities (N = 50) with those without any signs (N = 116). Sexual murderers with brain abnormalities suffered more from early behavior problems. They were less likely to cohabitate with the victim at the time of the homicide and had more victims at the age of six years or younger. Psychiatric diagnoses revealed a higher total number of paraphilias: Transvestic fetishism and paraphilias not otherwise specified were more frequent in offenders with brain abnormalities. A binary logistic regression identified five predictors that accounted for 46.8% of the variance explaining the presence of brain abnormalities. Our results suggest the importance of a comprehensive neurological and psychological examination of this special offender group.
Late recognition of pregnancy as a predictor of adverse birth outcomes.
Ayoola, Adejoke B; Stommel, Manfred; Nettleman, Mary D
2009-08-01
We examined the relationship between the time of recognition of pregnancy and birth outcomes, such as premature births, low birthweight (LBW), admission to the neonatal intensive care unit (NICU), and infant mortality. A secondary analysis was performed using the Pregnancy Risk Assessment and Monitoring System (PRAMS) multistate data from 2000-2004. The sample consisted of 136,373 women who had a live childbirth. Analysis involved multiple logistic regression models, appropriately weighted for point and variance estimation to reflect the complex survey design of the PRAMS using STATA 9.2 (Stata Corp, College Station, TX). Approximately 27.6% recognized their pregnancy late (after 6 weeks of gestation). Late recognition was significantly associated with an increased odds of having premature births (odds ratio [OR], 1.09; 99% confidence interval [CI], 1.01-1.19), LBW (OR, 1.08; 99% CI, 1.01-1.15), and NICU admissions (OR, 1.12; 99% CI, 1.03-1.21). These results provide a rationale and an impetus for developing interventions that promote early recognition of pregnancy.
Breeding population density and habitat use of Swainson's warblers in a Georgia floodplain forest
Wright, E.A.
2002-01-01
I examined density and habitat use of a Swainson's Warbler (Limnothlypis swainsonii) breeding population in Georgia. This songbird species is inadequately monitored, and may be declining due to anthropogenic alteration of floodplain forest breeding habitats. I used distance sampling methods to estimate density, finding 9.4 singing males/ha (CV = 0.298). Individuals were encountered too infrequently to produce a Iow-variance estimate, and distance sampling thus may be impracticable for monitoring this relatively rare species. I developed a set of multivariate habitat models using binary logistic regression techniques, based on measurement of 22 variables in 56 plots occupied by Swainson's Warblers and 110 unoccupied plots. Occupied areas were characterized by high stem density of cane (Arundinaria gigantea) and other shrub layer vegetation, and presence of abundant and accessible leaf litter. I recommend two habitat models, which correctly classified 87-89% of plots in cross-validation runs, for potential use in habitat assessment at other locations.
Mahrooghy, Majid; Ashraf, Ahmed B; Daye, Dania; Mies, Carolyn; Feldman, Michael; Rosen, Mark; Kontos, Despina
2013-01-01
Breast tumors are heterogeneous lesions. Intra-tumor heterogeneity presents a major challenge for cancer diagnosis and treatment. Few studies have worked on capturing tumor heterogeneity from imaging. Most studies to date consider aggregate measures for tumor characterization. In this work we capture tumor heterogeneity by partitioning tumor pixels into subregions and extracting heterogeneity wavelet kinetic (HetWave) features from breast dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) to obtain the spatiotemporal patterns of the wavelet coefficients and contrast agent uptake from each partition. Using a genetic algorithm for feature selection, and a logistic regression classifier with leave one-out cross validation, we tested our proposed HetWave features for the task of classifying breast cancer recurrence risk. The classifier based on our features gave an ROC AUC of 0.78, outperforming previously proposed kinetic, texture, and spatial enhancement variance features which give AUCs of 0.69, 0.64, and 0.65, respectively.
Racial differences in sexual dysfunction among postdeployed Iraq and Afghanistan veterans
Monawar Hosain, G. M.; Latini, David M.; Kauth, Micahel R.; Goltz, Heather Honoré; Helmer, Drew A.
2015-01-01
This study examined the racial/ethnic differences in prevalence and risk factors of sexual dysfunction among postdeployed Iraqi/Afghanistan veterans. A total of 3,962 recently deployed veterans were recruited from Houston Veterans Affairs medical center. The authors examined sociodemographic, medical, mental-health, and lifestyle-related variables. Sexual dysfunction was diagnosed by ICD9-CM code and/or medicines prescribed for sexual dysfunction. Analyses included chi-square, analysis of variance, and multivariate logistic regression. Sexual dysfunction was observed 4.7% in Whites, 7.9% in African Americans, and 6.3% in Hispanics. Age, marital status, smoking, and hypertension were risk factors for Whites, whereas age, marital status, posttraumatic stress disorder and hypertension were significant for African Americans. For Hispanics, only age and posttraumatic stress disorder were significant. This study identified that risk factors of sexual dysfunction varied by race/ethnicity. All postdeployed veterans should be screened; and psychosocial support and educational materials should address race/ethnicity-specific risk factors. PMID:23300201
Marriage, Work, and Racial Inequalities in Poverty: Evidence from the U.S.
Thiede, Brian; Kim, Hyojung; Slack, Tim
2017-10-01
This paper explores recent racial and ethnic inequalities in poverty, estimating the share of racial poverty differentials that can be explained by variation in family structure and workforce participation. The authors use logistic regression to estimate the association between poverty and race, family structure, and workforce participation. They then decompose between-race differences in poverty risk to quantify how racial disparities in marriage and work explain observed inequalities in the log odds of poverty. They estimate that 47.7-48.9% of black-white differences in poverty risk can be explained by between-group variance in these two factors, while only 4.3-4.5% of the Hispanic-white differential in poverty risk can be explained by these variables. These findings underscore the continued association between racial disparities in poverty and those in labor and marriage markets. However, clear racial differences in the origin of poverty suggest that family- and worked-related policy interventions will not have uniformly effective or evenly distributed impacts on poverty reduction.
Improving size estimates of open animal populations by incorporating information on age
Manly, Bryan F.J.; McDonald, Trent L.; Amstrup, Steven C.; Regehr, Eric V.
2003-01-01
Around the world, a great deal of effort is expended each year to estimate the sizes of wild animal populations. Unfortunately, population size has proven to be one of the most intractable parameters to estimate. The capture-recapture estimation models most commonly used (of the Jolly-Seber type) are complicated and require numerous, sometimes questionable, assumptions. The derived estimates usually have large variances and lack consistency over time. In capture–recapture studies of long-lived animals, the ages of captured animals can often be determined with great accuracy and relative ease. We show how to incorporate age information into size estimates for open populations, where the size changes through births, deaths, immigration, and emigration. The proposed method allows more precise estimates of population size than the usual models, and it can provide these estimates from two sample occasions rather than the three usually required. Moreover, this method does not require specialized programs for capture-recapture data; researchers can derive their estimates using the logistic regression module in any standard statistical package.
Yang, Gai; Leicht, Anthony S; Lago, Carlos; Gómez, Miguel-Ángel
2018-01-01
The aim of this study was to identify the key physical and technical performance variables related to team quality in the Chinese Super League (CSL). Teams' performance variables were collected from 240 matches and analysed via analysis of variance between end-of-season-ranked groups and multinomial logistic regression. Significant physical performance differences between groups were identified for sprinting (top-ranked group vs. upper-middle-ranked group) and total distance covered without possession (upper and upper-middle-ranked groups and lower-ranked group). For technical performance, teams in the top-ranked group exhibited a significantly greater amount of possession in opponent's half, number of entry passes in the final 1/3 of the field and the Penalty Area, and 50-50 challenges than lower-ranked teams. Finally, time of possession increased the probability of a win compared with a draw. The current study identified key performance indicators that differentiated end-season team quality within the CSL.
The Variance Normalization Method of Ridge Regression Analysis.
ERIC Educational Resources Information Center
Bulcock, J. W.; And Others
The testing of contemporary sociological theory often calls for the application of structural-equation models to data which are inherently collinear. It is shown that simple ridge regression, which is commonly used for controlling the instability of ordinary least squares regression estimates in ill-conditioned data sets, is not a legitimate…
NASA Technical Reports Server (NTRS)
Alston, D. W.
1981-01-01
The considered research had the objective to design a statistical model that could perform an error analysis of curve fits of wind tunnel test data using analysis of variance and regression analysis techniques. Four related subproblems were defined, and by solving each of these a solution to the general research problem was obtained. The capabilities of the evolved true statistical model are considered. The least squares fit is used to determine the nature of the force, moment, and pressure data. The order of the curve fit is increased in order to delete the quadratic effect in the residuals. The analysis of variance is used to determine the magnitude and effect of the error factor associated with the experimental data.
Determination of riverbank erosion probability using Locally Weighted Logistic Regression
NASA Astrophysics Data System (ADS)
Ioannidou, Elena; Flori, Aikaterini; Varouchakis, Emmanouil A.; Giannakis, Georgios; Vozinaki, Anthi Eirini K.; Karatzas, George P.; Nikolaidis, Nikolaos
2015-04-01
Riverbank erosion is a natural geomorphologic process that affects the fluvial environment. The most important issue concerning riverbank erosion is the identification of the vulnerable locations. An alternative to the usual hydrodynamic models to predict vulnerable locations is to quantify the probability of erosion occurrence. This can be achieved by identifying the underlying relations between riverbank erosion and the geomorphological or hydrological variables that prevent or stimulate erosion. Thus, riverbank erosion can be determined by a regression model using independent variables that are considered to affect the erosion process. The impact of such variables may vary spatially, therefore, a non-stationary regression model is preferred instead of a stationary equivalent. Locally Weighted Regression (LWR) is proposed as a suitable choice. This method can be extended to predict the binary presence or absence of erosion based on a series of independent local variables by using the logistic regression model. It is referred to as Locally Weighted Logistic Regression (LWLR). Logistic regression is a type of regression analysis used for predicting the outcome of a categorical dependent variable (e.g. binary response) based on one or more predictor variables. The method can be combined with LWR to assign weights to local independent variables of the dependent one. LWR allows model parameters to vary over space in order to reflect spatial heterogeneity. The probabilities of the possible outcomes are modelled as a function of the independent variables using a logistic function. Logistic regression measures the relationship between a categorical dependent variable and, usually, one or several continuous independent variables by converting the dependent variable to probability scores. Then, a logistic regression is formed, which predicts success or failure of a given binary variable (e.g. erosion presence or absence) for any value of the independent variables. The erosion occurrence probability can be calculated in conjunction with the model deviance regarding the independent variables tested. The most straightforward measure for goodness of fit is the G statistic. It is a simple and effective way to study and evaluate the Logistic Regression model efficiency and the reliability of each independent variable. The developed statistical model is applied to the Koiliaris River Basin on the island of Crete, Greece. Two datasets of river bank slope, river cross-section width and indications of erosion were available for the analysis (12 and 8 locations). Two different types of spatial dependence functions, exponential and tricubic, were examined to determine the local spatial dependence of the independent variables at the measurement locations. The results show a significant improvement when the tricubic function is applied as the erosion probability is accurately predicted at all eight validation locations. Results for the model deviance show that cross-section width is more important than bank slope in the estimation of erosion probability along the Koiliaris riverbanks. The proposed statistical model is a useful tool that quantifies the erosion probability along the riverbanks and can be used to assist managing erosion and flooding events. Acknowledgements This work is part of an on-going THALES project (CYBERSENSORS - High Frequency Monitoring System for Integrated Water Resources Management of Rivers). The project has been co-financed by the European Union (European Social Fund - ESF) and Greek national funds through the Operational Program "Education and Lifelong Learning" of the National Strategic Reference Framework (NSRF) - Research Funding Program: THALES. Investing in knowledge society through the European Social Fund.
NASA Astrophysics Data System (ADS)
Yilmaz, Işık
2009-06-01
The purpose of this study is to compare the landslide susceptibility mapping methods of frequency ratio (FR), logistic regression and artificial neural networks (ANN) applied in the Kat County (Tokat—Turkey). Digital elevation model (DEM) was first constructed using GIS software. Landslide-related factors such as geology, faults, drainage system, topographical elevation, slope angle, slope aspect, topographic wetness index (TWI) and stream power index (SPI) were used in the landslide susceptibility analyses. Landslide susceptibility maps were produced from the frequency ratio, logistic regression and neural networks models, and they were then compared by means of their validations. The higher accuracies of the susceptibility maps for all three models were obtained from the comparison of the landslide susceptibility maps with the known landslide locations. However, respective area under curve (AUC) values of 0.826, 0.842 and 0.852 for frequency ratio, logistic regression and artificial neural networks showed that the map obtained from ANN model is more accurate than the other models, accuracies of all models can be evaluated relatively similar. The results obtained in this study also showed that the frequency ratio model can be used as a simple tool in assessment of landslide susceptibility when a sufficient number of data were obtained. Input process, calculations and output process are very simple and can be readily understood in the frequency ratio model, however logistic regression and neural networks require the conversion of data to ASCII or other formats. Moreover, it is also very hard to process the large amount of data in the statistical package.
ERIC Educational Resources Information Center
Schumacher, Phyllis; Olinsky, Alan; Quinn, John; Smith, Richard
2010-01-01
The authors extended previous research by 2 of the authors who conducted a study designed to predict the successful completion of students enrolled in an actuarial program. They used logistic regression to determine the probability of an actuarial student graduating in the major or dropping out. They compared the results of this study with those…
Carolyn B. Meyer; Sherri L. Miller; C. John Ralph
2004-01-01
The scale at which habitat variables are measured affects the accuracy of resource selection functions in predicting animal use of sites. We used logistic regression models for a wide-ranging species, the marbled murrelet, (Brachyramphus marmoratus) in a large region in California to address how much changing the spatial or temporal scale of...
ERIC Educational Resources Information Center
Monahan, Patrick O.; McHorney, Colleen A.; Stump, Timothy E.; Perkins, Anthony J.
2007-01-01
Previous methodological and applied studies that used binary logistic regression (LR) for detection of differential item functioning (DIF) in dichotomously scored items either did not report an effect size or did not employ several useful measures of DIF magnitude derived from the LR model. Equations are provided for these effect size indices.…
ERIC Educational Resources Information Center
Magis, David; Raiche, Gilles; Beland, Sebastien; Gerard, Paul
2011-01-01
We present an extension of the logistic regression procedure to identify dichotomous differential item functioning (DIF) in the presence of more than two groups of respondents. Starting from the usual framework of a single focal group, we propose a general approach to estimate the item response functions in each group and to test for the presence…
Risk Factors of Falls in Community-Dwelling Older Adults: Logistic Regression Tree Analysis
ERIC Educational Resources Information Center
Yamashita, Takashi; Noe, Douglas A.; Bailer, A. John
2012-01-01
Purpose of the Study: A novel logistic regression tree-based method was applied to identify fall risk factors and possible interaction effects of those risk factors. Design and Methods: A nationally representative sample of American older adults aged 65 years and older (N = 9,592) in the Health and Retirement Study 2004 and 2006 modules was used.…
ERIC Educational Resources Information Center
Gordovil-Merino, Amalia; Guardia-Olmos, Joan; Pero-Cebollero, Maribel
2012-01-01
In this paper, we used simulations to compare the performance of classical and Bayesian estimations in logistic regression models using small samples. In the performed simulations, conditions were varied, including the type of relationship between independent and dependent variable values (i.e., unrelated and related values), the type of variable…
Ohlmacher, G.C.; Davis, J.C.
2003-01-01
Landslides in the hilly terrain along the Kansas and Missouri rivers in northeastern Kansas have caused millions of dollars in property damage during the last decade. To address this problem, a statistical method called multiple logistic regression has been used to create a landslide-hazard map for Atchison, Kansas, and surrounding areas. Data included digitized geology, slopes, and landslides, manipulated using ArcView GIS. Logistic regression relates predictor variables to the occurrence or nonoccurrence of landslides within geographic cells and uses the relationship to produce a map showing the probability of future landslides, given local slopes and geologic units. Results indicated that slope is the most important variable for estimating landslide hazard in the study area. Geologic units consisting mostly of shale, siltstone, and sandstone were most susceptible to landslides. Soil type and aspect ratio were considered but excluded from the final analysis because these variables did not significantly add to the predictive power of the logistic regression. Soil types were highly correlated with the geologic units, and no significant relationships existed between landslides and slope aspect. ?? 2003 Elsevier Science B.V. All rights reserved.
A Method for Calculating the Probability of Successfully Completing a Rocket Propulsion Ground Test
NASA Technical Reports Server (NTRS)
Messer, Bradley
2007-01-01
Propulsion ground test facilities face the daily challenge of scheduling multiple customers into limited facility space and successfully completing their propulsion test projects. Over the last decade NASA s propulsion test facilities have performed hundreds of tests, collected thousands of seconds of test data, and exceeded the capabilities of numerous test facility and test article components. A logistic regression mathematical modeling technique has been developed to predict the probability of successfully completing a rocket propulsion test. A logistic regression model is a mathematical modeling approach that can be used to describe the relationship of several independent predictor variables X(sub 1), X(sub 2),.., X(sub k) to a binary or dichotomous dependent variable Y, where Y can only be one of two possible outcomes, in this case Success or Failure of accomplishing a full duration test. The use of logistic regression modeling is not new; however, modeling propulsion ground test facilities using logistic regression is both a new and unique application of the statistical technique. Results from this type of model provide project managers with insight and confidence into the effectiveness of rocket propulsion ground testing.
Fei, Yang; Hu, Jian; Gao, Kun; Tu, Jianfeng; Li, Wei-Qin; Wang, Wei
2017-06-01
To construct a radical basis function (RBF) artificial neural networks (ANNs) model to predict the incidence of acute pancreatitis (AP)-induced portal vein thrombosis. The analysis included 353 patients with AP who had admitted between January 2011 and December 2015. RBF ANNs model and logistic regression model were constructed based on eleven factors relevant to AP respectively. Statistical indexes were used to evaluate the value of the prediction in two models. The predict sensitivity, specificity, positive predictive value, negative predictive value and accuracy by RBF ANNs model for PVT were 73.3%, 91.4%, 68.8%, 93.0% and 87.7%, respectively. There were significant differences between the RBF ANNs and logistic regression models in these parameters (P<0.05). In addition, a comparison of the area under receiver operating characteristic curves of the two models showed a statistically significant difference (P<0.05). The RBF ANNs model is more likely to predict the occurrence of PVT induced by AP than logistic regression model. D-dimer, AMY, Hct and PT were important prediction factors of approval for AP-induced PVT. Copyright © 2017 Elsevier Inc. All rights reserved.
Wu, Robert; Glen, Peter; Ramsay, Tim; Martel, Guillaume
2014-06-28
Observational studies dominate the surgical literature. Statistical adjustment is an important strategy to account for confounders in observational studies. Research has shown that published articles are often poor in statistical quality, which may jeopardize their conclusions. The Statistical Analyses and Methods in the Published Literature (SAMPL) guidelines have been published to help establish standards for statistical reporting.This study will seek to determine whether the quality of statistical adjustment and the reporting of these methods are adequate in surgical observational studies. We hypothesize that incomplete reporting will be found in all surgical observational studies, and that the quality and reporting of these methods will be of lower quality in surgical journals when compared with medical journals. Finally, this work will seek to identify predictors of high-quality reporting. This work will examine the top five general surgical and medical journals, based on a 5-year impact factor (2007-2012). All observational studies investigating an intervention related to an essential component area of general surgery (defined by the American Board of Surgery), with an exposure, outcome, and comparator, will be included in this systematic review. Essential elements related to statistical reporting and quality were extracted from the SAMPL guidelines and include domains such as intent of analysis, primary analysis, multiple comparisons, numbers and descriptive statistics, association and correlation analyses, linear regression, logistic regression, Cox proportional hazard analysis, analysis of variance, survival analysis, propensity analysis, and independent and correlated analyses. Each article will be scored as a proportion based on fulfilling criteria in relevant analyses used in the study. A logistic regression model will be built to identify variables associated with high-quality reporting. A comparison will be made between the scores of surgical observational studies published in medical versus surgical journals. Secondary outcomes will pertain to individual domains of analysis. Sensitivity analyses will be conducted. This study will explore the reporting and quality of statistical analyses in surgical observational studies published in the most referenced surgical and medical journals in 2013 and examine whether variables (including the type of journal) can predict high-quality reporting.
Testing Interaction Effects without Discarding Variance.
ERIC Educational Resources Information Center
Lopez, Kay A.
Analysis of variance (ANOVA) and multiple regression are two of the most commonly used methods of data analysis in behavioral science research. Although ANOVA was intended for use with experimental designs, educational researchers have used ANOVA extensively in aptitude-treatment interaction (ATI) research. This practice tends to make researchers…
Wang, Shuang; Jiang, Xiaoqian; Wu, Yuan; Cui, Lijuan; Cheng, Samuel; Ohno-Machado, Lucila
2013-01-01
We developed an EXpectation Propagation LOgistic REgRession (EXPLORER) model for distributed privacy-preserving online learning. The proposed framework provides a high level guarantee for protecting sensitive information, since the information exchanged between the server and the client is the encrypted posterior distribution of coefficients. Through experimental results, EXPLORER shows the same performance (e.g., discrimination, calibration, feature selection etc.) as the traditional frequentist Logistic Regression model, but provides more flexibility in model updating. That is, EXPLORER can be updated one point at a time rather than having to retrain the entire data set when new observations are recorded. The proposed EXPLORER supports asynchronized communication, which relieves the participants from coordinating with one another, and prevents service breakdown from the absence of participants or interrupted communications. PMID:23562651
Brügemann, K; Gernand, E; von Borstel, U U; König, S
2011-08-01
Data used in the present study included 1,095,980 first-lactation test-day records for protein yield of 154,880 Holstein cows housed on 196 large-scale dairy farms in Germany. Data were recorded between 2002 and 2009 and merged with meteorological data from public weather stations. The maximum distance between each farm and its corresponding weather station was 50 km. Hourly temperature-humidity indexes (THI) were calculated using the mean of hourly measurements of dry bulb temperature and relative humidity. On the phenotypic scale, an increase in THI was generally associated with a decrease in daily protein yield. For genetic analyses, a random regression model was applied using time-dependent (d in milk, DIM) and THI-dependent covariates. Additive genetic and permanent environmental effects were fitted with this random regression model and Legendre polynomials of order 3 for DIM and THI. In addition, the fixed curve was modeled with Legendre polynomials of order 3. Heterogeneous residuals were fitted by dividing DIM into 5 classes, and by dividing THI into 4 classes, resulting in 20 different classes. Additive genetic variances for daily protein yield decreased with increasing degrees of heat stress and were lowest at the beginning of lactation and at extreme THI. Due to higher additive genetic variances, slightly higher permanent environment variances, and similar residual variances, heritabilities were highest for low THI in combination with DIM at the end of lactation. Genetic correlations among individual values for THI were generally >0.90. These trends from the complex random regression model were verified by applying relatively simple bivariate animal models for protein yield measured in 2 THI environments; that is, defining a THI value of 60 as a threshold. These high correlations indicate the absence of any substantial genotype × environment interaction for protein yield. However, heritabilities and additive genetic variances from the random regression model tended to be slightly higher in the THI range corresponding to cows' comfort zone. Selecting such superior environments for progeny testing can contribute to an accurate genetic differentiation among selection candidates. Copyright © 2011 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Dietary consumption patterns and laryngeal cancer risk.
Vlastarakos, Petros V; Vassileiou, Andrianna; Delicha, Evie; Kikidis, Dimitrios; Protopapas, Dimosthenis; Nikolopoulos, Thomas P
2016-06-01
We conducted a case-control study to investigate the effect of diet on laryngeal carcinogenesis. Our study population was made up of 140 participants-70 patients with laryngeal cancer (LC) and 70 controls with a non-neoplastic condition that was unrelated to diet, smoking, or alcohol. A food-frequency questionnaire determined the mean consumption of 113 different items during the 3 years prior to symptom onset. Total energy intake and cooking mode were also noted. The relative risk, odds ratio (OR), and 95% confidence interval (CI) were estimated by multiple logistic regression analysis. We found that the total energy intake was significantly higher in the LC group (p < 0.001), and that the difference remained statistically significant after logistic regression analysis (p < 0.001; OR: 118.70). Notably, meat consumption was higher in the LC group (p < 0.001), and the difference remained significant after logistic regression analysis (p = 0.029; OR: 1.16). LC patients also consumed significantly more fried food (p = 0.036); this difference also remained significant in the logistic regression model (p = 0.026; OR: 5.45). The LC group also consumed significantly more seafood (p = 0.012); the difference persisted after logistic regression analysis (p = 0.009; OR: 2.48), with the consumption of shrimp proving detrimental (p = 0.049; OR: 2.18). Finally, the intake of zinc was significantly higher in the LC group before and after logistic regression analysis (p = 0.034 and p = 0.011; OR: 30.15, respectively). Cereal consumption (including pastas) was also higher among the LC patients (p = 0.043), with logistic regression analysis showing that their negative effect was possibly associated with the sauces and dressings that traditionally accompany pasta dishes (p = 0.006; OR: 4.78). Conversely, a higher consumption of dairy products was found in controls (p < 0.05); logistic regression analysis showed that calcium appeared to be protective at the micronutrient level (p < 0.001; OR: 0.27). We found no difference in the overall consumption of fruits and vegetables between the LC patients and controls; however, the LC patients did have a greater consumption of cooked tomatoes and cooked root vegetables (p = 0.039 for both), and the controls had more consumption of leeks (p = 0.042) and, among controls younger than 65 years, cooked beans (p = 0.037). Lemon (p = 0.037), squeezed fruit juice (p = 0.032), and watermelon (p = 0.018) were also more frequently consumed by the controls. Other differences at the micronutrient level included greater consumption by the LC patients of retinol (p = 0.044), polyunsaturated fats (p = 0.041), and linoleic acid (p = 0.008); LC patients younger than 65 years also had greater intake of riboflavin (p = 0.045). We conclude that the differences in dietary consumption patterns between LC patients and controls indicate a possible role for lifestyle modifications involving nutritional factors as a means of decreasing the risk of laryngeal cancer.
Interpreting Regression Results: beta Weights and Structure Coefficients are Both Important.
ERIC Educational Resources Information Center
Thompson, Bruce
Various realizations have led to less frequent use of the "OVA" methods (analysis of variance--ANOVA--among others) and to more frequent use of general linear model approaches such as regression. However, too few researchers understand all the various coefficients produced in regression. This paper explains these coefficients and their…
Transforming RNA-Seq data to improve the performance of prognostic gene signatures.
Zwiener, Isabella; Frisch, Barbara; Binder, Harald
2014-01-01
Gene expression measurements have successfully been used for building prognostic signatures, i.e for identifying a short list of important genes that can predict patient outcome. Mostly microarray measurements have been considered, and there is little advice available for building multivariable risk prediction models from RNA-Seq data. We specifically consider penalized regression techniques, such as the lasso and componentwise boosting, which can simultaneously consider all measurements and provide both, multivariable regression models for prediction and automated variable selection. However, they might be affected by the typical skewness, mean-variance-dependency or extreme values of RNA-Seq covariates and therefore could benefit from transformations of the latter. In an analytical part, we highlight preferential selection of covariates with large variances, which is problematic due to the mean-variance dependency of RNA-Seq data. In a simulation study, we compare different transformations of RNA-Seq data for potentially improving detection of important genes. Specifically, we consider standardization, the log transformation, a variance-stabilizing transformation, the Box-Cox transformation, and rank-based transformations. In addition, the prediction performance for real data from patients with kidney cancer and acute myeloid leukemia is considered. We show that signature size, identification performance, and prediction performance critically depend on the choice of a suitable transformation. Rank-based transformations perform well in all scenarios and can even outperform complex variance-stabilizing approaches. Generally, the results illustrate that the distribution and potential transformations of RNA-Seq data need to be considered as a critical step when building risk prediction models by penalized regression techniques.
Transforming RNA-Seq Data to Improve the Performance of Prognostic Gene Signatures
Zwiener, Isabella; Frisch, Barbara; Binder, Harald
2014-01-01
Gene expression measurements have successfully been used for building prognostic signatures, i.e for identifying a short list of important genes that can predict patient outcome. Mostly microarray measurements have been considered, and there is little advice available for building multivariable risk prediction models from RNA-Seq data. We specifically consider penalized regression techniques, such as the lasso and componentwise boosting, which can simultaneously consider all measurements and provide both, multivariable regression models for prediction and automated variable selection. However, they might be affected by the typical skewness, mean-variance-dependency or extreme values of RNA-Seq covariates and therefore could benefit from transformations of the latter. In an analytical part, we highlight preferential selection of covariates with large variances, which is problematic due to the mean-variance dependency of RNA-Seq data. In a simulation study, we compare different transformations of RNA-Seq data for potentially improving detection of important genes. Specifically, we consider standardization, the log transformation, a variance-stabilizing transformation, the Box-Cox transformation, and rank-based transformations. In addition, the prediction performance for real data from patients with kidney cancer and acute myeloid leukemia is considered. We show that signature size, identification performance, and prediction performance critically depend on the choice of a suitable transformation. Rank-based transformations perform well in all scenarios and can even outperform complex variance-stabilizing approaches. Generally, the results illustrate that the distribution and potential transformations of RNA-Seq data need to be considered as a critical step when building risk prediction models by penalized regression techniques. PMID:24416353
ERIC Educational Resources Information Center
Stapleton, Laura M.
2008-01-01
This article discusses replication sampling variance estimation techniques that are often applied in analyses using data from complex sampling designs: jackknife repeated replication, balanced repeated replication, and bootstrapping. These techniques are used with traditional analyses such as regression, but are currently not used with structural…
ERIC Educational Resources Information Center
Guler, Nese; Penfield, Randall D.
2009-01-01
In this study, we investigate the logistic regression (LR), Mantel-Haenszel (MH), and Breslow-Day (BD) procedures for the simultaneous detection of both uniform and nonuniform differential item functioning (DIF). A simulation study was used to assess and compare the Type I error rate and power of a combined decision rule (CDR), which assesses DIF…
ERIC Educational Resources Information Center
Le, Huy; Marcus, Justin
2012-01-01
This study used Monte Carlo simulation to examine the properties of the overall odds ratio (OOR), which was recently introduced as an index for overall effect size in multiple logistic regression. It was found that the OOR was relatively independent of study base rate and performed better than most commonly used R-square analogs in indexing model…
Predicting Student Success on the Texas Chemistry STAAR Test: A Logistic Regression Analysis
ERIC Educational Resources Information Center
Johnson, William L.; Johnson, Annabel M.; Johnson, Jared
2012-01-01
Background: The context is the new Texas STAAR end-of-course testing program. Purpose: The authors developed a logistic regression model to predict who would pass-or-fail the new Texas chemistry STAAR end-of-course exam. Setting: Robert E. Lee High School (5A) with an enrollment of 2700 students, Tyler, Texas. Date of the study was the 2011-2012…
Susan L. King
2003-01-01
The performance of two classifiers, logistic regression and neural networks, are compared for modeling noncatastrophic individual tree mortality for 21 species of trees in West Virginia. The output of the classifier is usually a continuous number between 0 and 1. A threshold is selected between 0 and 1 and all of the trees below the threshold are classified as...
Logistic regression trees for initial selection of interesting loci in case-control studies
Nickolov, Radoslav Z; Milanov, Valentin B
2007-01-01
Modern genetic epidemiology faces the challenge of dealing with hundreds of thousands of genetic markers. The selection of a small initial subset of interesting markers for further investigation can greatly facilitate genetic studies. In this contribution we suggest the use of a logistic regression tree algorithm known as logistic tree with unbiased selection. Using the simulated data provided for Genetic Analysis Workshop 15, we show how this algorithm, with incorporation of multifactor dimensionality reduction method, can reduce an initial large pool of markers to a small set that includes the interesting markers with high probability. PMID:18466557
Rupert, Michael G.; Cannon, Susan H.; Gartner, Joseph E.; Michael, John A.; Helsel, Dennis R.
2008-01-01
Logistic regression was used to develop statistical models that can be used to predict the probability of debris flows in areas recently burned by wildfires by using data from 14 wildfires that burned in southern California during 2003-2006. Twenty-eight independent variables describing the basin morphology, burn severity, rainfall, and soil properties of 306 drainage basins located within those burned areas were evaluated. The models were developed as follows: (1) Basins that did and did not produce debris flows soon after the 2003 to 2006 fires were delineated from data in the National Elevation Dataset using a geographic information system; (2) Data describing the basin morphology, burn severity, rainfall, and soil properties were compiled for each basin. These data were then input to a statistics software package for analysis using logistic regression; and (3) Relations between the occurrence or absence of debris flows and the basin morphology, burn severity, rainfall, and soil properties were evaluated, and five multivariate logistic regression models were constructed. All possible combinations of independent variables were evaluated to determine which combinations produced the most effective models, and the multivariate models that best predicted the occurrence of debris flows were identified. Percentage of high burn severity and 3-hour peak rainfall intensity were significant variables in all models. Soil organic matter content and soil clay content were significant variables in all models except Model 5. Soil slope was a significant variable in all models except Model 4. The most suitable model can be selected from these five models on the basis of the availability of independent variables in the particular area of interest and field checking of probability maps. The multivariate logistic regression models can be entered into a geographic information system, and maps showing the probability of debris flows can be constructed in recently burned areas of southern California. This study demonstrates that logistic regression is a valuable tool for developing models that predict the probability of debris flows occurring in recently burned landscapes.
Hein, R; Abbas, S; Seibold, P; Salazar, R; Flesch-Janys, D; Chang-Claude, J
2012-01-01
Menopausal hormone therapy (MHT) is associated with an increased breast cancer risk in postmenopausal women, with combined estrogen-progestagen therapy posing a greater risk than estrogen monotherapy. However, few studies focused on potential effect modification of MHT-associated breast cancer risk by genetic polymorphisms in the progesterone metabolism. We assessed effect modification of MHT use by five coding single nucleotide polymorphisms (SNPs) in the progesterone metabolizing enzymes AKR1C3 (rs7741), AKR1C4 (rs3829125, rs17134592), and SRD5A1 (rs248793, rs3736316) using a two-center population-based case-control study from Germany with 2,502 postmenopausal breast cancer patients and 4,833 matched controls. An empirical-Bayes procedure that tests for interaction using a weighted combination of the prospective and the retrospective case-control estimators as well as standard prospective logistic regression were applied to assess multiplicative statistical interaction between polymorphisms and duration of MHT use with regard to breast cancer risk assuming a log-additive mode of inheritance. No genetic marginal effects were observed. Breast cancer risk associated with duration of combined therapy was significantly modified by SRD5A1_rs3736316, showing a reduced risk elevation in carriers of the minor allele (p (interaction,empirical-Bayes) = 0.006 using the empirical-Bayes method, p (interaction,logistic regression) = 0.013 using logistic regression). The risk associated with duration of use of monotherapy was increased by AKR1C3_rs7741 in minor allele carriers (p (interaction,empirical-Bayes) = 0.083, p (interaction,logistic regression) = 0.029) and decreased in minor allele carriers of two SNPs in AKR1C4 (rs3829125: p (interaction,empirical-Bayes) = 0.07, p (interaction,logistic regression) = 0.021; rs17134592: p (interaction,empirical-Bayes) = 0.101, p (interaction,logistic regression) = 0.038). After Bonferroni correction for multiple testing only SRD5A1_rs3736316 assessed using the empirical-Bayes method remained significant. Postmenopausal breast cancer risk associated with combined therapy may be modified by genetic variation in SRD5A1. Further well-powered studies are, however, required to replicate our finding.
Biostatistics Series Module 10: Brief Overview of Multivariate Methods.
Hazra, Avijit; Gogtay, Nithya
2017-01-01
Multivariate analysis refers to statistical techniques that simultaneously look at three or more variables in relation to the subjects under investigation with the aim of identifying or clarifying the relationships between them. These techniques have been broadly classified as dependence techniques, which explore the relationship between one or more dependent variables and their independent predictors, and interdependence techniques, that make no such distinction but treat all variables equally in a search for underlying relationships. Multiple linear regression models a situation where a single numerical dependent variable is to be predicted from multiple numerical independent variables. Logistic regression is used when the outcome variable is dichotomous in nature. The log-linear technique models count type of data and can be used to analyze cross-tabulations where more than two variables are included. Analysis of covariance is an extension of analysis of variance (ANOVA), in which an additional independent variable of interest, the covariate, is brought into the analysis. It tries to examine whether a difference persists after "controlling" for the effect of the covariate that can impact the numerical dependent variable of interest. Multivariate analysis of variance (MANOVA) is a multivariate extension of ANOVA used when multiple numerical dependent variables have to be incorporated in the analysis. Interdependence techniques are more commonly applied to psychometrics, social sciences and market research. Exploratory factor analysis and principal component analysis are related techniques that seek to extract from a larger number of metric variables, a smaller number of composite factors or components, which are linearly related to the original variables. Cluster analysis aims to identify, in a large number of cases, relatively homogeneous groups called clusters, without prior information about the groups. The calculation intensive nature of multivariate analysis has so far precluded most researchers from using these techniques routinely. The situation is now changing with wider availability, and increasing sophistication of statistical software and researchers should no longer shy away from exploring the applications of multivariate methods to real-life data sets.
Correa, Katharina; Lhorente, Jean P; López, María E; Bassini, Liane; Naswa, Sudhir; Deeb, Nader; Di Genova, Alex; Maass, Alejandro; Davidson, William S; Yáñez, José M
2015-10-24
Pisciricketssia salmonis is the causal agent of Salmon Rickettsial Syndrome (SRS), which affects salmon species and causes severe economic losses. Selective breeding for disease resistance represents one approach for controlling SRS in farmed Atlantic salmon. Knowledge concerning the architecture of the resistance trait is needed before deciding on the most appropriate approach to enhance artificial selection for P. salmonis resistance in Atlantic salmon. The purpose of the study was to dissect the genetic variation in the resistance to this pathogen in Atlantic salmon. 2,601 Atlantic salmon smolts were experimentally challenged against P. salmonis by means of intra-peritoneal injection. These smolts were the progeny of 40 sires and 118 dams from a Chilean breeding population. Mortalities were recorded daily and the experiment ended at day 40 post-inoculation. Fish were genotyped using a 50K Affymetrix® Axiom® myDesignTM Single Nucleotide Polymorphism (SNP) Genotyping Array. A Genome Wide Association Analysis was performed on data from the challenged fish. Linear regression and logistic regression models were tested. Genome Wide Association Analysis indicated that resistance to P. salmonis is a moderately polygenic trait. There were five SNPs in chromosomes Ssa01 and Ssa17 significantly associated with the traits analysed. The proportion of the phenotypic variance explained by each marker is small, ranging from 0.007 to 0.045. Candidate genes including interleukin receptors and fucosyltransferase have been found to be physically linked with these genetic markers and may play an important role in the differential immune response against this pathogen. Due to the small amount of variance explained by each significant marker we conclude that genetic resistance to this pathogen can be more efficiently improved with the implementation of genetic evaluations incorporating genotype information from a dense SNP array.
van der Wees, Philip J; Hendriks, Erik JM; Jansen, Mariette J; van Beers, Hans; de Bie, Rob A; Dekker, Joost
2007-01-01
Background Clinical guidelines are considered important instruments to improve quality in health care. In physiotherapy, insight in adherence to guidelines is limited. Knowledge of adherence is important to identify barriers and to enhance implementation. Purpose of this study is to investigate the ability to adherence to recommendations of the guideline Acute ankle injury, and to identify patient characteristics that determine adherence to the guideline. Methods Twenty-two physiotherapists collected data of 174 patients in a prospective cohort study, in which the course of treatment was systematically registered. Indicators were used to investigate adherence to recommendations. Patient characteristics were used to identify prognostic factors that may determine adherence to the guideline. Correlation between patient characteristics and adherence to outcome-indicators (treatment sessions, functioning of patient, accomplished goals) was calculated using univariate logistic regression. To calculate explained variance of combined patient characteristics, multivariate analysis was performed. Results Adherence to individual recommendations varied from 71% to 100%. In 99 patients (57%) the physiotherapists showed adherence to all indicators. Adherence to preset maximum of six treatment sessions for patients with severe ankle injury was 81% (132 patients). The odds to receive more than six sessions were statistically significant for three patient characteristics: females (OR:3.89; 95%CI: 1.41–10.72), recurrent sprain (OR: 6.90; 95%CI: 2.34 – 20.37), co-morbidity (OR: 25.92; 95% CI: 6.79 – 98.93). All factors together explained 40% of the variance. Inclusion of physiotherapist characteristics in the regression model showed that work-experience reduced the odds to receive more than six sessions (OR: 0.2; 95%CI: 0.06 – 0.77), and increased explained variance to 45%. Conclusion Adherence to the clinical guideline Acute ankle sprain showed that the guideline is applicable in daily practice. Adherence to the guideline, even in a group of physiotherapists familiar with the guideline, showed possibilities for improvement. The necessity to exceed the expected number of treatment sessions may be explained by co-morbidity and recurrent sprains. It is not clear why female patients were treated with more sessions. Experience of the physiotherapist reduced the number of treatment sessions. Quality indicators may be used for audit and feedback as part of the implementation strategy. PMID:17519040
Applications of statistics to medical science, III. Correlation and regression.
Watanabe, Hiroshi
2012-01-01
In this third part of a series surveying medical statistics, the concepts of correlation and regression are reviewed. In particular, methods of linear regression and logistic regression are discussed. Arguments related to survival analysis will be made in a subsequent paper.
Schell, Greggory J; Lavieri, Mariel S; Stein, Joshua D; Musch, David C
2013-12-21
Open-angle glaucoma (OAG) is a prevalent, degenerate ocular disease which can lead to blindness without proper clinical management. The tests used to assess disease progression are susceptible to process and measurement noise. The aim of this study was to develop a methodology which accounts for the inherent noise in the data and improve significant disease progression identification. Longitudinal observations from the Collaborative Initial Glaucoma Treatment Study (CIGTS) were used to parameterize and validate a Kalman filter model and logistic regression function. The Kalman filter estimates the true value of biomarkers associated with OAG and forecasts future values of these variables. We develop two logistic regression models via generalized estimating equations (GEE) for calculating the probability of experiencing significant OAG progression: one model based on the raw measurements from CIGTS and another model based on the Kalman filter estimates of the CIGTS data. Receiver operating characteristic (ROC) curves and associated area under the ROC curve (AUC) estimates are calculated using cross-fold validation. The logistic regression model developed using Kalman filter estimates as data input achieves higher sensitivity and specificity than the model developed using raw measurements. The mean AUC for the Kalman filter-based model is 0.961 while the mean AUC for the raw measurements model is 0.889. Hence, using the probability function generated via Kalman filter estimates and GEE for logistic regression, we are able to more accurately classify patients and instances as experiencing significant OAG progression. A Kalman filter approach for estimating the true value of OAG biomarkers resulted in data input which improved the accuracy of a logistic regression classification model compared to a model using raw measurements as input. This methodology accounts for process and measurement noise to enable improved discrimination between progression and nonprogression in chronic diseases.
Computing group cardinality constraint solutions for logistic regression problems.
Zhang, Yong; Kwon, Dongjin; Pohl, Kilian M
2017-01-01
We derive an algorithm to directly solve logistic regression based on cardinality constraint, group sparsity and use it to classify intra-subject MRI sequences (e.g. cine MRIs) of healthy from diseased subjects. Group cardinality constraint models are often applied to medical images in order to avoid overfitting of the classifier to the training data. Solutions within these models are generally determined by relaxing the cardinality constraint to a weighted feature selection scheme. However, these solutions relate to the original sparse problem only under specific assumptions, which generally do not hold for medical image applications. In addition, inferring clinical meaning from features weighted by a classifier is an ongoing topic of discussion. Avoiding weighing features, we propose to directly solve the group cardinality constraint logistic regression problem by generalizing the Penalty Decomposition method. To do so, we assume that an intra-subject series of images represents repeated samples of the same disease patterns. We model this assumption by combining series of measurements created by a feature across time into a single group. Our algorithm then derives a solution within that model by decoupling the minimization of the logistic regression function from enforcing the group sparsity constraint. The minimum to the smooth and convex logistic regression problem is determined via gradient descent while we derive a closed form solution for finding a sparse approximation of that minimum. We apply our method to cine MRI of 38 healthy controls and 44 adult patients that received reconstructive surgery of Tetralogy of Fallot (TOF) during infancy. Our method correctly identifies regions impacted by TOF and generally obtains statistically significant higher classification accuracy than alternative solutions to this model, i.e., ones relaxing group cardinality constraints. Copyright © 2016 Elsevier B.V. All rights reserved.
Ren, Yilong; Wang, Yunpeng; Wu, Xinkai; Yu, Guizhen; Ding, Chuan
2016-10-01
Red light running (RLR) has become a major safety concern at signalized intersection. To prevent RLR related crashes, it is critical to identify the factors that significantly impact the drivers' behaviors of RLR, and to predict potential RLR in real time. In this research, 9-month's RLR events extracted from high-resolution traffic data collected by loop detectors from three signalized intersections were applied to identify the factors that significantly affect RLR behaviors. The data analysis indicated that occupancy time, time gap, used yellow time, time left to yellow start, whether the preceding vehicle runs through the intersection during yellow, and whether there is a vehicle passing through the intersection on the adjacent lane were significantly factors for RLR behaviors. Furthermore, due to the rare events nature of RLR, a modified rare events logistic regression model was developed for RLR prediction. The rare events logistic regression method has been applied in many fields for rare events studies and shows impressive performance, but so far none of previous research has applied this method to study RLR. The results showed that the rare events logistic regression model performed significantly better than the standard logistic regression model. More importantly, the proposed RLR prediction method is purely based on loop detector data collected from a single advance loop detector located 400 feet away from stop-bar. This brings great potential for future field applications of the proposed method since loops have been widely implemented in many intersections and can collect data in real time. This research is expected to contribute to the improvement of intersection safety significantly. Copyright © 2016 Elsevier Ltd. All rights reserved.
Engoren, Milo; Habib, Robert H; Dooner, John J; Schwann, Thomas A
2013-08-01
As many as 14 % of patients undergoing coronary artery bypass surgery are readmitted within 30 days. Readmission is usually the result of morbidity and may lead to death. The purpose of this study is to develop and compare statistical and genetic programming models to predict readmission. Patients were divided into separate Construction and Validation populations. Using 88 variables, logistic regression, genetic programs, and artificial neural nets were used to develop predictive models. Models were first constructed and tested on the Construction populations, then validated on the Validation population. Areas under the receiver operator characteristic curves (AU ROC) were used to compare the models. Two hundred and two patients (7.6 %) in the 2,644 patient Construction group and 216 (8.0 %) of the 2,711 patient Validation group were re-admitted within 30 days of CABG surgery. Logistic regression predicted readmission with AU ROC = .675 ± .021 in the Construction group. Genetic programs significantly improved the accuracy, AU ROC = .767 ± .001, p < .001). Artificial neural nets were less accurate with AU ROC = 0.597 ± .001 in the Construction group. Predictive accuracy of all three techniques fell in the Validation group. However, the accuracy of genetic programming (AU ROC = .654 ± .001) was still trivially but statistically non-significantly better than that of the logistic regression (AU ROC = .644 ± .020, p = .61). Genetic programming and logistic regression provide alternative methods to predict readmission that are similarly accurate.
Eken, Cenker; Bilge, Ugur; Kartal, Mutlu; Eray, Oktay
2009-06-03
Logistic regression is the most common statistical model for processing multivariate data in the medical literature. Artificial intelligence models like an artificial neural network (ANN) and genetic algorithm (GA) may also be useful to interpret medical data. The purpose of this study was to perform artificial intelligence models on a medical data sheet and compare to logistic regression. ANN, GA, and logistic regression analysis were carried out on a data sheet of a previously published article regarding patients presenting to an emergency department with flank pain suspicious for renal colic. The study population was composed of 227 patients: 176 patients had a diagnosis of urinary stone, while 51 ultimately had no calculus. The GA found two decision rules in predicting urinary stones. Rule 1 consisted of being male, pain not spreading to back, and no fever. In rule 2, pelvicaliceal dilatation on bedside ultrasonography replaced no fever. ANN, GA rule 1, GA rule 2, and logistic regression had a sensitivity of 94.9, 67.6, 56.8, and 95.5%, a specificity of 78.4, 76.47, 86.3, and 47.1%, a positive likelihood ratio of 4.4, 2.9, 4.1, and 1.8, and a negative likelihood ratio of 0.06, 0.42, 0.5, and 0.09, respectively. The area under the curve was found to be 0.867, 0.720, 0.715, and 0.713 for all applications, respectively. Data mining techniques such as ANN and GA can be used for predicting renal colic in emergency settings and to constitute clinical decision rules. They may be an alternative to conventional multivariate analysis applications used in biostatistics.
NASA Astrophysics Data System (ADS)
Duman, T. Y.; Can, T.; Gokceoglu, C.; Nefeslioglu, H. A.; Sonmez, H.
2006-11-01
As a result of industrialization, throughout the world, cities have been growing rapidly for the last century. One typical example of these growing cities is Istanbul, the population of which is over 10 million. Due to rapid urbanization, new areas suitable for settlement and engineering structures are necessary. The Cekmece area located west of the Istanbul metropolitan area is studied, because the landslide activity is extensive in this area. The purpose of this study is to develop a model that can be used to characterize landslide susceptibility in map form using logistic regression analysis of an extensive landslide database. A database of landslide activity was constructed using both aerial-photography and field studies. About 19.2% of the selected study area is covered by deep-seated landslides. The landslides that occur in the area are primarily located in sandstones with interbedded permeable and impermeable layers such as claystone, siltstone and mudstone. About 31.95% of the total landslide area is located at this unit. To apply logistic regression analyses, a data matrix including 37 variables was constructed. The variables used in the forwards stepwise analyses are different measures of slope, aspect, elevation, stream power index (SPI), plan curvature, profile curvature, geology, geomorphology and relative permeability of lithological units. A total of 25 variables were identified as exerting strong influence on landslide occurrence, and included by the logistic regression equation. Wald statistics values indicate that lithology, SPI and slope are more important than the other parameters in the equation. Beta coefficients of the 25 variables included the logistic regression equation provide a model for landslide susceptibility in the Cekmece area. This model is used to generate a landslide susceptibility map that correctly classified 83.8% of the landslide-prone areas.
Bhowmick, Amiya Ranjan; Bandyopadhyay, Subhadip; Rana, Sourav; Bhattacharya, Sabyasachi
2016-01-01
The stochastic versions of the logistic and extended logistic growth models are applied successfully to explain many real-life population dynamics and share a central body of literature in stochastic modeling of ecological systems. To understand the randomness in the population dynamics of the underlying processes completely, it is important to have a clear idea about the quasi-equilibrium distribution and its moments. Bartlett et al. (1960) took a pioneering attempt for estimating the moments of the quasi-equilibrium distribution of the stochastic logistic model. Matis and Kiffe (1996) obtain a set of more accurate and elegant approximations for the mean, variance and skewness of the quasi-equilibrium distribution of the same model using cumulant truncation method. The method is extended for stochastic power law logistic family by the same and several other authors (Nasell, 2003; Singh and Hespanha, 2007). Cumulant truncation and some alternative methods e.g. saddle point approximation, derivative matching approach can be applied if the powers involved in the extended logistic set up are integers, although plenty of evidence is available for non-integer powers in many practical situations (Sibly et al., 2005). In this paper, we develop a set of new approximations for mean, variance and skewness of the quasi-equilibrium distribution under more general family of growth curves, which is applicable for both integer and non-integer powers. The deterministic counterpart of this family of models captures both monotonic and non-monotonic behavior of the per capita growth rate, of which theta-logistic is a special case. The approximations accurately estimate the first three order moments of the quasi-equilibrium distribution. The proposed method is illustrated with simulated data and real data from global population dynamics database. Copyright © 2015 Elsevier Inc. All rights reserved.
Possibility of modifying the growth trajectory in Raeini Cashmere goat.
Ghiasi, Heydar; Mokhtari, M S
2018-03-27
The objective of this study was to investigate the possibility of modifying the growth trajectory in Raeini Cashmere goat breed. In total, 13,193 records on live body weight collected from 4788 Raeini Cashmere goats were used. According to Akanke's information criterion (AIC), the sing-trait random regression model included fourth-order Legendre polynomial for direct and maternal genetic effect; maternal and individual permanent environmental effect was the best model for estimating (co)variance components. The matrices of eigenvectors for (co)variances between random regression coefficients of direct additive genetic were used to calculate eigenfunctions, and different eigenvector indices were also constructed. The obtained results showed that the first eigenvalue explained 79.90% of total genetic variance. Therefore, changing the body weights applying the first eigenfunction will be obtained rapidly. Selection based on the first eigenvector will cause favorable positive genetic gains for all body weight considered from birth to 12 months of age. For modifying the growth trajectory in Raeini Cashmere goat, the selection should be based on the second eigenfunction. The second eigenvalue accounted for 14.41% of total genetic variance for body weights that is low in comparison with genetic variance explained by the first eigenvalue. The complex patterns of genetic change in growth trajectory observed under the third and fourth eigenfunction and low amount of genetic variance explained by the third and fourth eigenvalues.
New robust statistical procedures for the polytomous logistic regression models.
Castilla, Elena; Ghosh, Abhik; Martin, Nirian; Pardo, Leandro
2018-05-17
This article derives a new family of estimators, namely the minimum density power divergence estimators, as a robust generalization of the maximum likelihood estimator for the polytomous logistic regression model. Based on these estimators, a family of Wald-type test statistics for linear hypotheses is introduced. Robustness properties of both the proposed estimators and the test statistics are theoretically studied through the classical influence function analysis. Appropriate real life examples are presented to justify the requirement of suitable robust statistical procedures in place of the likelihood based inference for the polytomous logistic regression model. The validity of the theoretical results established in the article are further confirmed empirically through suitable simulation studies. Finally, an approach for the data-driven selection of the robustness tuning parameter is proposed with empirical justifications. © 2018, The International Biometric Society.
Staley, Dennis M.; Negri, Jacquelyn A.; Kean, Jason W.; Laber, Jayme L.; Tillery, Anne C.; Youberg, Ann M.
2016-06-30
Wildfire can significantly alter the hydrologic response of a watershed to the extent that even modest rainstorms can generate dangerous flash floods and debris flows. To reduce public exposure to hazard, the U.S. Geological Survey produces post-fire debris-flow hazard assessments for select fires in the western United States. We use publicly available geospatial data describing basin morphology, burn severity, soil properties, and rainfall characteristics to estimate the statistical likelihood that debris flows will occur in response to a storm of a given rainfall intensity. Using an empirical database and refined geospatial analysis methods, we defined new equations for the prediction of debris-flow likelihood using logistic regression methods. We showed that the new logistic regression model outperformed previous models used to predict debris-flow likelihood.
NASA Astrophysics Data System (ADS)
Kneringer, Philipp; Dietz, Sebastian; Mayr, Georg J.; Zeileis, Achim
2017-04-01
Low-visibility conditions have a large impact on aviation safety and economic efficiency of airports and airlines. To support decision makers, we develop a statistical probabilistic nowcasting tool for the occurrence of capacity-reducing operations related to low visibility. The probabilities of four different low visibility classes are predicted with an ordered logistic regression model based on time series of meteorological point measurements. Potential predictor variables for the statistical models are visibility, humidity, temperature and wind measurements at several measurement sites. A stepwise variable selection method indicates that visibility and humidity measurements are the most important model inputs. The forecasts are tested with a 30 minute forecast interval up to two hours, which is a sufficient time span for tactical planning at Vienna Airport. The ordered logistic regression models outperform persistence and are competitive with human forecasters.
Wang, Shuang; Jiang, Xiaoqian; Wu, Yuan; Cui, Lijuan; Cheng, Samuel; Ohno-Machado, Lucila
2013-06-01
We developed an EXpectation Propagation LOgistic REgRession (EXPLORER) model for distributed privacy-preserving online learning. The proposed framework provides a high level guarantee for protecting sensitive information, since the information exchanged between the server and the client is the encrypted posterior distribution of coefficients. Through experimental results, EXPLORER shows the same performance (e.g., discrimination, calibration, feature selection, etc.) as the traditional frequentist logistic regression model, but provides more flexibility in model updating. That is, EXPLORER can be updated one point at a time rather than having to retrain the entire data set when new observations are recorded. The proposed EXPLORER supports asynchronized communication, which relieves the participants from coordinating with one another, and prevents service breakdown from the absence of participants or interrupted communications. Copyright © 2013 Elsevier Inc. All rights reserved.
A computational approach to compare regression modelling strategies in prediction research.
Pajouheshnia, Romin; Pestman, Wiebe R; Teerenstra, Steven; Groenwold, Rolf H H
2016-08-25
It is often unclear which approach to fit, assess and adjust a model will yield the most accurate prediction model. We present an extension of an approach for comparing modelling strategies in linear regression to the setting of logistic regression and demonstrate its application in clinical prediction research. A framework for comparing logistic regression modelling strategies by their likelihoods was formulated using a wrapper approach. Five different strategies for modelling, including simple shrinkage methods, were compared in four empirical data sets to illustrate the concept of a priori strategy comparison. Simulations were performed in both randomly generated data and empirical data to investigate the influence of data characteristics on strategy performance. We applied the comparison framework in a case study setting. Optimal strategies were selected based on the results of a priori comparisons in a clinical data set and the performance of models built according to each strategy was assessed using the Brier score and calibration plots. The performance of modelling strategies was highly dependent on the characteristics of the development data in both linear and logistic regression settings. A priori comparisons in four empirical data sets found that no strategy consistently outperformed the others. The percentage of times that a model adjustment strategy outperformed a logistic model ranged from 3.9 to 94.9 %, depending on the strategy and data set. However, in our case study setting the a priori selection of optimal methods did not result in detectable improvement in model performance when assessed in an external data set. The performance of prediction modelling strategies is a data-dependent process and can be highly variable between data sets within the same clinical domain. A priori strategy comparison can be used to determine an optimal logistic regression modelling strategy for a given data set before selecting a final modelling approach.
Cakir, Ebru; Kucuk, Ulku; Pala, Emel Ebru; Sezer, Ozlem; Ekin, Rahmi Gokhan; Cakmak, Ozgur
2017-05-01
Conventional cytomorphologic assessment is the first step to establish an accurate diagnosis in urinary cytology. In cytologic preparations, the separation of low-grade urothelial carcinoma (LGUC) from reactive urothelial proliferation (RUP) can be exceedingly difficult. The bladder washing cytologies of 32 LGUC and 29 RUP were reviewed. The cytologic slides were examined for the presence or absence of the 28 cytologic features. The cytologic criteria showing statistical significance in LGUC were increased numbers of monotonous single (non-umbrella) cells, three-dimensional cellular papillary clusters without fibrovascular cores, irregular bordered clusters, atypical single cells, irregular nuclear overlap, cytoplasmic homogeneity, increased N/C ratio, pleomorphism, nuclear border irregularity, nuclear eccentricity, elongated nuclei, and hyperchromasia (p ˂ 0.05), and the cytologic criteria showing statistical significance in RUP were inflammatory background, mixture of small and large urothelial cells, loose monolayer aggregates, and vacuolated cytoplasm (p ˂ 0.05). When these variables were subjected to a stepwise logistic regression analysis, four features were selected to distinguish LGUC from RUP: increased numbers of monotonous single (non-umbrella) cells, increased nuclear cytoplasmic ratio, hyperchromasia, and presence of small and large urothelial cells (p = 0.0001). By this logistic model of the 32 cases with proven LGUC, the stepwise logistic regression analysis correctly predicted 31 (96.9%) patients with this diagnosis, and of the 29 patients with RUP, the logistic model correctly predicted 26 (89.7%) patients as having this disease. There are several cytologic features to separate LGUC from RUP. Stepwise logistic regression analysis is a valuable tool for determining the most useful cytologic criteria to distinguish these entities. © 2017 APMIS. Published by John Wiley & Sons Ltd.
Image encryption based on a delayed fractional-order chaotic logistic system
NASA Astrophysics Data System (ADS)
Wang, Zhen; Huang, Xia; Li, Ning; Song, Xiao-Na
2012-05-01
A new image encryption scheme is proposed based on a delayed fractional-order chaotic logistic system. In the process of generating a key stream, the time-varying delay and fractional derivative are embedded in the proposed scheme to improve the security. Such a scheme is described in detail with security analyses including correlation analysis, information entropy analysis, run statistic analysis, mean-variance gray value analysis, and key sensitivity analysis. Experimental results show that the newly proposed image encryption scheme possesses high security.
Science of Test Research Consortium: Year Two Final Report
2012-10-02
July 2012. Analysis of an Intervention for Small Unmanned Aerial System ( SUAS ) Accidents, submitted to Quality Engineering, LQEN-2012-0056. Stone... Systems Engineering. Wolf, S. E., R. R. Hill, and J. J. Pignatiello. June 2012. Using Neural Networks and Logistic Regression to Model Small Unmanned ...Human Retina. 6. Wolf, S. E. March 2012. Modeling Small Unmanned Aerial System Mishaps using Logistic Regression and Artificial Neural Networks. 7
ERIC Educational Resources Information Center
Hidalgo, Mª Dolores; Gómez-Benito, Juana; Zumbo, Bruno D.
2014-01-01
The authors analyze the effectiveness of the R[superscript 2] and delta log odds ratio effect size measures when using logistic regression analysis to detect differential item functioning (DIF) in dichotomous items. A simulation study was carried out, and the Type I error rate and power estimates under conditions in which only statistical testing…
Brian S. Cade; Barry R. Noon; Rick D. Scherer; John J. Keane
2017-01-01
Counts of avian fledglings, nestlings, or clutch size that are bounded below by zero and above by some small integer form a discrete random variable distribution that is not approximated well by conventional parametric count distributions such as the Poisson or negative binomial. We developed a logistic quantile regression model to provide estimates of the empirical...
Mohammed, Mohammed A; Manktelow, Bradley N; Hofer, Timothy P
2016-04-01
There is interest in deriving case-mix adjusted standardised mortality ratios so that comparisons between healthcare providers, such as hospitals, can be undertaken in the controversial belief that variability in standardised mortality ratios reflects quality of care. Typically standardised mortality ratios are derived using a fixed effects logistic regression model, without a hospital term in the model. This fails to account for the hierarchical structure of the data - patients nested within hospitals - and so a hierarchical logistic regression model is more appropriate. However, four methods have been advocated for deriving standardised mortality ratios from a hierarchical logistic regression model, but their agreement is not known and neither do we know which is to be preferred. We found significant differences between the four types of standardised mortality ratios because they reflect a range of underlying conceptual issues. The most subtle issue is the distinction between asking how an average patient fares in different hospitals versus how patients at a given hospital fare at an average hospital. Since the answers to these questions are not the same and since the choice between these two approaches is not obvious, the extent to which profiling hospitals on mortality can be undertaken safely and reliably, without resolving these methodological issues, remains questionable. © The Author(s) 2012.
Chan, Siew Foong; Deeks, Jonathan J; Macaskill, Petra; Irwig, Les
2008-01-01
To compare three predictive models based on logistic regression to estimate adjusted likelihood ratios allowing for interdependency between diagnostic variables (tests). This study was a review of the theoretical basis, assumptions, and limitations of published models; and a statistical extension of methods and application to a case study of the diagnosis of obstructive airways disease based on history and clinical examination. Albert's method includes an offset term to estimate an adjusted likelihood ratio for combinations of tests. Spiegelhalter and Knill-Jones method uses the unadjusted likelihood ratio for each test as a predictor and computes shrinkage factors to allow for interdependence. Knottnerus' method differs from the other methods because it requires sequencing of tests, which limits its application to situations where there are few tests and substantial data. Although parameter estimates differed between the models, predicted "posttest" probabilities were generally similar. Construction of predictive models using logistic regression is preferred to the independence Bayes' approach when it is important to adjust for dependency of tests errors. Methods to estimate adjusted likelihood ratios from predictive models should be considered in preference to a standard logistic regression model to facilitate ease of interpretation and application. Albert's method provides the most straightforward approach.
Cameron, Isobel M; Scott, Neil W; Adler, Mats; Reid, Ian C
2014-12-01
It is important for clinical practice and research that measurement scales of well-being and quality of life exhibit only minimal differential item functioning (DIF). DIF occurs where different groups of people endorse items in a scale to different extents after being matched by the intended scale attribute. We investigate the equivalence or otherwise of common methods of assessing DIF. Three methods of measuring age- and sex-related DIF (ordinal logistic regression, Rasch analysis and Mantel χ(2) procedure) were applied to Hospital Anxiety Depression Scale (HADS) data pertaining to a sample of 1,068 patients consulting primary care practitioners. Three items were flagged by all three approaches as having either age- or sex-related DIF with a consistent direction of effect; a further three items identified did not meet stricter criteria for important DIF using at least one method. When applying strict criteria for significant DIF, ordinal logistic regression was slightly less sensitive. Ordinal logistic regression, Rasch analysis and contingency table methods yielded consistent results when identifying DIF in the HADS depression and HADS anxiety scales. Regardless of methods applied, investigators should use a combination of statistical significance, magnitude of the DIF effect and investigator judgement when interpreting the results.
NASA Astrophysics Data System (ADS)
Cao, Faxian; Yang, Zhijing; Ren, Jinchang; Ling, Wing-Kuen; Zhao, Huimin; Marshall, Stephen
2017-12-01
Although the sparse multinomial logistic regression (SMLR) has provided a useful tool for sparse classification, it suffers from inefficacy in dealing with high dimensional features and manually set initial regressor values. This has significantly constrained its applications for hyperspectral image (HSI) classification. In order to tackle these two drawbacks, an extreme sparse multinomial logistic regression (ESMLR) is proposed for effective classification of HSI. First, the HSI dataset is projected to a new feature space with randomly generated weight and bias. Second, an optimization model is established by the Lagrange multiplier method and the dual principle to automatically determine a good initial regressor for SMLR via minimizing the training error and the regressor value. Furthermore, the extended multi-attribute profiles (EMAPs) are utilized for extracting both the spectral and spatial features. A combinational linear multiple features learning (MFL) method is proposed to further enhance the features extracted by ESMLR and EMAPs. Finally, the logistic regression via the variable splitting and the augmented Lagrangian (LORSAL) is adopted in the proposed framework for reducing the computational time. Experiments are conducted on two well-known HSI datasets, namely the Indian Pines dataset and the Pavia University dataset, which have shown the fast and robust performance of the proposed ESMLR framework.
Latin hypercube approach to estimate uncertainty in ground water vulnerability
Gurdak, J.J.; McCray, J.E.; Thyne, G.; Qi, S.L.
2007-01-01
A methodology is proposed to quantify prediction uncertainty associated with ground water vulnerability models that were developed through an approach that coupled multivariate logistic regression with a geographic information system (GIS). This method uses Latin hypercube sampling (LHS) to illustrate the propagation of input error and estimate uncertainty associated with the logistic regression predictions of ground water vulnerability. Central to the proposed method is the assumption that prediction uncertainty in ground water vulnerability models is a function of input error propagation from uncertainty in the estimated logistic regression model coefficients (model error) and the values of explanatory variables represented in the GIS (data error). Input probability distributions that represent both model and data error sources of uncertainty were simultaneously sampled using a Latin hypercube approach with logistic regression calculations of probability of elevated nonpoint source contaminants in ground water. The resulting probability distribution represents the prediction intervals and associated uncertainty of the ground water vulnerability predictions. The method is illustrated through a ground water vulnerability assessment of the High Plains regional aquifer. Results of the LHS simulations reveal significant prediction uncertainties that vary spatially across the regional aquifer. Additionally, the proposed method enables a spatial deconstruction of the prediction uncertainty that can lead to improved prediction of ground water vulnerability. ?? 2007 National Ground Water Association.
Fischer, A; Friggens, N C; Berry, D P; Faverdin, P
2018-07-01
The ability to properly assess and accurately phenotype true differences in feed efficiency among dairy cows is key to the development of breeding programs for improving feed efficiency. The variability among individuals in feed efficiency is commonly characterised by the residual intake approach. Residual feed intake is represented by the residuals of a linear regression of intake on the corresponding quantities of the biological functions that consume (or release) energy. However, the residuals include both, model fitting and measurement errors as well as any variability in cow efficiency. The objective of this study was to isolate the individual animal variability in feed efficiency from the residual component. Two separate models were fitted, in one the standard residual energy intake (REI) was calculated as the residual of a multiple linear regression of lactation average net energy intake (NEI) on lactation average milk energy output, average metabolic BW, as well as lactation loss and gain of body condition score. In the other, a linear mixed model was used to simultaneously fit fixed linear regressions and random cow levels on the biological traits and intercept using fortnight repeated measures for the variables. This method split the predicted NEI in two parts: one quantifying the population mean intercept and coefficients, and one quantifying cow-specific deviations in the intercept and coefficients. The cow-specific part of predicted NEI was assumed to isolate true differences in feed efficiency among cows. NEI and associated energy expenditure phenotypes were available for the first 17 fortnights of lactation from 119 Holstein cows; all fed a constant energy-rich diet. Mixed models fitting cow-specific intercept and coefficients to different combinations of the aforementioned energy expenditure traits, calculated on a fortnightly basis, were compared. The variance of REI estimated with the lactation average model represented only 8% of the variance of measured NEI. Among all compared mixed models, the variance of the cow-specific part of predicted NEI represented between 53% and 59% of the variance of REI estimated from the lactation average model or between 4% and 5% of the variance of measured NEI. The remaining 41% to 47% of the variance of REI estimated with the lactation average model may therefore reflect model fitting errors or measurement errors. In conclusion, the use of a mixed model framework with cow-specific random regressions seems to be a promising method to isolate the cow-specific component of REI in dairy cows.
Cool, Geneviève; Lebel, Alexandre; Sadiq, Rehan; Rodriguez, Manuel J
2015-12-01
The regional variability of the probability of occurrence of high total trihalomethane (TTHM) levels was assessed using multilevel logistic regression models that incorporate environmental and infrastructure characteristics. The models were structured in a three-level hierarchical configuration: samples (first level), drinking water utilities (DWUs, second level) and natural regions, an ecological hierarchical division from the Quebec ecological framework of reference (third level). They considered six independent variables: precipitation, temperature, source type, seasons, treatment type and pH. The average probability of TTHM concentrations exceeding the targeted threshold was 18.1%. The probability was influenced by seasons, treatment type, precipitations and temperature. The variance at all levels was significant, showing that the probability of TTHM concentrations exceeding the threshold is most likely to be similar if located within the same DWU and within the same natural region. However, most of the variance initially attributed to natural regions was explained by treatment types and clarified by spatial aggregation on treatment types. Nevertheless, even after controlling for treatment type, there was still significant regional variability of the probability of TTHM concentrations exceeding the threshold. Regional variability was particularly important for DWUs using chlorination alone since they lack the appropriate treatment required to reduce the amount of natural organic matter (NOM) in source water prior to disinfection. Results presented herein could be of interest to authorities in identifying regions with specific needs regarding drinking water quality and for epidemiological studies identifying geographical variations in population exposure to disinfection by-products (DBPs).
Lindström, Martin; Merlo, Juan; Ostergren, Per-Olof
2002-06-01
The aim of this study was to analyse the impact of neighbourhood on individual social capital (measured as social participation). The study population consisted of 14,390 individuals aged 45-73 that participated in the Malmö diet and cancer study in 1992-1994, residing in 90 neighbourhoods of Malmö, Sweden (population 250,000). A multilevel logistic regression model, with individuals at the first level and neighbourhoods at the second level, was performed. The study analysed the effect (intra-area correlation and cross-level modification) of the neighbourhood on individual social capital after adjustment for compositional factors (e.g. age, sex, educational level, occupational status, disability pension, living alone, sick leave, unemployment) and, finally, one contextual migration factor. The prevalence of low social participation varied from 23.0% to 39.7% in the first and third neighbourhood quartiles, respectively. Neighbourhood factors accounted for 6.3% of the total variance in social participation, and this effect was reduced but not eliminated when adjusting for all studied variables (-73%), especially the occupational composition of the neighbourhoods (-58%). The contextual migration variable further reduced the variance in social participation at the neighbourhood level to some extent. Our study supports Putnam's notion that social capital, which is suggested to be an important factor for population health and possibly for health equity, is an aspect that is partly contextual in its nature.
Dysphagia after nonsurgical head and neck cancer treatment: patients' perspectives.
Wilson, Janet A; Carding, Paul N; Patterson, Joanne M
2011-11-01
Assess patients' perspectives on the severity, time course, and relative importance of swallowing deficit before and after (chemo)radiotherapy for head and neck cancer. Before-and-after cohort study. Head and neck cancer UK multidisciplinary clinic. A total of 167 patients with a primary cancer, mostly laryngopharyngeal, completed the MD Anderson Dysphagia Index (MDADI) and the University of Washington Quality of Life Questionnaire (UWQOL) before treatment and at 3, 6, and 12 months. Pretreatment swallowing, age, gender, and tumor site and stage were assessed. Statistical methods used were Mann-Whitney, analysis of variance, and logistic regression. There was a sharp deterioration in swallowing on average by 18%, from before treatment to 3 months post treatment (mean difference in MDADI score = 14.5; P < .001). Treatment schedule, pretreatment score, and age accounted for 37% of the variance in 3-month posttreatment MDADI scores. There was then little improvement from 3 to 12 months. Patients treated with only 50-Gy radiotherapy reported significantly less dysphagia at 1 year than patients receiving higher doses or combined chemoradiation (P < .001). Swallowing was the most commonly prioritized of the 12 UWQOL domains both before and after therapy. The MDADI and UWQOL scores were strongly correlated: ρ > 0.69. Swallowing is a top priority before and after treatment for the vast majority of patients with head and neck cancer. Swallowing deteriorates significantly posttreatment (P < .001). Treatment intensity, younger age, and lower pretreatment scores predict long-term dysphagia. After chemoradiation, there is little improvement from 3 to 12 months.
Demographic, medical, and psychiatric factors in work and marital status after mild head injury.
Vanderploeg, Rodney D; Curtiss, Glenn; Duchnick, Jennifer J; Luis, Cheryl A
2003-01-01
To explore factors associated with long-term outcomes of work and marital status in individuals who had experienced a mild head injury (MHI), as well as those who had not. Population-based study using logistical regression analyses to investigate the impact of preinjury characteristics on work and marital status. Two groups of Vietnam-era Army veterans: 626 who had experienced a MHI an average of 8 years before examination, and 3,896 who had not. Demographic characteristics, concurrent medical conditions, early life psychiatric problems, loss of consciousness (LOC), and interactions among these variables were used to predict current work and marital status. Multiple variables were associated with work and marital status in the sample with MHI, accounting for approximately 23% and 17% of the variance in these two outcome variables, respectively. In contrast, the same factors accounted for significantly less variance in outcome in the sample without a head injury-13.3% and 9.4% for work and marital status, respectively. These findings suggest a more potent role for and increased vulnerability to the influence of demographic, medical, and psychiatric factors on outcomes after a MHI. That is, MHI itself moderates the influence of preinjury characteristics on work and marital status. In addition, in those who had a MHI, moderator relationships were found between education and LOC for both work and marital status. Similarly, complex moderator relationships among race, region of residence, and LOC were found for both work and marital status outcomes.
Accessibility of Catering Service Venues and Adolescent Drinking in Beijing, China.
Lu, Shijun; Du, Songming; Ren, Zhoupeng; Zhao, Jing; Chambers, Christina; Wang, Jinfeng; Ma, Guansheng
2015-06-26
This study assessed the association between accessibility of catering service venues and adolescents' alcohol use over the previous 30 days. The data were collected from cross-sectional surveys conducted in 2014, 2223 students at 27 high schools in Chaoyang and Xicheng districts, Beijing using self-administered questionnaires to collect the adolescents information on socio-demographic characteristics and recent alcohol experiences. The accessibility of, and proximity to, catering service venues were summarized by weights, which were calculated by multiplication of the type-weight and the distance-weight. All sampled schools were categorized into three subgroups (low, middle, and high geographic density) based on the tertile of nearby catering service venues, and a multi-level logistic regression analysis was performed to explore variance between the school levels. Considering the setting characteristics, the catering service venues weighted value was found to account for 8.6% of the school level variance of adolescent alcohol use. The odds ratios (OR) and 95% confidence intervals (CI) of drinking over the past 30-days among adolescents with medium and high accessibility of catering service venues were 1.17 (0.86, 1.57) and 1.47 (1.06, 2.02), respectively (p < 0.001 for trend test). This study addressed a gap in the adolescent drinking influence by the catering service venues around schools in China. Results suggest that the greater accessibility of catering service venues around schools is associated with a growing risk of recent drinking.
Accessibility of Catering Service Venues and Adolescent Drinking in Beijing, China
Lu, Shijun; Du, Songming; Ren, Zhoupeng; Zhao, Jing; Chambers, Christina; Wang, Jinfeng; Ma, Guansheng
2015-01-01
This study assessed the association between accessibility of catering service venues and adolescents’ alcohol use over the previous 30 days. The data were collected from cross-sectional surveys conducted in 2014, 2223 students at 27 high schools in Chaoyang and Xicheng districts, Beijing using self-administered questionnaires to collect the adolescents information on socio-demographic characteristics and recent alcohol experiences. The accessibility of, and proximity to, catering service venues were summarized by weights, which were calculated by multiplication of the type-weight and the distance-weight. All sampled schools were categorized into three subgroups (low, middle, and high geographic density) based on the tertile of nearby catering service venues, and a multi-level logistic regression analysis was performed to explore variance between the school levels. Considering the setting characteristics, the catering service venues weighted value was found to account for 8.6% of the school level variance of adolescent alcohol use. The odds ratios (OR) and 95% confidence intervals (CI) of drinking over the past 30-days among adolescents with medium and high accessibility of catering service venues were 1.17 (0.86, 1.57) and 1.47 (1.06, 2.02), respectively (p < 0.001 for trend test). This study addressed a gap in the adolescent drinking influence by the catering service venues around schools in China. Results suggest that the greater accessibility of catering service venues around schools is associated with a growing risk of recent drinking. PMID:26132475
Strober, Lauren B; Chiaravalloti, Nancy; DeLuca, John
2018-01-01
Rates of unemployment among individuals with multiple sclerosis (MS) are as high as 80%. While several factors for such high rates of unemployment have been identified, they do not account for the majority of the variance. This study examines person-specific factors such as personality and coping, which may better account for individuals leaving the workforce. Forty individuals with MS (20 considering reducing work hours or leaving the workforce and 20 remaining employed) were matched on age, gender, education, disease duration, and disease course, and administered a comprehensive survey of factors purported to be related to employment status. Based on multiple, logistic regression analyses certain disease factors and person-specific factors differentiate those who are considering leaving work or reducing work hours and those staying employed. In particular, those expressing the need to reduce work hours or leaving the workforce reported more fatigue, anxiety, depression, and use of behavioral disengagement as a means of coping. In contrast, those staying employed reported greater levels of extraversion, self-efficacy, and use of humor as a means of coping. Together, fatigue, use of humor, and use of behavioral disengagement as a means of coping were the most significant factors, accounting for 44% of the variance. Findings suggest that greater consideration be given to these factors and that interventions tailored to address these factors may assist individuals with MS staying employed and/or making appropriate accommodations.
Hjerpe, Per; Boström, Kristina Bengtsson; Lindblad, Ulf; Merlo, Juan
2012-12-01
To investigate the impact on ICD coding behaviour of a new case-mix reimbursement system based on coded patient diagnoses. The main hypothesis was that after the introduction of the new system the coding of chronic diseases like hypertension and cancer would increase and the variance in propensity for coding would decrease on both physician and health care centre (HCC) levels. Cross-sectional multilevel logistic regression analyses were performed in periods covering the time before and after the introduction of the new reimbursement system. Skaraborg primary care, Sweden. All patients (n = 76 546 to 79 826) 50 years of age and older visiting 468 to 627 physicians at the 22 public HCCs in five consecutive time periods of one year each. Registered codes for hypertension and cancer diseases in Skaraborg primary care database (SPCD). After the introduction of the new reimbursement system the adjusted prevalence of hypertension and cancer in SPCD increased from 17.4% to 32.2% and from 0.79% to 2.32%, respectively, probably partly due to an increased diagnosis coding of indirect patient contacts. The total variance in the propensity for coding declined simultaneously at the physician level for both diagnosis groups. Changes in the healthcare reimbursement system may directly influence the contents of a research database that retrieves data from clinical practice. This should be taken into account when using such a database for research purposes, and the data should be validated for each diagnosis.
ERIC Educational Resources Information Center
Krus, David J.; Krus, Patricia H.
1978-01-01
The conceptual differences between coded regression analysis and traditional analysis of variance are discussed. Also, a modification of several SPSS routines is proposed which allows for direct interpretation of ANOVA and ANCOVA results in a form stressing the strength and significance of scrutinized relationships. (Author)
Morlett-Paredes, Alejandra; Perrin, Paul B; Olivera, Silvia Leonor; Rogers, Heather L; Perdomo, Jose Libardo; Arango, Jose Anselmo; Arango-Lasprilla, Juan Carlos
2014-01-01
The purpose of this study was to examine the influence of appraisal, belonging, and tangible social support on the mental health (depression, satisfaction with life, anxiety, and burden) of Colombian spinal cord injury (SCI) caregivers. Forty SCI caregivers from Neiva, Colombia completed questionnaires assessing their perceived social support and mental health. Four multiple regressions found that the three social support variables explained 42.8% of the variance in caregiver depression, 22.3% of the variance in satisfaction with life, 24.1% of the variance in anxiety, and 16.5% of the variance in burden, although the effect on burden was marginally significant. Within these regressions, higher belonging social support was uniquely associated with lower depression, and higher tangible social support was uniquely associated with higher caregiver satisfaction with life. Social support may have a particularly important influence on SCI caregiver mental health in Colombia, due in part to the high levels of collectivism and strong family values shown to exist in Latin America, and may therefore be an important target for SCI caregiver interventions in this region.
High-Dimensional Heteroscedastic Regression with an Application to eQTL Data Analysis
Daye, Z. John; Chen, Jinbo; Li, Hongzhe
2011-01-01
Summary We consider the problem of high-dimensional regression under non-constant error variances. Despite being a common phenomenon in biological applications, heteroscedasticity has, so far, been largely ignored in high-dimensional analysis of genomic data sets. We propose a new methodology that allows non-constant error variances for high-dimensional estimation and model selection. Our method incorporates heteroscedasticity by simultaneously modeling both the mean and variance components via a novel doubly regularized approach. Extensive Monte Carlo simulations indicate that our proposed procedure can result in better estimation and variable selection than existing methods when heteroscedasticity arises from the presence of predictors explaining error variances and outliers. Further, we demonstrate the presence of heteroscedasticity in and apply our method to an expression quantitative trait loci (eQTLs) study of 112 yeast segregants. The new procedure can automatically account for heteroscedasticity in identifying the eQTLs that are associated with gene expression variations and lead to smaller prediction errors. These results demonstrate the importance of considering heteroscedasticity in eQTL data analysis. PMID:22547833
NASA Astrophysics Data System (ADS)
Wheeler, David C.; Waller, Lance A.
2009-03-01
In this paper, we compare and contrast a Bayesian spatially varying coefficient process (SVCP) model with a geographically weighted regression (GWR) model for the estimation of the potentially spatially varying regression effects of alcohol outlets and illegal drug activity on violent crime in Houston, Texas. In addition, we focus on the inherent coefficient shrinkage properties of the Bayesian SVCP model as a way to address increased coefficient variance that follows from collinearity in GWR models. We outline the advantages of the Bayesian model in terms of reducing inflated coefficient variance, enhanced model flexibility, and more formal measuring of model uncertainty for prediction. We find spatially varying effects for alcohol outlets and drug violations, but the amount of variation depends on the type of model used. For the Bayesian model, this variation is controllable through the amount of prior influence placed on the variance of the coefficients. For example, the spatial pattern of coefficients is similar for the GWR and Bayesian models when a relatively large prior variance is used in the Bayesian model.
Placebo influences on dyskinesia in Parkinson's disease.
Goetz, Christopher G; Laska, Eugene; Hicking, Christine; Damier, Philippe; Müller, Thomas; Nutt, John; Warren Olanow, C; Rascol, Olivier; Russ, Hermann
2008-04-15
Clinical features that are prognostic indicators of placebo response among dyskinetic Parkinson's disease patients were determined. Placebo-associated improvements occur in Parkinsonism, but responses in dyskinesia have not been studied. Placebo data from two multicenter studies with identical design comparing sarizotan to placebo for treating dyskinesia were accessed. Sarizotan (2 mg/day) failed to improve dyskinesia compared with placebo, but both treatments improved dyskinesia compared with baseline. Stepwise regression identified baseline characteristics that influenced dyskinesia response to placebo, and these factors were entered into a logistic regression model to quantify their influence on placebo-related dyskinesia improvements and worsening. Because placebo-associated improvements in Parkinsonism have been attributed to heightened dopaminergic activity, we also examined the association between changes in Parkinsonism and dyskinesia. Four hundred eighty-four subjects received placebo treatment; 178 met criteria for placebo-associated dyskinesia improvement and 37 for dyskinesia worsening. Older age, lower baseline Parkinsonism score, and lower total daily levodopa doses were associated with placebo-associated improvement, whereas lower baseline dyskinesia score was associated with placebo-associated worsening. Placebo-associated dyskinesia changes were not correlated with Parkinsonism changes, and all effects in the sarizotan group were statistically explained by the placebo-effect regression model. Dyskinesias are affected by placebo treatment. The absence of correlation between placebo-induced changes in dyskinesia and Parkinsonism argues against a dopaminergic activation mechanism to explain placebo-associated improvements in dyskinesia. The magnitude and variance of placebo-related changes and the factors that influence them can be helpful in the design of future clinical trials of antidyskinetic agents. 2007 Movement Disorder Society
Placebo Influences on Dyskinesia in Parkinson's Disease
Goetz, Christopher G.; Laska, Eugene; Hicking, Christine; Damier, Philippe; Müller, Thomas; Nutt, John; Olanow, C. Warren; Rascol, Olivier; Russ, Hermann
2009-01-01
Clinical features that are prognostic indicators of placebo response among dyskinetic Parkinson's disease patients were determined. Placebo-associated improvements occur in Parkinsonism, but responses in dyskinesia have not been studied. Placebo data from two multicenter studies with identical design comparing sarizotan to placebo for treating dyskinesia were accessed. Sarizotan (2 mg/day) failed to improve dyskinesia compared with placebo, but both treatments improved dyskinesia compared with baseline. Stepwise regression identified baseline characteristics that influenced dyskinesia response to placebo, and these factors were entered into a logistic regression model to quantify their influence on placebo-related dyskinesia improvements and worsening. Because placebo-associated improvements in Parkinsonism have been attributed to heightened dopaminergic activity, we also examined the association between changes in Parkinsonism and dyskinesia. Four hundred eighty-four subjects received placebo treatment; 178 met criteria for placebo-associated dyskinesia improvement and 37 for dyskinesia worsening. Older age, lower baseline Parkinsonism score, and lower total daily levodopa doses were associated with placebo-associated improvement, whereas lower baseline dyskinesia score was associated with placebo-associated worsening. Placebo-associated dyskinesia changes were not correlated with Parkinsonism changes, and all effects in the sarizotan group were statistically explained by the placebo-effect regression model. Dyskinesias are affected by placebo treatment. The absence of correlation between placebo-induced changes in dyskinesia and Parkinsonism argues against a dopaminergic activation mechanism to explain placebo-associated improvements in dyskinesia. The magnitude and variance of placebo-related changes and the factors that influence them can be helpful in the design of future clinical trials of antidyskinetic agents. PMID:18175337
Li, Liang; Mao, Huzhang; Ishwaran, Hemant; Rajeswaran, Jeevanantham; Ehrlinger, John; Blackstone, Eugene H.
2016-01-01
Atrial fibrillation (AF) is an abnormal heart rhythm characterized by rapid and irregular heart beat, with or without perceivable symptoms. In clinical practice, the electrocardiogram (ECG) is often used for diagnosis of AF. Since the AF often arrives as recurrent episodes of varying frequency and duration and only the episodes that occur at the time of ECG can be detected, the AF is often underdiagnosed when a limited number of repeated ECGs are used. In studies evaluating the efficacy of AF ablation surgery, each patient undergo multiple ECGs and the AF status at the time of ECG is recorded. The objective of this paper is to estimate the marginal proportions of patients with or without AF in a population, which are important measures of the efficacy of the treatment. The underdiagnosis problem is addressed by a three-class mixture regression model in which a patient’s probability of having no AF, paroxysmal AF, and permanent AF is modeled by auxiliary baseline covariates in a nested logistic regression. A binomial regression model is specified conditional on a subject being in the paroxysmal AF group. The model parameters are estimated by the EM algorithm. These parameters are themselves nuisance parameters for the purpose of this research, but the estimators of the marginal proportions of interest can be expressed as functions of the data and these nuisance parameters and their variances can be estimated by the sandwich method. We examine the performance of the proposed methodology in simulations and two real data applications. PMID:27983754
Occlusal factors are not related to self-reported bruxism.
Manfredini, Daniele; Visscher, Corine M; Guarda-Nardini, Luca; Lobbezoo, Frank
2012-01-01
To estimate the contribution of various occlusal features of the natural dentition that may identify self-reported bruxers compared to nonbruxers. Two age- and sex-matched groups of self-reported bruxers (n = 67) and self-reported nonbruxers (n = 75) took part in the study. For each patient, the following occlusal features were clinically assessed: retruded contact position (RCP) to intercuspal contact position (ICP) slide length (< 2 mm was considered normal), vertical overlap (< 0 mm was considered an anterior open bite; > 4 mm, a deep bite), horizontal overlap (> 4 mm was considered a large horizontal overlap), incisor dental midline discrepancy (< 2 mm was considered normal), and the presence of a unilateral posterior crossbite, mediotrusive interferences, and laterotrusive interferences. A multiple logistic regression model was used to identify the significant associations between the assessed occlusal features (independent variables) and self-reported bruxism (dependent variable). Accuracy values to predict self-reported bruxism were unacceptable for all occlusal variables. The only variable remaining in the final regression model was laterotrusive interferences (P = .030). The percentage of explained variance for bruxism by the final multiple regression model was 4.6%. This model including only one occlusal factor showed low positive (58.1%) and negative predictive values (59.7%), thus showing a poor accuracy to predict the presence of self-reported bruxism (59.2%). This investigation suggested that the contribution of occlusion to the differentiation between bruxers and nonbruxers is negligible. This finding supports theories that advocate a much diminished role for peripheral anatomical-structural factors in the pathogenesis of bruxism.
Li, Liang; Mao, Huzhang; Ishwaran, Hemant; Rajeswaran, Jeevanantham; Ehrlinger, John; Blackstone, Eugene H
2017-03-01
Atrial fibrillation (AF) is an abnormal heart rhythm characterized by rapid and irregular heartbeat, with or without perceivable symptoms. In clinical practice, the electrocardiogram (ECG) is often used for diagnosis of AF. Since the AF often arrives as recurrent episodes of varying frequency and duration and only the episodes that occur at the time of ECG can be detected, the AF is often underdiagnosed when a limited number of repeated ECGs are used. In studies evaluating the efficacy of AF ablation surgery, each patient undergoes multiple ECGs and the AF status at the time of ECG is recorded. The objective of this paper is to estimate the marginal proportions of patients with or without AF in a population, which are important measures of the efficacy of the treatment. The underdiagnosis problem is addressed by a three-class mixture regression model in which a patient's probability of having no AF, paroxysmal AF, and permanent AF is modeled by auxiliary baseline covariates in a nested logistic regression. A binomial regression model is specified conditional on a subject being in the paroxysmal AF group. The model parameters are estimated by the Expectation-Maximization (EM) algorithm. These parameters are themselves nuisance parameters for the purpose of this research, but the estimators of the marginal proportions of interest can be expressed as functions of the data and these nuisance parameters and their variances can be estimated by the sandwich method. We examine the performance of the proposed methodology in simulations and two real data applications. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Kupek, Emil
2006-03-15
Structural equation modelling (SEM) has been increasingly used in medical statistics for solving a system of related regression equations. However, a great obstacle for its wider use has been its difficulty in handling categorical variables within the framework of generalised linear models. A large data set with a known structure among two related outcomes and three independent variables was generated to investigate the use of Yule's transformation of odds ratio (OR) into Q-metric by (OR-1)/(OR+1) to approximate Pearson's correlation coefficients between binary variables whose covariance structure can be further analysed by SEM. Percent of correctly classified events and non-events was compared with the classification obtained by logistic regression. The performance of SEM based on Q-metric was also checked on a small (N = 100) random sample of the data generated and on a real data set. SEM successfully recovered the generated model structure. SEM of real data suggested a significant influence of a latent confounding variable which would have not been detectable by standard logistic regression. SEM classification performance was broadly similar to that of the logistic regression. The analysis of binary data can be greatly enhanced by Yule's transformation of odds ratios into estimated correlation matrix that can be further analysed by SEM. The interpretation of results is aided by expressing them as odds ratios which are the most frequently used measure of effect in medical statistics.
Suzuki, Taku; Iwamoto, Takuji; Shizu, Kanae; Suzuki, Katsuji; Yamada, Harumoto; Sato, Kazuki
2017-05-01
This retrospective study was designed to investigate prognostic factors for postoperative outcomes for cubital tunnel syndrome (CubTS) using multiple logistic regression analysis with a large number of patients. Eighty-three patients with CubTS who underwent surgeries were enrolled. The following potential prognostic factors for disease severity were selected according to previous reports: sex, age, type of surgery, disease duration, body mass index, cervical lesion, presence of diabetes mellitus, Workers' Compensation status, preoperative severity, and preoperative electrodiagnostic testing. Postoperative severity of disease was assessed 2 years after surgery by Messina's criteria which is an outcome measure specifically for CubTS. Bivariate analysis was performed to select candidate prognostic factors for multiple linear regression analyses. Multiple logistic regression analysis was conducted to identify the association between postoperative severity and selected prognostic factors. Both bivariate and multiple linear regression analysis revealed only preoperative severity as an independent risk factor for poor prognosis, while other factors did not show any significant association. Although conflicting results exist regarding prognosis of CubTS, this study supports evidence from previous studies and concludes early surgical intervention portends the most favorable prognosis. Copyright © 2017 The Japanese Orthopaedic Association. Published by Elsevier B.V. All rights reserved.
Caraviello, D Z; Weigel, K A; Gianola, D
2004-05-01
Predicted transmitting abilities (PTA) of US Jersey sires for daughter longevity were calculated using a Weibull proportional hazards sire model and compared with predictions from a conventional linear animal model. Culling data from 268,008 Jersey cows with first calving from 1981 to 2000 were used. The proportional hazards model included time-dependent effects of herd-year-season contemporary group and parity by stage of lactation interaction, as well as time-independent effects of sire and age at first calving. Sire variances and parameters of the Weibull distribution were estimated, providing heritability estimates of 4.7% on the log scale and 18.0% on the original scale. The PTA of each sire was expressed as the expected risk of culling relative to daughters of an average sire. Risk ratios (RR) ranged from 0.7 to 1.3, indicating that the risk of culling for daughters of the best sires was 30% lower than for daughters of average sires and nearly 50% lower than than for daughters of the poorest sires. Sire PTA from the proportional hazards model were compared with PTA from a linear model similar to that used for routine national genetic evaluation of length of productive life (PL) using cross-validation in independent samples of herds. Models were compared using logistic regression of daughters' stayability to second, third, fourth, or fifth lactation on their sires' PTA values, with alternative approaches for weighting the contribution of each sire. Models were also compared using logistic regression of daughters' stayability to 36, 48, 60, 72, and 84 mo of life. The proportional hazards model generally yielded more accurate predictions according to these criteria, but differences in predictive ability between methods were smaller when using a Kullback-Leibler distance than with other approaches. Results of this study suggest that survival analysis methodology may provide more accurate predictions of genetic merit for longevity than conventional linear models.
Evaluation of keratoconus progression.
Shajari, Mehdi; Steinwender, Gernot; Herrmann, Kim; Kubiak, Kate Barbara; Pavlovic, Ivana; Plawetzki, Elena; Schmack, Ingo; Kohnen, Thomas
2018-06-01
To define variables for the evaluation of keratoconus progression and to determine cut-off values. In this retrospective cohort study (2010-2016), 265 eyes of 165 patients diagnosed with keratoconus underwent two Scheimpflug measurements (Pentacam) that took place 1 year apart ±3 months. Variables used for keratoconus detection were evaluated for progression and a correlation analysis was performed. By logistic regression analysis, a keratoconus progression index (KPI) was defined. Receiver-operating characteristic curve (ROC) analysis was performed and Youden Index calculated to determine cut-off values. Variables used for keratoconus detection showed a weak correlation with each other (eg, correlation r=0.245 between RPImin and Kmax, p<0.001). Therefore, we used parameters that took several variables into consideration (eg, D-index, index of surface variance, index for height asymmetry, KPI). KPI was defined by logistic regression and consisted of a Pachymin coefficient of -0.78 (p=0.001), a maximum elevation of back surface coefficient of 0.27 and coefficient of corneal curvature at the zone 3 mm away from the thinnest point on the posterior corneal surface of -12.44 (both p<0.001). The two variables with the highest Youden Index in the ROC analysis were D-index and KPI: D-index had a cut-off of 0.4175 (70.6% sensitivity) and Youden Index of 0.606. Cut-off for KPI was -0.78196 (84.7% sensitivity) and a Youden Index of 0.747; both 90% specificity. Keratoconus progression should be defined by evaluating parameters that consider several corneal changes; we suggest D-index and KPI to detect progression. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Silva-Fernández, Lucía; Pérez-Vicente, Sabina; Martín-Martínez, María Auxiliadora; López-González, Ruth
2015-06-01
To describe the variability in the prescription of non-biologic disease-modifying antirheumatic drugs (nbDMARDs) for the treatment of spondyloarthritis (SpA) in Spain and to explore which factors relating to the disease, patient, physician, and/or center contribute to these variations. A retrospective medical record review was performed using a probabilistic sample of 1168 patients with SpA from 45 centers distributed in 15/19 regions in Spain. The sociodemographic and clinical features and the use of drugs were recorded following a standardized protocol. Logistic regression, with nbDMARDs prescriptions as the dependent variable, was used for bivariable analysis. A multilevel logistic regression model was used to study variability. The probability of receiving an nbDMARD was higher in female patients [OR = 1.548; 95% confidence interval (CI): 1.208-1.984], in those with elevated C-reactive protein (OR = 1.039; 95% CI: 1.012-1.066) and erythrocyte sedimentation rate (OR = 1.012; 95% CI: 1.003-1.021), in those with a higher number of affected peripheral joints (OR = 12.921; 95% CI: 2.911-57.347), and in patients with extra-articular manifestations like dactylitis (OR = 2.997; 95% CI: 1.868-4.809), psoriasis (OR = 2.601; 95% CI: 1.870-3.617), and enthesitis (OR = 1.717; 95% CI: 1.224-2.410). There was a marked variability in the prescription of nbDMARDs for SpA patients, depending on the center (14.3%; variance 0.549; standard error 0.161; median odds ratio 2.366; p < 0.001). After adjusting for patient and center variables, this variability fell to 3.8%. A number of factors affecting variability in clinical practice, and which are independent of disease characteristics, are associated with the probability of SpA patients receiving nbDMARDs in Spain. Copyright © 2015 Elsevier Inc. All rights reserved.
[Relationship between dietary vitamin C and Type 2 diabetes].
Li, Xiaoxiao; Wang, Xinliang; Wei, Jie; Yang, Tubao
2015-10-01
To examine the correlation between dietary vitamin C intake and Type 2 diabetes. A total of 5 168 participants from Xiangya Hospital, Central South University were randomly selected. According to the vitamin C intake, the participants were divided into 5 groups: a Q1 group (n=1 033), a Q2 group (n=1 034), a Q3 group (n=1 034), a Q4 group (n=1 034) and a Q5 group (n=1 033). They were also divided into a Type 2 diabetes group (n=502) and a non-diabetes group (n=4 666). The height, weight, and blood pressure were measured, and vitamin C intake and other dairy consumption were evaluated using a food frequency questionnaire and fasting plasma glucose (FPG). The analysis of variance (ANOVA), Chi-square test, Mann-Whitney U test and logistic regression model were used to analyze the relationship between dietary vitamin C and Type 2 diabetes. The univariate analysis showed that there were significant differences in the vitamin C consumption in energy intake, activity level, dietary fiber intake, nutritional supplementation status, drinking or not drinking, education level among the different vitamin C intake groups (all P<0.05). There were also significant differences in age, sex, body mass index (BMI), smoking status and vitamin C intake between the Type 2 diabetes group and the non-diabetes group (all P<0.05). After the adjustment for age, gender, hypertension, energy intake or smoking status, the multiple logistic regression model found that the multivariable adjusted OR was 0.610 (95% CI 0.428-0.870) for the highest level of vitamin C intake (>154.78 mg/d) in comparison with the lowest level (≤ 63.26 mg/d). The results suggested that the vitamin C intake was inversely associated with the Type 2 diabetes (r=-0.029, P<0.05). There is a significant negative correlation between the dietary vitamin C intake and the risk of Type 2 diabetes.
Gomar, Jesus J; Bobes-Bascaran, Maria T; Conejero-Goldberg, Concepcion; Davies, Peter; Goldberg, Terry E
2011-09-01
Biomarkers have become increasingly important in understanding neurodegenerative processes associated with Alzheimer disease. Markers include regional brain volumes, cerebrospinal fluid measures of pathological Aβ1-42 and total tau, cognitive measures, and individual risk factors. To determine the discriminative utility of different classes of biomarkers and cognitive markers by examining their ability to predict a change in diagnostic status from mild cognitive impairment to Alzheimer disease. Longitudinal study. We analyzed the Alzheimer's Disease Neuroimaging Initiative database to study patients with mild cognitive impairment who converted to Alzheimer disease (n = 116) and those who did not convert (n = 204) within a 2-year period. We determined the predictive utility of 25 variables from all classes of markers, biomarkers, and risk factors in a series of logistic regression models and effect size analyses. The Alzheimer's Disease Neuroimaging Initiative public database. Primary outcome measures were odds ratios, pseudo- R(2)s, and effect sizes. In comprehensive stepwise logistic regression models that thus included variables from all classes of markers, the following baseline variables predicted conversion within a 2-year period: 2 measures of delayed verbal memory and middle temporal lobe cortical thickness. In an effect size analysis that examined rates of decline, change scores for biomarkers were modest for 2 years, but a change in an everyday functional activities measure (Functional Assessment Questionnaire) was considerably larger. Decline in scores on the Functional Assessment Questionnaire and Trail Making Test, part B, accounted for approximately 50% of the predictive variance in conversion from mild cognitive impairment to Alzheimer disease. Cognitive markers at baseline were more robust predictors of conversion than most biomarkers. Longitudinal analyses suggested that conversion appeared to be driven less by changes in the neurobiologic trajectory of the disease than by a sharp decline in functional ability and, to a lesser extent, by declines in executive function.
Ivanovic, D; Del P Rodríguez, M; Pérez, H; Alvear, J; Díaz, N; Leyton, B; Almagià, A; Toro, T; Urrutia, M S; Ivanovic, R
2008-01-01
To determine the impact of nutritional status in a multicausal approach of socio-economic, socio-cultural, family, intellectual, educational and demographic variables at the onset of elementary school in 1987 on the educational situation of these children in 1998, when they should have graduated from high school. Chile's Metropolitan Region. Prospective, observational and 12-year follow-up study. A representative sample of 813 elementary first grade school-age children was randomly chosen in 1987. The sample was assessed in two cross-sectional studies. The first cross-sectional study was carried out in at the onset of elementary school in 1987 and the second was carried out in 1998, 12-years later, when they should be graduating from high school. In 1998, 632 adolescent students were located and their educational situation was registered (dropout, delayed, graduated and not located). At the onset of elementary school were determined the nutritional status, socio-economic status (SES), family characteristics, intellectual ability (IA), scholastic achievement (SA) and demographic variables. Statistical analysis included variance tests and Scheffe's test was used for comparison of means. Pearson correlation coefficients and logistic regression were used to establish the most important independent variables at the onset of elementary school in 1987 that affect the educational situation 1998. Data were analysed using the statistical analysis system (SAS). Logistic regression revealed that SES, IA, SA and head circumference-for-age Z score at the onset of elementary school in 1987 were the independent variables with the greatest explanatory power in the educational situation of school-age children in 1998. These parameters at an early school age are good predictors of the educational situation later and these results can be useful for nutrition and educational planning in early childhood.
Vadlin, Sofia; Åslund, Cecilia; Nilsson, Kent W
2018-04-01
The aims of this study were to investigate the long-term stability of problematic gaming among adolescents and whether problematic gaming at wave 1 (W1) was associated with problem gambling at wave 2 (W2), three years later. Data from the SALVe cohort, including adolescents in Västmanland born in 1997 and 1999, were accessed and analyzed in two waves W2, N = 1576; 914 (58%) girls). At W1, the adolescents were 13 and 15 years old, and at W2, they were 16 and 18 years old. Adolescents self-rated on the Gaming Addiction Identification Test (GAIT), Problem Gambling Severity Index (PGSI), and gambling frequencies. Stability of gaming was determined using Gamma correlation, Spearman's rho, and McNemar. Logistic regression analysis and general linear model (GLM) analysis were performed and adjusted for sex, age, and ethnicity, frequency of gambling activities and gaming time at W1, with PGSI as the dependent variable, and GAIT as the independent variable, to investigate associations between problematic gaming and problem gambling. Problematic gaming was relative stable over time, γ = 0.739, p ≤ .001, ρ = 0.555, p ≤ .001, and McNemar p ≤ .001. Furthermore, problematic gaming at W1 increased the probability of having problem gambling three years later, logistic regression OR = 1.886 (95% CI 1.125-3.161), p = .016, GLM F = 10.588, η 2 = 0.007, p = .001. Problematic gaming seems to be relatively stable over time. Although associations between problematic gaming and later problem gambling were found, the low explained variance indicates that problematic gaming in an unlikely predictor for problem gambling within this sample.
Foda, Hussein D; Brehm, Anthony; Goldsteen, Karen; Edelman, Norman H
2017-01-01
Background Prescriber disagreement is among the reasons for poor adherence to COPD treatment guidelines; it is yet not clear whether this leads to adverse outcomes. We tested whether undertreatment according to the original Global Initiative for Chronic Obstructive Lung Disease (GOLD) guidelines led to increased exacerbations. Methods Records of 878 patients with spirometrically confirmed COPD who were followed from 2005 to 2010 at one Veterans Administration (VA) Medical Center were analyzed. Analysis of variance was performed to assess differences in exacerbation rates between severity groups. Logistic regression analysis was performed to assess the relationship between noncompliance with guidelines and exacerbation rates. Findings About 19% were appropriately treated by guidelines; 14% overtreated, 44% under-treated, and in 23% treatment did not follow any guideline. Logistic regression revealed a strong inverse relationship between undertreatment and exacerbation rate when severity of obstruction was held constant. Exacerbations per year by GOLD stage were significantly different from each other: mild 0.15, moderate 0.27, severe 0.38, very severe 0.72, and substantially fewer than previously reported. Interpretation The guidelines were largely not followed. Undertreatment predominated but, contrary to expectations, was associated with fewer exacerbations. Thus, clinicians were likely advancing therapy primarily based upon exacerbation rates as was subsequently recommended in revised GOLD and other more recent guidelines. In retrospect, a substantial lack of prescriber adherence to treatment guidelines may have been a signal that they required re-evaluation. This is likely to be a general principle regarding therapeutic guidelines. The identification of fewer exacerbations in this cohort than has been generally reported probably reflects the comprehensive nature of the VA system, which is more likely to identify relatively asymptomatic (ie, nonexacerbating) COPD patients. Accordingly, these rates may better reflect those in the general population. In addition, the lower rates may reflect the more complete preventive care provided by the VA. PMID:28123293
Foda, Hussein D; Brehm, Anthony; Goldsteen, Karen; Edelman, Norman H
2017-01-01
Prescriber disagreement is among the reasons for poor adherence to COPD treatment guidelines; it is yet not clear whether this leads to adverse outcomes. We tested whether undertreatment according to the original Global Initiative for Chronic Obstructive Lung Disease (GOLD) guidelines led to increased exacerbations. Records of 878 patients with spirometrically confirmed COPD who were followed from 2005 to 2010 at one Veterans Administration (VA) Medical Center were analyzed. Analysis of variance was performed to assess differences in exacerbation rates between severity groups. Logistic regression analysis was performed to assess the relationship between noncompliance with guidelines and exacerbation rates. About 19% were appropriately treated by guidelines; 14% overtreated, 44% under-treated, and in 23% treatment did not follow any guideline. Logistic regression revealed a strong inverse relationship between undertreatment and exacerbation rate when severity of obstruction was held constant. Exacerbations per year by GOLD stage were significantly different from each other: mild 0.15, moderate 0.27, severe 0.38, very severe 0.72, and substantially fewer than previously reported. The guidelines were largely not followed. Undertreatment predominated but, contrary to expectations, was associated with fewer exacerbations. Thus, clinicians were likely advancing therapy primarily based upon exacerbation rates as was subsequently recommended in revised GOLD and other more recent guidelines. In retrospect, a substantial lack of prescriber adherence to treatment guidelines may have been a signal that they required re-evaluation. This is likely to be a general principle regarding therapeutic guidelines. The identification of fewer exacerbations in this cohort than has been generally reported probably reflects the comprehensive nature of the VA system, which is more likely to identify relatively asymptomatic (ie, nonexacerbating) COPD patients. Accordingly, these rates may better reflect those in the general population. In addition, the lower rates may reflect the more complete preventive care provided by the VA.
Coswig, Victor Silveira; Miarka, Bianca; Pires, Daniel Alvarez; da Silva, Levy Mendes; Bartel, Charles; Del Vecchio, Fabrício Boscolo
2018-05-14
We aimed to describe the nutritional and behavioural strategies for rapid weight loss (RWL), investigate the effects of RWL and weight regain (WRG) in winners and losers and verify mood state and technical-tactical/time-motion parameters in Mixed Martial Arts (MMA). The sample consisted of MMA athletes after a single real match and was separated into two groups: Winners (n=8, age: 25.4±6.1yo., height: 173.9±0.2cm, habitual body mass (BM): 89.9±17.3kg) and Losers (n=7, age: 24.4±6.8yo., height: 178.4±0.9cm, habitual BM: 90.8±19.5kg). Both groups exhibited RWL and WRG, verified their macronutrient intake, underwent weight and height assessments and completed two questionnaires (POMS and RWL) at i) 24 h before weigh-in, ii) weigh-in, iii) post-bout and iv) during a validated time-motion and technical-tactical analysis during the bout. Variance analysis, repeated measures and a logistic regression analysis were used. The main results showed significant differences between the time points in terms of total caloric intake as well as carbohydrate, protein and lipid ingestion. Statistical differences in combat analysis were observed between the winners and losers in terms of high-intensity relative time [58(10;98) s and 32(1;60) s, respectively], lower limb sequences [3.5(1.0;7.5) sequences and 1.0(0.0;1.0) sequences, respectively], and ground and pound actions [2.5(0.0;4.5) actions and 0.0(0.0;0.5) actions, respectively], and logistic regression confirmed the importance of high-intensity relative time and lower limb sequences on MMA performance. RWL and WRG strategies were related to technical-tactical and time-motion patterns as well as match outcomes. Weight management should be carefully supervised by specialized professionals to reduce health risks and raise competitive performance.
Accurate Diabetes Risk Stratification Using Machine Learning: Role of Missing Value and Outliers.
Maniruzzaman, Md; Rahman, Md Jahanur; Al-MehediHasan, Md; Suri, Harman S; Abedin, Md Menhazul; El-Baz, Ayman; Suri, Jasjit S
2018-04-10
Diabetes mellitus is a group of metabolic diseases in which blood sugar levels are too high. About 8.8% of the world was diabetic in 2017. It is projected that this will reach nearly 10% by 2045. The major challenge is that when machine learning-based classifiers are applied to such data sets for risk stratification, leads to lower performance. Thus, our objective is to develop an optimized and robust machine learning (ML) system under the assumption that missing values or outliers if replaced by a median configuration will yield higher risk stratification accuracy. This ML-based risk stratification is designed, optimized and evaluated, where: (i) the features are extracted and optimized from the six feature selection techniques (random forest, logistic regression, mutual information, principal component analysis, analysis of variance, and Fisher discriminant ratio) and combined with ten different types of classifiers (linear discriminant analysis, quadratic discriminant analysis, naïve Bayes, Gaussian process classification, support vector machine, artificial neural network, Adaboost, logistic regression, decision tree, and random forest) under the hypothesis that both missing values and outliers when replaced by computed medians will improve the risk stratification accuracy. Pima Indian diabetic dataset (768 patients: 268 diabetic and 500 controls) was used. Our results demonstrate that on replacing the missing values and outliers by group median and median values, respectively and further using the combination of random forest feature selection and random forest classification technique yields an accuracy, sensitivity, specificity, positive predictive value, negative predictive value and area under the curve as: 92.26%, 95.96%, 79.72%, 91.14%, 91.20%, and 0.93, respectively. This is an improvement of 10% over previously developed techniques published in literature. The system was validated for its stability and reliability. RF-based model showed the best performance when outliers are replaced by median values.
McClendon, Jamal; Smith, Timothy R; Sugrue, Patrick A; Thompson, Sara E; O'Shaughnessy, Brian A; Koski, Tyler R
2016-11-01
To evaluate spinal implant density and proximal junctional kyphosis (PJK) in adult spinal deformity (ASD). Consecutive patients with ASD receiving ≥5 level fusions were retrospectively analyzed between 2007 and 2010. ASD, elective fusions, minimum 2-year follow-up. age <18 years, neuromuscular or congenital scoliosis, cervical or cervicothoracic fusions, nonelective conditions (infection, tumor, trauma). Instrumented fusions were classified by the Scoliosis Research Society-Schwab ASD classification. Statistical analysis consisted of descriptives (measures of central tendency, dispersion, frequencies), independent Student t tests, χ 2 , analysis of variance, and logistic regression to determine association of implant density [(number of screws + number of hooks)/surgical levels of fusion] and PJK. Mean and median follow-up was 2.8 and 2.7 years, respectively. Eighty-three patients (17 male, 66 female) with a mean age of 59.7 years (standard deviation, 10.3) were analyzed. Mean body mass index (BMI) was 29.5 kg/m 2 (range, 18-56 kg/m 2 ) with mean preoperative Oswestry Disability Index of 48.67 (range, 6-86) and mean preoperative sagittal vertical axis of 8.42. The mean levels fused were 9.95 where 54 surgeries had interbody fusion. PJK prevalence was 21.7%, and pseudoarthrosis was 19.3%. Mean postoperative Oswestry Disability Index was 27.4 (range, 0-74). Independent Student t tests showed that PJK was not significant for age, gender, BMI, rod type, mean postoperative sagittal vertical axis, or Scoliosis Research Society-Schwab ASD classification; but iliac fixation approached significance (P = 0.077). Implant density and postoperative lumbar lordosis (LL) were predictors for PJK (P = 0.018 and 0.045, respectively). Controlling for age, BMI, and gender, postoperative LL (not implant density) continued to show significance in multivariate logistic regression model. PJK, although influenced by a multitude of factors, may be statistically related to implant density and LL. Copyright © 2016. Published by Elsevier Inc.
ERIC Educational Resources Information Center
Kasapoglu, Koray
2014-01-01
This study aims to investigate which factors are associated with Turkey's 15-year-olds' scoring above the OECD average (493) on the PISA'09 reading assessment. Collected from a total of 4,996 15-year-old students from Turkey, data were analyzed by logistic regression analysis in order to model the data of students who were split into two: (1)…
Upgrade Summer Severe Weather Tool
NASA Technical Reports Server (NTRS)
Watson, Leela
2011-01-01
The goal of this task was to upgrade to the existing severe weather database by adding observations from the 2010 warm season, update the verification dataset with results from the 2010 warm season, use statistical logistic regression analysis on the database and develop a new forecast tool. The AMU analyzed 7 stability parameters that showed the possibility of providing guidance in forecasting severe weather, calculated verification statistics for the Total Threat Score (TTS), and calculated warm season verification statistics for the 2010 season. The AMU also performed statistical logistic regression analysis on the 22-year severe weather database. The results indicated that the logistic regression equation did not show an increase in skill over the previously developed TTS. The equation showed less accuracy than TTS at predicting severe weather, little ability to distinguish between severe and non-severe weather days, and worse standard categorical accuracy measures and skill scores over TTS.
Estimating the Probability of Rare Events Occurring Using a Local Model Averaging.
Chen, Jin-Hua; Chen, Chun-Shu; Huang, Meng-Fan; Lin, Hung-Chih
2016-10-01
In statistical applications, logistic regression is a popular method for analyzing binary data accompanied by explanatory variables. But when one of the two outcomes is rare, the estimation of model parameters has been shown to be severely biased and hence estimating the probability of rare events occurring based on a logistic regression model would be inaccurate. In this article, we focus on estimating the probability of rare events occurring based on logistic regression models. Instead of selecting a best model, we propose a local model averaging procedure based on a data perturbation technique applied to different information criteria to obtain different probability estimates of rare events occurring. Then an approximately unbiased estimator of Kullback-Leibler loss is used to choose the best one among them. We design complete simulations to show the effectiveness of our approach. For illustration, a necrotizing enterocolitis (NEC) data set is analyzed. © 2016 Society for Risk Analysis.
Evaluating the perennial stream using logistic regression in central Taiwan
NASA Astrophysics Data System (ADS)
Ruljigaljig, T.; Cheng, Y. S.; Lin, H. I.; Lee, C. H.; Yu, T. T.
2014-12-01
This study produces a perennial stream head potential map, based on a logistic regression method with a Geographic Information System (GIS). Perennial stream initiation locations, indicates the location of the groundwater and surface contact, were identified in the study area from field survey. The perennial stream potential map in central Taiwan was constructed using the relationship between perennial stream and their causative factors, such as Catchment area, slope gradient, aspect, elevation, groundwater recharge and precipitation. Here, the field surveys of 272 streams were determined in the study area. The areas under the curve for logistic regression methods were calculated as 0.87. The results illustrate the importance of catchment area and groundwater recharge as key factors within the model. The results obtained from the model within the GIS were then used to produce a map of perennial stream and estimate the location of perennial stream head.
Menditto, Anthony A; Linhorst, Donald M; Coleman, James C; Beck, Niels C
2006-04-01
Development of policies and procedures to contend with the risks presented by elopement, aggression, and suicidal behaviors are long-standing challenges for mental health administrators. Guidance in making such judgments can be obtained through the use of a multivariate statistical technique known as logistic regression. This procedure can be used to develop a predictive equation that is mathematically formulated to use the best combination of predictors, rather than considering just one factor at a time. This paper presents an overview of logistic regression and its utility in mental health administrative decision making. A case example of its application is presented using data on elopements from Missouri's long-term state psychiatric hospitals. Ultimately, the use of statistical prediction analyses tempered with differential qualitative weighting of classification errors can augment decision-making processes in a manner that provides guidance and flexibility while wrestling with the complex problem of risk assessment and decision making.
Moghtadaei, Motahareh; Hashemi Golpayegani, Mohammad Reza; Malekzadeh, Reza
2013-02-07
Identification of squamous dysplasia and esophageal squamous cell carcinoma (ESCC) is of great importance in prevention of cancer incidence. Computer aided algorithms can be very useful for identification of people with higher risks of squamous dysplasia, and ESCC. Such method can limit the clinical screenings to people with higher risks. Different regression methods have been used to predict ESCC and dysplasia. In this paper, a Fuzzy Neural Network (FNN) model is selected for ESCC and dysplasia prediction. The inputs to the classifier are the risk factors. Since the relation between risk factors in the tumor system has a complex nonlinear behavior, in comparison to most of ordinary data, the cost function of its model can have more local optimums. Thus the need for global optimization methods is more highlighted. The proposed method in this paper is a Chaotic Optimization Algorithm (COA) proceeding by the common Error Back Propagation (EBP) local method. Since the model has many parameters, we use a strategy to reduce the dependency among parameters caused by the chaotic series generator. This dependency was not considered in the previous COA methods. The algorithm is compared with logistic regression model as the latest successful methods of ESCC and dysplasia prediction. The results represent a more precise prediction with less mean and variance of error. Copyright © 2012 Elsevier Ltd. All rights reserved.
Moreno-Martínez, F Javier; Goñi-Imízcoz, Miguel; Spitznagel, Mary Beth
2011-10-01
Category specific semantic impairment (e.g. living versus nonliving things) has been reported in association with various pathologies, including herpes simplex encephalitis and semantic dementia. However, evidence is inconsistent regarding whether this effect exists in diseases progressively impacting diverse cortical regions, such as Alzheimer's disease (AD). Ceiling effects producing non-Gaussian distributions and poor control for confounds such as nuisance variables (e.g. familiarity) may contribute to this discrepancy. Fourteen AD patients were longitudinally studied examining category effects on three semantic tasks (picture naming, naming to description and word to picture matching) matched across domain on all known nuisance variables (NV). To address non-Gaussian distributions, we run bootstrap analyses to determine whether NV, semantic domain or control performance best predicted AD patient performance. Multiple hierarchical regression analyses revealed that, whilst NV accounted for most of the explained variance in patients in the three tasks, the influence of semantic domain was substantially lower. Individual logistic regression demonstrated a significant category effect in only a few patients and healthy controls. No significant qualitative changes were observed in patients over time. Our results confirm the importance of NVs as predictors of AD patient performance, suggesting that the role of semantic domain is not a useful predictor of the progressive deterioration in AD. Copyright © 2011 Elsevier Inc. All rights reserved.
Measures of health, fitness, and functional movement among firefighter recruits.
Cornell, David J; Gnacinski, Stacy L; Zamzow, Aaron; Mims, Jason; Ebersole, Kyle T
2017-06-01
The purpose of this study was to examine the associations between various health and fitness measures and Functional Movement Screen™ (FMS™) scores among 78 firefighter recruits. Relationships between FMS™ scores and age, body mass index (BMI), sit and reach (S&R) distance, estimated maximal aerobic capacity (V˙ O2max ), estimated one-repetition maximum squat (1RM-Squat max ), and plank endurance (%Plank max ) were examined. Total FMS™ scores were significantly correlated with BMI (r = -0.231, p = 0.042), estimated 1RM-Squat max (r = 0.302, p = 0.007), and %Plank max (r = 0.320, p = 0.004). Multiple regression analyses indicated that this combination of predictors significantly predicted (F(3, 74) = 5.043, p = 0.003) Total FMS™ score outcomes and accounted for 17% of the total variance (R 2 = 0.170). In addition, logistic regression analyses indicated that estimated 1RM-Squat max also significantly predicted (χ 2 = 6.662, df = 1, p = 0.010) FMS™ group membership (≤14 or ≥15). These results suggest that the health and fitness measures of obesity (BMI), bilateral lower extremity strength (estimated 1RM-Squat max ), and core muscular endurance (%Plank max ) are significantly associated with functional movement patterns among firefighter recruits. Consequently, injury prevention programs implemented among firefighter recruits should target these aspects of health and fitness.
Severity of specific language impairment predicts delayed development in number skills
Durkin, Kevin; Mok, Pearl L. H.; Conti-Ramsden, Gina
2013-01-01
The extent to which mathematical development is dependent upon language is controversial. This longitudinal study investigates the role of language ability in children's development of number skills. Participants were 229 children with specific language impairment (SLI) who were assessed initially at age 7 and again 1 year later. All participants completed measures of psycholinguistic development (expressive and receptive), performance IQ, and the Basic Number Skills subtest of the British Ability Scales. Number skills data for this sample were compared with normative population data. Consistent with predictions that language impairment would impact on numerical development, average standard scores were more than 1 SD below the population mean at both ages. Although the children showed improvements in raw scores at the second wave of the study, the discrepancy between their scores and the population data nonetheless increased over time. Regression analyses showed that, after controlling for the effect of PIQ, language skills explained an additional 19 and 17% of the variance in number skills for ages 7 and 8, respectively. Furthermore, logistic regression analyses revealed that less improvement in the child's language ability over the course of the year was associated with a greater odds of a drop in performance in basic number skills from 7 to 8 years. The results are discussed in relation to the interaction of linguistic and cognitive factors in numerical development and the implications for mathematical education. PMID:24027548
Lei, Yang; Nollen, Nikki; Ahluwahlia, Jasjit S; Yu, Qing; Mayo, Matthew S
2015-04-09
Other forms of tobacco use are increasing in prevalence, yet most tobacco control efforts are aimed at cigarettes. In light of this, it is important to identify individuals who are using both cigarettes and alternative tobacco products (ATPs). Most previous studies have used regression models. We conducted a traditional logistic regression model and a classification and regression tree (CART) model to illustrate and discuss the added advantages of using CART in the setting of identifying high-risk subgroups of ATP users among cigarettes smokers. The data were collected from an online cross-sectional survey administered by Survey Sampling International between July 5, 2012 and August 15, 2012. Eligible participants self-identified as current smokers, African American, White, or Latino (of any race), were English-speaking, and were at least 25 years old. The study sample included 2,376 participants and was divided into independent training and validation samples for a hold out validation. Logistic regression and CART models were used to examine the important predictors of cigarettes + ATP users. The logistic regression model identified nine important factors: gender, age, race, nicotine dependence, buying cigarettes or borrowing, whether the price of cigarettes influences the brand purchased, whether the participants set limits on cigarettes per day, alcohol use scores, and discrimination frequencies. The C-index of the logistic regression model was 0.74, indicating good discriminatory capability. The model performed well in the validation cohort also with good discrimination (c-index = 0.73) and excellent calibration (R-square = 0.96 in the calibration regression). The parsimonious CART model identified gender, age, alcohol use score, race, and discrimination frequencies to be the most important factors. It also revealed interesting partial interactions. The c-index is 0.70 for the training sample and 0.69 for the validation sample. The misclassification rate was 0.342 for the training sample and 0.346 for the validation sample. The CART model was easier to interpret and discovered target populations that possess clinical significance. This study suggests that the non-parametric CART model is parsimonious, potentially easier to interpret, and provides additional information in identifying the subgroups at high risk of ATP use among cigarette smokers.
Moderation analysis using a two-level regression model.
Yuan, Ke-Hai; Cheng, Ying; Maxwell, Scott
2014-10-01
Moderation analysis is widely used in social and behavioral research. The most commonly used model for moderation analysis is moderated multiple regression (MMR) in which the explanatory variables of the regression model include product terms, and the model is typically estimated by least squares (LS). This paper argues for a two-level regression model in which the regression coefficients of a criterion variable on predictors are further regressed on moderator variables. An algorithm for estimating the parameters of the two-level model by normal-distribution-based maximum likelihood (NML) is developed. Formulas for the standard errors (SEs) of the parameter estimates are provided and studied. Results indicate that, when heteroscedasticity exists, NML with the two-level model gives more efficient and more accurate parameter estimates than the LS analysis of the MMR model. When error variances are homoscedastic, NML with the two-level model leads to essentially the same results as LS with the MMR model. Most importantly, the two-level regression model permits estimating the percentage of variance of each regression coefficient that is due to moderator variables. When applied to data from General Social Surveys 1991, NML with the two-level model identified a significant moderation effect of race on the regression of job prestige on years of education while LS with the MMR model did not. An R package is also developed and documented to facilitate the application of the two-level model.
Akkus, Zeki; Camdeviren, Handan; Celik, Fatma; Gur, Ali; Nas, Kemal
2005-09-01
To determine the risk factors of osteoporosis using a multiple binary logistic regression method and to assess the risk variables for osteoporosis, which is a major and growing health problem in many countries. We presented a case-control study, consisting of 126 postmenopausal healthy women as control group and 225 postmenopausal osteoporotic women as the case group. The study was carried out in the Department of Physical Medicine and Rehabilitation, Dicle University, Diyarbakir, Turkey between 1999-2002. The data from the 351 participants were collected using a standard questionnaire that contains 43 variables. A multiple logistic regression model was then used to evaluate the data and to find the best regression model. We classified 80.1% (281/351) of the participants using the regression model. Furthermore, the specificity value of the model was 67% (84/126) of the control group while the sensitivity value was 88% (197/225) of the case group. We found the distribution of residual values standardized for final model to be exponential using the Kolmogorow-Smirnow test (p=0.193). The receiver operating characteristic curve was found successful to predict patients with risk for osteoporosis. This study suggests that low levels of dietary calcium intake, physical activity, education, and longer duration of menopause are independent predictors of the risk of low bone density in our population. Adequate dietary calcium intake in combination with maintaining a daily physical activity, increasing educational level, decreasing birth rate, and duration of breast-feeding may contribute to healthy bones and play a role in practical prevention of osteoporosis in Southeast Anatolia. In addition, the findings of the present study indicate that the use of multivariate statistical method as a multiple logistic regression in osteoporosis, which maybe influenced by many variables, is better than univariate statistical evaluation.
Shi, K-Q; Zhou, Y-Y; Yan, H-D; Li, H; Wu, F-L; Xie, Y-Y; Braddock, M; Lin, X-Y; Zheng, M-H
2017-02-01
At present, there is no ideal model for predicting the short-term outcome of patients with acute-on-chronic hepatitis B liver failure (ACHBLF). This study aimed to establish and validate a prognostic model by using the classification and regression tree (CART) analysis. A total of 1047 patients from two separate medical centres with suspected ACHBLF were screened in the study, which were recognized as derivation cohort and validation cohort, respectively. CART analysis was applied to predict the 3-month mortality of patients with ACHBLF. The accuracy of the CART model was tested using the area under the receiver operating characteristic curve, which was compared with the model for end-stage liver disease (MELD) score and a new logistic regression model. CART analysis identified four variables as prognostic factors of ACHBLF: total bilirubin, age, serum sodium and INR, and three distinct risk groups: low risk (4.2%), intermediate risk (30.2%-53.2%) and high risk (81.4%-96.9%). The new logistic regression model was constructed with four independent factors, including age, total bilirubin, serum sodium and prothrombin activity by multivariate logistic regression analysis. The performances of the CART model (0.896), similar to the logistic regression model (0.914, P=.382), exceeded that of MELD score (0.667, P<.001). The results were confirmed in the validation cohort. We have developed and validated a novel CART model superior to MELD for predicting three-month mortality of patients with ACHBLF. Thus, the CART model could facilitate medical decision-making and provide clinicians with a validated practical bedside tool for ACHBLF risk stratification. © 2016 John Wiley & Sons Ltd.
Arevalillo, Jorge M; Sztein, Marcelo B; Kotloff, Karen L; Levine, Myron M; Simon, Jakub K
2017-10-01
Immunologic correlates of protection are important in vaccine development because they give insight into mechanisms of protection, assist in the identification of promising vaccine candidates, and serve as endpoints in bridging clinical vaccine studies. Our goal is the development of a methodology to identify immunologic correlates of protection using the Shigella challenge as a model. The proposed methodology utilizes the Random Forests (RF) machine learning algorithm as well as Classification and Regression Trees (CART) to detect immune markers that predict protection, identify interactions between variables, and define optimal cutoffs. Logistic regression modeling is applied to estimate the probability of protection and the confidence interval (CI) for such a probability is computed by bootstrapping the logistic regression models. The results demonstrate that the combination of Classification and Regression Trees and Random Forests complements the standard logistic regression and uncovers subtle immune interactions. Specific levels of immunoglobulin IgG antibody in blood on the day of challenge predicted protection in 75% (95% CI 67-86). Of those subjects that did not have blood IgG at or above a defined threshold, 100% were protected if they had IgA antibody secreting cells above a defined threshold. Comparison with the results obtained by applying only logistic regression modeling with standard Akaike Information Criterion for model selection shows the usefulness of the proposed method. Given the complexity of the immune system, the use of machine learning methods may enhance traditional statistical approaches. When applied together, they offer a novel way to quantify important immune correlates of protection that may help the development of vaccines. Copyright © 2017 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Schaeben, Helmut; Semmler, Georg
2016-09-01
The objective of prospectivity modeling is prediction of the conditional probability of the presence T = 1 or absence T = 0 of a target T given favorable or prohibitive predictors B, or construction of a two classes 0,1 classification of T. A special case of logistic regression called weights-of-evidence (WofE) is geologists' favorite method of prospectivity modeling due to its apparent simplicity. However, the numerical simplicity is deceiving as it is implied by the severe mathematical modeling assumption of joint conditional independence of all predictors given the target. General weights of evidence are explicitly introduced which are as simple to estimate as conventional weights, i.e., by counting, but do not require conditional independence. Complementary to the regression view is the classification view on prospectivity modeling. Boosting is the construction of a strong classifier from a set of weak classifiers. From the regression point of view it is closely related to logistic regression. Boost weights-of-evidence (BoostWofE) was introduced into prospectivity modeling to counterbalance violations of the assumption of conditional independence even though relaxation of modeling assumptions with respect to weak classifiers was not the (initial) purpose of boosting. In the original publication of BoostWofE a fabricated dataset was used to "validate" this approach. Using the same fabricated dataset it is shown that BoostWofE cannot generally compensate lacking conditional independence whatever the consecutively processing order of predictors. Thus the alleged features of BoostWofE are disproved by way of counterexamples, while theoretical findings are confirmed that logistic regression including interaction terms can exactly compensate violations of joint conditional independence if the predictors are indicators.
ERIC Educational Resources Information Center
Berenson, Mark L.
2013-01-01
There is consensus in the statistical literature that severe departures from its assumptions invalidate the use of regression modeling for purposes of inference. The assumptions of regression modeling are usually evaluated subjectively through visual, graphic displays in a residual analysis but such an approach, taken alone, may be insufficient…
Vandenplas, J; Bastin, C; Gengler, N; Mulder, H A
2013-09-01
Animals that are robust to environmental changes are desirable in the current dairy industry. Genetic differences in micro-environmental sensitivity can be studied through heterogeneity of residual variance between animals. However, residual variance between animals is usually assumed to be homogeneous in traditional genetic evaluations. The aim of this study was to investigate genetic heterogeneity of residual variance by estimating variance components in residual variance for milk yield, somatic cell score, contents in milk (g/dL) of 2 groups of milk fatty acids (i.e., saturated and unsaturated fatty acids), and the content in milk of one individual fatty acid (i.e., oleic acid, C18:1 cis-9), for first-parity Holstein cows in the Walloon Region of Belgium. A total of 146,027 test-day records from 26,887 cows in 747 herds were available. All cows had at least 3 records and a known sire. These sires had at least 10 cows with records and each herd × test-day had at least 5 cows. The 5 traits were analyzed separately based on fixed lactation curve and random regression test-day models for the mean. Estimation of variance components was performed by running iteratively expectation maximization-REML algorithm by the implementation of double hierarchical generalized linear models. Based on fixed lactation curve test-day mean models, heritability for residual variances ranged between 1.01×10(-3) and 4.17×10(-3) for all traits. The genetic standard deviation in residual variance (i.e., approximately the genetic coefficient of variation of residual variance) ranged between 0.12 and 0.17. Therefore, some genetic variance in micro-environmental sensitivity existed in the Walloon Holstein dairy cattle for the 5 studied traits. The standard deviations due to herd × test-day and permanent environment in residual variance ranged between 0.36 and 0.45 for herd × test-day effect and between 0.55 and 0.97 for permanent environmental effect. Therefore, nongenetic effects also contributed substantially to micro-environmental sensitivity. Addition of random regressions to the mean model did not reduce heterogeneity in residual variance and that genetic heterogeneity of residual variance was not simply an effect of an incomplete mean model. Copyright © 2013 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Separation in Logistic Regression: Causes, Consequences, and Control.
Mansournia, Mohammad Ali; Geroldinger, Angelika; Greenland, Sander; Heinze, Georg
2018-04-01
Separation is encountered in regression models with a discrete outcome (such as logistic regression) where the covariates perfectly predict the outcome. It is most frequent under the same conditions that lead to small-sample and sparse-data bias, such as presence of a rare outcome, rare exposures, highly correlated covariates, or covariates with strong effects. In theory, separation will produce infinite estimates for some coefficients. In practice, however, separation may be unnoticed or mishandled because of software limits in recognizing and handling the problem and in notifying the user. We discuss causes of separation in logistic regression and describe how common software packages deal with it. We then describe methods that remove separation, focusing on the same penalized-likelihood techniques used to address more general sparse-data problems. These methods improve accuracy, avoid software problems, and allow interpretation as Bayesian analyses with weakly informative priors. We discuss likelihood penalties, including some that can be implemented easily with any software package, and their relative advantages and disadvantages. We provide an illustration of ideas and methods using data from a case-control study of contraceptive practices and urinary tract infection.
NASA Astrophysics Data System (ADS)
Nong, Yu; Du, Qingyun; Wang, Kun; Miao, Lei; Zhang, Weiwei
2008-10-01
Urban growth modeling, one of the most important aspects of land use and land cover change study, has attracted substantial attention because it helps to comprehend the mechanisms of land use change thus helps relevant policies made. This study applied multinomial logistic regression to model urban growth in the Jiayu county of Hubei province, China to discover the relationship between urban growth and the driving forces of which biophysical and social-economic factors are selected as independent variables. This type of regression is similar to binary logistic regression, but it is more general because the dependent variable is not restricted to two categories, as those previous studies did. The multinomial one can simulate the process of multiple land use competition between urban land, bare land, cultivated land and orchard land. Taking the land use type of Urban as reference category, parameters could be estimated with odds ratio. A probability map is generated from the model to predict where urban growth will occur as a result of the computation.
Gender Performance Differences in Biochemistry
ERIC Educational Resources Information Center
Rauschenberger, Matthew M.; Sweeder, Ryan D.
2010-01-01
This study examined the historical performance of students at Michigan State University in a two-part biochemistry series Biochem I (n = 5,900) and Biochem II (n = 5,214) for students enrolled from 1997 to 2009. Multiple linear regressions predicted 54.9-87.5% of the variance in student from Biochem I grade and 53.8-76.1% of the variance in…
Tay, Cheryl Sihui; Sterzing, Thorsten; Lim, Chen Yen; Ding, Rui; Kong, Pui Wah
2017-05-01
This study examined (a) the strength of four individual footwear perception factors to influence the overall preference of running shoes and (b) whether these perception factors satisfied the nonmulticollinear assumption in a regression model. Running footwear must fulfill multiple functional criteria to satisfy its potential users. Footwear perception factors, such as fit and cushioning, are commonly used to guide shoe design and development, but it is unclear whether running-footwear users are able to differentiate one factor from another. One hundred casual runners assessed four running shoes on a 15-cm visual analogue scale for four footwear perception factors (fit, cushioning, arch support, and stability) as well as for overall preference during a treadmill running protocol. Diagnostic tests showed an absence of multicollinearity between factors, where values for tolerance ranged from .36 to .72, corresponding to variance inflation factors of 2.8 to 1.4. The multiple regression model of these four footwear perception variables accounted for 77.7% to 81.6% of variance in overall preference, with each factor explaining a unique part of the total variance. Casual runners were able to rate each footwear perception factor separately, thus assigning each factor a true potential to improve overall preference for the users. The results also support the use of a multiple regression model of footwear perception factors to predict overall running shoe preference. Regression modeling is a useful tool for running-shoe manufacturers to more precisely evaluate how individual factors contribute to the subjective assessment of running footwear.
Logistic Mixed Models to Investigate Implicit and Explicit Belief Tracking.
Lages, Martin; Scheel, Anne
2016-01-01
We investigated the proposition of a two-systems Theory of Mind in adults' belief tracking. A sample of N = 45 participants predicted the choice of one of two opponent players after observing several rounds in an animated card game. Three matches of this card game were played and initial gaze direction on target and subsequent choice predictions were recorded for each belief task and participant. We conducted logistic regressions with mixed effects on the binary data and developed Bayesian logistic mixed models to infer implicit and explicit mentalizing in true belief and false belief tasks. Although logistic regressions with mixed effects predicted the data well a Bayesian logistic mixed model with latent task- and subject-specific parameters gave a better account of the data. As expected explicit choice predictions suggested a clear understanding of true and false beliefs (TB/FB). Surprisingly, however, model parameters for initial gaze direction also indicated belief tracking. We discuss why task-specific parameters for initial gaze directions are different from choice predictions yet reflect second-order perspective taking.
Model selection for logistic regression models
NASA Astrophysics Data System (ADS)
Duller, Christine
2012-09-01
Model selection for logistic regression models decides which of some given potential regressors have an effect and hence should be included in the final model. The second interesting question is whether a certain factor is heterogeneous among some subsets, i.e. whether the model should include a random intercept or not. In this paper these questions will be answered with classical as well as with Bayesian methods. The application show some results of recent research projects in medicine and business administration.
Radiomorphometric analysis of frontal sinus for sex determination.
Verma, Saumya; Mahima, V G; Patil, Karthikeya
2014-09-01
Sex determination of unknown individuals carries crucial significance in forensic research, in cases where fragments of skull persist with no likelihood of identification based on dental arch. In these instances sex determination becomes important to rule out certain number of possibilities instantly and helps in establishing a biological profile of human remains. The aim of the study is to evaluate a mathematical method based on logistic regression analysis capable of ascertaining the sex of individuals in the South Indian population. The study was conducted in the department of Oral Medicine and Radiology. The right and left areas, maximum height, width of frontal sinus were determined in 100 Caldwell views of 50 women and 50 men aged 20 years and above, with the help of Vernier callipers and a square grid with 1 square measuring 1mm(2) in area. Student's t-test, logistic regression analysis. The mean values of variables were greater in men, based on Student's t-test at 5% level of significance. The mathematical model based on logistic regression analysis gave percentage agreement of total area to correctly predict the female gender as 55.2%, of right area as 60.9% and of left area as 55.2%. The areas of the frontal sinus and the logistic regression proved to be unreliable in sex determination. (Logit = 0.924 - 0.00217 × right area).