Sample size determination for logistic regression on a logit-normal distribution.
Kim, Seongho; Heath, Elisabeth; Heilbrun, Lance
2017-06-01
Although the sample size for simple logistic regression can be readily determined using currently available methods, the sample size calculation for multiple logistic regression requires some additional information, such as the coefficient of determination ([Formula: see text]) of a covariate of interest with other covariates, which is often unavailable in practice. The response variable of logistic regression follows a logit-normal distribution which can be generated from a logistic transformation of a normal distribution. Using this property of logistic regression, we propose new methods of determining the sample size for simple and multiple logistic regressions using a normal transformation of outcome measures. Simulation studies and a motivating example show several advantages of the proposed methods over the existing methods: (i) no need for [Formula: see text] for multiple logistic regression, (ii) available interim or group-sequential designs, and (iii) much smaller required sample size.
MODELING SNAKE MICROHABITAT FROM RADIOTELEMETRY STUDIES USING POLYTOMOUS LOGISTIC REGRESSION
Multivariate analysis of snake microhabitat has historically used techniques that were derived under assumptions of normality and common covariance structure (e.g., discriminant function analysis, MANOVA). In this study, polytomous logistic regression (PLR which does not require ...
Two-factor logistic regression in pediatric liver transplantation
NASA Astrophysics Data System (ADS)
Uzunova, Yordanka; Prodanova, Krasimira; Spasov, Lyubomir
2017-12-01
Using a two-factor logistic regression analysis an estimate is derived for the probability of absence of infections in the early postoperative period after pediatric liver transplantation. The influence of both the bilirubin level and the international normalized ratio of prothrombin time of blood coagulation at the 5th postoperative day is studied.
NASA Astrophysics Data System (ADS)
Pradhan, Biswajeet
2010-05-01
This paper presents the results of the cross-validation of a multivariate logistic regression model using remote sensing data and GIS for landslide hazard analysis on the Penang, Cameron, and Selangor areas in Malaysia. Landslide locations in the study areas were identified by interpreting aerial photographs and satellite images, supported by field surveys. SPOT 5 and Landsat TM satellite imagery were used to map landcover and vegetation index, respectively. Maps of topography, soil type, lineaments and land cover were constructed from the spatial datasets. Ten factors which influence landslide occurrence, i.e., slope, aspect, curvature, distance from drainage, lithology, distance from lineaments, soil type, landcover, rainfall precipitation, and normalized difference vegetation index (ndvi), were extracted from the spatial database and the logistic regression coefficient of each factor was computed. Then the landslide hazard was analysed using the multivariate logistic regression coefficients derived not only from the data for the respective area but also using the logistic regression coefficients calculated from each of the other two areas (nine hazard maps in all) as a cross-validation of the model. For verification of the model, the results of the analyses were then compared with the field-verified landslide locations. Among the three cases of the application of logistic regression coefficient in the same study area, the case of Selangor based on the Selangor logistic regression coefficients showed the highest accuracy (94%), where as Penang based on the Penang coefficients showed the lowest accuracy (86%). Similarly, among the six cases from the cross application of logistic regression coefficient in other two areas, the case of Selangor based on logistic coefficient of Cameron showed highest (90%) prediction accuracy where as the case of Penang based on the Selangor logistic regression coefficients showed the lowest accuracy (79%). Qualitatively, the cross application model yields reasonable results which can be used for preliminary landslide hazard mapping.
Seligman, D A; Pullinger, A G
2000-01-01
Confusion about the relationship of occlusion to temporomandibular disorders (TMD) persists. This study attempted to identify occlusal and attrition factors plus age that would characterize asymptomatic normal female subjects. A total of 124 female patients with intracapsular TMD were compared with 47 asymptomatic female controls for associations to 9 occlusal factors, 3 attrition severity measures, and age using classification tree, multiple stepwise logistic regression, and univariate analyses. Models were tested for accuracy (sensitivity and specificity) and total contribution to the variance. The classification tree model had 4 terminal nodes that used only anterior attrition and age. "Normals" were mainly characterized by low attrition levels, whereas patients had higher attrition and tended to be younger. The tree model was only moderately useful (sensitivity 63%, specificity 94%) in predicting normals. The logistic regression model incorporated unilateral posterior crossbite and mediotrusive attrition severity in addition to the 2 factors in the tree, but was slightly less accurate than the tree (sensitivity 51%, specificity 90%). When only occlusal factors were considered in the analysis, normals were additionally characterized by a lack of anterior open bite, smaller overjet, and smaller RCP-ICP slides. The log likelihood accounted for was similar for both the tree (pseudo R(2) = 29.38%; mean deviance = 0.95) and the multiple logistic regression (Cox Snell R(2) = 30.3%, mean deviance = 0.84) models. The occlusal and attrition factors studied were only moderately useful in differentiating normals from TMD patients.
Choi, Seung Hoan; Labadorf, Adam T; Myers, Richard H; Lunetta, Kathryn L; Dupuis, Josée; DeStefano, Anita L
2017-02-06
Next generation sequencing provides a count of RNA molecules in the form of short reads, yielding discrete, often highly non-normally distributed gene expression measurements. Although Negative Binomial (NB) regression has been generally accepted in the analysis of RNA sequencing (RNA-Seq) data, its appropriateness has not been exhaustively evaluated. We explore logistic regression as an alternative method for RNA-Seq studies designed to compare cases and controls, where disease status is modeled as a function of RNA-Seq reads using simulated and Huntington disease data. We evaluate the effect of adjusting for covariates that have an unknown relationship with gene expression. Finally, we incorporate the data adaptive method in order to compare false positive rates. When the sample size is small or the expression levels of a gene are highly dispersed, the NB regression shows inflated Type-I error rates but the Classical logistic and Bayes logistic (BL) regressions are conservative. Firth's logistic (FL) regression performs well or is slightly conservative. Large sample size and low dispersion generally make Type-I error rates of all methods close to nominal alpha levels of 0.05 and 0.01. However, Type-I error rates are controlled after applying the data adaptive method. The NB, BL, and FL regressions gain increased power with large sample size, large log2 fold-change, and low dispersion. The FL regression has comparable power to NB regression. We conclude that implementing the data adaptive method appropriately controls Type-I error rates in RNA-Seq analysis. Firth's logistic regression provides a concise statistical inference process and reduces spurious associations from inaccurately estimated dispersion parameters in the negative binomial framework.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dean, Jamie A., E-mail: jamie.dean@icr.ac.uk; Wong, Kee H.; Gay, Hiram
Purpose: Current normal tissue complication probability modeling using logistic regression suffers from bias and high uncertainty in the presence of highly correlated radiation therapy (RT) dose data. This hinders robust estimates of dose-response associations and, hence, optimal normal tissue–sparing strategies from being elucidated. Using functional data analysis (FDA) to reduce the dimensionality of the dose data could overcome this limitation. Methods and Materials: FDA was applied to modeling of severe acute mucositis and dysphagia resulting from head and neck RT. Functional partial least squares regression (FPLS) and functional principal component analysis were used for dimensionality reduction of the dose-volume histogrammore » data. The reduced dose data were input into functional logistic regression models (functional partial least squares–logistic regression [FPLS-LR] and functional principal component–logistic regression [FPC-LR]) along with clinical data. This approach was compared with penalized logistic regression (PLR) in terms of predictive performance and the significance of treatment covariate–response associations, assessed using bootstrapping. Results: The area under the receiver operating characteristic curve for the PLR, FPC-LR, and FPLS-LR models was 0.65, 0.69, and 0.67, respectively, for mucositis (internal validation) and 0.81, 0.83, and 0.83, respectively, for dysphagia (external validation). The calibration slopes/intercepts for the PLR, FPC-LR, and FPLS-LR models were 1.6/−0.67, 0.45/0.47, and 0.40/0.49, respectively, for mucositis (internal validation) and 2.5/−0.96, 0.79/−0.04, and 0.79/0.00, respectively, for dysphagia (external validation). The bootstrapped odds ratios indicated significant associations between RT dose and severe toxicity in the mucositis and dysphagia FDA models. Cisplatin was significantly associated with severe dysphagia in the FDA models. None of the covariates was significantly associated with severe toxicity in the PLR models. Dose levels greater than approximately 1.0 Gy/fraction were most strongly associated with severe acute mucositis and dysphagia in the FDA models. Conclusions: FPLS and functional principal component analysis marginally improved predictive performance compared with PLR and provided robust dose-response associations. FDA is recommended for use in normal tissue complication probability modeling.« less
Dean, Jamie A; Wong, Kee H; Gay, Hiram; Welsh, Liam C; Jones, Ann-Britt; Schick, Ulrike; Oh, Jung Hun; Apte, Aditya; Newbold, Kate L; Bhide, Shreerang A; Harrington, Kevin J; Deasy, Joseph O; Nutting, Christopher M; Gulliford, Sarah L
2016-11-15
Current normal tissue complication probability modeling using logistic regression suffers from bias and high uncertainty in the presence of highly correlated radiation therapy (RT) dose data. This hinders robust estimates of dose-response associations and, hence, optimal normal tissue-sparing strategies from being elucidated. Using functional data analysis (FDA) to reduce the dimensionality of the dose data could overcome this limitation. FDA was applied to modeling of severe acute mucositis and dysphagia resulting from head and neck RT. Functional partial least squares regression (FPLS) and functional principal component analysis were used for dimensionality reduction of the dose-volume histogram data. The reduced dose data were input into functional logistic regression models (functional partial least squares-logistic regression [FPLS-LR] and functional principal component-logistic regression [FPC-LR]) along with clinical data. This approach was compared with penalized logistic regression (PLR) in terms of predictive performance and the significance of treatment covariate-response associations, assessed using bootstrapping. The area under the receiver operating characteristic curve for the PLR, FPC-LR, and FPLS-LR models was 0.65, 0.69, and 0.67, respectively, for mucositis (internal validation) and 0.81, 0.83, and 0.83, respectively, for dysphagia (external validation). The calibration slopes/intercepts for the PLR, FPC-LR, and FPLS-LR models were 1.6/-0.67, 0.45/0.47, and 0.40/0.49, respectively, for mucositis (internal validation) and 2.5/-0.96, 0.79/-0.04, and 0.79/0.00, respectively, for dysphagia (external validation). The bootstrapped odds ratios indicated significant associations between RT dose and severe toxicity in the mucositis and dysphagia FDA models. Cisplatin was significantly associated with severe dysphagia in the FDA models. None of the covariates was significantly associated with severe toxicity in the PLR models. Dose levels greater than approximately 1.0 Gy/fraction were most strongly associated with severe acute mucositis and dysphagia in the FDA models. FPLS and functional principal component analysis marginally improved predictive performance compared with PLR and provided robust dose-response associations. FDA is recommended for use in normal tissue complication probability modeling. Copyright © 2016 The Author(s). Published by Elsevier Inc. All rights reserved.
Identifying the Factors That Influence Change in SEBD Using Logistic Regression Analysis
ERIC Educational Resources Information Center
Camilleri, Liberato; Cefai, Carmel
2013-01-01
Multiple linear regression and ANOVA models are widely used in applications since they provide effective statistical tools for assessing the relationship between a continuous dependent variable and several predictors. However these models rely heavily on linearity and normality assumptions and they do not accommodate categorical dependent…
Lee, Seokho; Shin, Hyejin; Lee, Sang Han
2016-12-01
Alzheimer's disease (AD) is usually diagnosed by clinicians through cognitive and functional performance test with a potential risk of misdiagnosis. Since the progression of AD is known to cause structural changes in the corpus callosum (CC), the CC thickness can be used as a functional covariate in AD classification problem for a diagnosis. However, misclassified class labels negatively impact the classification performance. Motivated by AD-CC association studies, we propose a logistic regression for functional data classification that is robust to misdiagnosis or label noise. Specifically, our logistic regression model is constructed by adopting individual intercepts to functional logistic regression model. This approach enables to indicate which observations are possibly mislabeled and also lead to a robust and efficient classifier. An effective algorithm using MM algorithm provides simple closed-form update formulas. We test our method using synthetic datasets to demonstrate its superiority over an existing method, and apply it to differentiating patients with AD from healthy normals based on CC from MRI. © 2016, The International Biometric Society.
Comparison of Survival Models for Analyzing Prognostic Factors in Gastric Cancer Patients
Habibi, Danial; Rafiei, Mohammad; Chehrei, Ali; Shayan, Zahra; Tafaqodi, Soheil
2018-03-27
Objective: There are a number of models for determining risk factors for survival of patients with gastric cancer. This study was conducted to select the model showing the best fit with available data. Methods: Cox regression and parametric models (Exponential, Weibull, Gompertz, Log normal, Log logistic and Generalized Gamma) were utilized in unadjusted and adjusted forms to detect factors influencing mortality of patients. Comparisons were made with Akaike Information Criterion (AIC) by using STATA 13 and R 3.1.3 softwares. Results: The results of this study indicated that all parametric models outperform the Cox regression model. The Log normal, Log logistic and Generalized Gamma provided the best performance in terms of AIC values (179.2, 179.4 and 181.1, respectively). On unadjusted analysis, the results of the Cox regression and parametric models indicated stage, grade, largest diameter of metastatic nest, largest diameter of LM, number of involved lymph nodes and the largest ratio of metastatic nests to lymph nodes, to be variables influencing the survival of patients with gastric cancer. On adjusted analysis, according to the best model (log normal), grade was found as the significant variable. Conclusion: The results suggested that all parametric models outperform the Cox model. The log normal model provides the best fit and is a good substitute for Cox regression. Creative Commons Attribution License
Ordinal logistic regression analysis on the nutritional status of children in KarangKitri village
NASA Astrophysics Data System (ADS)
Ohyver, Margaretha; Yongharto, Kimmy Octavian
2015-09-01
Ordinal logistic regression is a statistical technique that can be used to describe the relationship between ordinal response variable with one or more independent variables. This method has been used in various fields including in the health field. In this research, ordinal logistic regression is used to describe the relationship between nutritional status of children with age, gender, height, and family status. Nutritional status of children in this research is divided into over nutrition, well nutrition, less nutrition, and malnutrition. The purpose for this research is to describe the characteristics of children in the KarangKitri Village and to determine the factors that influence the nutritional status of children in the KarangKitri village. There are three things that obtained from this research. First, there are still children who are not categorized as well nutritional status. Second, there are children who come from sufficient economic level which include in not normal status. Third, the factors that affect the nutritional level of children are age, family status, and height.
NASA Astrophysics Data System (ADS)
Priya, Mallika; Rao, Bola Sadashiva Satish; Chandra, Subhash; Ray, Satadru; Mathew, Stanley; Datta, Anirbit; Nayak, Subramanya G.; Mahato, Krishna Kishore
2016-02-01
In spite of many efforts for early detection of breast cancer, there is still lack of technology for immediate implementation. In the present study, the potential photoacoustic spectroscopy was evaluated in discriminating breast cancer from normal, involving blood serum samples seeking early detection. Three photoacoustic spectra in time domain were recorded from each of 20 normal and 20 malignant samples at 281nm pulsed laser excitations and a total of 120 spectra were generated. The time domain spectra were then Fast Fourier Transformed into frequency domain and 116.5625 - 206.875 kHz region was selected for further analysis using a combinational approach of wavelet, PCA and logistic regression. Initially, wavelet analysis was performed on the FFT data and seven features (mean, median, area under the curve, variance, standard deviation, skewness and kurtosis) from each were extracted. PCA was then performed on the feature matrix (7x120) for discriminating malignant samples from the normal by plotting a decision boundary using logistic regression analysis. The unsupervised mode of classification used in the present study yielded specificity and sensitivity values of 100% in each respectively with a ROC - AUC value of 1. The results obtained have clearly demonstrated the capability of photoacoustic spectroscopy in discriminating cancer from the normal, suggesting its possible clinical implications.
A general framework for the use of logistic regression models in meta-analysis.
Simmonds, Mark C; Higgins, Julian Pt
2016-12-01
Where individual participant data are available for every randomised trial in a meta-analysis of dichotomous event outcomes, "one-stage" random-effects logistic regression models have been proposed as a way to analyse these data. Such models can also be used even when individual participant data are not available and we have only summary contingency table data. One benefit of this one-stage regression model over conventional meta-analysis methods is that it maximises the correct binomial likelihood for the data and so does not require the common assumption that effect estimates are normally distributed. A second benefit of using this model is that it may be applied, with only minor modification, in a range of meta-analytic scenarios, including meta-regression, network meta-analyses and meta-analyses of diagnostic test accuracy. This single model can potentially replace the variety of often complex methods used in these areas. This paper considers, with a range of meta-analysis examples, how random-effects logistic regression models may be used in a number of different types of meta-analyses. This one-stage approach is compared with widely used meta-analysis methods including Bayesian network meta-analysis and the bivariate and hierarchical summary receiver operating characteristic (ROC) models for meta-analyses of diagnostic test accuracy. © The Author(s) 2014.
Smith, Vanessa; Riccieri, Valeria; Pizzorni, Carmen; Decuman, Saskia; Deschepper, Ellen; Bonroy, Carolien; Sulli, Alberto; Piette, Yves; De Keyser, Filip; Cutolo, Maurizio
2013-12-01
Assessment of associations of nailfold videocapillaroscopy (NVC) scleroderma (systemic sclerosis; SSc) ("early," "active," and "late") with novel future severe clinical involvement in 2 independent cohorts. Sixty-six consecutive Belgian and 82 Italian patients with SSc underwent NVC at baseline. Images were blindly assessed and classified into normal, early, active, or late NVC pattern. Clinical evaluation was performed for 9 organ systems (general, peripheral vascular, skin, joint, muscle, gastrointestinal tract, lung, heart, and kidney) according to the Medsger disease severity scale (DSS) at baseline and in the future (18-24 months of followup). Severe clinical involvement was defined as category 2 to 4 per organ of the DSS. Logistic regression analysis (continuous NVC predictor variable) was performed. The OR to develop novel future severe organ involvement was stronger according to more severe NVC patterns and similar in both cohorts. In simple logistic regression analysis the OR in the Belgian/Italian cohort was 2.16 (95% CI 1.19-4.47, p = 0.010)/2.33 (95% CI 1.36-4.22, p = 0.002) for the early NVC SSc pattern, 4.68/5.42 for the active pattern, and 10.14/12.63 for the late pattern versus the normal pattern. In multiple logistic regression analysis, adjusting for disease duration, subset, and vasoactive medication, the OR was 2.99 (95% CI 1.31-8.82, p = 0.007)/1.88 (95% CI 1.00-3.71, p = 0.050) for the early NVC SSc pattern, 8.93/3.54 for the active pattern, and 26.69/6.66 for the late pattern versus the normal pattern. Capillaroscopy may be predictive of novel future severe organ involvement in SSc, as attested by 2 independent cohorts.
McLaren, Christine E.; Chen, Wen-Pin; Nie, Ke; Su, Min-Ying
2009-01-01
Rationale and Objectives Dynamic contrast enhanced MRI (DCE-MRI) is a clinical imaging modality for detection and diagnosis of breast lesions. Analytical methods were compared for diagnostic feature selection and performance of lesion classification to differentiate between malignant and benign lesions in patients. Materials and Methods The study included 43 malignant and 28 benign histologically-proven lesions. Eight morphological parameters, ten gray level co-occurrence matrices (GLCM) texture features, and fourteen Laws’ texture features were obtained using automated lesion segmentation and quantitative feature extraction. Artificial neural network (ANN) and logistic regression analysis were compared for selection of the best predictors of malignant lesions among the normalized features. Results Using ANN, the final four selected features were compactness, energy, homogeneity, and Law_LS, with area under the receiver operating characteristic curve (AUC) = 0.82, and accuracy = 0.76. The diagnostic performance of these 4-features computed on the basis of logistic regression yielded AUC = 0.80 (95% CI, 0.688 to 0.905), similar to that of ANN. The analysis also shows that the odds of a malignant lesion decreased by 48% (95% CI, 25% to 92%) for every increase of 1 SD in the Law_LS feature, adjusted for differences in compactness, energy, and homogeneity. Using logistic regression with z-score transformation, a model comprised of compactness, NRL entropy, and gray level sum average was selected, and it had the highest overall accuracy of 0.75 among all models, with AUC = 0.77 (95% CI, 0.660 to 0.880). When logistic modeling of transformations using the Box-Cox method was performed, the most parsimonious model with predictors, compactness and Law_LS, had an AUC of 0.79 (95% CI, 0.672 to 0.898). Conclusion The diagnostic performance of models selected by ANN and logistic regression was similar. The analytic methods were found to be roughly equivalent in terms of predictive ability when a small number of variables were chosen. The robust ANN methodology utilizes a sophisticated non-linear model, while logistic regression analysis provides insightful information to enhance interpretation of the model features. PMID:19409817
Concentration of folate in colorectal tissue biopsies predicts prevalence of adenomatous polyps
USDA-ARS?s Scientific Manuscript database
Background and aims: Folate has been implicated as a potential aetiological factor for colorectal cancer. Previous research has not adequately exploited concentrations of folate in normal colonic mucosal biopsies to examine the issue. Methods: Logistic regression models were used to estimate ORs ...
Lee, Bum Ju; Kim, Keun Ho; Ku, Boncho; Jang, Jun-Su; Kim, Jong Yeol
2013-05-01
The body mass index (BMI) provides essential medical information related to body weight for the treatment and prognosis prediction of diseases such as cardiovascular disease, diabetes, and stroke. We propose a method for the prediction of normal, overweight, and obese classes based only on the combination of voice features that are associated with BMI status, independently of weight and height measurements. A total of 1568 subjects were divided into 4 groups according to age and gender differences. We performed statistical analyses by analysis of variance (ANOVA) and Scheffe test to find significant features in each group. We predicted BMI status (normal, overweight, and obese) by a logistic regression algorithm and two ensemble classification algorithms (bagging and random forests) based on statistically significant features. In the Female-2030 group (females aged 20-40 years), classification experiments using an imbalanced (original) data set gave area under the receiver operating characteristic curve (AUC) values of 0.569-0.731 by logistic regression, whereas experiments using a balanced data set gave AUC values of 0.893-0.994 by random forests. AUC values in Female-4050 (females aged 41-60 years), Male-2030 (males aged 20-40 years), and Male-4050 (males aged 41-60 years) groups by logistic regression in imbalanced data were 0.585-0.654, 0.581-0.614, and 0.557-0.653, respectively. AUC values in Female-4050, Male-2030, and Male-4050 groups in balanced data were 0.629-0.893 by bagging, 0.707-0.916 by random forests, and 0.695-0.854 by bagging, respectively. In each group, we found discriminatory features showing statistical differences among normal, overweight, and obese classes. The results showed that the classification models built by logistic regression in imbalanced data were better than those built by the other two algorithms, and significant features differed according to age and gender groups. Our results could support the development of BMI diagnosis tools for real-time monitoring; such tools are considered helpful in improving automated BMI status diagnosis in remote healthcare or telemedicine and are expected to have applications in forensic and medical science. Copyright © 2013 Elsevier B.V. All rights reserved.
Cao, Xia; Xie, Xiumei; Xu, Guo; Yuan, Hong; Chen, Zhiheng
2014-06-01
To investigate the relationship between high-normal blood pressure and chronic kidney disease (CKD) in occupational physical examination population in Changsha. With a convenient sampling method, a cross-sectional survey of representative sample of 11 274 white collar workers was conducted in Changsha between March 2011 and May 2011 in a large comprehensive hospital. All subjects were assigned into 4 groups: a normal blood pressure group, a high-normal blood pressure group, an undiagnosed hypertension group, and a diagnosed hypertension group. Anthropometry, blood pressure, blood sample and urine sample were measured with standard instruments and methodology for all the subjects. Multiple logistic regression analysis was used to identify risk factors for CKD. The prevalence of CKD in the normal blood pressure, high-normal blood pressure, undiagnosed hypertension, and diagnosed hypertension were 3.31%, 6.60%, 11.78%, and 17.35%, respectively. The prevalence of CKD in males was significantly higher than that in females (P<0.01). For males with high-normal blood pressure, the CKD risk was significantly greater (OR, 1.30; 95% CI:1.03 - 1.63) than those with optimal blood pressure. The logistic regression analysis showed that there was an additive effect of hyperuricemia on CKD risk in men with high-normal blood pressure compared with men with optimal blood pressure (OR, 2.25; 95% CI, 1.59 - 3.19; P<0.05). The prevalence of CKD in people with the high-normal blood pressure is 6.60% in occupational physical examination population in Changsha. CKD is a high risk for men with highnormal blood pressure and hyperuricemia is an independent risk factor.
Austin, Peter C; Steyerberg, Ewout W
2012-06-20
When outcomes are binary, the c-statistic (equivalent to the area under the Receiver Operating Characteristic curve) is a standard measure of the predictive accuracy of a logistic regression model. An analytical expression was derived under the assumption that a continuous explanatory variable follows a normal distribution in those with and without the condition. We then conducted an extensive set of Monte Carlo simulations to examine whether the expressions derived under the assumption of binormality allowed for accurate prediction of the empirical c-statistic when the explanatory variable followed a normal distribution in the combined sample of those with and without the condition. We also examine the accuracy of the predicted c-statistic when the explanatory variable followed a gamma, log-normal or uniform distribution in combined sample of those with and without the condition. Under the assumption of binormality with equality of variances, the c-statistic follows a standard normal cumulative distribution function with dependence on the product of the standard deviation of the normal components (reflecting more heterogeneity) and the log-odds ratio (reflecting larger effects). Under the assumption of binormality with unequal variances, the c-statistic follows a standard normal cumulative distribution function with dependence on the standardized difference of the explanatory variable in those with and without the condition. In our Monte Carlo simulations, we found that these expressions allowed for reasonably accurate prediction of the empirical c-statistic when the distribution of the explanatory variable was normal, gamma, log-normal, and uniform in the entire sample of those with and without the condition. The discriminative ability of a continuous explanatory variable cannot be judged by its odds ratio alone, but always needs to be considered in relation to the heterogeneity of the population.
Classification of Effective Soil Depth by Using Multinomial Logistic Regression Analysis
NASA Astrophysics Data System (ADS)
Chang, C. H.; Chan, H. C.; Chen, B. A.
2016-12-01
Classification of effective soil depth is a task of determining the slopeland utilizable limitation in Taiwan. The "Slopeland Conservation and Utilization Act" categorizes the slopeland into agriculture and husbandry land, land suitable for forestry and land for enhanced conservation according to the factors including average slope, effective soil depth, soil erosion and parental rock. However, sit investigation of the effective soil depth requires a cost-effective field work. This research aimed to classify the effective soil depth by using multinomial logistic regression with the environmental factors. The Wen-Shui Watershed located at the central Taiwan was selected as the study areas. The analysis of multinomial logistic regression is performed by the assistance of a Geographic Information Systems (GIS). The effective soil depth was categorized into four levels including deeper, deep, shallow and shallower. The environmental factors of slope, aspect, digital elevation model (DEM), curvature and normalized difference vegetation index (NDVI) were selected for classifying the soil depth. An Error Matrix was then used to assess the model accuracy. The results showed an overall accuracy of 75%. At the end, a map of effective soil depth was produced to help planners and decision makers in determining the slopeland utilizable limitation in the study areas.
NASA Astrophysics Data System (ADS)
Yilmaz, Isik; Keskin, Inan; Marschalko, Marian; Bednarik, Martin
2010-05-01
This study compares the GIS based collapse susceptibility mapping methods such as; conditional probability (CP), logistic regression (LR) and artificial neural networks (ANN) applied in gypsum rock masses in Sivas basin (Turkey). Digital Elevation Model (DEM) was first constructed using GIS software. Collapse-related factors, directly or indirectly related to the causes of collapse occurrence, such as distance from faults, slope angle and aspect, topographical elevation, distance from drainage, topographic wetness index- TWI, stream power index- SPI, Normalized Difference Vegetation Index (NDVI) by means of vegetation cover, distance from roads and settlements were used in the collapse susceptibility analyses. In the last stage of the analyses, collapse susceptibility maps were produced from CP, LR and ANN models, and they were then compared by means of their validations. Area Under Curve (AUC) values obtained from all three methodologies showed that the map obtained from ANN model looks like more accurate than the other models, and the results also showed that the artificial neural networks is a usefull tool in preparation of collapse susceptibility map and highly compatible with GIS operating features. Key words: Collapse; doline; susceptibility map; gypsum; GIS; conditional probability; logistic regression; artificial neural networks.
Mostafa, Kamal S M
2011-04-01
Malnutrition among under-five children is a chronic problem in developing countries. This study explores the socio-economic determinants of severe and moderate stunting among under-five children of rural Bangladesh. The study used data from the 2007 Bangladesh Demographic and Health Survey. Cross-sectional and multinomial logistic regression analyses were used to assess the effect of the socio-demographic variables on moderate and severe stunting over normal among the children. Findings revealed that over two-fifths of the children were stunted, of which 26.3% were moderately stunted and 15.1% were severely stunted. The multivariate multinomial logistic regression analysis yielded significantly increased risk of severe stunting (OR=2.53, 95% CI=1.34-4.79) and moderate stunting (OR=2.37, 95% CI=1.47-3.83) over normal among children with a thinner mother. Region, father's education, toilet facilities, child's age, birth order of children and wealth index were also important determinants of children's nutritional status. Development and poverty alleviation programmes should focus on the disadvantaged rural segments of people to improve their nutritional status.
NASA Astrophysics Data System (ADS)
Ariffin, Syaiba Balqish; Midi, Habshah
2014-06-01
This article is concerned with the performance of logistic ridge regression estimation technique in the presence of multicollinearity and high leverage points. In logistic regression, multicollinearity exists among predictors and in the information matrix. The maximum likelihood estimator suffers a huge setback in the presence of multicollinearity which cause regression estimates to have unduly large standard errors. To remedy this problem, a logistic ridge regression estimator is put forward. It is evident that the logistic ridge regression estimator outperforms the maximum likelihood approach for handling multicollinearity. The effect of high leverage points are then investigated on the performance of the logistic ridge regression estimator through real data set and simulation study. The findings signify that logistic ridge regression estimator fails to provide better parameter estimates in the presence of both high leverage points and multicollinearity.
Genomic-Enabled Prediction of Ordinal Data with Bayesian Logistic Ordinal Regression.
Montesinos-López, Osval A; Montesinos-López, Abelardo; Crossa, José; Burgueño, Juan; Eskridge, Kent
2015-08-18
Most genomic-enabled prediction models developed so far assume that the response variable is continuous and normally distributed. The exception is the probit model, developed for ordered categorical phenotypes. In statistical applications, because of the easy implementation of the Bayesian probit ordinal regression (BPOR) model, Bayesian logistic ordinal regression (BLOR) is implemented rarely in the context of genomic-enabled prediction [sample size (n) is much smaller than the number of parameters (p)]. For this reason, in this paper we propose a BLOR model using the Pólya-Gamma data augmentation approach that produces a Gibbs sampler with similar full conditional distributions of the BPOR model and with the advantage that the BPOR model is a particular case of the BLOR model. We evaluated the proposed model by using simulation and two real data sets. Results indicate that our BLOR model is a good alternative for analyzing ordinal data in the context of genomic-enabled prediction with the probit or logit link. Copyright © 2015 Montesinos-López et al.
Staley, James R; Jones, Edmund; Kaptoge, Stephen; Butterworth, Adam S; Sweeting, Michael J; Wood, Angela M; Howson, Joanna M M
2017-06-01
Logistic regression is often used instead of Cox regression to analyse genome-wide association studies (GWAS) of single-nucleotide polymorphisms (SNPs) and disease outcomes with cohort and case-cohort designs, as it is less computationally expensive. Although Cox and logistic regression models have been compared previously in cohort studies, this work does not completely cover the GWAS setting nor extend to the case-cohort study design. Here, we evaluated Cox and logistic regression applied to cohort and case-cohort genetic association studies using simulated data and genetic data from the EPIC-CVD study. In the cohort setting, there was a modest improvement in power to detect SNP-disease associations using Cox regression compared with logistic regression, which increased as the disease incidence increased. In contrast, logistic regression had more power than (Prentice weighted) Cox regression in the case-cohort setting. Logistic regression yielded inflated effect estimates (assuming the hazard ratio is the underlying measure of association) for both study designs, especially for SNPs with greater effect on disease. Given logistic regression is substantially more computationally efficient than Cox regression in both settings, we propose a two-step approach to GWAS in cohort and case-cohort studies. First to analyse all SNPs with logistic regression to identify associated variants below a pre-defined P-value threshold, and second to fit Cox regression (appropriately weighted in case-cohort studies) to those identified SNPs to ensure accurate estimation of association with disease.
The crux of the method: assumptions in ordinary least squares and logistic regression.
Long, Rebecca G
2008-10-01
Logistic regression has increasingly become the tool of choice when analyzing data with a binary dependent variable. While resources relating to the technique are widely available, clear discussions of why logistic regression should be used in place of ordinary least squares regression are difficult to find. The current paper compares and contrasts the assumptions of ordinary least squares with those of logistic regression and explains why logistic regression's looser assumptions make it adept at handling violations of the more important assumptions in ordinary least squares.
Using Dominance Analysis to Determine Predictor Importance in Logistic Regression
ERIC Educational Resources Information Center
Azen, Razia; Traxel, Nicole
2009-01-01
This article proposes an extension of dominance analysis that allows researchers to determine the relative importance of predictors in logistic regression models. Criteria for choosing logistic regression R[superscript 2] analogues were determined and measures were selected that can be used to perform dominance analysis in logistic regression. A…
Wildfire Risk Mapping over the State of Mississippi: Land Surface Modeling Approach
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cooke, William H.; Mostovoy, Georgy; Anantharaj, Valentine G
2012-01-01
Three fire risk indexes based on soil moisture estimates were applied to simulate wildfire probability over the southern part of Mississippi using the logistic regression approach. The fire indexes were retrieved from: (1) accumulated difference between daily precipitation and potential evapotranspiration (P-E); (2) top 10 cm soil moisture content simulated by the Mosaic land surface model; and (3) the Keetch-Byram drought index (KBDI). The P-E, KBDI, and soil moisture based indexes were estimated from gridded atmospheric and Mosaic-simulated soil moisture data available from the North American Land Data Assimilation System (NLDAS-2). Normalized deviations of these indexes from the 31-year meanmore » (1980-2010) were fitted into the logistic regression model describing probability of wildfires occurrence as a function of the fire index. It was assumed that such normalization provides more robust and adequate description of temporal dynamics of soil moisture anomalies than the original (not normalized) set of indexes. The logistic model parameters were evaluated for 0.25 x0.25 latitude/longitude cells and for probability representing at least one fire event occurred during 5 consecutive days. A 23-year (1986-2008) forest fires record was used. Two periods were selected and examined (January mid June and mid September December). The application of the logistic model provides an overall good agreement between empirical/observed and model-fitted fire probabilities over the study area during both seasons. The fire risk indexes based on the top 10 cm soil moisture and KBDI have the largest impact on the wildfire odds (increasing it by almost 2 times in response to each unit change of the corresponding fire risk index during January mid June period and by nearly 1.5 times during mid September-December) observed over 0.25 x0.25 cells located along the state of Mississippi Coast line. This result suggests a rather strong control of fire risk indexes on fire occurrence probability over this region.« less
Vázquez-Nava, Francisco; Treviño-Garcia-Manzo, Norberto; Vázquez-Rodríguez, Carlos F; Vázquez-Rodríguez, Eliza M
2013-01-01
To determine the association between family structure, maternal education level, and maternal employment with sedentary lifestyle in primary school-age children. Data were obtained from 897 children aged 6 to 12 years. A questionnaire was used to collect information. Body mass index (BMI) was determined using the age- and gender-specific Centers for Disease Control and Prevention definition. Children were categorized as: normal weight (5(th) percentile≤BMI<85(th) percentile), at risk for overweight (85(th)≤BMI<95(th) percentile), overweight (≥ 95(th) percentile). For the analysis, overweight was defined as BMI at or above the 85(th) percentile for each gender. Adjusted odds ratios (adjusted ORs) for physical inactivity were determined using a logistic regression model. The prevalence of overweight was 40.7%, and of sedentary lifestyle, 57.2%. The percentage of non-intact families was 23.5%. Approximately 48.7% of the mothers had a non-acceptable educational level, and 38.8% of the mothers worked outside of the home. The logistic regression model showed that living in a non-intact family household (adjusted OR=1.67; 95% CI=1.04-2.66) is associated with sedentary lifestyle in overweight children. In the group of normal weight children, logistic regression analysis show that living in a non-intact family, having a mother with a non-acceptable education level, and having a mother who works outside of the home were not associated with sedentary lifestyle. Living in a non-intact family, more than low maternal educational level and having a working mother, appears to be associated with sedentary lifestyle in overweight primary school-age children. Copyright © 2013 Sociedade Brasileira de Pediatria. Published by Elsevier Editora Ltda. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Robertson, John M., E-mail: jrobertson@beaumont.ed; Soehn, Matthias; Yan Di
Purpose: Understanding the dose-volume relationship of small bowel irradiation and severe acute diarrhea may help reduce the incidence of this side effect during adjuvant treatment for rectal cancer. Methods and Materials: Consecutive patients treated curatively for rectal cancer were reviewed, and the maximum grade of acute diarrhea was determined. The small bowel was outlined on the treatment planning CT scan, and a dose-volume histogram was calculated for the initial pelvic treatment (45 Gy). Logistic regression models were fitted for varying cutoff-dose levels from 5 to 45 Gy in 5-Gy increments. The model with the highest LogLikelihood was used to developmore » a cutoff-dose normal tissue complication probability (NTCP) model. Results: There were a total of 152 patients (48% preoperative, 47% postoperative, 5% other), predominantly treated prone (95%) with a three-field technique (94%) and a protracted venous infusion of 5-fluorouracil (78%). Acute Grade 3 diarrhea occurred in 21%. The largest LogLikelihood was found for the cutoff-dose logistic regression model with 15 Gy as the cutoff-dose, although the models for 20 Gy and 25 Gy had similar significance. According to this model, highly significant correlations (p <0.001) between small bowel volumes receiving at least 15 Gy and toxicity exist in the considered patient population. Similar findings applied to both the preoperatively (p = 0.001) and postoperatively irradiated groups (p = 0.001). Conclusion: The incidence of Grade 3 diarrhea was significantly correlated with the volume of small bowel receiving at least 15 Gy using a cutoff-dose NTCP model.« less
Guerrero-Romero, Fernando; Flores-García, Araceli; Saldaña-Guerrero, Stephanie; Simental-Mendía, Luis E; Rodríguez-Morán, Martha
2016-10-01
Whether low serum magnesium is an epiphenomenon related with obesity or, whether obesity per se is cause of hypomagnesemia, remains to be clarified. To examine the relationship between body weight status and hypomagnesemia in apparently healthy subjects. A total of 681 healthy individuals aged 30 to 65years were enrolled in A cross-sectional study. Extreme exercise, chronic diarrhea, alcohol intake, use of diuretics, smoking, oral magnesium supplementation, diabetes, malnutrition, hypertension, liver disease, thyroid disorders, and renal damage were exclusion criteria. Based in the Body Mass Index (BMI), body weight status was defined as follows: normal weight (BMI <25kg/m 2 ); overweight (BMI ≥25<30 BMIkg/m 2 ); and obesity (BMI ≥30kg/m 2 ). Hypomagnesemia was defined by serum magnesium concentration ≤0.74mmol/L. A multiple logistic regression analysis was used to compute the odds ratio (OR) between body weight status (independent variables) and hypomagnesemia (dependent variable). The multivariate logistic regression analysis showed that dietary magnesium intake (OR 2.11; 95%CI 1.4-5.7) but no obesity (OR 1.53; 95%CI 0.9-2.5), overweight (OR 1.40; 95%CI 0.8-2.4), and normal weight (OR 0.78; 95%CI 0.6-2.09) were associated with hypomagnesemia. A subsequent logistic regression analysis adjusted by body mass index, waist circumference, total body fat, systolic and diastolic blood pressure, and triglycerides levels showed that hyperglycemia (2.19; 95%CI 1.1-7.0) and dietary magnesium intake (2.21; 95%CI 1.1-8.9) remained associated with hypomagnesemia. Our results show that body weight status is not associated with hypomagnesemia and that, irrespective of obesity, hyperglycemia is cause of hypomagnesemia in non-diabetic individuals. Copyright © 2016 European Federation of Internal Medicine. Published by Elsevier B.V. All rights reserved.
Applying Kaplan-Meier to Item Response Data
ERIC Educational Resources Information Center
McNeish, Daniel
2018-01-01
Some IRT models can be equivalently modeled in alternative frameworks such as logistic regression. Logistic regression can also model time-to-event data, which concerns the probability of an event occurring over time. Using the relation between time-to-event models and logistic regression and the relation between logistic regression and IRT, this…
Misperception among rural diabetic residents: a cross-sectional descriptive study.
Huang, Tzu-Ting; Guo, Su-Er; Chang, Chia-Hao; Huang, Jui-Chu; Lin, Ming-Shyan; Lee, Chia-Mou; Chen, Mei-Yen
2013-04-01
To evaluate the self-perception of diabetes control associated with physical indicators and with practicing exercise and a healthy diet, among rural residents. It remains unclear whether a subject's self-perception of diabetes control increases its deleterious effects. Cross-sectional, correlational. We recruited 715 participants from 18 primary healthcare centres in the rural regions of Chiayi County, Taiwan. Data were collected between 1 January 2009-30 June 2010. Logistic regression was conducted to identify the determinant factors associated with perceptions of diabetes control. A high percentage of participants overestimated their fasting blood glucose and HbA1 C status. Total cholesterol, triglyceride, low density lipoprotein cholesterol, blood pressure, and waist circumference exceeded the medical standard in the 'feel good' group, and many did not adopt a healthy diet and undertake physical activity. The final logistic regression model demonstrated that residents with diabetes who exercised frequently had normal fasting glucose, and normal HbA1 C tended to perceive 'feel good' control. Misperception and unawareness of diabetes control were prevalent among rural residents. Addressing misperceptions among rural residents with diabetes and increasing their knowledge of professional advice could be important steps in improving diabetes control in an elder population. © 2012 Blackwell Publishing Ltd.
Association between peer relationship problems and childhood overweight/obesity.
Boneberger, Anja; von Kries, Rüdiger; Milde-Busch, Astrid; Bolte, Gabriele; Rochat, Mascha K; Rückinger, Simon
2009-12-01
To assess the association between peer relationship problems and childhood overweight and obesity. Data on 4718 preschool children were obtained at the obligatory school entry health examination in Bavaria. Parentally reported peer relationship problems ('normal', 'borderline' or 'abnormal') were assessed from the Strengths and Difficulties Questionnaire. Overweight and obesity were defined according to age- and gender-specific BMI cut-off points. Multivariate logistic regression analysis was performed to control potential confounders. The prevalence of overweight and obesity was higher among children with 'borderline' or 'abnormal' peer relationship problems compared to 'normal' children. The association of 'abnormal' peer relationship problems was still significant in the final logistic regression model for girls [odds ratio (OR) for overweight 2.0; 95% confidence interval (CI): 1.4-3.0; OR for obesity 2.6; 95% CI: 1.3-5.0]. Among boys the adjusted odds ratio were lower and no longer significant. The significantly increased prevalence of overweight and obesity among preschool children with peer relationship problems could not be explained by confounding. It seems evident that there is a relevant co-morbidity of peer relationship problems and obesity in pre-school children pointing to the need of interventions focusing on both physical as well as psychosocial health.
Prayer at Midlife is Associated with Reduced Risk of Cognitive Decline in Arabic Women
Inzelberg, Rivka; Afgin, Anne E; Massarwa, Magda; Schechtman, Edna; Israeli-Korn, Simon D.; Strugatsky, Rosa; Abuful, Amin; Kravitz, Efrat; Farrer, Lindsay A.; Friedland, Robert P.
2013-01-01
Midlife habits may be important for the later development of Alzheimer's disease (AD). We estimated the contribution of midlife prayer to the development of cognitive decline. In a door-to-door survey, residents aged ≥65 years were systematically evaluated in Arabic including medical history, neurological, cognitive examination, and a midlife leisure-activities questionnaire. Praying was assessed by the number of monthly praying hours at midlife. Stepwise logistic regression models were used to evaluate the effect of prayer on the odds of mild cognitive impairment (MCI) and AD versus cognitively normal individuals. Of 935 individuals that were approached, 778 [normal controls (n=448), AD (n=92) and MCI (n=238)] were evaluated. A higher proportion of cognitively normal individuals engaged in prayer at midlife [(87%) versus MCI (71%) or AD (69%) (p<0.0001)]. Since 94% of males engaged in prayer, the effect on cognitive decline could not be assessed in men. Among women, stepwise logistic regression adjusted for age and education, showed that prayer was significantly associated with reduced risk of MCI (p=0.027, OR=0.55, 95% CI 0.33-0.94), but not AD. Among individuals endorsing prayer activity, the amount of prayer was not associated with MCI or AD in either gender. Praying at midlife is associated with lower risk of mild cognitive impairment in women. PMID:23116476
Prayer at midlife is associated with reduced risk of cognitive decline in Arabic women.
Inzelberg, Rivka; Afgin, Anne E; Massarwa, Magda; Schechtman, Edna; Israeli-Korn, Simon D; Strugatsky, Rosa; Abuful, Amin; Kravitz, Efrat; Farrer, Lindsay A; Friedland, Robert P
2013-03-01
Midlife habits may be important for the later development of Alzheimer's disease (AD). We estimated the contribution of midlife prayer to the development of cognitive decline. In a door-to-door survey, residents aged ≥65 years were systematically evaluated in Arabic including medical history, neurological, cognitive examination, and a midlife leisure-activities questionnaire. Praying was assessed by the number of monthly praying hours at midlife. Stepwise logistic regression models were used to evaluate the effect of prayer on the odds of mild cognitive impairment (MCI) and AD versus cognitively normal individuals. Of 935 individuals that were approached, 778 [normal controls (n=448), AD (n=92) and MCI (n=238)] were evaluated. A higher proportion of cognitively normal individuals engaged in prayer at midlife [(87%) versus MCI (71%) or AD (69%) (p<0.0001)]. Since 94% of males engaged in prayer, the effect on cognitive decline could not be assessed in men. Among women, stepwise logistic regression adjusted for age and education, showed that prayer was significantly associated with reduced risk of MCI (p=0.027, OR=0.55, 95% CI 0.33-0.94), but not AD. Among individuals endorsing prayer activity, the amount of prayer was not associated with MCI or AD in either gender. Praying at midlife is associated with lower risk of mild cognitive impairment in women.
Detecting Anomalies in Process Control Networks
NASA Astrophysics Data System (ADS)
Rrushi, Julian; Kang, Kyoung-Don
This paper presents the estimation-inspection algorithm, a statistical algorithm for anomaly detection in process control networks. The algorithm determines if the payload of a network packet that is about to be processed by a control system is normal or abnormal based on the effect that the packet will have on a variable stored in control system memory. The estimation part of the algorithm uses logistic regression integrated with maximum likelihood estimation in an inductive machine learning process to estimate a series of statistical parameters; these parameters are used in conjunction with logistic regression formulas to form a probability mass function for each variable stored in control system memory. The inspection part of the algorithm uses the probability mass functions to estimate the normalcy probability of a specific value that a network packet writes to a variable. Experimental results demonstrate that the algorithm is very effective at detecting anomalies in process control networks.
NASA Astrophysics Data System (ADS)
Lin, Yingzhi; Deng, Xiangzheng; Li, Xing; Ma, Enjun
2014-12-01
Spatially explicit simulation of land use change is the basis for estimating the effects of land use and cover change on energy fluxes, ecology and the environment. At the pixel level, logistic regression is one of the most common approaches used in spatially explicit land use allocation models to determine the relationship between land use and its causal factors in driving land use change, and thereby to evaluate land use suitability. However, these models have a drawback in that they do not determine/allocate land use based on the direct relationship between land use change and its driving factors. Consequently, a multinomial logistic regression method was introduced to address this flaw, and thereby, judge the suitability of a type of land use in any given pixel in a case study area of the Jiangxi Province, China. A comparison of the two regression methods indicated that the proportion of correctly allocated pixels using multinomial logistic regression was 92.98%, which was 8.47% higher than that obtained using logistic regression. Paired t-test results also showed that pixels were more clearly distinguished by multinomial logistic regression than by logistic regression. In conclusion, multinomial logistic regression is a more efficient and accurate method for the spatial allocation of land use changes. The application of this method in future land use change studies may improve the accuracy of predicting the effects of land use and cover change on energy fluxes, ecology, and environment.
2012-01-01
Background When outcomes are binary, the c-statistic (equivalent to the area under the Receiver Operating Characteristic curve) is a standard measure of the predictive accuracy of a logistic regression model. Methods An analytical expression was derived under the assumption that a continuous explanatory variable follows a normal distribution in those with and without the condition. We then conducted an extensive set of Monte Carlo simulations to examine whether the expressions derived under the assumption of binormality allowed for accurate prediction of the empirical c-statistic when the explanatory variable followed a normal distribution in the combined sample of those with and without the condition. We also examine the accuracy of the predicted c-statistic when the explanatory variable followed a gamma, log-normal or uniform distribution in combined sample of those with and without the condition. Results Under the assumption of binormality with equality of variances, the c-statistic follows a standard normal cumulative distribution function with dependence on the product of the standard deviation of the normal components (reflecting more heterogeneity) and the log-odds ratio (reflecting larger effects). Under the assumption of binormality with unequal variances, the c-statistic follows a standard normal cumulative distribution function with dependence on the standardized difference of the explanatory variable in those with and without the condition. In our Monte Carlo simulations, we found that these expressions allowed for reasonably accurate prediction of the empirical c-statistic when the distribution of the explanatory variable was normal, gamma, log-normal, and uniform in the entire sample of those with and without the condition. Conclusions The discriminative ability of a continuous explanatory variable cannot be judged by its odds ratio alone, but always needs to be considered in relation to the heterogeneity of the population. PMID:22716998
Standards for Standardized Logistic Regression Coefficients
ERIC Educational Resources Information Center
Menard, Scott
2011-01-01
Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…
Schörgendorfer, Angela; Branscum, Adam J; Hanson, Timothy E
2013-06-01
Logistic regression is a popular tool for risk analysis in medical and population health science. With continuous response data, it is common to create a dichotomous outcome for logistic regression analysis by specifying a threshold for positivity. Fitting a linear regression to the nondichotomized response variable assuming a logistic sampling model for the data has been empirically shown to yield more efficient estimates of odds ratios than ordinary logistic regression of the dichotomized endpoint. We illustrate that risk inference is not robust to departures from the parametric logistic distribution. Moreover, the model assumption of proportional odds is generally not satisfied when the condition of a logistic distribution for the data is violated, leading to biased inference from a parametric logistic analysis. We develop novel Bayesian semiparametric methodology for testing goodness of fit of parametric logistic regression with continuous measurement data. The testing procedures hold for any cutoff threshold and our approach simultaneously provides the ability to perform semiparametric risk estimation. Bayes factors are calculated using the Savage-Dickey ratio for testing the null hypothesis of logistic regression versus a semiparametric generalization. We propose a fully Bayesian and a computationally efficient empirical Bayesian approach to testing, and we present methods for semiparametric estimation of risks, relative risks, and odds ratios when parametric logistic regression fails. Theoretical results establish the consistency of the empirical Bayes test. Results from simulated data show that the proposed approach provides accurate inference irrespective of whether parametric assumptions hold or not. Evaluation of risk factors for obesity shows that different inferences are derived from an analysis of a real data set when deviations from a logistic distribution are permissible in a flexible semiparametric framework. © 2013, The International Biometric Society.
Color normalization for robust evaluation of microscopy images
NASA Astrophysics Data System (ADS)
Švihlík, Jan; Kybic, Jan; Habart, David
2015-09-01
This paper deals with color normalization of microscopy images of Langerhans islets in order to increase robustness of the islet segmentation to illumination changes. The main application is automatic quantitative evaluation of the islet parameters, useful for determining the feasibility of islet transplantation in diabetes. First, background illumination inhomogeneity is compensated and a preliminary foreground/background segmentation is performed. The color normalization itself is done in either lαβ or logarithmic RGB color spaces, by comparison with a reference image. The color-normalized images are segmented using color-based features and pixel-wise logistic regression, trained on manually labeled images. Finally, relevant statistics such as the total islet area are evaluated in order to determine the success likelihood of the transplantation.
Westreich, Daniel; Lessler, Justin; Funk, Michele Jonsson
2010-01-01
Summary Objective Propensity scores for the analysis of observational data are typically estimated using logistic regression. Our objective in this Review was to assess machine learning alternatives to logistic regression which may accomplish the same goals but with fewer assumptions or greater accuracy. Study Design and Setting We identified alternative methods for propensity score estimation and/or classification from the public health, biostatistics, discrete mathematics, and computer science literature, and evaluated these algorithms for applicability to the problem of propensity score estimation, potential advantages over logistic regression, and ease of use. Results We identified four techniques as alternatives to logistic regression: neural networks, support vector machines, decision trees (CART), and meta-classifiers (in particular, boosting). Conclusion While the assumptions of logistic regression are well understood, those assumptions are frequently ignored. All four alternatives have advantages and disadvantages compared with logistic regression. Boosting (meta-classifiers) and to a lesser extent decision trees (particularly CART) appear to be most promising for use in the context of propensity score analysis, but extensive simulation studies are needed to establish their utility in practice. PMID:20630332
Robust mislabel logistic regression without modeling mislabel probabilities.
Hung, Hung; Jou, Zhi-Yu; Huang, Su-Yun
2018-03-01
Logistic regression is among the most widely used statistical methods for linear discriminant analysis. In many applications, we only observe possibly mislabeled responses. Fitting a conventional logistic regression can then lead to biased estimation. One common resolution is to fit a mislabel logistic regression model, which takes into consideration of mislabeled responses. Another common method is to adopt a robust M-estimation by down-weighting suspected instances. In this work, we propose a new robust mislabel logistic regression based on γ-divergence. Our proposal possesses two advantageous features: (1) It does not need to model the mislabel probabilities. (2) The minimum γ-divergence estimation leads to a weighted estimating equation without the need to include any bias correction term, that is, it is automatically bias-corrected. These features make the proposed γ-logistic regression more robust in model fitting and more intuitive for model interpretation through a simple weighting scheme. Our method is also easy to implement, and two types of algorithms are included. Simulation studies and the Pima data application are presented to demonstrate the performance of γ-logistic regression. © 2017, The International Biometric Society.
Fungible weights in logistic regression.
Jones, Jeff A; Waller, Niels G
2016-06-01
In this article we develop methods for assessing parameter sensitivity in logistic regression models. To set the stage for this work, we first review Waller's (2008) equations for computing fungible weights in linear regression. Next, we describe 2 methods for computing fungible weights in logistic regression. To demonstrate the utility of these methods, we compute fungible logistic regression weights using data from the Centers for Disease Control and Prevention's (2010) Youth Risk Behavior Surveillance Survey, and we illustrate how these alternate weights can be used to evaluate parameter sensitivity. To make our work accessible to the research community, we provide R code (R Core Team, 2015) that will generate both kinds of fungible logistic regression weights. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Westreich, Daniel; Lessler, Justin; Funk, Michele Jonsson
2010-08-01
Propensity scores for the analysis of observational data are typically estimated using logistic regression. Our objective in this review was to assess machine learning alternatives to logistic regression, which may accomplish the same goals but with fewer assumptions or greater accuracy. We identified alternative methods for propensity score estimation and/or classification from the public health, biostatistics, discrete mathematics, and computer science literature, and evaluated these algorithms for applicability to the problem of propensity score estimation, potential advantages over logistic regression, and ease of use. We identified four techniques as alternatives to logistic regression: neural networks, support vector machines, decision trees (classification and regression trees [CART]), and meta-classifiers (in particular, boosting). Although the assumptions of logistic regression are well understood, those assumptions are frequently ignored. All four alternatives have advantages and disadvantages compared with logistic regression. Boosting (meta-classifiers) and, to a lesser extent, decision trees (particularly CART), appear to be most promising for use in the context of propensity score analysis, but extensive simulation studies are needed to establish their utility in practice. Copyright (c) 2010 Elsevier Inc. All rights reserved.
Should metacognition be measured by logistic regression?
Rausch, Manuel; Zehetleitner, Michael
2017-03-01
Are logistic regression slopes suitable to quantify metacognitive sensitivity, i.e. the efficiency with which subjective reports differentiate between correct and incorrect task responses? We analytically show that logistic regression slopes are independent from rating criteria in one specific model of metacognition, which assumes (i) that rating decisions are based on sensory evidence generated independently of the sensory evidence used for primary task responses and (ii) that the distributions of evidence are logistic. Given a hierarchical model of metacognition, logistic regression slopes depend on rating criteria. According to all considered models, regression slopes depend on the primary task criterion. A reanalysis of previous data revealed that massive numbers of trials are required to distinguish between hierarchical and independent models with tolerable accuracy. It is argued that researchers who wish to use logistic regression as measure of metacognitive sensitivity need to control the primary task criterion and rating criteria. Copyright © 2017 Elsevier Inc. All rights reserved.
Mameli, Chiara; Krakauer, Nir Y; Krakauer, Jesse C; Bosetti, Alessandra; Ferrari, Chiara Matilde; Moiana, Norma; Schneider, Laura; Borsani, Barbara; Genoni, Teresa; Zuccotti, Gianvincenzo
2018-01-01
A Body Shape Index (ABSI) and normalized hip circumference (Hip Index, HI) have been recently shown to be strong risk factors for mortality and for cardiovascular disease in adults. We conducted an observational cross-sectional study to evaluate the relationship between ABSI, HI and cardiometabolic risk factors and obesity-related comorbidities in overweight and obese children and adolescents aged 2-18 years. We performed multivariate linear and logistic regression analyses with BMI, ABSI, and HI age and sex normalized z scores as predictors to examine the association with cardiometabolic risk markers (systolic and diastolic blood pressure, fasting glucose and insulin, total cholesterol and its components, transaminases, fat mass % detected by bioelectrical impedance analysis) and obesity-related conditions (including hepatic steatosis and metabolic syndrome). We recruited 217 patients (114 males), mean age 11.3 years. Multivariate linear regression showed a significant association of ABSI z score with 10 out of 15 risk markers expressed as continuous variables, while BMI z score showed a significant correlation with 9 and HI only with 1. In multivariate logistic regression to predict occurrence of obesity-related conditions and above-threshold values of risk factors, BMI z score was significantly correlated to 7 out of 12, ABSI to 5, and HI to 1. Overall, ABSI is an independent anthropometric index that was significantly associated with cardiometabolic risk markers in a pediatric population affected by overweight and obesity.
London Measure of Unplanned Pregnancy: guidance for its use as an outcome measure
Hall, Jennifer A; Barrett, Geraldine; Copas, Andrew; Stephenson, Judith
2017-01-01
Background The London Measure of Unplanned Pregnancy (LMUP) is a psychometrically validated measure of the degree of intention of a current or recent pregnancy. The LMUP is increasingly being used worldwide, and can be used to evaluate family planning or preconception care programs. However, beyond recommending the use of the full LMUP scale, there is no published guidance on how to use the LMUP as an outcome measure. Ordinal logistic regression has been recommended informally, but studies published to date have all used binary logistic regression and dichotomized the scale at different cut points. There is thus a need for evidence-based guidance to provide a standardized methodology for multivariate analysis and to enable comparison of results. This paper makes recommendations for the regression method for analysis of the LMUP as an outcome measure. Materials and methods Data collected from 4,244 pregnant women in Malawi were used to compare five regression methods: linear, logistic with two cut points, and ordinal logistic with either the full or grouped LMUP score. The recommendations were then tested on the original UK LMUP data. Results There were small but no important differences in the findings across the regression models. Logistic regression resulted in the largest loss of information, and assumptions were violated for the linear and ordinal logistic regression. Consequently, robust standard errors were used for linear regression and a partial proportional odds ordinal logistic regression model attempted. The latter could only be fitted for grouped LMUP score. Conclusion We recommend the linear regression model with robust standard errors to make full use of the LMUP score when analyzed as an outcome measure. Ordinal logistic regression could be considered, but a partial proportional odds model with grouped LMUP score may be required. Logistic regression is the least-favored option, due to the loss of information. For logistic regression, the cut point for un/planned pregnancy should be between nine and ten. These recommendations will standardize the analysis of LMUP data and enhance comparability of results across studies. PMID:28435343
Logistic models--an odd(s) kind of regression.
Jupiter, Daniel C
2013-01-01
The logistic regression model bears some similarity to the multivariable linear regression with which we are familiar. However, the differences are great enough to warrant a discussion of the need for and interpretation of logistic regression. Copyright © 2013 American College of Foot and Ankle Surgeons. Published by Elsevier Inc. All rights reserved.
Parameters Estimation of Geographically Weighted Ordinal Logistic Regression (GWOLR) Model
NASA Astrophysics Data System (ADS)
Zuhdi, Shaifudin; Retno Sari Saputro, Dewi; Widyaningsih, Purnami
2017-06-01
A regression model is the representation of relationship between independent variable and dependent variable. The dependent variable has categories used in the logistic regression model to calculate odds on. The logistic regression model for dependent variable has levels in the logistics regression model is ordinal. GWOLR model is an ordinal logistic regression model influenced the geographical location of the observation site. Parameters estimation in the model needed to determine the value of a population based on sample. The purpose of this research is to parameters estimation of GWOLR model using R software. Parameter estimation uses the data amount of dengue fever patients in Semarang City. Observation units used are 144 villages in Semarang City. The results of research get GWOLR model locally for each village and to know probability of number dengue fever patient categories.
The purpose of this report is to provide a reference manual that could be used by investigators for making informed use of logistic regression using two methods (standard logistic regression and MARS). The details for analyses of relationships between a dependent binary response ...
Predicting U.S. Army Reserve Unit Manning Using Market Demographics
2015-06-01
develops linear regression , classification tree, and logistic regression models to determine the ability of the location to support manning requirements... logistic regression model delivers predictive results that allow decision-makers to identify locations with a high probability of meeting unit...manning requirements. The recommendation of this thesis is that the USAR implement the logistic regression model. 14. SUBJECT TERMS U.S
ERIC Educational Resources Information Center
Chen, Chau-Kuang
2005-01-01
Logistic and Cox regression methods are practical tools used to model the relationships between certain student learning outcomes and their relevant explanatory variables. The logistic regression model fits an S-shaped curve into a binary outcome with data points of zero and one. The Cox regression model allows investigators to study the duration…
Yusuf, O B; Bamgboye, E A; Afolabi, R F; Shodimu, M A
2014-09-01
Logistic regression model is widely used in health research for description and predictive purposes. Unfortunately, most researchers are sometimes not aware that the underlying principles of the techniques have failed when the algorithm for maximum likelihood does not converge. Young researchers particularly postgraduate students may not know why separation problem whether quasi or complete occurs, how to identify it and how to fix it. This study was designed to critically evaluate convergence issues in articles that employed logistic regression analysis published in an African Journal of Medicine and medical sciences between 2004 and 2013. Problems of quasi or complete separation were described and were illustrated with the National Demographic and Health Survey dataset. A critical evaluation of articles that employed logistic regression was conducted. A total of 581 articles was reviewed, of which 40 (6.9%) used binary logistic regression. Twenty-four (60.0%) stated the use of logistic regression model in the methodology while none of the articles assessed model fit. Only 3 (12.5%) properly described the procedures. Of the 40 that used the logistic regression model, the problem of convergence occurred in 6 (15.0%) of the articles. Logistic regression tends to be poorly reported in studies published between 2004 and 2013. Our findings showed that the procedure may not be well understood by researchers since very few described the process in their reports and may be totally unaware of the problem of convergence or how to deal with it.
Logistic Regression: Concept and Application
ERIC Educational Resources Information Center
Cokluk, Omay
2010-01-01
The main focus of logistic regression analysis is classification of individuals in different groups. The aim of the present study is to explain basic concepts and processes of binary logistic regression analysis intended to determine the combination of independent variables which best explain the membership in certain groups called dichotomous…
Novikov, I; Fund, N; Freedman, L S
2010-01-15
Different methods for the calculation of sample size for simple logistic regression (LR) with one normally distributed continuous covariate give different results. Sometimes the difference can be large. Furthermore, some methods require the user to specify the prevalence of cases when the covariate equals its population mean, rather than the more natural population prevalence. We focus on two commonly used methods and show through simulations that the power for a given sample size may differ substantially from the nominal value for one method, especially when the covariate effect is large, while the other method performs poorly if the user provides the population prevalence instead of the required parameter. We propose a modification of the method of Hsieh et al. that requires specification of the population prevalence and that employs Schouten's sample size formula for a t-test with unequal variances and group sizes. This approach appears to increase the accuracy of the sample size estimates for LR with one continuous covariate.
An Entropy-Based Measure for Assessing Fuzziness in Logistic Regression
Weiss, Brandi A.; Dardick, William
2015-01-01
This article introduces an entropy-based measure of data–model fit that can be used to assess the quality of logistic regression models. Entropy has previously been used in mixture-modeling to quantify how well individuals are classified into latent classes. The current study proposes the use of entropy for logistic regression models to quantify the quality of classification and separation of group membership. Entropy complements preexisting measures of data–model fit and provides unique information not contained in other measures. Hypothetical data scenarios, an applied example, and Monte Carlo simulation results are used to demonstrate the application of entropy in logistic regression. Entropy should be used in conjunction with other measures of data–model fit to assess how well logistic regression models classify cases into observed categories. PMID:29795897
Logistic regression applied to natural hazards: rare event logistic regression with replications
NASA Astrophysics Data System (ADS)
Guns, M.; Vanacker, V.
2012-06-01
Statistical analysis of natural hazards needs particular attention, as most of these phenomena are rare events. This study shows that the ordinary rare event logistic regression, as it is now commonly used in geomorphologic studies, does not always lead to a robust detection of controlling factors, as the results can be strongly sample-dependent. In this paper, we introduce some concepts of Monte Carlo simulations in rare event logistic regression. This technique, so-called rare event logistic regression with replications, combines the strength of probabilistic and statistical methods, and allows overcoming some of the limitations of previous developments through robust variable selection. This technique was here developed for the analyses of landslide controlling factors, but the concept is widely applicable for statistical analyses of natural hazards.
Large unbalanced credit scoring using Lasso-logistic regression ensemble.
Wang, Hong; Xu, Qingsong; Zhou, Lifeng
2015-01-01
Recently, various ensemble learning methods with different base classifiers have been proposed for credit scoring problems. However, for various reasons, there has been little research using logistic regression as the base classifier. In this paper, given large unbalanced data, we consider the plausibility of ensemble learning using regularized logistic regression as the base classifier to deal with credit scoring problems. In this research, the data is first balanced and diversified by clustering and bagging algorithms. Then we apply a Lasso-logistic regression learning ensemble to evaluate the credit risks. We show that the proposed algorithm outperforms popular credit scoring models such as decision tree, Lasso-logistic regression and random forests in terms of AUC and F-measure. We also provide two importance measures for the proposed model to identify important variables in the data.
An Entropy-Based Measure for Assessing Fuzziness in Logistic Regression.
Weiss, Brandi A; Dardick, William
2016-12-01
This article introduces an entropy-based measure of data-model fit that can be used to assess the quality of logistic regression models. Entropy has previously been used in mixture-modeling to quantify how well individuals are classified into latent classes. The current study proposes the use of entropy for logistic regression models to quantify the quality of classification and separation of group membership. Entropy complements preexisting measures of data-model fit and provides unique information not contained in other measures. Hypothetical data scenarios, an applied example, and Monte Carlo simulation results are used to demonstrate the application of entropy in logistic regression. Entropy should be used in conjunction with other measures of data-model fit to assess how well logistic regression models classify cases into observed categories.
Sperm Retrieval in Patients with Klinefelter Syndrome: A Skewed Regression Model Analysis.
Chehrazi, Mohammad; Rahimiforoushani, Abbas; Sabbaghian, Marjan; Nourijelyani, Keramat; Sadighi Gilani, Mohammad Ali; Hoseini, Mostafa; Vesali, Samira; Yaseri, Mehdi; Alizadeh, Ahad; Mohammad, Kazem; Samani, Reza Omani
2017-01-01
The most common chromosomal abnormality due to non-obstructive azoospermia (NOA) is Klinefelter syndrome (KS) which occurs in 1-1.72 out of 500-1000 male infants. The probability of retrieving sperm as the outcome could be asymmetrically different between patients with and without KS, therefore logistic regression analysis is not a well-qualified test for this type of data. This study has been designed to evaluate skewed regression model analysis for data collected from microsurgical testicular sperm extraction (micro-TESE) among azoospermic patients with and without non-mosaic KS syndrome. This cohort study compared the micro-TESE outcome between 134 men with classic KS and 537 men with NOA and normal karyotype who were referred to Royan Institute between 2009 and 2011. In addition to our main outcome, which was sperm retrieval, we also used logistic and skewed regression analyses to compare the following demographic and hormonal factors: age, level of follicle stimulating hormone (FSH), luteinizing hormone (LH), and testosterone between the two groups. A comparison of the micro-TESE between the KS and control groups showed a success rate of 28.4% (38/134) for the KS group and 22.2% (119/537) for the control group. In the KS group, a significantly difference (P<0.001) existed between testosterone levels for the successful sperm retrieval group (3.4 ± 0.48 mg/mL) compared to the unsuccessful sperm retrieval group (2.33 ± 0.23 mg/mL). The index for quasi Akaike information criterion (QAIC) had a goodness of fit of 74 for the skewed model which was lower than logistic regression (QAIC=85). According to the results, skewed regression is more efficient in estimating sperm retrieval success when the data from patients with KS are analyzed. This finding should be investigated by conducting additional studies with different data structures.
Dexter, Franklin; Ledolter, Johannes; Hindman, Bradley J
2017-06-01
Our department monitors the quality of anesthesiologists' clinical supervision and provides each anesthesiologist with periodic feedback. We hypothesized that greater differentiation among anesthesiologists' supervision scores could be obtained by adjusting for leniency of the rating resident. From July 1, 2013 to December 31, 2015, our department has utilized the de Oliveira Filho unidimensional nine-item supervision scale to assess the quality of clinical supervision provided by faculty as rated by residents. We examined all 13,664 ratings of the 97 anesthesiologists (ratees) by the 65 residents (raters). Testing for internal consistency among answers to questions (large Cronbach's alpha > 0.90) was performed to rule out that one or two questions accounted for leniency. Mixed-effects logistic regression was used to compare ratees while controlling for rater leniency vs using Student t tests without rater leniency. The mean supervision scale score was calculated for each combination of the 65 raters and nine questions. The Cronbach's alpha was very large (0.977). The mean score was calculated for each of the 3,421 observed combinations of resident and anesthesiologist. The logits of the percentage of scores equal to the maximum value of 4.00 were normally distributed (residents, P = 0.24; anesthesiologists, P = 0.50). There were 20/97 anesthesiologists identified as significant outliers (13 with below average supervision scores and seven with better than average) using the mixed-effects logistic regression with rater leniency entered as a fixed effect but not by Student's t test. In contrast, there were three of 97 anesthesiologists identified as outliers (all three above average) using Student's t tests but not by logistic regression with leniency. The 20 vs 3 was significant (P < 0.001). Use of logistic regression with leniency results in greater detection of anesthesiologists with significantly better (or worse) clinical supervision scores than use of Student's t tests (i.e., without adjustment for rater leniency).
Olson, Scott A.; Brouillette, Michael C.
2006-01-01
A logistic regression equation was developed for estimating the probability of a stream flowing intermittently at unregulated, rural stream sites in Vermont. These determinations can be used for a wide variety of regulatory and planning efforts at the Federal, State, regional, county and town levels, including such applications as assessing fish and wildlife habitats, wetlands classifications, recreational opportunities, water-supply potential, waste-assimilation capacities, and sediment transport. The equation will be used to create a derived product for the Vermont Hydrography Dataset having the streamflow characteristic of 'intermittent' or 'perennial.' The Vermont Hydrography Dataset is Vermont's implementation of the National Hydrography Dataset and was created at a scale of 1:5,000 based on statewide digital orthophotos. The equation was developed by relating field-verified perennial or intermittent status of a stream site during normal summer low-streamflow conditions in the summer of 2005 to selected basin characteristics of naturally flowing streams in Vermont. The database used to develop the equation included 682 stream sites with drainage areas ranging from 0.05 to 5.0 square miles. When the 682 sites were observed, 126 were intermittent (had no flow at the time of the observation) and 556 were perennial (had flowing water at the time of the observation). The results of the logistic regression analysis indicate that the probability of a stream having intermittent flow in Vermont is a function of drainage area, elevation of the site, the ratio of basin relief to basin perimeter, and the areal percentage of well- and moderately well-drained soils in the basin. Using a probability cutpoint (a lower probability indicates the site has perennial flow and a higher probability indicates the site has intermittent flow) of 0.5, the logistic regression equation correctly predicted the perennial or intermittent status of 116 test sites 85 percent of the time.
Power and Sample Size Calculations for Logistic Regression Tests for Differential Item Functioning
ERIC Educational Resources Information Center
Li, Zhushan
2014-01-01
Logistic regression is a popular method for detecting uniform and nonuniform differential item functioning (DIF) effects. Theoretical formulas for the power and sample size calculations are derived for likelihood ratio tests and Wald tests based on the asymptotic distribution of the maximum likelihood estimators for the logistic regression model.…
A Methodology for Generating Placement Rules that Utilizes Logistic Regression
ERIC Educational Resources Information Center
Wurtz, Keith
2008-01-01
The purpose of this article is to provide the necessary tools for institutional researchers to conduct a logistic regression analysis and interpret the results. Aspects of the logistic regression procedure that are necessary to evaluate models are presented and discussed with an emphasis on cutoff values and choosing the appropriate number of…
John Hogland; Nedret Billor; Nathaniel Anderson
2013-01-01
Discriminant analysis, referred to as maximum likelihood classification within popular remote sensing software packages, is a common supervised technique used by analysts. Polytomous logistic regression (PLR), also referred to as multinomial logistic regression, is an alternative classification approach that is less restrictive, more flexible, and easy to interpret. To...
Large Unbalanced Credit Scoring Using Lasso-Logistic Regression Ensemble
Wang, Hong; Xu, Qingsong; Zhou, Lifeng
2015-01-01
Recently, various ensemble learning methods with different base classifiers have been proposed for credit scoring problems. However, for various reasons, there has been little research using logistic regression as the base classifier. In this paper, given large unbalanced data, we consider the plausibility of ensemble learning using regularized logistic regression as the base classifier to deal with credit scoring problems. In this research, the data is first balanced and diversified by clustering and bagging algorithms. Then we apply a Lasso-logistic regression learning ensemble to evaluate the credit risks. We show that the proposed algorithm outperforms popular credit scoring models such as decision tree, Lasso-logistic regression and random forests in terms of AUC and F-measure. We also provide two importance measures for the proposed model to identify important variables in the data. PMID:25706988
Suppression of the oculocephalic reflex (doll's eyes phenomenon) in normal full-term babies.
Snir, Moshe; Hasanreisoglu, Murat; Hasanreisoglue, Murat; Goldenberg-Cohen, Nitza; Friling, Ronit; Katz, Kalman; Nachum, Yoav; Benjamini, Yoav; Herscovici, Zvi; Axer-Siegel, Ruth
2010-05-01
To determine the precise age of suppression of the oculocephalic reflex in infants and its relationship to specific clinical characteristics. The oculocephalic reflex was prospectively tested in 325 healthy full-term babies aged 1 to 32 weeks attending an orthopedic outpatient clinic. Two ophthalmologists raised the baby's head 30 degrees above horizontal and rapidly rotated it in the horizontal and vertical planes while watching the conjugate eye movement. Suppression of the reflex, by observer agreement, was analyzed in relation to gestational age, postpartum age, postconceptional age, birth weight, and current weight. The data were fitted to a logistic regression model to determine the probability of suppression of the reflex according to the clinical variables. The oculocephalic reflex was suppressed in 75% of babies by the age of 11.5 weeks and in more than 95% of babies aged 20 weeks. Although postpartum age had a greater influence than gestational age, both were significantly correlated with suppression of the reflex (p = 0.01 and p = 0.04, respectively; two-sided t-test). Postpartum age was the best single variable explaining absence of the reflex. On logistic regression with cross-validation, the model including postpartum age and current weight yielded the best results; both these factors were highly correlated with suppression of the reflex (r = 0.74). The oculocephalic reflex is suppressed in the vast majority of normal infants by age 11.5 weeks. The disappearance of the reflex occurs gradually and longitudinally and is part of the normal maturation of the visual system.
Association Between Inpatient Sleep Loss and Hyperglycemia of Hospitalization
DePietro, Regina H.; Knutson, Kristen L.; Spampinato, Lisa; Anderson, Samantha L.; Meltzer, David O.; Van Cauter, Eve
2017-01-01
OBJECTIVE To determine whether inpatient sleep duration and efficiency are associated with a greater risk of hyperglycemia in hospitalized patients with and without diabetes. RESEARCH DESIGN AND METHODS In this retrospective analysis of a prospective cohort study, medical inpatients ≥50 years of age were interviewed, and their charts were reviewed to obtain demographic data and diagnosis. Using World Health Organization criteria, patients were categorized as having normal blood glucose, impaired fasting blood glucose, or hyperglycemia based on morning glucose from the electronic health record. Wrist actigraphy measured sleep. Multivariable ordinal logistic regression models, controlling for subject random effects, tested the association between inpatient sleep duration and proportional odds of hyperglycemia versus impaired fasting blood glucose or impaired fasting blood glucose versus normal blood glucose in hospitalized adults. RESULTS A total of 212 patients (60% female and 74% African American) were enrolled. Roughly one-third (73, 34%) had diabetes. Objective inpatient sleep measures did not differ between patients with or without diabetes. In ordinal logistic regression models, each additional hour of in-hospital sleep was associated with an 11% (odds ratio 0.89 [95% CI 0.80, 0.99]; P = 0.043) lower proportional odds of a higher glucose category the next morning (hyperglycemia vs. elevated and elevated vs. normal). Every 10% increase in sleep efficiency was associated with an 18% lower proportional odds of a higher glucose category (odds ratio 0.82 [95% CI 0.74, 0.89]; P < 0.001). CONCLUSIONS Among medical inpatients, both shorter sleep duration and worse sleep efficiency were independently associated with greater proportional odds of hyperglycemia and impaired fasting glucose. PMID:27903614
An Entropy-Based Measure for Assessing Fuzziness in Logistic Regression
ERIC Educational Resources Information Center
Weiss, Brandi A.; Dardick, William
2016-01-01
This article introduces an entropy-based measure of data-model fit that can be used to assess the quality of logistic regression models. Entropy has previously been used in mixture-modeling to quantify how well individuals are classified into latent classes. The current study proposes the use of entropy for logistic regression models to quantify…
What Are the Odds of that? A Primer on Understanding Logistic Regression
ERIC Educational Resources Information Center
Huang, Francis L.; Moon, Tonya R.
2013-01-01
The purpose of this Methodological Brief is to present a brief primer on logistic regression, a commonly used technique when modeling dichotomous outcomes. Using data from the National Education Longitudinal Study of 1988 (NELS:88), logistic regression techniques were used to investigate student-level variables in eighth grade (i.e., enrolled in a…
On the Usefulness of a Multilevel Logistic Regression Approach to Person-Fit Analysis
ERIC Educational Resources Information Center
Conijn, Judith M.; Emons, Wilco H. M.; van Assen, Marcel A. L. M.; Sijtsma, Klaas
2011-01-01
The logistic person response function (PRF) models the probability of a correct response as a function of the item locations. Reise (2000) proposed to use the slope parameter of the logistic PRF as a person-fit measure. He reformulated the logistic PRF model as a multilevel logistic regression model and estimated the PRF parameters from this…
Stylianou, Neophytos; Akbarov, Artur; Kontopantelis, Evangelos; Buchan, Iain; Dunn, Ken W
2015-08-01
Predicting mortality from burn injury has traditionally employed logistic regression models. Alternative machine learning methods have been introduced in some areas of clinical prediction as the necessary software and computational facilities have become accessible. Here we compare logistic regression and machine learning predictions of mortality from burn. An established logistic mortality model was compared to machine learning methods (artificial neural network, support vector machine, random forests and naïve Bayes) using a population-based (England & Wales) case-cohort registry. Predictive evaluation used: area under the receiver operating characteristic curve; sensitivity; specificity; positive predictive value and Youden's index. All methods had comparable discriminatory abilities, similar sensitivities, specificities and positive predictive values. Although some machine learning methods performed marginally better than logistic regression the differences were seldom statistically significant and clinically insubstantial. Random forests were marginally better for high positive predictive value and reasonable sensitivity. Neural networks yielded slightly better prediction overall. Logistic regression gives an optimal mix of performance and interpretability. The established logistic regression model of burn mortality performs well against more complex alternatives. Clinical prediction with a small set of strong, stable, independent predictors is unlikely to gain much from machine learning outside specialist research contexts. Copyright © 2015 Elsevier Ltd and ISBI. All rights reserved.
Valle, Denis; Lima, Joanna M Tucker; Millar, Justin; Amratia, Punam; Haque, Ubydul
2015-11-04
Logistic regression is a statistical model widely used in cross-sectional and cohort studies to identify and quantify the effects of potential disease risk factors. However, the impact of imperfect tests on adjusted odds ratios (and thus on the identification of risk factors) is under-appreciated. The purpose of this article is to draw attention to the problem associated with modelling imperfect diagnostic tests, and propose simple Bayesian models to adequately address this issue. A systematic literature review was conducted to determine the proportion of malaria studies that appropriately accounted for false-negatives/false-positives in a logistic regression setting. Inference from the standard logistic regression was also compared with that from three proposed Bayesian models using simulations and malaria data from the western Brazilian Amazon. A systematic literature review suggests that malaria epidemiologists are largely unaware of the problem of using logistic regression to model imperfect diagnostic test results. Simulation results reveal that statistical inference can be substantially improved when using the proposed Bayesian models versus the standard logistic regression. Finally, analysis of original malaria data with one of the proposed Bayesian models reveals that microscopy sensitivity is strongly influenced by how long people have lived in the study region, and an important risk factor (i.e., participation in forest extractivism) is identified that would have been missed by standard logistic regression. Given the numerous diagnostic methods employed by malaria researchers and the ubiquitous use of logistic regression to model the results of these diagnostic tests, this paper provides critical guidelines to improve data analysis practice in the presence of misclassification error. Easy-to-use code that can be readily adapted to WinBUGS is provided, enabling straightforward implementation of the proposed Bayesian models.
Chuang, Jung-Fang; Rau, Cheng-Shyuan; Kuo, Pao-Jen; Chen, Yi-Chun; Hsu, Shiun-Yuan; Hsieh, Hsiao-Yun; Hsieh, Ching-Hua
2016-03-18
The adverse impact of obesity has been extensively studied in the general population; however, the added risk of obesity on trauma-related mortality remains controversial. This study investigated and compared mortality as well injury patterns and length of stay (LOS) in obese and normal-weight patients hospitalized for trauma in the hospital and intensive care unit (ICU) of a Level I trauma center in southern Taiwan. Detailed data of 880 obese adult patients with body mass index (BMI) ≥ 30 kg/m(2) and 5391 normal-weight adult patients (25 > BMI ≥ 18.5 kg/m(2)) who had sustained a trauma injury between January 1, 2009 and December 31, 2013 were retrieved from the Trauma Registry System. Pearson's chi-squared, Fisher's exact, and independent Student's t-tests were used to compare differences between groups. Propensity score matching with logistic regression was used to evaluate the effect of obesity on mortality. In this study, obese patients were more often men, motorcycle riders and pedestrians, and had a lower proportion of alcohol intoxication compared to normal-weight patients. Analysis of Abbreviated Injury Scale scores revealed that obese trauma patients presented with a higher rate of injury to the thorax, but a lower rate of facial injuries than normal-weight patients. No significant differences were found between obese and normal-weight patients regarding Injury Severity Score (ISS), Trauma-Injury Severity Score (TRISS), mortality, the proportion of patients admitted to the ICU, or LOS in ICU. After propensity score matching, logistic regression of 66 well-matched pairs did not show a significant influence of obesity on mortality (odds ratio: 1.51, 95% confidence interval: 0.54-4.23 p = 0.438). However, significantly longer hospital LOS (10.6 vs. 9.5 days, respectively, p = 0.044) was observed in obese patients than in normal-weight patients, particularly obese patients with pelvic, tibial, or fibular fractures. Compared to normal-weight patients, obese patients presented with different injury characteristics and bodily injury patterns but no difference in mortality.
2004-03-01
constant variance via an analysis of the residuals, as well as the Breusch - Pagan test (see Figure 3 below). As a result, we follow the footsteps of...reasonably normal, which ensures that our residuals meet the assumption of constant variance by passing the Breusch - Pagan test (see Figure 4 below...sections for Research and Development, Test and Evaluation (RDT&E), procurement and military construction (Jarvaise, 1996:3). While differing
The validity of self-reported vs. measured body weight and height and the effect of self-perception.
Gokler, Mehmet Enes; Bugrul, Necati; Sarı, Ahu Ozturk; Metintas, Selma
2018-01-01
The objective was to assess the validity of self-reported body weight and height and the possible influence of self-perception of body mass index (BMI) status on the actual BMI during the adolescent period. This cross sectional study was conducted on 3918 high school students. Accurate BMI perception occurred when the student's self-perception of their BMI status did not differ from their actual BMI based on measured height and weight. Agreement between the measured and self-reported body height and weight and BMI values was determined using the Bland-Altman metod. To determine the effects of "a good level of agreement", hierarchical logistic regression models were used. Among male students who reported their BMI in the normal region, 2.8% were measured as overweight while 0.6% of them were measured as obese. For females in the same group, these percentages were 1.3% and 0.4% respectively. Among male students who perceived their BMI in the normal region, 8.5% were measured as overweight while 0.4% of them were measured as obese. For females these percentages were 25.6% and 1.8% respectively. According to logistic regression analysis, residence and accurate BMI perception were significantly associated with "good agreement" ( p ≤ 0.001). The results of this study demonstrated that in determining obesity and overweight statuses, non-accurate weight perception is a potential risk for students.
Mimenza-Alvarado, Alberto; Aguilar-Navarro, Sara G; Yeverino-Castro, Sara; Mendoza-Franco, César; Ávila-Funes, José Alberto; Román, Gustavo C
2018-01-01
Cerebral small-vessel disease (SVD) represents the most frequent type of vascular brain lesions, often coexisting with Alzheimer disease (AD). By quantifying white matter hyperintensities (WMH) and hippocampal and parietal atrophy, we aimed to describe the prevalence and severity of SVD among older adults with normal cognition (NC), mild cognitive impairment (MCI), and probable AD and to describe associated risk factors. This study included 105 older adults evaluated with magnetic resonance imaging and clinical and neuropsychological tests. We used the Fazekas scale (FS) for quantification of WMH, the Scheltens scale (SS) for hippocampal atrophy, and the Koedam scale (KS) for parietal atrophy. Logistic regression models were performed to determine the association between FS, SS, and KS scores and the presence of NC, MCI, or probable AD. Compared to NC subjects, SVD was more prevalent in MCI and probable AD subjects. After adjusting for confounding factors, logistic regression showed a positive association between higher scores on the FS and probable AD (OR = 7.6, 95% CI 2.7-20, p < 0.001). With the use of the SS and KS (OR = 4.5, 95% CI 3.5-58, p = 0.003 and OR = 8.9, 95% CI 1-72, p = 0.04, respectively), the risk also remained significant for probable AD. These results suggest an association between severity of vascular brain lesions and neurodegeneration.
Prevalence of Neuropsychiatric Symptoms in CIND and Its Subtypes: The Cache County Study
Peters, ME; Rosenberg, P; Steinberg, M; Tschanz, J; Norton, MC; Welsh-Bohmer, KA; Hayden, KM; Breitner, JCS; Lyketsos, CG
2011-01-01
Objectives 1) To report rates of neuropsychiatric symptoms (NPS) in cognitive impairment, no dementia (CIND). 2) To compare the 30-day prevalence of NPS in CIND with that in dementia and cognitively normal individuals. 3) To compare the prevalence of NPS in amnestic MCI (aMCI) with other predementia syndromes. Design Comparison of prevalence proportions among several defined groups. Setting Population-based study. Participants A subsample of the permanent residents of Cache County, Utah, aged 65 years or older in January 1995 (N = 5092) and who had completed clinical assessments and had an informant-completed Neuropsychiatric Inventory. Measurements Chi-square statistics, tests for trend, and logistic regression models were used to analyze the three objectives listed earlier. Results The most prevalent NPS in those with CIND were depression (16.9%), irritability (9.8%), nighttime behaviors (7.6%), apathy (6.9%), and anxiety (5.4%). Trend analyses confirmed that the CIND group had NPS prevalence rates that fell between the normal and dementia groups for most NPS. Logistic regression models showed no significant difference between aMCI and other CIND participants in the prevalence of any NPS (lowest p: 0.316). Conclusions These data confirm the relatively high prevalence of NPS in CIND reported by other studies, especially for affective symptoms. No differences in NPS prevalence were found between aMCI and other types of CIND. PMID:22522960
Logistic regression for risk factor modelling in stuttering research.
Reed, Phil; Wu, Yaqionq
2013-06-01
To outline the uses of logistic regression and other statistical methods for risk factor analysis in the context of research on stuttering. The principles underlying the application of a logistic regression are illustrated, and the types of questions to which such a technique has been applied in the stuttering field are outlined. The assumptions and limitations of the technique are discussed with respect to existing stuttering research, and with respect to formulating appropriate research strategies to accommodate these considerations. Finally, some alternatives to the approach are briefly discussed. The way the statistical procedures are employed are demonstrated with some hypothetical data. Research into several practical issues concerning stuttering could benefit if risk factor modelling were used. Important examples are early diagnosis, prognosis (whether a child will recover or persist) and assessment of treatment outcome. After reading this article you will: (a) Summarize the situations in which logistic regression can be applied to a range of issues about stuttering; (b) Follow the steps in performing a logistic regression analysis; (c) Describe the assumptions of the logistic regression technique and the precautions that need to be checked when it is employed; (d) Be able to summarize its advantages over other techniques like estimation of group differences and simple regression. Copyright © 2012 Elsevier Inc. All rights reserved.
Dynamic Dimensionality Selection for Bayesian Classifier Ensembles
2015-03-19
learning of weights in an otherwise generatively learned naive Bayes classifier. WANBIA-C is very cometitive to Logistic Regression but much more...classifier, Generative learning, Discriminative learning, Naïve Bayes, Feature selection, Logistic regression , higher order attribute independence 16...discriminative learning of weights in an otherwise generatively learned naive Bayes classifier. WANBIA-C is very cometitive to Logistic Regression but
Travis Woolley; David C. Shaw; Lisa M. Ganio; Stephen Fitzgerald
2012-01-01
Logistic regression models used to predict tree mortality are critical to post-fire management, planning prescribed bums and understanding disturbance ecology. We review literature concerning post-fire mortality prediction using logistic regression models for coniferous tree species in the western USA. We include synthesis and review of: methods to develop, evaluate...
Preserving Institutional Privacy in Distributed binary Logistic Regression.
Wu, Yuan; Jiang, Xiaoqian; Ohno-Machado, Lucila
2012-01-01
Privacy is becoming a major concern when sharing biomedical data across institutions. Although methods for protecting privacy of individual patients have been proposed, it is not clear how to protect the institutional privacy, which is many times a critical concern of data custodians. Built upon our previous work, Grid Binary LOgistic REgression (GLORE)1, we developed an Institutional Privacy-preserving Distributed binary Logistic Regression model (IPDLR) that considers both individual and institutional privacy for building a logistic regression model in a distributed manner. We tested our method using both simulated and clinical data, showing how it is possible to protect the privacy of individuals and of institutions using a distributed strategy.
Covariate Imbalance and Adjustment for Logistic Regression Analysis of Clinical Trial Data
Ciolino, Jody D.; Martin, Reneé H.; Zhao, Wenle; Jauch, Edward C.; Hill, Michael D.; Palesch, Yuko Y.
2014-01-01
In logistic regression analysis for binary clinical trial data, adjusted treatment effect estimates are often not equivalent to unadjusted estimates in the presence of influential covariates. This paper uses simulation to quantify the benefit of covariate adjustment in logistic regression. However, International Conference on Harmonization guidelines suggest that covariate adjustment be pre-specified. Unplanned adjusted analyses should be considered secondary. Results suggest that that if adjustment is not possible or unplanned in a logistic setting, balance in continuous covariates can alleviate some (but never all) of the shortcomings of unadjusted analyses. The case of log binomial regression is also explored. PMID:24138438
Differentially private distributed logistic regression using private and public data.
Ji, Zhanglong; Jiang, Xiaoqian; Wang, Shuang; Xiong, Li; Ohno-Machado, Lucila
2014-01-01
Privacy protecting is an important issue in medical informatics and differential privacy is a state-of-the-art framework for data privacy research. Differential privacy offers provable privacy against attackers who have auxiliary information, and can be applied to data mining models (for example, logistic regression). However, differentially private methods sometimes introduce too much noise and make outputs less useful. Given available public data in medical research (e.g. from patients who sign open-consent agreements), we can design algorithms that use both public and private data sets to decrease the amount of noise that is introduced. In this paper, we modify the update step in Newton-Raphson method to propose a differentially private distributed logistic regression model based on both public and private data. We try our algorithm on three different data sets, and show its advantage over: (1) a logistic regression model based solely on public data, and (2) a differentially private distributed logistic regression model based on private data under various scenarios. Logistic regression models built with our new algorithm based on both private and public datasets demonstrate better utility than models that trained on private or public datasets alone without sacrificing the rigorous privacy guarantee.
Deng, Yingyuan; Wang, Tianfu; Chen, Siping; Liu, Weixiang
2017-01-01
The aim of the study is to screen the significant sonographic features by logistic regression analysis and fit a model to diagnose thyroid nodules. A total of 525 pathological thyroid nodules were retrospectively analyzed. All the nodules underwent conventional ultrasonography (US), strain elastosonography (SE), and contrast -enhanced ultrasound (CEUS). Those nodules’ 12 suspicious sonographic features were used to assess thyroid nodules. The significant features of diagnosing thyroid nodules were picked out by logistic regression analysis. All variables that were statistically related to diagnosis of thyroid nodules, at a level of p < 0.05 were embodied in a logistic regression analysis model. The significant features in the logistic regression model of diagnosing thyroid nodules were calcification, suspected cervical lymph node metastasis, hypoenhancement pattern, margin, shape, vascularity, posterior acoustic, echogenicity, and elastography score. According to the results of logistic regression analysis, the formula that could predict whether or not thyroid nodules are malignant was established. The area under the receiver operating curve (ROC) was 0.930 and the sensitivity, specificity, accuracy, positive predictive value, and negative predictive value were 83.77%, 89.56%, 87.05%, 86.04%, and 87.79% respectively. PMID:29228030
Pang, Tiantian; Huang, Leidan; Deng, Yingyuan; Wang, Tianfu; Chen, Siping; Gong, Xuehao; Liu, Weixiang
2017-01-01
The aim of the study is to screen the significant sonographic features by logistic regression analysis and fit a model to diagnose thyroid nodules. A total of 525 pathological thyroid nodules were retrospectively analyzed. All the nodules underwent conventional ultrasonography (US), strain elastosonography (SE), and contrast -enhanced ultrasound (CEUS). Those nodules' 12 suspicious sonographic features were used to assess thyroid nodules. The significant features of diagnosing thyroid nodules were picked out by logistic regression analysis. All variables that were statistically related to diagnosis of thyroid nodules, at a level of p < 0.05 were embodied in a logistic regression analysis model. The significant features in the logistic regression model of diagnosing thyroid nodules were calcification, suspected cervical lymph node metastasis, hypoenhancement pattern, margin, shape, vascularity, posterior acoustic, echogenicity, and elastography score. According to the results of logistic regression analysis, the formula that could predict whether or not thyroid nodules are malignant was established. The area under the receiver operating curve (ROC) was 0.930 and the sensitivity, specificity, accuracy, positive predictive value, and negative predictive value were 83.77%, 89.56%, 87.05%, 86.04%, and 87.79% respectively.
Amini, Payam; Maroufizadeh, Saman; Samani, Reza Omani; Hamidi, Omid; Sepidarkish, Mahdi
2017-06-01
Preterm birth (PTB) is a leading cause of neonatal death and the second biggest cause of death in children under five years of age. The objective of this study was to determine the prevalence of PTB and its associated factors using logistic regression and decision tree classification methods. This cross-sectional study was conducted on 4,415 pregnant women in Tehran, Iran, from July 6-21, 2015. Data were collected by a researcher-developed questionnaire through interviews with mothers and review of their medical records. To evaluate the accuracy of the logistic regression and decision tree methods, several indices such as sensitivity, specificity, and the area under the curve were used. The PTB rate was 5.5% in this study. The logistic regression outperformed the decision tree for the classification of PTB based on risk factors. Logistic regression showed that multiple pregnancies, mothers with preeclampsia, and those who conceived with assisted reproductive technology had an increased risk for PTB ( p < 0.05). Identifying and training mothers at risk as well as improving prenatal care may reduce the PTB rate. We also recommend that statisticians utilize the logistic regression model for the classification of risk groups for PTB.
Predicting clicks of PubMed articles.
Mao, Yuqing; Lu, Zhiyong
2013-01-01
Predicting the popularity or access usage of an article has the potential to improve the quality of PubMed searches. We can model the click trend of each article as its access changes over time by mining the PubMed query logs, which contain the previous access history for all articles. In this article, we examine the access patterns produced by PubMed users in two years (July 2009 to July 2011). We explore the time series of accesses for each article in the query logs, model the trends with regression approaches, and subsequently use the models for prediction. We show that the click trends of PubMed articles are best fitted with a log-normal regression model. This model allows the number of accesses an article receives and the time since it first becomes available in PubMed to be related via quadratic and logistic functions, with the model parameters to be estimated via maximum likelihood. Our experiments predicting the number of accesses for an article based on its past usage demonstrate that the mean absolute error and mean absolute percentage error of our model are 4.0% and 8.1% lower than the power-law regression model, respectively. The log-normal distribution is also shown to perform significantly better than a previous prediction method based on a human memory theory in cognitive science. This work warrants further investigation on the utility of such a log-normal regression approach towards improving information access in PubMed.
Predicting clicks of PubMed articles
Mao, Yuqing; Lu, Zhiyong
2013-01-01
Predicting the popularity or access usage of an article has the potential to improve the quality of PubMed searches. We can model the click trend of each article as its access changes over time by mining the PubMed query logs, which contain the previous access history for all articles. In this article, we examine the access patterns produced by PubMed users in two years (July 2009 to July 2011). We explore the time series of accesses for each article in the query logs, model the trends with regression approaches, and subsequently use the models for prediction. We show that the click trends of PubMed articles are best fitted with a log-normal regression model. This model allows the number of accesses an article receives and the time since it first becomes available in PubMed to be related via quadratic and logistic functions, with the model parameters to be estimated via maximum likelihood. Our experiments predicting the number of accesses for an article based on its past usage demonstrate that the mean absolute error and mean absolute percentage error of our model are 4.0% and 8.1% lower than the power-law regression model, respectively. The log-normal distribution is also shown to perform significantly better than a previous prediction method based on a human memory theory in cognitive science. This work warrants further investigation on the utility of such a log-normal regression approach towards improving information access in PubMed. PMID:24551386
Logistic regression for dichotomized counts.
Preisser, John S; Das, Kalyan; Benecha, Habtamu; Stamm, John W
2016-12-01
Sometimes there is interest in a dichotomized outcome indicating whether a count variable is positive or zero. Under this scenario, the application of ordinary logistic regression may result in efficiency loss, which is quantifiable under an assumed model for the counts. In such situations, a shared-parameter hurdle model is investigated for more efficient estimation of regression parameters relating to overall effects of covariates on the dichotomous outcome, while handling count data with many zeroes. One model part provides a logistic regression containing marginal log odds ratio effects of primary interest, while an ancillary model part describes the mean count of a Poisson or negative binomial process in terms of nuisance regression parameters. Asymptotic efficiency of the logistic model parameter estimators of the two-part models is evaluated with respect to ordinary logistic regression. Simulations are used to assess the properties of the models with respect to power and Type I error, the latter investigated under both misspecified and correctly specified models. The methods are applied to data from a randomized clinical trial of three toothpaste formulations to prevent incident dental caries in a large population of Scottish schoolchildren. © The Author(s) 2014.
Zhu, K; Lou, Z; Zhou, J; Ballester, N; Kong, N; Parikh, P
2015-01-01
This article is part of the Focus Theme of Methods of Information in Medicine on "Big Data and Analytics in Healthcare". Hospital readmissions raise healthcare costs and cause significant distress to providers and patients. It is, therefore, of great interest to healthcare organizations to predict what patients are at risk to be readmitted to their hospitals. However, current logistic regression based risk prediction models have limited prediction power when applied to hospital administrative data. Meanwhile, although decision trees and random forests have been applied, they tend to be too complex to understand among the hospital practitioners. Explore the use of conditional logistic regression to increase the prediction accuracy. We analyzed an HCUP statewide inpatient discharge record dataset, which includes patient demographics, clinical and care utilization data from California. We extracted records of heart failure Medicare beneficiaries who had inpatient experience during an 11-month period. We corrected the data imbalance issue with under-sampling. In our study, we first applied standard logistic regression and decision tree to obtain influential variables and derive practically meaning decision rules. We then stratified the original data set accordingly and applied logistic regression on each data stratum. We further explored the effect of interacting variables in the logistic regression modeling. We conducted cross validation to assess the overall prediction performance of conditional logistic regression (CLR) and compared it with standard classification models. The developed CLR models outperformed several standard classification models (e.g., straightforward logistic regression, stepwise logistic regression, random forest, support vector machine). For example, the best CLR model improved the classification accuracy by nearly 20% over the straightforward logistic regression model. Furthermore, the developed CLR models tend to achieve better sensitivity of more than 10% over the standard classification models, which can be translated to correct labeling of additional 400 - 500 readmissions for heart failure patients in the state of California over a year. Lastly, several key predictor identified from the HCUP data include the disposition location from discharge, the number of chronic conditions, and the number of acute procedures. It would be beneficial to apply simple decision rules obtained from the decision tree in an ad-hoc manner to guide the cohort stratification. It could be potentially beneficial to explore the effect of pairwise interactions between influential predictors when building the logistic regression models for different data strata. Judicious use of the ad-hoc CLR models developed offers insights into future development of prediction models for hospital readmissions, which can lead to better intuition in identifying high-risk patients and developing effective post-discharge care strategies. Lastly, this paper is expected to raise the awareness of collecting data on additional markers and developing necessary database infrastructure for larger-scale exploratory studies on readmission risk prediction.
Interpretation of commonly used statistical regression models.
Kasza, Jessica; Wolfe, Rory
2014-01-01
A review of some regression models commonly used in respiratory health applications is provided in this article. Simple linear regression, multiple linear regression, logistic regression and ordinal logistic regression are considered. The focus of this article is on the interpretation of the regression coefficients of each model, which are illustrated through the application of these models to a respiratory health research study. © 2013 The Authors. Respirology © 2013 Asian Pacific Society of Respirology.
Sensitivity of Raman spectroscopy to normal patient variability
NASA Astrophysics Data System (ADS)
Vargis, Elizabeth; Byrd, Teresa; Logan, Quinisha; Khabele, Dineo; Mahadevan-Jansen, Anita
2011-11-01
Many groups have used Raman spectroscopy for diagnosing cervical dysplasia; however, there have been few studies looking at the effect of normal physiological variations on Raman spectra. We assess four patient variables that may affect normal Raman spectra: Race/ethnicity, body mass index (BMI), parity, and socioeconomic status. Raman spectra were acquired from a diverse population of 75 patients undergoing routine screening for cervical dysplasia. Classification of Raman spectra from patients with a normal cervix is performed using sparse multinomial logistic regression (SMLR) to determine if any of these variables has a significant effect. Results suggest that BMI and parity have the greatest impact, whereas race/ethnicity and socioeconomic status have a limited effect. Incorporating BMI and obstetric history into classification algorithms may increase sensitivity and specificity rates of disease classification using Raman spectroscopy. Studies are underway to assess the effect of these variables on disease.
2004-03-01
Breusch - Pagan test for constant variance of the residuals. Using Microsoft Excel® we calculate a p-value of 0.841237. This high p-value, which is above...our alpha of 0.05, indicates that our residuals indeed pass the Breusch - Pagan test for constant variance. In addition to the assumption tests , we...Wilk Test for Normality – Support (Reduced) Model (OLS) Finally, we perform a Breusch - Pagan test for constant variance of the residuals. Using
Heser, Kathrin; Bleckwenn, Markus; Wiese, Birgitt; Mamone, Silke; Riedel-Heller, Steffi G; Stein, Janine; Lühmann, Dagmar; Posselt, Tina; Fuchs, Angela; Pentzek, Michael; Weyerer, Siegfried; Werle, Jochen; Weeg, Dagmar; Bickel, Horst; Brettschneider, Christian; König, Hans-Helmut; Maier, Wolfgang; Scherer, Martin; Wagner, Michael
2016-08-01
Late-life depression is frequently accompanied by cognitive impairments. Whether these impairments indicate a prodromal state of dementia, or are a symptomatic expression of depression per se is not well-studied. In a cohort of very old initially non-demented primary care patients (n = 2,709, mean age = 81.1 y), cognitive performance was compared between groups of participants with or without elevated depressive symptoms and with or without subsequent dementia using ANCOVA (adjusted for age, sex, and education). Logistic regression analyses were computed to predict subsequent dementia over up to six years of follow-up. The same analytical approach was performed for lifetime major depression. Participants with elevated depressive symptoms without subsequent dementia showed only small to medium cognitive deficits. In contrast, participants with depressive symptoms with subsequent dementia showed medium to very large cognitive deficits. In adjusted logistic regression models, learning and memory deficits predicted the risk for subsequent dementia in participants with depressive symptoms. Participants with a lifetime history of major depression without subsequent dementia showed no cognitive deficits. However, in adjusted logistic regression models, learning and orientation deficits predicted the risk for subsequent dementia also in participants with lifetime major depression. Marked cognitive impairments in old age depression should not be dismissed as "depressive pseudodementia", but require clinical attention as a possible sign of incipient dementia. Non-depressed elderly with a lifetime history of major depression, who remained free of dementia during follow-up, had largely normal cognitive performance.
Guo, L W; Liu, S Z; Zhang, M; Chen, Q; Zhang, S K; Sun, X B
2017-12-10
Objective: To investigate the effect of fried food intake on the pathogenesis of esophageal cancer and precancerous lesions. Methods: From 2005 to 2013, all the residents aged 40-69 years from 11 counties (cities) where cancer screening of upper gastrointestinal cancer had been conducted in rural areas of Henan province, were recruited as the subjects of study. Information on demography and lifestyle was collected. The residents under study were screened with iodine staining endoscopic examination and biopsy samples were diagnosed pathologically, under standardized criteria. Subjects with high risk were divided into the groups based on their different pathological degrees. Multivariate ordinal logistic regression analysis was used to analyze the relationship between the frequency of fried food intake and esophageal cancer and precancerous lesions. Results: A total number of 8 792 cases with normal esophagus, 3 680 with mild hyperplasia, 972 with moderate hyperplasia, 413 with severe hyperplasia carcinoma in situ, and 336 cases of esophageal cancer were recruited. Results from multivariate logistic regression analysis showed that, when compared with those who did not eat fried food, the intake of fried food (<2 times/week: OR =1.60, 95% CI : 1.40-1.83; ≥2 times/week: OR =2.58, 95% CI : 1.98-3.37) appeared a risk factor for both esophageal cancer or precancerous lesions after adjustment for age, sex, marital status, educational level, body mass index, smoking and alcohol intake. Conclusion: The intake of fried food appeared a risk factor for both esophageal cancer and precancerous lesions.
Iturriaga, H; Hirsch, S; Bunout, D; Díaz, M; Kelly, M; Silva, G; de la Maza, M P; Petermann, M; Ugarte, G
1993-04-01
Looking for a noninvasive method to predict liver histologic alterations in alcoholic patients without clinical signs of liver failure, we studied 187 chronic alcoholics recently abstinent, divided in 2 series. In the model series (n = 94) several clinical variables and results of common laboratory tests were confronted to the findings of liver biopsies. These were classified in 3 groups: 1. Normal liver; 2. Moderate alterations; 3. Marked alterations, including alcoholic hepatitis and cirrhosis. Multivariate methods used were logistic regression analysis and a classification and regression tree (CART). Both methods entered gamma-glutamyltransferase (GGT), aspartate-aminotransferase (AST), weight and age as significant and independent variables. Univariate analysis with GGT and AST at different cutoffs were also performed. To predict the presence of any kind of damage (Groups 2 and 3), CART and AST > 30 IU showed the higher sensitivity, specificity and correct prediction, both in the model and validation series. For prediction of marked liver damage, a score based on logistic regression and GGT > 110 IU had the higher efficiencies. It is concluded that GGT and AST are good markers of alcoholic liver damage and that, using sample cutoffs, histologic diagnosis can be correctly predicted in 80% of recently abstinent asymptomatic alcoholics.
Differentially private distributed logistic regression using private and public data
2014-01-01
Background Privacy protecting is an important issue in medical informatics and differential privacy is a state-of-the-art framework for data privacy research. Differential privacy offers provable privacy against attackers who have auxiliary information, and can be applied to data mining models (for example, logistic regression). However, differentially private methods sometimes introduce too much noise and make outputs less useful. Given available public data in medical research (e.g. from patients who sign open-consent agreements), we can design algorithms that use both public and private data sets to decrease the amount of noise that is introduced. Methodology In this paper, we modify the update step in Newton-Raphson method to propose a differentially private distributed logistic regression model based on both public and private data. Experiments and results We try our algorithm on three different data sets, and show its advantage over: (1) a logistic regression model based solely on public data, and (2) a differentially private distributed logistic regression model based on private data under various scenarios. Conclusion Logistic regression models built with our new algorithm based on both private and public datasets demonstrate better utility than models that trained on private or public datasets alone without sacrificing the rigorous privacy guarantee. PMID:25079786
Park, Ji Hyun; Kim, Hyeon-Young; Lee, Hanna; Yun, Eun Kyoung
2015-12-01
This study compares the performance of the logistic regression and decision tree analysis methods for assessing the risk factors for infection in cancer patients undergoing chemotherapy. The subjects were 732 cancer patients who were receiving chemotherapy at K university hospital in Seoul, Korea. The data were collected between March 2011 and February 2013 and were processed for descriptive analysis, logistic regression and decision tree analysis using the IBM SPSS Statistics 19 and Modeler 15.1 programs. The most common risk factors for infection in cancer patients receiving chemotherapy were identified as alkylating agents, vinca alkaloid and underlying diabetes mellitus. The logistic regression explained 66.7% of the variation in the data in terms of sensitivity and 88.9% in terms of specificity. The decision tree analysis accounted for 55.0% of the variation in the data in terms of sensitivity and 89.0% in terms of specificity. As for the overall classification accuracy, the logistic regression explained 88.0% and the decision tree analysis explained 87.2%. The logistic regression analysis showed a higher degree of sensitivity and classification accuracy. Therefore, logistic regression analysis is concluded to be the more effective and useful method for establishing an infection prediction model for patients undergoing chemotherapy. Copyright © 2015 Elsevier Ltd. All rights reserved.
Yang, Lixue; Chen, Kean
2015-11-01
To improve the design of underwater target recognition systems based on auditory perception, this study compared human listeners with automatic classifiers. Performances measures and strategies in three discrimination experiments, including discriminations between man-made and natural targets, between ships and submarines, and among three types of ships, were used. In the experiments, the subjects were asked to assign a score to each sound based on how confident they were about the category to which it belonged, and logistic regression, which represents linear discriminative models, also completed three similar tasks by utilizing many auditory features. The results indicated that the performances of logistic regression improved as the ratio between inter- and intra-class differences became larger, whereas the performances of the human subjects were limited by their unfamiliarity with the targets. Logistic regression performed better than the human subjects in all tasks but the discrimination between man-made and natural targets, and the strategies employed by excellent human subjects were similar to that of logistic regression. Logistic regression and several human subjects demonstrated similar performances when discriminating man-made and natural targets, but in this case, their strategies were not similar. An appropriate fusion of their strategies led to further improvement in recognition accuracy.
Factors associated with young adults' knowledge regarding family history of Stroke 1
Lima, Maria Jose Melo Ramos; Moreira, Thereza Maria Magalhães; Florêncio, Raquel Sampaio; Braga, Predro
2016-01-01
ABSTRACT Objective: to analyze the factors associated with young adults' knowledge regarding family history of stroke. Method: an analytical transversal study, with 579 young adults from state schools, with collection of sociodemographic, clinical and risk factor-related variables, analyzed using logistic regression (backward elimination). Results: a statistical association was detected between age, civil status, and classification of arterial blood pressure and abdominal circumference with knowledge of family history of stroke. In the final logistic regression model, a statistical association was observed between knowledge regarding family history of stroke and the civil status of having a partner (ORa=1.61[1.07-2.42]; p=0.023), abdominal circumference (ORa=0.98[0.96-0.99]; p=0.012) and normal arterial blood pressure (ORa=2.56[1.19-5.52]; p=0.016). Conclusion: an association was observed between socioeconomic factors and risk factors for stroke and knowledge of family history of stroke, suggesting the need for health education or even educational programs on this topic for the clientele in question. PMID:27878217
NASA Astrophysics Data System (ADS)
Mei, Zhixiong; Wu, Hao; Li, Shiyun
2018-06-01
The Conversion of Land Use and its Effects at Small regional extent (CLUE-S), which is a widely used model for land-use simulation, utilizes logistic regression to estimate the relationships between land use and its drivers, and thus, predict land-use change probabilities. However, logistic regression disregards possible spatial autocorrelation and self-organization in land-use data. Autologistic regression can depict spatial autocorrelation but cannot address self-organization, while logistic regression by considering only self-organization (NElogistic regression) fails to capture spatial autocorrelation. Therefore, this study developed a regression (NE-autologistic regression) method, which incorporated both spatial autocorrelation and self-organization, to improve CLUE-S. The Zengcheng District of Guangzhou, China was selected as the study area. The land-use data of 2001, 2005, and 2009, as well as 10 typical driving factors, were used to validate the proposed regression method and the improved CLUE-S model. Then, three future land-use scenarios in 2020: the natural growth scenario, ecological protection scenario, and economic development scenario, were simulated using the improved model. Validation results showed that NE-autologistic regression performed better than logistic regression, autologistic regression, and NE-logistic regression in predicting land-use change probabilities. The spatial allocation accuracy and kappa values of NE-autologistic-CLUE-S were higher than those of logistic-CLUE-S, autologistic-CLUE-S, and NE-logistic-CLUE-S for the simulations of two periods, 2001-2009 and 2005-2009, which proved that the improved CLUE-S model achieved the best simulation and was thereby effective to a certain extent. The scenario simulation results indicated that under all three scenarios, traffic land and residential/industrial land would increase, whereas arable land and unused land would decrease during 2009-2020. Apparent differences also existed in the simulated change sizes and locations of each land-use type under different scenarios. The results not only demonstrate the validity of the improved model but also provide a valuable reference for relevant policy-makers.
Unitary Response Regression Models
ERIC Educational Resources Information Center
Lipovetsky, S.
2007-01-01
The dependent variable in a regular linear regression is a numerical variable, and in a logistic regression it is a binary or categorical variable. In these models the dependent variable has varying values. However, there are problems yielding an identity output of a constant value which can also be modelled in a linear or logistic regression with…
Binary logistic regression-Instrument for assessing museum indoor air impact on exhibits.
Bucur, Elena; Danet, Andrei Florin; Lehr, Carol Blaziu; Lehr, Elena; Nita-Lazar, Mihai
2017-04-01
This paper presents a new way to assess the environmental impact on historical artifacts using binary logistic regression. The prediction of the impact on the exhibits during certain pollution scenarios (environmental impact) was calculated by a mathematical model based on the binary logistic regression; it allows the identification of those environmental parameters from a multitude of possible parameters with a significant impact on exhibitions and ranks them according to their severity effect. Air quality (NO 2 , SO 2 , O 3 and PM 2.5 ) and microclimate parameters (temperature, humidity) monitoring data from a case study conducted within exhibition and storage spaces of the Romanian National Aviation Museum Bucharest have been used for developing and validating the binary logistic regression method and the mathematical model. The logistic regression analysis was used on 794 data combinations (715 to develop of the model and 79 to validate it) by a Statistical Package for Social Sciences (SPSS 20.0). The results from the binary logistic regression analysis demonstrated that from six parameters taken into consideration, four of them present a significant effect upon exhibits in the following order: O 3 >PM 2.5 >NO 2 >humidity followed at a significant distance by the effects of SO 2 and temperature. The mathematical model, developed in this study, correctly predicted 95.1 % of the cumulated effect of the environmental parameters upon the exhibits. Moreover, this model could also be used in the decisional process regarding the preventive preservation measures that should be implemented within the exhibition space. The paper presents a new way to assess the environmental impact on historical artifacts using binary logistic regression. The mathematical model developed on the environmental parameters analyzed by the binary logistic regression method could be useful in a decision-making process establishing the best measures for pollution reduction and preventive preservation of exhibits.
Determining factors influencing survival of breast cancer by fuzzy logistic regression model.
Nikbakht, Roya; Bahrampour, Abbas
2017-01-01
Fuzzy logistic regression model can be used for determining influential factors of disease. This study explores the important factors of actual predictive survival factors of breast cancer's patients. We used breast cancer data which collected by cancer registry of Kerman University of Medical Sciences during the period of 2000-2007. The variables such as morphology, grade, age, and treatments (surgery, radiotherapy, and chemotherapy) were applied in the fuzzy logistic regression model. Performance of model was determined in terms of mean degree of membership (MDM). The study results showed that almost 41% of patients were in neoplasm and malignant group and more than two-third of them were still alive after 5-year follow-up. Based on the fuzzy logistic model, the most important factors influencing survival were chemotherapy, morphology, and radiotherapy, respectively. Furthermore, the MDM criteria show that the fuzzy logistic regression have a good fit on the data (MDM = 0.86). Fuzzy logistic regression model showed that chemotherapy is more important than radiotherapy in survival of patients with breast cancer. In addition, another ability of this model is calculating possibilistic odds of survival in cancer patients. The results of this study can be applied in clinical research. Furthermore, there are few studies which applied the fuzzy logistic models. Furthermore, we recommend using this model in various research areas.
Excess adiposity, inflammation, and iron-deficiency in female adolescents.
Tussing-Humphreys, Lisa M; Liang, Huifang; Nemeth, Elizabeta; Freels, Sally; Braunschweig, Carol A
2009-02-01
Iron deficiency is more prevalent in overweight children and adolescents but the mechanisms that underlie this condition remain unclear. The purpose of this cross-sectional study was to assess the relationship between iron status and excess adiposity, inflammation, menarche, diet, physical activity, and poverty status in female adolescents included in the National Health and Nutrition Examination Survey 2003-2004 dataset. Descriptive and simple comparative statistics (t test, chi(2)) were used to assess differences between normal-weight (5th < or = body mass index [BMI] percentile <85th) and heavier-weight girls (< or = 85th percentile for BMI) for demographic, biochemical, dietary, and physical activity variables. In addition, logistic regression analyses predicting iron deficiency and linear regression predicting serum iron levels were performed. Heavier-weight girls had an increased prevalence of iron deficiency compared to those with normal weight. Dietary iron, age of and time since first menarche, poverty status, and physical activity were similar between the two groups and were not independent predictors of iron deficiency or log serum iron levels. Logistic modeling predicting iron deficiency revealed having a BMI > or = 85th percentile and for each 1 mg/dL increase in C-reactive protein the odds ratio for iron deficiency more than doubled. The best-fit linear model to predict serum iron levels included both serum transferrin receptor and C-reactive protein following log-transformation for normalization of these variables. Findings indicate that heavier-weight female adolescents are at greater risk for iron deficiency and that inflammation stemming from excess adipose tissue contributes to this phenomenon. Food and nutrition professionals should consider elevated BMI as an additional risk factor for iron deficiency in female adolescents.
Mixed conditional logistic regression for habitat selection studies.
Duchesne, Thierry; Fortin, Daniel; Courbin, Nicolas
2010-05-01
1. Resource selection functions (RSFs) are becoming a dominant tool in habitat selection studies. RSF coefficients can be estimated with unconditional (standard) and conditional logistic regressions. While the advantage of mixed-effects models is recognized for standard logistic regression, mixed conditional logistic regression remains largely overlooked in ecological studies. 2. We demonstrate the significance of mixed conditional logistic regression for habitat selection studies. First, we use spatially explicit models to illustrate how mixed-effects RSFs can be useful in the presence of inter-individual heterogeneity in selection and when the assumption of independence from irrelevant alternatives (IIA) is violated. The IIA hypothesis states that the strength of preference for habitat type A over habitat type B does not depend on the other habitat types also available. Secondly, we demonstrate the significance of mixed-effects models to evaluate habitat selection of free-ranging bison Bison bison. 3. When movement rules were homogeneous among individuals and the IIA assumption was respected, fixed-effects RSFs adequately described habitat selection by simulated animals. In situations violating the inter-individual homogeneity and IIA assumptions, however, RSFs were best estimated with mixed-effects regressions, and fixed-effects models could even provide faulty conclusions. 4. Mixed-effects models indicate that bison did not select farmlands, but exhibited strong inter-individual variations in their response to farmlands. Less than half of the bison preferred farmlands over forests. Conversely, the fixed-effect model simply suggested an overall selection for farmlands. 5. Conditional logistic regression is recognized as a powerful approach to evaluate habitat selection when resource availability changes. This regression is increasingly used in ecological studies, but almost exclusively in the context of fixed-effects models. Fitness maximization can imply differences in trade-offs among individuals, which can yield inter-individual differences in selection and lead to departure from IIA. These situations are best modelled with mixed-effects models. Mixed-effects conditional logistic regression should become a valuable tool for ecological research.
Advanced colorectal neoplasia risk stratification by penalized logistic regression.
Lin, Yunzhi; Yu, Menggang; Wang, Sijian; Chappell, Richard; Imperiale, Thomas F
2016-08-01
Colorectal cancer is the second leading cause of death from cancer in the United States. To facilitate the efficiency of colorectal cancer screening, there is a need to stratify risk for colorectal cancer among the 90% of US residents who are considered "average risk." In this article, we investigate such risk stratification rules for advanced colorectal neoplasia (colorectal cancer and advanced, precancerous polyps). We use a recently completed large cohort study of subjects who underwent a first screening colonoscopy. Logistic regression models have been used in the literature to estimate the risk of advanced colorectal neoplasia based on quantifiable risk factors. However, logistic regression may be prone to overfitting and instability in variable selection. Since most of the risk factors in our study have several categories, it was tempting to collapse these categories into fewer risk groups. We propose a penalized logistic regression method that automatically and simultaneously selects variables, groups categories, and estimates their coefficients by penalizing the [Formula: see text]-norm of both the coefficients and their differences. Hence, it encourages sparsity in the categories, i.e. grouping of the categories, and sparsity in the variables, i.e. variable selection. We apply the penalized logistic regression method to our data. The important variables are selected, with close categories simultaneously grouped, by penalized regression models with and without the interactions terms. The models are validated with 10-fold cross-validation. The receiver operating characteristic curves of the penalized regression models dominate the receiver operating characteristic curve of naive logistic regressions, indicating a superior discriminative performance. © The Author(s) 2013.
Rupert, Michael G.; Cannon, Susan H.; Gartner, Joseph E.
2003-01-01
Logistic regression was used to predict the probability of debris flows occurring in areas recently burned by wildland fires. Multiple logistic regression is conceptually similar to multiple linear regression because statistical relations between one dependent variable and several independent variables are evaluated. In logistic regression, however, the dependent variable is transformed to a binary variable (debris flow did or did not occur), and the actual probability of the debris flow occurring is statistically modeled. Data from 399 basins located within 15 wildland fires that burned during 2000-2002 in Colorado, Idaho, Montana, and New Mexico were evaluated. More than 35 independent variables describing the burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated. The models were developed as follows: (1) Basins that did and did not produce debris flows were delineated from National Elevation Data using a Geographic Information System (GIS). (2) Data describing the burn severity, geology, land surface gradient, rainfall, and soil properties were determined for each basin. These data were then downloaded to a statistics software package for analysis using logistic regression. (3) Relations between the occurrence/non-occurrence of debris flows and burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated and several preliminary multivariate logistic regression models were constructed. All possible combinations of independent variables were evaluated to determine which combination produced the most effective model. The multivariate model that best predicted the occurrence of debris flows was selected. (4) The multivariate logistic regression model was entered into a GIS, and a map showing the probability of debris flows was constructed. The most effective model incorporates the percentage of each basin with slope greater than 30 percent, percentage of land burned at medium and high burn severity in each basin, particle size sorting, average storm intensity (millimeters per hour), soil organic matter content, soil permeability, and soil drainage. The results of this study demonstrate that logistic regression is a valuable tool for predicting the probability of debris flows occurring in recently-burned landscapes.
Ebrahimzadeh, Farzad; Hajizadeh, Ebrahim; Vahabi, Nasim; Almasian, Mohammad; Bakhteyar, Katayoon
2015-01-01
Background: Unwanted pregnancy not intended by at least one of the parents has undesirable consequences for the family and the society. In the present study, three classification models were used and compared to predict unwanted pregnancies in an urban population. Methods: In this cross-sectional study, 887 pregnant mothers referring to health centers in Khorramabad, Iran, in 2012 were selected by the stratified and cluster sampling; relevant variables were measured and for prediction of unwanted pregnancy, logistic regression, discriminant analysis, and probit regression models and SPSS software version 21 were used. To compare these models, indicators such as sensitivity, specificity, the area under the ROC curve, and the percentage of correct predictions were used. Results: The prevalence of unwanted pregnancies was 25.3%. The logistic and probit regression models indicated that parity and pregnancy spacing, contraceptive methods, household income and number of living male children were related to unwanted pregnancy. The performance of the models based on the area under the ROC curve was 0.735, 0.733, and 0.680 for logistic regression, probit regression, and linear discriminant analysis, respectively. Conclusion: Given the relatively high prevalence of unwanted pregnancies in Khorramabad, it seems necessary to revise family planning programs. Despite the similar accuracy of the models, if the researcher is interested in the interpretability of the results, the use of the logistic regression model is recommended. PMID:26793655
Ebrahimzadeh, Farzad; Hajizadeh, Ebrahim; Vahabi, Nasim; Almasian, Mohammad; Bakhteyar, Katayoon
2015-01-01
Unwanted pregnancy not intended by at least one of the parents has undesirable consequences for the family and the society. In the present study, three classification models were used and compared to predict unwanted pregnancies in an urban population. In this cross-sectional study, 887 pregnant mothers referring to health centers in Khorramabad, Iran, in 2012 were selected by the stratified and cluster sampling; relevant variables were measured and for prediction of unwanted pregnancy, logistic regression, discriminant analysis, and probit regression models and SPSS software version 21 were used. To compare these models, indicators such as sensitivity, specificity, the area under the ROC curve, and the percentage of correct predictions were used. The prevalence of unwanted pregnancies was 25.3%. The logistic and probit regression models indicated that parity and pregnancy spacing, contraceptive methods, household income and number of living male children were related to unwanted pregnancy. The performance of the models based on the area under the ROC curve was 0.735, 0.733, and 0.680 for logistic regression, probit regression, and linear discriminant analysis, respectively. Given the relatively high prevalence of unwanted pregnancies in Khorramabad, it seems necessary to revise family planning programs. Despite the similar accuracy of the models, if the researcher is interested in the interpretability of the results, the use of the logistic regression model is recommended.
Kunimatsu-Sanuki, Shiho; Iwase, Aiko; Araie, Makoto; Aoki, Yuki; Hara, Takeshi; Fukuchi, Takeo; Udagawa, Sachiko; Ohkubo, Shinji; Sugiyama, Kazuhisa; Matsumoto, Chota; Nakazawa, Toru; Yamaguchi, Takuhiro; Ono, Hiroshi
2017-01-01
Background/aims To assess the role of specific visual subfields in collisions with oncoming cars during simulated driving in patients with advanced glaucoma. Methods Normal subjects and patients with glaucoma with mean deviation <–12 dB in both eyes (Humphrey Field Analyzer 24-2 SITA-S program) used a driving simulator (DS; Honda Motor, Tokyo). Two scenarios in which oncoming cars turned right crossing the driver's path were chosen. We compared the binocular integrated visual field (IVF) in the patients who were involved in collisions and those who were not. We performed a multivariate logistic regression analysis; the dependent parameter was collision involvement, and the independent parameters were age, visual acuity and mean sensitivity of the IVF subfields. Results The study included 43 normal subjects and 100 patients with advanced glaucoma. And, 5 of the 100 patients with advanced glaucoma experienced simulator sickness during the main test and were thus excluded. In total, 95 patients with advanced glaucoma and 43 normal subjects completed the main test of DS. Advanced glaucoma patients had significantly more collisions than normal patients in one or both DS scenarios (p<0.001). The patients with advanced glaucoma who were involved in collisions were older (p=0.050) and had worse visual acuity in the better eye (p<0.001) and had lower mean IVF sensitivity in the inferior hemifield, both 0°–12° and 13°–24° in comparison with who were not involved in collisions (p=0.012 and p=0.034). A logistic regression analysis revealed that collision involvement was significantly associated with decreased inferior IVF mean sensitivity from 13° to 24° (p=0.041), in addition to older age and lower visual acuity (p=0.018 and p<0.001). Conclusions Our data suggest that the inferior hemifield was associated with the incidence of motor vehicle collisions with oncoming cars in patients with advanced glaucoma. PMID:28400370
Kempe, P T; van Oppen, P; de Haan, E; Twisk, J W R; Sluis, A; Smit, J H; van Dyck, R; van Balkom, A J L M
2007-09-01
Two methods for predicting remissions in obsessive-compulsive disorder (OCD) treatment are evaluated. Y-BOCS measurements of 88 patients with a primary OCD (DSM-III-R) diagnosis were performed over a 16-week treatment period, and during three follow-ups. Remission at any measurement was defined as a Y-BOCS score lower than thirteen combined with a reduction of seven points when compared with baseline. Logistic regression models were compared with a Cox regression for recurrent events model. Logistic regression yielded different models at different evaluation times. The recurrent events model remained stable when fewer measurements were used. Higher baseline levels of neuroticism and more severe OCD symptoms were associated with a lower chance of remission, early age of onset and more depressive symptoms with a higher chance. Choice of outcome time affects logistic regression prediction models. Recurrent events analysis uses all information on remissions and relapses. Short- and long-term predictors for OCD remission show overlap.
Estimating the exceedance probability of rain rate by logistic regression
NASA Technical Reports Server (NTRS)
Chiu, Long S.; Kedem, Benjamin
1990-01-01
Recent studies have shown that the fraction of an area with rain intensity above a fixed threshold is highly correlated with the area-averaged rain rate. To estimate the fractional rainy area, a logistic regression model, which estimates the conditional probability that rain rate over an area exceeds a fixed threshold given the values of related covariates, is developed. The problem of dependency in the data in the estimation procedure is bypassed by the method of partial likelihood. Analyses of simulated scanning multichannel microwave radiometer and observed electrically scanning microwave radiometer data during the Global Atlantic Tropical Experiment period show that the use of logistic regression in pixel classification is superior to multiple regression in predicting whether rain rate at each pixel exceeds a given threshold, even in the presence of noisy data. The potential of the logistic regression technique in satellite rain rate estimation is discussed.
NASA Astrophysics Data System (ADS)
Cary, Theodore W.; Cwanger, Alyssa; Venkatesh, Santosh S.; Conant, Emily F.; Sehgal, Chandra M.
2012-03-01
This study compares the performance of two proven but very different machine learners, Naïve Bayes and logistic regression, for differentiating malignant and benign breast masses using ultrasound imaging. Ultrasound images of 266 masses were analyzed quantitatively for shape, echogenicity, margin characteristics, and texture features. These features along with patient age, race, and mammographic BI-RADS category were used to train Naïve Bayes and logistic regression classifiers to diagnose lesions as malignant or benign. ROC analysis was performed using all of the features and using only a subset that maximized information gain. Performance was determined by the area under the ROC curve, Az, obtained from leave-one-out cross validation. Naïve Bayes showed significant variation (Az 0.733 +/- 0.035 to 0.840 +/- 0.029, P < 0.002) with the choice of features, but the performance of logistic regression was relatively unchanged under feature selection (Az 0.839 +/- 0.029 to 0.859 +/- 0.028, P = 0.605). Out of 34 features, a subset of 6 gave the highest information gain: brightness difference, margin sharpness, depth-to-width, mammographic BI-RADs, age, and race. The probabilities of malignancy determined by Naïve Bayes and logistic regression after feature selection showed significant correlation (R2= 0.87, P < 0.0001). The diagnostic performance of Naïve Bayes and logistic regression can be comparable, but logistic regression is more robust. Since probability of malignancy cannot be measured directly, high correlation between the probabilities derived from two basic but dissimilar models increases confidence in the predictive power of machine learning models for characterizing solid breast masses on ultrasound.
Wang, Qingliang; Li, Xiaojie; Hu, Kunpeng; Zhao, Kun; Yang, Peisheng; Liu, Bo
2015-05-12
To explore the risk factors of portal hypertensive gastropathy (PHG) in patients with hepatitis B associated cirrhosis and establish a Logistic regression model of noninvasive prediction. The clinical data of 234 hospitalized patients with hepatitis B associated cirrhosis from March 2012 to March 2014 were analyzed retrospectively. The dependent variable was the occurrence of PHG while the independent variables were screened by binary Logistic analysis. Multivariate Logistic regression was used for further analysis of significant noninvasive independent variables. Logistic regression model was established and odds ratio was calculated for each factor. The accuracy, sensitivity and specificity of model were evaluated by the curve of receiver operating characteristic (ROC). According to univariate Logistic regression, the risk factors included hepatic dysfunction, albumin (ALB), bilirubin (TB), prothrombin time (PT), platelet (PLT), white blood cell (WBC), portal vein diameter, spleen index, splenic vein diameter, diameter ratio, PLT to spleen volume ratio, esophageal varices (EV) and gastric varices (GV). Multivariate analysis showed that hepatic dysfunction (X1), TB (X2), PLT (X3) and splenic vein diameter (X4) were the major occurring factors for PHG. The established regression model was Logit P=-2.667+2.186X1-2.167X2+0.725X3+0.976X4. The accuracy of model for PHG was 79.1% with a sensitivity of 77.2% and a specificity of 80.8%. Hepatic dysfunction, TB, PLT and splenic vein diameter are risk factors for PHG and the noninvasive predicted Logistic regression model was Logit P=-2.667+2.186X1-2.167X2+0.725X3+0.976X4.
Dean, J A; Welsh, L C; Wong, K H; Aleksic, A; Dunne, E; Islam, M R; Patel, A; Patel, P; Petkar, I; Phillips, I; Sham, J; Schick, U; Newbold, K L; Bhide, S A; Harrington, K J; Nutting, C M; Gulliford, S L
2017-04-01
A normal tissue complication probability (NTCP) model of severe acute mucositis would be highly useful to guide clinical decision making and inform radiotherapy planning. We aimed to improve upon our previous model by using a novel oral mucosal surface organ at risk (OAR) in place of an oral cavity OAR. Predictive models of severe acute mucositis were generated using radiotherapy dose to the oral cavity OAR or mucosal surface OAR and clinical data. Penalised logistic regression and random forest classification (RFC) models were generated for both OARs and compared. Internal validation was carried out with 100-iteration stratified shuffle split cross-validation, using multiple metrics to assess different aspects of model performance. Associations between treatment covariates and severe mucositis were explored using RFC feature importance. Penalised logistic regression and RFC models using the oral cavity OAR performed at least as well as the models using mucosal surface OAR. Associations between dose metrics and severe mucositis were similar between the mucosal surface and oral cavity models. The volumes of oral cavity or mucosal surface receiving intermediate and high doses were most strongly associated with severe mucositis. The simpler oral cavity OAR should be preferred over the mucosal surface OAR for NTCP modelling of severe mucositis. We recommend minimising the volume of mucosa receiving intermediate and high doses, where possible. Copyright © 2016 The Royal College of Radiologists. Published by Elsevier Ltd. All rights reserved.
The status of diabetes control in Kurdistan province, west of Iran.
Esmailnasab, Nader; Afkhamzadeh, Abdorrahim; Roshani, Daem; Moradi, Ghobad
2013-09-17
Based on some estimation more than two million peoples in Iran are affected by Type 2 diabetes. The present study was designed to evaluate the status of diabetes control among Type 2 diabetes patients in Kurdistan, west of Iran and its associated factors. In our cross sectional study conducted in 2010, 411 Type 2 diabetes patients were randomly recruited from Sanandaj, Capital of Kurdistan. Chi square test was used in univariate analysis to address the association between HgAlc and FBS status and other variables. The significant results from Univariate analysis were entered in multivariate analysis and multinomial logistic regression model. In 38% of patients, FBS was in normal range (70-130) and in 47% HgA1c was <7% which is normal range for HgA1c. In univariate analysis, FBS level was associated with educational levels (P=0.001), referral style (P=0.001), referral time (P=0.009), and insulin injection (P=0.016). In addition, HgA1c had a relationship with sex (P=0.023), age (P=0.035), education (P=0.001), referral style (P=0.001), and insulin injection (P=0.008). After using multinomial logistic regression for significant results of univariate analysis, it was found that FBS was significantly associated with referral style. In addition HgA1c was significantly associated with referral style and Insulin injection. Although some of patients were under the coverage of specialized cares, but their diabetes were not properly controlled.
Biocultural Predictors of Motor Coordination Among Prepubertal Boys and Girls.
Luz, Leonardo G O; Valente-Dos-Santos, João; Luz, Tatiana D D; Sousa-E-Silva, Paulo; Duarte, João P; Machado-Rodrigues, Aristides; Seabra, André; Santos, Rute; Cumming, Sean P; Coelho-E-Silva, Manuel J
2018-02-01
This study aimed to predict motor coordination from a matrix of biocultural factors for 173 children (89 boys, 84 girls) aged 7-9 years who were assessed with the Körperkoordinationtest für Kinder test battery. Socioeconomic variables included built environment, area of residence, mother's educational level, and mother's physical activity level (using the International Physical Activity Questionnaire [short version]). The behavioral domain was marked by participation in organized sports and habitual physical activity measured by accelerometers ( ActiGraph GT1M). Indicators of biological development included somatic maturation and body mass index. Among males, the best logistic regression model to explain motor coordination (Nagelkerke R 2 = 50.8; χ 2 = 41.166; p < .001) emerged from age-group (odds ratio [OR]: 0.007-0.065), late maturation (OR = 0.174), normal body weight status (OR = 0.116), mother's educational level (OR = 0.129), and urban area of residence (OR = 0.236). Among girls, the best logistic regression to explain motor coordination (Nagelkerke R 2 = 40.8; χ 2 = 29.933; p < .01) derived from age (OR: 0.091-0.384), normal body mass index (OR = 0.142), participation in organized sport (OR = 0.121), and mother's physical activity level (OR = 0.183). This sex-specific, ecological approach to motor coordination proficiency may help promote physical activity during prepubertal years through familiar determinants.
Early Menarche and Gestational Diabetes Mellitus at First Live Birth.
Shen, Yun; Hu, Hui; D Taylor, Brandie; Kan, Haidong; Xu, Xiaohui
2017-03-01
To examine the association between early menarche and gestational diabetes mellitus (GDM). Data from the National Health and Nutrition Examination Survey 2007-2012 were used to investigate the association between age at menarche and the risk of GDM at first birth among 5914 women. A growth mixture model was used to detect distinctive menarche onset patterns based on self-reported age at menarche. Logistic regression models were then used to examine the associations between menarche initiation patterns and GDM after adjusting for sociodemographic factors, family history of diabetes mellitus, lifetime greatest Body Mass Index, smoking status, and physical activity level. Among the 5914 first-time mothers, 3.4 % had self-reported GDM. We detected three groups with heterogeneous menarche onset patterns, the Early, Normal, and Late Menarche Groups. The regression model shows that compared to the Normal Menarche Group, the Early Menarche Group had 1.75 (95 % CI 1.10, 2.79) times the odds of having GDM. No statistically significant difference was observed between the Normal and the Late Menarche Group. This study suggests that early menarche may be a risk factor of GDM. Future studies are warranted to examine and confirm this finding.
Variable Selection in Logistic Regression.
1987-06-01
23 %. AUTIOR(.) S. CONTRACT OR GRANT NUMBE Rf.i %Z. D. Bai, P. R. Krishnaiah and . C. Zhao F49620-85- C-0008 " PERFORMING ORGANIZATION NAME AND AOORESS...d I7 IOK-TK- d 7 -I0 7’ VARIABLE SELECTION IN LOGISTIC REGRESSION Z. D. Bai, P. R. Krishnaiah and L. C. Zhao Center for Multivariate Analysis...University of Pittsburgh Center for Multivariate Analysis University of Pittsburgh Y !I VARIABLE SELECTION IN LOGISTIC REGRESSION Z- 0. Bai, P. R. Krishnaiah
NASA Astrophysics Data System (ADS)
Madhu, B.; Ashok, N. C.; Balasubramanian, S.
2014-11-01
Multinomial logistic regression analysis was used to develop statistical model that can predict the probability of breast cancer in Southern Karnataka using the breast cancer occurrence data during 2007-2011. Independent socio-economic variables describing the breast cancer occurrence like age, education, occupation, parity, type of family, health insurance coverage, residential locality and socioeconomic status of each case was obtained. The models were developed as follows: i) Spatial visualization of the Urban- rural distribution of breast cancer cases that were obtained from the Bharat Hospital and Institute of Oncology. ii) Socio-economic risk factors describing the breast cancer occurrences were complied for each case. These data were then analysed using multinomial logistic regression analysis in a SPSS statistical software and relations between the occurrence of breast cancer across the socio-economic status and the influence of other socio-economic variables were evaluated and multinomial logistic regression models were constructed. iii) the model that best predicted the occurrence of breast cancer were identified. This multivariate logistic regression model has been entered into a geographic information system and maps showing the predicted probability of breast cancer occurrence in Southern Karnataka was created. This study demonstrates that Multinomial logistic regression is a valuable tool for developing models that predict the probability of breast cancer Occurrence in Southern Karnataka.
Parsaeian, M; Mohammad, K; Mahmoudi, M; Zeraati, H
2012-01-01
Background: The purpose of this investigation was to compare empirically predictive ability of an artificial neural network with a logistic regression in prediction of low back pain. Methods: Data from the second national health survey were considered in this investigation. This data includes the information of low back pain and its associated risk factors among Iranian people aged 15 years and older. Artificial neural network and logistic regression models were developed using a set of 17294 data and they were validated in a test set of 17295 data. Hosmer and Lemeshow recommendation for model selection was used in fitting the logistic regression. A three-layer perceptron with 9 inputs, 3 hidden and 1 output neurons was employed. The efficiency of two models was compared by receiver operating characteristic analysis, root mean square and -2 Loglikelihood criteria. Results: The area under the ROC curve (SE), root mean square and -2Loglikelihood of the logistic regression was 0.752 (0.004), 0.3832 and 14769.2, respectively. The area under the ROC curve (SE), root mean square and -2Loglikelihood of the artificial neural network was 0.754 (0.004), 0.3770 and 14757.6, respectively. Conclusions: Based on these three criteria, artificial neural network would give better performance than logistic regression. Although, the difference is statistically significant, it does not seem to be clinically significant. PMID:23113198
Parsaeian, M; Mohammad, K; Mahmoudi, M; Zeraati, H
2012-01-01
The purpose of this investigation was to compare empirically predictive ability of an artificial neural network with a logistic regression in prediction of low back pain. Data from the second national health survey were considered in this investigation. This data includes the information of low back pain and its associated risk factors among Iranian people aged 15 years and older. Artificial neural network and logistic regression models were developed using a set of 17294 data and they were validated in a test set of 17295 data. Hosmer and Lemeshow recommendation for model selection was used in fitting the logistic regression. A three-layer perceptron with 9 inputs, 3 hidden and 1 output neurons was employed. The efficiency of two models was compared by receiver operating characteristic analysis, root mean square and -2 Loglikelihood criteria. The area under the ROC curve (SE), root mean square and -2Loglikelihood of the logistic regression was 0.752 (0.004), 0.3832 and 14769.2, respectively. The area under the ROC curve (SE), root mean square and -2Loglikelihood of the artificial neural network was 0.754 (0.004), 0.3770 and 14757.6, respectively. Based on these three criteria, artificial neural network would give better performance than logistic regression. Although, the difference is statistically significant, it does not seem to be clinically significant.
NASA Astrophysics Data System (ADS)
Kamaruddin, Ainur Amira; Ali, Zalila; Noor, Norlida Mohd.; Baharum, Adam; Ahmad, Wan Muhamad Amir W.
2014-07-01
Logistic regression analysis examines the influence of various factors on a dichotomous outcome by estimating the probability of the event's occurrence. Logistic regression, also called a logit model, is a statistical procedure used to model dichotomous outcomes. In the logit model the log odds of the dichotomous outcome is modeled as a linear combination of the predictor variables. The log odds ratio in logistic regression provides a description of the probabilistic relationship of the variables and the outcome. In conducting logistic regression, selection procedures are used in selecting important predictor variables, diagnostics are used to check that assumptions are valid which include independence of errors, linearity in the logit for continuous variables, absence of multicollinearity, and lack of strongly influential outliers and a test statistic is calculated to determine the aptness of the model. This study used the binary logistic regression model to investigate overweight and obesity among rural secondary school students on the basis of their demographics profile, medical history, diet and lifestyle. The results indicate that overweight and obesity of students are influenced by obesity in family and the interaction between a student's ethnicity and routine meals intake. The odds of a student being overweight and obese are higher for a student having a family history of obesity and for a non-Malay student who frequently takes routine meals as compared to a Malay student.
Understanding logistic regression analysis.
Sperandei, Sandro
2014-01-01
Logistic regression is used to obtain odds ratio in the presence of more than one explanatory variable. The procedure is quite similar to multiple linear regression, with the exception that the response variable is binomial. The result is the impact of each variable on the odds ratio of the observed event of interest. The main advantage is to avoid confounding effects by analyzing the association of all variables together. In this article, we explain the logistic regression procedure using examples to make it as simple as possible. After definition of the technique, the basic interpretation of the results is highlighted and then some special issues are discussed.
ERIC Educational Resources Information Center
Koon, Sharon; Petscher, Yaacov
2015-01-01
The purpose of this report was to explicate the use of logistic regression and classification and regression tree (CART) analysis in the development of early warning systems. It was motivated by state education leaders' interest in maintaining high classification accuracy while simultaneously improving practitioner understanding of the rules by…
2017-03-23
PUBLIC RELEASE; DISTRIBUTION UNLIMITED Using Multiple and Logistic Regression to Estimate the Median Will- Cost and Probability of Cost and... Cost and Probability of Cost and Schedule Overrun for Program Managers Ryan C. Trudelle Follow this and additional works at: https://scholar.afit.edu...afit.edu. Recommended Citation Trudelle, Ryan C., "Using Multiple and Logistic Regression to Estimate the Median Will- Cost and Probability of Cost and
2013-11-01
Ptrend 0.78 0.62 0.75 Unconditional logistic regression was used to estimate odds ratios (OR) and 95 % confidence intervals (CI) for risk of node...Ptrend 0.71 0.67 Unconditional logistic regression was used to estimate odds ratios (OR) and 95 % confidence intervals (CI) for risk of high-grade tumors... logistic regression was used to estimate odds ratios (OR) and 95 % confidence intervals (CI) for the associations between each of the seven SNPs and
Kim, Sun Mi; Kim, Yongdai; Jeong, Kuhwan; Jeong, Heeyeong; Kim, Jiyoung
2018-01-01
The aim of this study was to compare the performance of image analysis for predicting breast cancer using two distinct regression models and to evaluate the usefulness of incorporating clinical and demographic data (CDD) into the image analysis in order to improve the diagnosis of breast cancer. This study included 139 solid masses from 139 patients who underwent a ultrasonography-guided core biopsy and had available CDD between June 2009 and April 2010. Three breast radiologists retrospectively reviewed 139 breast masses and described each lesion using the Breast Imaging Reporting and Data System (BI-RADS) lexicon. We applied and compared two regression methods-stepwise logistic (SL) regression and logistic least absolute shrinkage and selection operator (LASSO) regression-in which the BI-RADS descriptors and CDD were used as covariates. We investigated the performances of these regression methods and the agreement of radiologists in terms of test misclassification error and the area under the curve (AUC) of the tests. Logistic LASSO regression was superior (P<0.05) to SL regression, regardless of whether CDD was included in the covariates, in terms of test misclassification errors (0.234 vs. 0.253, without CDD; 0.196 vs. 0.258, with CDD) and AUC (0.785 vs. 0.759, without CDD; 0.873 vs. 0.735, with CDD). However, it was inferior (P<0.05) to the agreement of three radiologists in terms of test misclassification errors (0.234 vs. 0.168, without CDD; 0.196 vs. 0.088, with CDD) and the AUC without CDD (0.785 vs. 0.844, P<0.001), but was comparable to the AUC with CDD (0.873 vs. 0.880, P=0.141). Logistic LASSO regression based on BI-RADS descriptors and CDD showed better performance than SL in predicting the presence of breast cancer. The use of CDD as a supplement to the BI-RADS descriptors significantly improved the prediction of breast cancer using logistic LASSO regression.
Yu, Yuanyuan; Li, Hongkai; Sun, Xiaoru; Su, Ping; Wang, Tingting; Liu, Yi; Yuan, Zhongshang; Liu, Yanxun; Xue, Fuzhong
2017-12-28
Confounders can produce spurious associations between exposure and outcome in observational studies. For majority of epidemiologists, adjusting for confounders using logistic regression model is their habitual method, though it has some problems in accuracy and precision. It is, therefore, important to highlight the problems of logistic regression and search the alternative method. Four causal diagram models were defined to summarize confounding equivalence. Both theoretical proofs and simulation studies were performed to verify whether conditioning on different confounding equivalence sets had the same bias-reducing potential and then to select the optimum adjusting strategy, in which logistic regression model and inverse probability weighting based marginal structural model (IPW-based-MSM) were compared. The "do-calculus" was used to calculate the true causal effect of exposure on outcome, then the bias and standard error were used to evaluate the performances of different strategies. Adjusting for different sets of confounding equivalence, as judged by identical Markov boundaries, produced different bias-reducing potential in the logistic regression model. For the sets satisfied G-admissibility, adjusting for the set including all the confounders reduced the equivalent bias to the one containing the parent nodes of the outcome, while the bias after adjusting for the parent nodes of exposure was not equivalent to them. In addition, all causal effect estimations through logistic regression were biased, although the estimation after adjusting for the parent nodes of exposure was nearest to the true causal effect. However, conditioning on different confounding equivalence sets had the same bias-reducing potential under IPW-based-MSM. Compared with logistic regression, the IPW-based-MSM could obtain unbiased causal effect estimation when the adjusted confounders satisfied G-admissibility and the optimal strategy was to adjust for the parent nodes of outcome, which obtained the highest precision. All adjustment strategies through logistic regression were biased for causal effect estimation, while IPW-based-MSM could always obtain unbiased estimation when the adjusted set satisfied G-admissibility. Thus, IPW-based-MSM was recommended to adjust for confounders set.
Use and interpretation of logistic regression in habitat-selection studies
Keating, Kim A.; Cherry, Steve
2004-01-01
Logistic regression is an important tool for wildlife habitat-selection studies, but the method frequently has been misapplied due to an inadequate understanding of the logistic model, its interpretation, and the influence of sampling design. To promote better use of this method, we review its application and interpretation under 3 sampling designs: random, case-control, and use-availability. Logistic regression is appropriate for habitat use-nonuse studies employing random sampling and can be used to directly model the conditional probability of use in such cases. Logistic regression also is appropriate for studies employing case-control sampling designs, but careful attention is required to interpret results correctly. Unless bias can be estimated or probability of use is small for all habitats, results of case-control studies should be interpreted as odds ratios, rather than probability of use or relative probability of use. When data are gathered under a use-availability design, logistic regression can be used to estimate approximate odds ratios if probability of use is small, at least on average. More generally, however, logistic regression is inappropriate for modeling habitat selection in use-availability studies. In particular, using logistic regression to fit the exponential model of Manly et al. (2002:100) does not guarantee maximum-likelihood estimates, valid probabilities, or valid likelihoods. We show that the resource selection function (RSF) commonly used for the exponential model is proportional to a logistic discriminant function. Thus, it may be used to rank habitats with respect to probability of use and to identify important habitat characteristics or their surrogates, but it is not guaranteed to be proportional to probability of use. Other problems associated with the exponential model also are discussed. We describe an alternative model based on Lancaster and Imbens (1996) that offers a method for estimating conditional probability of use in use-availability studies. Although promising, this model fails to converge to a unique solution in some important situations. Further work is needed to obtain a robust method that is broadly applicable to use-availability studies.
Modeling Governance KB with CATPCA to Overcome Multicollinearity in the Logistic Regression
NASA Astrophysics Data System (ADS)
Khikmah, L.; Wijayanto, H.; Syafitri, U. D.
2017-04-01
The problem often encounters in logistic regression modeling are multicollinearity problems. Data that have multicollinearity between explanatory variables with the result in the estimation of parameters to be bias. Besides, the multicollinearity will result in error in the classification. In general, to overcome multicollinearity in regression used stepwise regression. They are also another method to overcome multicollinearity which involves all variable for prediction. That is Principal Component Analysis (PCA). However, classical PCA in only for numeric data. Its data are categorical, one method to solve the problems is Categorical Principal Component Analysis (CATPCA). Data were used in this research were a part of data Demographic and Population Survey Indonesia (IDHS) 2012. This research focuses on the characteristic of women of using the contraceptive methods. Classification results evaluated using Area Under Curve (AUC) values. The higher the AUC value, the better. Based on AUC values, the classification of the contraceptive method using stepwise method (58.66%) is better than the logistic regression model (57.39%) and CATPCA (57.39%). Evaluation of the results of logistic regression using sensitivity, shows the opposite where CATPCA method (99.79%) is better than logistic regression method (92.43%) and stepwise (92.05%). Therefore in this study focuses on major class classification (using a contraceptive method), then the selected model is CATPCA because it can raise the level of the major class model accuracy.
Söhn, Matthias; Alber, Markus; Yan, Di
2007-09-01
The variability of dose-volume histogram (DVH) shapes in a patient population can be quantified using principal component analysis (PCA). We applied this to rectal DVHs of prostate cancer patients and investigated the correlation of the PCA parameters with late bleeding. PCA was applied to the rectal wall DVHs of 262 patients, who had been treated with a four-field box, conformal adaptive radiotherapy technique. The correlated changes in the DVH pattern were revealed as "eigenmodes," which were ordered by their importance to represent data set variability. Each DVH is uniquely characterized by its principal components (PCs). The correlation of the first three PCs and chronic rectal bleeding of Grade 2 or greater was investigated with uni- and multivariate logistic regression analyses. Rectal wall DVHs in four-field conformal RT can primarily be represented by the first two or three PCs, which describe approximately 94% or 96% of the DVH shape variability, respectively. The first eigenmode models the total irradiated rectal volume; thus, PC1 correlates to the mean dose. Mode 2 describes the interpatient differences of the relative rectal volume in the two- or four-field overlap region. Mode 3 reveals correlations of volumes with intermediate doses ( approximately 40-45 Gy) and volumes with doses >70 Gy; thus, PC3 is associated with the maximal dose. According to univariate logistic regression analysis, only PC2 correlated significantly with toxicity. However, multivariate logistic regression analysis with the first two or three PCs revealed an increased probability of bleeding for DVHs with more than one large PC. PCA can reveal the correlation structure of DVHs for a patient population as imposed by the treatment technique and provide information about its relationship to toxicity. It proves useful for augmenting normal tissue complication probability modeling approaches.
Logistic regression models of factors influencing the location of bioenergy and biofuels plants
T.M. Young; R.L. Zaretzki; J.H. Perdue; F.M. Guess; X. Liu
2011-01-01
Logistic regression models were developed to identify significant factors that influence the location of existing wood-using bioenergy/biofuels plants and traditional wood-using facilities. Logistic models provided quantitative insight for variables influencing the location of woody biomass-using facilities. Availability of "thinnings to a basal area of 31.7m2/ha...
Discrete post-processing of total cloud cover ensemble forecasts
NASA Astrophysics Data System (ADS)
Hemri, Stephan; Haiden, Thomas; Pappenberger, Florian
2017-04-01
This contribution presents an approach to post-process ensemble forecasts for the discrete and bounded weather variable of total cloud cover. Two methods for discrete statistical post-processing of ensemble predictions are tested. The first approach is based on multinomial logistic regression, the second involves a proportional odds logistic regression model. Applying them to total cloud cover raw ensemble forecasts from the European Centre for Medium-Range Weather Forecasts improves forecast skill significantly. Based on station-wise post-processing of raw ensemble total cloud cover forecasts for a global set of 3330 stations over the period from 2007 to early 2014, the more parsimonious proportional odds logistic regression model proved to slightly outperform the multinomial logistic regression model. Reference Hemri, S., Haiden, T., & Pappenberger, F. (2016). Discrete post-processing of total cloud cover ensemble forecasts. Monthly Weather Review 144, 2565-2577.
Fuzzy multinomial logistic regression analysis: A multi-objective programming approach
NASA Astrophysics Data System (ADS)
Abdalla, Hesham A.; El-Sayed, Amany A.; Hamed, Ramadan
2017-05-01
Parameter estimation for multinomial logistic regression is usually based on maximizing the likelihood function. For large well-balanced datasets, Maximum Likelihood (ML) estimation is a satisfactory approach. Unfortunately, ML can fail completely or at least produce poor results in terms of estimated probabilities and confidence intervals of parameters, specially for small datasets. In this study, a new approach based on fuzzy concepts is proposed to estimate parameters of the multinomial logistic regression. The study assumes that the parameters of multinomial logistic regression are fuzzy. Based on the extension principle stated by Zadeh and Bárdossy's proposition, a multi-objective programming approach is suggested to estimate these fuzzy parameters. A simulation study is used to evaluate the performance of the new approach versus Maximum likelihood (ML) approach. Results show that the new proposed model outperforms ML in cases of small datasets.
A Primer on Logistic Regression.
ERIC Educational Resources Information Center
Woldbeck, Tanya
This paper introduces logistic regression as a viable alternative when the researcher is faced with variables that are not continuous. If one is to use simple regression, the dependent variable must be measured on a continuous scale. In the behavioral sciences, it may not always be appropriate or possible to have a measured dependent variable on a…
Min, Seung Nam; Park, Se Jin; Kim, Dong Joon; Subramaniyam, Murali; Lee, Kyung-Sun
2018-01-01
Stroke is the second leading cause of death worldwide and remains an important health burden both for the individuals and for the national healthcare systems. Potentially modifiable risk factors for stroke include hypertension, cardiac disease, diabetes, and dysregulation of glucose metabolism, atrial fibrillation, and lifestyle factors. We aimed to derive a model equation for developing a stroke pre-diagnosis algorithm with the potentially modifiable risk factors. We used logistic regression for model derivation, together with data from the database of the Korea National Health Insurance Service (NHIS). We reviewed the NHIS records of 500,000 enrollees. For the regression analysis, data regarding 367 stroke patients were selected. The control group consisted of 500 patients followed up for 2 consecutive years and with no history of stroke. We developed a logistic regression model based on information regarding several well-known modifiable risk factors. The developed model could correctly discriminate between normal subjects and stroke patients in 65% of cases. The model developed in the present study can be applied in the clinical setting to estimate the probability of stroke in a year and thus improve the stroke prevention strategies in high-risk patients. The approach used to develop the stroke prevention algorithm can be applied for developing similar models for the pre-diagnosis of other diseases. © 2018 S. Karger AG, Basel.
A Solution to Separation and Multicollinearity in Multiple Logistic Regression
Shen, Jianzhao; Gao, Sujuan
2010-01-01
In dementia screening tests, item selection for shortening an existing screening test can be achieved using multiple logistic regression. However, maximum likelihood estimates for such logistic regression models often experience serious bias or even non-existence because of separation and multicollinearity problems resulting from a large number of highly correlated items. Firth (1993, Biometrika, 80(1), 27–38) proposed a penalized likelihood estimator for generalized linear models and it was shown to reduce bias and the non-existence problems. The ridge regression has been used in logistic regression to stabilize the estimates in cases of multicollinearity. However, neither solves the problems for each other. In this paper, we propose a double penalized maximum likelihood estimator combining Firth’s penalized likelihood equation with a ridge parameter. We present a simulation study evaluating the empirical performance of the double penalized likelihood estimator in small to moderate sample sizes. We demonstrate the proposed approach using a current screening data from a community-based dementia study. PMID:20376286
A Solution to Separation and Multicollinearity in Multiple Logistic Regression.
Shen, Jianzhao; Gao, Sujuan
2008-10-01
In dementia screening tests, item selection for shortening an existing screening test can be achieved using multiple logistic regression. However, maximum likelihood estimates for such logistic regression models often experience serious bias or even non-existence because of separation and multicollinearity problems resulting from a large number of highly correlated items. Firth (1993, Biometrika, 80(1), 27-38) proposed a penalized likelihood estimator for generalized linear models and it was shown to reduce bias and the non-existence problems. The ridge regression has been used in logistic regression to stabilize the estimates in cases of multicollinearity. However, neither solves the problems for each other. In this paper, we propose a double penalized maximum likelihood estimator combining Firth's penalized likelihood equation with a ridge parameter. We present a simulation study evaluating the empirical performance of the double penalized likelihood estimator in small to moderate sample sizes. We demonstrate the proposed approach using a current screening data from a community-based dementia study.
Ye, Dong-qing; Hu, Yi-song; Li, Xiang-pei; Huang, Fen; Yang, Shi-gui; Hao, Jia-hu; Yin, Jing; Zhang, Guo-qing; Liu, Hui-hui
2004-11-01
To explore the impact of environmental factors, daily lifestyle, psycho-social factors and the interactions between environmental factors and chemokines genes on systemic lupus erythematosus (SLE). Case-control study was carried out and environmental factors for SLE were analyzed by univariate and multivariate unconditional logistic regression. Interactions between environmental factors and chemokines polymorphism contributing to systemic lupus erythematosus were also analyzed by logistic regression model. There were nineteen factors associated with SLE when univariate unconditional logistic regression was used. However, when multivariate unconditional logistic regression was used, only five factors showed having impacts on the disease, in which drinking well water (OR=0.099) was protective factor for SLE, and multiple drug allergy (OR=8.174), over-exposure to sunshine (OR=18.339), taking antibiotics (OR=9.630) and oral contraceptives were risk factors for SLE. When unconditional logistic regression model was used, results showed that there was interaction between eating irritable food and -2518MCP-1G/G genotype (OR=4.387). No interaction between environmental factors was found that contributing to SLE in this study. Many environmental factors were related to SLE, and there was an interaction between -2518MCP-1G/G genotype and eating irritable food.
Mielniczuk, Jan; Teisseyre, Paweł
2018-03-01
Detection of gene-gene interactions is one of the most important challenges in genome-wide case-control studies. Besides traditional logistic regression analysis, recently the entropy-based methods attracted a significant attention. Among entropy-based methods, interaction information is one of the most promising measures having many desirable properties. Although both logistic regression and interaction information have been used in several genome-wide association studies, the relationship between them has not been thoroughly investigated theoretically. The present paper attempts to fill this gap. We show that although certain connections between the two methods exist, in general they refer two different concepts of dependence and looking for interactions in those two senses leads to different approaches to interaction detection. We introduce ordering between interaction measures and specify conditions for independent and dependent genes under which interaction information is more discriminative measure than logistic regression. Moreover, we show that for so-called perfect distributions those measures are equivalent. The numerical experiments illustrate the theoretical findings indicating that interaction information and its modified version are more universal tools for detecting various types of interaction than logistic regression and linkage disequilibrium measures. © 2017 WILEY PERIODICALS, INC.
TU-CD-BRB-01: Normal Lung CT Texture Features Improve Predictive Models for Radiation Pneumonitis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Krafft, S; The University of Texas Graduate School of Biomedical Sciences, Houston, TX; Briere, T
2015-06-15
Purpose: Existing normal tissue complication probability (NTCP) models for radiation pneumonitis (RP) traditionally rely on dosimetric and clinical data but are limited in terms of performance and generalizability. Extraction of pre-treatment image features provides a potential new category of data that can improve NTCP models for RP. We consider quantitative measures of total lung CT intensity and texture in a framework for prediction of RP. Methods: Available clinical and dosimetric data was collected for 198 NSCLC patients treated with definitive radiotherapy. Intensity- and texture-based image features were extracted from the T50 phase of the 4D-CT acquired for treatment planning. Amore » total of 3888 features (15 clinical, 175 dosimetric, and 3698 image features) were gathered and considered candidate predictors for modeling of RP grade≥3. A baseline logistic regression model with mean lung dose (MLD) was first considered. Additionally, a least absolute shrinkage and selection operator (LASSO) logistic regression was applied to the set of clinical and dosimetric features, and subsequently to the full set of clinical, dosimetric, and image features. Model performance was assessed by comparing area under the curve (AUC). Results: A simple logistic fit of MLD was an inadequate model of the data (AUC∼0.5). Including clinical and dosimetric parameters within the framework of the LASSO resulted in improved performance (AUC=0.648). Analysis of the full cohort of clinical, dosimetric, and image features provided further and significant improvement in model performance (AUC=0.727). Conclusions: To achieve significant gains in predictive modeling of RP, new categories of data should be considered in addition to clinical and dosimetric features. We have successfully incorporated CT image features into a framework for modeling RP and have demonstrated improved predictive performance. Validation and further investigation of CT image features in the context of RP NTCP modeling is warranted. This work was supported by the Rosalie B. Hite Fellowship in Cancer research awarded to SPK.« less
ERIC Educational Resources Information Center
Shih, Ching-Lin; Liu, Tien-Hsiang; Wang, Wen-Chung
2014-01-01
The simultaneous item bias test (SIBTEST) method regression procedure and the differential item functioning (DIF)-free-then-DIF strategy are applied to the logistic regression (LR) method simultaneously in this study. These procedures are used to adjust the effects of matching true score on observed score and to better control the Type I error…
Hay, Peter D; Smith, Julie; O'Connor, Richard A
2016-02-01
The aim of this study was to evaluate the benefits to SPECT bone scan image quality when applying resolution recovery (RR) during image reconstruction using software provided by a third-party supplier. Bone SPECT data from 90 clinical studies were reconstructed retrospectively using software supplied independent of the gamma camera manufacturer. The current clinical datasets contain 120×10 s projections and are reconstructed using an iterative method with a Butterworth postfilter. Five further reconstructions were created with the following characteristics: 10 s projections with a Butterworth postfilter (to assess intraobserver variation); 10 s projections with a Gaussian postfilter with and without RR; and 5 s projections with a Gaussian postfilter with and without RR. Two expert observers were asked to rate image quality on a five-point scale relative to our current clinical reconstruction. Datasets were anonymized and presented in random order. The benefits of RR on image scores were evaluated using ordinal logistic regression (visual grading regression). The application of RR during reconstruction increased the probability of both observers of scoring image quality as better than the current clinical reconstruction even where the dataset contained half the normal counts. Type of reconstruction and observer were both statistically significant variables in the ordinal logistic regression model. Visual grading regression was found to be a useful method for validating the local introduction of technological developments in nuclear medicine imaging. RR, as implemented by the independent software supplier, improved bone SPECT image quality when applied during image reconstruction. In the majority of clinical cases, acquisition times for bone SPECT intended for the purposes of localization can safely be halved (from 10 s projections to 5 s) when RR is applied.
Díaz Villegas, Gregory Mishell; Runzer Colmenares, Fernando
2015-01-01
To evaluate the association between calf circumference and gait speed in elderly patients 65 years or older at Geriatric day clinic at Peruvian Centro Médico Naval. Cross-sectional, retrospective study. We assessed 139 participants, 65 years or older at Peruvian Centro Médico Naval including calf circumference, gait speed and Short Physical Performance Battery. With bivariate analyses and logistic regression model we search for association between variables. The age mean was 79.37 years old (SD: 8.71). 59.71% were male, the 30.97% had a slow walking speed and the mean calf circumference was 33.42cm (SD: 5.61). After a bivariate analysis, we found a calf circumference mean of 30.35cm (SD: 3.74) in the slow speed group and, in normal gait group, a mean of 33.51cm (SD: 3.26) with significantly differences. We used logistic regression to analyze association with slow gait speed, founding statistically significant results adjusting model by disability and age. Low calf circumference is associated with slow speed walk in population over 65 years old. Copyright © 2014. Published by Elsevier Espana.
Rainfall-induced Landslide Susceptibility assessment at the Longnan county
NASA Astrophysics Data System (ADS)
Hong, Haoyuan; Zhang, Ying
2017-04-01
Landslides are a serious disaster in Longnan county, China. Therefore landslide susceptibility assessment is useful tool for government or decision making. The main objective of this study is to investigate and compare the frequency ratio, support vector machines, and logistic regression. The Longnan county (Jiangxi province, China) was selected as the case study. First, the landslide inventory map with 354 landslide locations was constructed. Then landslide locations were then randomly divided into a ratio of 70/30 for the training and validating the models. Second, fourteen landslide conditioning factors were prepared such as slope, aspect, altitude, topographic wetness index (TWI), stream power index (SPI), sediment transport index (STI), plan curvature, lithology, distance to faults, distance to rivers, distance to roads, land use, normalized difference vegetation index (NDVI), and rainfall. Using the frequency ratio, support vector machines, and logistic regression, a total of three landslide susceptibility models were constructed. Finally, the overall performance of the resulting models was assessed and compared using the Receiver operating characteristic (ROC) curve technique. The result showed that the support vector machines model is the best model in the study area. The success rate is 88.39 %; and prediction rate is 84.06 %.
Access disparities to Magnet hospitals for patients undergoing neurosurgical operations
Missios, Symeon; Bekelis, Kimon
2017-01-01
Background Centers of excellence focusing on quality improvement have demonstrated superior outcomes for a variety of surgical interventions. We investigated the presence of access disparities to hospitals recognized by the Magnet Recognition Program of the American Nurses Credentialing Center (ANCC) for patients undergoing neurosurgical operations. Methods We performed a cohort study of all neurosurgery patients who were registered in the New York Statewide Planning and Research Cooperative System (SPARCS) database from 2009–2013. We examined the association of African-American race and lack of insurance with Magnet status hospitalization for neurosurgical procedures. A mixed effects propensity adjusted multivariable regression analysis was used to control for confounding. Results During the study period, 190,535 neurosurgical patients met the inclusion criteria. Using a multivariable logistic regression, we demonstrate that African-Americans had lower admission rates to Magnet institutions (OR 0.62; 95% CI, 0.58–0.67). This persisted in a mixed effects logistic regression model (OR 0.77; 95% CI, 0.70–0.83) to adjust for clustering at the patient county level, and a propensity score adjusted logistic regression model (OR 0.75; 95% CI, 0.69–0.82). Additionally, lack of insurance was associated with lower admission rates to Magnet institutions (OR 0.71; 95% CI, 0.68–0.73), in a multivariable logistic regression model. This persisted in a mixed effects logistic regression model (OR 0.72; 95% CI, 0.69–0.74), and a propensity score adjusted logistic regression model (OR 0.72; 95% CI, 0.69–0.75). Conclusions Using a comprehensive all-payer cohort of neurosurgery patients in New York State we identified an association of African-American race and lack of insurance with lower rates of admission to Magnet hospitals. PMID:28684152
Adjusting for Confounding in Early Postlaunch Settings: Going Beyond Logistic Regression Models.
Schmidt, Amand F; Klungel, Olaf H; Groenwold, Rolf H H
2016-01-01
Postlaunch data on medical treatments can be analyzed to explore adverse events or relative effectiveness in real-life settings. These analyses are often complicated by the number of potential confounders and the possibility of model misspecification. We conducted a simulation study to compare the performance of logistic regression, propensity score, disease risk score, and stabilized inverse probability weighting methods to adjust for confounding. Model misspecification was induced in the independent derivation dataset. We evaluated performance using relative bias confidence interval coverage of the true effect, among other metrics. At low events per coefficient (1.0 and 0.5), the logistic regression estimates had a large relative bias (greater than -100%). Bias of the disease risk score estimates was at most 13.48% and 18.83%. For the propensity score model, this was 8.74% and >100%, respectively. At events per coefficient of 1.0 and 0.5, inverse probability weighting frequently failed or reduced to a crude regression, resulting in biases of -8.49% and 24.55%. Coverage of logistic regression estimates became less than the nominal level at events per coefficient ≤5. For the disease risk score, inverse probability weighting, and propensity score, coverage became less than nominal at events per coefficient ≤2.5, ≤1.0, and ≤1.0, respectively. Bias of misspecified disease risk score models was 16.55%. In settings with low events/exposed subjects per coefficient, disease risk score methods can be useful alternatives to logistic regression models, especially when propensity score models cannot be used. Despite better performance of disease risk score methods than logistic regression and propensity score models in small events per coefficient settings, bias, and coverage still deviated from nominal.
Pfeiffer, R M; Riedl, R
2015-08-15
We assess the asymptotic bias of estimates of exposure effects conditional on covariates when summary scores of confounders, instead of the confounders themselves, are used to analyze observational data. First, we study regression models for cohort data that are adjusted for summary scores. Second, we derive the asymptotic bias for case-control studies when cases and controls are matched on a summary score, and then analyzed either using conditional logistic regression or by unconditional logistic regression adjusted for the summary score. Two scores, the propensity score (PS) and the disease risk score (DRS) are studied in detail. For cohort analysis, when regression models are adjusted for the PS, the estimated conditional treatment effect is unbiased only for linear models, or at the null for non-linear models. Adjustment of cohort data for DRS yields unbiased estimates only for linear regression; all other estimates of exposure effects are biased. Matching cases and controls on DRS and analyzing them using conditional logistic regression yields unbiased estimates of exposure effect, whereas adjusting for the DRS in unconditional logistic regression yields biased estimates, even under the null hypothesis of no association. Matching cases and controls on the PS yield unbiased estimates only under the null for both conditional and unconditional logistic regression, adjusted for the PS. We study the bias for various confounding scenarios and compare our asymptotic results with those from simulations with limited sample sizes. To create realistic correlations among multiple confounders, we also based simulations on a real dataset. Copyright © 2015 John Wiley & Sons, Ltd.
Nie, Z Q; Ou, Y Q; Zhuang, J; Qu, Y J; Mai, J Z; Chen, J M; Liu, X Q
2016-05-01
Conditional logistic regression analysis and unconditional logistic regression analysis are commonly used in case control study, but Cox proportional hazard model is often used in survival data analysis. Most literature only refer to main effect model, however, generalized linear model differs from general linear model, and the interaction was composed of multiplicative interaction and additive interaction. The former is only statistical significant, but the latter has biological significance. In this paper, macros was written by using SAS 9.4 and the contrast ratio, attributable proportion due to interaction and synergy index were calculated while calculating the items of logistic and Cox regression interactions, and the confidence intervals of Wald, delta and profile likelihood were used to evaluate additive interaction for the reference in big data analysis in clinical epidemiology and in analysis of genetic multiplicative and additive interactions.
Multiple imputation for handling missing outcome data when estimating the relative risk.
Sullivan, Thomas R; Lee, Katherine J; Ryan, Philip; Salter, Amy B
2017-09-06
Multiple imputation is a popular approach to handling missing data in medical research, yet little is known about its applicability for estimating the relative risk. Standard methods for imputing incomplete binary outcomes involve logistic regression or an assumption of multivariate normality, whereas relative risks are typically estimated using log binomial models. It is unclear whether misspecification of the imputation model in this setting could lead to biased parameter estimates. Using simulated data, we evaluated the performance of multiple imputation for handling missing data prior to estimating adjusted relative risks from a correctly specified multivariable log binomial model. We considered an arbitrary pattern of missing data in both outcome and exposure variables, with missing data induced under missing at random mechanisms. Focusing on standard model-based methods of multiple imputation, missing data were imputed using multivariate normal imputation or fully conditional specification with a logistic imputation model for the outcome. Multivariate normal imputation performed poorly in the simulation study, consistently producing estimates of the relative risk that were biased towards the null. Despite outperforming multivariate normal imputation, fully conditional specification also produced somewhat biased estimates, with greater bias observed for higher outcome prevalences and larger relative risks. Deleting imputed outcomes from analysis datasets did not improve the performance of fully conditional specification. Both multivariate normal imputation and fully conditional specification produced biased estimates of the relative risk, presumably since both use a misspecified imputation model. Based on simulation results, we recommend researchers use fully conditional specification rather than multivariate normal imputation and retain imputed outcomes in the analysis when estimating relative risks. However fully conditional specification is not without its shortcomings, and so further research is needed to identify optimal approaches for relative risk estimation within the multiple imputation framework.
Chen, Chen; Xie, Yuanchang
2014-12-01
Driving hours and rest breaks are closely related to driver fatigue, which is a major contributor to truck crashes. This study investigates the effects of driving hours and rest breaks on commercial truck driver safety. A discrete-time logistic regression model is used to evaluate the crash odds ratios of driving hours and rest breaks. Driving time is divided into 11 one hour intervals. These intervals and rest breaks are modeled as dummy variables. In addition, a Cox proportional hazards regression model with time-dependent covariates is used to assess the transient effects of rest breaks, which consists of a fixed effect and a variable effect. Data collected from two national truckload carriers in 2009 and 2010 are used. The discrete-time logistic regression result indicates that only the crash odds ratio of the 11th driving hour is statistically significant. Taking one, two, and three rest breaks can reduce drivers' crash odds by 68%, 83%, and 85%, respectively, compared to drivers who did not take any rest breaks. The Cox regression result shows clear transient effects for rest breaks. It also suggests that drivers may need some time to adjust themselves to normal driving tasks after a rest break. Overall, the third rest break's safety benefit is very limited based on the results of both models. The findings of this research can help policy makers better understand the impact of driving time and rest breaks and develop more effective rules to improve commercial truck safety. Copyright © 2014 National Safety Council and Elsevier Ltd. All rights reserved.
No rationale for 1 variable per 10 events criterion for binary logistic regression analysis.
van Smeden, Maarten; de Groot, Joris A H; Moons, Karel G M; Collins, Gary S; Altman, Douglas G; Eijkemans, Marinus J C; Reitsma, Johannes B
2016-11-24
Ten events per variable (EPV) is a widely advocated minimal criterion for sample size considerations in logistic regression analysis. Of three previous simulation studies that examined this minimal EPV criterion only one supports the use of a minimum of 10 EPV. In this paper, we examine the reasons for substantial differences between these extensive simulation studies. The current study uses Monte Carlo simulations to evaluate small sample bias, coverage of confidence intervals and mean square error of logit coefficients. Logistic regression models fitted by maximum likelihood and a modified estimation procedure, known as Firth's correction, are compared. The results show that besides EPV, the problems associated with low EPV depend on other factors such as the total sample size. It is also demonstrated that simulation results can be dominated by even a few simulated data sets for which the prediction of the outcome by the covariates is perfect ('separation'). We reveal that different approaches for identifying and handling separation leads to substantially different simulation results. We further show that Firth's correction can be used to improve the accuracy of regression coefficients and alleviate the problems associated with separation. The current evidence supporting EPV rules for binary logistic regression is weak. Given our findings, there is an urgent need for new research to provide guidance for supporting sample size considerations for binary logistic regression analysis.
Li, Yi; Tseng, Yufeng J.; Pan, Dahua; Liu, Jianzhong; Kern, Petra S.; Gerberick, G. Frank; Hopfinger, Anton J.
2008-01-01
Currently, the only validated methods to identify skin sensitization effects are in vivo models, such as the Local Lymph Node Assay (LLNA) and guinea pig studies. There is a tremendous need, in particular due to novel legislation, to develop animal alternatives, eg. Quantitative Structure-Activity Relationship (QSAR) models. Here, QSAR models for skin sensitization using LLNA data have been constructed. The descriptors used to generate these models are derived from the 4D-molecular similarity paradigm and are referred to as universal 4D-fingerprints. A training set of 132 structurally diverse compounds and a test set of 15 structurally diverse compounds were used in this study. The statistical methodologies used to build the models are logistic regression (LR), and partial least square coupled logistic regression (PLS-LR), which prove to be effective tools for studying skin sensitization measures expressed in the two categorical terms of sensitizer and non-sensitizer. QSAR models with low values of the Hosmer-Lemeshow goodness-of-fit statistic, χHL2, are significant and predictive. For the training set, the cross-validated prediction accuracy of the logistic regression models ranges from 77.3% to 78.0%, while that of PLS-logistic regression models ranges from 87.1% to 89.4%. For the test set, the prediction accuracy of logistic regression models ranges from 80.0%-86.7%, while that of PLS-logistic regression models ranges from 73.3%-80.0%. The QSAR models are made up of 4D-fingerprints related to aromatic atoms, hydrogen bond acceptors and negatively partially charged atoms. PMID:17226934
Hestetun, Ingebjørg; Svendsen, Martin Veel; Oellingrath, Inger Margaret
2015-03-01
Overweight and mental health problems represent two major challenges related to child and adolescent health. More knowledge of a possible relationship between the two problems and the influence of peer problems on the mental health of overweight children is needed. It has previously been hypothesized that peer problems may be an underlying factor in the association between overweight and mental health problems. The purpose of the present study was to investigate the associations between overweight, peer problems, and indications of mental health problems in a sample of 12-13-year-old Norwegian schoolchildren. Children aged 12-13 years were recruited from the seventh grade of primary schools in Telemark County, Norway. Parents gave information about mental health and peer problems by completing the extended version of the Strength and Difficulties Questionnaire (SDQ). Height and weight were objectively measured. Complete data were obtained for 744 children. Fisher's exact probability test and multiple logistic regressions were used. Most children had normal good mental health. Multiple logistic regression analysis showed that overweight children were more likely to have indications of psychiatric disorders (adjusted OR: 1.8, CI: 1.0-3.2) and peer problems (adjusted OR: 2.6, CI: 1.6-4.2) than normal-weight children, when adjusted for relevant background variables. When adjusted for peer problems, the association between overweight and indications of any psychiatric disorder was no longer significant. The results support the hypothesis that peer problems may be an important underlying factor for mental health problems in overweight children.
Sun, Z W; Shi, T T; Fu, P X
2017-02-01
To explore the characteristics of schizophrenia patients' homicide behaviors and the influences of the assessments of criminal capacity. Indicators such as demographic and clinical data, characteristics of criminal behaviors and criminal capacity from the suspects whom were diagnosed by forensic psychiatry as schizophrenia ( n =110) and normal mental ( n =70) with homicide behavior, were collected by self-made investigation form and compared. The influences of the assessments of criminal capacity on the suspects diagnosed as schizophrenia were also analyzed using logistic regression analysis. There were no significant statistical differences between the schizophrenic group and the normal mental group concerning age, gender, education and marital status ( P >0.05). There were significant statistical differences between the two groups concerning thought disorder, emotion state and social function before crime ( P <0.05) and there were significant statistical differences in some characteristics of the case such as aggressive history ( P <0.05), cue, trigger, plan, criminal incentives, object of crime, circumstance cognition and self-protection ( P <0.05). Multivariate logistic regression analysis suggested that thought disorder, emotion state, social function, criminal incentives, plan and self-protection before crime of the schizophrenic group were positively correlated with the criminal capacity ( P <0.05). The relevant influences of psychopathology and crime characteristics should be considered comprehensively for improving the accuracy of the criminal capacity evaluation on the suspects diagnosed as schizophrenia with homicide behavior. Copyright© by the Editorial Department of Journal of Forensic Medicine
Barberio, Amanda M; Hosein, F Shaun; Quiñonez, Carlos; McLaren, Lindsay
2017-01-01
Background There are concerns that altered thyroid functioning could be the result of ingesting too much fluoride. Community water fluoridation (CWF) is an important source of fluoride exposure. Our objectives were to examine the association between fluoride exposure and (1) diagnosis of a thyroid condition and (2) indicators of thyroid functioning among a national population-based sample of Canadians. Methods We analysed data from Cycles 2 and 3 of the Canadian Health Measures Survey (CHMS). Logistic regression was used to assess associations between fluoride from urine and tap water samples and the diagnosis of a thyroid condition. Multinomial logistic regression was used to examine the relationship between fluoride exposure and thyroid-stimulating hormone (TSH) level (low/normal/high). Other available variables permitted additional exploratory analyses among the subset of participants for whom we could discern some fluoride exposure from drinking water and/or dental products. Results There was no evidence of a relationship between fluoride exposure (from urine and tap water) and the diagnosis of a thyroid condition. There was no statistically significant association between fluoride exposure and abnormal (low or high) TSH levels relative to normal TSH levels. Rerunning the models with the sample constrained to the subset of participants for whom we could discern some source(s) of fluoride exposure from drinking water and/or dental products revealed no significant associations. Conclusion These analyses suggest that, at the population level, fluoride exposure is not associated with impaired thyroid functioning in a time and place where multiple sources of fluoride exposure, including CWF, exist. PMID:28839078
Brenn, T; Arnesen, E
1985-01-01
For comparative evaluation, discriminant analysis, logistic regression and Cox's model were used to select risk factors for total and coronary deaths among 6595 men aged 20-49 followed for 9 years. Groups with mortality between 5 and 93 per 1000 were considered. Discriminant analysis selected variable sets only marginally different from the logistic and Cox methods which always selected the same sets. A time-saving option, offered for both the logistic and Cox selection, showed no advantage compared with discriminant analysis. Analysing more than 3800 subjects, the logistic and Cox methods consumed, respectively, 80 and 10 times more computer time than discriminant analysis. When including the same set of variables in non-stepwise analyses, all methods estimated coefficients that in most cases were almost identical. In conclusion, discriminant analysis is advocated for preliminary or stepwise analysis, otherwise Cox's method should be used.
ERIC Educational Resources Information Center
DeMars, Christine E.
2009-01-01
The Mantel-Haenszel (MH) and logistic regression (LR) differential item functioning (DIF) procedures have inflated Type I error rates when there are large mean group differences, short tests, and large sample sizes.When there are large group differences in mean score, groups matched on the observed number-correct score differ on true score,…
[A study of Rutter behavior problems in school aged children with cleft lip and/or palate].
Wu, Zheng-yi; Zhang, Yong; Chen, Li-qin
2008-08-01
To study the difference of behavior problems in school aged children with cleft of lip and/or palate. The Rutter Children Behavior Parent Checklist was used in 100 school aged children with cleft lip and/or palate and 135 school aged normal children in Shanghai.The questionnaire were filled and analyzed with chi2 test and logistic regression using SPSS 10.0 software packageî The positive rate of Rutter behavior problems in school aged children with cleft of lip and/or palate was significantly higher than that in normal children (P<0.05). The positive rate of behavior problems and Rutter behavior problem A in boys was significantly higher than in girls. Primary mental health intervention is necessary to promote the psychiatric health.
Satellite rainfall retrieval by logistic regression
NASA Technical Reports Server (NTRS)
Chiu, Long S.
1986-01-01
The potential use of logistic regression in rainfall estimation from satellite measurements is investigated. Satellite measurements provide covariate information in terms of radiances from different remote sensors.The logistic regression technique can effectively accommodate many covariates and test their significance in the estimation. The outcome from the logistical model is the probability that the rainrate of a satellite pixel is above a certain threshold. By varying the thresholds, a rainrate histogram can be obtained, from which the mean and the variant can be estimated. A logistical model is developed and applied to rainfall data collected during GATE, using as covariates the fractional rain area and a radiance measurement which is deduced from a microwave temperature-rainrate relation. It is demonstrated that the fractional rain area is an important covariate in the model, consistent with the use of the so-called Area Time Integral in estimating total rain volume in other studies. To calibrate the logistical model, simulated rain fields generated by rainfield models with prescribed parameters are needed. A stringent test of the logistical model is its ability to recover the prescribed parameters of simulated rain fields. A rain field simulation model which preserves the fractional rain area and lognormality of rainrates as found in GATE is developed. A stochastic regression model of branching and immigration whose solutions are lognormally distributed in some asymptotic limits has also been developed.
Quantitative Analysis of Land Loss in Coastal Louisiana Using Remote Sensing
NASA Astrophysics Data System (ADS)
Wales, P. M.; Kuszmaul, J.; Roberts, C.
2005-12-01
For the past thirty-five years the land loss along the Louisiana Coast has been recognized as a growing problem. One of the clearest indicators of this land loss is that in 2000 smooth cord grass (spartina alterniflora) was turning brown well before its normal hibernation period. Over 100,000 acres of marsh were affected by the 2000 browning. In 2001 data were collected using low altitude helicopter based transects of the coast, with 7,400 data points being collected by researchers at the USGS, National Wetlands Research Center, and Louisiana Department of Natural Resources. The surveys contained data describing the characteristics of the marsh, including latitude, longitude, marsh condition, marsh color, percent vegetated, and marsh die-back. Creating a model that combines remote sensing images, field data, and statistical analysis to develop a methodology for estimating the margin of error in measurements of coastal land loss (erosion) is the ultimate goal of the study. A model was successfully created using a series of band combinations (used as predictive variables). The most successful band combinations or predictive variables were the braud value [(Sum Visible TM Bands - Sum Infrared TM Bands)/(Sum Visible TM Bands + Sum Infrared TM Bands)], TM band 7/ TM band 2, brightness, NDVI, wetness, vegetation index, and a 7x7 autocovariate nearest neighbor floating window. The model values were used to generate the logistic regression model. A new image was created based on the logistic regression probability equation where each pixel represents the probability of finding water or non-water at that location in each image. Pixels within each image that have a high probability of representing water have a value close to 1 and pixels with a low probability of representing water have a value close to 0. A logistic regression model is proposed that uses seven independent variables. This model yields an accurate classification in 86.5% of the locations considered in the 1997 and 2001 survey locations. When the logistic regression was modeled to the satellite imagery of the entire Louisiana Coast study area a statewide loss was estimated to be 358 mi2 to 368 mi2, from 1997 to 2001, using two different methods for estimating land loss.
Practical Session: Logistic Regression
NASA Astrophysics Data System (ADS)
Clausel, M.; Grégoire, G.
2014-12-01
An exercise is proposed to illustrate the logistic regression. One investigates the different risk factors in the apparition of coronary heart disease. It has been proposed in Chapter 5 of the book of D.G. Kleinbaum and M. Klein, "Logistic Regression", Statistics for Biology and Health, Springer Science Business Media, LLC (2010) and also by D. Chessel and A.B. Dufour in Lyon 1 (see Sect. 6 of http://pbil.univ-lyon1.fr/R/pdf/tdr341.pdf). This example is based on data given in the file evans.txt coming from http://www.sph.emory.edu/dkleinb/logreg3.htm#data.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ghazali, Amirul Syafiq Mohd; Ali, Zalila; Noor, Norlida Mohd
Multinomial logistic regression is widely used to model the outcomes of a polytomous response variable, a categorical dependent variable with more than two categories. The model assumes that the conditional mean of the dependent categorical variables is the logistic function of an affine combination of predictor variables. Its procedure gives a number of logistic regression models that make specific comparisons of the response categories. When there are q categories of the response variable, the model consists of q-1 logit equations which are fitted simultaneously. The model is validated by variable selection procedures, tests of regression coefficients, a significant test ofmore » the overall model, goodness-of-fit measures, and validation of predicted probabilities using odds ratio. This study used the multinomial logistic regression model to investigate obesity and overweight among primary school students in a rural area on the basis of their demographic profiles, lifestyles and on the diet and food intake. The results indicated that obesity and overweight of students are related to gender, religion, sleep duration, time spent on electronic games, breakfast intake in a week, with whom meals are taken, protein intake, and also, the interaction between breakfast intake in a week with sleep duration, and the interaction between gender and protein intake.« less
NASA Astrophysics Data System (ADS)
Ghazali, Amirul Syafiq Mohd; Ali, Zalila; Noor, Norlida Mohd; Baharum, Adam
2015-10-01
Multinomial logistic regression is widely used to model the outcomes of a polytomous response variable, a categorical dependent variable with more than two categories. The model assumes that the conditional mean of the dependent categorical variables is the logistic function of an affine combination of predictor variables. Its procedure gives a number of logistic regression models that make specific comparisons of the response categories. When there are q categories of the response variable, the model consists of q-1 logit equations which are fitted simultaneously. The model is validated by variable selection procedures, tests of regression coefficients, a significant test of the overall model, goodness-of-fit measures, and validation of predicted probabilities using odds ratio. This study used the multinomial logistic regression model to investigate obesity and overweight among primary school students in a rural area on the basis of their demographic profiles, lifestyles and on the diet and food intake. The results indicated that obesity and overweight of students are related to gender, religion, sleep duration, time spent on electronic games, breakfast intake in a week, with whom meals are taken, protein intake, and also, the interaction between breakfast intake in a week with sleep duration, and the interaction between gender and protein intake.
The cross-validated AUC for MCP-logistic regression with high-dimensional data.
Jiang, Dingfeng; Huang, Jian; Zhang, Ying
2013-10-01
We propose a cross-validated area under the receiving operator characteristic (ROC) curve (CV-AUC) criterion for tuning parameter selection for penalized methods in sparse, high-dimensional logistic regression models. We use this criterion in combination with the minimax concave penalty (MCP) method for variable selection. The CV-AUC criterion is specifically designed for optimizing the classification performance for binary outcome data. To implement the proposed approach, we derive an efficient coordinate descent algorithm to compute the MCP-logistic regression solution surface. Simulation studies are conducted to evaluate the finite sample performance of the proposed method and its comparison with the existing methods including the Akaike information criterion (AIC), Bayesian information criterion (BIC) or Extended BIC (EBIC). The model selected based on the CV-AUC criterion tends to have a larger predictive AUC and smaller classification error than those with tuning parameters selected using the AIC, BIC or EBIC. We illustrate the application of the MCP-logistic regression with the CV-AUC criterion on three microarray datasets from the studies that attempt to identify genes related to cancers. Our simulation studies and data examples demonstrate that the CV-AUC is an attractive method for tuning parameter selection for penalized methods in high-dimensional logistic regression models.
Vaeth, Michael; Skovlund, Eva
2004-06-15
For a given regression problem it is possible to identify a suitably defined equivalent two-sample problem such that the power or sample size obtained for the two-sample problem also applies to the regression problem. For a standard linear regression model the equivalent two-sample problem is easily identified, but for generalized linear models and for Cox regression models the situation is more complicated. An approximately equivalent two-sample problem may, however, also be identified here. In particular, we show that for logistic regression and Cox regression models the equivalent two-sample problem is obtained by selecting two equally sized samples for which the parameters differ by a value equal to the slope times twice the standard deviation of the independent variable and further requiring that the overall expected number of events is unchanged. In a simulation study we examine the validity of this approach to power calculations in logistic regression and Cox regression models. Several different covariate distributions are considered for selected values of the overall response probability and a range of alternatives. For the Cox regression model we consider both constant and non-constant hazard rates. The results show that in general the approach is remarkably accurate even in relatively small samples. Some discrepancies are, however, found in small samples with few events and a highly skewed covariate distribution. Comparison with results based on alternative methods for logistic regression models with a single continuous covariate indicates that the proposed method is at least as good as its competitors. The method is easy to implement and therefore provides a simple way to extend the range of problems that can be covered by the usual formulas for power and sample size determination. Copyright 2004 John Wiley & Sons, Ltd.
Kesselmeier, Miriam; Lorenzo Bermejo, Justo
2017-11-01
Logistic regression is the most common technique used for genetic case-control association studies. A disadvantage of standard maximum likelihood estimators of the genotype relative risk (GRR) is their strong dependence on outlier subjects, for example, patients diagnosed at unusually young age. Robust methods are available to constrain outlier influence, but they are scarcely used in genetic studies. This article provides a non-intimidating introduction to robust logistic regression, and investigates its benefits and limitations in genetic association studies. We applied the bounded Huber and extended the R package 'robustbase' with the re-descending Hampel functions to down-weight outlier influence. Computer simulations were carried out to assess the type I error rate, mean squared error (MSE) and statistical power according to major characteristics of the genetic study and investigated markers. Simulations were complemented with the analysis of real data. Both standard and robust estimation controlled type I error rates. Standard logistic regression showed the highest power but standard GRR estimates also showed the largest bias and MSE, in particular for associated rare and recessive variants. For illustration, a recessive variant with a true GRR=6.32 and a minor allele frequency=0.05 investigated in a 1000 case/1000 control study by standard logistic regression resulted in power=0.60 and MSE=16.5. The corresponding figures for Huber-based estimation were power=0.51 and MSE=0.53. Overall, Hampel- and Huber-based GRR estimates did not differ much. Robust logistic regression may represent a valuable alternative to standard maximum likelihood estimation when the focus lies on risk prediction rather than identification of susceptibility variants. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Regression Analysis of Optical Coherence Tomography Disc Variables for Glaucoma Diagnosis.
Richter, Grace M; Zhang, Xinbo; Tan, Ou; Francis, Brian A; Chopra, Vikas; Greenfield, David S; Varma, Rohit; Schuman, Joel S; Huang, David
2016-08-01
To report diagnostic accuracy of optical coherence tomography (OCT) disc variables using both time-domain (TD) and Fourier-domain (FD) OCT, and to improve the use of OCT disc variable measurements for glaucoma diagnosis through regression analyses that adjust for optic disc size and axial length-based magnification error. Observational, cross-sectional. In total, 180 normal eyes of 112 participants and 180 eyes of 138 participants with perimetric glaucoma from the Advanced Imaging for Glaucoma Study. Diagnostic variables evaluated from TD-OCT and FD-OCT were: disc area, rim area, rim volume, optic nerve head volume, vertical cup-to-disc ratio (CDR), and horizontal CDR. These were compared with overall retinal nerve fiber layer thickness and ganglion cell complex. Regression analyses were performed that corrected for optic disc size and axial length. Area-under-receiver-operating curves (AUROC) were used to assess diagnostic accuracy before and after the adjustments. An index based on multiple logistic regression that combined optic disc variables with axial length was also explored with the aim of improving diagnostic accuracy of disc variables. Comparison of diagnostic accuracy of disc variables, as measured by AUROC. The unadjusted disc variables with the highest diagnostic accuracies were: rim volume for TD-OCT (AUROC=0.864) and vertical CDR (AUROC=0.874) for FD-OCT. Magnification correction significantly worsened diagnostic accuracy for rim variables, and while optic disc size adjustments partially restored diagnostic accuracy, the adjusted AUROCs were still lower. Axial length adjustments to disc variables in the form of multiple logistic regression indices led to a slight but insignificant improvement in diagnostic accuracy. Our various regression approaches were not able to significantly improve disc-based OCT glaucoma diagnosis. However, disc rim area and vertical CDR had very high diagnostic accuracy, and these disc variables can serve to complement additional OCT measurements for diagnosis of glaucoma.
Associations between food insecurity and healthy behaviors among Korean adults
Chun, In-Ae; Park, Jong; Ro, Hee-Kyung; Han, Mi-Ah
2015-01-01
BACKGROUND/OBJECTIVES Food insecurity has been suggested as being negatively associated with healthy behaviors and health status. This study was performed to identify the associations between food insecurity and healthy behaviors among Korean adults. SUBJECTS/METHODS The data used were the 2011 Community Health Survey, cross-sectional representative samples of 253 communities in Korea. Food insecurity was defined as when participants reported that their family sometimes or often did not get enough food to eat in the past year. Healthy behaviors were considered as non-smoking, non-high risk drinking, participation in physical activities, eating a regular breakfast, and maintaining a normal weight. Multiple logistic regression and multinomial logistic regression analyses were used to identify the association between food insecurity and healthy behaviors. RESULTS The prevalence of food insecurity was 4.4% (men 3.9%, women 4.9%). Men with food insecurity had lower odds ratios (ORs) for non-smoking, 0.75 (95% CI: 0.68-0.82), participation in physical activities, 0.82 (95% CI: 0.76-0.90), and eating a regular breakfast, 0.66 (95% CI: 0.59-0.74), whereas they had a higher OR for maintaining a normal weight, 1.19 (95% CI: 1.09-1.30), than men with food security. Women with food insecurity had lower ORs for non-smoking, 0.77 (95% CI: 0.66-0.89), and eating a regular breakfast, 0.79 (95% CI: 0.72-0.88). For men, ORs for obesity were 0.78 (95% CI: 0.70-0.87) for overweight and 0.56 (95% CI: 0.39-0.82) for mild obesity. For women, the OR for moderate obesity was 2.04 (95% CI: 1.14-3.63) as compared with normal weight. CONCLUSIONS Food insecurity has a different impact on healthy behaviors. Provision of coping strategies for food insecurity might be critical to improve healthy behaviors among the population. PMID:26244083
Esmaeili Nadimi, Ali; Pour Amiri, Farah; Sheikh Fathollahi, Mahmood; Hassanshahi, Gholamhossien; Ahmadi, Zahra; Sayadi, Ahmad Reza
2016-09-15
Approximately 20% to 30% of patients who undergo coronary angiography for assessment of typical cardiac chest pain display microvascular coronary dysfunction (MCD). This study aimed to determine potential relationships between baseline clinical characteristics and likelihood of MCD diagnosis in a large group of patients with stable angina symptoms, positive exercise test and angiographic ally normal epicardial coronary arteries. This cross-sectional study included 250 Iranian with documented evidence of cardiac ischemia on exercise testing, class I or II indication for coronary angiography, and either: (1) angiographically normal coronary arteries and diagnosis of MCD with slow-flow phenomenon, or (2) normal angiogram and no evidence of MCD. All patients completed a questionnaire designed to capture key data including clinical demographics, past medical history, and social factors. Data was evaluated using single and multivariable logistic regression models to identify potential individual patient factors that might help to predict a diagnosis of MCD. 125 (11.2% of total) patients were subsequently diagnosed with MCD. 125 consecutive control subjects were selected for comparison. The mean age was similar among the two groups (52.38 vs. 53.26%, p=ns), but there was a higher proportion of men in the study group compared to control (42.4 vs. 27.2%, p=0.012). No significant relationships were observed between traditional cardiovascular risk factors (diabetes, hypertension, and dyslipidemia) or body mass index (BMI), and likelihood of MCD diagnosis. However, opium addiction was found to be an independent predictor of MCD on single and multivariable logistic regression model (OR=3.575, 95%CI: 1.418-9.016; p=0.0069). We observed a significant relationship between opium addiction and microvascular angina. This novel finding provides a potential mechanistic insight into the pathogenesis of MCD with slow-flow phenomenon. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Obese Chinese Primary-School Students and Low Self-Esteem: A Cross-Sectional Study
Xue-Yan, Zhang; Dong-Mei, Li; Dan-Dan, Xu; Le-Shan, Zhou
2016-01-01
Objectives The aim of this study was to examine several factors related to low self-esteem among obese Chinese primary-school students. Methods A cross-sectional study was conducted between June 2009 and June 2010. A total of 1,410 primary-school students (China grades 4 - 6) in Changsha city were divided into normal weight (n = 1,084), overweight (n = 211), and obese groups (n = 115) according to world health organization (WHO) growth standards for body mass index (BMI). The students were assessed using the self-esteem scale (SES) and a general situation questionnaire. Caregivers completed questionnaires about their child’s weight status. Self-esteem levels were explored; any factors related to low self-esteem were analyzed using logistic regression analysis. Results The average self-esteem score among overweight or obese primary-school students was found to be lower than that of normal-weight students. The proportion of students with low self-esteem in the obese group was more than that in the normal-weight and overweight groups. Multiple logistic regression analysis showed that obesity status (odds ratio [OR], 3.74; 95% confidence interval [CI], 2.25 - 6.22), overweight status (OR, 2.60; 95% CI, 1.71 - 3.95), obesity considered by children’s grandparents (OR, 1.76; 95% CI, 1.05 - 2.96), dissatisfaction with height (OR, 1.55; 95% CI, 1.11 - 2.18), and dissatisfaction with weight (OR, 1.45; 95% CI, 1.05 - 2.01) were the risk factors for low self-esteem for primary-school students, while satisfaction with academic performance was a protective factor (OR, 0.22; 95% CI, 0.07 - 0.71). Conclusions For Chinese primary-school students, low self-esteem is associated with higher weight status and self-perceived body shape and academic performance. In addition, grandparental opinion of a child’s weight also contributes to low self-esteem. PMID:27713806
Lee, Si Hyung; Kim, Gyu Ah; Lee, Wonseok; Bae, Hyoung Won; Seong, Gong Je; Kim, Chan Yun
2017-11-01
To assess the associations between vascular and metabolic comorbidities and the prevalence of open-angle glaucoma (OAG) with low-teen and high-teen intraocular pressure (IOP) in Korea. Cross-sectional data from the Korean National Health and Nutrition Examination Survey from 2008 to 2012 were analysed. Participants diagnosed with OAG with normal IOP were further classified into low-teen IOP (IOP ≤ 15 mmHg) and high-teen IOP (15 mmHg < IOP ≤ 21 mmHg) groups. Using multiple logistic regression analyses, the associations between vascular and metabolic comorbidities and the prevalence of glaucoma were investigated for the low- and high-teen IOP groups. The prevalences of hypertension, hyperlipidemia, ischaemic heart disease, stroke and metabolic syndrome were significantly higher among subjects with low-teen OAG compared with normal subjects, while only the prevalences of hypertension and stroke were higher among subjects with high-teen OAG compared with normal subjects. In multivariate logistic regression models adjusted for confounding factors, low-teen OAG was significantly associated with hypertension (OR, 1.68; 95% CI, 1.30-2.18), hyperlipidemia (OR, 1.49; 95% CI, 1.07-2.08), ischaemic heart disease (OR, 1.83; 95% CI, 1.07-3.11), stroke (OR, 1.91; 95% CI, 1.12-3.25) and metabolic syndrome (OR, 1.46; 95% CI, 1.12-1.90). High-teen OAG was only associated with stroke (OR, 2.58; 95% CI, 1.20-5.53). Various vascular and metabolic comorbidities were significantly associated with low-teen OAG, but not with high-teen OAG. These data support the hypothesis that vascular factors play a more significant role in the pathogenesis of OAG with low-teen baseline IOP. © 2017 Acta Ophthalmologica Scandinavica Foundation. Published by John Wiley & Sons Ltd.
Obese Chinese Primary-School Students and Low Self-Esteem: A Cross-Sectional Study.
Xue-Yan, Zhang; Dong-Mei, Li; Dan-Dan, Xu; Le-Shan, Zhou
2016-08-01
The aim of this study was to examine several factors related to low self-esteem among obese Chinese primary-school students. A cross-sectional study was conducted between June 2009 and June 2010. A total of 1,410 primary-school students (China grades 4 - 6) in Changsha city were divided into normal weight (n = 1,084), overweight (n = 211), and obese groups (n = 115) according to world health organization (WHO) growth standards for body mass index (BMI). The students were assessed using the self-esteem scale (SES) and a general situation questionnaire. Caregivers completed questionnaires about their child's weight status. Self-esteem levels were explored; any factors related to low self-esteem were analyzed using logistic regression analysis. The average self-esteem score among overweight or obese primary-school students was found to be lower than that of normal-weight students. The proportion of students with low self-esteem in the obese group was more than that in the normal-weight and overweight groups. Multiple logistic regression analysis showed that obesity status (odds ratio [OR], 3.74; 95% confidence interval [CI], 2.25 - 6.22), overweight status (OR, 2.60; 95% CI, 1.71 - 3.95), obesity considered by children's grandparents (OR, 1.76; 95% CI, 1.05 - 2.96), dissatisfaction with height (OR, 1.55; 95% CI, 1.11 - 2.18), and dissatisfaction with weight (OR, 1.45; 95% CI, 1.05 - 2.01) were the risk factors for low self-esteem for primary-school students, while satisfaction with academic performance was a protective factor (OR, 0.22; 95% CI, 0.07 - 0.71). For Chinese primary-school students, low self-esteem is associated with higher weight status and self-perceived body shape and academic performance. In addition, grandparental opinion of a child's weight also contributes to low self-esteem.
Sampson, Maureen L; Gounden, Verena; van Deventer, Hendrik E; Remaley, Alan T
2016-02-01
The main drawback of the periodic analysis of quality control (QC) material is that test performance is not monitored in time periods between QC analyses, potentially leading to the reporting of faulty test results. The objective of this study was to develop a patient based QC procedure for the more timely detection of test errors. Results from a Chem-14 panel measured on the Beckman LX20 analyzer were used to develop the model. Each test result was predicted from the other 13 members of the panel by multiple regression, which resulted in correlation coefficients between the predicted and measured result of >0.7 for 8 of the 14 tests. A logistic regression model, which utilized the measured test result, the predicted test result, the day of the week and time of day, was then developed for predicting test errors. The output of the logistic regression was tallied by a daily CUSUM approach and used to predict test errors, with a fixed specificity of 90%. The mean average run length (ARL) before error detection by CUSUM-Logistic Regression (CSLR) was 20 with a mean sensitivity of 97%, which was considerably shorter than the mean ARL of 53 (sensitivity 87.5%) for a simple prediction model that only used the measured result for error detection. A CUSUM-Logistic Regression analysis of patient laboratory data can be an effective approach for the rapid and sensitive detection of clinical laboratory errors. Published by Elsevier Inc.
NASA Astrophysics Data System (ADS)
Inoue, N.; Kitada, N.; Irikura, K.
2013-12-01
A probability of surface rupture is important to configure the seismic source, such as area sources or fault models, for a seismic hazard evaluation. In Japan, Takemura (1998) estimated the probability based on the historical earthquake data. Kagawa et al. (2004) evaluated the probability based on a numerical simulation of surface displacements. The estimated probability indicates a sigmoid curve and increases between Mj (the local magnitude defined and calculated by Japan Meteorological Agency) =6.5 and Mj=7.0. The probability of surface rupture is also used in a probabilistic fault displacement analysis (PFDHA). The probability is determined from the collected earthquake catalog, which were classified into two categories: with surface rupture or without surface rupture. The logistic regression is performed for the classified earthquake data. Youngs et al. (2003), Ross and Moss (2011) and Petersen et al. (2011) indicate the logistic curves of the probability of surface rupture by normal, reverse and strike-slip faults, respectively. Takao et al. (2013) shows the logistic curve derived from only Japanese earthquake data. The Japanese probability curve shows the sharply increasing in narrow magnitude range by comparison with other curves. In this study, we estimated the probability of surface rupture applying the logistic analysis to the surface displacement derived from a surface displacement calculation. A source fault was defined in according to the procedure of Kagawa et al. (2004), which determined a seismic moment from a magnitude and estimated the area size of the asperity and the amount of slip. Strike slip and reverse faults were considered as source faults. We applied Wang et al. (2003) for calculations. The surface displacements with defined source faults were calculated by varying the depth of the fault. A threshold value as 5cm of surface displacement was used to evaluate whether a surface rupture reach or do not reach to the surface. We carried out the logistic regression analysis to the calculated displacements, which were classified by the above threshold. The estimated probability curve indicated the similar trend to the result of Takao et al. (2013). The probability of revere faults is larger than that of strike slip faults. On the other hand, PFDHA results show different trends. The probability of reverse faults at higher magnitude is lower than that of strike slip and normal faults. Ross and Moss (2011) suggested that the sediment and/or rock over the fault compress and not reach the displacement to the surface enough. The numerical theory applied in this study cannot deal with a complex initial situation such as topography.
Kunimatsu-Sanuki, Shiho; Iwase, Aiko; Araie, Makoto; Aoki, Yuki; Hara, Takeshi; Fukuchi, Takeo; Udagawa, Sachiko; Ohkubo, Shinji; Sugiyama, Kazuhisa; Matsumoto, Chota; Nakazawa, Toru; Yamaguchi, Takuhiro; Ono, Hiroshi
2017-07-01
To assess the role of specific visual subfields in collisions with oncoming cars during simulated driving in patients with advanced glaucoma. Normal subjects and patients with glaucoma with mean deviation <-12 dB in both eyes (Humphrey Field Analyzer 24-2 SITA-S program) used a driving simulator (DS; Honda Motor, Tokyo). Two scenarios in which oncoming cars turned right crossing the driver's path were chosen. We compared the binocular integrated visual field (IVF) in the patients who were involved in collisions and those who were not. We performed a multivariate logistic regression analysis; the dependent parameter was collision involvement, and the independent parameters were age, visual acuity and mean sensitivity of the IVF subfields. The study included 43 normal subjects and 100 patients with advanced glaucoma. And, 5 of the 100 patients with advanced glaucoma experienced simulator sickness during the main test and were thus excluded. In total, 95 patients with advanced glaucoma and 43 normal subjects completed the main test of DS. Advanced glaucoma patients had significantly more collisions than normal patients in one or both DS scenarios (p<0.001). The patients with advanced glaucoma who were involved in collisions were older (p=0.050) and had worse visual acuity in the better eye (p<0.001) and had lower mean IVF sensitivity in the inferior hemifield, both 0°-12° and 13°-24° in comparison with who were not involved in collisions (p=0.012 and p=0.034). A logistic regression analysis revealed that collision involvement was significantly associated with decreased inferior IVF mean sensitivity from 13° to 24° (p=0.041), in addition to older age and lower visual acuity (p=0.018 and p<0.001). Our data suggest that the inferior hemifield was associated with the incidence of motor vehicle collisions with oncoming cars in patients with advanced glaucoma. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.
Dodge, Hiroko H; Mattek, Nora; Gregor, Mattie; Bowman, Molly; Seelye, Adriana; Ybarra, Oscar; Asgari, Meysam; Kaye, Jeffrey A
2015-01-01
Detecting early signs of Alzheimer's disease (AD) and mild cognitive impairment (MCI) during the pre-symptomatic phase is becoming increasingly important for costeffective clinical trials and also for deriving maximum benefit from currently available treatment strategies. However, distinguishing early signs of MCI from normal cognitive aging is difficult. Biomarkers have been extensively examined as early indicators of the pathological process for AD, but assessing these biomarkers is expensive and challenging to apply widely among pre-symptomatic community dwelling older adults. Here we propose assessment of social markers, which could provide an alternative or complementary and ecologically valid strategy for identifying the pre-symptomatic phase leading to MCI and AD. The data came from a larger randomized controlled clinical trial (RCT), where we examined whether daily conversational interactions using remote video telecommunications software could improve cognitive functions of older adult participants. We assessed the proportion of words generated by participants out of total words produced by both participants and staff interviewers using transcribed conversations during the intervention trial as an indicator of how two people (participants and interviewers) interact with each other in one-on-one conversations. We examined whether the proportion differed between those with intact cognition and MCI, using first, generalized estimating equations with the proportion as outcome, and second, logistic regression models with cognitive status as outcome in order to estimate the area under ROC curve (ROC AUC). Compared to those with normal cognitive function, MCI participants generated a greater proportion of words out of the total number of words during the timed conversation sessions (p=0.01). This difference remained after controlling for participant age, gender, interviewer and time of assessment (p=0.03). The logistic regression models showed the ROC AUC of identifying MCI (vs. normals) was 0.71 (95% Confidence Interval: 0.54 - 0.89) when average proportion of word counts spoken by subjects was included univariately into the model. An ecologically valid social marker such as the proportion of spoken words produced during spontaneous conversations may be sensitive to transitions from normal cognition to MCI.
Shi, Xiaolei; Peng, Yonghan; Li, Ling; Li, Xiao; Wang, Qi; Zhang, Wei; Dong, Hao; Shen, Rong; Lu, Chaoyue; Liu, Min; Gao, Xiaofeng; Sun, Yinghao
2018-05-26
To evaluate renal function changes and risk factors for acute kidney injury (AKI) after percutaneous nephrolithotomy (PCNL) in patients with renal calculi with a solitary kidney (SK) or normal bilateral kidneys (BKs). Between 2012 and 2016, 859 patients undergoing PCNL were retrospectively reviewed at Changhai Hospital. In all, 53 patients with a SK were paired with 53 patients with normal BKs via a propensity score-matched analysis. Data for the following variables were collected: age, sex, body mass index, stone size, distribution, operation time, perioperative outcomes, and complications. The complications were graded according to the modified Clavien-Dindo system. Univariable and multivariable logistic regression models were constructed to evaluate risk factors for predicting AKI. The SK and BKs groups were comparable in terms of age, sex ratio, stone size, stone location distribution, comorbidities, and American Society of Anesthesiologists Physical Status classification. The initial and final stone-free rates were comparable between the SK and BKs groups (initial: 52.83% vs 58.49%, P = 0.696; final: 84.91% vs 92.45%, P = 0.359). There was no difference between the two groups for complications, according to the Clavien-Dindo grades. The estimated glomerular filtration rate (eGFR) increased dramatically after the stone burden was immediately relieved, and during the 6-month follow-up eGFR was lower in the SK group compared with the BKs group. We found a modest improvement in renal function immediately after PCNL in the BKs group, and renal function gain was delayed in the SK group. Through logistic regression analysis, we discovered that a SK, preoperative creatinine and diabetes were independent risk factors for predicting AKI after PCNL. Considering the overall complication rates, PCNL is generally a safe procedure for treating renal calculi amongst patients with a SK or normal BKs. Follow-up renal function analysis showed a modest improvement in patients of both groups. Compared to patients with normal BKs, patients with a SK were more likely to develop AKI after PCNL. © 2018 The Authors BJU International © 2018 BJU International Published by John Wiley & Sons Ltd.
Nonconvex Sparse Logistic Regression With Weakly Convex Regularization
NASA Astrophysics Data System (ADS)
Shen, Xinyue; Gu, Yuantao
2018-06-01
In this work we propose to fit a sparse logistic regression model by a weakly convex regularized nonconvex optimization problem. The idea is based on the finding that a weakly convex function as an approximation of the $\\ell_0$ pseudo norm is able to better induce sparsity than the commonly used $\\ell_1$ norm. For a class of weakly convex sparsity inducing functions, we prove the nonconvexity of the corresponding sparse logistic regression problem, and study its local optimality conditions and the choice of the regularization parameter to exclude trivial solutions. Despite the nonconvexity, a method based on proximal gradient descent is used to solve the general weakly convex sparse logistic regression, and its convergence behavior is studied theoretically. Then the general framework is applied to a specific weakly convex function, and a necessary and sufficient local optimality condition is provided. The solution method is instantiated in this case as an iterative firm-shrinkage algorithm, and its effectiveness is demonstrated in numerical experiments by both randomly generated and real datasets.
A comparative study on entrepreneurial attitudes modeled with logistic regression and Bayes nets.
López Puga, Jorge; García García, Juan
2012-11-01
Entrepreneurship research is receiving increasing attention in our context, as entrepreneurs are key social agents involved in economic development. We compare the success of the dichotomic logistic regression model and the Bayes simple classifier to predict entrepreneurship, after manipulating the percentage of missing data and the level of categorization in predictors. A sample of undergraduate university students (N = 1230) completed five scales (motivation, attitude towards business creation, obstacles, deficiencies, and training needs) and we found that each of them predicted different aspects of the tendency to business creation. Additionally, our results show that the receiver operating characteristic (ROC) curve is affected by the rate of missing data in both techniques, but logistic regression seems to be more vulnerable when faced with missing data, whereas Bayes nets underperform slightly when categorization has been manipulated. Our study sheds light on the potential entrepreneur profile and we propose to use Bayesian networks as an additional alternative to overcome the weaknesses of logistic regression when missing data are present in applied research.
Campos-Filho, N; Franco, E L
1989-02-01
A frequent procedure in matched case-control studies is to report results from the multivariate unmatched analyses if they do not differ substantially from the ones obtained after conditioning on the matching variables. Although conceptually simple, this rule requires that an extensive series of logistic regression models be evaluated by both the conditional and unconditional maximum likelihood methods. Most computer programs for logistic regression employ only one maximum likelihood method, which requires that the analyses be performed in separate steps. This paper describes a Pascal microcomputer (IBM PC) program that performs multiple logistic regression by both maximum likelihood estimation methods, which obviates the need for switching between programs to obtain relative risk estimates from both matched and unmatched analyses. The program calculates most standard statistics and allows factoring of categorical or continuous variables by two distinct methods of contrast. A built-in, descriptive statistics option allows the user to inspect the distribution of cases and controls across categories of any given variable.
Comparison of cranial sex determination by discriminant analysis and logistic regression.
Amores-Ampuero, Anabel; Alemán, Inmaculada
2016-04-05
Various methods have been proposed for estimating dimorphism. The objective of this study was to compare sex determination results from cranial measurements using discriminant analysis or logistic regression. The study sample comprised 130 individuals (70 males) of known sex, age, and cause of death from San José cemetery in Granada (Spain). Measurements of 19 neurocranial dimensions and 11 splanchnocranial dimensions were subjected to discriminant analysis and logistic regression, and the percentages of correct classification were compared between the sex functions obtained with each method. The discriminant capacity of the selected variables was evaluated with a cross-validation procedure. The percentage accuracy with discriminant analysis was 78.2% for the neurocranium (82.4% in females and 74.6% in males) and 73.7% for the splanchnocranium (79.6% in females and 68.8% in males). These percentages were higher with logistic regression analysis: 85.7% for the neurocranium (in both sexes) and 94.1% for the splanchnocranium (100% in females and 91.7% in males).
Castelo, Paula Midori; Gavião, Maria Beatriz Duarte; Pereira, Luciano José; Bonjardim, Leonardo Rigoldi
2010-01-01
The maintenance of normal conditions of the masticatory function is determinant for the correct growth and development of its structures. Thus, the aims of this study were to evaluate the influence of sucking habits on the presence of crossbite and its relationship with maximal bite force, facial morphology and body variables in 67 children of both genders (3.5-7 years) with primary or early mixed dentition. The children were divided in four groups: primary-normocclusion (PN, n=19), primary-crossbite (PC, n=19), mixed-normocclusion (MN, n=13), and mixed-crossbite (MC, n=16). Bite force was measured with a pressurized tube, and facial morphology was determined by standardized frontal photographs: AFH (anterior face height) and BFW (bizygomatic facial width). It was observed that MC group showed lower bite force than MN, and AFH/BFW was significantly smaller in PN than PC (t-test). Weight and height were only significantly correlated with bite force in PC group (Pearson's correlation test). In the primary dentition, AFH/BFW and breast-feeding (at least six months) were positive and negatively associated with crossbite, respectively (multiple logistic regression). In the mixed dentition, breast-feeding and bite force showed negative associations with crossbite (univariate regression), while nonnutritive sucking (up to 3 years) associated significantly with crossbite in all groups (multiple logistic regression). In the studied sample, sucking habits played an important role in the etiology of crossbite, which was associated with lower bite force and long-face tendency.
Hill, Andrew; Loh, Po-Ru; Bharadwaj, Ragu B.; Pons, Pascal; Shang, Jingbo; Guinan, Eva; Lakhani, Karim; Kilty, Iain
2017-01-01
Abstract Background: The association of differing genotypes with disease-related phenotypic traits offers great potential to both help identify new therapeutic targets and support stratification of patients who would gain the greatest benefit from specific drug classes. Development of low-cost genotyping and sequencing has made collecting large-scale genotyping data routine in population and therapeutic intervention studies. In addition, a range of new technologies is being used to capture numerous new and complex phenotypic descriptors. As a result, genotype and phenotype datasets have grown exponentially. Genome-wide association studies associate genotypes and phenotypes using methods such as logistic regression. As existing tools for association analysis limit the efficiency by which value can be extracted from increasing volumes of data, there is a pressing need for new software tools that can accelerate association analyses on large genotype-phenotype datasets. Results: Using open innovation (OI) and contest-based crowdsourcing, the logistic regression analysis in a leading, community-standard genetics software package (PLINK 1.07) was substantially accelerated. OI allowed us to do this in <6 months by providing rapid access to highly skilled programmers with specialized, difficult-to-find skill sets. Through a crowd-based contest a combination of computational, numeric, and algorithmic approaches was identified that accelerated the logistic regression in PLINK 1.07 by 18- to 45-fold. Combining contest-derived logistic regression code with coarse-grained parallelization, multithreading, and associated changes to data initialization code further developed through distributed innovation, we achieved an end-to-end speedup of 591-fold for a data set size of 6678 subjects by 645 863 variants, compared to PLINK 1.07's logistic regression. This represents a reduction in run time from 4.8 hours to 29 seconds. Accelerated logistic regression code developed in this project has been incorporated into the PLINK2 project. Conclusions: Using iterative competition-based OI, we have developed a new, faster implementation of logistic regression for genome-wide association studies analysis. We present lessons learned and recommendations on running a successful OI process for bioinformatics. PMID:28327993
Hill, Andrew; Loh, Po-Ru; Bharadwaj, Ragu B; Pons, Pascal; Shang, Jingbo; Guinan, Eva; Lakhani, Karim; Kilty, Iain; Jelinsky, Scott A
2017-05-01
The association of differing genotypes with disease-related phenotypic traits offers great potential to both help identify new therapeutic targets and support stratification of patients who would gain the greatest benefit from specific drug classes. Development of low-cost genotyping and sequencing has made collecting large-scale genotyping data routine in population and therapeutic intervention studies. In addition, a range of new technologies is being used to capture numerous new and complex phenotypic descriptors. As a result, genotype and phenotype datasets have grown exponentially. Genome-wide association studies associate genotypes and phenotypes using methods such as logistic regression. As existing tools for association analysis limit the efficiency by which value can be extracted from increasing volumes of data, there is a pressing need for new software tools that can accelerate association analyses on large genotype-phenotype datasets. Using open innovation (OI) and contest-based crowdsourcing, the logistic regression analysis in a leading, community-standard genetics software package (PLINK 1.07) was substantially accelerated. OI allowed us to do this in <6 months by providing rapid access to highly skilled programmers with specialized, difficult-to-find skill sets. Through a crowd-based contest a combination of computational, numeric, and algorithmic approaches was identified that accelerated the logistic regression in PLINK 1.07 by 18- to 45-fold. Combining contest-derived logistic regression code with coarse-grained parallelization, multithreading, and associated changes to data initialization code further developed through distributed innovation, we achieved an end-to-end speedup of 591-fold for a data set size of 6678 subjects by 645 863 variants, compared to PLINK 1.07's logistic regression. This represents a reduction in run time from 4.8 hours to 29 seconds. Accelerated logistic regression code developed in this project has been incorporated into the PLINK2 project. Using iterative competition-based OI, we have developed a new, faster implementation of logistic regression for genome-wide association studies analysis. We present lessons learned and recommendations on running a successful OI process for bioinformatics. © The Author 2017. Published by Oxford University Press.
Lin, Chao-Cheng; Bai, Ya-Mei; Chen, Jen-Yeu; Hwang, Tzung-Jeng; Chen, Tzu-Ting; Chiu, Hung-Wen; Li, Yu-Chuan
2010-03-01
Metabolic syndrome (MetS) is an important side effect of second-generation antipsychotics (SGAs). However, many SGA-treated patients with MetS remain undetected. In this study, we trained and validated artificial neural network (ANN) and multiple logistic regression models without biochemical parameters to rapidly identify MetS in patients with SGA treatment. A total of 383 patients with a diagnosis of schizophrenia or schizoaffective disorder (DSM-IV criteria) with SGA treatment for more than 6 months were investigated to determine whether they met the MetS criteria according to the International Diabetes Federation. The data for these patients were collected between March 2005 and September 2005. The input variables of ANN and logistic regression were limited to demographic and anthropometric data only. All models were trained by randomly selecting two-thirds of the patient data and were internally validated with the remaining one-third of the data. The models were then externally validated with data from 69 patients from another hospital, collected between March 2008 and June 2008. The area under the receiver operating characteristic curve (AUC) was used to measure the performance of all models. Both the final ANN and logistic regression models had high accuracy (88.3% vs 83.6%), sensitivity (93.1% vs 86.2%), and specificity (86.9% vs 83.8%) to identify MetS in the internal validation set. The mean +/- SD AUC was high for both the ANN and logistic regression models (0.934 +/- 0.033 vs 0.922 +/- 0.035, P = .63). During external validation, high AUC was still obtained for both models. Waist circumference and diastolic blood pressure were the common variables that were left in the final ANN and logistic regression models. Our study developed accurate ANN and logistic regression models to detect MetS in patients with SGA treatment. The models are likely to provide a noninvasive tool for large-scale screening of MetS in this group of patients. (c) 2010 Physicians Postgraduate Press, Inc.
Bayesian logistic regression in detection of gene-steroid interaction for cancer at PDLIM5 locus.
Wang, Ke-Sheng; Owusu, Daniel; Pan, Yue; Xie, Changchun
2016-06-01
The PDZ and LIM domain 5 (PDLIM5) gene may play a role in cancer, bipolar disorder, major depression, alcohol dependence and schizophrenia; however, little is known about the interaction effect of steroid and PDLIM5 gene on cancer. This study examined 47 single-nucleotide polymorphisms (SNPs) within the PDLIM5 gene in the Marshfield sample with 716 cancer patients (any diagnosed cancer, excluding minor skin cancer) and 2848 noncancer controls. Multiple logistic regression model in PLINK software was used to examine the association of each SNP with cancer. Bayesian logistic regression in PROC GENMOD in SAS statistical software, ver. 9.4 was used to detect gene- steroid interactions influencing cancer. Single marker analysis using PLINK identified 12 SNPs associated with cancer (P< 0.05); especially, SNP rs6532496 revealed the strongest association with cancer (P = 6.84 × 10⁻³); while the next best signal was rs951613 (P = 7.46 × 10⁻³). Classic logistic regression in PROC GENMOD showed that both rs6532496 and rs951613 revealed strong gene-steroid interaction effects (OR=2.18, 95% CI=1.31-3.63 with P = 2.9 × 10⁻³ for rs6532496 and OR=2.07, 95% CI=1.24-3.45 with P = 5.43 × 10⁻³ for rs951613, respectively). Results from Bayesian logistic regression showed stronger interaction effects (OR=2.26, 95% CI=1.2-3.38 for rs6532496 and OR=2.14, 95% CI=1.14-3.2 for rs951613, respectively). All the 12 SNPs associated with cancer revealed significant gene-steroid interaction effects (P < 0.05); whereas 13 SNPs showed gene-steroid interaction effects without main effect on cancer. SNP rs4634230 revealed the strongest gene-steroid interaction effect (OR=2.49, 95% CI=1.5-4.13 with P = 4.0 × 10⁻⁴ based on the classic logistic regression and OR=2.59, 95% CI=1.4-3.97 from Bayesian logistic regression; respectively). This study provides evidence of common genetic variants within the PDLIM5 gene and interactions between PLDIM5 gene polymorphisms and steroid use influencing cancer.
Deletion Diagnostics for Alternating Logistic Regressions
Preisser, John S.; By, Kunthel; Perin, Jamie; Qaqish, Bahjat F.
2013-01-01
Deletion diagnostics are introduced for the regression analysis of clustered binary outcomes estimated with alternating logistic regressions, an implementation of generalized estimating equations (GEE) that estimates regression coefficients in a marginal mean model and in a model for the intracluster association given by the log odds ratio. The diagnostics are developed within an estimating equations framework that recasts the estimating functions for association parameters based upon conditional residuals into equivalent functions based upon marginal residuals. Extensions of earlier work on GEE diagnostics follow directly, including computational formulae for one-step deletion diagnostics that measure the influence of a cluster of observations on the estimated regression parameters and on the overall marginal mean or association model fit. The diagnostic formulae are evaluated with simulations studies and with an application concerning an assessment of factors associated with health maintenance visits in primary care medical practices. The application and the simulations demonstrate that the proposed cluster-deletion diagnostics for alternating logistic regressions are good approximations of their exact fully iterated counterparts. PMID:22777960
Knol, Mirjam J; van der Tweel, Ingeborg; Grobbee, Diederick E; Numans, Mattijs E; Geerlings, Mirjam I
2007-10-01
To determine the presence of interaction in epidemiologic research, typically a product term is added to the regression model. In linear regression, the regression coefficient of the product term reflects interaction as departure from additivity. However, in logistic regression it refers to interaction as departure from multiplicativity. Rothman has argued that interaction estimated as departure from additivity better reflects biologic interaction. So far, literature on estimating interaction on an additive scale using logistic regression only focused on dichotomous determinants. The objective of the present study was to provide the methods to estimate interaction between continuous determinants and to illustrate these methods with a clinical example. and results From the existing literature we derived the formulas to quantify interaction as departure from additivity between one continuous and one dichotomous determinant and between two continuous determinants using logistic regression. Bootstrapping was used to calculate the corresponding confidence intervals. To illustrate the theory with an empirical example, data from the Utrecht Health Project were used, with age and body mass index as risk factors for elevated diastolic blood pressure. The methods and formulas presented in this article are intended to assist epidemiologists to calculate interaction on an additive scale between two variables on a certain outcome. The proposed methods are included in a spreadsheet which is freely available at: http://www.juliuscenter.nl/additive-interaction.xls.
Determinants of spirometric abnormalities among silicotic patients in Hong Kong.
Leung, Chi C; Chang, Kwok C; Law, Wing S; Yew, Wing W; Tam, Cheuk M; Chan, Chi K; Wong, Man Y
2005-09-01
Silicosis is the second commonest notified occupational disease in Hong Kong. To characterize the determinants of spirometric abnormalities in silicosis. The spirometric patterns of consecutive silicotic patients on confirmation by the Pneumoconiosis Medical Board from 1991 to 2002 were correlated with demographic characteristics, occupational history, smoking history, tuberculosis (TB) history and radiographic features by univariate and multiple regression analyses. Of 1576 silicotic patients included, 55.6% showed normal spirometry, 28.5% normal forced vital capacity (FVC>or=80% predicted) but reduced forced expiratory ratio (FER<70%), 7.6% reduced FVC but normal FER, and 8.4% reduced both FVC and FER. Age, ever-smoking, cigarette pack-years, industry, job type, history of TB, size of lung nodules and progressive massive fibrosis (PMF) were all significantly associated with airflow limitation on univariate analysis (all P<0.05), while sex and profusion of nodules were not. Only age, cigarette pack-years, history of TB, size of lung nodules and PMF remained as significant independent predictors of airflow obstruction in multiple logistic regression analysis. After controlling for airflow obstruction, only shorter exposure duration, history of TB and profusion of nodules were significant independent predictors of reduced FVC. As well as age, history of TB, cigarette pack-years, PMF and nodule size contributed comparable effects to airflow obstruction in multiple linear regression analyses, while profusion of nodules was the strongest factor for reduced vital capacity. In an occupational compensation setting, disease indices and history of tuberculosis are independent predictors of both airflow obstruction and reduced vital capacity for silicotic patients.
ERIC Educational Resources Information Center
Osborne, Jason W.
2012-01-01
Logistic regression is slowly gaining acceptance in the social sciences, and fills an important niche in the researcher's toolkit: being able to predict important outcomes that are not continuous in nature. While OLS regression is a valuable tool, it cannot routinely be used to predict outcomes that are binary or categorical in nature. These…
Intermediate and advanced topics in multilevel logistic regression analysis
Merlo, Juan
2017-01-01
Multilevel data occur frequently in health services, population and public health, and epidemiologic research. In such research, binary outcomes are common. Multilevel logistic regression models allow one to account for the clustering of subjects within clusters of higher‐level units when estimating the effect of subject and cluster characteristics on subject outcomes. A search of the PubMed database demonstrated that the use of multilevel or hierarchical regression models is increasing rapidly. However, our impression is that many analysts simply use multilevel regression models to account for the nuisance of within‐cluster homogeneity that is induced by clustering. In this article, we describe a suite of analyses that can complement the fitting of multilevel logistic regression models. These ancillary analyses permit analysts to estimate the marginal or population‐average effect of covariates measured at the subject and cluster level, in contrast to the within‐cluster or cluster‐specific effects arising from the original multilevel logistic regression model. We describe the interval odds ratio and the proportion of opposed odds ratios, which are summary measures of effect for cluster‐level covariates. We describe the variance partition coefficient and the median odds ratio which are measures of components of variance and heterogeneity in outcomes. These measures allow one to quantify the magnitude of the general contextual effect. We describe an R 2 measure that allows analysts to quantify the proportion of variation explained by different multilevel logistic regression models. We illustrate the application and interpretation of these measures by analyzing mortality in patients hospitalized with a diagnosis of acute myocardial infarction. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. PMID:28543517
Intermediate and advanced topics in multilevel logistic regression analysis.
Austin, Peter C; Merlo, Juan
2017-09-10
Multilevel data occur frequently in health services, population and public health, and epidemiologic research. In such research, binary outcomes are common. Multilevel logistic regression models allow one to account for the clustering of subjects within clusters of higher-level units when estimating the effect of subject and cluster characteristics on subject outcomes. A search of the PubMed database demonstrated that the use of multilevel or hierarchical regression models is increasing rapidly. However, our impression is that many analysts simply use multilevel regression models to account for the nuisance of within-cluster homogeneity that is induced by clustering. In this article, we describe a suite of analyses that can complement the fitting of multilevel logistic regression models. These ancillary analyses permit analysts to estimate the marginal or population-average effect of covariates measured at the subject and cluster level, in contrast to the within-cluster or cluster-specific effects arising from the original multilevel logistic regression model. We describe the interval odds ratio and the proportion of opposed odds ratios, which are summary measures of effect for cluster-level covariates. We describe the variance partition coefficient and the median odds ratio which are measures of components of variance and heterogeneity in outcomes. These measures allow one to quantify the magnitude of the general contextual effect. We describe an R 2 measure that allows analysts to quantify the proportion of variation explained by different multilevel logistic regression models. We illustrate the application and interpretation of these measures by analyzing mortality in patients hospitalized with a diagnosis of acute myocardial infarction. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.
Liang, H; Zhang, W Y; Li, X T
2017-03-25
Objective: To investigate the influence of gestational weight gain (GWG) on the incidence of macrosomia, and to establish the reference ranges of GWG based on the incidence of macrosomia. Methods: A multicenter, cross-sectional study was conducted. Totally, 112 485 women were recruited from 39 hospitals in 14 provinces in China. Totally, 61 149 cases were eligible with singleton pregnancies and non-premature deliveries. The associations of pre-pregnancy body mass index (BMI), GWG, newborn gender and gestational diabetes with macrosomia were analyzed with logistic regression. The normal GWG ranges were calculated in all maternal BMI subgroups, based on the normal incidence of macrosomia was set as the range of 5.0% to 10.0%. Results: In this study, the incidence of macrosomia was 7.46% (4 563/611 149). The macrosociam was positive related with maternal height, delivery week, pre-pregnancy BMI, GWG, gestational diabetes, primipara, and male babies significantly ( P< 0.05), based on unadjusted and adjusted logestic regression. The normal range of GWG 20.0-25.0, 10.0-20.0, 0-10.0 and 0-5.0 kg in subgroups of underweight (pre-pregnancy BMI<18.5 kg/m(2)), normal (18.5-24.9 kg/m(2)), overweight (25.0-29.9 kg/m(2)) and obese (≥30.0 kg/m(2)), respectively. Conclusion: The reference range of GWG in China based on the incidence of macrosomia is established.
Speech prosody impairment predicts cognitive decline in Parkinson's disease.
Rektorova, Irena; Mekyska, Jiri; Janousova, Eva; Kostalova, Milena; Eliasova, Ilona; Mrackova, Martina; Berankova, Dagmar; Necasova, Tereza; Smekal, Zdenek; Marecek, Radek
2016-08-01
Impairment of speech prosody is characteristic for Parkinson's disease (PD) and does not respond well to dopaminergic treatment. We assessed whether baseline acoustic parameters, alone or in combination with other predominantly non-dopaminergic symptoms may predict global cognitive decline as measured by the Addenbrooke's cognitive examination (ACE-R) and/or worsening of cognitive status as assessed by a detailed neuropsychological examination. Forty-four consecutive non-depressed PD patients underwent clinical and cognitive testing, and acoustic voice analysis at baseline and at the two-year follow-up. Influence of speech and other clinical parameters on worsening of the ACE-R and of the cognitive status was analyzed using linear and logistic regression. The cognitive status (classified as normal cognition, mild cognitive impairment and dementia) deteriorated in 25% of patients during the follow-up. The multivariate linear regression model consisted of the variation in range of the fundamental voice frequency (F0VR) and the REM Sleep Behavioral Disorder Screening Questionnaire (RBDSQ). These parameters explained 37.2% of the variability of the change in ACE-R. The most significant predictors in the univariate logistic regression were the speech index of rhythmicity (SPIR; p = 0.012), disease duration (p = 0.019), and the RBDSQ (p = 0.032). The multivariate regression analysis revealed that SPIR alone led to 73.2% accuracy in predicting a change in cognitive status. Combining SPIR with RBDSQ improved the prediction accuracy of SPIR alone by 7.3%. Impairment of speech prosody together with symptoms of RBD predicted rapid cognitive decline and worsening of PD cognitive status during a two-year period. Copyright © 2016 Elsevier Ltd. All rights reserved.
Predicting Social Trust with Binary Logistic Regression
ERIC Educational Resources Information Center
Adwere-Boamah, Joseph; Hufstedler, Shirley
2015-01-01
This study used binary logistic regression to predict social trust with five demographic variables from a national sample of adult individuals who participated in The General Social Survey (GSS) in 2012. The five predictor variables were respondents' highest degree earned, race, sex, general happiness and the importance of personally assisting…
Effect of folic acid on appetite in children: ordinal logistic and fuzzy logistic regressions.
Namdari, Mahshid; Abadi, Alireza; Taheri, S Mahmoud; Rezaei, Mansour; Kalantari, Naser; Omidvar, Nasrin
2014-03-01
Reduced appetite and low food intake are often a concern in preschool children, since it can lead to malnutrition, a leading cause of impaired growth and mortality in childhood. It is occasionally considered that folic acid has a positive effect on appetite enhancement and consequently growth in children. The aim of this study was to assess the effect of folic acid on the appetite of preschool children 3 to 6 y old. The study sample included 127 children ages 3 to 6 who were randomly selected from 20 preschools in the city of Tehran in 2011. Since appetite was measured by linguistic terms, a fuzzy logistic regression was applied for modeling. The obtained results were compared with a statistical ordinal logistic model. After controlling for the potential confounders, in a statistical ordinal logistic model, serum folate showed a significantly positive effect on appetite. A small but positive effect of folate was detected by fuzzy logistic regression. Based on fuzzy regression, the risk for poor appetite in preschool children was related to the employment status of their mothers. In this study, a positive association was detected between the levels of serum folate and improved appetite. For further investigation, a randomized controlled, double-blind clinical trial could be helpful to address causality. Copyright © 2014 Elsevier Inc. All rights reserved.
Trauma injury in adult underweight patients
Hsieh, Ching-Hua; Lai, Wei-Hung; Wu, Shao-Chun; Chen, Yi-Chun; Kuo, Pao-Jen; Hsu, Shiun-Yuan; Hsieh, Hsiao-Yun
2017-01-01
Abstract The aim of this study was to investigate and compare the injury characteristics, severity, and outcome between underweight and normal-weight patients hospitalized for the treatment of all kinds of trauma injury. This study was based on a level I trauma center Taiwan. The detailed data of 640 underweight adult trauma patients with a body mass index (BMI) of <18.5 kg/m2 and 6497 normal-weight adult patients (25 > BMI ≥ 18.5 kg/m2) were retrieved from the Trauma Registry System between January 1, 2009, and December 31, 2014. Pearson's chi-square test, Fisher's exact test, and independent Student's t-test were performed to compare the differences. Propensity score matching with logistic regression was used to evaluate the effect of underweight on mortality. Underweight patients presented a different bodily injury pattern and a significantly higher rate of admittance to the intensive care unit (ICU) than did normal-weight patients; however, no significant differences in the Glasgow Coma Scale (GCS) score, injury severity score (ISS), in-hospital mortality, and hospital length of stay were found between the two groups. However, further analysis of the patients stratified by two major injury mechanisms (motorcycle accident and fall injury) revealed that underweight patients had significantly lower GCS scores (13.8 ± 3.0 vs 14.5 ± 2.0, P = 0.020), but higher ISS (10.1 ± 6.9 vs 8.4 ± 5.9, P = 0.005), in-hospital mortality (odds ratio, 4.4; 95% confidence interval, 1.69–11.35; P = 0.006), and ICU admittance rate (24.1% vs 14.3%, P = 0.007) than normal-weight patients in the fall accident group, but not in the motorcycle accident group. However, after propensity score matching, logistic regression analysis of well-matched pairs of patients with either all trauma, motorcycle accident, or fall injury did not show a significant influence of underweight on mortality. Exploratory data analysis revealed that underweight patients presented a different bodily injury pattern from that of normal-weight patients, specifically a higher incidence of pneumothorax in those with penetrating injuries and of femoral fracture in those with struck on/against injuries; however, the injury severity and outcome of underweight patients varied depending on the injury mechanism. PMID:28272241
Kasztelan-Szczerbinska, Beata; Slomka, Maria; Celinski, Krzysztof; Szczerbinski, Mariusz
2013-01-01
Determination of risk factors relevant to 90-day prognosis in AH. Comparison of the conventional prognostic models such as Maddrey's modified discriminant function (mDF) and Child-Pugh-Turcotte (CPT) score with newer ones: the Glasgow Alcoholic Hepatitis Score (GAHS); Age, Bilirubin, INR, Creatinine (ABIC) score, Model for End-Stage Liver Disease (MELD), and MELD-Na in the death prediction. The clinical and laboratory variables obtained at admission were assessed. The mDF, CPT, GAHS, ABIC, MELD, and MELD-Na scores' different areas under the curve (AUCs) and the best threshold values were compared. Logistic regression was used to assess predictors of the 90-day outcome. One hundred sixteen pts fulfilled the inclusion criteria. Twenty (17.4%) pts died and one underwent orthotopic liver transplantation (OLT) within 90 days of follow-up. No statistically significant differences in the models' performances were found. Multivariate logistic regression identified CPT score, alkaline phosphatase (AP) level higher than 1.5 times the upper limit of normal (ULN), and corticosteroids (CS) nonresponse as independent predictors of mortality. The CPT score, AP > 1.5 ULN, and the CS nonresponse had an independent impact on the 90-day survival in AH. Accuracy of all studied scoring systems was comparable.
Marqueta de Salas, María; Rodríguez Gómez, Lorena; Enjuto Martínez, Diego; Juárez Soto, José Juan; Martín-Ramiro, José Javier
2017-03-01
Obesity is a public health problem worldwide. The aim of the present study was to determine the association between the type of working schedule and the sleeping hours per day with obesity and overweight. Cross-sectional study of the National Health Survey in 2012. We conducted an analysis of multinomial logistic regression and estimated the rates of possible risk of obesity and overweight versus the normal weight in relation to the type of working schedule and sleeping hours. Obesity among those who worked at night was 17,50% and those who had irregular works was 17,92%. Overweight among those who performed part-time works was 40,81% and 39,17% in night works. The obesity and overweight among those who slept less than six hours a day were 24,42% and 40,99% respectively. Regression analysis logistic showed OR=1,83 (IC95% 1,15-1,75) in irregular works and OR= 1,83 (IC95% 1,59-2,11) in people who slept less than six hours. Whenever overweight and obesity are present, a positive association between irregular jobs and short patterns of rest has been found, but stadistical significance is lost when estimating the OR adjusting the confounding factors.
[Risk factors for anorexia in children].
Liu, Wei-Xiao; Lang, Jun-Feng; Zhang, Qin-Feng
2016-11-01
To investigate the risk factors for anorexia in children, and to reduce the prevalence of anorexia in children. A questionnaire survey and a case-control study were used to collect the general information of 150 children with anorexia (case group) and 150 normal children (control group). Univariate analysis and multivariate logistic stepwise regression analysis were performed to identify the risk factors for anorexia in children. The results of the univariate analysis showed significant differences between the case and control groups in the age in months when supplementary food were added, feeding pattern, whether they liked meat, vegetables and salty food, whether they often took snacks and beverages, whether they liked to play while eating, and whether their parents asked them to eat food on time (P<0.05). The results of the multivariate logistic regression analysis showed that late addition of supplementary food (OR=5.408), high frequency of taking snacks and/or drinks (OR=11.813), and eating while playing (OR=6.654) were major risk factors for anorexia in children. Liking of meat (OR=0.093) and vegetables (OR=0.272) and eating on time required by parents (OR=0.079) were protective factors against anorexia in children. Timely addition of supplementary food, a proper diet, and development of children's proper eating and living habits can reduce the incidence of anorexia in children.
Lee, Tsair-Fwu; Liou, Ming-Hsiang; Huang, Yu-Jie; Chao, Pei-Ju; Ting, Hui-Min; Lee, Hsiao-Yi
2014-01-01
To predict the incidence of moderate-to-severe patient-reported xerostomia among head and neck squamous cell carcinoma (HNSCC) and nasopharyngeal carcinoma (NPC) patients treated with intensity-modulated radiotherapy (IMRT). Multivariable normal tissue complication probability (NTCP) models were developed by using quality of life questionnaire datasets from 152 patients with HNSCC and 84 patients with NPC. The primary endpoint was defined as moderate-to-severe xerostomia after IMRT. The numbers of predictive factors for a multivariable logistic regression model were determined using the least absolute shrinkage and selection operator (LASSO) with bootstrapping technique. Four predictive models were achieved by LASSO with the smallest number of factors while preserving predictive value with higher AUC performance. For all models, the dosimetric factors for the mean dose given to the contralateral and ipsilateral parotid gland were selected as the most significant predictors. Followed by the different clinical and socio-economic factors being selected, namely age, financial status, T stage, and education for different models were chosen. The predicted incidence of xerostomia for HNSCC and NPC patients can be improved by using multivariable logistic regression models with LASSO technique. The predictive model developed in HNSCC cannot be generalized to NPC cohort treated with IMRT without validation and vice versa. PMID:25163814
Gamble, Abigail; Mendy, Vincent
2013-01-01
Introduction Cardiovascular disease is a leading cause of death and health disparities in Mississippi. Identifying populations with poor cardiovascular health may help direct interventions toward those populations disproportionately affected, which may ultimately increase cardiovascular health and decrease prominent disparities. Our objective was to assess racial differences in the prevalence of cardiovascular health metrics among Mississippi adults. Methods We used data from the 2009 Mississippi Behavioral Risk Factor Surveillance System to determine age-standardized prevalence estimates and 95% confidence intervals of cardiovascular health metrics among 2,003 black and 5,125 white adults. Logistic regression models were used to evaluate the relationship between race and cardiovascular health metrics. The mean cardiovascular metrics score and percentage of the population with ideal and poor cardiovascular health were calculated by subgroup. Results Approximately 1.3% of blacks and 2.6% of whites exhibited ideal levels of all 7 cardiovascular health metrics. The prevalence of 4 of the 7 cardiovascular health metrics was significantly lower among the total population of blacks than among whites, including a normal body mass index (20.8% vs 32.3%, P < .001), no history of diabetes (85.1% vs 91.3%, P < .001), no history of hypertension (53.9% vs 67.9%, P < .001), and physical activity (52.8% vs 62.2%, P < .001). The logistic regression models revealed significant race-by-sex interactions; differences between blacks and whites for normal body mass index, no history of diabetes mellitus, and no current smoking were found among women but not among men. Conclusion Cardiovascular health is poor among Mississippi adults overall, and racial differences exist. PMID:24262026
Hidese, Shinsuke; Asano, Shinya; Saito, Kenji; Sasayama, Daimei; Kunugi, Hiroshi
2018-07-01
Body mass index (BMI) and lifestyle-related physical illnesses have been implicated in the pathology of depression. We aimed to investigate the association of depression wih BMI classification (i.e., underweight, normal, overweight, and obese), metabolic disease, and lifestyle using a web-based survey in a large cohort. Participants were 1000 individuals who have had depression (mean age: 41.4 ± 12.3 years, 501 men) and 10,876 population-based controls (45.1 ± 13.6 years, 5691 men). The six-item Kessler scale (K6) test was used as a psychological distress scale. Compared to in the controls, obesity and hyperlipidemia were more common and frequency of a snack or night meal consumption was higher, whereas frequencies of breakfast consumption and vigorous and moderate physical activities were lower in the patients. K6 test scores were higher for underweight or obese people compared to normal or overweight people. A logistic regression analysis showed that the K6 test cut-off score was positively associated with being underweight, hyperlipidemia, and the frequency of a snack or night meal consumption, whereas it was negatively associated with the frequency of breakfast consumption in the patients. Logistic regression analyses showed that self-reported depression was positively associated with metabolic diseases and the frequency of a snack or night meal consumption, whereas it was negatively associated with the frequency of breakfast consumption. The observed associations of depression with BMI classification, metabolic disease, and lifestyle suggest that lifestyle and related physical conditions are involved in at least a portion of depressive disorders. Copyright © 2018 Elsevier Ltd. All rights reserved.
Barberio, Amanda M; Hosein, F Shaun; Quiñonez, Carlos; McLaren, Lindsay
2017-10-01
There are concerns that altered thyroid functioning could be the result of ingesting too much fluoride. Community water fluoridation (CWF) is an important source of fluoride exposure. Our objectives were to examine the association between fluoride exposure and (1) diagnosis of a thyroid condition and (2) indicators of thyroid functioning among a national population-based sample of Canadians. We analysed data from Cycles 2 and 3 of the Canadian Health Measures Survey (CHMS). Logistic regression was used to assess associations between fluoride from urine and tap water samples and the diagnosis of a thyroid condition. Multinomial logistic regression was used to examine the relationship between fluoride exposure and thyroid-stimulating hormone (TSH) level (low/normal/high). Other available variables permitted additional exploratory analyses among the subset of participants for whom we could discern some fluoride exposure from drinking water and/or dental products. There was no evidence of a relationship between fluoride exposure (from urine and tap water) and the diagnosis of a thyroid condition. There was no statistically significant association between fluoride exposure and abnormal (low or high) TSH levels relative to normal TSH levels. Rerunning the models with the sample constrained to the subset of participants for whom we could discern some source(s) of fluoride exposure from drinking water and/or dental products revealed no significant associations. These analyses suggest that, at the population level, fluoride exposure is not associated with impaired thyroid functioning in a time and place where multiple sources of fluoride exposure, including CWF, exist. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Chang, Chun-Jen; Pei, Dee; Wu, Chien-Chih; Palmer, Mary H; Su, Ching-Chieh; Kuo, Shu-Fen; Liao, Yuan-Mei
2017-07-01
To explore correlates of nocturia, compare sleep quality and glycemic control for women with and without nocturia, and examine relationships of nocturia with sleep quality and glycemic control in women with diabetes. This study was a cross-sectional, correlational study with data collected from 275 women with type 2 diabetes. Data were collected using a structured questionnaire. Multivariate logistic regression analyses were used to identify correlates. Chi-squared tests were used to identify candidate variables for the first logistic regression model. A one-way analysis of variance was used to compare sleep quality and glycemic control for women with and those without nocturia. Pearson correlations were used to examine the relationships of nocturia with sleep quality and glycemic control. Of the 275 participants, 124 (45.1%) had experienced nocturia (at least two voids per night). Waist circumference, parity, time since diagnosis of diabetes, sleep quality, and increased daytime urinary frequency were correlated with nocturia after adjusting for age. Compared to women without nocturia, women who had nocturia reported poorer sleep quality. A significant correlation was found between the number of nocturnal episodes and sleep quality. Nocturia and poor sleep are common among women with diabetes. The multifactorial nature of nocturia supports the delivered management and treatments being targeted to underlying etiologies in order to optimize women's symptom management. Interventions aimed at modifiable correlates may include maintaining a normal body weight and regular physical exercise for maintaining a normal waist circumference, and decreasing caffeine consumption, implementing feasible modifications in sleeping environments and maintaining sleep hygiene to improve sleep quality. Healthcare professionals should screen for nocturia and poor sleep and offer appropriate nonpharmacological lifestyle management, behavioral interventions, or pharmacotherapy for women with diabetes. © 2017 Sigma Theta Tau International.
Association between obesity and dental caries in a group of preschool children in Mexico.
Vázquez-Nava, Francisco; Vázquez-Rodríguez, Eliza Mireya; Saldívar-González, Atenógenes Humberto; Lin-Ochoa, Dolores; Martinez-Perales, Gerardo Manuel; Joffre-Velázquez, Víctor Manuel
2010-01-01
The aim of this study was to determine the association between obesity and caries by utilizing the data of a cohort of preschool children aged 4-5 years. Data were obtained from a cohort of 1,160 children. Dental caries detection was performed according to the World Health Organization criteria. The caries index was measured as the number of decayed (d), extracted (e), and filled (f) teeth (t) (deft), or surfaces (defs). The body mass index (BMI) in units of kg/m2 was determined, and children were categorized according to age- and gender-specific criteria as normal weight (5th-85th percentile), at-risk overweight (> or = 85th-<95th percentile), and overweight (> or = 95th percentile). Odds ratios were determined for at-risk overweight and overweight children using logistic regression. The prevalence of dental caries was 17.9 percent. A slightly higher percentage of dental caries was found in boys (19.6 percent) than in girls (16.4 percent). From the total sample, the mean BMI was 17.10 +/- 3.83. Approximately 53.7 percent of children were classified as normal weight, 14.2 percent as at-risk overweight, and 32.1 percent as overweight. At-risk overweight children were higher among girls (17.1 percent) than among boys (11.3 percent). When adjusted for covariates, the logistic regression model showed that there was a significant association between at-risk overweight children (P < 0.001), overweight children (P < 0.001), and caries in the primary dentition. Mean (SD) deft value of the sample was 1.08 (2.34), while the corresponding defs value was 1.43 (3.29). Obesity appears to be associated with dental caries in the primary dentition of preschool Mexican children.
2014-01-01
Introduction Current practice in the delivery of caloric intake (DCI) in patients with severe acute kidney injury (AKI) receiving renal replacement therapy (RRT) is unknown. We aimed to describe calorie administration in patients enrolled in the Randomized Evaluation of Normal vs. Augmented Level of Replacement Therapy (RENAL) study and to assess the association between DCI and clinical outcomes. Methods We performed a secondary analysis in 1456 patients from the RENAL trial. We measured the dose and evolution of DCI during treatment and analyzed its association with major clinical outcomes using multivariable logistic regression, Cox proportional hazards models, and time adjusted models. Results Overall, mean DCI during treatment in ICU was low at only 10.9 ± 9 Kcal/kg/day for non-survivors and 11 ± 9 Kcal/kg/day for survivors. Among patients with a lower DCI (below the median) 334 of 729 (45.8%) had died at 90-days after randomization compared with 316 of 727 (43.3%) patients with a higher DCI (above the median) (P = 0.34). On multivariable logistic regression analysis, mean DCI carried an odds ratio of 0.95 (95% confidence interval (CI): 0.91-1.00; P = 0.06) per 100 Kcal increase for 90-day mortality. DCI was not associated with significant differences in renal replacement (RRT) free days, mechanical ventilation free days, ICU free days and hospital free days. These findings remained essentially unaltered after time adjusted analysis and Cox proportional hazards modeling. Conclusions In the RENAL study, mean DCI was low. Within the limits of such low caloric intake, greater DCI was not associated with improved clinical outcomes. Trial registration ClinicalTrials.gov number, NCT00221013 PMID:24629036
Clustering performance comparison using K-means and expectation maximization algorithms.
Jung, Yong Gyu; Kang, Min Soo; Heo, Jun
2014-11-14
Clustering is an important means of data mining based on separating data categories by similar features. Unlike the classification algorithm, clustering belongs to the unsupervised type of algorithms. Two representatives of the clustering algorithms are the K -means and the expectation maximization (EM) algorithm. Linear regression analysis was extended to the category-type dependent variable, while logistic regression was achieved using a linear combination of independent variables. To predict the possibility of occurrence of an event, a statistical approach is used. However, the classification of all data by means of logistic regression analysis cannot guarantee the accuracy of the results. In this paper, the logistic regression analysis is applied to EM clusters and the K -means clustering method for quality assessment of red wine, and a method is proposed for ensuring the accuracy of the classification results.
Stamate, Mirela Cristina; Todor, Nicolae; Cosgarea, Marcel
2015-01-01
The clinical utility of otoacoustic emissions as a noninvasive objective test of cochlear function has been long studied. Both transient otoacoustic emissions and distorsion products can be used to identify hearing loss, but to what extent they can be used as predictors for hearing loss is still debated. Most studies agree that multivariate analyses have better test performances than univariate analyses. The aim of the study was to determine transient otoacoustic emissions and distorsion products performance in identifying normal and impaired hearing loss, using the pure tone audiogram as a gold standard procedure and different multivariate statistical approaches. The study included 105 adult subjects with normal hearing and hearing loss who underwent the same test battery: pure-tone audiometry, tympanometry, otoacoustic emission tests. We chose to use the logistic regression as a multivariate statistical technique. Three logistic regression models were developed to characterize the relations between different risk factors (age, sex, tinnitus, demographic features, cochlear status defined by otoacoustic emissions) and hearing status defined by pure-tone audiometry. The multivariate analyses allow the calculation of the logistic score, which is a combination of the inputs, weighted by coefficients, calculated within the analyses. The accuracy of each model was assessed using receiver operating characteristics curve analysis. We used the logistic score to generate receivers operating curves and to estimate the areas under the curves in order to compare different multivariate analyses. We compared the performance of each otoacoustic emission (transient, distorsion product) using three different multivariate analyses for each ear, when multi-frequency gold standards were used. We demonstrated that all multivariate analyses provided high values of the area under the curve proving the performance of the otoacoustic emissions. Each otoacoustic emission test presented high values of area under the curve, suggesting that implementing a multivariate approach to evaluate the performances of each otoacoustic emission test would serve to increase the accuracy in identifying the normal and impaired ears. We encountered the highest area under the curve value for the combined multivariate analysis suggesting that both otoacoustic emission tests should be used in assessing hearing status. Our multivariate analyses revealed that age is a constant predictor factor of the auditory status for both ears, but the presence of tinnitus was the most important predictor for the hearing level, only for the left ear. Age presented similar coefficients, but tinnitus coefficients, by their high value, produced the highest variations of the logistic scores, only for the left ear group, thus increasing the risk of hearing loss. We did not find gender differences between ears for any otoacoustic emission tests, but studies still debate this question as the results are contradictory. Neither gender, nor environment origin had any predictive value for the hearing status, according to the results of our study. Like any other audiological test, using otoacoustic emissions to identify hearing loss is not without error. Even when applying multivariate analysis, perfect test performance is never achieved. Although most studies demonstrated the benefit of using the multivariate analysis, it has not been incorporated into clinical decisions maybe because of the idiosyncratic nature of multivariate solutions or because of the lack of the validation studies.
STAMATE, MIRELA CRISTINA; TODOR, NICOLAE; COSGAREA, MARCEL
2015-01-01
Background and aim The clinical utility of otoacoustic emissions as a noninvasive objective test of cochlear function has been long studied. Both transient otoacoustic emissions and distorsion products can be used to identify hearing loss, but to what extent they can be used as predictors for hearing loss is still debated. Most studies agree that multivariate analyses have better test performances than univariate analyses. The aim of the study was to determine transient otoacoustic emissions and distorsion products performance in identifying normal and impaired hearing loss, using the pure tone audiogram as a gold standard procedure and different multivariate statistical approaches. Methods The study included 105 adult subjects with normal hearing and hearing loss who underwent the same test battery: pure-tone audiometry, tympanometry, otoacoustic emission tests. We chose to use the logistic regression as a multivariate statistical technique. Three logistic regression models were developed to characterize the relations between different risk factors (age, sex, tinnitus, demographic features, cochlear status defined by otoacoustic emissions) and hearing status defined by pure-tone audiometry. The multivariate analyses allow the calculation of the logistic score, which is a combination of the inputs, weighted by coefficients, calculated within the analyses. The accuracy of each model was assessed using receiver operating characteristics curve analysis. We used the logistic score to generate receivers operating curves and to estimate the areas under the curves in order to compare different multivariate analyses. Results We compared the performance of each otoacoustic emission (transient, distorsion product) using three different multivariate analyses for each ear, when multi-frequency gold standards were used. We demonstrated that all multivariate analyses provided high values of the area under the curve proving the performance of the otoacoustic emissions. Each otoacoustic emission test presented high values of area under the curve, suggesting that implementing a multivariate approach to evaluate the performances of each otoacoustic emission test would serve to increase the accuracy in identifying the normal and impaired ears. We encountered the highest area under the curve value for the combined multivariate analysis suggesting that both otoacoustic emission tests should be used in assessing hearing status. Our multivariate analyses revealed that age is a constant predictor factor of the auditory status for both ears, but the presence of tinnitus was the most important predictor for the hearing level, only for the left ear. Age presented similar coefficients, but tinnitus coefficients, by their high value, produced the highest variations of the logistic scores, only for the left ear group, thus increasing the risk of hearing loss. We did not find gender differences between ears for any otoacoustic emission tests, but studies still debate this question as the results are contradictory. Neither gender, nor environment origin had any predictive value for the hearing status, according to the results of our study. Conclusion Like any other audiological test, using otoacoustic emissions to identify hearing loss is not without error. Even when applying multivariate analysis, perfect test performance is never achieved. Although most studies demonstrated the benefit of using the multivariate analysis, it has not been incorporated into clinical decisions maybe because of the idiosyncratic nature of multivariate solutions or because of the lack of the validation studies. PMID:26733749
Stapel, Sandra N; Looijaard, Wilhelmus G P M; Dekker, Ingeborg M; Girbes, Armand R J; Weijs, Peter J M; Oudemans-van Straaten, Heleen M
2018-05-11
A low bioelectrical impedance analysis (BIA)-derived phase angle (PA) predicts morbidity and mortality in different patient groups. An association between PA and long-term mortality in ICU patients has not been demonstrated before. The purpose of the present study was to determine whether PA on ICU admission independently predicts 90-day mortality. This prospective observational study was performed in a mixed university ICU. BIA was performed in 196 patients within 24 h of ICU admission. To test the independent association between PA and 90-day mortality, logistic regression analysis was performed using the APACHE IV predicted mortality as confounder. The optimal cutoff value of PA for mortality prediction was determined by ROC curve analysis. Using this cutoff value, patients were categorized into low or normal PA group and the association with 90-day mortality was tested again. The PA of survivors was higher than of the non-survivors (5.0° ± 1.3° vs. 4.1° ± 1.2°, p < 0.001). The area under the ROC curve of PA for 90-day mortality was 0.70 (CI 0.59-0.80). PA was associated with 90-day mortality (OR = 0.56, CI: 0.38-0.77, p = 0.001) on univariate logistic regression analysis and also after adjusting for BMI, gender, age, and APACHE IV on multivariable logistic regression (OR = 0.65, CI: 0.44-0.96, p = 0.031). A PA < 4.8° was an independent predictor of 90-day mortality (adjusted OR = 3.65, CI: 1.34-9.93, p = 0.011). Phase angle at ICU admission is an independent predictor of 90-day mortality. This biological marker can aid in long-term mortality risk assessment of critically ill patients.
Hackett, Geoffrey; Jones, Peter W; Strange, Richard C; Ramachandran, Sudarshan
2017-01-01
AIM To determine how statins, testosterone (T) replacement therapy (TRT) and phosphodiesterase 5-inhibitors (PDE5I) influence age related mortality in diabetic men. METHODS We studied 857 diabetic men screened for the BLAST study, stratifying them (mean follow-up = 3.8 years) into: (1) Normal T levels/untreated (total T > 12 nmol/L and free T > 0.25 nmol/L), Low T/untreated and Low T/treated; (2) PDE5I/untreated and PDE5I/treated; and (3) statin/untreated and statin/treated groups. The relationship between age and mortality, alone and with T/TRT, statin and PDE5I treatment was studied using logistic regression. Mortality probability and 95%CI were calculated from the above models for each individual. RESULTS Age was associated with mortality (logistic regression, OR = 1.10, 95%CI: 1.08-1.13, P < 0.001). With all factors included, age (OR = 1.08, 95%CI: 1.06-1.11, P < 0.001), Low T/treated (OR = 0.38, 95%CI: 0.15-0.92, P = 0.033), PDE5I/treated (OR = 0.17, 95%CI: 0.053-0.56, P = 0.004) and statin/treated (OR = 0.59, 95%CI: 0.36-0.97, P = 0.038) were associated with lower mortality. Age related mortality was as described by Gompertz, r2 = 0.881 when Ln (mortality) was plotted against age. The probability of mortality and 95%CI (from logistic regression) of individuals, treated/untreated with the drugs, alone and in combination was plotted against age. Overlap of 95%CI lines was evident with statins and TRT. No overlap was evident with PDE5I alone and with statins and TRT, this suggesting a change in the relationship between age and mortality. CONCLUSION We show that statins, PDE5I and TRT reduce mortality in diabetes. PDE5I, alone and with the other treatments significantly alter age related mortality in diabetic men. PMID:28344753
Delva, J; Spencer, M S; Lin, J K
2000-01-01
This article compares estimates of the relative odds of nitrite use obtained from weighted unconditional logistic regression with estimates obtained from conditional logistic regression after post-stratification and matching of cases with controls by neighborhood of residence. We illustrate these methods by comparing the odds associated with nitrite use among adults of four racial/ethnic groups, with and without a high school education. We used aggregated data from the 1994-B through 1996 National Household Survey on Drug Abuse (NHSDA). Difference between the methods and implications for analysis and inference are discussed.
Austin, Peter C; Lee, Douglas S; Steyerberg, Ewout W; Tu, Jack V
2012-01-01
In biomedical research, the logistic regression model is the most commonly used method for predicting the probability of a binary outcome. While many clinical researchers have expressed an enthusiasm for regression trees, this method may have limited accuracy for predicting health outcomes. We aimed to evaluate the improvement that is achieved by using ensemble-based methods, including bootstrap aggregation (bagging) of regression trees, random forests, and boosted regression trees. We analyzed 30-day mortality in two large cohorts of patients hospitalized with either acute myocardial infarction (N = 16,230) or congestive heart failure (N = 15,848) in two distinct eras (1999–2001 and 2004–2005). We found that both the in-sample and out-of-sample prediction of ensemble methods offered substantial improvement in predicting cardiovascular mortality compared to conventional regression trees. However, conventional logistic regression models that incorporated restricted cubic smoothing splines had even better performance. We conclude that ensemble methods from the data mining and machine learning literature increase the predictive performance of regression trees, but may not lead to clear advantages over conventional logistic regression models for predicting short-term mortality in population-based samples of subjects with cardiovascular disease. PMID:22777999
ERIC Educational Resources Information Center
Fidalgo, Angel M.; Alavi, Seyed Mohammad; Amirian, Seyed Mohammad Reza
2014-01-01
This study examines three controversial aspects in differential item functioning (DIF) detection by logistic regression (LR) models: first, the relative effectiveness of different analytical strategies for detecting DIF; second, the suitability of the Wald statistic for determining the statistical significance of the parameters of interest; and…
ERIC Educational Resources Information Center
French, Brian F.; Maller, Susan J.
2007-01-01
Two unresolved implementation issues with logistic regression (LR) for differential item functioning (DIF) detection include ability purification and effect size use. Purification is suggested to control inaccuracies in DIF detection as a result of DIF items in the ability estimate. Additionally, effect size use may be beneficial in controlling…
A Note on Three Statistical Tests in the Logistic Regression DIF Procedure
ERIC Educational Resources Information Center
Paek, Insu
2012-01-01
Although logistic regression became one of the well-known methods in detecting differential item functioning (DIF), its three statistical tests, the Wald, likelihood ratio (LR), and score tests, which are readily available under the maximum likelihood, do not seem to be consistently distinguished in DIF literature. This paper provides a clarifying…
ERIC Educational Resources Information Center
West, Lindsey M.; Davis, Telsie A.; Thompson, Martie P.; Kaslow, Nadine J.
2011-01-01
Protective factors for fostering reasons for living were examined among low-income, suicidal, African American women. Bivariate logistic regressions revealed that higher levels of optimism, spiritual well-being, and family social support predicted reasons for living. Multivariate logistic regressions indicated that spiritual well-being showed…
Comparison of Two Approaches for Handling Missing Covariates in Logistic Regression
ERIC Educational Resources Information Center
Peng, Chao-Ying Joanne; Zhu, Jin
2008-01-01
For the past 25 years, methodological advances have been made in missing data treatment. Most published work has focused on missing data in dependent variables under various conditions. The present study seeks to fill the void by comparing two approaches for handling missing data in categorical covariates in logistic regression: the…
Comparison of IRT Likelihood Ratio Test and Logistic Regression DIF Detection Procedures
ERIC Educational Resources Information Center
Atar, Burcu; Kamata, Akihito
2011-01-01
The Type I error rates and the power of IRT likelihood ratio test and cumulative logit ordinal logistic regression procedures in detecting differential item functioning (DIF) for polytomously scored items were investigated in this Monte Carlo simulation study. For this purpose, 54 simulation conditions (combinations of 3 sample sizes, 2 sample…
Multiple Logistic Regression Analysis of Cigarette Use among High School Students
ERIC Educational Resources Information Center
Adwere-Boamah, Joseph
2011-01-01
A binary logistic regression analysis was performed to predict high school students' cigarette smoking behavior from selected predictors from 2009 CDC Youth Risk Behavior Surveillance Survey. The specific target student behavior of interest was frequent cigarette use. Five predictor variables included in the model were: a) race, b) frequency of…
ERIC Educational Resources Information Center
Anderson, Carolyn J.; Verkuilen, Jay; Peyton, Buddy L.
2010-01-01
Survey items with multiple response categories and multiple-choice test questions are ubiquitous in psychological and educational research. We illustrate the use of log-multiplicative association (LMA) models that are extensions of the well-known multinomial logistic regression model for multiple dependent outcome variables to reanalyze a set of…
Propensity Score Estimation with Data Mining Techniques: Alternatives to Logistic Regression
ERIC Educational Resources Information Center
Keller, Bryan S. B.; Kim, Jee-Seon; Steiner, Peter M.
2013-01-01
Propensity score analysis (PSA) is a methodological technique which may correct for selection bias in a quasi-experiment by modeling the selection process using observed covariates. Because logistic regression is well understood by researchers in a variety of fields and easy to implement in a number of popular software packages, it has…
ERIC Educational Resources Information Center
Courtney, Jon R.; Prophet, Retta
2011-01-01
Placement instability is often associated with a number of negative outcomes for children. To gain state level contextual knowledge of factors associated with placement stability/instability, logistic regression was applied to selected variables from the New Mexico Adoption and Foster Care Administrative Reporting System dataset. Predictors…
Classifying machinery condition using oil samples and binary logistic regression
NASA Astrophysics Data System (ADS)
Phillips, J.; Cripps, E.; Lau, John W.; Hodkiewicz, M. R.
2015-08-01
The era of big data has resulted in an explosion of condition monitoring information. The result is an increasing motivation to automate the costly and time consuming human elements involved in the classification of machine health. When working with industry it is important to build an understanding and hence some trust in the classification scheme for those who use the analysis to initiate maintenance tasks. Typically "black box" approaches such as artificial neural networks (ANN) and support vector machines (SVM) can be difficult to provide ease of interpretability. In contrast, this paper argues that logistic regression offers easy interpretability to industry experts, providing insight to the drivers of the human classification process and to the ramifications of potential misclassification. Of course, accuracy is of foremost importance in any automated classification scheme, so we also provide a comparative study based on predictive performance of logistic regression, ANN and SVM. A real world oil analysis data set from engines on mining trucks is presented and using cross-validation we demonstrate that logistic regression out-performs the ANN and SVM approaches in terms of prediction for healthy/not healthy engines.
Length bias correction in gene ontology enrichment analysis using logistic regression.
Mi, Gu; Di, Yanming; Emerson, Sarah; Cumbie, Jason S; Chang, Jeff H
2012-01-01
When assessing differential gene expression from RNA sequencing data, commonly used statistical tests tend to have greater power to detect differential expression of genes encoding longer transcripts. This phenomenon, called "length bias", will influence subsequent analyses such as Gene Ontology enrichment analysis. In the presence of length bias, Gene Ontology categories that include longer genes are more likely to be identified as enriched. These categories, however, are not necessarily biologically more relevant. We show that one can effectively adjust for length bias in Gene Ontology analysis by including transcript length as a covariate in a logistic regression model. The logistic regression model makes the statistical issue underlying length bias more transparent: transcript length becomes a confounding factor when it correlates with both the Gene Ontology membership and the significance of the differential expression test. The inclusion of the transcript length as a covariate allows one to investigate the direct correlation between the Gene Ontology membership and the significance of testing differential expression, conditional on the transcript length. We present both real and simulated data examples to show that the logistic regression approach is simple, effective, and flexible.
Hansson, Lisbeth; Khamis, Harry J
2008-12-01
Simulated data sets are used to evaluate conditional and unconditional maximum likelihood estimation in an individual case-control design with continuous covariates when there are different rates of excluded cases and different levels of other design parameters. The effectiveness of the estimation procedures is measured by method bias, variance of the estimators, root mean square error (RMSE) for logistic regression and the percentage of explained variation. Conditional estimation leads to higher RMSE than unconditional estimation in the presence of missing observations, especially for 1:1 matching. The RMSE is higher for the smaller stratum size, especially for the 1:1 matching. The percentage of explained variation appears to be insensitive to missing data, but is generally higher for the conditional estimation than for the unconditional estimation. It is particularly good for the 1:2 matching design. For minimizing RMSE, a high matching ratio is recommended; in this case, conditional and unconditional logistic regression models yield comparable levels of effectiveness. For maximizing the percentage of explained variation, the 1:2 matching design with the conditional logistic regression model is recommended.
Szekér, Szabolcs; Vathy-Fogarassy, Ágnes
2018-01-01
Logistic regression based propensity score matching is a widely used method in case-control studies to select the individuals of the control group. This method creates a suitable control group if all factors affecting the output variable are known. However, if relevant latent variables exist as well, which are not taken into account during the calculations, the quality of the control group is uncertain. In this paper, we present a statistics-based research in which we try to determine the relationship between the accuracy of the logistic regression model and the uncertainty of the dependent variable of the control group defined by propensity score matching. Our analyses show that there is a linear correlation between the fit of the logistic regression model and the uncertainty of the output variable. In certain cases, a latent binary explanatory variable can result in a relative error of up to 70% in the prediction of the outcome variable. The observed phenomenon calls the attention of analysts to an important point, which must be taken into account when deducting conclusions.
Dong, Guangheng; Wang, Jiangyang; Yang, Xuelong; Zhou, Hui
2013-12-01
As the world's fastest growing "addiction", Internet addiction is still controversial. The present study aimed to examine the potential personality predictors of Internet addicts. Eight hundred and sixty-eight students were tested using the Eysenck Personality Questionnaire after they had just entered university. Two years later, 49 were found to be addicted to the Internet as defined by high Internet addiction test scores. Comparisons of means and logistic regression analysis were used to explore their relationship. Students addicted to the Internet showed higher Neuroticism/Stability scores, higher Psychoticism/Socialization scores, and lower Lie scores than their normal peers before their addiction. Regression results showed that Internet addiction was accounted by three independent variables: Neuroticism/Stability, Psychoticism/Socialization, and Lie. These results suggest that the risk personality traits of Internet addiction include neuroticism, psychoticism, and immaturity. Copyright © 2012 Wiley Publishing Asia Pty Ltd.
Logistic regression for circular data
NASA Astrophysics Data System (ADS)
Al-Daffaie, Kadhem; Khan, Shahjahan
2017-05-01
This paper considers the relationship between a binary response and a circular predictor. It develops the logistic regression model by employing the linear-circular regression approach. The maximum likelihood method is used to estimate the parameters. The Newton-Raphson numerical method is used to find the estimated values of the parameters. A data set from weather records of Toowoomba city is analysed by the proposed methods. Moreover, a simulation study is considered. The R software is used for all computations and simulations.
Naval Research Logistics Quarterly. Volume 28. Number 3,
1981-09-01
denotes component-wise maximum. f has antone (isotone) differences on C x D if for cl < c2 and d, < d2, NAVAL RESEARCH LOGISTICS QUARTERLY VOL. 28...or negative correlations and linear or nonlinear regressions. Given are the mo- ments to order two and, for special cases, (he regression function and...data sets. We designate this bnb distribution as G - B - N(a, 0, v). The distribution admits only of positive correlation and linear regressions
Bond, H S; Sullivan, S G; Cowling, B J
2016-06-01
Influenza vaccination is the most practical means available for preventing influenza virus infection and is widely used in many countries. Because vaccine components and circulating strains frequently change, it is important to continually monitor vaccine effectiveness (VE). The test-negative design is frequently used to estimate VE. In this design, patients meeting the same clinical case definition are recruited and tested for influenza; those who test positive are the cases and those who test negative form the comparison group. When determining VE in these studies, the typical approach has been to use logistic regression, adjusting for potential confounders. Because vaccine coverage and influenza incidence change throughout the season, time is included among these confounders. While most studies use unconditional logistic regression, adjusting for time, an alternative approach is to use conditional logistic regression, matching on time. Here, we used simulation data to examine the potential for both regression approaches to permit accurate and robust estimates of VE. In situations where vaccine coverage changed during the influenza season, the conditional model and unconditional models adjusting for categorical week and using a spline function for week provided more accurate estimates. We illustrated the two approaches on data from a test-negative study of influenza VE against hospitalization in children in Hong Kong which resulted in the conditional logistic regression model providing the best fit to the data.
Asghari, Mehdi Poursheikhali; Hayatshahi, Sayyed Hamed Sadat; Abdolmaleki, Parviz
2012-01-01
From both the structural and functional points of view, β-turns play important biological roles in proteins. In the present study, a novel two-stage hybrid procedure has been developed to identify β-turns in proteins. Binary logistic regression was initially used for the first time to select significant sequence parameters in identification of β-turns due to a re-substitution test procedure. Sequence parameters were consisted of 80 amino acid positional occurrences and 20 amino acid percentages in sequence. Among these parameters, the most significant ones which were selected by binary logistic regression model, were percentages of Gly, Ser and the occurrence of Asn in position i+2, respectively, in sequence. These significant parameters have the highest effect on the constitution of a β-turn sequence. A neural network model was then constructed and fed by the parameters selected by binary logistic regression to build a hybrid predictor. The networks have been trained and tested on a non-homologous dataset of 565 protein chains. With applying a nine fold cross-validation test on the dataset, the network reached an overall accuracy (Qtotal) of 74, which is comparable with results of the other β-turn prediction methods. In conclusion, this study proves that the parameter selection ability of binary logistic regression together with the prediction capability of neural networks lead to the development of more precise models for identifying β-turns in proteins. PMID:27418910
Asghari, Mehdi Poursheikhali; Hayatshahi, Sayyed Hamed Sadat; Abdolmaleki, Parviz
2012-01-01
From both the structural and functional points of view, β-turns play important biological roles in proteins. In the present study, a novel two-stage hybrid procedure has been developed to identify β-turns in proteins. Binary logistic regression was initially used for the first time to select significant sequence parameters in identification of β-turns due to a re-substitution test procedure. Sequence parameters were consisted of 80 amino acid positional occurrences and 20 amino acid percentages in sequence. Among these parameters, the most significant ones which were selected by binary logistic regression model, were percentages of Gly, Ser and the occurrence of Asn in position i+2, respectively, in sequence. These significant parameters have the highest effect on the constitution of a β-turn sequence. A neural network model was then constructed and fed by the parameters selected by binary logistic regression to build a hybrid predictor. The networks have been trained and tested on a non-homologous dataset of 565 protein chains. With applying a nine fold cross-validation test on the dataset, the network reached an overall accuracy (Qtotal) of 74, which is comparable with results of the other β-turn prediction methods. In conclusion, this study proves that the parameter selection ability of binary logistic regression together with the prediction capability of neural networks lead to the development of more precise models for identifying β-turns in proteins.
Crane, Paul K; Gibbons, Laura E; Jolley, Lance; van Belle, Gerald
2006-11-01
We present an ordinal logistic regression model for identification of items with differential item functioning (DIF) and apply this model to a Mini-Mental State Examination (MMSE) dataset. We employ item response theory ability estimation in our models. Three nested ordinal logistic regression models are applied to each item. Model testing begins with examination of the statistical significance of the interaction term between ability and the group indicator, consistent with nonuniform DIF. Then we turn our attention to the coefficient of the ability term in models with and without the group term. If including the group term has a marked effect on that coefficient, we declare that it has uniform DIF. We examined DIF related to language of test administration in addition to self-reported race, Hispanic ethnicity, age, years of education, and sex. We used PARSCALE for IRT analyses and STATA for ordinal logistic regression approaches. We used an iterative technique for adjusting IRT ability estimates on the basis of DIF findings. Five items were found to have DIF related to language. These same items also had DIF related to other covariates. The ordinal logistic regression approach to DIF detection, when combined with IRT ability estimates, provides a reasonable alternative for DIF detection. There appear to be several items with significant DIF related to language of test administration in the MMSE. More attention needs to be paid to the specific criteria used to determine whether an item has DIF, not just the technique used to identify DIF.
Conditional Poisson models: a flexible alternative to conditional logistic case cross-over analysis.
Armstrong, Ben G; Gasparrini, Antonio; Tobias, Aurelio
2014-11-24
The time stratified case cross-over approach is a popular alternative to conventional time series regression for analysing associations between time series of environmental exposures (air pollution, weather) and counts of health outcomes. These are almost always analyzed using conditional logistic regression on data expanded to case-control (case crossover) format, but this has some limitations. In particular adjusting for overdispersion and auto-correlation in the counts is not possible. It has been established that a Poisson model for counts with stratum indicators gives identical estimates to those from conditional logistic regression and does not have these limitations, but it is little used, probably because of the overheads in estimating many stratum parameters. The conditional Poisson model avoids estimating stratum parameters by conditioning on the total event count in each stratum, thus simplifying the computing and increasing the number of strata for which fitting is feasible compared with the standard unconditional Poisson model. Unlike the conditional logistic model, the conditional Poisson model does not require expanding the data, and can adjust for overdispersion and auto-correlation. It is available in Stata, R, and other packages. By applying to some real data and using simulations, we demonstrate that conditional Poisson models were simpler to code and shorter to run than are conditional logistic analyses and can be fitted to larger data sets than possible with standard Poisson models. Allowing for overdispersion or autocorrelation was possible with the conditional Poisson model but when not required this model gave identical estimates to those from conditional logistic regression. Conditional Poisson regression models provide an alternative to case crossover analysis of stratified time series data with some advantages. The conditional Poisson model can also be used in other contexts in which primary control for confounding is by fine stratification.
Use of generalized ordered logistic regression for the analysis of multidrug resistance data.
Agga, Getahun E; Scott, H Morgan
2015-10-01
Statistical analysis of antimicrobial resistance data largely focuses on individual antimicrobial's binary outcome (susceptible or resistant). However, bacteria are becoming increasingly multidrug resistant (MDR). Statistical analysis of MDR data is mostly descriptive often with tabular or graphical presentations. Here we report the applicability of generalized ordinal logistic regression model for the analysis of MDR data. A total of 1,152 Escherichia coli, isolated from the feces of weaned pigs experimentally supplemented with chlortetracycline (CTC) and copper, were tested for susceptibilities against 15 antimicrobials and were binary classified into resistant or susceptible. The 15 antimicrobial agents tested were grouped into eight different antimicrobial classes. We defined MDR as the number of antimicrobial classes to which E. coli isolates were resistant ranging from 0 to 8. Proportionality of the odds assumption of the ordinal logistic regression model was violated only for the effect of treatment period (pre-treatment, during-treatment and post-treatment); but not for the effect of CTC or copper supplementation. Subsequently, a partially constrained generalized ordinal logistic model was built that allows for the effect of treatment period to vary while constraining the effects of treatment (CTC and copper supplementation) to be constant across the levels of MDR classes. Copper (Proportional Odds Ratio [Prop OR]=1.03; 95% CI=0.73-1.47) and CTC (Prop OR=1.1; 95% CI=0.78-1.56) supplementation were not significantly associated with the level of MDR adjusted for the effect of treatment period. MDR generally declined over the trial period. In conclusion, generalized ordered logistic regression can be used for the analysis of ordinal data such as MDR data when the proportionality assumptions for ordered logistic regression are violated. Published by Elsevier B.V.
Detection of chewing from piezoelectric film sensor signals using ensemble classifiers.
Farooq, Muhammad; Sazonov, Edward
2016-08-01
Selection and use of pattern recognition algorithms is application dependent. In this work, we explored the use of several ensembles of weak classifiers to classify signals captured from a wearable sensor system to detect food intake based on chewing. Three sensor signals (Piezoelectric sensor, accelerometer, and hand to mouth gesture) were collected from 12 subjects in free-living conditions for 24 hrs. Sensor signals were divided into 10 seconds epochs and for each epoch combination of time and frequency domain features were computed. In this work, we present a comparison of three different ensemble techniques: boosting (AdaBoost), bootstrap aggregation (bagging) and stacking, each trained with 3 different weak classifiers (Decision Trees, Linear Discriminant Analysis (LDA) and Logistic Regression). Type of feature normalization used can also impact the classification results. For each ensemble method, three feature normalization techniques: (no-normalization, z-score normalization, and minmax normalization) were tested. A 12 fold cross-validation scheme was used to evaluate the performance of each model where the performance was evaluated in terms of precision, recall, and accuracy. Best results achieved here show an improvement of about 4% over our previous algorithms.
Fei, Y; Hu, J; Li, W-Q; Wang, W; Zong, G-Q
2017-03-01
Essentials Predicting the occurrence of portosplenomesenteric vein thrombosis (PSMVT) is difficult. We studied 72 patients with acute pancreatitis. Artificial neural networks modeling was more accurate than logistic regression in predicting PSMVT. Additional predictive factors may be incorporated into artificial neural networks. Objective To construct and validate artificial neural networks (ANNs) for predicting the occurrence of portosplenomesenteric venous thrombosis (PSMVT) and compare the predictive ability of the ANNs with that of logistic regression. Methods The ANNs and logistic regression modeling were constructed using simple clinical and laboratory data of 72 acute pancreatitis (AP) patients. The ANNs and logistic modeling were first trained on 48 randomly chosen patients and validated on the remaining 24 patients. The accuracy and the performance characteristics were compared between these two approaches by SPSS17.0 software. Results The training set and validation set did not differ on any of the 11 variables. After training, the back propagation network training error converged to 1 × 10 -20 , and it retained excellent pattern recognition ability. When the ANNs model was applied to the validation set, it revealed a sensitivity of 80%, specificity of 85.7%, a positive predictive value of 77.6% and negative predictive value of 90.7%. The accuracy was 83.3%. Differences could be found between ANNs modeling and logistic regression modeling in these parameters (10.0% [95% CI, -14.3 to 34.3%], 14.3% [95% CI, -8.6 to 37.2%], 15.7% [95% CI, -9.9 to 41.3%], 11.8% [95% CI, -8.2 to 31.8%], 22.6% [95% CI, -1.9 to 47.1%], respectively). When ANNs modeling was used to identify PSMVT, the area under receiver operating characteristic curve was 0.849 (95% CI, 0.807-0.901), which demonstrated better overall properties than logistic regression modeling (AUC = 0.716) (95% CI, 0.679-0.761). Conclusions ANNs modeling was a more accurate tool than logistic regression in predicting the occurrence of PSMVT following AP. More clinical factors or biomarkers may be incorporated into ANNs modeling to improve its predictive ability. © 2016 International Society on Thrombosis and Haemostasis.
Ai, Zi-Sheng; Gao, You-Shui; Sun, Yuan; Liu, Yue; Zhang, Chang-Qing; Jiang, Cheng-Hua
2013-03-01
Risk factors for femoral neck fracture-induced avascular necrosis of the femoral head have not been elucidated clearly in middle-aged and elderly patients. Moreover, the high incidence of screw removal in China and its effect on the fate of the involved femoral head require statistical methods to reflect their intrinsic relationship. Ninety-nine patients older than 45 years with femoral neck fracture were treated by internal fixation between May 1999 and April 2004. Descriptive analysis, interaction analysis between associated factors, single factor logistic regression, multivariate logistic regression, and detailed interaction analysis were employed to explore potential relationships among associated factors. Avascular necrosis of the femoral head was found in 15 cases (15.2 %). Age × the status of implants (removal vs. maintenance) and gender × the timing of reduction were interactive according to two-factor interactive analysis. Age, the displacement of fractures, the quality of reduction, and the status of implants were found to be significant factors in single factor logistic regression analysis. Age, age × the status of implants, and the quality of reduction were found to be significant factors in multivariate logistic regression analysis. In fine interaction analysis after multivariate logistic regression analysis, implant removal was the most important risk factor for avascular necrosis in 56-to-85-year-old patients, with a risk ratio of 26.00 (95 % CI = 3.076-219.747). The middle-aged and elderly have less incidence of avascular necrosis of the femoral head following femoral neck fractures treated by cannulated screws. The removal of cannulated screws can induce a significantly high incidence of avascular necrosis of the femoral head in elderly patients, while a high-quality reduction is helpful to reduce avascular necrosis.
Zhou, Jinzhe; Zhou, Yanbing; Cao, Shougen; Li, Shikuan; Wang, Hao; Niu, Zhaojian; Chen, Dong; Wang, Dongsheng; Lv, Liang; Zhang, Jian; Li, Yu; Jiao, Xuelong; Tan, Xiaojie; Zhang, Jianli; Wang, Haibo; Zhang, Bingyuan; Lu, Yun; Sun, Zhenqing
2016-01-01
Reporting of surgical complications is common, but few provide information about the severity and estimate risk factors of complications. If have, but lack of specificity. We retrospectively analyzed data on 2795 gastric cancer patients underwent surgical procedure at the Affiliated Hospital of Qingdao University between June 2007 and June 2012, established multivariate logistic regression model to predictive risk factors related to the postoperative complications according to the Clavien-Dindo classification system. Twenty-four out of 86 variables were identified statistically significant in univariate logistic regression analysis, 11 significant variables entered multivariate analysis were employed to produce the risk model. Liver cirrhosis, diabetes mellitus, Child classification, invasion of neighboring organs, combined resection, introperative transfusion, Billroth II anastomosis of reconstruction, malnutrition, surgical volume of surgeons, operating time and age were independent risk factors for postoperative complications after gastrectomy. Based on logistic regression equation, p=Exp∑BiXi / (1+Exp∑BiXi), multivariate logistic regression predictive model that calculated the risk of postoperative morbidity was developed, p = 1/(1 + e((4.810-1.287X1-0.504X2-0.500X3-0.474X4-0.405X5-0.318X6-0.316X7-0.305X8-0.278X9-0.255X10-0.138X11))). The accuracy, sensitivity and specificity of the model to predict the postoperative complications were 86.7%, 76.2% and 88.6%, respectively. This risk model based on Clavien-Dindo grading severity of complications system and logistic regression analysis can predict severe morbidity specific to an individual patient's risk factors, estimate patients' risks and benefits of gastric surgery as an accurate decision-making tool and may serve as a template for the development of risk models for other surgical groups.
Rank-Optimized Logistic Matrix Regression toward Improved Matrix Data Classification.
Zhang, Jianguang; Jiang, Jianmin
2018-02-01
While existing logistic regression suffers from overfitting and often fails in considering structural information, we propose a novel matrix-based logistic regression to overcome the weakness. In the proposed method, 2D matrices are directly used to learn two groups of parameter vectors along each dimension without vectorization, which allows the proposed method to fully exploit the underlying structural information embedded inside the 2D matrices. Further, we add a joint [Formula: see text]-norm on two parameter matrices, which are organized by aligning each group of parameter vectors in columns. This added co-regularization term has two roles-enhancing the effect of regularization and optimizing the rank during the learning process. With our proposed fast iterative solution, we carried out extensive experiments. The results show that in comparison to both the traditional tensor-based methods and the vector-based regression methods, our proposed solution achieves better performance for matrix data classifications.
Detecting DIF in Polytomous Items Using MACS, IRT and Ordinal Logistic Regression
ERIC Educational Resources Information Center
Elosua, Paula; Wells, Craig
2013-01-01
The purpose of the present study was to compare the Type I error rate and power of two model-based procedures, the mean and covariance structure model (MACS) and the item response theory (IRT), and an observed-score based procedure, ordinal logistic regression, for detecting differential item functioning (DIF) in polytomous items. A simulation…
ERIC Educational Resources Information Center
Rudner, Lawrence
2016-01-01
In the machine learning literature, it is commonly accepted as fact that as calibration sample sizes increase, Naïve Bayes classifiers initially outperform Logistic Regression classifiers in terms of classification accuracy. Applied to subtests from an on-line final examination and from a highly regarded certification examination, this study shows…
ERIC Educational Resources Information Center
Fan, Xitao; Wang, Lin
The Monte Carlo study compared the performance of predictive discriminant analysis (PDA) and that of logistic regression (LR) for the two-group classification problem. Prior probabilities were used for classification, but the cost of misclassification was assumed to be equal. The study used a fully crossed three-factor experimental design (with…
ERIC Educational Resources Information Center
Nguyen, Phuong L.
2006-01-01
This study examines the effects of parental SES, school quality, and community factors on children's enrollment and achievement in rural areas in Viet Nam, using logistic regression and ordered logistic regression. Multivariate analysis reveals significant differences in educational enrollment and outcomes by level of household expenditures and…
School Exits in the Milwaukee Parental Choice Program: Evidence of a Marketplace?
ERIC Educational Resources Information Center
Ford, Michael
2011-01-01
This article examines whether the large number of school exits from the Milwaukee school voucher program is evidence of a marketplace. Two logistic regression and multinomial logistic regression models tested the relation between the inability to draw large numbers of voucher students and the ability for a private school to remain viable. Data on…
Hierarchical Bayesian Logistic Regression to forecast metabolic control in type 2 DM patients.
Dagliati, Arianna; Malovini, Alberto; Decata, Pasquale; Cogni, Giulia; Teliti, Marsida; Sacchi, Lucia; Cerra, Carlo; Chiovato, Luca; Bellazzi, Riccardo
2016-01-01
In this work we present our efforts in building a model able to forecast patients' changes in clinical conditions when repeated measurements are available. In this case the available risk calculators are typically not applicable. We propose a Hierarchical Bayesian Logistic Regression model, which allows taking into account individual and population variability in model parameters estimate. The model is used to predict metabolic control and its variation in type 2 diabetes mellitus. In particular we have analyzed a population of more than 1000 Italian type 2 diabetic patients, collected within the European project Mosaic. The results obtained in terms of Matthews Correlation Coefficient are significantly better than the ones gathered with standard logistic regression model, based on data pooling.
Li, Ji; Gray, B.R.; Bates, D.M.
2008-01-01
Partitioning the variance of a response by design levels is challenging for binomial and other discrete outcomes. Goldstein (2003) proposed four definitions for variance partitioning coefficients (VPC) under a two-level logistic regression model. In this study, we explicitly derived formulae for multi-level logistic regression model and subsequently studied the distributional properties of the calculated VPCs. Using simulations and a vegetation dataset, we demonstrated associations between different VPC definitions, the importance of methods for estimating VPCs (by comparing VPC obtained using Laplace and penalized quasilikehood methods), and bivariate dependence between VPCs calculated at different levels. Such an empirical study lends an immediate support to wider applications of VPC in scientific data analysis.
Model building strategy for logistic regression: purposeful selection.
Zhang, Zhongheng
2016-03-01
Logistic regression is one of the most commonly used models to account for confounders in medical literature. The article introduces how to perform purposeful selection model building strategy with R. I stress on the use of likelihood ratio test to see whether deleting a variable will have significant impact on model fit. A deleted variable should also be checked for whether it is an important adjustment of remaining covariates. Interaction should be checked to disentangle complex relationship between covariates and their synergistic effect on response variable. Model should be checked for the goodness-of-fit (GOF). In other words, how the fitted model reflects the real data. Hosmer-Lemeshow GOF test is the most widely used for logistic regression model.
Color vision impairment in multiple sclerosis points to retinal ganglion cell damage.
Lampert, E J; Andorra, M; Torres-Torres, R; Ortiz-Pérez, S; Llufriu, S; Sepúlveda, M; Sola, N; Saiz, A; Sánchez-Dalmau, B; Villoslada, P; Martínez-Lapiscina, Elena H
2015-11-01
Multiple Sclerosis (MS) results in color vision impairment regardless of optic neuritis (ON). The exact location of injury remains undefined. The objective of this study is to identify the region leading to dyschromatopsia in MS patients' NON-eyes. We evaluated Spearman correlations between color vision and measures of different regions in the afferent visual pathway in 106 MS patients. Regions with significant correlations were included in logistic regression models to assess their independent role in dyschromatopsia. We evaluated color vision with Hardy-Rand-Rittler plates and retinal damage using Optical Coherence Tomography. We ran SIENAX to measure Normalized Brain Parenchymal Volume (NBPV), FIRST for thalamus volume and Freesurfer for visual cortex areas. We found moderate, significant correlations between color vision and macular retinal nerve fiber layer (rho = 0.289, p = 0.003), ganglion cell complex (GCC = GCIP) (rho = 0.353, p < 0.001), thalamus (rho = 0.361, p < 0.001), and lesion volume within the optic radiations (rho = -0.230, p = 0.030). Only GCC thickness remained significant (p = 0.023) in the logistic regression model. In the final model including lesion load and NBPV as markers of diffuse neuroaxonal damage, GCC remained associated with dyschromatopsia [OR = 0.88 95 % CI (0.80-0.97) p = 0.016]. This association remained significant when we also added sex, age, and disease duration as covariates in the regression model. Dyschromatopsia in NON-eyes is due to damage of retinal ganglion cells (RGC) in MS. Color vision can serve as a marker of RGC damage in MS.
CASTELO, Paula Midori; GAVIÃO, Maria Beatriz Duarte; PEREIRA, Luciano José; BONJARDIM, Leonardo Rigoldi
2010-01-01
Objective The maintenance of normal conditions of the masticatory function is determinant for the correct growth and development of its structures. Thus, the aims of this study were to evaluate the influence of sucking habits on the presence of crossbite and its relationship with maximal bite force, facial morphology and body variables in 67 children of both genders (3.5-7 years) with primary or early mixed dentition. Material and methods The children were divided in four groups: primary-normocclusion (PN, n=19), primary-crossbite (PC, n=19), mixed-normocclusion (MN, n=13), and mixed-crossbite (MC, n=16). Bite force was measured with a pressurized tube, and facial morphology was determined by standardized frontal photographs: AFH (anterior face height) and BFW (bizygomatic facial width). Results It was observed that MC group showed lower bite force than MN, and AFH/ BFW was significantly smaller in PN than PC (t-test). Weight and height were only significantly correlated with bite force in PC group (Pearson’s correlation test). In the primary dentition, AFH/BFW and breast-feeding (at least six months) were positive and negatively associated with crossbite, respectively (multiple logistic regression). In the mixed dentition, breastfeeding and bite force showed negative associations with crossbite (univariate regression), while nonnutritive sucking (up to 3 years) associated significantly with crossbite in all groups (multiple logistic regression). Conclusions In the studied sample, sucking habits played an important role in the etiology of crossbite, which was associated with lower bite force and long-face tendency. PMID:20485925
NASA Astrophysics Data System (ADS)
Ceppi, C.; Mancini, F.; Ritrovato, G.
2009-04-01
This study aim at the landslide susceptibility mapping within an area of the Daunia (Apulian Apennines, Italy) by a multivariate statistical method and data manipulation in a Geographical Information System (GIS) environment. Among the variety of existing statistical data analysis techniques, the logistic regression was chosen to produce a susceptibility map all over an area where small settlements are historically threatened by landslide phenomena. By logistic regression a best fitting between the presence or absence of landslide (dependent variable) and the set of independent variables is performed on the basis of a maximum likelihood criterion, bringing to the estimation of regression coefficients. The reliability of such analysis is therefore due to the ability to quantify the proneness to landslide occurrences by the probability level produced by the analysis. The inventory of dependent and independent variables were managed in a GIS, where geometric properties and attributes have been translated into raster cells in order to proceed with the logistic regression by means of SPSS (Statistical Package for the Social Sciences) package. A landslide inventory was used to produce the bivariate dependent variable whereas the independent set of variable concerned with slope, aspect, elevation, curvature, drained area, lithology and land use after their reductions to dummy variables. The effect of independent parameters on landslide occurrence was assessed by the corresponding coefficient in the logistic regression function, highlighting a major role played by the land use variable in determining occurrence and distribution of phenomena. Once the outcomes of the logistic regression are determined, data are re-introduced in the GIS to produce a map reporting the proneness to landslide as predicted level of probability. As validation of results and regression model a cell-by-cell comparison between the susceptibility map and the initial inventory of landslide events was performed and an agreement at 75% level achieved.
Finding Bayesian Optimal Designs for Nonlinear Models: A Semidefinite Programming-Based Approach.
Duarte, Belmiro P M; Wong, Weng Kee
2015-08-01
This paper uses semidefinite programming (SDP) to construct Bayesian optimal design for nonlinear regression models. The setup here extends the formulation of the optimal designs problem as an SDP problem from linear to nonlinear models. Gaussian quadrature formulas (GQF) are used to compute the expectation in the Bayesian design criterion, such as D-, A- or E-optimality. As an illustrative example, we demonstrate the approach using the power-logistic model and compare results in the literature. Additionally, we investigate how the optimal design is impacted by different discretising schemes for the design space, different amounts of uncertainty in the parameter values, different choices of GQF and different prior distributions for the vector of model parameters, including normal priors with and without correlated components. Further applications to find Bayesian D-optimal designs with two regressors for a logistic model and a two-variable generalised linear model with a gamma distributed response are discussed, and some limitations of our approach are noted.
Finding Bayesian Optimal Designs for Nonlinear Models: A Semidefinite Programming-Based Approach
Duarte, Belmiro P. M.; Wong, Weng Kee
2014-01-01
Summary This paper uses semidefinite programming (SDP) to construct Bayesian optimal design for nonlinear regression models. The setup here extends the formulation of the optimal designs problem as an SDP problem from linear to nonlinear models. Gaussian quadrature formulas (GQF) are used to compute the expectation in the Bayesian design criterion, such as D-, A- or E-optimality. As an illustrative example, we demonstrate the approach using the power-logistic model and compare results in the literature. Additionally, we investigate how the optimal design is impacted by different discretising schemes for the design space, different amounts of uncertainty in the parameter values, different choices of GQF and different prior distributions for the vector of model parameters, including normal priors with and without correlated components. Further applications to find Bayesian D-optimal designs with two regressors for a logistic model and a two-variable generalised linear model with a gamma distributed response are discussed, and some limitations of our approach are noted. PMID:26512159
Diniz, Maria de Fátima Haueisen Sander; Beleigoli, Alline Maria Rezende; Ribeiro, Antônio Luiz P.; Vidigal, Pedro Guatimosim; Bensenor, Isabela M.; Lotufo, Paulo A.; Duncan, Bruce B.; Schmidt, Maria Inês; Barreto, Sandhi Maria
2016-01-01
Abstract The primary aim of this study was to evaluate metabolically healthy status (MHS) among participants in obesity, overweight, and normal weight groups and characteristics associated with this phenotype using baseline data of Brazilian Longitudinal Study of Adult Health (ELSA-Brasil). The secondary aim was to investigate agreement among 4 different MHS criteria. This cross-sectional study included 14,545 participants aged 35 to 74 years with a small majority (54.1%) being women. Of all participants, 22.7% (n = 3298) were obese, 40.8% (n = 5934) were overweight, and 37.5% (n = 5313) were of normal weight. Socio-demographic, behavioral, and anthropometric factors related to MHS were ascertained. Logistic regression models estimated the odds of associations. We used 4 different criteria separately and in combination to define MHS: the National Health and Nutrition Examination Survey (NHANES), the National Cholesterol Education Program (NCEP-ATPIII), the International Diabetes Federation (IDF) and comorbidities, and the agreement between them were evaluated by Cohen-kappa coefficient. MHS was present among 12.0% (n = 396) of obese, 25.5% (n = 1514) of overweight, and 48.6% (n = 2582) of normal weight participants according to the combination of the 4 criteria. The agreement between all the 4 MHS criteria was strong (kappa 0.73 P < 0.001). In final logistic models, MHS was associated with lower age, female sex, lower body mass index (BMI), and weight change from age 20 within all BMI categories. This study showed that, despite differences in prevalence among the 4 criteria, MHS was associated with common characteristics at every BMI category. PMID:27399079
Determination of riverbank erosion probability using Locally Weighted Logistic Regression
NASA Astrophysics Data System (ADS)
Ioannidou, Elena; Flori, Aikaterini; Varouchakis, Emmanouil A.; Giannakis, Georgios; Vozinaki, Anthi Eirini K.; Karatzas, George P.; Nikolaidis, Nikolaos
2015-04-01
Riverbank erosion is a natural geomorphologic process that affects the fluvial environment. The most important issue concerning riverbank erosion is the identification of the vulnerable locations. An alternative to the usual hydrodynamic models to predict vulnerable locations is to quantify the probability of erosion occurrence. This can be achieved by identifying the underlying relations between riverbank erosion and the geomorphological or hydrological variables that prevent or stimulate erosion. Thus, riverbank erosion can be determined by a regression model using independent variables that are considered to affect the erosion process. The impact of such variables may vary spatially, therefore, a non-stationary regression model is preferred instead of a stationary equivalent. Locally Weighted Regression (LWR) is proposed as a suitable choice. This method can be extended to predict the binary presence or absence of erosion based on a series of independent local variables by using the logistic regression model. It is referred to as Locally Weighted Logistic Regression (LWLR). Logistic regression is a type of regression analysis used for predicting the outcome of a categorical dependent variable (e.g. binary response) based on one or more predictor variables. The method can be combined with LWR to assign weights to local independent variables of the dependent one. LWR allows model parameters to vary over space in order to reflect spatial heterogeneity. The probabilities of the possible outcomes are modelled as a function of the independent variables using a logistic function. Logistic regression measures the relationship between a categorical dependent variable and, usually, one or several continuous independent variables by converting the dependent variable to probability scores. Then, a logistic regression is formed, which predicts success or failure of a given binary variable (e.g. erosion presence or absence) for any value of the independent variables. The erosion occurrence probability can be calculated in conjunction with the model deviance regarding the independent variables tested. The most straightforward measure for goodness of fit is the G statistic. It is a simple and effective way to study and evaluate the Logistic Regression model efficiency and the reliability of each independent variable. The developed statistical model is applied to the Koiliaris River Basin on the island of Crete, Greece. Two datasets of river bank slope, river cross-section width and indications of erosion were available for the analysis (12 and 8 locations). Two different types of spatial dependence functions, exponential and tricubic, were examined to determine the local spatial dependence of the independent variables at the measurement locations. The results show a significant improvement when the tricubic function is applied as the erosion probability is accurately predicted at all eight validation locations. Results for the model deviance show that cross-section width is more important than bank slope in the estimation of erosion probability along the Koiliaris riverbanks. The proposed statistical model is a useful tool that quantifies the erosion probability along the riverbanks and can be used to assist managing erosion and flooding events. Acknowledgements This work is part of an on-going THALES project (CYBERSENSORS - High Frequency Monitoring System for Integrated Water Resources Management of Rivers). The project has been co-financed by the European Union (European Social Fund - ESF) and Greek national funds through the Operational Program "Education and Lifelong Learning" of the National Strategic Reference Framework (NSRF) - Research Funding Program: THALES. Investing in knowledge society through the European Social Fund.
NASA Astrophysics Data System (ADS)
Yilmaz, Işık
2009-06-01
The purpose of this study is to compare the landslide susceptibility mapping methods of frequency ratio (FR), logistic regression and artificial neural networks (ANN) applied in the Kat County (Tokat—Turkey). Digital elevation model (DEM) was first constructed using GIS software. Landslide-related factors such as geology, faults, drainage system, topographical elevation, slope angle, slope aspect, topographic wetness index (TWI) and stream power index (SPI) were used in the landslide susceptibility analyses. Landslide susceptibility maps were produced from the frequency ratio, logistic regression and neural networks models, and they were then compared by means of their validations. The higher accuracies of the susceptibility maps for all three models were obtained from the comparison of the landslide susceptibility maps with the known landslide locations. However, respective area under curve (AUC) values of 0.826, 0.842 and 0.852 for frequency ratio, logistic regression and artificial neural networks showed that the map obtained from ANN model is more accurate than the other models, accuracies of all models can be evaluated relatively similar. The results obtained in this study also showed that the frequency ratio model can be used as a simple tool in assessment of landslide susceptibility when a sufficient number of data were obtained. Input process, calculations and output process are very simple and can be readily understood in the frequency ratio model, however logistic regression and neural networks require the conversion of data to ASCII or other formats. Moreover, it is also very hard to process the large amount of data in the statistical package.
ERIC Educational Resources Information Center
Schumacher, Phyllis; Olinsky, Alan; Quinn, John; Smith, Richard
2010-01-01
The authors extended previous research by 2 of the authors who conducted a study designed to predict the successful completion of students enrolled in an actuarial program. They used logistic regression to determine the probability of an actuarial student graduating in the major or dropping out. They compared the results of this study with those…
Carolyn B. Meyer; Sherri L. Miller; C. John Ralph
2004-01-01
The scale at which habitat variables are measured affects the accuracy of resource selection functions in predicting animal use of sites. We used logistic regression models for a wide-ranging species, the marbled murrelet, (Brachyramphus marmoratus) in a large region in California to address how much changing the spatial or temporal scale of...
ERIC Educational Resources Information Center
Monahan, Patrick O.; McHorney, Colleen A.; Stump, Timothy E.; Perkins, Anthony J.
2007-01-01
Previous methodological and applied studies that used binary logistic regression (LR) for detection of differential item functioning (DIF) in dichotomously scored items either did not report an effect size or did not employ several useful measures of DIF magnitude derived from the LR model. Equations are provided for these effect size indices.…
ERIC Educational Resources Information Center
Magis, David; Raiche, Gilles; Beland, Sebastien; Gerard, Paul
2011-01-01
We present an extension of the logistic regression procedure to identify dichotomous differential item functioning (DIF) in the presence of more than two groups of respondents. Starting from the usual framework of a single focal group, we propose a general approach to estimate the item response functions in each group and to test for the presence…
Risk Factors of Falls in Community-Dwelling Older Adults: Logistic Regression Tree Analysis
ERIC Educational Resources Information Center
Yamashita, Takashi; Noe, Douglas A.; Bailer, A. John
2012-01-01
Purpose of the Study: A novel logistic regression tree-based method was applied to identify fall risk factors and possible interaction effects of those risk factors. Design and Methods: A nationally representative sample of American older adults aged 65 years and older (N = 9,592) in the Health and Retirement Study 2004 and 2006 modules was used.…
ERIC Educational Resources Information Center
Gordovil-Merino, Amalia; Guardia-Olmos, Joan; Pero-Cebollero, Maribel
2012-01-01
In this paper, we used simulations to compare the performance of classical and Bayesian estimations in logistic regression models using small samples. In the performed simulations, conditions were varied, including the type of relationship between independent and dependent variable values (i.e., unrelated and related values), the type of variable…
Ohlmacher, G.C.; Davis, J.C.
2003-01-01
Landslides in the hilly terrain along the Kansas and Missouri rivers in northeastern Kansas have caused millions of dollars in property damage during the last decade. To address this problem, a statistical method called multiple logistic regression has been used to create a landslide-hazard map for Atchison, Kansas, and surrounding areas. Data included digitized geology, slopes, and landslides, manipulated using ArcView GIS. Logistic regression relates predictor variables to the occurrence or nonoccurrence of landslides within geographic cells and uses the relationship to produce a map showing the probability of future landslides, given local slopes and geologic units. Results indicated that slope is the most important variable for estimating landslide hazard in the study area. Geologic units consisting mostly of shale, siltstone, and sandstone were most susceptible to landslides. Soil type and aspect ratio were considered but excluded from the final analysis because these variables did not significantly add to the predictive power of the logistic regression. Soil types were highly correlated with the geologic units, and no significant relationships existed between landslides and slope aspect. ?? 2003 Elsevier Science B.V. All rights reserved.
A Method for Calculating the Probability of Successfully Completing a Rocket Propulsion Ground Test
NASA Technical Reports Server (NTRS)
Messer, Bradley
2007-01-01
Propulsion ground test facilities face the daily challenge of scheduling multiple customers into limited facility space and successfully completing their propulsion test projects. Over the last decade NASA s propulsion test facilities have performed hundreds of tests, collected thousands of seconds of test data, and exceeded the capabilities of numerous test facility and test article components. A logistic regression mathematical modeling technique has been developed to predict the probability of successfully completing a rocket propulsion test. A logistic regression model is a mathematical modeling approach that can be used to describe the relationship of several independent predictor variables X(sub 1), X(sub 2),.., X(sub k) to a binary or dichotomous dependent variable Y, where Y can only be one of two possible outcomes, in this case Success or Failure of accomplishing a full duration test. The use of logistic regression modeling is not new; however, modeling propulsion ground test facilities using logistic regression is both a new and unique application of the statistical technique. Results from this type of model provide project managers with insight and confidence into the effectiveness of rocket propulsion ground testing.
Fei, Yang; Hu, Jian; Gao, Kun; Tu, Jianfeng; Li, Wei-Qin; Wang, Wei
2017-06-01
To construct a radical basis function (RBF) artificial neural networks (ANNs) model to predict the incidence of acute pancreatitis (AP)-induced portal vein thrombosis. The analysis included 353 patients with AP who had admitted between January 2011 and December 2015. RBF ANNs model and logistic regression model were constructed based on eleven factors relevant to AP respectively. Statistical indexes were used to evaluate the value of the prediction in two models. The predict sensitivity, specificity, positive predictive value, negative predictive value and accuracy by RBF ANNs model for PVT were 73.3%, 91.4%, 68.8%, 93.0% and 87.7%, respectively. There were significant differences between the RBF ANNs and logistic regression models in these parameters (P<0.05). In addition, a comparison of the area under receiver operating characteristic curves of the two models showed a statistically significant difference (P<0.05). The RBF ANNs model is more likely to predict the occurrence of PVT induced by AP than logistic regression model. D-dimer, AMY, Hct and PT were important prediction factors of approval for AP-induced PVT. Copyright © 2017 Elsevier Inc. All rights reserved.
Uhler, Kristin M; Baca, Rosalinda; Dudas, Emily; Fredrickson, Tammy
2015-01-01
Speech perception measures have long been considered an integral piece of the audiological assessment battery. Currently, a prelinguistic, standardized measure of speech perception is missing in the clinical assessment battery for infants and young toddlers. Such a measure would allow systematic assessment of speech perception abilities of infants as well as the potential to investigate the impact early identification of hearing loss and early fitting of amplification have on the auditory pathways. To investigate the impact of sensation level (SL) on the ability of infants with normal hearing (NH) to discriminate /a-i/ and /ba-da/ and to determine if performance on the two contrasts are significantly different in predicting the discrimination criterion. The design was based on a survival analysis model for event occurrence and a repeated measures logistic model for binary outcomes. The outcome for survival analysis was the minimum SL for criterion and the outcome for the logistic regression model was the presence/absence of achieving the criterion. Criterion achievement was designated when an infant's proportion correct score was >0.75 on the discrimination performance task. Twenty-two infants with NH sensitivity participated in this study. There were 9 males and 13 females, aged 6-14 mo. Testing took place over two to three sessions. The first session consisted of a hearing test, threshold assessment of the two speech sounds (/a/ and /i/), and if time and attention allowed, visual reinforcement infant speech discrimination (VRISD). The second session consisted of VRISD assessment for the two test contrasts (/a-i/ and /ba-da/). The presentation level started at 50 dBA. If the infant was unable to successfully achieve criterion (>0.75) at 50 dBA, the presentation level was increased to 70 dBA followed by 60 dBA. Data examination included an event analysis, which provided the probability of criterion distribution across SL. The second stage of the analysis was a repeated measures logistic regression where SL and contrast were used to predict the likelihood of speech discrimination criterion. Infants were able to reach criterion for the /a-i/ contrast at statistically lower SLs when compared to /ba-da/. There were six infants who never reached criterion for /ba-da/ and one never reached criterion for /a-i/. The conditional probability of not reaching criterion by 70 dB SL was 0% for /a-i/ and 21% for /ba-da/. The predictive logistic regression model showed that children were more likely to discriminate the /a-i/ even when controlling for SL. Nearly all normal-hearing infants can demonstrate discrimination criterion of a vowel contrast at 60 dB SL, while a level of ≥70 dB SL may be needed to allow all infants to demonstrate discrimination criterion of a difficult consonant contrast. American Academy of Audiology.
Wang, Shuang; Jiang, Xiaoqian; Wu, Yuan; Cui, Lijuan; Cheng, Samuel; Ohno-Machado, Lucila
2013-01-01
We developed an EXpectation Propagation LOgistic REgRession (EXPLORER) model for distributed privacy-preserving online learning. The proposed framework provides a high level guarantee for protecting sensitive information, since the information exchanged between the server and the client is the encrypted posterior distribution of coefficients. Through experimental results, EXPLORER shows the same performance (e.g., discrimination, calibration, feature selection etc.) as the traditional frequentist Logistic Regression model, but provides more flexibility in model updating. That is, EXPLORER can be updated one point at a time rather than having to retrain the entire data set when new observations are recorded. The proposed EXPLORER supports asynchronized communication, which relieves the participants from coordinating with one another, and prevents service breakdown from the absence of participants or interrupted communications. PMID:23562651
Dietary consumption patterns and laryngeal cancer risk.
Vlastarakos, Petros V; Vassileiou, Andrianna; Delicha, Evie; Kikidis, Dimitrios; Protopapas, Dimosthenis; Nikolopoulos, Thomas P
2016-06-01
We conducted a case-control study to investigate the effect of diet on laryngeal carcinogenesis. Our study population was made up of 140 participants-70 patients with laryngeal cancer (LC) and 70 controls with a non-neoplastic condition that was unrelated to diet, smoking, or alcohol. A food-frequency questionnaire determined the mean consumption of 113 different items during the 3 years prior to symptom onset. Total energy intake and cooking mode were also noted. The relative risk, odds ratio (OR), and 95% confidence interval (CI) were estimated by multiple logistic regression analysis. We found that the total energy intake was significantly higher in the LC group (p < 0.001), and that the difference remained statistically significant after logistic regression analysis (p < 0.001; OR: 118.70). Notably, meat consumption was higher in the LC group (p < 0.001), and the difference remained significant after logistic regression analysis (p = 0.029; OR: 1.16). LC patients also consumed significantly more fried food (p = 0.036); this difference also remained significant in the logistic regression model (p = 0.026; OR: 5.45). The LC group also consumed significantly more seafood (p = 0.012); the difference persisted after logistic regression analysis (p = 0.009; OR: 2.48), with the consumption of shrimp proving detrimental (p = 0.049; OR: 2.18). Finally, the intake of zinc was significantly higher in the LC group before and after logistic regression analysis (p = 0.034 and p = 0.011; OR: 30.15, respectively). Cereal consumption (including pastas) was also higher among the LC patients (p = 0.043), with logistic regression analysis showing that their negative effect was possibly associated with the sauces and dressings that traditionally accompany pasta dishes (p = 0.006; OR: 4.78). Conversely, a higher consumption of dairy products was found in controls (p < 0.05); logistic regression analysis showed that calcium appeared to be protective at the micronutrient level (p < 0.001; OR: 0.27). We found no difference in the overall consumption of fruits and vegetables between the LC patients and controls; however, the LC patients did have a greater consumption of cooked tomatoes and cooked root vegetables (p = 0.039 for both), and the controls had more consumption of leeks (p = 0.042) and, among controls younger than 65 years, cooked beans (p = 0.037). Lemon (p = 0.037), squeezed fruit juice (p = 0.032), and watermelon (p = 0.018) were also more frequently consumed by the controls. Other differences at the micronutrient level included greater consumption by the LC patients of retinol (p = 0.044), polyunsaturated fats (p = 0.041), and linoleic acid (p = 0.008); LC patients younger than 65 years also had greater intake of riboflavin (p = 0.045). We conclude that the differences in dietary consumption patterns between LC patients and controls indicate a possible role for lifestyle modifications involving nutritional factors as a means of decreasing the risk of laryngeal cancer.
A 3-Year Study of Predictive Factors for Positive and Negative Appendicectomies.
Chang, Dwayne T S; Maluda, Melissa; Lee, Lisa; Premaratne, Chandrasiri; Khamhing, Srisongham
2018-03-06
Early and accurate identification or exclusion of acute appendicitis is the key to avoid the morbidity of delayed treatment for true appendicitis or unnecessary appendicectomy, respectively. We aim (i) to identify potential predictive factors for positive and negative appendicectomies; and (ii) to analyse the use of ultrasound scans (US) and computed tomography (CT) scans for acute appendicitis. All appendicectomies that took place at our hospital from the 1st of January 2013 to the 31st of December 2015 were retrospectively recorded. Test results of potential predictive factors of acute appendicitis were recorded. Statistical analysis was performed using Fisher exact test, logistic regression analysis, sensitivity, specificity, and positive and negative predictive values calculation. 208 patients were included in this study. 184 patients had histologically proven acute appendicitis. The other 24 patients had either nonappendicitis pathology or normal appendix. Logistic regression analysis showed statistically significant associations between appendicitis and white cell count, neutrophil count, C-reactive protein, and bilirubin. Neutrophil count was the test with the highest sensitivity and negative predictive values, whereas bilirubin was the test with the highest specificity and positive predictive values (PPV). US and CT scans had high sensitivity and PPV for diagnosing appendicitis. No single test was sufficient to diagnose or exclude acute appendicitis by itself. Combining tests with high sensitivity (abnormal neutrophil count, and US and CT scans) and high specificity (raised bilirubin) may predict acute appendicitis more accurately.
Wang, Jian; Shete, Sanjay
2011-11-01
We recently proposed a bias correction approach to evaluate accurate estimation of the odds ratio (OR) of genetic variants associated with a secondary phenotype, in which the secondary phenotype is associated with the primary disease, based on the original case-control data collected for the purpose of studying the primary disease. As reported in this communication, we further investigated the type I error probabilities and powers of the proposed approach, and compared the results to those obtained from logistic regression analysis (with or without adjustment for the primary disease status). We performed a simulation study based on a frequency-matching case-control study with respect to the secondary phenotype of interest. We examined the empirical distribution of the natural logarithm of the corrected OR obtained from the bias correction approach and found it to be normally distributed under the null hypothesis. On the basis of the simulation study results, we found that the logistic regression approaches that adjust or do not adjust for the primary disease status had low power for detecting secondary phenotype associated variants and highly inflated type I error probabilities, whereas our approach was more powerful for identifying the SNP-secondary phenotype associations and had better-controlled type I error probabilities. © 2011 Wiley Periodicals, Inc.
Agnesi, Roberto; Valentini, Flavio; Fedeli, Ugo; Rylander, Ragnar; Meneghetti, Maurizia; Fadda, Emanuela; Buja, Alessandra; Mastrangelo, Giuseppe
2011-06-01
In a district of Veneto (North-east Italy) where numerous females of childbearing age were occupationally exposed to organic solvents in nearly 400 shoe factories, a case-control study found significant associations between maternal exposures (from occupation and risky behavior) and spontaneous abortion (SAB). Thereafter, a health education campaign was undertaken to increase awareness of risk factors for pregnancy in the population. To evaluate the effects of this campaign maternal exposures and SAB risks were compared before and after the campaign. Hospital records were collected from a local hospital for SAB cases and age- residence-matched controls with normal deliveries. Information on solvent exposure, coffee and alcohol consumption, smoking and the use of medication was collected using a questionnaire. Before and after differences were tested through a modified Chi-square test and linear and logistic regressions for survey data. Odds ratios (ORs) with 95% confidence interval (CI) were estimated using logistic regression models. The consumption of coffee (P = 0.003) and alcohol (P < 0.001) was lower after than before the campaign, controlling for age at pregnancy and level of education. There were no differences in reported solvent exposure or smoking (smokers were few). The previously detected increased risks of SAB in relation to solvent exposure and coffee consumption were no longer present. The results suggest that health education campaigns might reduce harmful maternal exposures and the risk of SAB.
Factors associated with abnormal eating attitudes among Greek adolescents.
Bilali, Aggeliki; Galanis, Petros; Velonakis, Emmanuel; Katostaras, Theofanis
2010-01-01
To estimate the prevalence of abnormal eating attitudes among Greek adolescents and identify possible risk factors associated with these attitudes. Cross-sectional, school-based study. Six randomly selected schools in Patras, southern Greece. The study population consisted of 540 Greek students aged 13-18 years, and the response rate was 97%. The dependent variable was scores on the Eating Attitudes Test-26, with scores > or = 20 indicating abnormal eating attitudes. Bivariate analysis included independent Student t test, chi-square test, and Fisher's exact test. Multivariate logistic regression analysis was applied for the identification of the predictive factors, which were associated independently with abnormal eating attitudes. A 2-sided P value of less than .05 was considered statistically significant. The prevalence of abnormal eating attitudes was 16.7%. Multivariate logistic regression analysis demonstrated that females, urban residents, and those with a body mass index outside normal range, a perception of being overweight, body dissatisfaction, and a family member on a diet were independently related to abnormal eating attitudes. The results indicate that a proportion of Greek adolescents report abnormal eating attitudes and suggest that multiple factors contribute to the development of these attitudes. These findings are useful for further research into this topic and would be valuable in designing preventive interventions. Copyright 2010 Society for Nutrition Education. Published by Elsevier Inc. All rights reserved.
Meel-van den Abeelen, Aisha S.S.; Simpson, David M.; Wang, Lotte J.Y.; Slump, Cornelis H.; Zhang, Rong; Tarumi, Takashi; Rickards, Caroline A.; Payne, Stephen; Mitsis, Georgios D.; Kostoglou, Kyriaki; Marmarelis, Vasilis; Shin, Dae; Tzeng, Yu-Chieh; Ainslie, Philip N.; Gommer, Erik; Müller, Martin; Dorado, Alexander C.; Smielewski, Peter; Yelicich, Bernardo; Puppo, Corina; Liu, Xiuyun; Czosnyka, Marek; Wang, Cheng-Yen; Novak, Vera; Panerai, Ronney B.; Claassen, Jurgen A.H.R.
2014-01-01
Transfer function analysis (TFA) is a frequently used method to assess dynamic cerebral autoregulation (CA) using spontaneous oscillations in blood pressure (BP) and cerebral blood flow velocity (CBFV). However, controversies and variations exist in how research groups utilise TFA, causing high variability in interpretation. The objective of this study was to evaluate between-centre variability in TFA outcome metrics. 15 centres analysed the same 70 BP and CBFV datasets from healthy subjects (n = 50 rest; n = 20 during hypercapnia); 10 additional datasets were computer-generated. Each centre used their in-house TFA methods; however, certain parameters were specified to reduce a priori between-centre variability. Hypercapnia was used to assess discriminatory performance and synthetic data to evaluate effects of parameter settings. Results were analysed using the Mann–Whitney test and logistic regression. A large non-homogeneous variation was found in TFA outcome metrics between the centres. Logistic regression demonstrated that 11 centres were able to distinguish between normal and impaired CA with an AUC > 0.85. Further analysis identified TFA settings that are associated with large variation in outcome measures. These results indicate the need for standardisation of TFA settings in order to reduce between-centre variability and to allow accurate comparison between studies. Suggestions on optimal signal processing methods are proposed. PMID:24725709
Hu, Lihua; Huang, Xiao; You, Chunjiao; Li, Juxiang; Hong, Kui; Li, Ping; Wu, Yanqing; Wu, Qinhua; Wang, Zengwu; Gao, Runlin; Bao, Huihui; Cheng, Xiaoshu
2017-01-01
The purpose of this study is to assess the prevalence of overweight/obesity, abdominal obesity and obesity-related risk factors in southern China. A cross-sectional survey of 15,364 participants aged 15 years and older was conducted from November 2013 to August 2014 in Jiangxi Province, China, using questionnaire forms and physical measurements. The physical measurements included body height, weight, waist circumference (WC), body fat percentage (BFP) and visceral adipose index (VAI). Multivariate logistic regression analysis was performed to evaluate the risk factors for overweight/obesity and abdominal obesity. The prevalence of overweight was 25.8% (25.9% in males and 25.7% in females), while that of obesity was 7.9% (8.4% in males and 7.6% in females). The prevalence of abdominal obesity was 10.2% (8.6% in males and 11.3% in females). The prevalence of overweight/obesity was 37.1% in urban residents and 30.2% in rural residents, and this difference was significant (P < 0.001). Urban residents had a significantly higher prevalence of abdominal obesity than rural residents (11.6% vs 8.7%, P < 0.001). Among the participants with an underweight/normal body mass index (BMI), 1.3% still had abdominal obesity, 16.1% had a high BFP and 1.0% had a high VAI. Moreover, among obese participants, 9.7% had a low /normal WC, 0.8% had a normal BFP and 15.9% had a normal VAI. Meanwhile, the partial correlation analysis indicated that the correlation coefficients between VAI and BMI, VAI and WC, and BMI and WC were 0.700, 0.666, and 0.721, respectively. A multivariate logistic regression analysis indicated that being female and having a high BFP and a high VAI were significantly associated with an increased risk of overweight/obesity and abdominal obesity. In addition, living in an urban area and older age correlated with overweight/obesity. This study revealed that obesity and abdominal obesity, which differed by gender and age, are epidemic in southern China. Moreover, there was a very high, significant, positive correlation between WC, BMI and VAI. However, further studies are needed to explore which indicator of body fat could be used as the best marker to indirectly reflect cardiometabolic risk.
Hu, Lihua; Huang, Xiao; You, Chunjiao; Li, Juxiang; Hong, Kui; Li, Ping; Wu, Yanqing; Wu, Qinhua; Wang, Zengwu; Gao, Runlin; Bao, Huihui
2017-01-01
Objectives The purpose of this study is to assess the prevalence of overweight/obesity, abdominal obesity and obesity-related risk factors in southern China. Methods A cross-sectional survey of 15,364 participants aged 15 years and older was conducted from November 2013 to August 2014 in Jiangxi Province, China, using questionnaire forms and physical measurements. The physical measurements included body height, weight, waist circumference (WC), body fat percentage (BFP) and visceral adipose index (VAI). Multivariate logistic regression analysis was performed to evaluate the risk factors for overweight/obesity and abdominal obesity. Results The prevalence of overweight was 25.8% (25.9% in males and 25.7% in females), while that of obesity was 7.9% (8.4% in males and 7.6% in females). The prevalence of abdominal obesity was 10.2% (8.6% in males and 11.3% in females). The prevalence of overweight/obesity was 37.1% in urban residents and 30.2% in rural residents, and this difference was significant (P < 0.001). Urban residents had a significantly higher prevalence of abdominal obesity than rural residents (11.6% vs 8.7%, P < 0.001). Among the participants with an underweight/normal body mass index (BMI), 1.3% still had abdominal obesity, 16.1% had a high BFP and 1.0% had a high VAI. Moreover, among obese participants, 9.7% had a low /normal WC, 0.8% had a normal BFP and 15.9% had a normal VAI. Meanwhile, the partial correlation analysis indicated that the correlation coefficients between VAI and BMI, VAI and WC, and BMI and WC were 0.700, 0.666, and 0.721, respectively. A multivariate logistic regression analysis indicated that being female and having a high BFP and a high VAI were significantly associated with an increased risk of overweight/obesity and abdominal obesity. In addition, living in an urban area and older age correlated with overweight/obesity. Conclusion This study revealed that obesity and abdominal obesity, which differed by gender and age, are epidemic in southern China. Moreover, there was a very high, significant, positive correlation between WC, BMI and VAI. However, further studies are needed to explore which indicator of body fat could be used as the best marker to indirectly reflect cardiometabolic risk. PMID:28910301
Cruz-Martinez, R; Savchev, S; Cruz-Lemini, M; Mendez, A; Gratacos, E; Figueras, F
2015-03-01
To assess the clinical value of third-trimester uterine artery (UtA) Doppler ultrasound in the prediction of hemodynamic deterioration and adverse perinatal outcome in term small-for-gestational-age (SGA) fetuses. UtA Doppler parameters, cerebroplacental ratio (CPR) and fetal middle cerebral artery (MCA) pulsatility index (PI) were evaluated weekly, starting from the time of SGA diagnosis until 24 h before induction of labor, in a cohort of 327 SGA fetuses with normal umbilical artery PI (< 95th centile), delivered at > 37 weeks' gestation. Differences in the sequence of CPR and MCA-PI changes < 5th centile, between the group with normal UtA Doppler indices at diagnosis and those with abnormal UtA indices, were analyzed by survival analysis. In addition, the use of UtA Doppler value, alone or in combination with a brain Doppler scan before delivery, to predict the risk of Cesarean section, Cesarean section for non-reassuring fetal status (NRFS), neonatal acidosis and neonatal hospitalization was evaluated by logistic regression analysis, adjusted for gestational age at birth and birth-weight percentile. Abnormal UtA Doppler at diagnosis of SGA was associated with a higher risk of developing abnormal brain Doppler indices before induction of labor than in those with a normal UtA at diagnosis (62.7% vs 34.6%, respectively; P < 0.01). Compared to those with normal UtA Doppler indices, those with abnormal UtA Doppler findings were associated with a higher risk of intrapartum Cesarean section (52.2% vs 37.3%, respectively; P = 0.03), Cesarean section for NRFS (35.8% vs 23.1%, respectively; P = 0.03), neonatal acidosis (10.4% vs 7.7%, respectively; P = 0.47) and neonatal hospitalization (23.9% vs 16.5%, respectively; P = 0.16). Logistic regression analysis indicated that UtA Doppler findings were not significantly associated with adverse perinatal outcome independent of brain Doppler findings. UtA Doppler indices predict adverse perinatal outcome, but do not help to improve the predictive value of brain Doppler indices. However, at the time of SGA diagnosis they identify the subgroup of fetuses at highest risk of progression to abnormal brain Doppler findings. Copyright © 2014 ISUOG. Published by John Wiley & Sons Ltd.
Yu, X D; Yu, J C; Wu, Q F; Chen, J Y; Wang, Y C; Yan, D; Teng, S W; Zhao, Y T; Cao, J P; Li, S Q; Yan, Y Q; Gong, J; Yao, K; Zhou, H; Wang, Z Z
2017-03-06
Objective: To investigate the relationship among depression, anxiety, stress and addictive substance use behavior in secondary vocational students. Methods: Cluster sampling method and the Adolescent Health-related Behaviors Questionnaire were used to collect demographic characteristics, psychological symptoms, and addictive substance usage among 5 935 students in nine vocational schools in Chongqing, Zhaoqing, Ningbo, and Taiyuan. Multivariate logistic regression analysis was used to analyze the relationship between the addictive substance use behavior and psychological factors. Results: The detection rates of depression, anxiety and stress were 46.5% ( n= 2 762), 58.7% ( n= 3 483), and 29.8% ( n= 1 770), respectively. The prevalence of addictive substances was 74.8% ( n =4 440), traditional drugs was 0.8% ( n= 50), new drugs was 2.8% ( n= 166), other addictive drugs was 4.1% ( n= 241). Multivariate logistic regression analysis showed that compared with the normal psychological states of secondary vocational students, the OR value of mild depression tendency alcohol and tobacco use behavior of secondary vocational students was 1.45; the OR values of mild anxiety, moderate anxiety, severe anxiety and very serious anxiety were 1.46, 1.46, 1.71, and 1.83, respectively; the traditional drugs use behaviors were 5.51, and 2.61, respectively, for the severe anxiety and very serious anxiety. Compared with the normal psychological state of secondary vocational students, the OR values of the severe anxiety and very severe anxiety were 2.56, and 2.66, respectively, for severe anxiety and very serious anxiety. Compared with normal psychological status of secondary vocational students, the OR values of mild, moderate, severe, and very severe anxiety were 2.14, 2.47, 2.39, and 3.45, respectively; all P values <0.05. Conclusion: Anxiety and mild depression were risk factors of tobacco and alcohol use in secondary vocational students; severe and above anxiety were the risk factors of drug use in secondary vocational students; anxiety was the risk factor for other addictive drug use in secondary vocational students.
ERIC Educational Resources Information Center
Guler, Nese; Penfield, Randall D.
2009-01-01
In this study, we investigate the logistic regression (LR), Mantel-Haenszel (MH), and Breslow-Day (BD) procedures for the simultaneous detection of both uniform and nonuniform differential item functioning (DIF). A simulation study was used to assess and compare the Type I error rate and power of a combined decision rule (CDR), which assesses DIF…
ERIC Educational Resources Information Center
Le, Huy; Marcus, Justin
2012-01-01
This study used Monte Carlo simulation to examine the properties of the overall odds ratio (OOR), which was recently introduced as an index for overall effect size in multiple logistic regression. It was found that the OOR was relatively independent of study base rate and performed better than most commonly used R-square analogs in indexing model…
Predicting Student Success on the Texas Chemistry STAAR Test: A Logistic Regression Analysis
ERIC Educational Resources Information Center
Johnson, William L.; Johnson, Annabel M.; Johnson, Jared
2012-01-01
Background: The context is the new Texas STAAR end-of-course testing program. Purpose: The authors developed a logistic regression model to predict who would pass-or-fail the new Texas chemistry STAAR end-of-course exam. Setting: Robert E. Lee High School (5A) with an enrollment of 2700 students, Tyler, Texas. Date of the study was the 2011-2012…
Susan L. King
2003-01-01
The performance of two classifiers, logistic regression and neural networks, are compared for modeling noncatastrophic individual tree mortality for 21 species of trees in West Virginia. The output of the classifier is usually a continuous number between 0 and 1. A threshold is selected between 0 and 1 and all of the trees below the threshold are classified as...
Logistic regression trees for initial selection of interesting loci in case-control studies
Nickolov, Radoslav Z; Milanov, Valentin B
2007-01-01
Modern genetic epidemiology faces the challenge of dealing with hundreds of thousands of genetic markers. The selection of a small initial subset of interesting markers for further investigation can greatly facilitate genetic studies. In this contribution we suggest the use of a logistic regression tree algorithm known as logistic tree with unbiased selection. Using the simulated data provided for Genetic Analysis Workshop 15, we show how this algorithm, with incorporation of multifactor dimensionality reduction method, can reduce an initial large pool of markers to a small set that includes the interesting markers with high probability. PMID:18466557
Rupert, Michael G.; Cannon, Susan H.; Gartner, Joseph E.; Michael, John A.; Helsel, Dennis R.
2008-01-01
Logistic regression was used to develop statistical models that can be used to predict the probability of debris flows in areas recently burned by wildfires by using data from 14 wildfires that burned in southern California during 2003-2006. Twenty-eight independent variables describing the basin morphology, burn severity, rainfall, and soil properties of 306 drainage basins located within those burned areas were evaluated. The models were developed as follows: (1) Basins that did and did not produce debris flows soon after the 2003 to 2006 fires were delineated from data in the National Elevation Dataset using a geographic information system; (2) Data describing the basin morphology, burn severity, rainfall, and soil properties were compiled for each basin. These data were then input to a statistics software package for analysis using logistic regression; and (3) Relations between the occurrence or absence of debris flows and the basin morphology, burn severity, rainfall, and soil properties were evaluated, and five multivariate logistic regression models were constructed. All possible combinations of independent variables were evaluated to determine which combinations produced the most effective models, and the multivariate models that best predicted the occurrence of debris flows were identified. Percentage of high burn severity and 3-hour peak rainfall intensity were significant variables in all models. Soil organic matter content and soil clay content were significant variables in all models except Model 5. Soil slope was a significant variable in all models except Model 4. The most suitable model can be selected from these five models on the basis of the availability of independent variables in the particular area of interest and field checking of probability maps. The multivariate logistic regression models can be entered into a geographic information system, and maps showing the probability of debris flows can be constructed in recently burned areas of southern California. This study demonstrates that logistic regression is a valuable tool for developing models that predict the probability of debris flows occurring in recently burned landscapes.
Hein, R; Abbas, S; Seibold, P; Salazar, R; Flesch-Janys, D; Chang-Claude, J
2012-01-01
Menopausal hormone therapy (MHT) is associated with an increased breast cancer risk in postmenopausal women, with combined estrogen-progestagen therapy posing a greater risk than estrogen monotherapy. However, few studies focused on potential effect modification of MHT-associated breast cancer risk by genetic polymorphisms in the progesterone metabolism. We assessed effect modification of MHT use by five coding single nucleotide polymorphisms (SNPs) in the progesterone metabolizing enzymes AKR1C3 (rs7741), AKR1C4 (rs3829125, rs17134592), and SRD5A1 (rs248793, rs3736316) using a two-center population-based case-control study from Germany with 2,502 postmenopausal breast cancer patients and 4,833 matched controls. An empirical-Bayes procedure that tests for interaction using a weighted combination of the prospective and the retrospective case-control estimators as well as standard prospective logistic regression were applied to assess multiplicative statistical interaction between polymorphisms and duration of MHT use with regard to breast cancer risk assuming a log-additive mode of inheritance. No genetic marginal effects were observed. Breast cancer risk associated with duration of combined therapy was significantly modified by SRD5A1_rs3736316, showing a reduced risk elevation in carriers of the minor allele (p (interaction,empirical-Bayes) = 0.006 using the empirical-Bayes method, p (interaction,logistic regression) = 0.013 using logistic regression). The risk associated with duration of use of monotherapy was increased by AKR1C3_rs7741 in minor allele carriers (p (interaction,empirical-Bayes) = 0.083, p (interaction,logistic regression) = 0.029) and decreased in minor allele carriers of two SNPs in AKR1C4 (rs3829125: p (interaction,empirical-Bayes) = 0.07, p (interaction,logistic regression) = 0.021; rs17134592: p (interaction,empirical-Bayes) = 0.101, p (interaction,logistic regression) = 0.038). After Bonferroni correction for multiple testing only SRD5A1_rs3736316 assessed using the empirical-Bayes method remained significant. Postmenopausal breast cancer risk associated with combined therapy may be modified by genetic variation in SRD5A1. Further well-powered studies are, however, required to replicate our finding.
A comprehensive prediction and evaluation method of pilot workload
Feng, Chuanyan; Wanyan, Xiaoru; Yang, Kun; Zhuang, Damin; Wu, Xu
2018-01-01
BACKGROUND: The prediction and evaluation of pilot workload is a key problem in human factor airworthiness of cockpit. OBJECTIVE: A pilot traffic pattern task was designed in a flight simulation environment in order to carry out the pilot workload prediction and improve the evaluation method. METHODS: The prediction of typical flight subtasks and dynamic workloads (cruise, approach, and landing) were built up based on multiple resource theory, and a favorable validity was achieved by the correlation analysis verification between sensitive physiological data and the predicted value. RESULTS: Statistical analysis indicated that eye movement indices (fixation frequency, mean fixation time, saccade frequency, mean saccade time, and mean pupil diameter), Electrocardiogram indices (mean normal-to-normal interval and the ratio between low frequency and sum of low frequency and high frequency), and Electrodermal Activity indices (mean tonic and mean phasic) were all sensitive to typical workloads of subjects. CONCLUSION: A multinominal logistic regression model based on combination of physiological indices (fixation frequency, mean normal-to-normal interval, the ratio between low frequency and sum of low frequency and high frequency, and mean tonic) was constructed, and the discriminate accuracy was comparatively ideal with a rate of 84.85%. PMID:29710742
A comprehensive prediction and evaluation method of pilot workload.
Feng, Chuanyan; Wanyan, Xiaoru; Yang, Kun; Zhuang, Damin; Wu, Xu
2018-01-01
The prediction and evaluation of pilot workload is a key problem in human factor airworthiness of cockpit. A pilot traffic pattern task was designed in a flight simulation environment in order to carry out the pilot workload prediction and improve the evaluation method. The prediction of typical flight subtasks and dynamic workloads (cruise, approach, and landing) were built up based on multiple resource theory, and a favorable validity was achieved by the correlation analysis verification between sensitive physiological data and the predicted value. Statistical analysis indicated that eye movement indices (fixation frequency, mean fixation time, saccade frequency, mean saccade time, and mean pupil diameter), Electrocardiogram indices (mean normal-to-normal interval and the ratio between low frequency and sum of low frequency and high frequency), and Electrodermal Activity indices (mean tonic and mean phasic) were all sensitive to typical workloads of subjects. A multinominal logistic regression model based on combination of physiological indices (fixation frequency, mean normal-to-normal interval, the ratio between low frequency and sum of low frequency and high frequency, and mean tonic) was constructed, and the discriminate accuracy was comparatively ideal with a rate of 84.85%.
Bruxism in craniocervical dystonia: a prospective study.
Borie, Laetitia; Langbour, Nicolas; Guehl, Dominique; Burbaud, Pierre; Ella, Bruno
2016-09-01
Bruxism pathophysiology remains unclear, and its occurrence has been poorly investigated in movement disorders. The aim of this study was to compare the frequency of bruxism in patients with craniocervical dystonia vs. normal controls and to determine its associated clinical features. This is a prospective-control study. A total of 114 dystonic subjects (45 facial dystonia, 69 cervical dystonia) and 182 controls were included. Bruxism was diagnosed using a hetero-questionnaire and a clinical examination performed by trained dentists. Occurrence of bruxism was compared between the different study populations. A binomial logistic regression analysis was used to determine which clinical features influenced bruxism occurrence in each population. The frequency of bruxism was significantly higher in the dystonic group than in normal controls but there was no difference between facial and cervical dystonia. It was also higher in women than in men. Bruxism features were similar between normal controls and dystonic patients except for a higher score of temporomandibular jaw pain in the dystonic group. The higher frequency of bruxism in dystonic patients suggests that bruxism is increased in patients with basal ganglia dysfunction but that its nature does not differ from that seen in bruxers from the normal population.
Errors in radiographic interpretation made by veterinary students.
Lamb, C R; Pfeiffer, D U; Mantis, P
2007-01-01
As a means of identifying student weaknesses in radiographic interpretation that could be used as foci for teaching, a cohort of 96 students joining the final-year radiology rotation were randomly allocated to one of three radiographic interpretation quizzes, each based on radiographs of small-animal patients together with the signalment and a brief, relevant history. Students' quiz scores were analyzed by multiple logistic regression, using an outcome variable with the score for each item as numerator and maximum possible mark as denominator. Students' median quiz score was 49% of the maximum (range 23-80%). Students were more likely to gain a mark for items based on abnormal radiographs than for those based on normal radiographs (odds ratio 3.4, p < 0.001). Skeletal radiographs were associated with lower scores (OR 0.75, p = 0.03). The fewest marks were awarded for interpretation of a radiograph of a normal canine stifle and interpretation of a radiograph of a normal canine pelvis; these items were misinterpreted as abnormal by 86% and 80% of the students, respectively. Students' tendency to over-interpret normal radiographs may reflect a lack of knowledge of radiographic anatomy or an unrealistically high expectation that the radiographs are abnormal.
Applications of statistics to medical science, III. Correlation and regression.
Watanabe, Hiroshi
2012-01-01
In this third part of a series surveying medical statistics, the concepts of correlation and regression are reviewed. In particular, methods of linear regression and logistic regression are discussed. Arguments related to survival analysis will be made in a subsequent paper.
Schell, Greggory J; Lavieri, Mariel S; Stein, Joshua D; Musch, David C
2013-12-21
Open-angle glaucoma (OAG) is a prevalent, degenerate ocular disease which can lead to blindness without proper clinical management. The tests used to assess disease progression are susceptible to process and measurement noise. The aim of this study was to develop a methodology which accounts for the inherent noise in the data and improve significant disease progression identification. Longitudinal observations from the Collaborative Initial Glaucoma Treatment Study (CIGTS) were used to parameterize and validate a Kalman filter model and logistic regression function. The Kalman filter estimates the true value of biomarkers associated with OAG and forecasts future values of these variables. We develop two logistic regression models via generalized estimating equations (GEE) for calculating the probability of experiencing significant OAG progression: one model based on the raw measurements from CIGTS and another model based on the Kalman filter estimates of the CIGTS data. Receiver operating characteristic (ROC) curves and associated area under the ROC curve (AUC) estimates are calculated using cross-fold validation. The logistic regression model developed using Kalman filter estimates as data input achieves higher sensitivity and specificity than the model developed using raw measurements. The mean AUC for the Kalman filter-based model is 0.961 while the mean AUC for the raw measurements model is 0.889. Hence, using the probability function generated via Kalman filter estimates and GEE for logistic regression, we are able to more accurately classify patients and instances as experiencing significant OAG progression. A Kalman filter approach for estimating the true value of OAG biomarkers resulted in data input which improved the accuracy of a logistic regression classification model compared to a model using raw measurements as input. This methodology accounts for process and measurement noise to enable improved discrimination between progression and nonprogression in chronic diseases.
Computing group cardinality constraint solutions for logistic regression problems.
Zhang, Yong; Kwon, Dongjin; Pohl, Kilian M
2017-01-01
We derive an algorithm to directly solve logistic regression based on cardinality constraint, group sparsity and use it to classify intra-subject MRI sequences (e.g. cine MRIs) of healthy from diseased subjects. Group cardinality constraint models are often applied to medical images in order to avoid overfitting of the classifier to the training data. Solutions within these models are generally determined by relaxing the cardinality constraint to a weighted feature selection scheme. However, these solutions relate to the original sparse problem only under specific assumptions, which generally do not hold for medical image applications. In addition, inferring clinical meaning from features weighted by a classifier is an ongoing topic of discussion. Avoiding weighing features, we propose to directly solve the group cardinality constraint logistic regression problem by generalizing the Penalty Decomposition method. To do so, we assume that an intra-subject series of images represents repeated samples of the same disease patterns. We model this assumption by combining series of measurements created by a feature across time into a single group. Our algorithm then derives a solution within that model by decoupling the minimization of the logistic regression function from enforcing the group sparsity constraint. The minimum to the smooth and convex logistic regression problem is determined via gradient descent while we derive a closed form solution for finding a sparse approximation of that minimum. We apply our method to cine MRI of 38 healthy controls and 44 adult patients that received reconstructive surgery of Tetralogy of Fallot (TOF) during infancy. Our method correctly identifies regions impacted by TOF and generally obtains statistically significant higher classification accuracy than alternative solutions to this model, i.e., ones relaxing group cardinality constraints. Copyright © 2016 Elsevier B.V. All rights reserved.
Ren, Yilong; Wang, Yunpeng; Wu, Xinkai; Yu, Guizhen; Ding, Chuan
2016-10-01
Red light running (RLR) has become a major safety concern at signalized intersection. To prevent RLR related crashes, it is critical to identify the factors that significantly impact the drivers' behaviors of RLR, and to predict potential RLR in real time. In this research, 9-month's RLR events extracted from high-resolution traffic data collected by loop detectors from three signalized intersections were applied to identify the factors that significantly affect RLR behaviors. The data analysis indicated that occupancy time, time gap, used yellow time, time left to yellow start, whether the preceding vehicle runs through the intersection during yellow, and whether there is a vehicle passing through the intersection on the adjacent lane were significantly factors for RLR behaviors. Furthermore, due to the rare events nature of RLR, a modified rare events logistic regression model was developed for RLR prediction. The rare events logistic regression method has been applied in many fields for rare events studies and shows impressive performance, but so far none of previous research has applied this method to study RLR. The results showed that the rare events logistic regression model performed significantly better than the standard logistic regression model. More importantly, the proposed RLR prediction method is purely based on loop detector data collected from a single advance loop detector located 400 feet away from stop-bar. This brings great potential for future field applications of the proposed method since loops have been widely implemented in many intersections and can collect data in real time. This research is expected to contribute to the improvement of intersection safety significantly. Copyright © 2016 Elsevier Ltd. All rights reserved.
Engoren, Milo; Habib, Robert H; Dooner, John J; Schwann, Thomas A
2013-08-01
As many as 14 % of patients undergoing coronary artery bypass surgery are readmitted within 30 days. Readmission is usually the result of morbidity and may lead to death. The purpose of this study is to develop and compare statistical and genetic programming models to predict readmission. Patients were divided into separate Construction and Validation populations. Using 88 variables, logistic regression, genetic programs, and artificial neural nets were used to develop predictive models. Models were first constructed and tested on the Construction populations, then validated on the Validation population. Areas under the receiver operator characteristic curves (AU ROC) were used to compare the models. Two hundred and two patients (7.6 %) in the 2,644 patient Construction group and 216 (8.0 %) of the 2,711 patient Validation group were re-admitted within 30 days of CABG surgery. Logistic regression predicted readmission with AU ROC = .675 ± .021 in the Construction group. Genetic programs significantly improved the accuracy, AU ROC = .767 ± .001, p < .001). Artificial neural nets were less accurate with AU ROC = 0.597 ± .001 in the Construction group. Predictive accuracy of all three techniques fell in the Validation group. However, the accuracy of genetic programming (AU ROC = .654 ± .001) was still trivially but statistically non-significantly better than that of the logistic regression (AU ROC = .644 ± .020, p = .61). Genetic programming and logistic regression provide alternative methods to predict readmission that are similarly accurate.
Eken, Cenker; Bilge, Ugur; Kartal, Mutlu; Eray, Oktay
2009-06-03
Logistic regression is the most common statistical model for processing multivariate data in the medical literature. Artificial intelligence models like an artificial neural network (ANN) and genetic algorithm (GA) may also be useful to interpret medical data. The purpose of this study was to perform artificial intelligence models on a medical data sheet and compare to logistic regression. ANN, GA, and logistic regression analysis were carried out on a data sheet of a previously published article regarding patients presenting to an emergency department with flank pain suspicious for renal colic. The study population was composed of 227 patients: 176 patients had a diagnosis of urinary stone, while 51 ultimately had no calculus. The GA found two decision rules in predicting urinary stones. Rule 1 consisted of being male, pain not spreading to back, and no fever. In rule 2, pelvicaliceal dilatation on bedside ultrasonography replaced no fever. ANN, GA rule 1, GA rule 2, and logistic regression had a sensitivity of 94.9, 67.6, 56.8, and 95.5%, a specificity of 78.4, 76.47, 86.3, and 47.1%, a positive likelihood ratio of 4.4, 2.9, 4.1, and 1.8, and a negative likelihood ratio of 0.06, 0.42, 0.5, and 0.09, respectively. The area under the curve was found to be 0.867, 0.720, 0.715, and 0.713 for all applications, respectively. Data mining techniques such as ANN and GA can be used for predicting renal colic in emergency settings and to constitute clinical decision rules. They may be an alternative to conventional multivariate analysis applications used in biostatistics.
NASA Astrophysics Data System (ADS)
Duman, T. Y.; Can, T.; Gokceoglu, C.; Nefeslioglu, H. A.; Sonmez, H.
2006-11-01
As a result of industrialization, throughout the world, cities have been growing rapidly for the last century. One typical example of these growing cities is Istanbul, the population of which is over 10 million. Due to rapid urbanization, new areas suitable for settlement and engineering structures are necessary. The Cekmece area located west of the Istanbul metropolitan area is studied, because the landslide activity is extensive in this area. The purpose of this study is to develop a model that can be used to characterize landslide susceptibility in map form using logistic regression analysis of an extensive landslide database. A database of landslide activity was constructed using both aerial-photography and field studies. About 19.2% of the selected study area is covered by deep-seated landslides. The landslides that occur in the area are primarily located in sandstones with interbedded permeable and impermeable layers such as claystone, siltstone and mudstone. About 31.95% of the total landslide area is located at this unit. To apply logistic regression analyses, a data matrix including 37 variables was constructed. The variables used in the forwards stepwise analyses are different measures of slope, aspect, elevation, stream power index (SPI), plan curvature, profile curvature, geology, geomorphology and relative permeability of lithological units. A total of 25 variables were identified as exerting strong influence on landslide occurrence, and included by the logistic regression equation. Wald statistics values indicate that lithology, SPI and slope are more important than the other parameters in the equation. Beta coefficients of the 25 variables included the logistic regression equation provide a model for landslide susceptibility in the Cekmece area. This model is used to generate a landslide susceptibility map that correctly classified 83.8% of the landslide-prone areas.
Ivanovic, Jugoslav; Larsson, Pål G; Østby, Ylva; Hald, John; Krossnes, Bård K; Fjeld, Jan G; Pripp, Are H; Alfstad, Kristin Å; Egge, Arild; Stanisic, Milo
2017-05-01
Seizure outcome following surgery in pharmacoresistant temporal lobe epilepsy patients with normal magnetic resonance imaging and normal or non-specific histopathology is not sufficiently presented in the literature. In a retrospective design, we reviewed data of 263 patients who had undergone temporal lobe epilepsy surgery and identified 26 (9.9%) who met the inclusion criteria. Seizure outcomes were determined at 2-year follow-up. Potential predictors of Engel class I (satisfactory outcome) were identified by logistic regression analyses. Engel class I outcome was achieved in 61.5% of patients, 50% being completely seizure free (Engel class IA outcome). The strongest predictors of satisfactory outcome were typical ictal seizure semiology (p = 0.048) and localised ictal discharges on scalp EEG (p = 0.036). Surgery might be an effective treatment choice for the majority of these patients, although outcomes are less favourable than in patients with magnetic resonance imaging-defined lesional temporal lobe epilepsy. Typical ictal seizure semiology and localised ictal discharges on scalp EEG were predictors of Engel class I outcome.
New robust statistical procedures for the polytomous logistic regression models.
Castilla, Elena; Ghosh, Abhik; Martin, Nirian; Pardo, Leandro
2018-05-17
This article derives a new family of estimators, namely the minimum density power divergence estimators, as a robust generalization of the maximum likelihood estimator for the polytomous logistic regression model. Based on these estimators, a family of Wald-type test statistics for linear hypotheses is introduced. Robustness properties of both the proposed estimators and the test statistics are theoretically studied through the classical influence function analysis. Appropriate real life examples are presented to justify the requirement of suitable robust statistical procedures in place of the likelihood based inference for the polytomous logistic regression model. The validity of the theoretical results established in the article are further confirmed empirically through suitable simulation studies. Finally, an approach for the data-driven selection of the robustness tuning parameter is proposed with empirical justifications. © 2018, The International Biometric Society.
Staley, Dennis M.; Negri, Jacquelyn A.; Kean, Jason W.; Laber, Jayme L.; Tillery, Anne C.; Youberg, Ann M.
2016-06-30
Wildfire can significantly alter the hydrologic response of a watershed to the extent that even modest rainstorms can generate dangerous flash floods and debris flows. To reduce public exposure to hazard, the U.S. Geological Survey produces post-fire debris-flow hazard assessments for select fires in the western United States. We use publicly available geospatial data describing basin morphology, burn severity, soil properties, and rainfall characteristics to estimate the statistical likelihood that debris flows will occur in response to a storm of a given rainfall intensity. Using an empirical database and refined geospatial analysis methods, we defined new equations for the prediction of debris-flow likelihood using logistic regression methods. We showed that the new logistic regression model outperformed previous models used to predict debris-flow likelihood.
NASA Astrophysics Data System (ADS)
Kneringer, Philipp; Dietz, Sebastian; Mayr, Georg J.; Zeileis, Achim
2017-04-01
Low-visibility conditions have a large impact on aviation safety and economic efficiency of airports and airlines. To support decision makers, we develop a statistical probabilistic nowcasting tool for the occurrence of capacity-reducing operations related to low visibility. The probabilities of four different low visibility classes are predicted with an ordered logistic regression model based on time series of meteorological point measurements. Potential predictor variables for the statistical models are visibility, humidity, temperature and wind measurements at several measurement sites. A stepwise variable selection method indicates that visibility and humidity measurements are the most important model inputs. The forecasts are tested with a 30 minute forecast interval up to two hours, which is a sufficient time span for tactical planning at Vienna Airport. The ordered logistic regression models outperform persistence and are competitive with human forecasters.
Wang, Shuang; Jiang, Xiaoqian; Wu, Yuan; Cui, Lijuan; Cheng, Samuel; Ohno-Machado, Lucila
2013-06-01
We developed an EXpectation Propagation LOgistic REgRession (EXPLORER) model for distributed privacy-preserving online learning. The proposed framework provides a high level guarantee for protecting sensitive information, since the information exchanged between the server and the client is the encrypted posterior distribution of coefficients. Through experimental results, EXPLORER shows the same performance (e.g., discrimination, calibration, feature selection, etc.) as the traditional frequentist logistic regression model, but provides more flexibility in model updating. That is, EXPLORER can be updated one point at a time rather than having to retrain the entire data set when new observations are recorded. The proposed EXPLORER supports asynchronized communication, which relieves the participants from coordinating with one another, and prevents service breakdown from the absence of participants or interrupted communications. Copyright © 2013 Elsevier Inc. All rights reserved.
A computational approach to compare regression modelling strategies in prediction research.
Pajouheshnia, Romin; Pestman, Wiebe R; Teerenstra, Steven; Groenwold, Rolf H H
2016-08-25
It is often unclear which approach to fit, assess and adjust a model will yield the most accurate prediction model. We present an extension of an approach for comparing modelling strategies in linear regression to the setting of logistic regression and demonstrate its application in clinical prediction research. A framework for comparing logistic regression modelling strategies by their likelihoods was formulated using a wrapper approach. Five different strategies for modelling, including simple shrinkage methods, were compared in four empirical data sets to illustrate the concept of a priori strategy comparison. Simulations were performed in both randomly generated data and empirical data to investigate the influence of data characteristics on strategy performance. We applied the comparison framework in a case study setting. Optimal strategies were selected based on the results of a priori comparisons in a clinical data set and the performance of models built according to each strategy was assessed using the Brier score and calibration plots. The performance of modelling strategies was highly dependent on the characteristics of the development data in both linear and logistic regression settings. A priori comparisons in four empirical data sets found that no strategy consistently outperformed the others. The percentage of times that a model adjustment strategy outperformed a logistic model ranged from 3.9 to 94.9 %, depending on the strategy and data set. However, in our case study setting the a priori selection of optimal methods did not result in detectable improvement in model performance when assessed in an external data set. The performance of prediction modelling strategies is a data-dependent process and can be highly variable between data sets within the same clinical domain. A priori strategy comparison can be used to determine an optimal logistic regression modelling strategy for a given data set before selecting a final modelling approach.
Cakir, Ebru; Kucuk, Ulku; Pala, Emel Ebru; Sezer, Ozlem; Ekin, Rahmi Gokhan; Cakmak, Ozgur
2017-05-01
Conventional cytomorphologic assessment is the first step to establish an accurate diagnosis in urinary cytology. In cytologic preparations, the separation of low-grade urothelial carcinoma (LGUC) from reactive urothelial proliferation (RUP) can be exceedingly difficult. The bladder washing cytologies of 32 LGUC and 29 RUP were reviewed. The cytologic slides were examined for the presence or absence of the 28 cytologic features. The cytologic criteria showing statistical significance in LGUC were increased numbers of monotonous single (non-umbrella) cells, three-dimensional cellular papillary clusters without fibrovascular cores, irregular bordered clusters, atypical single cells, irregular nuclear overlap, cytoplasmic homogeneity, increased N/C ratio, pleomorphism, nuclear border irregularity, nuclear eccentricity, elongated nuclei, and hyperchromasia (p ˂ 0.05), and the cytologic criteria showing statistical significance in RUP were inflammatory background, mixture of small and large urothelial cells, loose monolayer aggregates, and vacuolated cytoplasm (p ˂ 0.05). When these variables were subjected to a stepwise logistic regression analysis, four features were selected to distinguish LGUC from RUP: increased numbers of monotonous single (non-umbrella) cells, increased nuclear cytoplasmic ratio, hyperchromasia, and presence of small and large urothelial cells (p = 0.0001). By this logistic model of the 32 cases with proven LGUC, the stepwise logistic regression analysis correctly predicted 31 (96.9%) patients with this diagnosis, and of the 29 patients with RUP, the logistic model correctly predicted 26 (89.7%) patients as having this disease. There are several cytologic features to separate LGUC from RUP. Stepwise logistic regression analysis is a valuable tool for determining the most useful cytologic criteria to distinguish these entities. © 2017 APMIS. Published by John Wiley & Sons Ltd.
Determinants of amikacin first peak concentration in critically ill patients.
Boidin, Clément; Jenck, Sophie; Bourguignon, Laurent; Torkmani, Sejad; Roussey-Jean, Aurore; Ledochowski, Stanislas; Marry, Lucie; Ammenouche, Nacim; Dupont, Hervé; Marçon, Frédéric; Allaouchiche, Bernard; Bohé, Julien; Lepape, Alain; Goutelle, Sylvain; Friggeri, Arnaud
2018-04-16
Amikacin antimicrobial effect has been correlated with the ratio of the peak concentration (C max ) to the minimum inhibitory concentration. A target C max ≥ 60-80 mg/L has been suggested. It has been shown that such target is not achieved in a large proportion of critically ill patients in intensive care units. A retrospective analysis was performed to examine the determinants of C max ≥ 80 mg/L on the first peak in 339 critically ill patients treated by amikacin. The influence of available variables on C max target attainment was analyzed using a classification and regression tree (CART) and logistic regression. Mean C max in the 339 patients was 73.0 ± 23.9 mg/L, with a target attainment rate (TAR, C max ≥ 80 mg/L) of 37.5%. In CART analysis, the strongest predictor of amikacin target peak attainment was dose per kilogram of lean body weight (dose/LBW). TAR was 60.1% in patients with dose/LBW ≥ 37.8 vs. 19.9% in patients with lower dose/LBW (OR = 6.0 (95% CI: 3.6-10.2)). Renal function was a secondary predictor of C max . Logistic regression analysis identified dose per kilogram of ideal body weight (OR = 1.13 (95% CI: 1.09-1.17)) and creatinine clearance (OR = 0.993 (95% CI: 0.988-0.998)) as predictors of target peak achievement. Based on our results, an amikacin dose ≥ 37.8 mg/kg of LBW should be used to optimize the attainment of C max ≥ 80 mg/L after the first dose in critically ill patients. An even higher dose may be necessary in patients with normal renal function. © 2018 Société Française de Pharmacologie et de Thérapeutique.
Science of Test Research Consortium: Year Two Final Report
2012-10-02
July 2012. Analysis of an Intervention for Small Unmanned Aerial System ( SUAS ) Accidents, submitted to Quality Engineering, LQEN-2012-0056. Stone... Systems Engineering. Wolf, S. E., R. R. Hill, and J. J. Pignatiello. June 2012. Using Neural Networks and Logistic Regression to Model Small Unmanned ...Human Retina. 6. Wolf, S. E. March 2012. Modeling Small Unmanned Aerial System Mishaps using Logistic Regression and Artificial Neural Networks. 7
ERIC Educational Resources Information Center
Hidalgo, Mª Dolores; Gómez-Benito, Juana; Zumbo, Bruno D.
2014-01-01
The authors analyze the effectiveness of the R[superscript 2] and delta log odds ratio effect size measures when using logistic regression analysis to detect differential item functioning (DIF) in dichotomous items. A simulation study was carried out, and the Type I error rate and power estimates under conditions in which only statistical testing…
Brian S. Cade; Barry R. Noon; Rick D. Scherer; John J. Keane
2017-01-01
Counts of avian fledglings, nestlings, or clutch size that are bounded below by zero and above by some small integer form a discrete random variable distribution that is not approximated well by conventional parametric count distributions such as the Poisson or negative binomial. We developed a logistic quantile regression model to provide estimates of the empirical...
Mohammed, Mohammed A; Manktelow, Bradley N; Hofer, Timothy P
2016-04-01
There is interest in deriving case-mix adjusted standardised mortality ratios so that comparisons between healthcare providers, such as hospitals, can be undertaken in the controversial belief that variability in standardised mortality ratios reflects quality of care. Typically standardised mortality ratios are derived using a fixed effects logistic regression model, without a hospital term in the model. This fails to account for the hierarchical structure of the data - patients nested within hospitals - and so a hierarchical logistic regression model is more appropriate. However, four methods have been advocated for deriving standardised mortality ratios from a hierarchical logistic regression model, but their agreement is not known and neither do we know which is to be preferred. We found significant differences between the four types of standardised mortality ratios because they reflect a range of underlying conceptual issues. The most subtle issue is the distinction between asking how an average patient fares in different hospitals versus how patients at a given hospital fare at an average hospital. Since the answers to these questions are not the same and since the choice between these two approaches is not obvious, the extent to which profiling hospitals on mortality can be undertaken safely and reliably, without resolving these methodological issues, remains questionable. © The Author(s) 2012.
Chan, Siew Foong; Deeks, Jonathan J; Macaskill, Petra; Irwig, Les
2008-01-01
To compare three predictive models based on logistic regression to estimate adjusted likelihood ratios allowing for interdependency between diagnostic variables (tests). This study was a review of the theoretical basis, assumptions, and limitations of published models; and a statistical extension of methods and application to a case study of the diagnosis of obstructive airways disease based on history and clinical examination. Albert's method includes an offset term to estimate an adjusted likelihood ratio for combinations of tests. Spiegelhalter and Knill-Jones method uses the unadjusted likelihood ratio for each test as a predictor and computes shrinkage factors to allow for interdependence. Knottnerus' method differs from the other methods because it requires sequencing of tests, which limits its application to situations where there are few tests and substantial data. Although parameter estimates differed between the models, predicted "posttest" probabilities were generally similar. Construction of predictive models using logistic regression is preferred to the independence Bayes' approach when it is important to adjust for dependency of tests errors. Methods to estimate adjusted likelihood ratios from predictive models should be considered in preference to a standard logistic regression model to facilitate ease of interpretation and application. Albert's method provides the most straightforward approach.
Cameron, Isobel M; Scott, Neil W; Adler, Mats; Reid, Ian C
2014-12-01
It is important for clinical practice and research that measurement scales of well-being and quality of life exhibit only minimal differential item functioning (DIF). DIF occurs where different groups of people endorse items in a scale to different extents after being matched by the intended scale attribute. We investigate the equivalence or otherwise of common methods of assessing DIF. Three methods of measuring age- and sex-related DIF (ordinal logistic regression, Rasch analysis and Mantel χ(2) procedure) were applied to Hospital Anxiety Depression Scale (HADS) data pertaining to a sample of 1,068 patients consulting primary care practitioners. Three items were flagged by all three approaches as having either age- or sex-related DIF with a consistent direction of effect; a further three items identified did not meet stricter criteria for important DIF using at least one method. When applying strict criteria for significant DIF, ordinal logistic regression was slightly less sensitive. Ordinal logistic regression, Rasch analysis and contingency table methods yielded consistent results when identifying DIF in the HADS depression and HADS anxiety scales. Regardless of methods applied, investigators should use a combination of statistical significance, magnitude of the DIF effect and investigator judgement when interpreting the results.
NASA Astrophysics Data System (ADS)
Cao, Faxian; Yang, Zhijing; Ren, Jinchang; Ling, Wing-Kuen; Zhao, Huimin; Marshall, Stephen
2017-12-01
Although the sparse multinomial logistic regression (SMLR) has provided a useful tool for sparse classification, it suffers from inefficacy in dealing with high dimensional features and manually set initial regressor values. This has significantly constrained its applications for hyperspectral image (HSI) classification. In order to tackle these two drawbacks, an extreme sparse multinomial logistic regression (ESMLR) is proposed for effective classification of HSI. First, the HSI dataset is projected to a new feature space with randomly generated weight and bias. Second, an optimization model is established by the Lagrange multiplier method and the dual principle to automatically determine a good initial regressor for SMLR via minimizing the training error and the regressor value. Furthermore, the extended multi-attribute profiles (EMAPs) are utilized for extracting both the spectral and spatial features. A combinational linear multiple features learning (MFL) method is proposed to further enhance the features extracted by ESMLR and EMAPs. Finally, the logistic regression via the variable splitting and the augmented Lagrangian (LORSAL) is adopted in the proposed framework for reducing the computational time. Experiments are conducted on two well-known HSI datasets, namely the Indian Pines dataset and the Pavia University dataset, which have shown the fast and robust performance of the proposed ESMLR framework.
Latin hypercube approach to estimate uncertainty in ground water vulnerability
Gurdak, J.J.; McCray, J.E.; Thyne, G.; Qi, S.L.
2007-01-01
A methodology is proposed to quantify prediction uncertainty associated with ground water vulnerability models that were developed through an approach that coupled multivariate logistic regression with a geographic information system (GIS). This method uses Latin hypercube sampling (LHS) to illustrate the propagation of input error and estimate uncertainty associated with the logistic regression predictions of ground water vulnerability. Central to the proposed method is the assumption that prediction uncertainty in ground water vulnerability models is a function of input error propagation from uncertainty in the estimated logistic regression model coefficients (model error) and the values of explanatory variables represented in the GIS (data error). Input probability distributions that represent both model and data error sources of uncertainty were simultaneously sampled using a Latin hypercube approach with logistic regression calculations of probability of elevated nonpoint source contaminants in ground water. The resulting probability distribution represents the prediction intervals and associated uncertainty of the ground water vulnerability predictions. The method is illustrated through a ground water vulnerability assessment of the High Plains regional aquifer. Results of the LHS simulations reveal significant prediction uncertainties that vary spatially across the regional aquifer. Additionally, the proposed method enables a spatial deconstruction of the prediction uncertainty that can lead to improved prediction of ground water vulnerability. ?? 2007 National Ground Water Association.
The Trend Odds Model for Ordinal Data‡
Capuano, Ana W.; Dawson, Jeffrey D.
2013-01-01
Ordinal data appear in a wide variety of scientific fields. These data are often analyzed using ordinal logistic regression models that assume proportional odds. When this assumption is not met, it may be possible to capture the lack of proportionality using a constrained structural relationship between the odds and the cut-points of the ordinal values (Peterson and Harrell, 1990). We consider a trend odds version of this constrained model, where the odds parameter increases or decreases in a monotonic manner across the cut-points. We demonstrate algebraically and graphically how this model is related to latent logistic, normal, and exponential distributions. In particular, we find that scale changes in these potential latent distributions are consistent with the trend odds assumption, with the logistic and exponential distributions having odds that increase in a linear or nearly linear fashion. We show how to fit this model using SAS Proc Nlmixed, and perform simulations under proportional odds and trend odds processes. We find that the added complexity of the trend odds model gives improved power over the proportional odds model when there are moderate to severe departures from proportionality. A hypothetical dataset is used to illustrate the interpretation of the trend odds model, and we apply this model to a Swine Influenza example where the proportional odds assumption appears to be violated. PMID:23225520
The trend odds model for ordinal data.
Capuano, Ana W; Dawson, Jeffrey D
2013-06-15
Ordinal data appear in a wide variety of scientific fields. These data are often analyzed using ordinal logistic regression models that assume proportional odds. When this assumption is not met, it may be possible to capture the lack of proportionality using a constrained structural relationship between the odds and the cut-points of the ordinal values. We consider a trend odds version of this constrained model, wherein the odds parameter increases or decreases in a monotonic manner across the cut-points. We demonstrate algebraically and graphically how this model is related to latent logistic, normal, and exponential distributions. In particular, we find that scale changes in these potential latent distributions are consistent with the trend odds assumption, with the logistic and exponential distributions having odds that increase in a linear or nearly linear fashion. We show how to fit this model using SAS Proc NLMIXED and perform simulations under proportional odds and trend odds processes. We find that the added complexity of the trend odds model gives improved power over the proportional odds model when there are moderate to severe departures from proportionality. A hypothetical data set is used to illustrate the interpretation of the trend odds model, and we apply this model to a swine influenza example wherein the proportional odds assumption appears to be violated. Copyright © 2012 John Wiley & Sons, Ltd.
Kupek, Emil
2006-03-15
Structural equation modelling (SEM) has been increasingly used in medical statistics for solving a system of related regression equations. However, a great obstacle for its wider use has been its difficulty in handling categorical variables within the framework of generalised linear models. A large data set with a known structure among two related outcomes and three independent variables was generated to investigate the use of Yule's transformation of odds ratio (OR) into Q-metric by (OR-1)/(OR+1) to approximate Pearson's correlation coefficients between binary variables whose covariance structure can be further analysed by SEM. Percent of correctly classified events and non-events was compared with the classification obtained by logistic regression. The performance of SEM based on Q-metric was also checked on a small (N = 100) random sample of the data generated and on a real data set. SEM successfully recovered the generated model structure. SEM of real data suggested a significant influence of a latent confounding variable which would have not been detectable by standard logistic regression. SEM classification performance was broadly similar to that of the logistic regression. The analysis of binary data can be greatly enhanced by Yule's transformation of odds ratios into estimated correlation matrix that can be further analysed by SEM. The interpretation of results is aided by expressing them as odds ratios which are the most frequently used measure of effect in medical statistics.
Suzuki, Taku; Iwamoto, Takuji; Shizu, Kanae; Suzuki, Katsuji; Yamada, Harumoto; Sato, Kazuki
2017-05-01
This retrospective study was designed to investigate prognostic factors for postoperative outcomes for cubital tunnel syndrome (CubTS) using multiple logistic regression analysis with a large number of patients. Eighty-three patients with CubTS who underwent surgeries were enrolled. The following potential prognostic factors for disease severity were selected according to previous reports: sex, age, type of surgery, disease duration, body mass index, cervical lesion, presence of diabetes mellitus, Workers' Compensation status, preoperative severity, and preoperative electrodiagnostic testing. Postoperative severity of disease was assessed 2 years after surgery by Messina's criteria which is an outcome measure specifically for CubTS. Bivariate analysis was performed to select candidate prognostic factors for multiple linear regression analyses. Multiple logistic regression analysis was conducted to identify the association between postoperative severity and selected prognostic factors. Both bivariate and multiple linear regression analysis revealed only preoperative severity as an independent risk factor for poor prognosis, while other factors did not show any significant association. Although conflicting results exist regarding prognosis of CubTS, this study supports evidence from previous studies and concludes early surgical intervention portends the most favorable prognosis. Copyright © 2017 The Japanese Orthopaedic Association. Published by Elsevier B.V. All rights reserved.
Ranasinghe, Priyanga; Perera, Yashasvi S; Lamabadusuriya, Dilusha A; Kulatunga, Supun; Jayawardana, Naveen; Rajapakse, Senaka; Katulanda, Prasad
2011-08-04
Complaints of arms, neck and shoulders (CANS) is common among computer office workers. We evaluated an aetiological model with physical/psychosocial risk-factors. We invited 2,500 computer office workers for the study. Data on prevalence and risk-factors of CANS were collected by validated Maastricht-Upper-extremity-Questionnaire. Workstations were evaluated by Occupational Safety and Health Administration (OSHA) Visual-Display-Terminal workstation-checklist. Participants' knowledge and awareness was evaluated by a set of expert-validated questions. A binary logistic regression analysis investigated relationships/correlations between risk-factors and symptoms. Sample size was 2,210. Mean age 30.8 ± 8.1 years, 50.8% were males. The 1-year prevalence of CANS was 56.9%, commonest region of complaint was forearm/hand (42.6%), followed by neck (36.7%) and shoulder/arm (32.0%). In those with CANS, 22.7% had taken treatment from a health care professional, only in 1.1% seeking medical advice an occupation-related injury had been suspected/diagnosed. In addition 9.3% reported CANS-related absenteeism from work, while 15.4% reported CANS causing disruption of normal activities. A majority of evaluated workstations in all participants (88.4%,) and in those with CANS (91.9%) had OSHA non-compliant workstations. In the binary logistic regression analyses female gender, daily computer usage, incorrect body posture, bad work-habits, work overload, poor social support and poor ergonomic knowledge were associated with CANS and its' severity In a multiple logistic regression analysis controlling for age, gender and duration of occupation, incorrect body posture, bad work-habits and daily computer usage were significant independent predictors of CANS. The prevalence of work-related CANS among computer office workers in Sri Lanka, a developing, South Asian country is high and comparable to prevalence in developed countries. Work-related physical factors, psychosocial factors and lack of awareness were all important associations of CANS and effective preventive strategies need to address all three areas.
Cawley, Gavin C; Talbot, Nicola L C
2006-10-01
Gene selection algorithms for cancer classification, based on the expression of a small number of biomarker genes, have been the subject of considerable research in recent years. Shevade and Keerthi propose a gene selection algorithm based on sparse logistic regression (SLogReg) incorporating a Laplace prior to promote sparsity in the model parameters, and provide a simple but efficient training procedure. The degree of sparsity obtained is determined by the value of a regularization parameter, which must be carefully tuned in order to optimize performance. This normally involves a model selection stage, based on a computationally intensive search for the minimizer of the cross-validation error. In this paper, we demonstrate that a simple Bayesian approach can be taken to eliminate this regularization parameter entirely, by integrating it out analytically using an uninformative Jeffrey's prior. The improved algorithm (BLogReg) is then typically two or three orders of magnitude faster than the original algorithm, as there is no longer a need for a model selection step. The BLogReg algorithm is also free from selection bias in performance estimation, a common pitfall in the application of machine learning algorithms in cancer classification. The SLogReg, BLogReg and Relevance Vector Machine (RVM) gene selection algorithms are evaluated over the well-studied colon cancer and leukaemia benchmark datasets. The leave-one-out estimates of the probability of test error and cross-entropy of the BLogReg and SLogReg algorithms are very similar, however the BlogReg algorithm is found to be considerably faster than the original SLogReg algorithm. Using nested cross-validation to avoid selection bias, performance estimation for SLogReg on the leukaemia dataset takes almost 48 h, whereas the corresponding result for BLogReg is obtained in only 1 min 24 s, making BLogReg by far the more practical algorithm. BLogReg also demonstrates better estimates of conditional probability than the RVM, which are of great importance in medical applications, with similar computational expense. A MATLAB implementation of the sparse logistic regression algorithm with Bayesian regularization (BLogReg) is available from http://theoval.cmp.uea.ac.uk/~gcc/cbl/blogreg/
Tumor necrosis factor- α, adiponectin and their ratio in gestational diabetes mellitus
Khosrowbeygi, Ali; Rezvanfar, Mohammad Reza; Ahmadvand, Hassan
2018-01-01
Background: It has been suggested that inflammation might be implicated in the gestational diabetes mellitus (GDM) complications, including insulin resistance. The aims of the current study were to explore maternal circulating values of TNF-α, adiponectin and the adiponectin/TNF-α ratio in women with GDM compared with normal pregnancy and their relationships with metabolic syndrome biomarkers. Methods: Forty women with GDM and 40 normal pregnant women were included in the study. Commercially available enzyme-linked immunosorbent assay methods were used to measure serum levels of TNF-α and total adiponectin. Results: Women with GDM had higher values of TNF-α (225.08±27.35 vs 115.68±12.64 pg/ml, p<0.001) and lower values of adiponectin (4.50±0.38 vs 6.37±0.59 µg/ml, p=0.003) and the adiponectin/TNF-α ratio (4.31±0.05 vs 4.80±0.07, P<0.001) than normal pregnant women. The adiponectin/TNF-α ratio showed negative correlations with insulin resistance (r=-0.68, p<0.001) and triglyceride (r=-0.39, p=0.014) and a positive correlation with insulin sensitivity (r=0.69, p<0.001). Multiple linear regression analysis showed that values of the adiponectin /TNF-α ratio were independently associated with insulin resistance. Binary logistic regression analysis showed that GDM was negatively associated with adiponectin /TNF-α ratio. Conclusions: In summary, the adiponectin/TNF-α ratio decreased significantly in GDM compared with normal pregnancy. The ratio might be an informative biomarker for assessment of pregnant women at high risk of insulin resistance and dyslipidemia and for diagnosis and therapeutic monitoring aims in GDM. PMID:29387323
Sydó, Nóra; Sydó, Tibor; Gonzalez Carta, Karina A; Hussain, Nasir; Merkely, Béla; Murphy, Joseph G; Squires, Ray W; Lopez-Jimenez, Francisco; Allison, Thomas G
2018-05-15
A decrease in diastolic blood pressure (DBP) with exercise is considered normal, but the significance of an increase in DBP has not been validated. Our aim was to determine the relationship of DBP increasing on a stress test regarding comorbidities and mortality. Our database was reviewed from 1993-2010 using the first stress test of a patient. Non-Minnesota residence, baseline CV disease, rest DBP <60 or >100 mmHg, and age <30 or ≥80 were exclusion criteria. DBP response was classified Normal if peak DBP-rest DBP <0, Borderline 0-9, Abnormal ≥10mmHg. Mortality was determined from Mayo Clinic records and Minnesota Death Index. Logistic regression was used to determine the relationship of DBP response to presence of comorbidities. Cox regression was used to determine total and CV mortality risk by DBP response. All analyses were adjusted for age, sex and resting DBP. 20760 patients were included (51±11 years, female n=7314). Rest/peak averaged DBP 82±8/69 ±15 mmHg in normal vs 79±9/82±9 mmHg in borderline vs 76±9/92±11 mmHg in abnormal DBP response. There were 1582 deaths (8%) with 557 (3%) CV deaths over 12±5 years of follow-up. In patients with borderline and abnormal DBP response, odds ratios for obesity, hypertension, diabetes and current smoking were significant, while hazard ratios for total and CV death were not significant compared to patients with normal DBP response. DBP response to exercise is significantly associated with important comorbidities at the time of the stress test but does not add to the prognostic yield of stress test.
Rode, Line; Kjærgaard, Hanne; Ottesen, Bent; Damm, Peter; Hegaard, Hanne K
2012-02-01
Our aim was to investigate the association between gestational weight gain (GWG) and postpartum weight retention (PWR) in pre-pregnancy underweight, normal weight, overweight or obese women, with emphasis on the American Institute of Medicine (IOM) recommendations. We performed secondary analyses on data based on questionnaires from 1,898 women from the "Smoke-free Newborn Study" conducted 1996-1999 at Hvidovre Hospital, Denmark. Relationship between GWG and PWR was examined according to BMI as a continuous variable and in four groups. Association between PWR and GWG according to IOM recommendations was tested by linear regression analysis and the association between PWR ≥ 5 kg (11 lbs) and GWG by logistic regression analysis. Mean GWG and mean PWR were constant for all BMI units until 26-27 kg/m(2). After this cut-off mean GWG and mean PWR decreased with increasing BMI. Nearly 40% of normal weight, 60% of overweight and 50% of obese women gained more than recommended during pregnancy. For normal weight and overweight women with GWG above recommendations the OR of gaining ≥ 5 kg (11 lbs) 1-year postpartum was 2.8 (95% CI 2.0-4.0) and 2.8 (95% CI 1.3-6.2, respectively) compared to women with GWG within recommendations. GWG above IOM recommendations significantly increases normal weight, overweight and obese women's risk of retaining weight 1 year after delivery. Health personnel face a challenge in prenatal counseling as 40-60% of these women gain more weight than recommended for their BMI. As GWG is potentially modifiable, our study should be followed by intervention studies focusing on GW.
Lin, Wen-Li; Chi, Hsin; Huang, Fu-Yuan; Huang, Daniel Tsung-Ning; Chiu, Nan-Chang
2016-10-01
Cerebrospinal fluid (CSF) cell count and biochemical examinations and cultures form the basis for the diagnosis of bacterial meningitis. However, some patients do not have typical findings and are at a higher risk of being missed or having delayed treatment. To better understand the correlation between CSF results and outcomes, we evaluated CSF data focusing on the patients with atypical findings. This study enrolled CSF culture-proven bacterial meningitis patients aged from 1 month to 18 years in a medical center. The patients were divided into "normal" and "abnormal" groups for each laboratory result and in combination. The correlations between the laboratory results and the outcomes were analyzed. A total of 175 children with confirmed bacterial meningitis were enrolled. In CSF examinations, 16.2% of patients had normal white blood cell counts, 29.5% had normal glucose levels, 24.5% had normal protein levels, 10.2% had normal results in two items, and 8.6% had normal results in all three items. In logistic regression analysis, a normal CSF leukocyte count and increased CSF protein level were related to poor outcomes. Patients with meningitis caused by Streptococcus pneumoniae and hyponatremia were at a higher risk of mortality and the development of sequelae. In children with bacterial meningitis, nontypical CSF findings and, in particular, normal CSF leukocyte count and increased protein level may indicate a worse prognosis. Copyright © 2014. Published by Elsevier B.V.
Lilje, Stina C; Skillgate, Eva; Anderberg, Peter; Berglund, Johan
2015-07-01
Pain is one of the most frequent reasons for seeking health care, and is thus a public health problem. Although there is a progressive increase in pain and impaired physical function with age, few studies are performed on older adults. The aim of this study was to investigate if there are associations between musculoskeletal pain interfering with normal life in older adults and physical and psychosocial workloads through life. The association of heavy physical workload and negative psychosocial workload and musculoskeletal pain interfering with normal life (SF 12) was analyzed by multiple logistic regression. The model was adjusted for eight background covariates: age, gender, growing-up environment, educational level, if living alone or not, obesity, smoking, and leisure physical activity. Negative psychosocial and heavy physical workloads were independently associated with musculoskeletal pain interfering with normal life (adjusted OR: 4.44, 95% CI: 2.84-6.92), and (adjusted OR: 1.88, 95% CI: 1.20-2.93), respectively. The background covariates female gender and higher education were also associated with musculoskeletal pain interfering with normal life, and physical leisure activity was inversely associated. The findings suggest that negative psychosocial and heavy physical workloads are strongly associated with musculoskeletal pain interfering with normal life in older adults. © 2015 the Nordic Societies of Public Health.
Liu, Weihua; Yang, Yi; Wang, Shuqing; Liu, Yang
2014-01-01
Order insertion often occurs in the scheduling process of logistics service supply chain (LSSC), which disturbs normal time scheduling especially in the environment of mass customization logistics service. This study analyses order similarity coefficient and order insertion operation process and then establishes an order insertion scheduling model of LSSC with service capacity and time factors considered. This model aims to minimize the average unit volume operation cost of logistics service integrator and maximize the average satisfaction degree of functional logistics service providers. In order to verify the viability and effectiveness of our model, a specific example is numerically analyzed. Some interesting conclusions are obtained. First, along with the increase of completion time delay coefficient permitted by customers, the possible inserting order volume first increases and then trends to be stable. Second, supply chain performance reaches the best when the volume of inserting order is equal to the surplus volume of the normal operation capacity in mass service process. Third, the larger the normal operation capacity in mass service process is, the bigger the possible inserting order's volume will be. Moreover, compared to increasing the completion time delay coefficient, improving the normal operation capacity of mass service process is more useful.
Artes, Paul H; Crabb, David P
2010-01-01
To investigate why the specificity of the Moorfields Regression Analysis (MRA) of the Heidelberg Retina Tomograph (HRT) varies with disc size, and to derive accurate normative limits for neuroretinal rim area to address this problem. Two datasets from healthy subjects (Manchester, UK, n = 88; Halifax, Nova Scotia, Canada, n = 75) were used to investigate the physiological relationship between the optic disc and neuroretinal rim area. Normative limits for rim area were derived by quantile regression (QR) and compared with those of the MRA (derived by linear regression). Logistic regression analyses were performed to quantify the association between disc size and positive classifications with the MRA, as well as with the QR-derived normative limits. In both datasets, the specificity of the MRA depended on optic disc size. The odds of observing a borderline or outside-normal-limits classification increased by approximately 10% for each 0.1 mm(2) increase in disc area (P < 0.1). The lower specificity of the MRA with large optic discs could be explained by the failure of linear regression to model the extremes of the rim area distribution (observations far from the mean). In comparison, the normative limits predicted by QR were larger for smaller discs (less specific, more sensitive), and smaller for larger discs, such that false-positive rates became independent of optic disc size. Normative limits derived by quantile regression appear to remove the size-dependence of specificity with the MRA. Because quantile regression does not rely on the restrictive assumptions of standard linear regression, it may be a more appropriate method for establishing normative limits in other clinical applications where the underlying distributions are nonnormal or have nonconstant variance.
ERIC Educational Resources Information Center
Kasapoglu, Koray
2014-01-01
This study aims to investigate which factors are associated with Turkey's 15-year-olds' scoring above the OECD average (493) on the PISA'09 reading assessment. Collected from a total of 4,996 15-year-old students from Turkey, data were analyzed by logistic regression analysis in order to model the data of students who were split into two: (1)…
Patregnani, Jason T; Borgman, Matthew A; Maegele, Marc; Wade, Charles E; Blackbourne, Lorne H; Spinella, Philip C
2012-05-01
In adults, early traumatic coagulopathy and shock are both common and independently associated with mortality. There are little data regarding both the incidence and association of early coagulopathy and shock on outcomes in pediatric patients with traumatic injuries. Our objective was to determine whether coagulopathy and shock on admission are independently associated with mortality in children with traumatic injuries. A retrospective review of the Joint Theater Trauma Registry from U.S. combat support hospitals in Iraq and Afghanistan from 2002 to 2009 was performed. Coagulopathy was defined as an international normalized ratio of ≥1.5 and shock as a base deficit of ≥6. Laboratory values were measured on admission. Primary outcome was inhospital mortality. Univariate analyses were performed on all admission variables followed by reverse stepwise multivariate logistic regression to determine independent associations. Combat support hospitals in Iraq and Afghanistan. Patients <18 yrs of age with Injury Severity Score, international normalized ratio, base deficit, and inhospital mortality were included. Of 1998 in the cohort, 744 (37%) had a complete set of data for analysis. None. The incidence of early coagulopathy and shock were 27% and 38.3% and associated with mortality of 22% and 16.8%, respectively. After multivariate logistic regression, early coagulopathy had an odds ratio of 2.2 (95% confidence interval 1.1-4.5) and early shock had an odds ratio of 3.0 (95% confidence interval 1.2-7.5) for mortality. Patients with coagulopathy and shock had an odds ratio of 3.8 (95% confidence interval 2.0-7.4) for mortality. In children with traumatic injuries treated at combat support hospitals, coagulopathy and shock on admission are common and independently associated with a high incidence of inhospital mortality. Future studies are needed to determine whether more rapid and accurate methods of measuring coagulopathy and shock as well as if early goal-directed treatment of these states can improve outcomes in children.
Upgrade Summer Severe Weather Tool
NASA Technical Reports Server (NTRS)
Watson, Leela
2011-01-01
The goal of this task was to upgrade to the existing severe weather database by adding observations from the 2010 warm season, update the verification dataset with results from the 2010 warm season, use statistical logistic regression analysis on the database and develop a new forecast tool. The AMU analyzed 7 stability parameters that showed the possibility of providing guidance in forecasting severe weather, calculated verification statistics for the Total Threat Score (TTS), and calculated warm season verification statistics for the 2010 season. The AMU also performed statistical logistic regression analysis on the 22-year severe weather database. The results indicated that the logistic regression equation did not show an increase in skill over the previously developed TTS. The equation showed less accuracy than TTS at predicting severe weather, little ability to distinguish between severe and non-severe weather days, and worse standard categorical accuracy measures and skill scores over TTS.
Estimating the Probability of Rare Events Occurring Using a Local Model Averaging.
Chen, Jin-Hua; Chen, Chun-Shu; Huang, Meng-Fan; Lin, Hung-Chih
2016-10-01
In statistical applications, logistic regression is a popular method for analyzing binary data accompanied by explanatory variables. But when one of the two outcomes is rare, the estimation of model parameters has been shown to be severely biased and hence estimating the probability of rare events occurring based on a logistic regression model would be inaccurate. In this article, we focus on estimating the probability of rare events occurring based on logistic regression models. Instead of selecting a best model, we propose a local model averaging procedure based on a data perturbation technique applied to different information criteria to obtain different probability estimates of rare events occurring. Then an approximately unbiased estimator of Kullback-Leibler loss is used to choose the best one among them. We design complete simulations to show the effectiveness of our approach. For illustration, a necrotizing enterocolitis (NEC) data set is analyzed. © 2016 Society for Risk Analysis.
Evaluating the perennial stream using logistic regression in central Taiwan
NASA Astrophysics Data System (ADS)
Ruljigaljig, T.; Cheng, Y. S.; Lin, H. I.; Lee, C. H.; Yu, T. T.
2014-12-01
This study produces a perennial stream head potential map, based on a logistic regression method with a Geographic Information System (GIS). Perennial stream initiation locations, indicates the location of the groundwater and surface contact, were identified in the study area from field survey. The perennial stream potential map in central Taiwan was constructed using the relationship between perennial stream and their causative factors, such as Catchment area, slope gradient, aspect, elevation, groundwater recharge and precipitation. Here, the field surveys of 272 streams were determined in the study area. The areas under the curve for logistic regression methods were calculated as 0.87. The results illustrate the importance of catchment area and groundwater recharge as key factors within the model. The results obtained from the model within the GIS were then used to produce a map of perennial stream and estimate the location of perennial stream head.
Menditto, Anthony A; Linhorst, Donald M; Coleman, James C; Beck, Niels C
2006-04-01
Development of policies and procedures to contend with the risks presented by elopement, aggression, and suicidal behaviors are long-standing challenges for mental health administrators. Guidance in making such judgments can be obtained through the use of a multivariate statistical technique known as logistic regression. This procedure can be used to develop a predictive equation that is mathematically formulated to use the best combination of predictors, rather than considering just one factor at a time. This paper presents an overview of logistic regression and its utility in mental health administrative decision making. A case example of its application is presented using data on elopements from Missouri's long-term state psychiatric hospitals. Ultimately, the use of statistical prediction analyses tempered with differential qualitative weighting of classification errors can augment decision-making processes in a manner that provides guidance and flexibility while wrestling with the complex problem of risk assessment and decision making.
Lei, Yang; Nollen, Nikki; Ahluwahlia, Jasjit S; Yu, Qing; Mayo, Matthew S
2015-04-09
Other forms of tobacco use are increasing in prevalence, yet most tobacco control efforts are aimed at cigarettes. In light of this, it is important to identify individuals who are using both cigarettes and alternative tobacco products (ATPs). Most previous studies have used regression models. We conducted a traditional logistic regression model and a classification and regression tree (CART) model to illustrate and discuss the added advantages of using CART in the setting of identifying high-risk subgroups of ATP users among cigarettes smokers. The data were collected from an online cross-sectional survey administered by Survey Sampling International between July 5, 2012 and August 15, 2012. Eligible participants self-identified as current smokers, African American, White, or Latino (of any race), were English-speaking, and were at least 25 years old. The study sample included 2,376 participants and was divided into independent training and validation samples for a hold out validation. Logistic regression and CART models were used to examine the important predictors of cigarettes + ATP users. The logistic regression model identified nine important factors: gender, age, race, nicotine dependence, buying cigarettes or borrowing, whether the price of cigarettes influences the brand purchased, whether the participants set limits on cigarettes per day, alcohol use scores, and discrimination frequencies. The C-index of the logistic regression model was 0.74, indicating good discriminatory capability. The model performed well in the validation cohort also with good discrimination (c-index = 0.73) and excellent calibration (R-square = 0.96 in the calibration regression). The parsimonious CART model identified gender, age, alcohol use score, race, and discrimination frequencies to be the most important factors. It also revealed interesting partial interactions. The c-index is 0.70 for the training sample and 0.69 for the validation sample. The misclassification rate was 0.342 for the training sample and 0.346 for the validation sample. The CART model was easier to interpret and discovered target populations that possess clinical significance. This study suggests that the non-parametric CART model is parsimonious, potentially easier to interpret, and provides additional information in identifying the subgroups at high risk of ATP use among cigarette smokers.
Akkus, Zeki; Camdeviren, Handan; Celik, Fatma; Gur, Ali; Nas, Kemal
2005-09-01
To determine the risk factors of osteoporosis using a multiple binary logistic regression method and to assess the risk variables for osteoporosis, which is a major and growing health problem in many countries. We presented a case-control study, consisting of 126 postmenopausal healthy women as control group and 225 postmenopausal osteoporotic women as the case group. The study was carried out in the Department of Physical Medicine and Rehabilitation, Dicle University, Diyarbakir, Turkey between 1999-2002. The data from the 351 participants were collected using a standard questionnaire that contains 43 variables. A multiple logistic regression model was then used to evaluate the data and to find the best regression model. We classified 80.1% (281/351) of the participants using the regression model. Furthermore, the specificity value of the model was 67% (84/126) of the control group while the sensitivity value was 88% (197/225) of the case group. We found the distribution of residual values standardized for final model to be exponential using the Kolmogorow-Smirnow test (p=0.193). The receiver operating characteristic curve was found successful to predict patients with risk for osteoporosis. This study suggests that low levels of dietary calcium intake, physical activity, education, and longer duration of menopause are independent predictors of the risk of low bone density in our population. Adequate dietary calcium intake in combination with maintaining a daily physical activity, increasing educational level, decreasing birth rate, and duration of breast-feeding may contribute to healthy bones and play a role in practical prevention of osteoporosis in Southeast Anatolia. In addition, the findings of the present study indicate that the use of multivariate statistical method as a multiple logistic regression in osteoporosis, which maybe influenced by many variables, is better than univariate statistical evaluation.
Shi, K-Q; Zhou, Y-Y; Yan, H-D; Li, H; Wu, F-L; Xie, Y-Y; Braddock, M; Lin, X-Y; Zheng, M-H
2017-02-01
At present, there is no ideal model for predicting the short-term outcome of patients with acute-on-chronic hepatitis B liver failure (ACHBLF). This study aimed to establish and validate a prognostic model by using the classification and regression tree (CART) analysis. A total of 1047 patients from two separate medical centres with suspected ACHBLF were screened in the study, which were recognized as derivation cohort and validation cohort, respectively. CART analysis was applied to predict the 3-month mortality of patients with ACHBLF. The accuracy of the CART model was tested using the area under the receiver operating characteristic curve, which was compared with the model for end-stage liver disease (MELD) score and a new logistic regression model. CART analysis identified four variables as prognostic factors of ACHBLF: total bilirubin, age, serum sodium and INR, and three distinct risk groups: low risk (4.2%), intermediate risk (30.2%-53.2%) and high risk (81.4%-96.9%). The new logistic regression model was constructed with four independent factors, including age, total bilirubin, serum sodium and prothrombin activity by multivariate logistic regression analysis. The performances of the CART model (0.896), similar to the logistic regression model (0.914, P=.382), exceeded that of MELD score (0.667, P<.001). The results were confirmed in the validation cohort. We have developed and validated a novel CART model superior to MELD for predicting three-month mortality of patients with ACHBLF. Thus, the CART model could facilitate medical decision-making and provide clinicians with a validated practical bedside tool for ACHBLF risk stratification. © 2016 John Wiley & Sons Ltd.
Arevalillo, Jorge M; Sztein, Marcelo B; Kotloff, Karen L; Levine, Myron M; Simon, Jakub K
2017-10-01
Immunologic correlates of protection are important in vaccine development because they give insight into mechanisms of protection, assist in the identification of promising vaccine candidates, and serve as endpoints in bridging clinical vaccine studies. Our goal is the development of a methodology to identify immunologic correlates of protection using the Shigella challenge as a model. The proposed methodology utilizes the Random Forests (RF) machine learning algorithm as well as Classification and Regression Trees (CART) to detect immune markers that predict protection, identify interactions between variables, and define optimal cutoffs. Logistic regression modeling is applied to estimate the probability of protection and the confidence interval (CI) for such a probability is computed by bootstrapping the logistic regression models. The results demonstrate that the combination of Classification and Regression Trees and Random Forests complements the standard logistic regression and uncovers subtle immune interactions. Specific levels of immunoglobulin IgG antibody in blood on the day of challenge predicted protection in 75% (95% CI 67-86). Of those subjects that did not have blood IgG at or above a defined threshold, 100% were protected if they had IgA antibody secreting cells above a defined threshold. Comparison with the results obtained by applying only logistic regression modeling with standard Akaike Information Criterion for model selection shows the usefulness of the proposed method. Given the complexity of the immune system, the use of machine learning methods may enhance traditional statistical approaches. When applied together, they offer a novel way to quantify important immune correlates of protection that may help the development of vaccines. Copyright © 2017 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Schaeben, Helmut; Semmler, Georg
2016-09-01
The objective of prospectivity modeling is prediction of the conditional probability of the presence T = 1 or absence T = 0 of a target T given favorable or prohibitive predictors B, or construction of a two classes 0,1 classification of T. A special case of logistic regression called weights-of-evidence (WofE) is geologists' favorite method of prospectivity modeling due to its apparent simplicity. However, the numerical simplicity is deceiving as it is implied by the severe mathematical modeling assumption of joint conditional independence of all predictors given the target. General weights of evidence are explicitly introduced which are as simple to estimate as conventional weights, i.e., by counting, but do not require conditional independence. Complementary to the regression view is the classification view on prospectivity modeling. Boosting is the construction of a strong classifier from a set of weak classifiers. From the regression point of view it is closely related to logistic regression. Boost weights-of-evidence (BoostWofE) was introduced into prospectivity modeling to counterbalance violations of the assumption of conditional independence even though relaxation of modeling assumptions with respect to weak classifiers was not the (initial) purpose of boosting. In the original publication of BoostWofE a fabricated dataset was used to "validate" this approach. Using the same fabricated dataset it is shown that BoostWofE cannot generally compensate lacking conditional independence whatever the consecutively processing order of predictors. Thus the alleged features of BoostWofE are disproved by way of counterexamples, while theoretical findings are confirmed that logistic regression including interaction terms can exactly compensate violations of joint conditional independence if the predictors are indicators.
Separation in Logistic Regression: Causes, Consequences, and Control.
Mansournia, Mohammad Ali; Geroldinger, Angelika; Greenland, Sander; Heinze, Georg
2018-04-01
Separation is encountered in regression models with a discrete outcome (such as logistic regression) where the covariates perfectly predict the outcome. It is most frequent under the same conditions that lead to small-sample and sparse-data bias, such as presence of a rare outcome, rare exposures, highly correlated covariates, or covariates with strong effects. In theory, separation will produce infinite estimates for some coefficients. In practice, however, separation may be unnoticed or mishandled because of software limits in recognizing and handling the problem and in notifying the user. We discuss causes of separation in logistic regression and describe how common software packages deal with it. We then describe methods that remove separation, focusing on the same penalized-likelihood techniques used to address more general sparse-data problems. These methods improve accuracy, avoid software problems, and allow interpretation as Bayesian analyses with weakly informative priors. We discuss likelihood penalties, including some that can be implemented easily with any software package, and their relative advantages and disadvantages. We provide an illustration of ideas and methods using data from a case-control study of contraceptive practices and urinary tract infection.
NASA Astrophysics Data System (ADS)
Nong, Yu; Du, Qingyun; Wang, Kun; Miao, Lei; Zhang, Weiwei
2008-10-01
Urban growth modeling, one of the most important aspects of land use and land cover change study, has attracted substantial attention because it helps to comprehend the mechanisms of land use change thus helps relevant policies made. This study applied multinomial logistic regression to model urban growth in the Jiayu county of Hubei province, China to discover the relationship between urban growth and the driving forces of which biophysical and social-economic factors are selected as independent variables. This type of regression is similar to binary logistic regression, but it is more general because the dependent variable is not restricted to two categories, as those previous studies did. The multinomial one can simulate the process of multiple land use competition between urban land, bare land, cultivated land and orchard land. Taking the land use type of Urban as reference category, parameters could be estimated with odds ratio. A probability map is generated from the model to predict where urban growth will occur as a result of the computation.
Occlusal factors are not related to self-reported bruxism.
Manfredini, Daniele; Visscher, Corine M; Guarda-Nardini, Luca; Lobbezoo, Frank
2012-01-01
To estimate the contribution of various occlusal features of the natural dentition that may identify self-reported bruxers compared to nonbruxers. Two age- and sex-matched groups of self-reported bruxers (n = 67) and self-reported nonbruxers (n = 75) took part in the study. For each patient, the following occlusal features were clinically assessed: retruded contact position (RCP) to intercuspal contact position (ICP) slide length (< 2 mm was considered normal), vertical overlap (< 0 mm was considered an anterior open bite; > 4 mm, a deep bite), horizontal overlap (> 4 mm was considered a large horizontal overlap), incisor dental midline discrepancy (< 2 mm was considered normal), and the presence of a unilateral posterior crossbite, mediotrusive interferences, and laterotrusive interferences. A multiple logistic regression model was used to identify the significant associations between the assessed occlusal features (independent variables) and self-reported bruxism (dependent variable). Accuracy values to predict self-reported bruxism were unacceptable for all occlusal variables. The only variable remaining in the final regression model was laterotrusive interferences (P = .030). The percentage of explained variance for bruxism by the final multiple regression model was 4.6%. This model including only one occlusal factor showed low positive (58.1%) and negative predictive values (59.7%), thus showing a poor accuracy to predict the presence of self-reported bruxism (59.2%). This investigation suggested that the contribution of occlusion to the differentiation between bruxers and nonbruxers is negligible. This finding supports theories that advocate a much diminished role for peripheral anatomical-structural factors in the pathogenesis of bruxism.
Logistic Approximation to the Normal: The KL Rationale
ERIC Educational Resources Information Center
Savalei, Victoria
2006-01-01
A rationale is proposed for approximating the normal distribution with a logistic distribution using a scaling constant based on minimizing the Kullback-Leibler (KL) information, that is, the expected amount of information available in a sample to distinguish between two competing distributions using a likelihood ratio (LR) test, assuming one of…
Stability of Early EEG Background Patterns After Pediatric Cardiac Arrest.
Abend, Nicholas S; Xiao, Rui; Kessler, Sudha Kilaru; Topjian, Alexis A
2018-05-01
We aimed to determine whether EEG background characteristics remain stable across discrete time periods during the acute period after resuscitation from pediatric cardiac arrest. Children resuscitated from cardiac arrest underwent continuous conventional EEG monitoring. The EEG was scored in 12-hour epochs for up to 72 hours after return of circulation by an electroencephalographer using a Background Category with 4 levels (normal, slow-disorganized, discontinuous/burst-suppression, or attenuated-featureless) or 2 levels (normal/slow-disorganized or discontinuous/burst-suppression/attenuated-featureless). Survival analyses and mixed-effects ordinal logistic regression models evaluated whether the EEG remained stable across epochs. EEG monitoring was performed in 89 consecutive children. When EEG was assessed as the 4-level Background Category, 30% of subjects changed category over time. Based on initial Background Category, one quarter of the subjects changed EEG category by 24 hours if the initial EEG was attenuated-featureless, by 36 hours if the initial EEG was discontinuous or burst-suppression, by 48 hours if the initial EEG was slow-disorganized, and never if the initial EEG was normal. However, regression modeling for the 4-level Background Category indicated that the EEG did not change over time (odds ratio = 1.06, 95% confidence interval = 0.96-1.17, P = 0.26). Similarly, when EEG was assessed as the 2-level Background Category, 8% of subjects changed EEG category over time. However, regression modeling for the 2-level category indicated that the EEG did not change over time (odds ratio = 1.02, 95% confidence interval = 0.91-1.13, P = 0.75). The EEG Background Category changes over time whether analyzed as 4 levels (30% of subjects) or 2 levels (8% of subjects), although regression analyses indicated that no significant changes occurred over time for the full cohort. These data indicate that the Background Category is often stable during the acute 72 hours after pediatric cardiac arrest and thus may be a useful EEG assessment metric in future studies, but that some subjects do have EEG changes over time and therefore serial EEG assessments may be informative.
Childhood growth and development associated with need for full-time special education at school age.
Mannerkoski, Minna; Aberg, Laura; Hoikkala, Marianne; Sarna, Seppo; Kaski, Markus; Autti, Taina; Heiskala, Hannu
2009-01-01
To explore how growth measurements and attainment of developmental milestones in early childhood reflect the need for full-time special education (SE). After stratification in this population-based study, 900 pupils in full-time SE groups (age-range 7-16 years, mean 12 years 8 months) at three levels and 301 pupils in mainstream education (age-range 7-16, mean 12 years 9 months) provided data on height and weight from birth to age 7 years and head circumference to age 1 year. Developmental screening was evaluated from age 1 month to 48 months. Statistical methods included a general linear model (growth measurements), binary logistic regression analysis (odds ratios for growth), and multinomial logistic regression analysis (odds ratios for developmental milestones). At 1 year, a 1 standard deviation score (SDS) decrease in height raised the probability of SE placement by 40%, and a 1 SDS decrease in head size by 28%. In developmental screening, during the first months of life the gross motor milestones, especially head support, differentiated the children at levels 0-3. Thereafter, the fine motor milestones and those related to speech and social skills became more important. Children whose growth is mildly impaired, though in the normal range, and who fail to attain certain developmental milestones have an increased probability for SE and thus a need for special attention when toddlers age. Similar to the growth curves, these children seem to have consistent developmental curves (patterns).
Solinsky, R; Bunnell, A E; Linsenmeyer, T A; Svircev, J N; Engle, A; Burns, S P
2017-10-01
Secondary analysis of prospectively collected observational data assessing the safety of an autonomic dysreflexia (AD) management protocol. To estimate the time to onset of action, time to full clinical effect (sustained systolic blood pressure (SBP) <160 mm Hg) and effectiveness of nitroglycerin ointment at lowering blood pressure for patients with spinal cord injuries experiencing AD. US Veterans Affairs inpatient spinal cord injury (SCI) unit. Episodes of AD recalcitrant to nonpharmacologic interventions that were given one to two inches of 2% topical nitroglycerin ointment were recorded. Pharmacodynamics as above and predictive characteristics (through a mixed multivariate logistic regression model) were calculated. A total of 260 episodes of pharmacologically managed AD were recorded in 56 individuals. Time to onset of action for nitroglycerin ointment was 9-11 min. Time to full clinical effect was 14-20 min. Topical nitroglycerin controlled SBP <160 mm Hg in 77.3% of pharmacologically treated AD episodes with the remainder requiring additional antihypertensive medications. A multivariate logistic regression model was unable to identify statistically significant factors to predict which patients would respond to nitroglycerin ointment (odds ratios 95% confidence intervals 0.29-4.93). The adverse event rate, entirely attributed to hypotension, was 3.6% with seven of the eight events resolving with close observation alone and one episode requiring normal saline. Nitroglycerin ointment has a rapid onset of action and time to full clinical effect with high efficacy and relatively low adverse event rate for patients with SCI experiencing AD.
An Evidence-Based Approach to Defining Fetal Macrosomia.
Froehlich, Rosemary; Simhan, Hyagriv N; Larkin, Jacob C
2016-04-01
This study aims to determine the risk of adverse outcomes associated with the current diagnostic criteria for fetal macrosomia. Study We evaluated three techniques for characterizing birth weight as a predictor of shoulder dystocia or third- or fourth-degree laceration in 79,879 vaginal deliveries. First, we compared deliveries with birth weights above or below 4,500 g. We then performed logistic regression using birth weight as a continuous predictor, both with and without fractional polynomial transformation. Finally, we calculated the number of cesarean sections required to prevent one incident of the interrogated outcomes (number needed to treat [NNT]). Rates of adverse intrapartum outcomes increase incrementally with increasing birth weight and are predicted most accurately with logistic regression following fractional polynomial transformation. The NNT for third- or fourth-degree laceration dropped from 14.3 (95% confidence interval [CI], 13.9-14.7) at a birth weight of 3,500 g to 6.4 (95% CI, 6.1-6.8) at 4,500 g and, for shoulder dystocia, from 54.9 (95% CI, 51.5-58.6) at 3,500 g to 5.6 (95% CI, 5.2-6.0) at 4,500 g. The conventional distinction between "normal" and "macrosomic" does not reflect the incremental effect of increasing birth weight on the risk of obstetric morbidity. Outcomes analysis can inform fetal growth standards to better reflect relevant thresholds of risk. Thieme Medical Publishers 333 Seventh Avenue, New York, NY 10001, USA.
Boonvisudhi, Thummaporn; Kuladee, Sanchai
2017-01-01
To study the extent of Internet addiction (IA) and its association with depression in Thai medical students. A cross-sectional study was conducted at Faculty of Medicine, Ramathibodi Hospital. Participants were first- to fifth-year medical students who agreed to participate in this study. Demographic characteristics and stress-related factors were derived from self-rated questionnaires. Depression was assessed using the Thai version of Patient Health Questionnaire (PHQ-9). A total score of five or greater derived from the Thai version of Young Diagnostic Questionnaire for Internet Addiction was classified as "possible IA". Then chi-square test and logistic regression were used to evaluate the associations between possible IA, depression and associated factors. From 705 participants, 24.4% had possible IA and 28.8% had depression. There was statistically significant association between possible IA and depression (odds ratio (OR) 1.92, 95% confidence interval (CI): 1.34-2.77, P-value <0.001). Logistic regression analysis illustrated that the odds of depression in possible IA group was 1.58 times of the group of normal Internet use (95% CI: 1.04-2.38, P-value = 0.031). Academic problems were found to be a significant predictor of both possible IA and depression. IA was likely to be a common psychiatric problem among Thai medical students. The research has also shown that possible IA was associated with depression and academic problems. We suggest that surveillance of IA should be considered in medical schools.
Panic anxiety, under the weather?
NASA Astrophysics Data System (ADS)
Bulbena, A.; Pailhez, G.; Aceña, R.; Cunillera, J.; Rius, A.; Garcia-Ribera, C.; Gutiérrez, J.; Rojo, C.
2005-03-01
The relationship between weather conditions and psychiatric disorders has been a continuous subject of speculation due to contradictory findings. This study attempts to further clarify this relationship by focussing on specific conditions such as panic attacks and non-panic anxiety in relation to specific meteorological variables. All psychiatric emergencies attended at a general hospital in Barcelona (Spain) during 2002 with anxiety as main complaint were classified as panic or non-panic anxiety according to strict independent and retrospective criteria. Both groups were assessed and compared with meteorological data (wind speed and direction, daily rainfall, temperature, humidity and solar radiation). Seasons and weekend days were also included as independent variables. Non-parametric statistics were used throughout since most variables do not follow a normal distribution. Logistic regression models were applied to predict days with and without the clinical condition. Episodes of panic were three times more common with the poniente wind (hot wind), twice less often with rainfall, and one and a half times more common in autumn than in other seasons. These three trends (hot wind, rainfall and autumn) were accumulative for panic episodes in a logistic regression formula. Significant reduction of episodes on weekends was found only for non-panic episodes. Panic attacks, unlike other anxiety episodes, in a psychiatric emergency department in Barcelona seem to show significant meteorotropism. Assessing specific disorders instead of overall emergencies or other variables of a more general quality could shed new light on the relationship between weather conditions and behaviour.
Sperm function and assisted reproduction technology
MAAß, GESA; BÖDEKER, ROLF‐HASSO; SCHEIBELHUT, CHRISTINE; STALF, THOMAS; MEHNERT, CLAAS; SCHUPPE, HANS‐CHRISTIAN; JUNG, ANDREAS; SCHILL, WOLF‐BERNHARD
2005-01-01
The evaluation of different functional sperm parameters has become a tool in andrological diagnosis. These assays determine the sperm's capability to fertilize an oocyte. It also appears that sperm functions and semen parameters are interrelated and interdependent. Therefore, the question arose whether a given laboratory test or a battery of tests can predict the outcome in in vitro fertilization (IVF). One‐hundred and sixty‐one patients who underwent an IVF treatment were selected from a database of 4178 patients who had been examined for male infertility 3 months before or after IVF. Sperm concentration, motility, acrosin activity, acrosome reaction, sperm morphology, maternal age, number of transferred embryos, embryo score, fertilization rate and pregnancy rate were determined. In addition, logistic regression models to describe fertilization rate and pregnancy were developed. All the parameters in the models were dichotomized and intra‐ and interindividual variability of the parameters were assessed. Although the sperm parameters showed good correlations with IVF when correlated separately, the only essential parameter in the multivariate model was morphology. The enormous intra‐ and interindividual variability of the values was striking. In conclusion, our data indicate that the andrological status at the end of the respective treatment does not necessarily represent the status at the time of IVF. Despite a relatively low correlation coefficient in the logistic regression model, it appears that among the parameters tested, the most reliable parameter to predict fertilization is normal sperm morphology. (Reprod Med Biol 2005; 4: 7–30) PMID:29699207
Association of sarcopenia with functional decline in community-dwelling elderly subjects in Japan.
Tanimoto, Yoshimi; Watanabe, Misuzu; Sun, Wei; Tanimoto, Keiji; Shishikura, Kanako; Sugiura, Yumiko; Kusabiraki, Toshiyuki; Kono, Koichi
2013-10-01
The present study aimed to determine the association of sarcopenia, defined by muscle mass, muscle strength and physical performance, with functional disability from a 2-year cohort study of community-dwelling elderly Japanese people. Participants were 743 community-dwelling elderly Japanese people aged 65 years or older. We used bioelectrical impedance analysis (BIA) to measure muscle mass, grip strength to measure muscle strength, and usual walking speed to measure physical performance in a baseline study. Functional disability was defined using an activities of daily living (ADL) scale and instrumental activities of daily living (IADL) scale at baseline and during follow-up examinations 2 years later. Logistic regression analysis, adjusted for age and body mass index, was used to examine the association between sarcopenia and the occurrence of functional disability. In the present study, 7.8% of men and 10.2% of women were classified as having sarcopenia. Among sarcopenia patients in the baseline study, 36.8% of men and 18.8% of women became dependent in ADL at 2-year follow up. From the logistic regression analysis adjusted by age and body mass index, sarcopenia was significantly associated with the occurrences of physical disability compared with normal subjects in both men and women. Sarcopenia, defined by muscle mass, muscle strength and physical performance, was associated with functional decline over a 2-year period in elderly Japanese. Interventions to prevent sarcopenia are very important to prevent functional decline among elderly individuals. © 2013 Japan Geriatrics Society.
Logistic Mixed Models to Investigate Implicit and Explicit Belief Tracking.
Lages, Martin; Scheel, Anne
2016-01-01
We investigated the proposition of a two-systems Theory of Mind in adults' belief tracking. A sample of N = 45 participants predicted the choice of one of two opponent players after observing several rounds in an animated card game. Three matches of this card game were played and initial gaze direction on target and subsequent choice predictions were recorded for each belief task and participant. We conducted logistic regressions with mixed effects on the binary data and developed Bayesian logistic mixed models to infer implicit and explicit mentalizing in true belief and false belief tasks. Although logistic regressions with mixed effects predicted the data well a Bayesian logistic mixed model with latent task- and subject-specific parameters gave a better account of the data. As expected explicit choice predictions suggested a clear understanding of true and false beliefs (TB/FB). Surprisingly, however, model parameters for initial gaze direction also indicated belief tracking. We discuss why task-specific parameters for initial gaze directions are different from choice predictions yet reflect second-order perspective taking.
Arthritis and Risk of Cognitive and Functional Impairment in Older Mexican Adults.
Veeranki, Sreenivas P; Downer, Brian; Jupiter, Daniel; Wong, Rebeca
2017-04-01
This study investigated the risk of cognitive and functional impairment in older Mexicans diagnosed with arthritis. Participants included 2,681 Mexicans, aged ≥60 years, enrolled in the Mexican Health and Aging Study cohort. Participants were categorized into arthritis and no arthritis exposure groups. Primary outcome included participants categorized into "cognitively impaired" or "cognitively normal" groups. Secondary outcomes included participants categorized into Normal, Functionally Impaired only, Cognitively Impaired only, or Dementia (both cognitively and functionally impaired) groups. Multivariable logistic and multinomial regression models were used to assess the relationships. Overall, 16% or 7% were diagnosed with cognitive impairment or dementia. Compared with older Mexicans without arthritis, those who were diagnosed with arthritis had significantly increased risk of functional impairment (adjusted odds ratio [OR] 1.82, 95% confidence interval [CI] = [1.45, 2.29]), but not of dementia. Arthritis is associated with increased risk of functional impairment, but not with dementia after 11 years in older Mexicans.
Colabianchi, Natalie; Ievers-Landis, Carolyn E; Borawski, Elaine A
2006-09-01
To examine the association between observer ratings of physical attractiveness and weight preoccupation for female adolescents, and to explore any ethnic differences between Caucasian, African-American, and Hispanic females. Normal-weight female adolescents who had participated in the National Longitudinal Study of Adolescent Health in-home Wave II survey were included (n = 4,324). Physical attractiveness ratings were made in vivo by interviewers. Using logistic regression models stratified by ethnicity, the associations between observer-rated attractiveness and weight preoccupation were examined after controlling for demographics, measured body mass index (BMI) and psychosocial factors. Caucasian female adolescents perceived as being more attractive reported significantly greater weight preoccupation compared with those rated as being less attractive. Observed attractiveness did not relate to weight preoccupation among African-American or Hispanic youth when controlling for other factors. For Caucasian female adolescents, being perceived by others as more attractive may be a risk factor for disordered eating.
Model selection for logistic regression models
NASA Astrophysics Data System (ADS)
Duller, Christine
2012-09-01
Model selection for logistic regression models decides which of some given potential regressors have an effect and hence should be included in the final model. The second interesting question is whether a certain factor is heterogeneous among some subsets, i.e. whether the model should include a random intercept or not. In this paper these questions will be answered with classical as well as with Bayesian methods. The application show some results of recent research projects in medicine and business administration.
Radiomorphometric analysis of frontal sinus for sex determination.
Verma, Saumya; Mahima, V G; Patil, Karthikeya
2014-09-01
Sex determination of unknown individuals carries crucial significance in forensic research, in cases where fragments of skull persist with no likelihood of identification based on dental arch. In these instances sex determination becomes important to rule out certain number of possibilities instantly and helps in establishing a biological profile of human remains. The aim of the study is to evaluate a mathematical method based on logistic regression analysis capable of ascertaining the sex of individuals in the South Indian population. The study was conducted in the department of Oral Medicine and Radiology. The right and left areas, maximum height, width of frontal sinus were determined in 100 Caldwell views of 50 women and 50 men aged 20 years and above, with the help of Vernier callipers and a square grid with 1 square measuring 1mm(2) in area. Student's t-test, logistic regression analysis. The mean values of variables were greater in men, based on Student's t-test at 5% level of significance. The mathematical model based on logistic regression analysis gave percentage agreement of total area to correctly predict the female gender as 55.2%, of right area as 60.9% and of left area as 55.2%. The areas of the frontal sinus and the logistic regression proved to be unreliable in sex determination. (Logit = 0.924 - 0.00217 × right area).
Genetic prediction of type 2 diabetes using deep neural network.
Kim, J; Kim, J; Kwak, M J; Bajaj, M
2018-04-01
Type 2 diabetes (T2DM) has strong heritability but genetic models to explain heritability have been challenging. We tested deep neural network (DNN) to predict T2DM using the nested case-control study of Nurses' Health Study (3326 females, 45.6% T2DM) and Health Professionals Follow-up Study (2502 males, 46.5% T2DM). We selected 96, 214, 399, and 678 single-nucleotide polymorphism (SNPs) through Fisher's exact test and L1-penalized logistic regression. We split each dataset randomly in 4:1 to train prediction models and test their performance. DNN and logistic regressions showed better area under the curve (AUC) of ROC curves than the clinical model when 399 or more SNPs included. DNN was superior than logistic regressions in AUC with 399 or more SNPs in male and 678 SNPs in female. Addition of clinical factors consistently increased AUC of DNN but failed to improve logistic regressions with 214 or more SNPs. In conclusion, we show that DNN can be a versatile tool to predict T2DM incorporating large numbers of SNPs and clinical information. Limitations include a relatively small number of the subjects mostly of European ethnicity. Further studies are warranted to confirm and improve performance of genetic prediction models using DNN in different ethnic groups. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Unconditional or Conditional Logistic Regression Model for Age-Matched Case-Control Data?
Kuo, Chia-Ling; Duan, Yinghui; Grady, James
2018-01-01
Matching on demographic variables is commonly used in case-control studies to adjust for confounding at the design stage. There is a presumption that matched data need to be analyzed by matched methods. Conditional logistic regression has become a standard for matched case-control data to tackle the sparse data problem. The sparse data problem, however, may not be a concern for loose-matching data when the matching between cases and controls is not unique, and one case can be matched to other controls without substantially changing the association. Data matched on a few demographic variables are clearly loose-matching data, and we hypothesize that unconditional logistic regression is a proper method to perform. To address the hypothesis, we compare unconditional and conditional logistic regression models by precision in estimates and hypothesis testing using simulated matched case-control data. Our results support our hypothesis; however, the unconditional model is not as robust as the conditional model to the matching distortion that the matching process not only makes cases and controls similar for matching variables but also for the exposure status. When the study design involves other complex features or the computational burden is high, matching in loose-matching data can be ignored for negligible loss in testing and estimation if the distributions of matching variables are not extremely different between cases and controls.
Unconditional or Conditional Logistic Regression Model for Age-Matched Case–Control Data?
Kuo, Chia-Ling; Duan, Yinghui; Grady, James
2018-01-01
Matching on demographic variables is commonly used in case–control studies to adjust for confounding at the design stage. There is a presumption that matched data need to be analyzed by matched methods. Conditional logistic regression has become a standard for matched case–control data to tackle the sparse data problem. The sparse data problem, however, may not be a concern for loose-matching data when the matching between cases and controls is not unique, and one case can be matched to other controls without substantially changing the association. Data matched on a few demographic variables are clearly loose-matching data, and we hypothesize that unconditional logistic regression is a proper method to perform. To address the hypothesis, we compare unconditional and conditional logistic regression models by precision in estimates and hypothesis testing using simulated matched case–control data. Our results support our hypothesis; however, the unconditional model is not as robust as the conditional model to the matching distortion that the matching process not only makes cases and controls similar for matching variables but also for the exposure status. When the study design involves other complex features or the computational burden is high, matching in loose-matching data can be ignored for negligible loss in testing and estimation if the distributions of matching variables are not extremely different between cases and controls. PMID:29552553
Austin, Peter C
2010-04-22
Multilevel logistic regression models are increasingly being used to analyze clustered data in medical, public health, epidemiological, and educational research. Procedures for estimating the parameters of such models are available in many statistical software packages. There is currently little evidence on the minimum number of clusters necessary to reliably fit multilevel regression models. We conducted a Monte Carlo study to compare the performance of different statistical software procedures for estimating multilevel logistic regression models when the number of clusters was low. We examined procedures available in BUGS, HLM, R, SAS, and Stata. We found that there were qualitative differences in the performance of different software procedures for estimating multilevel logistic models when the number of clusters was low. Among the likelihood-based procedures, estimation methods based on adaptive Gauss-Hermite approximations to the likelihood (glmer in R and xtlogit in Stata) or adaptive Gaussian quadrature (Proc NLMIXED in SAS) tended to have superior performance for estimating variance components when the number of clusters was small, compared to software procedures based on penalized quasi-likelihood. However, only Bayesian estimation with BUGS allowed for accurate estimation of variance components when there were fewer than 10 clusters. For all statistical software procedures, estimation of variance components tended to be poor when there were only five subjects per cluster, regardless of the number of clusters.
Zlotnik, Alexander; Alfaro, Miguel Cuchí; Pérez, María Carmen Pérez; Gallardo-Antolín, Ascensión; Martínez, Juan Manuel Montero
2016-05-01
The usage of decision support tools in emergency departments, based on predictive models, capable of estimating the probability of admission for patients in the emergency department may give nursing staff the possibility of allocating resources in advance. We present a methodology for developing and building one such system for a large specialized care hospital using a logistic regression and an artificial neural network model using nine routinely collected variables available right at the end of the triage process.A database of 255.668 triaged nonobstetric emergency department presentations from the Ramon y Cajal University Hospital of Madrid, from January 2011 to December 2012, was used to develop and test the models, with 66% of the data used for derivation and 34% for validation, with an ordered nonrandom partition. On the validation dataset areas under the receiver operating characteristic curve were 0.8568 (95% confidence interval, 0.8508-0.8583) for the logistic regression model and 0.8575 (95% confidence interval, 0.8540-0. 8610) for the artificial neural network model. χ Values for Hosmer-Lemeshow fixed "deciles of risk" were 65.32 for the logistic regression model and 17.28 for the artificial neural network model. A nomogram was generated upon the logistic regression model and an automated software decision support system with a Web interface was built based on the artificial neural network model.
Product unit neural network models for predicting the growth limits of Listeria monocytogenes.
Valero, A; Hervás, C; García-Gimeno, R M; Zurera, G
2007-08-01
A new approach to predict the growth/no growth interface of Listeria monocytogenes as a function of storage temperature, pH, citric acid (CA) and ascorbic acid (AA) is presented. A linear logistic regression procedure was performed and a non-linear model was obtained by adding new variables by means of a Neural Network model based on Product Units (PUNN). The classification efficiency of the training data set and the generalization data of the new Logistic Regression PUNN model (LRPU) were compared with Linear Logistic Regression (LLR) and Polynomial Logistic Regression (PLR) models. 92% of the total cases from the LRPU model were correctly classified, an improvement on the percentage obtained using the PLR model (90%) and significantly higher than the results obtained with the LLR model, 80%. On the other hand predictions of LRPU were closer to data observed which permits to design proper formulations in minimally processed foods. This novel methodology can be applied to predictive microbiology for describing growth/no growth interface of food-borne microorganisms such as L. monocytogenes. The optimal balance is trying to find models with an acceptable interpretation capacity and with good ability to fit the data on the boundaries of variable range. The results obtained conclude that these kinds of models might well be very a valuable tool for mathematical modeling.
Lacagnina, Valerio; Leto-Barone, Maria S; La Piana, Simona; Seidita, Aurelio; Pingitore, Giuseppe; Di Lorenzo, Gabriele
2014-01-01
This article uses the logistic regression model for diagnostic decision making in patients with chronic nasal symptoms. We studied the ability of the logistic regression model, obtained by the evaluation of a database, to detect patients with positive allergy skin-prick test (SPT) and patients with negative SPT. The model developed was validated using the data set obtained from another medical institution. The analysis was performed using a database obtained from a questionnaire administered to the patients with nasal symptoms containing personal data, clinical data, and results of allergy testing (SPT). All variables found to be significantly different between patients with positive and negative SPT (p < 0.05) were selected for the logistic regression models and were analyzed with backward stepwise logistic regression, evaluated with area under the curve of the receiver operating characteristic curve. A second set of patients from another institution was used to prove the model. The accuracy of the model in identifying, over the second set, both patients whose SPT will be positive and negative was high. The model detected 96% of patients with nasal symptoms and positive SPT and classified 94% of those with negative SPT. This study is preliminary to the creation of a software that could help the primary care doctors in a diagnostic decision making process (need of allergy testing) in patients complaining of chronic nasal symptoms.
Held, Elizabeth; Cape, Joshua; Tintle, Nathan
2016-01-01
Machine learning methods continue to show promise in the analysis of data from genetic association studies because of the high number of variables relative to the number of observations. However, few best practices exist for the application of these methods. We extend a recently proposed supervised machine learning approach for predicting disease risk by genotypes to be able to incorporate gene expression data and rare variants. We then apply 2 different versions of the approach (radial and linear support vector machines) to simulated data from Genetic Analysis Workshop 19 and compare performance to logistic regression. Method performance was not radically different across the 3 methods, although the linear support vector machine tended to show small gains in predictive ability relative to a radial support vector machine and logistic regression. Importantly, as the number of genes in the models was increased, even when those genes contained causal rare variants, model predictive ability showed a statistically significant decrease in performance for both the radial support vector machine and logistic regression. The linear support vector machine showed more robust performance to the inclusion of additional genes. Further work is needed to evaluate machine learning approaches on larger samples and to evaluate the relative improvement in model prediction from the incorporation of gene expression data.
Hansson, Lena M; Näslund, Erik; Rasmussen, Finn
2010-08-01
We examined whether men and women with obesity reported different types of discrimination to a greater extent than those with normal weight, and explored whether these associations were modified by socioeconomic position. National representative sample of men and women, with normal weight (n = 2,000), moderate obesity (n = 2,461) and severe obesity (n = 557). Participants were identified in a yearly population-based survey (1996-2006) and data on perceived discrimination and potential confounding factors were measured in 2008. Logistic regression models tested whether obesity was associated with perceived lifetime, workplace, healthcare and interpersonal discrimination. The overall response rate was 56%. For men, moderate obesity was associated with workplace discrimination, while severely obese women were more likely to report this sort of discrimination than normal weight women. Severely obese individuals were twice as likely to report healthcare discrimination than normal weight individuals. Women, regardless of weight status group, were in turn twice as likely to report healthcare discrimination as men. Women with severe obesity were significantly more likely to report interpersonal discrimination compared with normal weight women. Socioeconomic position modified the association between weight status and healthcare discrimination. Highly educated individuals with moderate and severe obesity were more likely to report healthcare discrimination than their normal weight counterparts, whereas low educated individuals with normal weight, moderate and severe obesity were equally likely to report discrimination. In this large, population-based study, discrimination was more likely to be reported by obese individuals compared with those of normal weight. The associations, however, varied according to gender and socioeconomic position.
Real, J; Cleries, R; Forné, C; Roso-Llorach, A; Martínez-Sánchez, J M
In medicine and biomedical research, statistical techniques like logistic, linear, Cox and Poisson regression are widely known. The main objective is to describe the evolution of multivariate techniques used in observational studies indexed in PubMed (1970-2013), and to check the requirements of the STROBE guidelines in the author guidelines in Spanish journals indexed in PubMed. A targeted PubMed search was performed to identify papers that used logistic linear Cox and Poisson models. Furthermore, a review was also made of the author guidelines of journals published in Spain and indexed in PubMed and Web of Science. Only 6.1% of the indexed manuscripts included a term related to multivariate analysis, increasing from 0.14% in 1980 to 12.3% in 2013. In 2013, 6.7, 2.5, 3.5, and 0.31% of the manuscripts contained terms related to logistic, linear, Cox and Poisson regression, respectively. On the other hand, 12.8% of journals author guidelines explicitly recommend to follow the STROBE guidelines, and 35.9% recommend the CONSORT guideline. A low percentage of Spanish scientific journals indexed in PubMed include the STROBE statement requirement in the author guidelines. Multivariate regression models in published observational studies such as logistic regression, linear, Cox and Poisson are increasingly used both at international level, as well as in journals published in Spanish. Copyright © 2015 Sociedad Española de Médicos de Atención Primaria (SEMERGEN). Publicado por Elsevier España, S.L.U. All rights reserved.
2011-01-01
Introduction Necrotizing fasciitis (NF) is a life threatening infectious disease with a high mortality rate. We carried out a microbiological characterization of the causative pathogens. We investigated the correlation of mortality in NF with bloodstream infection and with the presence of co-morbidities. Methods In this retrospective study, we analyzed 323 patients who presented with necrotizing fasciitis at two different institutions. Bloodstream infection (BSI) was defined as a positive blood culture result. The patients were categorized as survivors and non-survivors. Eleven clinically important variables which were statistically significant by univariate analysis were selected for multivariate regression analysis and a stepwise logistic regression model was developed to determine the association between BSI and mortality. Results Univariate logistic regression analysis showed that patients with hypotension, heart disease, liver disease, presence of Vibrio spp. in wound cultures, presence of fungus in wound cultures, and presence of Streptococcus group A, Aeromonas spp. or Vibrio spp. in blood cultures, had a significantly higher risk of in-hospital mortality. Our multivariate logistic regression analysis showed a higher risk of mortality in patients with pre-existing conditions like hypotension, heart disease, and liver disease. Multivariate logistic regression analysis also showed that presence of Vibrio spp in wound cultures, and presence of Streptococcus Group A in blood cultures were associated with a high risk of mortality while debridement > = 3 was associated with improved survival. Conclusions Mortality in patients with necrotizing fasciitis was significantly associated with the presence of Vibrio in wound cultures and Streptococcus group A in blood cultures. PMID:21693053
Prediction of siRNA potency using sparse logistic regression.
Hu, Wei; Hu, John
2014-06-01
RNA interference (RNAi) can modulate gene expression at post-transcriptional as well as transcriptional levels. Short interfering RNA (siRNA) serves as a trigger for the RNAi gene inhibition mechanism, and therefore is a crucial intermediate step in RNAi. There have been extensive studies to identify the sequence characteristics of potent siRNAs. One such study built a linear model using LASSO (Least Absolute Shrinkage and Selection Operator) to measure the contribution of each siRNA sequence feature. This model is simple and interpretable, but it requires a large number of nonzero weights. We have introduced a novel technique, sparse logistic regression, to build a linear model using single-position specific nucleotide compositions which has the same prediction accuracy of the linear model based on LASSO. The weights in our new model share the same general trend as those in the previous model, but have only 25 nonzero weights out of a total 84 weights, a 54% reduction compared to the previous model. Contrary to the linear model based on LASSO, our model suggests that only a few positions are influential on the efficacy of the siRNA, which are the 5' and 3' ends and the seed region of siRNA sequences. We also employed sparse logistic regression to build a linear model using dual-position specific nucleotide compositions, a task LASSO is not able to accomplish well due to its high dimensional nature. Our results demonstrate the superiority of sparse logistic regression as a technique for both feature selection and regression over LASSO in the context of siRNA design.
[The relationship of halitosis and Helicobacter pylori].
Chen, Xi; Tao, Dan-ying; Li, Qing; Feng, Xi-ping
2007-06-01
The aim of the study was to investigate the relationship between halitosis and Helicobacter pylori infection in stomach. Fifty subjects without periodontal diseases and systematic disease (exclude gastrointestinal diseases) were included. Infection of H.pylori was diagnosed by biopsy and (14)C-urea breath test. SPSS11.5 software package was used to analyze the data. All the subjects were periodontal healthy according to the periodontal index. The prevalence of H.pylori infection in halitosis subjects was significantly higher than that in the normal subjects (57.1% VS 18.2%, P<0.01). Logistic regression analysis showed that H.pylori was the only significant variable in the equation(P<0.05). H.pylori in stomach may be involved in the presence of halitosis in periodontal healthy subjects.
Yang, Yi; Wang, Shuqing; Liu, Yang
2014-01-01
Order insertion often occurs in the scheduling process of logistics service supply chain (LSSC), which disturbs normal time scheduling especially in the environment of mass customization logistics service. This study analyses order similarity coefficient and order insertion operation process and then establishes an order insertion scheduling model of LSSC with service capacity and time factors considered. This model aims to minimize the average unit volume operation cost of logistics service integrator and maximize the average satisfaction degree of functional logistics service providers. In order to verify the viability and effectiveness of our model, a specific example is numerically analyzed. Some interesting conclusions are obtained. First, along with the increase of completion time delay coefficient permitted by customers, the possible inserting order volume first increases and then trends to be stable. Second, supply chain performance reaches the best when the volume of inserting order is equal to the surplus volume of the normal operation capacity in mass service process. Third, the larger the normal operation capacity in mass service process is, the bigger the possible inserting order's volume will be. Moreover, compared to increasing the completion time delay coefficient, improving the normal operation capacity of mass service process is more useful. PMID:25276851
Guo, Huey-Ming; Shyu, Yea-Ing Lotus; Chang, Her-Kun
2006-01-01
In this article, the authors provide an overview of a research method to predict quality of care in home health nursing data set. The results of this study can be visualized through classification an regression tree (CART) graphs. The analysis was more effective, and the results were more informative since the home health nursing dataset was analyzed with a combination of the logistic regression and CART, these two techniques complete each other. And the results more informative that more patients' characters were related to quality of care in home care. The results contributed to home health nurse predict patient outcome in case management. Improved prediction is needed for interventions to be appropriately targeted for improved patient outcome and quality of care.
2011-01-01
Background The relationship between asthma and traffic-related pollutants has received considerable attention. The use of individual-level exposure measures, such as residence location or proximity to emission sources, may avoid ecological biases. Method This study focused on the pediatric Medicaid population in Detroit, MI, a high-risk population for asthma-related events. A population-based matched case-control analysis was used to investigate associations between acute asthma outcomes and proximity of residence to major roads, including freeways. Asthma cases were identified as all children who made at least one asthma claim, including inpatient and emergency department visits, during the three-year study period, 2004-06. Individually matched controls were randomly selected from the rest of the Medicaid population on the basis of non-respiratory related illness. We used conditional logistic regression with distance as both categorical and continuous variables, and examined non-linear relationships with distance using polynomial splines. The conditional logistic regression models were then extended by considering multiple asthma states (based on the frequency of acute asthma outcomes) using polychotomous conditional logistic regression. Results Asthma events were associated with proximity to primary roads with an odds ratio of 0.97 (95% CI: 0.94, 0.99) for a 1 km increase in distance using conditional logistic regression, implying that asthma events are less likely as the distance between the residence and a primary road increases. Similar relationships and effect sizes were found using polychotomous conditional logistic regression. Another plausible exposure metric, a reduced form response surface model that represents atmospheric dispersion of pollutants from roads, was not associated under that exposure model. Conclusions There is moderately strong evidence of elevated risk of asthma close to major roads based on the results obtained in this population-based matched case-control study. PMID:21513554
Neural network modeling for surgical decisions on traumatic brain injury patients.
Li, Y C; Liu, L; Chiu, W T; Jian, W S
2000-01-01
Computerized medical decision support systems have been a major research topic in recent years. Intelligent computer programs were implemented to aid physicians and other medical professionals in making difficult medical decisions. This report compares three different mathematical models for building a traumatic brain injury (TBI) medical decision support system (MDSS). These models were developed based on a large TBI patient database. This MDSS accepts a set of patient data such as the types of skull fracture, Glasgow Coma Scale (GCS), episode of convulsion and return the chance that a neurosurgeon would recommend an open-skull surgery for this patient. The three mathematical models described in this report including a logistic regression model, a multi-layer perceptron (MLP) neural network and a radial-basis-function (RBF) neural network. From the 12,640 patients selected from the database. A randomly drawn 9480 cases were used as the training group to develop/train our models. The other 3160 cases were in the validation group which we used to evaluate the performance of these models. We used sensitivity, specificity, areas under receiver-operating characteristics (ROC) curve and calibration curves as the indicator of how accurate these models are in predicting a neurosurgeon's decision on open-skull surgery. The results showed that, assuming equal importance of sensitivity and specificity, the logistic regression model had a (sensitivity, specificity) of (73%, 68%), compared to (80%, 80%) from the RBF model and (88%, 80%) from the MLP model. The resultant areas under ROC curve for logistic regression, RBF and MLP neural networks are 0.761, 0.880 and 0.897, respectively (P < 0.05). Among these models, the logistic regression has noticeably poorer calibration. This study demonstrated the feasibility of applying neural networks as the mechanism for TBI decision support systems based on clinical databases. The results also suggest that neural networks may be a better solution for complex, non-linear medical decision support systems than conventional statistical techniques such as logistic regression.
Viswanathan, M; Pearl, D L; Taboada, E N; Parmley, E J; Mutschall, S K; Jardine, C M
2017-05-01
Using data collected from a cross-sectional study of 25 farms (eight beef, eight swine and nine dairy) in 2010, we assessed clustering of molecular subtypes of C. jejuni based on a Campylobacter-specific 40 gene comparative genomic fingerprinting assay (CGF40) subtypes, using unweighted pair-group method with arithmetic mean (UPGMA) analysis, and multiple correspondence analysis. Exact logistic regression was used to determine which genes differentiate wildlife and livestock subtypes in our study population. A total of 33 bovine livestock (17 beef and 16 dairy), 26 wildlife (20 raccoon (Procyon lotor), five skunk (Mephitis mephitis) and one mouse (Peromyscus spp.) C. jejuni isolates were subtyped using CGF40. Dendrogram analysis, based on UPGMA, showed distinct branches separating bovine livestock and mammalian wildlife isolates. Furthermore, two-dimensional multiple correspondence analysis was highly concordant with dendrogram analysis showing clear differentiation between livestock and wildlife CGF40 subtypes. Based on multilevel logistic regression models with a random intercept for farm of origin, we found that isolates in general, and raccoons more specifically, were significantly more likely to be part of the wildlife branch. Exact logistic regression conducted gene by gene revealed 15 genes that were predictive of whether an isolate was of wildlife or bovine livestock isolate origin. Both multiple correspondence analysis and exact logistic regression revealed that in most cases, the presence of a particular gene (13 of 15) was associated with an isolate being of livestock rather than wildlife origin. In conclusion, the evidence gained from dendrogram analysis, multiple correspondence analysis and exact logistic regression indicates that mammalian wildlife carry CGF40 subtypes of C. jejuni distinct from those carried by bovine livestock. Future studies focused on source attribution of C. jejuni in human infections will help determine whether wildlife transmit Campylobacter jejuni directly to humans. © 2016 Blackwell Verlag GmbH.
Functional Data Analysis in NTCP Modeling: A New Method to Explore the Radiation Dose-Volume Effects
DOE Office of Scientific and Technical Information (OSTI.GOV)
Benadjaoud, Mohamed Amine, E-mail: mohamedamine.benadjaoud@gustaveroussy.fr; Université Paris sud, Le Kremlin-Bicêtre; Institut Gustave Roussy, Villejuif
2014-11-01
Purpose/Objective(s): To describe a novel method to explore radiation dose-volume effects. Functional data analysis is used to investigate the information contained in differential dose-volume histograms. The method is applied to the normal tissue complication probability modeling of rectal bleeding (RB) for patients irradiated in the prostatic bed by 3-dimensional conformal radiation therapy. Methods and Materials: Kernel density estimation was used to estimate the individual probability density functions from each of the 141 rectum differential dose-volume histograms. Functional principal component analysis was performed on the estimated probability density functions to explore the variation modes in the dose distribution. The functional principalmore » components were then tested for association with RB using logistic regression adapted to functional covariates (FLR). For comparison, 3 other normal tissue complication probability models were considered: the Lyman-Kutcher-Burman model, logistic model based on standard dosimetric parameters (LM), and logistic model based on multivariate principal component analysis (PCA). Results: The incidence rate of grade ≥2 RB was 14%. V{sub 65Gy} was the most predictive factor for the LM (P=.058). The best fit for the Lyman-Kutcher-Burman model was obtained with n=0.12, m = 0.17, and TD50 = 72.6 Gy. In PCA and FLR, the components that describe the interdependence between the relative volumes exposed at intermediate and high doses were the most correlated to the complication. The FLR parameter function leads to a better understanding of the volume effect by including the treatment specificity in the delivered mechanistic information. For RB grade ≥2, patients with advanced age are significantly at risk (odds ratio, 1.123; 95% confidence interval, 1.03-1.22), and the fits of the LM, PCA, and functional principal component analysis models are significantly improved by including this clinical factor. Conclusion: Functional data analysis provides an attractive method for flexibly estimating the dose-volume effect for normal tissues in external radiation therapy.« less
2012-09-01
3,435 10,461 9.1 3.1 63 Unmarried with Children+ Unmarried without Children 439,495 0.01 10,350 43,870 10.1 2.2 64 Married with Children+ Married ...logistic regression model was used to predict the probability of eligibility for the survey (known eligibility vs . unknown eligibility). A second logistic...regression model was used to predict the probability of response among eligible sample members (complete response vs . non-response). CHAID (Chi
Weight misperception and psychosocial health in normal weight Chinese adolescents.
Lo, Wing-Sze; Ho, Sai-Yin; Mak, Kwok-Kei; Lai, Hak-Kan; Lai, Yuen-Kwan; Lam, Tai-Hing
2011-06-01
To investigate the association between weight misperception and psychosocial health problems among normal weight Chinese adolescent boys and girls. In the Youth Smoking Survey 2003-04, 20 677 normal weight students aged 11-18 years from 85 randomly selected schools throughout Hong Kong were analysed. Students who perceived themselves as very thin, thin, fat or very fat were classified as having weight misperception in contrast to the reference group who correctly perceived themselves as normal weight. Psychosocial health outcomes included headache, feeling stressful, feeling depressed, poorer appetite, sleepless at night, having nightmares and less confidence in getting along with friends. Logistic regression yielded adjusted odds ratios (ORs) for each outcome by weight misperception in boys and girls separately. In girls, misperceived fatness was associated with all outcomes, while misperceived thinness was associated with poorer appetite and less confidence. Boys who misperceived themselves as very thin or fat had greater odds of all outcomes except having nightmares. In general, greater ORs were observed for misperceived fatness than thinness in girls, but similar ORs were observed in boys. Misperceived thinness and fatness accounted for 0.6% to 45.1% of the psychosocial health problems in adolescents. Normal weight adolescents with weight misperception were more likely to have psychosocial health problems, and the associations were stronger for extreme misperceptions (i.e., very fat or very thin) in both boys and girls.
Huang, Hui; Zhu, Zheng-Qiu; Zhou, Zheng-Guo; Chen, Ling-Shan; Zhao, Ming; Zhang, Yang; Li, Hong-Bo; Yin, Li-Ping
2016-12-08
To assess the role of time-intensity curves (TICs) of the normal peripheral zone (PZ) in the identification of biopsy-proven prostate nodules using contrast-enhanced transrectal ultrasound (CETRUS). This study included 132 patients with 134 prostate PZ nodules. Arrival time (AT), peak intensity (PI), mean transit time (MTT), area under the curve (AUC), time from peak to one half (TPH), wash in slope (WIS) and time to peak (TTP) were analyzed using multivariate linear logistic regression and receiver operating characteristic (ROC) curves to assess whether combining nodule TICs with normal PZ TICs improved the prediction of prostate cancer (PCa) aggressiveness. The PI, AUC (p < 0.001 for both), MTT and TPH (p = 0.011 and 0.040 respectively) values of the malignant nodules were significantly higher than those of the benign nodules. Incorporating the PI and AUC values (both, p < 0.001) of the normal PZ TIC, but not the MTT and TPH values (p = 0.076 and 0.159 respectively), significantly improved the AUC for prediction of malignancy (PI: 0.784-0.923; AUC: 0.758-0.891) and assessment of cancer aggressiveness (p < 0.001). Thus, all these findings indicate that incorporating normal PZ TICs with nodule TICs in CETRUS readings can improve the diagnostic accuracy for PCa and cancer aggressiveness assessment.
Habitat features and predictive habitat modeling for the Colorado chipmunk in southern New Mexico
Rivieccio, M.; Thompson, B.C.; Gould, W.R.; Boykin, K.G.
2003-01-01
Two subspecies of Colorado chipmunk (state threatened and federal species of concern) occur in southern New Mexico: Tamias quadrivittatus australis in the Organ Mountains and T. q. oscuraensis in the Oscura Mountains. We developed a GIS model of potentially suitable habitat based on vegetation and elevation features, evaluated site classifications of the GIS model, and determined vegetation and terrain features associated with chipmunk occurrence. We compared GIS model classifications with actual vegetation and elevation features measured at 37 sites. At 60 sites we measured 18 habitat variables regarding slope, aspect, tree species, shrub species, and ground cover. We used logistic regression to analyze habitat variables associated with chipmunk presence/absence. All (100%) 37 sample sites (28 predicted suitable, 9 predicted unsuitable) were classified correctly by the GIS model regarding elevation and vegetation. For 28 sites predicted suitable by the GIS model, 18 sites (64%) appeared visually suitable based on habitat variables selected from logistic regression analyses, of which 10 sites (36%) were specifically predicted as suitable habitat via logistic regression. We detected chipmunks at 70% of sites deemed suitable via the logistic regression models. Shrub cover, tree density, plant proximity, presence of logs, and presence of rock outcrop were retained in the logistic model for the Oscura Mountains; litter, shrub cover, and grass cover were retained in the logistic model for the Organ Mountains. Evaluation of predictive models illustrates the need for multi-stage analyses to best judge performance. Microhabitat analyses indicate prospective needs for different management strategies between the subspecies. Sensitivities of each population of the Colorado chipmunk to natural and prescribed fire suggest that partial burnings of areas inhabited by Colorado chipmunks in southern New Mexico may be beneficial. These partial burnings may later help avoid a fire that could substantially reduce habitat of chipmunks over a mountain range.
Takase, Bonpei; Masaki, Nobuyuki; Hattori, Hidemi; Ishihara, Masayuki; Kurita, Akira
2009-06-01
The electrocardiographic index of QT dispersion (QTd) is related to the occurrence of arrhythmia. In patients with suspected or known coronary artery disease, QTd may be affected by exercise. We investigated whether QTd that is automatically calculated by a newly developed computer system could be used as a marker of exercise-induced myocardial ischemia. The design of this study was prospective and observational. Eighty-three consecutive patients were enrolled in this study. Their QTd was measured at rest and after 3 min of exercise during exercise-stress Thallium-201 scintigraphy and compared with conventional ST-segment changes. The patients were classified into 4 groups (normal group, redistribution group, fixed defect group, redistribution with fixed defect group) based on the result of single photon emission computed tomography. As statistical analysis, one-way ANOVA with post-hoc Scheffe's method, receiver-operating characteristics (ROC) and multiple logistic regression analysis were performed. At rest, QTd was significantly greater (p<0.05) in the fixed defect group (52+/-21 ms) and the redistribution with fixed defect group (53+/-20 ms) than in the normal group (32+/-14 ms) and the redistribution group (31+/-16 ms). However, QTd tended to increase after exercise in the redistribution group, while QTd tended to decrease in the normal group, the fixed defect group, and the redistribution with fixed defect group (QTd after exercise, normal group, 28+/-17 ms, redistribution group, 35+/-19 ms, fixed defect group, 43+/-25 ms, redistribution with fixed defect group, 49+/-27 ms). Exercise significantly increased QTcd (RR interval-corrected QT dispersion) in the redistribution group. The best cut-off values of QTd and QTcd obtained from ROC curves for exercise-induced myocardial ischemia were 41.6 ms and 40.4 ms, respectively (Qtd--AUC 0.68, 95%CI 0.53- 0.83 and QTcd--AUC 0.67, 95%CI 0.55-0.80). Using these values as cut-off ones, QTd, QTcd, and conventional ST-segment change had comparable sensitivities and specificities for detecting exercise-induced myocardial ischemia (sensitivity - 60%, 58% and 49%, respectively;specificity - 78%, 80% and 83%, respectively). In addition, multiple logistic regression analysis showed that QTd (OR=2.01, 95%CI 1.15-4.10, p<0.05), QTcd (OR=2.12, 95% CI 1.02-4.30, p<0.05) and ST-segment change (OR=1.89, 95%CI 1.03-3.40, p<0.05), were the significantly associated with exercise-induced myocardial ischemia. QT dispersion and/or QTcd after exercise could be a useful marker for exercise-induced myocardial ischemia in routine clinical practice.
Bramley, E; Costa, N D; Fulkerson, W J; Lean, I J
2013-11-01
To investigate associations between ruminal acidosis and body condition score (BCS), prevalence of poor rumen fill, diarrhoea and lameness in dairy cows in New South Wales and Victoria, Australia. This was a cross-sectional study conducted in 100 dairy herds in five regions of Australia. Feeding practices, diets and management practices of herds were assessed. Lactating cows within herds were sampled for rumen biochemistry (n = 8 per herd) and scored for body condition, rumen fill and locomotion (n = 15 per herd). The consistency of faecal pats (n = 20 per herd) from the lactating herd was also scored. A perineal faecal staining score was given to each herd. Herds were classified as subclinically acidotic (ACID), suboptimal (SO) and non-acidotic (Normal) when ≥3/8 cows per herd were allocated to previously defined categories based on rumen biochemical measures. Multivariate logistic regression models were used to examine associations between the prevalence of conditions within a herd and explanatory variables. Median BCS and perineal staining score were not associated with herd category (p >0.05). In the multivariate models, herds with a high prevalence of low rumen fill scores (≤2/5) were more likely to be categorised Normal than SO with an associated increased risk of 69% (p = 0.05). Herds that had a greater prevalence of lame cows (locomotion scores ≥3/5), had 103% higher risk of being categorised as ACID than SO (p = 0.034). In a multivariate logistic regression model, with herd modelled as a random effect, an increase of 1% of pasture in the diet was associated with a 5.5% increase in risk of high faecal scores (≥4/5) indicating diarrhoea (p = 0.001). This study confirmed that herd categories based on rumen function are associated with biological outcomes consistent with acidosis. Herds that had a higher risk of lameness also had a much higher risk of being categorised ACID than SO. Herds with a high prevalence of low rumen scores were more likely to be categorised Normal than SO. The findings indicate that differences in rumen metabolism identified for herd categories ACID, SO and Normal were associated with differences in disease risk and physiology. The study also identified an association between pasture feeding and higher faecal scores. This study suggests that there is a challenge for farmers seeking to increase milk production of cows on pasture to maintain the health of cattle.
Peng, Tingting; Yue, Fujuan; Wang, Fang; Feng, Yongliang; Wu, Weiwei; Wang, Suping; Zhang, Yawei; Yang, Hailan
2015-06-01
To investigate the relationship between maternal pre-pregnancy body mass index, weight gain during pregnancy and small for gestational age (SGA) birth so as to provide evidence for the development of comprehensive prevention programs on SGA birth. Between March, 2012 and July, 2014, 4 754 pregnant women were asked to fill in the questionnaires which were collected from the First Affiliated Hospital of Shanxi Medical University. Data related to general demographic characteristics, pregnancy and health status of those pregnant women was collected and maternal pre-pregnancy body mass index and maternal weight gain were calculated. Subjects were divided into different groups before the effect of maternal pre-pregnancy body mass index and weight gain during pregnancy on SGA birth were estimated. The overall incidence of SGA birth was 9.26% (440/4 754). Proportions of SGA birth from pre-pregnant, underweight group, normal weight group, overweight and obese groups were 9.85%, 8.54% and 9.45%, respectively. Results from multi-factor logistic regression analyses showed that after adjusting the confounding factors as age, history on pregnancies etc., women with high pre-pregnancy BMI showed a lower incidence of SGA than those under normal pre-pregnancy BMI (OR = 0.714, 95% CI: 0.535-0.953). Different weight gains during pregnancy were statistically significant (χ(2) = 8.811, P = 0.012). Incidence of SGA birth that was below the recommended range in the 2009 Institute of Medicine Guidelines (12.20%) was higher than those within (9.23%) or beyond (8.45%) the recommended range. Results from the multi-factor logistic regression analyses showed that, after adjusting the confounding factors as age, pregnancy history etc., factor as weight gain below the recommended level could increase the risk of SGA (OR = 1.999, 95% CI: 1.487-2.685). In the underweight, normal weight, overweight or obese groups, with weight gain during pregnancy below the range, the incidence of SGA showed an increase (OR = 2.558, 95% CI: 1.313-4.981, OR = 1.804, 95% CI: 1.258-2.587, OR = 3.108, 95% CI: 1.237-7.811). There was no interaction of addictive or multiplicative models between these two factors under 'interaction analysis'. Women with high pre-pregnancy BMI presented a lower incidence of SGA than those within the normal range. Insufficient weight gain during pregnancy could increase the risk of SGA delivery. These findings called for attention to be paid to the gestational weight gain, in order to decrease the risk of SGA.
The logistic model for predicting the non-gonoactive Aedes aegypti females.
Reyes-Villanueva, Filiberto; Rodríguez-Pérez, Mario A
2004-01-01
To estimate, using logistic regression, the likelihood of occurrence of a non-gonoactive Aedes aegypti female, previously fed human blood, with relation to body size and collection method. This study was conducted in Monterrey, Mexico, between 1994 and 1996. Ten samplings of 60 mosquitoes of Ae. aegypti females were carried out in three dengue endemic areas: six of biting females, two of emerging mosquitoes, and two of indoor resting females. Gravid females, as well as those with blood in the gut were removed. Mosquitoes were taken to the laboratory and engorged on human blood. After 48 hours, ovaries were dissected to register whether they were gonoactive or non-gonoactive. Wing-length in mm was an indicator for body size. The logistic regression model was used to assess the likelihood of non-gonoactivity, as a binary variable, in relation to wing-length and collection method. Of the 600 females, 164 (27%) remained non-gonoactive, with a wing-length range of 1.9-3.2 mm, almost equal to that of all females (1.8-3.3 mm). The logistic regression model showed a significant likelihood of a female remaining non-gonoactive (Y=1). The collection method did not influence the binary response, but there was an inverse relationship between non-gonoactivity and wing-length. Dengue vector populations from Monterrey, Mexico display a wide-range body size. Logistic regression was a useful tool to estimate the likelihood for an engorged female to remain non-gonoactive. The necessity for a second blood meal is present in any female, but small mosquitoes are more likely to bite again within a 2-day interval, in order to attain egg maturation. The English version of this paper is available too at: http://www.insp.mx/salud/index.html.
The Application of the Cumulative Logistic Regression Model to Automated Essay Scoring
ERIC Educational Resources Information Center
Haberman, Shelby J.; Sinharay, Sandip
2010-01-01
Most automated essay scoring programs use a linear regression model to predict an essay score from several essay features. This article applied a cumulative logit model instead of the linear regression model to automated essay scoring. Comparison of the performances of the linear regression model and the cumulative logit model was performed on a…
Association of abnormal plasma bilirubin with aggressive HCC phenotype
Carr, Brian I.; Guerra, Vito; Giannini, Edoardo G.; Farinati, Fabio; Ciccarese, Francesca; Rapaccini, Gian Ludovico; Marco, Maria Di; Benvegnù, Luisa; Zoli, Marco; Borzio, Franco; Caturelli, Eugenio; Chiaramonte, Maria; Trevisani, Franco
2014-01-01
Background Cirrhosis-related abnormal liver function is associated with predisposition to HCC, features in several HCC classification systems and is an HCC prognostic factor. Aims To examine the phenotypic tumor differences in HCC patients with normal or abnormal plasma bilirubin levels. Methods A 2,416 patient HCC cohort was studied and dichotomized into normal and abnormal plasma bilirubin groups. Their HCC characteristics were compared for tumor aggressiveness features, namely blood AFP levels, tumor size, presence of PVT and tumor multifocality. Results In the total cohort, elevated bilirubin levels were associated with higher AFP levels, increased PVT and multifocality and lower survival, despite similar tumor sizes. When different tumor size terciles were compared, similar results were found, even for small tumor size patients. A multiple logistic regression model for PVT or tumor multifocality showed increased OddsRatios for elevated levels of GGTP, bilirubin and AFP and for larger tumor sizes. Conclusions HCC patients with abnormal bilirubin levels had worse prognosis than patients with normal bilirubin. They also had increased incidence of PVT and tumor multifocality and higher AFP levels, in patients with both small and larger tumors. The results show an association between bilirubin levels and indices of HCC aggressiveness. PMID:24787296
Influence of obesity on mortality of drivers in severe motor vehicle crashes.
Jehle, Dietrich; Gemme, Seth; Jehle, Christopher
2012-01-01
The purpose of the study was to investigate the relationship between obesity and mortality of drivers in severe motor vehicle crashes involving at least one fatality. Fatalities were selected from 155,584 drivers included in the 2000-2005 Fatality Analysis Reporting System. Drivers were stratified by body mass index, confounders were adjusted for, and multiple logistic regression was used to determine the odds ratio (OR) of death in each body mass index class compared with normal weight. The adjusted risk of death from lowest to highest, reported as the OR of death compared with normal weight with 95% confidence intervals, was as follows: (1) overweight (OR, 0.952; 0.911-0.995; P = .0293), (2) slightly obese (OR, 0.996; 0.966-1.026; P = .7758), (3) normal weight, (4) underweight (OR, 1.115; 1.035-1.201; P = .0043), (5) moderately obese (OR, 1.212; 1.128-1.302; P < .0001), and (6) morbidly obese (OR, 1.559; 1.402-1.734; P < .0001). There is an increased risk of death for moderately obese, morbidly obese, and underweight drivers and a decreased risk in overweight drivers. Copyright © 2012 Elsevier Inc. All rights reserved.
[Investigation of psychological state and its influencing factors in children with epilepsy].
Zhao, Jin-Hua; Zhou, Hui; Xu, Ming; Lu, Sheng-Li; Hong, Fei
2015-06-01
To evaluate the psychological state of children with epilepsy and analyze its influencing factors. The Mental Health Scale for Child and Adolescent was used to survey 113 children with epilepsy and 114 normal children to evaluate and compare their psychological state. Questionnaires were used to investigate the general status of all subjects and the disease condition and treatment of children with epilepsy. The possible influencing factors for the psychological state of children with epilepsy were analyzed. The mental health status of children with epilepsy was poorer than that of normal children in cognition, thinking, emotion, will-behavior, and personality traits (P<0.05). Multivariate logistic regression analysis showed that family education, family relations, seizure frequency, seizure duration, EEG epileptiform discharges in the last six months, and number of types of antiepileptic drugs were correlated with the psychological state of children with epilepsy. There is a wider range of psychological health problems in children with epilepsy than in normal children. Poor family living environment, poor seizure control, and use of many antiepileptic drugs are the risk factors affecting the psychological state of children with epilepsy. Improving family living environment, controlling seizures, and monotherapy help to improve the psychological state of children with epilepsy.
Ling, Ziyu; Wang, Jianmin; Li, Xia; Zhong, Yan; Qin, Yuanyuan; Xie, Shengnan; Yang, Senbei; Zhang, Jing
2015-09-01
To explore the relationship between mothers' body mass index (BMI) before pregnancy or weight gain during pregnancy and autism in children. From 2013 to 2014, the 181 children with autism and 181 healthy children matched by sex and age from same area were included in this study. According to mothers' BMI before pregnancy, the selected cases were divided into 3 groups: low, normal and high group. Then 3 groups were divided into 3 subgroups based on mother' s weight gain during pregnancy: low, normal and high group, according to the recommendations of Institute of Medicine. Logistic regression analysis and χ(2) test were conducted with SPSS 18.0 software to analysis the relationship between mothers' BMI before pregnancy or weight gain during pregnancy and autism in children. The age and sex distributions of case group and control group were consistent (χ(2)=0.434, P>0.05). The mothers' BMI before pregnancy of case group was higher than that of control group (χ(2)=9.580, P<0.05) ,which was (21.28±3.80) kg/m(2) for case group and (19.87±2.83) kg/m(2) for control group. The proportion of cases in high BMI group (10.5%) was much higher than that in control group (2.8%) . The risk of children with autism in high BMI group was 3.7 times higher than that in normal BMI group (OR=3.71, 95% CI: 1.34-10.24). In normal BMI group, the proportion of mothers who had excessive weight gain during pregnancy was higher in case group (44.1%) than in control group (33.9%). In high BMI group, the proportion of mothers who had excessive weight gain was higher in case group (52.6%) than in control group (20.0%) . In normal BMI group (χ(2) =8.690, P<0.05) and high BMI group (χ(2)=4.775, P<0.05), the weight gain during pregnancy was associated with autism in children. Logistic regression analysis showed that mothers' BMI before pregnancy (unadjusted OR=1.89, 95% CI: 1.26-2.85, adjusted OR=1.52, 95% CI: 1.19-2.27) and weight gain during pregnancy were the risk factors for autism in children (unadjusted OR=1.63, 95% CI: 1.08-1.25, adjusted OR=1.64, 95% CI: 1.21-2.21). Overweight or obesity before pregnancy and excessive weight gain during pregnancy were associated with autism in children, suggesting that women who plan to be pregnant should pay attention to body weight control.
Ardoino, Ilaria; Lanzoni, Monica; Marano, Giuseppe; Boracchi, Patrizia; Sagrini, Elisabetta; Gianstefani, Alice; Piscaglia, Fabio; Biganzoli, Elia M
2017-04-01
The interpretation of regression models results can often benefit from the generation of nomograms, 'user friendly' graphical devices especially useful for assisting the decision-making processes. However, in the case of multinomial regression models, whenever categorical responses with more than two classes are involved, nomograms cannot be drawn in the conventional way. Such a difficulty in managing and interpreting the outcome could often result in a limitation of the use of multinomial regression in decision-making support. In the present paper, we illustrate the derivation of a non-conventional nomogram for multinomial regression models, intended to overcome this issue. Although it may appear less straightforward at first sight, the proposed methodology allows an easy interpretation of the results of multinomial regression models and makes them more accessible for clinicians and general practitioners too. Development of prediction model based on multinomial logistic regression and of the pertinent graphical tool is illustrated by means of an example involving the prediction of the extent of liver fibrosis in hepatitis C patients by routinely available markers.
Regularization Paths for Conditional Logistic Regression: The clogitL1 Package.
Reid, Stephen; Tibshirani, Rob
2014-07-01
We apply the cyclic coordinate descent algorithm of Friedman, Hastie, and Tibshirani (2010) to the fitting of a conditional logistic regression model with lasso [Formula: see text] and elastic net penalties. The sequential strong rules of Tibshirani, Bien, Hastie, Friedman, Taylor, Simon, and Tibshirani (2012) are also used in the algorithm and it is shown that these offer a considerable speed up over the standard coordinate descent algorithm with warm starts. Once implemented, the algorithm is used in simulation studies to compare the variable selection and prediction performance of the conditional logistic regression model against that of its unconditional (standard) counterpart. We find that the conditional model performs admirably on datasets drawn from a suitable conditional distribution, outperforming its unconditional counterpart at variable selection. The conditional model is also fit to a small real world dataset, demonstrating how we obtain regularization paths for the parameters of the model and how we apply cross validation for this method where natural unconditional prediction rules are hard to come by.
Computational tools for exact conditional logistic regression.
Corcoran, C; Mehta, C; Patel, N; Senchaudhuri, P
Logistic regression analyses are often challenged by the inability of unconditional likelihood-based approximations to yield consistent, valid estimates and p-values for model parameters. This can be due to sparseness or separability in the data. Conditional logistic regression, though useful in such situations, can also be computationally unfeasible when the sample size or number of explanatory covariates is large. We review recent developments that allow efficient approximate conditional inference, including Monte Carlo sampling and saddlepoint approximations. We demonstrate through real examples that these methods enable the analysis of significantly larger and more complex data sets. We find in this investigation that for these moderately large data sets Monte Carlo seems a better alternative, as it provides unbiased estimates of the exact results and can be executed in less CPU time than can the single saddlepoint approximation. Moreover, the double saddlepoint approximation, while computationally the easiest to obtain, offers little practical advantage. It produces unreliable results and cannot be computed when a maximum likelihood solution does not exist. Copyright 2001 John Wiley & Sons, Ltd.
Regularization Paths for Conditional Logistic Regression: The clogitL1 Package
Reid, Stephen; Tibshirani, Rob
2014-01-01
We apply the cyclic coordinate descent algorithm of Friedman, Hastie, and Tibshirani (2010) to the fitting of a conditional logistic regression model with lasso (ℓ1) and elastic net penalties. The sequential strong rules of Tibshirani, Bien, Hastie, Friedman, Taylor, Simon, and Tibshirani (2012) are also used in the algorithm and it is shown that these offer a considerable speed up over the standard coordinate descent algorithm with warm starts. Once implemented, the algorithm is used in simulation studies to compare the variable selection and prediction performance of the conditional logistic regression model against that of its unconditional (standard) counterpart. We find that the conditional model performs admirably on datasets drawn from a suitable conditional distribution, outperforming its unconditional counterpart at variable selection. The conditional model is also fit to a small real world dataset, demonstrating how we obtain regularization paths for the parameters of the model and how we apply cross validation for this method where natural unconditional prediction rules are hard to come by. PMID:26257587
Chen, Chau-Kuang; Bruce, Michelle; Tyler, Lauren; Brown, Claudine; Garrett, Angelica; Goggins, Susan; Lewis-Polite, Brandy; Weriwoh, Mirabel L; Juarez, Paul D.; Hood, Darryl B.; Skelton, Tyler
2014-01-01
The goal of this study was to analyze a 54-item instrument for assessment of perception of exposure to environmental contaminants within the context of the built environment, or exposome. This exposome was defined in five domains to include 1) home and hobby, 2) school, 3) community, 4) occupation, and 5) exposure history. Interviews were conducted with child-bearing-age minority women at Metro Nashville General Hospital at Meharry Medical College. Data were analyzed utilizing DTReg software for Support Vector Machine (SVM) modeling followed by an SPSS package for a logistic regression model. The target (outcome) variable of interest was respondent's residence by ZIP code. The results demonstrate that the rank order of important variables with respect to SVM modeling versus traditional logistic regression models is almost identical. This is the first study documenting that SVM analysis has discriminate power for determination of higher-ordered spatial relationships on an environmental exposure history questionnaire. PMID:23395953
An ultra low power feature extraction and classification system for wearable seizure detection.
Page, Adam; Pramod Tim Oates, Siddharth; Mohsenin, Tinoosh
2015-01-01
In this paper we explore the use of a variety of machine learning algorithms for designing a reliable and low-power, multi-channel EEG feature extractor and classifier for predicting seizures from electroencephalographic data (scalp EEG). Different machine learning classifiers including k-nearest neighbor, support vector machines, naïve Bayes, logistic regression, and neural networks are explored with the goal of maximizing detection accuracy while minimizing power, area, and latency. The input to each machine learning classifier is a 198 feature vector containing 9 features for each of the 22 EEG channels obtained over 1-second windows. All classifiers were able to obtain F1 scores over 80% and onset sensitivity of 100% when tested on 10 patients. Among five different classifiers that were explored, logistic regression (LR) proved to have minimum hardware complexity while providing average F-1 score of 91%. Both ASIC and FPGA implementations of logistic regression are presented and show the smallest area, power consumption, and the lowest latency when compared to the previous work.
The arcsine is asinine: the analysis of proportions in ecology.
Warton, David I; Hui, Francis K C
2011-01-01
The arcsine square root transformation has long been standard procedure when analyzing proportional data in ecology, with applications in data sets containing binomial and non-binomial response variables. Here, we argue that the arcsine transform should not be used in either circumstance. For binomial data, logistic regression has greater interpretability and higher power than analyses of transformed data. However, it is important to check the data for additional unexplained variation, i.e., overdispersion, and to account for it via the inclusion of random effects in the model if found. For non-binomial data, the arcsine transform is undesirable on the grounds of interpretability, and because it can produce nonsensical predictions. The logit transformation is proposed as an alternative approach to address these issues. Examples are presented in both cases to illustrate these advantages, comparing various methods of analyzing proportions including untransformed, arcsine- and logit-transformed linear models and logistic regression (with or without random effects). Simulations demonstrate that logistic regression usually provides a gain in power over other methods.
Chen, Chau-Kuang; Bruce, Michelle; Tyler, Lauren; Brown, Claudine; Garrett, Angelica; Goggins, Susan; Lewis-Polite, Brandy; Weriwoh, Mirabel L; Juarez, Paul D; Hood, Darryl B; Skelton, Tyler
2013-02-01
The goal of this study was to analyze a 54-item instrument for assessment of perception of exposure to environmental contaminants within the context of the built environment, or exposome. This exposome was defined in five domains to include 1) home and hobby, 2) school, 3) community, 4) occupation, and 5) exposure history. Interviews were conducted with child-bearing-age minority women at Metro Nashville General Hospital at Meharry Medical College. Data were analyzed utilizing DTReg software for Support Vector Machine (SVM) modeling followed by an SPSS package for a logistic regression model. The target (outcome) variable of interest was respondent's residence by ZIP code. The results demonstrate that the rank order of important variables with respect to SVM modeling versus traditional logistic regression models is almost identical. This is the first study documenting that SVM analysis has discriminate power for determination of higher-ordered spatial relationships on an environmental exposure history questionnaire.
Okuyama, Mayumi; Nishida, Masumi
2016-01-01
The aim of the present study was to examine the association between impending dehydration among elderly people in nursing homes and physical signs, including the axillary skin temperature, humidity, intraoral moisture content, and salivary components. The study included 78 elderly individuals who required long-term care in a nursing home (11 men and 67 women; average age, 86.6±7.3 years). The elderly subjects were classified in two groups according to their serum osmolality levels: those with levels between the upper limit reference value (292 mOsm/kg H2O) and the diagnostic reference value of dehydration (300 mOsm/kg H2O) were classified into the boundary zone group and those with levels of <292 mOsm/kg H2O were classified into the normal range group. The following parameters were measured: basic attributes (age, gender and level of care required), body mass index, diet, daily fluid intake per kilogram of body weight, physiological indicators (blood pressure, pulse rate, body temperature, axillary skin temperature, humidity, total body water, body water rate, internal liquid rate, external solution rate, blood components, intraoral water amount, and salivary components), and the indoor environment (room temperature and humidity). We then performed a statistical analysis to compare the boundary zone group with the normal range group. After adjusting for age and the daily fluid intake per kilogram of body weight (<25 ml/≥25 ml), we performed a logistic regression analysis (the boundary zone group was used as an independent variable) for variables that had significance levels of <0.05 (except for blood components). The univariate analysis revealed significant differences in the following parameters: the serum sodium, chloride, and creatinine levels; the blood sugar level; the urea nitrogen/creatinine ratio; the axillary skin temperature; and room humidity. Only the axillary skin temperature showed a significant association in the final model of the logistic regression analysis (odds ratio, 3.664; 95% confidence interval, 1.101-12.197; p = 0.034). As the axillary skin temperature increased by 1°C, there was a 3.67-fold risk of being classified into the boundary zone group instead of the normal range group. Thus, the axillary skin temperature was associated with impending dehydration.
Avalos, Marta; Adroher, Nuria Duran; Lagarde, Emmanuel; Thiessard, Frantz; Grandvalet, Yves; Contrand, Benjamin; Orriols, Ludivine
2012-09-01
Large data sets with many variables provide particular challenges when constructing analytic models. Lasso-related methods provide a useful tool, although one that remains unfamiliar to most epidemiologists. We illustrate the application of lasso methods in an analysis of the impact of prescribed drugs on the risk of a road traffic crash, using a large French nationwide database (PLoS Med 2010;7:e1000366). In the original case-control study, the authors analyzed each exposure separately. We use the lasso method, which can simultaneously perform estimation and variable selection in a single model. We compare point estimates and confidence intervals using (1) a separate logistic regression model for each drug with a Bonferroni correction and (2) lasso shrinkage logistic regression analysis. Shrinkage regression had little effect on (bias corrected) point estimates, but led to less conservative results, noticeably for drugs with moderate levels of exposure. Carbamates, carboxamide derivative and fatty acid derivative antiepileptics, drugs used in opioid dependence, and mineral supplements of potassium showed stronger associations. Lasso is a relevant method in the analysis of databases with large number of exposures and can be recommended as an alternative to conventional strategies.
NASA Astrophysics Data System (ADS)
Shafizadeh-Moghadam, Hossein; Helbich, Marco
2015-03-01
The rapid growth of megacities requires special attention among urban planners worldwide, and particularly in Mumbai, India, where growth is very pronounced. To cope with the planning challenges this will bring, developing a retrospective understanding of urban land-use dynamics and the underlying driving-forces behind urban growth is a key prerequisite. This research uses regression-based land-use change models - and in particular non-spatial logistic regression models (LR) and auto-logistic regression models (ALR) - for the Mumbai region over the period 1973-2010, in order to determine the drivers behind spatiotemporal urban expansion. Both global models are complemented by a local, spatial model, the so-called geographically weighted logistic regression (GWLR) model, one that explicitly permits variations in driving-forces across space. The study comes to two main conclusions. First, both global models suggest similar driving-forces behind urban growth over time, revealing that LRs and ALRs result in estimated coefficients with comparable magnitudes. Second, all the local coefficients show distinctive temporal and spatial variations. It is therefore concluded that GWLR aids our understanding of urban growth processes, and so can assist context-related planning and policymaking activities when seeking to secure a sustainable urban future.
Can Predictive Modeling Identify Head and Neck Oncology Patients at Risk for Readmission?
Manning, Amy M; Casper, Keith A; Peter, Kay St; Wilson, Keith M; Mark, Jonathan R; Collar, Ryan M
2018-05-01
Objective Unplanned readmission within 30 days is a contributor to health care costs in the United States. The use of predictive modeling during hospitalization to identify patients at risk for readmission offers a novel approach to quality improvement and cost reduction. Study Design Two-phase study including retrospective analysis of prospectively collected data followed by prospective longitudinal study. Setting Tertiary academic medical center. Subjects and Methods Prospectively collected data for patients undergoing surgical treatment for head and neck cancer from January 2013 to January 2015 were used to build predictive models for readmission within 30 days of discharge using logistic regression, classification and regression tree (CART) analysis, and random forests. One model (logistic regression) was then placed prospectively into the discharge workflow from March 2016 to May 2016 to determine the model's ability to predict which patients would be readmitted within 30 days. Results In total, 174 admissions had descriptive data. Thirty-two were excluded due to incomplete data. Logistic regression, CART, and random forest predictive models were constructed using the remaining 142 admissions. When applied to 106 consecutive prospective head and neck oncology patients at the time of discharge, the logistic regression model predicted readmissions with a specificity of 94%, a sensitivity of 47%, a negative predictive value of 90%, and a positive predictive value of 62% (odds ratio, 14.9; 95% confidence interval, 4.02-55.45). Conclusion Prospectively collected head and neck cancer databases can be used to develop predictive models that can accurately predict which patients will be readmitted. This offers valuable support for quality improvement initiatives and readmission-related cost reduction in head and neck cancer care.
Roland, Lauren T.; Kallogjeri, Dorina; Sinks, Belinda C.; Rauch, Steven D.; Shepard, Neil T.; White, Judith A.; Goebel, Joel A.
2015-01-01
Objective Test performance of a focused dizziness questionnaire’s ability to discriminate between peripheral and non-peripheral causes of vertigo. Study Design Prospective multi-center Setting Four academic centers with experienced balance specialists Patients New dizzy patients Interventions A 32-question survey was given to participants. Balance specialists were blinded and a diagnosis was established for all participating patients within 6 months. Main outcomes Multinomial logistic regression was used to evaluate questionnaire performance in predicting final diagnosis and differentiating between peripheral and non-peripheral vertigo. Univariate and multivariable stepwise logistic regression were used to identify questions as significant predictors of the ultimate diagnosis. C-index was used to evaluate performance and discriminative power of the multivariable models. Results 437 patients participated in the study. Eight participants without confirmed diagnoses were excluded and 429 were included in the analysis. Multinomial regression revealed that the model had good overall predictive accuracy of 78.5% for the final diagnosis and 75.5% for differentiating between peripheral and non-peripheral vertigo. Univariate logistic regression identified significant predictors of three main categories of vertigo: peripheral, central and other. Predictors were entered into forward stepwise multivariable logistic regression. The discriminative power of the final models for peripheral, central and other causes were considered good as measured by c-indices of 0.75, 0.7 and 0.78, respectively. Conclusions This multicenter study demonstrates a focused dizziness questionnaire can accurately predict diagnosis for patients with chronic/relapsing dizziness referred to outpatient clinics. Additionally, this survey has significant capability to differentiate peripheral from non-peripheral causes of vertigo and may, in the future, serve as a screening tool for specialty referral. Clinical utility of this questionnaire to guide specialty referral is discussed. PMID:26485598
Roland, Lauren T; Kallogjeri, Dorina; Sinks, Belinda C; Rauch, Steven D; Shepard, Neil T; White, Judith A; Goebel, Joel A
2015-12-01
Test performance of a focused dizziness questionnaire's ability to discriminate between peripheral and nonperipheral causes of vertigo. Prospective multicenter. Four academic centers with experienced balance specialists. New dizzy patients. A 32-question survey was given to participants. Balance specialists were blinded and a diagnosis was established for all participating patients within 6 months. Multinomial logistic regression was used to evaluate questionnaire performance in predicting final diagnosis and differentiating between peripheral and nonperipheral vertigo. Univariate and multivariable stepwise logistic regression were used to identify questions as significant predictors of the ultimate diagnosis. C-index was used to evaluate performance and discriminative power of the multivariable models. In total, 437 patients participated in the study. Eight participants without confirmed diagnoses were excluded and 429 were included in the analysis. Multinomial regression revealed that the model had good overall predictive accuracy of 78.5% for the final diagnosis and 75.5% for differentiating between peripheral and nonperipheral vertigo. Univariate logistic regression identified significant predictors of three main categories of vertigo: peripheral, central, and other. Predictors were entered into forward stepwise multivariable logistic regression. The discriminative power of the final models for peripheral, central, and other causes was considered good as measured by c-indices of 0.75, 0.7, and 0.78, respectively. This multicenter study demonstrates a focused dizziness questionnaire can accurately predict diagnosis for patients with chronic/relapsing dizziness referred to outpatient clinics. Additionally, this survey has significant capability to differentiate peripheral from nonperipheral causes of vertigo and may, in the future, serve as a screening tool for specialty referral. Clinical utility of this questionnaire to guide specialty referral is discussed.
Prediction of cold and heat patterns using anthropometric measures based on machine learning.
Lee, Bum Ju; Lee, Jae Chul; Nam, Jiho; Kim, Jong Yeol
2018-01-01
To examine the association of body shape with cold and heat patterns, to determine which anthropometric measure is the best indicator for discriminating between the two patterns, and to investigate whether using a combination of measures can improve the predictive power to diagnose these patterns. Based on a total of 4,859 subjects (3,000 women and 1,859 men), statistical analyses using binary logistic regression were performed to assess the significance of the difference and the predictive power of each anthropometric measure, and binary logistic regression and Naive Bayes with the variable selection technique were used to assess the improvement in the predictive power of the patterns using the combined measures. In women, the strongest indicators for determining the cold and heat patterns among anthropometric measures were body mass index (BMI) and rib circumference; in men, the best indicator was BMI. In experiments using a combination of measures, the values of the area under the receiver operating characteristic curve in women were 0.776 by Naive Bayes and 0.772 by logistic regression, and the values in men were 0.788 by Naive Bayes and 0.779 by logistic regression. Individuals with a higher BMI have a tendency toward a heat pattern in both women and men. The use of a combination of anthropometric measures can slightly improve the diagnostic accuracy. Our findings can provide fundamental information for the diagnosis of cold and heat patterns based on body shape for personalized medicine.
Teng, Ju-Hsi; Lin, Kuan-Chia; Ho, Bin-Shenq
2007-10-01
A community-based aboriginal study was conducted and analysed to explore the application of classification tree and logistic regression. A total of 1066 aboriginal residents in Yilan County were screened during 2003-2004. The independent variables include demographic characteristics, physical examinations, geographic location, health behaviours, dietary habits and family hereditary diseases history. Risk factors of cardiovascular diseases were selected as the dependent variables in further analysis. The completion rate for heath interview is 88.9%. The classification tree results find that if body mass index is higher than 25.72 kg m(-2) and the age is above 51 years, the predicted probability for number of cardiovascular risk factors > or =3 is 73.6% and the population is 322. If body mass index is higher than 26.35 kg m(-2) and geographical latitude of the village is lower than 24 degrees 22.8', the predicted probability for number of cardiovascular risk factors > or =4 is 60.8% and the population is 74. As the logistic regression results indicate that body mass index, drinking habit and menopause are the top three significant independent variables. The classification tree model specifically shows the discrimination paths and interactions between the risk groups. The logistic regression model presents and analyses the statistical independent factors of cardiovascular risks. Applying both models to specific situations will provide a different angle for the design and management of future health intervention plans after community-based study.
Gong, Xu; Cui, Jianli; Jiang, Ziping; Lu, Laijin; Li, Xiucun
2018-03-01
Few clinical retrospective studies have reported the risk factors of pedicled flap necrosis in hand soft tissue reconstruction. The aim of this study was to identify non-technical risk factors associated with pedicled flap perioperative necrosis in hand soft tissue reconstruction via a multivariate logistic regression analysis. For patients with hand soft tissue reconstruction, we carefully reviewed hospital records and identified 163 patients who met the inclusion criteria. The characteristics of these patients, flap transfer procedures and postoperative complications were recorded. Eleven predictors were identified. The correlations between pedicled flap necrosis and risk factors were analysed using a logistic regression model. Of 163 skin flaps, 125 flaps survived completely without any complications. The pedicled flap necrosis rate in hands was 11.04%, which included partial flap necrosis (7.36%) and total flap necrosis (3.68%). Soft tissue defects in fingers were noted in 68.10% of all cases. The logistic regression analysis indicated that the soft tissue defect site (P = 0.046, odds ratio (OR) = 0.079, confidence interval (CI) (0.006, 0.959)), flap size (P = 0.020, OR = 1.024, CI (1.004, 1.045)) and postoperative wound infection (P < 0.001, OR = 17.407, CI (3.821, 79.303)) were statistically significant risk factors for pedicled flap necrosis of the hand. Soft tissue defect site, flap size and postoperative wound infection were risk factors associated with pedicled flap necrosis in hand soft tissue defect reconstruction. © 2017 Royal Australasian College of Surgeons.
A regularization corrected score method for nonlinear regression models with covariate error.
Zucker, David M; Gorfine, Malka; Li, Yi; Tadesse, Mahlet G; Spiegelman, Donna
2013-03-01
Many regression analyses involve explanatory variables that are measured with error, and failing to account for this error is well known to lead to biased point and interval estimates of the regression coefficients. We present here a new general method for adjusting for covariate error. Our method consists of an approximate version of the Stefanski-Nakamura corrected score approach, using the method of regularization to obtain an approximate solution of the relevant integral equation. We develop the theory in the setting of classical likelihood models; this setting covers, for example, linear regression, nonlinear regression, logistic regression, and Poisson regression. The method is extremely general in terms of the types of measurement error models covered, and is a functional method in the sense of not involving assumptions on the distribution of the true covariate. We discuss the theoretical properties of the method and present simulation results in the logistic regression setting (univariate and multivariate). For illustration, we apply the method to data from the Harvard Nurses' Health Study concerning the relationship between physical activity and breast cancer mortality in the period following a diagnosis of breast cancer. Copyright © 2013, The International Biometric Society.
Logistic Mixed Models to Investigate Implicit and Explicit Belief Tracking
Lages, Martin; Scheel, Anne
2016-01-01
We investigated the proposition of a two-systems Theory of Mind in adults’ belief tracking. A sample of N = 45 participants predicted the choice of one of two opponent players after observing several rounds in an animated card game. Three matches of this card game were played and initial gaze direction on target and subsequent choice predictions were recorded for each belief task and participant. We conducted logistic regressions with mixed effects on the binary data and developed Bayesian logistic mixed models to infer implicit and explicit mentalizing in true belief and false belief tasks. Although logistic regressions with mixed effects predicted the data well a Bayesian logistic mixed model with latent task- and subject-specific parameters gave a better account of the data. As expected explicit choice predictions suggested a clear understanding of true and false beliefs (TB/FB). Surprisingly, however, model parameters for initial gaze direction also indicated belief tracking. We discuss why task-specific parameters for initial gaze directions are different from choice predictions yet reflect second-order perspective taking. PMID:27853440
Distiller, Larry A; Joffe, Barry I; Melville, Vanessa; Welman, Tania; Distiller, Greg B
2006-01-01
The factors responsible for premature coronary atherosclerosis in patients with type 1 diabetes are ill defined. We therefore assessed carotid intima-media complex thickness (IMT) in relatively long-surviving patients with type 1 diabetes as a marker of atherosclerosis and correlated this with traditional risk factors. Cross-sectional study of 148 patients with relatively long-surviving (>18 years) type 1 diabetes (76 men and 72 women) attending the Centre for Diabetes and Endocrinology, Johannesburg. The mean common carotid artery IMT and presence or absence of plaque was evaluated by high-resolution B-mode ultrasound. Their median age was 48 years and duration of diabetes 26 years (range 18-59 years). Traditional risk factors (age, duration of diabetes, glycemic control, hypertension, smoking and lipoprotein concentrations) were recorded. Three response variables were defined and modeled. Standard multiple regression was used for a continuous IMT variable, logistic regression for the presence/absence of plaque and ordinal logistic regression to model three categories of "risk." The median common carotid IMT was 0.62 mm (range 0.44-1.23 mm) with plaque detected in 28 cases. The multiple regression model found significant associations between IMT and current age (P=.001), duration of diabetes (P=.033), BMI (P=.008) and diagnosed hypertension (P=.046) with HDL showing a protective effect (P=.022). Current age (P=.001) and diagnosed hypertension (P=.004), smoking (P=.008) and retinopathy (P=.033) were significant in the logistic regression model. Current age was also significant in the ordinal logistic regression model (P<.001), as was total cholesterol/HDL ratio (P<.001) and mean HbA(1c) concentration (P=.073). The major factors influencing common carotid IMT in patients with relatively long-surviving type 1 diabetes are age, duration of diabetes, existing hypertension and HDL (protective) with a relatively minor role ascribed to relatively long-standing glycemic control.
Pre-Pregnancy Body Mass Index, Gestational Weight Gain, and Birth Weight: A Cohort Study in China.
Yang, Shaoping; Peng, Anna; Wei, Sheng; Wu, Jing; Zhao, Jinzhu; Zhang, Yiming; Wang, Jing; Lu, Yuan; Yu, Yuzhen; Zhang, Bin
2015-01-01
To assess whether pre-pregnancy body mass index (BMI) modify the relationship between gestational weight gain (GWG) and child birth weight (specifically, presence or absence of low birth weight (LBW) or presence of absence of macrosomia), and estimates of the relative risk of macrosomia and LBW based on pre-pregnancy BMI were controlled in Wuhan, China. From June 30, 2011 to June 30, 2013. All data was collected and available from the perinatal health care system. Logistic regression models were used to estimate the independent association among pregnancy weight gain, LBW, normal birth weight, and macrosomia within different pre-pregnancy BMI groups. We built different logistic models for the 2009 Institute of Medicine (IOM) Guidelines and Chinese-recommended GWG which was made from this sample. The Chinese-recommended GWG was derived from the quartile values (25th-75th percentiles) of weight gain at the time of delivery in the subjects which comprised our sample. For LBW children, using the recommended weight gain of the IOM and Chinese women as a reference, the OR for a pregnancy weight gain below recommendations resulted in a positive relationship for lean and normal weight women, but not for overweight and obese women. For macrosomia, considering the IOM's recommended weight gain as a reference, the OR magnitude for pregnancy weight gain above recommendations resulted in a positive correlation for all women. The OR for a pregnancy weight gain below recommendations resulted in a negative relationship for normal BMI and lean women, but not for overweight and obese women based on the IOM recommendations, significant based on the recommended pregnancy weight gain for Chinese women. Of normal weight children, 56.6% were above the GWG based on IOM recommendations, but 26.97% of normal weight children were above the GWG based on Chinese recommendations. A GWG above IOM recommendations might not be helpful for Chinese women. We need unified criteria to classify adult BMI and to expand the sample size to improve representation and to elucidate the relationship between GWG and related outcomes for developing a Chinese GWG recommendation.
Pre-Pregnancy Body Mass Index, Gestational Weight Gain, and Birth Weight: A Cohort Study in China
Wei, Sheng; Wu, Jing; Zhao, Jinzhu; Zhang, Yiming; Wang, Jing; Lu, Yuan; Yu, Yuzhen; Zhang, Bin
2015-01-01
Objective To assess whether pre-pregnancy body mass index (BMI) modify the relationship between gestational weight gain (GWG) and child birth weight (specifically, presence or absence of low birth weight (LBW) or presence of absence of macrosomia), and estimates of the relative risk of macrosomia and LBW based on pre-pregnancy BMI were controlled in Wuhan, China. Methods From June 30, 2011 to June 30, 2013. All data was collected and available from the perinatal health care system. Logistic regression models were used to estimate the independent association among pregnancy weight gain, LBW, normal birth weight, and macrosomia within different pre-pregnancy BMI groups. We built different logistic models for the 2009 Institute of Medicine (IOM) Guidelines and Chinese-recommended GWG which was made from this sample. The Chinese-recommended GWG was derived from the quartile values (25th-75th percentiles) of weight gain at the time of delivery in the subjects which comprised our sample. Results For LBW children, using the recommended weight gain of the IOM and Chinese women as a reference, the OR for a pregnancy weight gain below recommendations resulted in a positive relationship for lean and normal weight women, but not for overweight and obese women. For macrosomia, considering the IOM’s recommended weight gain as a reference, the OR magnitude for pregnancy weight gain above recommendations resulted in a positive correlation for all women. The OR for a pregnancy weight gain below recommendations resulted in a negative relationship for normal BMI and lean women, but not for overweight and obese women based on the IOM recommendations, significant based on the recommended pregnancy weight gain for Chinese women. Of normal weight children, 56.6% were above the GWG based on IOM recommendations, but 26.97% of normal weight children were above the GWG based on Chinese recommendations. Conclusions A GWG above IOM recommendations might not be helpful for Chinese women. We need unified criteria to classify adult BMI and to expand the sample size to improve representation and to elucidate the relationship between GWG and related outcomes for developing a Chinese GWG recommendation. PMID:26115015
Correlation and simple linear regression.
Eberly, Lynn E
2007-01-01
This chapter highlights important steps in using correlation and simple linear regression to address scientific questions about the association of two continuous variables with each other. These steps include estimation and inference, assessing model fit, the connection between regression and ANOVA, and study design. Examples in microbiology are used throughout. This chapter provides a framework that is helpful in understanding more complex statistical techniques, such as multiple linear regression, linear mixed effects models, logistic regression, and proportional hazards regression.
Multiple Imputation of a Randomly Censored Covariate Improves Logistic Regression Analysis.
Atem, Folefac D; Qian, Jing; Maye, Jacqueline E; Johnson, Keith A; Betensky, Rebecca A
2016-01-01
Randomly censored covariates arise frequently in epidemiologic studies. The most commonly used methods, including complete case and single imputation or substitution, suffer from inefficiency and bias. They make strong parametric assumptions or they consider limit of detection censoring only. We employ multiple imputation, in conjunction with semi-parametric modeling of the censored covariate, to overcome these shortcomings and to facilitate robust estimation. We develop a multiple imputation approach for randomly censored covariates within the framework of a logistic regression model. We use the non-parametric estimate of the covariate distribution or the semiparametric Cox model estimate in the presence of additional covariates in the model. We evaluate this procedure in simulations, and compare its operating characteristics to those from the complete case analysis and a survival regression approach. We apply the procedures to an Alzheimer's study of the association between amyloid positivity and maternal age of onset of dementia. Multiple imputation achieves lower standard errors and higher power than the complete case approach under heavy and moderate censoring and is comparable under light censoring. The survival regression approach achieves the highest power among all procedures, but does not produce interpretable estimates of association. Multiple imputation offers a favorable alternative to complete case analysis and ad hoc substitution methods in the presence of randomly censored covariates within the framework of logistic regression.
Shani, Michal; Vinker, Shlomo; Dinour, Dganit; Leiba, Merav; Twig, Gilad; Holtzman, Eliezer J; Leiba, Adi
2016-10-01
The risk associated with serum uric acid (SUA) levels within the normal range is unknown, especially among lean and apparently healthy adults. Evaluating whether high-normal SUA levels, 6.8 mg/dL and below, are associated with an increased diabetes risk, compared with low-normal SUA. This was a cohort study with 10 years of followup involving all clinics of the largest nationally distributed Health Maintenance Organization in Israel. Participants included 469,947 examinees, 40-70 years old at baseline, who had their SUA measured during 2002. We excluded examinees who had hyperuricemia (SUA > 6.8 mg/dL), impaired fasting glucose, overweight or obesity and chronic cardiovascular or renal disorders. The final cohort was composed of 30 302 participants. Participants were followed up to a new diagnosis of diabetes during the study period. Odds ratio of developing diabetes among participants with high-normal baseline SUA were compared with low-normal (2 ≤ uric acid < 3 and 3 ≤ uric acid < 4 in women and men, respectively). In a logistic regression model adjusted for age, body mass index, socioeconomic status, smoking, baseline estimated glomerular filtration rate, and baseline glucose, SUA levels of 4-5 mg/dL for women were associated with 61% increased risk for incident diabetes (95% confidence interval, 1.1-2.3). At the highest normal levels for women (SUA, 5-6 mg/dL) the odds ratio was 2.7 (1.8-4.0), whereas men had comparable diabetes risk at values of 6-6.8 mg/dL (hazard ratio, 1.35; 95% confidence interval, 0.9-2.1). SUA levels within the normal range are associated with an increased risk for new-onset diabetes among healthy lean women when compared with those with low-normal values.
Epstein, Daniel S; Mitra, Biswadev; Cameron, Peter A; Fitzgerald, Mark; Rosenfeld, Jeffrey V
2016-07-01
Acute traumatic coagulopathy (ATC) has been reported in the setting of isolated traumatic brain injury (iTBI) and is associated with poor outcomes. We aimed to evaluate the effectiveness of procoagulant agents administered to patients with ATC and iTBI during resuscitation, hypothesizing that timely normalization of coagulopathy may be associated with a decrease in mortality. A retrospective review of the Alfred Hospital trauma registry, Australia, was conducted and patients with iTBI (head Abbreviated Injury Score [AIS] ⩾3 and all other body AIS <3) and coagulopathy (international normalized ratio ⩾1.3) were selected for analysis. Data on procoagulant agents used (fresh frozen plasma, platelets, cryoprecipitate, prothrombin complex concentrates, tranexamic acid, vitamin K) were extracted. Among patients who had achieved normalization of INR or survived beyond 24hours and were not taking oral anticoagulants, the association of normalization of INR and death at hospital discharge was analyzed using multivariable logistic regression analysis. There were 157 patients with ATC of whom 68 (43.3%) received procoagulant products within 24hours of presentation. The median time to delivery of first products was 182.5 (interquartile range [IQR] 115-375) minutes, and following administration of coagulants, time to normalization of INR was 605 (IQR 274-1146) minutes. Normalization of INR was independently associated with significantly lower mortality (adjusted odds ratio 0.10; 95% confidence interval 0.03-0.38). Normalization of INR was associated with improved mortality in patients with ATC in the setting of iTBI. As there was a substantial time lag between delivery of products and eventual normalization of coagulation, specific management of coagulopathy should be implemented as early as possible. Copyright © 2016 Elsevier Ltd. All rights reserved.
Simulation of urban land surface temperature based on sub-pixel land cover in a coastal city
NASA Astrophysics Data System (ADS)
Zhao, Xiaofeng; Deng, Lei; Feng, Huihui; Zhao, Yanchuang
2014-11-01
The sub-pixel urban land cover has been proved to have obvious correlations with land surface temperature (LST). Yet these relationships have seldom been used to simulate LST. In this study we provided a new approach of urban LST simulation based on sub-pixel land cover modeling. Landsat TM/ETM+ images of Xiamen city, China on both the January of 2002 and 2007 were used to acquire land cover and then extract the transformation rule using logistic regression. The transformation possibility was taken as its percent in the same pixel after normalization. And cellular automata were used to acquire simulated sub-pixel land cover on 2007 and 2017. On the other hand, the correlations between retrieved LST and sub-pixel land cover achieved by spectral mixture analysis in 2002 were examined and a regression model was built. Then the regression model was used on simulated 2007 land cover to model the LST of 2007. Finally the LST of 2017 was simulated for urban planning and management. The results showed that our method is useful in LST simulation. Although the simulation accuracy is not quite satisfactory, it provides an important idea and a good start in the modeling of urban LST.
Multinomial logistic regression in workers' health
NASA Astrophysics Data System (ADS)
Grilo, Luís M.; Grilo, Helena L.; Gonçalves, Sónia P.; Junça, Ana
2017-11-01
In European countries, namely in Portugal, it is common to hear some people mentioning that they are exposed to excessive and continuous psychosocial stressors at work. This is increasing in diverse activity sectors, such as, the Services sector. A representative sample was collected from a Portuguese Services' organization, by applying a survey (internationally validated), which variables were measured in five ordered categories in Likert-type scale. A multinomial logistic regression model is used to estimate the probability of each category of the dependent variable general health perception where, among other independent variables, burnout appear as statistically significant.
Du, Qing-Yun; Wang, En-Yin; Huang, Yan; Guo, Xiao-Yi; Xiong, Yu-Jing; Yu, Yi-Ping; Yao, Gui-Dong; Shi, Sen-Lin; Sun, Ying-Pu
2016-04-01
To evaluate the independent effects of the degree of blastocoele expansion and re-expansion and the inner cell mass (ICM) and trophectoderm (TE) grades on predicting live birth after fresh and vitrified/warmed single blastocyst transfer. Retrospective study. Reproductive medical center. Women undergoing 844 fresh and 370 vitrified/warmed single blastocyst transfer cycles. None. Live-birth rate correlated with blastocyst morphology parameters by logistic regression analysis and Spearman correlations analysis. The degree of blastocoele expansion and re-expansion was the only blastocyst morphology parameter that exhibited a significant ability to predict live birth in both fresh and vitrified/warmed single blastocyst transfer cycles respectively by multivariate logistic regression and Spearman correlations analysis. Although the ICM grade was significantly related to live birth in fresh cycles according to the univariate model, its effect was not maintained in the multivariate logistic analysis. In vitrified/warmed cycles, neither ICM nor TE grade was correlated with live birth by logistic regression analysis. This study is the first to confirm that the degree of blastocoele expansion and re-expansion is a better predictor of live birth after both fresh and vitrified/warmed single blastocyst transfer cycles than ICM or TE grade. Copyright © 2016. Published by Elsevier Inc.
Notes on power of normality tests of error terms in regression models
DOE Office of Scientific and Technical Information (OSTI.GOV)
Střelec, Luboš
2015-03-10
Normality is one of the basic assumptions in applying statistical procedures. For example in linear regression most of the inferential procedures are based on the assumption of normality, i.e. the disturbance vector is assumed to be normally distributed. Failure to assess non-normality of the error terms may lead to incorrect results of usual statistical inference techniques such as t-test or F-test. Thus, error terms should be normally distributed in order to allow us to make exact inferences. As a consequence, normally distributed stochastic errors are necessary in order to make a not misleading inferences which explains a necessity and importancemore » of robust tests of normality. Therefore, the aim of this contribution is to discuss normality testing of error terms in regression models. In this contribution, we introduce the general RT class of robust tests for normality, and present and discuss the trade-off between power and robustness of selected classical and robust normality tests of error terms in regression models.« less
Factor complexity of crash occurrence: An empirical demonstration using boosted regression trees.
Chung, Yi-Shih
2013-12-01
Factor complexity is a characteristic of traffic crashes. This paper proposes a novel method, namely boosted regression trees (BRT), to investigate the complex and nonlinear relationships in high-variance traffic crash data. The Taiwanese 2004-2005 single-vehicle motorcycle crash data are used to demonstrate the utility of BRT. Traditional logistic regression and classification and regression tree (CART) models are also used to compare their estimation results and external validities. Both the in-sample cross-validation and out-of-sample validation results show that an increase in tree complexity provides improved, although declining, classification performance, indicating a limited factor complexity of single-vehicle motorcycle crashes. The effects of crucial variables including geographical, time, and sociodemographic factors explain some fatal crashes. Relatively unique fatal crashes are better approximated by interactive terms, especially combinations of behavioral factors. BRT models generally provide improved transferability than conventional logistic regression and CART models. This study also discusses the implications of the results for devising safety policies. Copyright © 2012 Elsevier Ltd. All rights reserved.
Zhan, L.; Liu, Y.; Zhou, J.; Ye, J.; Thompson, P.M.
2015-01-01
Mild cognitive impairment (MCI) is an intermediate stage between normal aging and Alzheimer's disease (AD), and around 10-15% of people with MCI develop AD each year. More recently, MCI has been further subdivided into early and late stages, and there is interest in identifying sensitive brain imaging biomarkers that help to differentiate stages of MCI. Here, we focused on anatomical brain networks computed from diffusion MRI and proposed a new feature extraction and classification framework based on higher order singular value decomposition and sparse logistic regression. In tests on publicly available data from the Alzheimer's Disease Neuroimaging Initiative, our proposed framework showed promise in detecting brain network differences that help in classifying early versus late MCI. PMID:26413202
Keogh, Ruth H; Mangtani, Punam; Rodrigues, Laura; Nguipdop Djomo, Patrick
2016-01-05
Traditional analyses of standard case-control studies using logistic regression do not allow estimation of time-varying associations between exposures and the outcome. We present two approaches which allow this. The motivation is a study of vaccine efficacy as a function of time since vaccination. Our first approach is to estimate time-varying exposure-outcome associations by fitting a series of logistic regressions within successive time periods, reusing controls across periods. Our second approach treats the case-control sample as a case-cohort study, with the controls forming the subcohort. In the case-cohort analysis, controls contribute information at all times they are at risk. Extensions allow left truncation, frequency matching and, using the case-cohort analysis, time-varying exposures. Simulations are used to investigate the methods. The simulation results show that both methods give correct estimates of time-varying effects of exposures using standard case-control data. Using the logistic approach there are efficiency gains by reusing controls over time and care should be taken over the definition of controls within time periods. However, using the case-cohort analysis there is no ambiguity over the definition of controls. The performance of the two analyses is very similar when controls are used most efficiently under the logistic approach. Using our methods, case-control studies can be used to estimate time-varying exposure-outcome associations where they may not previously have been considered. The case-cohort analysis has several advantages, including that it allows estimation of time-varying associations as a continuous function of time, while the logistic regression approach is restricted to assuming a step function form for the time-varying association.
Macdonald, Jonathan; Porter, Victoria; Scott, Neil W; McNamara, Deirdre
2010-10-01
Small bowel angiodysplasia accounts for 30 to 40% of cases of obscure gastrointestinal bleeding and is associated with significant morbidity and mortality. Identifying lesions can be difficult. Small bowel capsule endoscopy (SBCE) is a significant advance on earlier diagnostic techniques. The cause of angiodysplasia is unknown and the natural history poorly understood. Many lesions are thought to arise from a degenerative process associated with ageing, local vascular anomalies, and tissue hypoxia. Nonpathologic lymphangiectasias are commonly seen throughout the small bowel and are considered a normal finding. To determine whether there is an association between lymphangiectasias, angiodysplasia, and atherosclerosis related conditions. Relevant information was collected from a dedicated SBCE database. Logistic regression analysis was used to examine associations between angiodysplasia, lymphangiectasia, patient demographics, and comorbidity. In all, 180 patients underwent SBCE during the study period, 46 (25%) had angiodysplasia and 47 (26%) lymphangiectasia. Lymphangiectasia were seen in 24 (52%) of 46 with angiodysplasia, in 16 (19%) of 84 with obscure gastrointestinal bleeding without angiodysplasia and in 7 (14%) of 50 without gastrointestinal bleeding. Logistic regression analysis confirmed a strong positive association between angiodysplasia and lymphangiectasia; odds ratio 4.42, P<0.003. Angiodysplasias were also associated with increasing age; odds ratio 1.1. There was no correlation with any other patient characteristic. Lymphangiectasia are strongly associated with the presence of small intestinal angiodysplasia and may represent a useful clinical marker for this condition. Angiodysplasia are also associated with increasing age. Conditions associated with systemic atherosclerosis did not increase the risk of angiodysplasia.
Venta, Kimberly; Baker, Erin; Fidopiastis, Cali; Stanney, Kay
2017-12-01
The purpose of this study was to investigate the potential of developing an EHR-based model of physician competency, named the Skill Deficiency Evaluation Toolkit for Eliminating Competency-loss Trends (Skill-DETECT), which presents the opportunity to use EHR-based models to inform selection of Continued Medical Education (CME) opportunities specifically targeted at maintaining proficiency. The IBM Explorys platform provided outpatient Electronic Health Records (EHRs) representing 76 physicians with over 5000 patients combined. These data were used to develop the Skill-DETECT model, a predictive hybrid model composed of a rule-based model, logistic regression model, and a thresholding model, which predicts cognitive clinical skill deficiencies in internal medicine physicians. A three-phase approach was then used to statistically validate the model performance. Subject Matter Expert (SME) panel reviews resulted in a 100% overall approval rate of the rule based model. Area under the receiver-operating characteristic curves calculated for each logistic regression curve resulted in values between 0.76 and 0.92, which indicated exceptional performance. Normality, skewness, and kurtosis were determined and confirmed that the distribution of values output from the thresholding model were unimodal and peaked, which confirmed effectiveness and generalizability. The validation has confirmed that the Skill-DETECT model has a strong ability to evaluate EHR data and support the identification of internal medicine cognitive clinical skills that are deficient or are of higher likelihood of becoming deficient and thus require remediation, which will allow both physician and medical organizations to fine tune training efforts. Copyright © 2017 Elsevier B.V. All rights reserved.
Reducing the number of reconstructions needed for estimating channelized observer performance
NASA Astrophysics Data System (ADS)
Pineda, Angel R.; Miedema, Hope; Brenner, Melissa; Altaf, Sana
2018-03-01
A challenge for task-based optimization is the time required for each reconstructed image in applications where reconstructions are time consuming. Our goal is to reduce the number of reconstructions needed to estimate the area under the receiver operating characteristic curve (AUC) of the infinitely-trained optimal channelized linear observer. We explore the use of classifiers which either do not invert the channel covariance matrix or do feature selection. We also study the assumption that multiple low contrast signals in the same image of a non-linear reconstruction do not significantly change the estimate of the AUC. We compared the AUC of several classifiers (Hotelling, logistic regression, logistic regression using Firth bias reduction and the least absolute shrinkage and selection operator (LASSO)) with a small number of observations both for normal simulated data and images from a total variation reconstruction in magnetic resonance imaging (MRI). We used 10 Laguerre-Gauss channels and the Mann-Whitney estimator for AUC. For this data, our results show that at small sample sizes feature selection using the LASSO technique can decrease bias of the AUC estimation with increased variance and that for large sample sizes the difference between these classifiers is small. We also compared the use of multiple signals in a single reconstructed image to reduce the number of reconstructions in a total variation reconstruction for accelerated imaging in MRI. We found that AUC estimation using multiple low contrast signals in the same image resulted in similar AUC estimates as doing a single reconstruction per signal leading to a 13x reduction in the number of reconstructions needed.
Mood state sub-types in adults who stutter: A prospective study.
Tran, Yvonne; Blumgart, Elaine; Craig, Ashley
2018-06-01
Many adults who stutter have elevated negative mood states like anxiety and depressive mood. Little is known about how mood states change over time. The purpose of this study was to determine the trajectories or sub-types of mood states in adults who stutter over a 6 month period, and establish factors that contribute to these sub-types. Participants included 129 adults who stutter who completed an assessment regimen at baseline, including a measure of mood states (Symptom Checklist-90-Revised). Three mood states were assessed (interpersonal sensitivity or IS, depressive mood and anxiety) once a month over 6 months. Latent class growth mixture modeling was used to establish trajectories of change in these mood states over time. Logistic regression was then used to determine factors assessed at baseline that contribute to the IS trajectories. Three-class trajectory models were accepted as the best fit for IS, depressive mood and anxiety mood sub-types. Stable and normal mood state sub-types were found, incorporating around 60% of participants. Up to 40% belonged to sub-types comprising elevated levels of negative mood states. The logistic regression was conducted only with the IS domain, and revealed four factors that significantly contributed to IS mood sub-types. Those with low perceived control, low vitality, elevated social fears and being female were more likely to belong to elevated IS classes. This research revealed mood sub-types in adults who stutter, providing direction for the treatment of stuttering. Clarification of how much stuttering influences mood sub-types versus pre-existing mood is required. Copyright © 2017 Elsevier Inc. All rights reserved.
Ultrasonographic Diagnosis of Biliary Atresia Based on a Decision-Making Tree Model.
Lee, So Mi; Cheon, Jung-Eun; Choi, Young Hun; Kim, Woo Sun; Cho, Hyun-Hae; Cho, Hyun-Hye; Kim, In-One; You, Sun Kyoung
2015-01-01
To assess the diagnostic value of various ultrasound (US) findings and to make a decision-tree model for US diagnosis of biliary atresia (BA). From March 2008 to January 2014, the following US findings were retrospectively evaluated in 100 infants with cholestatic jaundice (BA, n = 46; non-BA, n = 54): length and morphology of the gallbladder, triangular cord thickness, hepatic artery and portal vein diameters, and visualization of the common bile duct. Logistic regression analyses were performed to determine the features that would be useful in predicting BA. Conditional inference tree analysis was used to generate a decision-making tree for classifying patients into the BA or non-BA groups. Multivariate logistic regression analysis showed that abnormal gallbladder morphology and greater triangular cord thickness were significant predictors of BA (p = 0.003 and 0.001; adjusted odds ratio: 345.6 and 65.6, respectively). In the decision-making tree using conditional inference tree analysis, gallbladder morphology and triangular cord thickness (optimal cutoff value of triangular cord thickness, 3.4 mm) were also selected as significant discriminators for differential diagnosis of BA, and gallbladder morphology was the first discriminator. The diagnostic performance of the decision-making tree was excellent, with sensitivity of 100% (46/46), specificity of 94.4% (51/54), and overall accuracy of 97% (97/100). Abnormal gallbladder morphology and greater triangular cord thickness (> 3.4 mm) were the most useful predictors of BA on US. We suggest that the gallbladder morphology should be evaluated first and that triangular cord thickness should be evaluated subsequently in cases with normal gallbladder morphology.
Chen, Yimin; Zhao, Ying; Feng, Linmin; Zhang, Jie; Zhang, Juanwen; Feng, Guofang
2016-04-27
Metabolic syndrome is closely associated with an increased risk for fatty liver disease morbidity and mortality. Recently, studies have reported that participants with fatty liver disease have higher serum alpha-fetoprotein levels than those without. We investigated the association between alpha-fetoprotein levels and the prevalence of metabolic syndrome in a Chinese asymptomatic population. A cross-sectional study was performed with 7,755 participants who underwent individual health examinations. Clinical and anthropometric parameters were collected and serum alpha-fetoprotein levels and other clinical and laboratory parameters were measured. Logistic regression analysis was used to examine associations between alpha-fetoprotein and metabolic syndrome. Participants with metabolic syndrome had significantly higher (p < 0.001) alpha-fetoprotein levels than those without, though all alpha-fetoprotein levels were within the reference interval. The association between the components of metabolic syndrome (central obesity, elevated blood pressure, elevated triglycerides, reduced high-density lipoprotein cholesterol, and elevated fasting plasma glucose) and alpha-fetoprotein levels was evaluated. Alpha-fetoprotein levels in the elevated triglycerides, reduced high-density lipoprotein cholesterol, and elevated fasting plasma glucose groups were significantly different (p=0.002, p < 0.001, p=0.020) compared with alpha-fetoprotein in the normal triglycerides, high-density lipoprotein cholesterol, and fasting plasma glucose groups. Logistic regression analyses showed an association between alpha-fetoprotein levels and increased risk for metabolic syndrome, the presence of reduced high-density lipoprotein cholesterol, and elevated fasting plasma glucose, but not with obesity, elevated blood pressure, or triglycerides. These results suggest a significant association between alpha-fetoprotein and metabolic syndrome.
Dietary and Physical Activity Counseling Trends in U.S. Children, 2002-2011.
Odulana, Adebowale; Basco, William T; Bishu, Kinfe G; Egede, Leonard E
2017-07-01
In 2007 and 2010, Expert Committee and U.S. Preventive Services Task Force guidelines were released, respectively, urging U.S. practitioners to deliver preventive obesity counseling for children. This study determined the frequency and evaluated predictors of receiving counseling for diet and physical activity among a national sample of children from 2002 to 2011. Children aged 6-17 years were used from the 2002-2011 Medical Expenditure Panel Surveys and analyzed in 2016. Parental report of two questions assessed whether children received both dietary and exercise counseling from the provider. Children were grouped by weight category. Bivariate analyses compared the frequency of receiving counseling; logistic regression evaluated predictors of receiving counseling. The sample included 36,114 children; <50% of children received counseling. Across all time periods, children were more likely to receive counseling with increasing weight. Logistic regression models showed that obese children had greater odds of receiving counseling versus normal-weight children, even after adjusting for covariates. Additional significant positive correlates of receiving counseling were Hispanic ethnicity, living in an urban setting, and being in the highest income stratum. Being uninsured was associated with lower odds of counseling. Years 2007-2009 and 2010-2011 were associated with increased counseling versus the benchmark year category in the multivariable model. Counseling appears more likely with greater weight and increased after both guidelines in 2007 and 2010. Overall counseling rates for children remain low. Future work should focus on marginalized groups, such as racial and ethnic minorities and rural populations. Copyright © 2017 American Journal of Preventive Medicine. Published by Elsevier Inc. All rights reserved.
Phenomapping of rangelands in South Africa using time series of RapidEye data
NASA Astrophysics Data System (ADS)
Parplies, André; Dubovyk, Olena; Tewes, Andreas; Mund, Jan-Peter; Schellberg, Jürgen
2016-12-01
Phenomapping is an approach which allows the derivation of spatial patterns of vegetation phenology and rangeland productivity based on time series of vegetation indices. In our study, we propose a new spatial mapping approach which combines phenometrics derived from high resolution (HR) satellite time series with spatial logistic regression modeling to discriminate land management systems in rangelands. From the RapidEye time series for selected rangelands in South Africa, we calculated bi-weekly noise reduced Normalized Difference Vegetation Index (NDVI) images. For the growing season of 20112012, we further derived principal phenology metrics such as start, end and length of growing season and related phenological variables such as amplitude, left derivative and small integral of the NDVI curve. We then mapped these phenometrics across two different tenure systems, communal and commercial, at the very detailed spatial resolution of 5 m. The result of a binary logistic regression (BLR) has shown that the amplitude and the left derivative of the NDVI curve were statistically significant. These indicators are useful to discriminate commercial from communal rangeland systems. We conclude that phenomapping combined with spatial modeling is a powerful tool that allows efficient aggregation of phenology and productivity metrics for spatially explicit analysis of the relationships of crop phenology with site conditions and management. This approach has particular potential for disaggregated and patchy environments such as in farming systems in semi-arid South Africa, where phenology varies considerably among and within years. Further, we see a strong perspective for phenomapping to support spatially explicit modelling of vegetation.
Burton, Anya; Martin, Richard M; Holly, Jeff; Lane, J Athene; Donovan, Jenny L; Hamdy, Freddie C; Neal, David E; Tilling, Kate
2013-02-01
Obesity has been associated with an increased risk of advanced and fatal prostate cancer; adipokines may mediate this association. We examined associations of the adipokines leptin and adiponectin with the stage and grade of PSA-detected prostate cancer. We conducted a nested case-control study comparing 311 men with mainly locally advanced (≥T3, N1, or M1 cases) vs. 413 men with localized (T ≤2 & NX-0 & M0 controls) PSA-detected prostate cancer, recruited 2001-2009 from 9 UK regions to the ProtecT study. Associations of body mass index and adipokine levels with prostate cancer stage were determined by conditional logistic regression and with grade (Gleason score ≥7 vs. ≤6) by unconditional logistic regression. Adiponectin was inversely associated with prostate cancer stage in overweight and obese men (OR 0.62; 95 % CI 0.42-0.90; p = 0.01), but not in normal weight men (OR 1.48; 0.77-2.82; p = 0.24) (p for interaction 0.007), or all men (OR 0.86; 0.66-1.11; p = 0.24). There was no compelling evidence of associations between leptin or leptin to adiponectin ratio and prostate cancer stage. No strong associations of adiponectin, leptin, or leptin:adiponectin ratio with grade were seen. This study provides some evidence that adiponectin levels may be associated with prostate cancer stage, dependent on the degree of adiposity of the man. Our results are consistent with adiponectin countering the adverse effects of obesity on prostate cancer progression.
[Bone mineral density in overweight and obese adolescents].
Cobayashi, Fernanda; Lopes, Luiz A; Taddei, José Augusto de A C
2005-01-01
To study bone density as a concomitant factor for obesity in post-pubertal adolescents, controlling for other variables that may interfere in such a relation. Study comprising 83 overweight and obese adolescents (BMI > or = P85) and 89 non obese ones (P5 < or = BMI < or = P85). Cases and controls were selected out of 1,420 students (aged 14-19) from a public school in the city of São Paulo. The bone mineral density of the lumbar spine (L2-L4 in g/cm2) was assessed by dual-energy x-ray absorptiometry (LUNARtrade mark DPX-L). The variable bone density was dichotomized using 1.194 g/cm2 as cutoff point. Bivariate analyses were conducted considering the prevalence of overweight and obesity followed by multivariate analysis (logistic regression) according to a hierarchical conceptual model. The prevalence of bone density above the median was twice more frequent among cases (69.3%) than among controls (32.1%). In the bivariate analysis such prevalence resulted in an odds ratio (OR) of 4.78. The logistic regression model showed that the association between obesity and mineral density is yet more intense with an OR of 6.65 after the control of variables related to sedentary lifestyle and intake of milk and dairy products. Obese and overweight adolescents in the final stages of sexual maturity presented higher bone mineral density in relation to their normal-weight counterparts; however, cohort studies will be necessary to evaluate the influence of such characteristic on bone resistance in adulthood and, consequently, on the incidence of osteopenia and osteoporosis at older ages.
Turk, Tahir; Newton, Fiona; Choudhury, Sohel; Islam, Md Shafiqul
2018-06-01
Tobacco use contributes to an estimated 14.6% of male and 5.7% of female deaths in Bangladesh. We examine the determinants of tobacco-related quit attempts among Bangladeshis with and without awareness of the synergized "People Behind the Packs" (PBTP) communication campaign used to support the introduction of pack-based graphic warning labels (GWLs) in 2016. Data from 1,796 adults were collected using multistage sampling and a cross-sectional face-to-face survey. Analyses used a normalized design weight to ensure representativeness to the national population of smokers within Bangladesh. For the overall sample, the multivariable logistic regression model revealed quit attempts were associated with having seen the pack-based GWLs, recalling ≥1 PBTP campaign message, higher levels of self-efficacy to quit, and recognizing more potential side-effects associated with using tobacco products. Conversely, the likelihood of quitting attempts were lower among dual tobacco users (relative to smokers) and those using tobacco at least daily (vs. less than daily). The hierarchical multivariable logistic regression model among those aware of ≥1 PBTP campaign message indicated quitting attempts were positively associated with recalling more of the campaign messages and discussing them with others. This national evaluation of pack-based GWLs and accompanying PBTP campaign within Bangladesh supports the efficacy of using synergized communication messages when introducing such labels. That quit attempts are more likely among those discussing PBTP campaign messages with others and recalling more PBTP campaign messages highlights the importance of ensuring message content is both memorable and engaging.
Chen, Carla Chia-Ming; Schwender, Holger; Keith, Jonathan; Nunkesser, Robin; Mengersen, Kerrie; Macrossan, Paula
2011-01-01
Due to advancements in computational ability, enhanced technology and a reduction in the price of genotyping, more data are being generated for understanding genetic associations with diseases and disorders. However, with the availability of large data sets comes the inherent challenges of new methods of statistical analysis and modeling. Considering a complex phenotype may be the effect of a combination of multiple loci, various statistical methods have been developed for identifying genetic epistasis effects. Among these methods, logic regression (LR) is an intriguing approach incorporating tree-like structures. Various methods have built on the original LR to improve different aspects of the model. In this study, we review four variations of LR, namely Logic Feature Selection, Monte Carlo Logic Regression, Genetic Programming for Association Studies, and Modified Logic Regression-Gene Expression Programming, and investigate the performance of each method using simulated and real genotype data. We contrast these with another tree-like approach, namely Random Forests, and a Bayesian logistic regression with stochastic search variable selection.
Regression analysis for solving diagnosis problem of children's health
NASA Astrophysics Data System (ADS)
Cherkashina, Yu A.; Gerget, O. M.
2016-04-01
The paper includes results of scientific researches. These researches are devoted to the application of statistical techniques, namely, regression analysis, to assess the health status of children in the neonatal period based on medical data (hemostatic parameters, parameters of blood tests, the gestational age, vascular-endothelial growth factor) measured at 3-5 days of children's life. In this paper a detailed description of the studied medical data is given. A binary logistic regression procedure is discussed in the paper. Basic results of the research are presented. A classification table of predicted values and factual observed values is shown, the overall percentage of correct recognition is determined. Regression equation coefficients are calculated, the general regression equation is written based on them. Based on the results of logistic regression, ROC analysis was performed, sensitivity and specificity of the model are calculated and ROC curves are constructed. These mathematical techniques allow carrying out diagnostics of health of children providing a high quality of recognition. The results make a significant contribution to the development of evidence-based medicine and have a high practical importance in the professional activity of the author.
[Calculating Pearson residual in logistic regressions: a comparison between SPSS and SAS].
Xu, Hao; Zhang, Tao; Li, Xiao-song; Liu, Yuan-yuan
2015-01-01
To compare the results of Pearson residual calculations in logistic regression models using SPSS and SAS. We reviewed Pearson residual calculation methods, and used two sets of data to test logistic models constructed by SPSS and STATA. One model contained a small number of covariates compared to the number of observed. The other contained a similar number of covariates as the number of observed. The two software packages produced similar Pearson residual estimates when the models contained a similar number of covariates as the number of observed, but the results differed when the number of observed was much greater than the number of covariates. The two software packages produce different results of Pearson residuals, especially when the models contain a small number of covariates. Further studies are warranted.
Bharadwaj, Shruthi K; Vishnu Bhat, B; Vickneswaran, V; Adhisivam, B; Bobby, Zachariah; Habeebullah, S
2018-05-01
To measure the oxidative stress and antioxidant status in preeclamptic mother-newborn dyads and correlate them with neurodevelopmental outcome at one year of corrected age. This cohort study conducted in a tertiary care teaching hospital, south India included 71 preeclamptic and 72 normal mother-newborn dyads. Biochemical parameters including total antioxidant status (TAS), protein carbonyls and malondialdehyde levels (MDA) were measured in both maternal and cord blood. Infants in both the groups were followed up to one year of corrected age and neurodevelopmental assessment was done using Developmental Assessment Scale for Indian Infants (DASII). Correlation and multivariate regression analysis was done to evaluate the oxidative stress markers in relation to neurodevelopmental outcome. All oxidative stress markers were higher in maternal and cord blood of pre-ecclampsia group compared to the normal group. Maternal Total antioxidant status (M-TAS) was lower in pre-eclampsia group than normal group. More neonates in the pre-ecclampsia group were preterm and intrauterine growth restriction (IUGR) and had higher incidence of morbidities like respiratory distress syndrome (RDS) and early onset sepsis (EOS). Infants in the preeclampsia group had lower motor age, motor score and motor developmental quotient (MoDQ). On multivariate logistic regression analyses, lower M-TAS levels were strongly associated with poor neuro-motor outcomes at 1 y of corrected age. Maternal TAS with a cut-off value of 0.965 mmol/L had a sensitivity of 77.8% and specificity of 55.3% in predicting MoDQ <70 at one year corrected age in infants born to preeclamptic mothers. Oxidative stress is increased in preeclamptic mother-newborn dyads. Low maternal TAS levels are associated with poor neuro-motor outcomes. Maternal TAS in preeclampsia is useful in predicting poor motor development at one year corrected age.
Krell-Roesch, Janina; Ruider, Hanna; Lowe, Val J; Stokin, Gorazd B; Pink, Anna; Roberts, Rosebud O; Mielke, Michelle M; Knopman, David S; Christianson, Teresa J; Machulda, Mary M; Jack, Clifford R; Petersen, Ronald C; Geda, Yonas E
2016-07-14
One of the key research agenda of the field of aging is investigation of presymptomatic Alzheimer's disease (AD). Furthermore, abnormalities in brain glucose metabolism (as measured by FDG-PET) have been reported among cognitively normal elderly persons. However, little is known about the association of FDG-PET abnormalities with neuropsychiatric symptoms (NPS) in a population-based setting. Thus, we conducted a cross-sectional study derived from the ongoing population-based Mayo Clinic Study of Aging in order to examine the association between brain glucose metabolism and NPS among cognitively normal (CN) persons aged > 70 years. Participants underwent FDG-PET and completed the Neuropsychiatric Inventory Questionnaire (NPI-Q), Beck Depression Inventory (BDI), and Beck Anxiety Inventory (BAI). Cognitive classification was made by an expert consensus panel. We conducted multivariable logistic regression analyses to compute odds ratios (OR) and 95% confidence intervals after adjusting for age, sex, and education. For continuous variables, we used linear regression and Spearman rank-order correlations. Of 668 CN participants (median 78.1 years, 55.4% males), 205 had an abnormal FDG-PET (i.e., standardized uptake value ratio < 1.32 in AD-related regions). Abnormal FDG-PET was associated with depression as measured by NPI-Q (OR = 2.12; 1.23-3.64); the point estimate was further elevated for APOE ɛ4 carriers (OR = 2.59; 1.00-6.69), though marginally significant. Additionally, we observed a significant association between abnormal FDG-PET and depressive and anxiety symptoms when treated as continuous measures. These findings indicate that NPS, even in community-based samples, can be an important additional tool to the biomarker-based investigation of presymptomatic AD.
Elshatanoufy, Solafa; Matthews, Alexandra; Yousif, Mairy; Jamil, Marcus; Gutta, Sravanthi; Gill, Harmanjit; Galvin, Shelley L; Luck, Ali M
2018-05-04
The aim of our study was to assess midurethral sling (MUS) failure rate in the morbidly obese (body mass index [BMI] ≥40 kg/m) population as compared with normal-weight individuals. Our secondary objective was to assess the difference in complication rates. This is a retrospective cohort study. We included all patients who underwent a synthetic MUS procedure from January 1, 2008, to December 31, 2015, in our health system. Failure was defined as reported stress urinary incontinence symptoms or treatment for stress urinary incontinence. Variables collected were BMI; smoking status; comorbidities; perioperative (≤24 hours), short-term (≤30 days), and long-term (>30 days) complications; and follow-up time. Statistics include analysis of variance, χ test, logistic regression, Kaplan-Meier method, and Cox regression. There were 431 patients included in our analysis. Forty-nine patients were in class 3 with a BMI mean of 44.9 ± 5.07 kg/m. Median follow-up time was 52 months (range, 6-119 months). Class 3 obesity (BMI ≥40 kg/m) was the only group that had an increased risk of failure when compared with the normal-weight group (P = 0.03; odds ratio, 2.47; 95% confidence interval, 1.09-5.59). Obesity was not a significant predictor of perioperative, short-term, or long-term postoperative complications (P = 0.19, P = 0.28, and P = 0.089, respectively) after controlling for other comorbidities. Patients in the class 3 obesity group who are treated with an MUS are 2 times as likely to fail when compared with those in the normal-weight category on long-term follow-up with similar low complication rates.
Gudzune, Kimberly A; Bleich, Sara N; Richards, Thomas M; Weiner, Jonathan P; Hodges, Krista; Clark, Jeanne M
2013-07-01
Negative interactions with healthcare providers may lead patients to switch physicians or "doctor shop." We hypothesized that overweight and obese patients would be more likely to doctor shop, and as a result, have increased rates of emergency department (ED) visits and hospitalizations as compared to normal weight nonshoppers. We combined claims data from a health plan in one state with information from beneficiaries' health risk assessments. The primary outcome was "doctor shopping," which we defined as having outpatient claims with ≥5 different primary care physicians (PCPs) during a 24-month period. The independent variable was standard NIH categories of weight by BMI. We performed multivariate logistic regression to evaluate the association between weight categories and doctor shopping. We conducted multivariate zero-inflated negative binominal regression to evaluate the association between weight-doctor shopping categories with counts of ED visits and hospitalizations. Of the 20,726 beneficiaries, the mean BMI was 26.3 kg m(-2) (SD 5.1), mean age was 44.4 years (SD 11.1) and 53% were female. As compared to normal weight beneficiaries, overweight beneficiaries had 23% greater adjusted odds of doctor shopping (OR 1.23, 95%CI 1.04-1.46) and obese beneficiaries had 52% greater adjusted odds of doctor shopping (OR 1.52, 95%CI 1.26-1.82). As compared to normal weight non-shoppers, overweight and obese shoppers had higher rates of ED visits (IRR 1.85, 95%CI 1.37-2.45; IRR 1.83, 95%CI 1.34-2.50, respectively), which persisted during within weight group comparisons (Overweight IRR 1.50, 95%CI 1.10-2.03; Obese IRR 1.54, 95%CI 1.12-2.11). Frequently changing PCPs may impair continuity and result in increased healthcare utilization. Copyright © 2012 The Obesity Society.
Alderete, E; Bejarano, I; Rodríguez, A
2015-12-07
Sugar sweetened beverages (SSB) are thought to play an important role in weight gain. We examined the relationship between the intake of caloric and noncaloric beverages (SSB and water) and the nutritional status of children. In 2014, we randomly selected 16 public health clinics in four cities of Northwest Argentina and conducted a survey among mothers of children 0-6 years of age. Children's beverage intake was ascertained by 24-h dietary recall provided by the mothers. Children's weight and height measures were obtained from clinic's registries. We calculated the body mass index using the International Obesity Task Force standards. The analysis included 562 children 25 months to 6 years of age with normal or above normal nutritional status. Children's beverage consumption was as follows, water 81.8%, carbonated soft drinks (CSD) 49.7%, coffee/tea/cocoa 44.0%, artificial fruit drinks 35.6%, flavored water 17.9%, natural fruit juice 14.5%. In multivariate logistic regression models the likelihood of being obese v. being overweight or having normal weight doubled with an intake of one to five glasses of CSD (OR=2.2) and increased by more than three-fold with an intake of more than five glasses (OR=3.5). Drinking more than five glasses of water decreased the likelihood of being obese by less than half (OR=0.3). The percentage of children drinking more than five glasses of other beverages was low (3.3-0.9%) and regression models did not yield significant results. The study contributed evidence for reducing children's CSD intake and for promoting water consumption, together with the implementation of comprehensive regulatory public health policies.
Yan, Qun; Sun, Dongmei; Li, Xu; Chen, Guoliang; Zheng, Qinghu; Li, Lun; Gu, Chenhong; Feng, Bo
2016-07-13
There is a scarcity of epidemiological researches examining the relationship between blood pressure (BP) and glucose level among older adults. The objective of the current study was to investigate the association of high BP and glucose level in elderly Chinese. A cross-sectional study of a population of 2092 Chinese individuals aged over 65 years was conducted. Multiple logistic analysis was used to explore the association between hypertension and hyperglycemia. Independent risk factors for systolic and diastolic BP were analyzed using stepwise linear regression. Subjects in impaired fasting glucose group (IFG) (n = 144) and diabetes (n = 346), as compared with normal fasting glucose (NFG) (n = 1277), had a significant higher risk for hypertension, with odds ratios (ORs) of 1.81 (95 % CI, 1.39-2.35) (P = 0.000) and 1.40 (95 % CI, 1.09-1.80) (P = 0.009), respectively. Higher fasting plasma glucose (FPG) levels in the normal range were still significantly associated with a higher prevalence of hypertension in both genders, with ORs of 1.24 (95 % CI, 0.85-1.80), R (2) = 0.114, P = 0.023 in men and 1.61 (95 % CI, 1.12-2.30), R (2) = 0.082, P = 0.010 in women, respectively, when compared with lower FPG. Linear regression analysis revealed FPG was an independent factor of systolic and diastolic BP. Our findings suggest that hyperglycemia as well as higher FPG within the normal range is associated with a higher prevalence of hypertension independent of other cardiovascular risk factors in elderly Chinese. Further studies are needed to explore the relationship between hyperglycemia and hypertension in a longitudinal setting.
Greeven, Anja; van Balkom, Anton J L M; Spinhoven, Philip
2014-05-01
We aimed to investigate whether personality characteristics predict time to remission and psychiatric status. The follow-up was at most 6 years and was performed within the scope of a randomized controlled trial that investigated the efficacy of cognitive behavioral therapy, paroxetine, and placebo in hypochondriasis. The Life Chart Interview was administered to investigate for each year if remission had occurred. Personality was assessed at pretest by the Abbreviated Dutch Temperament and Character Inventory. Cox's regression models for recurrent events were compared with logistic regression models. Sixteen (36.4%) of 44 patients achieved remission during the follow-up period. Cox's regression yielded approximately the same results as the logistic regression. Being less harm avoidant and more cooperative were associated with a shorter time to remission and a remitted state after the follow-up period. Personality variables seem to be relevant for describing patients with a more chronic course of hypochondriacal complaints.
Goo, Yeong-Jia James; Shen, Zone-De
2014-01-01
As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%. PMID:25302338
Steen, Paul J.; Passino-Reader, Dora R.; Wiley, Michael J.
2006-01-01
As a part of the Great Lakes Regional Aquatic Gap Analysis Project, we evaluated methodologies for modeling associations between fish species and habitat characteristics at a landscape scale. To do this, we created brook trout Salvelinus fontinalis presence and absence models based on four different techniques: multiple linear regression, logistic regression, neural networks, and classification trees. The models were tested in two ways: by application to an independent validation database and cross-validation using the training data, and by visual comparison of statewide distribution maps with historically recorded occurrences from the Michigan Fish Atlas. Although differences in the accuracy of our models were slight, the logistic regression model predicted with the least error, followed by multiple regression, then classification trees, then the neural networks. These models will provide natural resource managers a way to identify habitats requiring protection for the conservation of fish species.
Chen, Suduan; Goo, Yeong-Jia James; Shen, Zone-De
2014-01-01
As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%.
Ngo, Long H; Inouye, Sharon K; Jones, Richard N; Travison, Thomas G; Libermann, Towia A; Dillon, Simon T; Kuchel, George A; Vasunilashorn, Sarinnapha M; Alsop, David C; Marcantonio, Edward R
2017-06-06
The nested case-control study (NCC) design within a prospective cohort study is used when outcome data are available for all subjects, but the exposure of interest has not been collected, and is difficult or prohibitively expensive to obtain for all subjects. A NCC analysis with good matching procedures yields estimates that are as efficient and unbiased as estimates from the full cohort study. We present methodological considerations in a matched NCC design and analysis, which include the choice of match algorithms, analysis methods to evaluate the association of exposures of interest with outcomes, and consideration of overmatching. Matched, NCC design within a longitudinal observational prospective cohort study in the setting of two academic hospitals. Study participants are patients aged over 70 years who underwent scheduled major non-cardiac surgery. The primary outcome was postoperative delirium from in-hospital interviews and medical record review. The main exposure was IL-6 concentration (pg/ml) from blood sampled at three time points before delirium occurred. We used nonparametric signed ranked test to test for the median of the paired differences. We used conditional logistic regression to model the risk of IL-6 on delirium incidence. Simulation was used to generate a sample of cohort data on which unconditional multivariable logistic regression was used, and the results were compared to those of the conditional logistic regression. Partial R-square was used to assess the level of overmatching. We found that the optimal match algorithm yielded more matched pairs than the greedy algorithm. The choice of analytic strategy-whether to consider measured cytokine levels as the predictor or outcome-- yielded inferences that have different clinical interpretations but similar levels of statistical significance. Estimation results from NCC design using conditional logistic regression, and from simulated cohort design using unconditional logistic regression, were similar. We found minimal evidence for overmatching. Using a matched NCC approach introduces methodological challenges into the study design and data analysis. Nonetheless, with careful selection of the match algorithm, match factors, and analysis methods, this design is cost effective and, for our study, yields estimates that are similar to those from a prospective cohort study design.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, Shujie; Kawamoto, Taisuke; Morita, Osamu
Chemical exposure often results in liver hypertrophy in animal tests, characterized by increased liver weight, hepatocellular hypertrophy, and/or cell proliferation. While most of these changes are considered adaptive responses, there is concern that they may be associated with carcinogenesis. In this study, we have employed a toxicogenomic approach using a logistic ridge regression model to identify genes responsible for liver hypertrophy and hypertrophic hepatocarcinogenesis and to develop a predictive model for assessing hypertrophy-inducing compounds. Logistic regression models have previously been used in the quantification of epidemiological risk factors. DNA microarray data from the Toxicogenomics Project-Genomics Assisted Toxicity Evaluation System weremore » used to identify hypertrophy-related genes that are expressed differently in hypertrophy induced by carcinogens and non-carcinogens. Data were collected for 134 chemicals (72 non-hypertrophy-inducing chemicals, 27 hypertrophy-inducing non-carcinogenic chemicals, and 15 hypertrophy-inducing carcinogenic compounds). After applying logistic ridge regression analysis, 35 genes for liver hypertrophy (e.g., Acot1 and Abcc3) and 13 genes for hypertrophic hepatocarcinogenesis (e.g., Asns and Gpx2) were selected. The predictive models built using these genes were 94.8% and 82.7% accurate, respectively. Pathway analysis of the genes indicates that, aside from a xenobiotic metabolism-related pathway as an adaptive response for liver hypertrophy, amino acid biosynthesis and oxidative responses appear to be involved in hypertrophic hepatocarcinogenesis. Early detection and toxicogenomic characterization of liver hypertrophy using our models may be useful for predicting carcinogenesis. In addition, the identified genes provide novel insight into discrimination between adverse hypertrophy associated with carcinogenesis and adaptive hypertrophy in risk assessment. - Highlights: • Hypertrophy (H) and hypertrophic carcinogenesis (C) were studied by toxicogenomics. • Important genes for H and C were selected by logistic ridge regression analysis. • Amino acid biosynthesis and oxidative responses may be involved in C. • Predictive models for H and C provided 94.8% and 82.7% accuracy, respectively. • The identified genes could be useful for assessment of liver hypertrophy.« less
Zhang, Xingyu; Kim, Joyce; Patzer, Rachel E; Pitts, Stephen R; Patzer, Aaron; Schrager, Justin D
2017-10-26
To describe and compare logistic regression and neural network modeling strategies to predict hospital admission or transfer following initial presentation to Emergency Department (ED) triage with and without the addition of natural language processing elements. Using data from the National Hospital Ambulatory Medical Care Survey (NHAMCS), a cross-sectional probability sample of United States EDs from 2012 and 2013 survey years, we developed several predictive models with the outcome being admission to the hospital or transfer vs. discharge home. We included patient characteristics immediately available after the patient has presented to the ED and undergone a triage process. We used this information to construct logistic regression (LR) and multilayer neural network models (MLNN) which included natural language processing (NLP) and principal component analysis from the patient's reason for visit. Ten-fold cross validation was used to test the predictive capacity of each model and receiver operating curves (AUC) were then calculated for each model. Of the 47,200 ED visits from 642 hospitals, 6,335 (13.42%) resulted in hospital admission (or transfer). A total of 48 principal components were extracted by NLP from the reason for visit fields, which explained 75% of the overall variance for hospitalization. In the model including only structured variables, the AUC was 0.824 (95% CI 0.818-0.830) for logistic regression and 0.823 (95% CI 0.817-0.829) for MLNN. Models including only free-text information generated AUC of 0.742 (95% CI 0.731- 0.753) for logistic regression and 0.753 (95% CI 0.742-0.764) for MLNN. When both structured variables and free text variables were included, the AUC reached 0.846 (95% CI 0.839-0.853) for logistic regression and 0.844 (95% CI 0.836-0.852) for MLNN. The predictive accuracy of hospital admission or transfer for patients who presented to ED triage overall was good, and was improved with the inclusion of free text data from a patient's reason for visit regardless of modeling approach. Natural language processing and neural networks that incorporate patient-reported outcome free text may increase predictive accuracy for hospital admission.
Cevenini, Gabriele; Barbini, Emanuela; Scolletta, Sabino; Biagioli, Bonizella; Giomarelli, Pierpaolo; Barbini, Paolo
2007-11-22
Popular predictive models for estimating morbidity probability after heart surgery are compared critically in a unitary framework. The study is divided into two parts. In the first part modelling techniques and intrinsic strengths and weaknesses of different approaches were discussed from a theoretical point of view. In this second part the performances of the same models are evaluated in an illustrative example. Eight models were developed: Bayes linear and quadratic models, k-nearest neighbour model, logistic regression model, Higgins and direct scoring systems and two feed-forward artificial neural networks with one and two layers. Cardiovascular, respiratory, neurological, renal, infectious and hemorrhagic complications were defined as morbidity. Training and testing sets each of 545 cases were used. The optimal set of predictors was chosen among a collection of 78 preoperative, intraoperative and postoperative variables by a stepwise procedure. Discrimination and calibration were evaluated by the area under the receiver operating characteristic curve and Hosmer-Lemeshow goodness-of-fit test, respectively. Scoring systems and the logistic regression model required the largest set of predictors, while Bayesian and k-nearest neighbour models were much more parsimonious. In testing data, all models showed acceptable discrimination capacities, however the Bayes quadratic model, using only three predictors, provided the best performance. All models showed satisfactory generalization ability: again the Bayes quadratic model exhibited the best generalization, while artificial neural networks and scoring systems gave the worst results. Finally, poor calibration was obtained when using scoring systems, k-nearest neighbour model and artificial neural networks, while Bayes (after recalibration) and logistic regression models gave adequate results. Although all the predictive models showed acceptable discrimination performance in the example considered, the Bayes and logistic regression models seemed better than the others, because they also had good generalization and calibration. The Bayes quadratic model seemed to be a convincing alternative to the much more usual Bayes linear and logistic regression models. It showed its capacity to identify a minimum core of predictors generally recognized as essential to pragmatically evaluate the risk of developing morbidity after heart surgery.
NASA Astrophysics Data System (ADS)
Ozdemir, Adnan
2011-07-01
SummaryThe purpose of this study is to produce a groundwater spring potential map of the Sultan Mountains in central Turkey, based on a logistic regression method within a Geographic Information System (GIS) environment. Using field surveys, the locations of the springs (440 springs) were determined in the study area. In this study, 17 spring-related factors were used in the analysis: geology, relative permeability, land use/land cover, precipitation, elevation, slope, aspect, total curvature, plan curvature, profile curvature, wetness index, stream power index, sediment transport capacity index, distance to drainage, distance to fault, drainage density, and fault density map. The coefficients of the predictor variables were estimated using binary logistic regression analysis and were used to calculate the groundwater spring potential for the entire study area. The accuracy of the final spring potential map was evaluated based on the observed springs. The accuracy of the model was evaluated by calculating the relative operating characteristics. The area value of the relative operating characteristic curve model was found to be 0.82. These results indicate that the model is a good estimator of the spring potential in the study area. The spring potential map shows that the areas of very low, low, moderate and high groundwater spring potential classes are 105.586 km 2 (28.99%), 74.271 km 2 (19.906%), 101.203 km 2 (27.14%), and 90.05 km 2 (24.671%), respectively. The interpretations of the potential map showed that stream power index, relative permeability of lithologies, geology, elevation, aspect, wetness index, plan curvature, and drainage density play major roles in spring occurrence and distribution in the Sultan Mountains. The logistic regression approach has not yet been used to delineate groundwater potential zones. In this study, the logistic regression method was used to locate potential zones for groundwater springs in the Sultan Mountains. The evolved model was found to be in strong agreement with the available groundwater spring test data. Hence, this method can be used routinely in groundwater exploration under favourable conditions.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yan, H; Chen, Z; Nath, R
Purpose: kV fluoroscopic imaging combined with MV treatment beam imaging has been investigated for intrafractional motion monitoring and correction. It is, however, subject to additional kV imaging dose to normal tissue. To balance tracking accuracy and imaging dose, we previously proposed an adaptive imaging strategy to dynamically decide future imaging type and moments based on motion tracking uncertainty. kV imaging may be used continuously for maximal accuracy or only when the position uncertainty (probability of out of threshold) is high if a preset imaging dose limit is considered. In this work, we propose more accurate methods to estimate tracking uncertaintymore » through analyzing acquired data in real-time. Methods: We simulated motion tracking process based on a previously developed imaging framework (MV + initial seconds of kV imaging) using real-time breathing data from 42 patients. Motion tracking errors for each time point were collected together with the time point’s corresponding features, such as tumor motion speed and 2D tracking error of previous time points, etc. We tested three methods for error uncertainty estimation based on the features: conditional probability distribution, logistic regression modeling, and support vector machine (SVM) classification to detect errors exceeding a threshold. Results: For conditional probability distribution, polynomial regressions on three features (previous tracking error, prediction quality, and cosine of the angle between the trajectory and the treatment beam) showed strong correlation with the variation (uncertainty) of the mean 3D tracking error and its standard deviation: R-square = 0.94 and 0.90, respectively. The logistic regression and SVM classification successfully identified about 95% of tracking errors exceeding 2.5mm threshold. Conclusion: The proposed methods can reliably estimate the motion tracking uncertainty in real-time, which can be used to guide adaptive additional imaging to confirm the tumor is within the margin or initialize motion compensation if it is out of the margin.« less
A VARI-Based Relative Greenness from MODIS Data for Computing the Fire Potential Index
NASA Technical Reports Server (NTRS)
Schneider, P.; Roberts, D. A.; Kyriakidis, P. C.
2008-01-01
The Fire Potential Index (FPI) relies on relative greenness (RG) estimates from remote sensing data. The Normalized Difference Vegetation index (NDVI), derived from NOAA Advanced Very High Resolution Radiometer (AVHRR) imagery is currently used to calculate RG operationally. Here we evaluated an alternate measure of RG using the Visible Atmospheric Resistant Index (VARI) derived from Moderate Resolution Imaging Spectrometer (MODIS) data. VARI was chosen because it has previously been shown to have the strongest relationship with Live Fuel Moisture (LFM) out of a wide selection of MODIS-derived indices in southern California shrublands. To compare MODIS-based NDVI-FPI and VARI-FPI, RG was calculated from a 6-year time series of MODIS composites and validated against in-situ observations of LFM as a surrogate for vegetation greenness. RG from both indices was then compared in terms of its performance for computing the FPI using historical wildfire data. Computed RG values were regressed against ground-sampled LFM at 14 sites within Los Angeles County. The results indicate the VARI-based RG consistently shows a stronger relationship with observed LFM than NDVI-based RG. With an average R2 of 0.727 compared to a value of only 0.622 for NDVI-RG, VARI-RG showed stronger relationships at 13 out of 14 sites. Based on these results, daily FPI maps were computed for the years 2001 through 2005 using both NDVI-RG and VARI-RG. These were then validated against 12,490 fire detections from the MODIS active fire product using logistic regression. Deviance of the logistic regression model was 408.8 for NDVI-FPI and 176.2 for VARI-FPI. The c-index was found to be 0.69 and 0.78, respectively. The results show that VARI-FP outperforms NDVI-FPI in distinguishing between fire and no-fire events for historical wildfire data in southern California for the given time period.
Prevalence and Evolution of Renal Impairment in People Living With HIV in Rural Tanzania.
Mapesi, Herry; Kalinjuma, Aneth V; Ngerecha, Alphonce; Franzeck, Fabian; Hatz, Christoph; Tanner, Marcel; Mayr, Michael; Furrer, Hansjakob; Battegay, Manuel; Letang, Emilio; Weisser, Maja; Glass, Tracy R
2018-04-01
We assessed the prevalence, incidence, and predictors of renal impairment among people living with HIV (PLWHIV) in rural Tanzania. In a cohort of PLWHIV aged ≥15 years enrolled from January 2013 to June 2016, we assessed the association between renal impairment (estimated glomerural filtration rate < 90 mL/min/1.73 m 2 ) at enrollment and during follow-up with demographic and clinical characteristcis using logistic regression and Cox proportional hazards models. Of 1093 PLWHIV, 172 (15.7%) had renal impairment at enrollment. Of 921 patients with normal renal function at baseline, 117 (12.7%) developed renal impairment during a median follow-up (interquartile range) of 6.2 (0.4-14.7) months. The incidence of renal impairment was 110 cases per 1000 person-years (95% confidence interval [CI], 92-132). At enrollment, logistic regression identified older age (adjusted odds ratio [aOR], 1.79; 95% CI, 1.52-2.11), hypertension (aOR, 1.84; 95% CI, 1.08-3.15), CD4 count <200 cells/mm 3 (aOR, 1.80; 95% CI, 1.23-2.65), and World Health Organization (WHO) stage III/IV (aOR, 3.00; 95% CI, 1.96-4.58) as risk factors for renal impairment. Cox regression model confirmed older age (adjusted hazard ratio [aHR], 1.85; 95% CI, 1.56-2.20) and CD4 count <200 cells/mm 3 (aHR, 2.05; 95% CI, 1.36-3.09) to be associated with the development of renal impairment. Our study found a low prevalence of renal impairment among PLWHIV despite high usage of tenofovir and its association with age, hypertension, low CD4 count, and advanced WHO stage. These important and reassuring safety data stress the significance of noncommunicable disease surveillance in aging HIV populations in sub-Saharan Africa.
Independent Prognostic Factors for Acute Organophosphorus Pesticide Poisoning.
Tang, Weidong; Ruan, Feng; Chen, Qi; Chen, Suping; Shao, Xuebo; Gao, Jianbo; Zhang, Mao
2016-07-01
Acute organophosphorus pesticide poisoning (AOPP) is becoming a significant problem and a potential cause of human mortality because of the abuse of organophosphate compounds. This study aims to determine the independent prognostic factors of AOPP by using multivariate logistic regression analysis. The clinical data for 71 subjects with AOPP admitted to our hospital were retrospectively analyzed. This information included the Acute Physiology and Chronic Health Evaluation II (APACHE II) scores, 6-h post-admission blood lactate levels, post-admission 6-h lactate clearance rates, admission blood cholinesterase levels, 6-h post-admission blood cholinesterase levels, cholinesterase activity, blood pH, and other factors. Univariate analysis and multivariate logistic regression analyses were conducted to identify all prognostic factors and independent prognostic factors, respectively. A receiver operating characteristic curve was plotted to analyze the testing power of independent prognostic factors. Twelve of 71 subjects died. Admission blood lactate levels, 6-h post-admission blood lactate levels, post-admission 6-h lactate clearance rates, blood pH, and APACHE II scores were identified as prognostic factors for AOPP according to the univariate analysis, whereas only 6-h post-admission blood lactate levels, post-admission 6-h lactate clearance rates, and blood pH were independent prognostic factors identified by multivariate logistic regression analysis. The receiver operating characteristic analysis suggested that post-admission 6-h lactate clearance rates were of moderate diagnostic value. High 6-h post-admission blood lactate levels, low blood pH, and low post-admission 6-h lactate clearance rates were independent prognostic factors identified by multivariate logistic regression analysis. Copyright © 2016 by Daedalus Enterprises.
Sargolzaie, Narjes; Miri-Moghaddam, Ebrahim
2014-01-01
The most common differential diagnosis of β-thalassemia (β-thal) trait is iron deficiency anemia. Several red blood cell equations were introduced during different studies for differential diagnosis between β-thal trait and iron deficiency anemia. Due to genetic variations in different regions, these equations cannot be useful in all population. The aim of this study was to determine a native equation with high accuracy for differential diagnosis of β-thal trait and iron deficiency anemia for the Sistan and Baluchestan population by logistic regression analysis. We selected 77 iron deficiency anemia and 100 β-thal trait cases. We used binary logistic regression analysis and determined best equations for probability prediction of β-thal trait against iron deficiency anemia in our population. We compared diagnostic values and receiver operative characteristic (ROC) curve related to this equation and another 10 published equations in discriminating β-thal trait and iron deficiency anemia. The binary logistic regression analysis determined the best equation for best probability prediction of β-thal trait against iron deficiency anemia with area under curve (AUC) 0.998. Based on ROC curves and AUC, Green & King, England & Frazer, and then Sirdah indices, respectively, had the most accuracy after our equation. We suggest that to get the best equation and cut-off in each region, one needs to evaluate specific information of each region, specifically in areas where populations are homogeneous, to provide a specific formula for differentiating between β-thal trait and iron deficiency anemia.
Selenium in irrigated agricultural areas of the western United States
Nolan, B.T.; Clark, M.L.
1997-01-01
A logistic regression model was developed to predict the likelihood that Se exceeds the USEPA chronic criterion for aquatic life (5 ??g/L) in irrigated agricultural areas of the western USA. Preliminary analysis of explanatory variables used in the model indicated that surface-water Se concentration increased with increasing dissolved solids (DS) concentration and with the presence of Upper Cretaceous, mainly marine sediment. The presence or absence of Cretaceous sediment was the major variable affecting Se concentration in surface-water samples from the National Irrigation Water Quality Program. Median Se concentration was 14 ??g/L in samples from areas underlain by Cretaceous sediments and < 1 ??g/L in samples from areas underlain by non-Cretaceous sediments. Wilcoxon rank sum tests indicated that elevated Se concentrations in samples from areas with Cretaceous sediments, irrigated areas, and from closed lakes and ponds were statistically significant. Spearman correlations indicated that Se was positively correlated with a binary geology variable (0.64) and DS (0.45). Logistic regression models indicated that the concentration of Se in surface water was almost certain to exceed the Environmental Protection Agency aquatic-life chronic criterion of 5 ??g/L when DS was greater than 3000 mg/L in areas with Cretaceous sediments. The 'best' logistic regression model correctly predicted Se exceedances and nonexceedances 84.4% of the time, and model sensitivity was 80.7%. A regional map of Cretaceous sediment showed the location of potential problem areas. The map and logistic regression model are tools that can be used to determine the potential for Se contamination of irrigated agricultural areas in the western USA.
Fang, Xingang; Bagui, Sikha; Bagui, Subhash
2017-08-01
The readily available high throughput screening (HTS) data from the PubChem database provides an opportunity for mining of small molecules in a variety of biological systems using machine learning techniques. From the thousands of available molecular descriptors developed to encode useful chemical information representing the characteristics of molecules, descriptor selection is an essential step in building an optimal quantitative structural-activity relationship (QSAR) model. For the development of a systematic descriptor selection strategy, we need the understanding of the relationship between: (i) the descriptor selection; (ii) the choice of the machine learning model; and (iii) the characteristics of the target bio-molecule. In this work, we employed the Signature descriptor to generate a dataset on the Human kallikrein 5 (hK 5) inhibition confirmatory assay data and compared multiple classification models including logistic regression, support vector machine, random forest and k-nearest neighbor. Under optimal conditions, the logistic regression model provided extremely high overall accuracy (98%) and precision (90%), with good sensitivity (65%) in the cross validation test. In testing the primary HTS screening data with more than 200K molecular structures, the logistic regression model exhibited the capability of eliminating more than 99.9% of the inactive structures. As part of our exploration of the descriptor-model-target relationship, the excellent predictive performance of the combination of the Signature descriptor and the logistic regression model on the assay data of the Human kallikrein 5 (hK 5) target suggested a feasible descriptor/model selection strategy on similar targets. Copyright © 2017 Elsevier Ltd. All rights reserved.
Prenatal Lead Exposure and Fetal Growth: Smaller Infants Have Heightened Susceptibility
Rodosthenous, Rodosthenis S.; Burris, Heather H.; Svensson, Katherine; Amarasiriwardena, Chitra J.; Cantoral, Alejandra; Schnaas, Lourdes; Mercado-García, Adriana; Coull, Brent A.; Wright, Robert O.; Téllez-Rojo, Martha M.; Baccarelli, Andrea A.
2016-01-01
Background As population lead levels decrease, the toxic effects of lead may be distributed to more sensitive populations, such as infants with poor fetal growth. Objectives To determine the association of prenatal lead exposure and fetal growth; and to evaluate whether infants with poor fetal growth are more susceptible to lead toxicity than those with normal fetal growth. Methods We examined the association of second trimester maternal blood lead levels (BLL) with birthweight-for-gestational age (BWGA) z-score in 944 mother-infant participants of the PROGRESS cohort. We determined the association between maternal BLL and BWGA z-score by using both linear and quantile regression. We estimated odds ratios for small-for-gestational age (SGA) infants between maternal BLL quartiles using logistic regression. Maternal age, body mass index, socioeconomic status, parity, household smoking exposure, hemoglobin levels, and infant sex were included as confounders. Results While linear regression showed a negative association between maternal BLL and BWGA z-score (β=−0.06 z-score units per log2 BLL increase; 95% CI: −0.13, 0.003; P=0.06), quantile regression revealed larger magnitudes of this association in the <30th percentiles of BWGA z-score (β range [−0.08, −0.13] z-score units per log2 BLL increase; all P values <0.05). Mothers in the highest BLL quartile had an odds ratio of 1.62 (95% CI: 0.99–2.65) for having a SGA infant compared to the lowest BLL quartile. Conclusions While both linear and quantile regression showed a negative association between prenatal lead exposure and birthweight, quantile regression revealed that smaller infants may represent a more susceptible subpopulation. PMID:27923585
Prenatal lead exposure and fetal growth: Smaller infants have heightened susceptibility.
Rodosthenous, Rodosthenis S; Burris, Heather H; Svensson, Katherine; Amarasiriwardena, Chitra J; Cantoral, Alejandra; Schnaas, Lourdes; Mercado-García, Adriana; Coull, Brent A; Wright, Robert O; Téllez-Rojo, Martha M; Baccarelli, Andrea A
2017-02-01
As population lead levels decrease, the toxic effects of lead may be distributed to more sensitive populations, such as infants with poor fetal growth. To determine the association of prenatal lead exposure and fetal growth; and to evaluate whether infants with poor fetal growth are more susceptible to lead toxicity than those with normal fetal growth. We examined the association of second trimester maternal blood lead levels (BLL) with birthweight-for-gestational age (BWGA) z-score in 944 mother-infant participants of the PROGRESS cohort. We determined the association between maternal BLL and BWGA z-score by using both linear and quantile regression. We estimated odds ratios for small-for-gestational age (SGA) infants between maternal BLL quartiles using logistic regression. Maternal age, body mass index, socioeconomic status, parity, household smoking exposure, hemoglobin levels, and infant sex were included as confounders. While linear regression showed a negative association between maternal BLL and BWGA z-score (β=-0.06 z-score units per log 2 BLL increase; 95% CI: -0.13, 0.003; P=0.06), quantile regression revealed larger magnitudes of this association in the <30th percentiles of BWGA z-score (β range [-0.08, -0.13] z-score units per log 2 BLL increase; all P values<0.05). Mothers in the highest BLL quartile had an odds ratio of 1.62 (95% CI: 0.99-2.65) for having a SGA infant compared to the lowest BLL quartile. While both linear and quantile regression showed a negative association between prenatal lead exposure and birthweight, quantile regression revealed that smaller infants may represent a more susceptible subpopulation. Copyright © 2016 Elsevier Ltd. All rights reserved.
Ileus in children presenting with diarrhea and severe acute malnutrition: A chart review
Shahid, Abu SMSB; Shahunja, K. M.; Bardhan, Pradip Kumar; Faruque, Abu Syeed Golam; Shahrin, Lubaba; Das, Sumon Kumar; Barua, Dipesh Kumar; Hossain, Md Iqbal; Ahmed, Tahmeed
2017-01-01
Background Severely malnourished children aged under five years requiring hospital admission for diarrheal illness frequently develop ileus during hospitalization with often fatal outcomes. However, there is no data on risk factors and outcome of ileus in such children. We intended to evaluate predictive factors for ileus during hospitalization and their outcomes. Methodology/Principal findings This was a retrospective chart review that enrolled severely malnourished children under five years old with diarrhea, admitted to the Dhaka Hospital of the International Centre for Diarrhoeal Disease Research, Bangladesh between April 2011 and August 2012. We used electronic database to have our chart abstraction from previously admitted children in the hospital. The clinical and laboratory characteristics of children with (cases = 45), and without ileus (controls = 261) were compared. Cases were first identified by observation of abnormal bowel sounds on physical examination and confirmed with abdominal radiographs. For this comparison, Chi-square test was used to measure the difference in proportion, Student’s t-test to calculate the difference in mean for normally distributed data and Mann-Whitney test for data that were not normally distributed. Finally, in identifying independent risk factors for ileus, logistical regression analysis was performed. Ileus was defined if a child developed abdominal distension and had hyperactive or sluggish or absent bowel sound and a radiologic evidence of abdominal gas-fluid level during hospitalization. Logistic regression analysis adjusting for potential confounders revealed that the independent risk factors for admission for ileus were reluctance to feed (odds ratio [OR] = 3.22, 95% confidence interval [CI] = 1.24–8.39, p = 0.02), septic shock (OR = 3.62, 95% CI = 1.247–8.95, p<0.01), and hypokalemia (OR = 1.99, 95% CI = 1.03–3.86, p = 0.04). Mortality was significantly higher in cases compared to controls (22% vs. 8%, p<0.01) in univariate analysis; however, in multivariable regression analysis, after adjusting for potential confounders such as septic shock, no association was found between ileus and death (OR = 2.05, 95% CI = 0.68–6.14, p = 0.20). In a separate regression analysis model, after adjusting for potential confounders such as ileus, reluctance to feed, hypokalemia, hypocalcemia, and blood transfusion, septic shock (OR = 168.84, 95% CI = 19.27–1479.17, p<0.01) emerged as the only independent predictor of death in severely malnourished diarrheal children. Conclusions/Significance This study suggests that the identification of simple independent admission risk factors for ileus and risk factors for death in hospitalized severely malnourished diarrheal children may prompt clinicians to be more vigilant in managing these conditions, especially in resource-limited settings in order to decrease ileus and ileus-related fatal outcomes in such children. PMID:28493871
Ileus in children presenting with diarrhea and severe acute malnutrition: A chart review.
Chisti, Mohammod Jobayer; Shahid, Abu Smsb; Shahunja, K M; Bardhan, Pradip Kumar; Faruque, Abu Syeed Golam; Shahrin, Lubaba; Das, Sumon Kumar; Barua, Dipesh Kumar; Hossain, Md Iqbal; Ahmed, Tahmeed
2017-05-01
Severely malnourished children aged under five years requiring hospital admission for diarrheal illness frequently develop ileus during hospitalization with often fatal outcomes. However, there is no data on risk factors and outcome of ileus in such children. We intended to evaluate predictive factors for ileus during hospitalization and their outcomes. This was a retrospective chart review that enrolled severely malnourished children under five years old with diarrhea, admitted to the Dhaka Hospital of the International Centre for Diarrhoeal Disease Research, Bangladesh between April 2011 and August 2012. We used electronic database to have our chart abstraction from previously admitted children in the hospital. The clinical and laboratory characteristics of children with (cases = 45), and without ileus (controls = 261) were compared. Cases were first identified by observation of abnormal bowel sounds on physical examination and confirmed with abdominal radiographs. For this comparison, Chi-square test was used to measure the difference in proportion, Student's t-test to calculate the difference in mean for normally distributed data and Mann-Whitney test for data that were not normally distributed. Finally, in identifying independent risk factors for ileus, logistical regression analysis was performed. Ileus was defined if a child developed abdominal distension and had hyperactive or sluggish or absent bowel sound and a radiologic evidence of abdominal gas-fluid level during hospitalization. Logistic regression analysis adjusting for potential confounders revealed that the independent risk factors for admission for ileus were reluctance to feed (odds ratio [OR] = 3.22, 95% confidence interval [CI] = 1.24-8.39, p = 0.02), septic shock (OR = 3.62, 95% CI = 1.247-8.95, p<0.01), and hypokalemia (OR = 1.99, 95% CI = 1.03-3.86, p = 0.04). Mortality was significantly higher in cases compared to controls (22% vs. 8%, p<0.01) in univariate analysis; however, in multivariable regression analysis, after adjusting for potential confounders such as septic shock, no association was found between ileus and death (OR = 2.05, 95% CI = 0.68-6.14, p = 0.20). In a separate regression analysis model, after adjusting for potential confounders such as ileus, reluctance to feed, hypokalemia, hypocalcemia, and blood transfusion, septic shock (OR = 168.84, 95% CI = 19.27-1479.17, p<0.01) emerged as the only independent predictor of death in severely malnourished diarrheal children. This study suggests that the identification of simple independent admission risk factors for ileus and risk factors for death in hospitalized severely malnourished diarrheal children may prompt clinicians to be more vigilant in managing these conditions, especially in resource-limited settings in order to decrease ileus and ileus-related fatal outcomes in such children.
Non-ignorable missingness in logistic regression.
Wang, Joanna J J; Bartlett, Mark; Ryan, Louise
2017-08-30
Nonresponses and missing data are common in observational studies. Ignoring or inadequately handling missing data may lead to biased parameter estimation, incorrect standard errors and, as a consequence, incorrect statistical inference and conclusions. We present a strategy for modelling non-ignorable missingness where the probability of nonresponse depends on the outcome. Using a simple case of logistic regression, we quantify the bias in regression estimates and show the observed likelihood is non-identifiable under non-ignorable missing data mechanism. We then adopt a selection model factorisation of the joint distribution as the basis for a sensitivity analysis to study changes in estimated parameters and the robustness of study conclusions against different assumptions. A Bayesian framework for model estimation is used as it provides a flexible approach for incorporating different missing data assumptions and conducting sensitivity analysis. Using simulated data, we explore the performance of the Bayesian selection model in correcting for bias in a logistic regression. We then implement our strategy using survey data from the 45 and Up Study to investigate factors associated with worsening health from the baseline to follow-up survey. Our findings have practical implications for the use of the 45 and Up Study data to answer important research questions relating to health and quality-of-life. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Prediction model for the return to work of workers with injuries in Hong Kong.
Xu, Yanwen; Chan, Chetwyn C H; Lo, Karen Hui Yu-Ling; Tang, Dan
2008-01-01
This study attempts to formulate a prediction model of return to work for a group of workers who have been suffering from chronic pain and physical injury while also being out of work in Hong Kong. The study used Case-based Reasoning (CBR) method, and compared the result with the statistical method of logistic regression model. The database of the algorithm of CBR was composed of 67 cases who were also used in the logistic regression model. The testing cases were 32 participants who had a similar background and characteristics to those in the database. The methods of setting constraints and Euclidean distance metric were used in CBR to search the closest cases to the trial case based on the matrix. The usefulness of the algorithm was tested on 32 new participants, and the accuracy of predicting return to work outcomes was 62.5%, which was no better than the 71.2% accuracy derived from the logistic regression model. The results of the study would enable us to have a better understanding of the CBR applied in the field of occupational rehabilitation by comparing with the conventional regression analysis. The findings would also shed light on the development of relevant interventions for the return-to-work process of these workers.
Ensemble of trees approaches to risk adjustment for evaluating a hospital's performance.
Liu, Yang; Traskin, Mikhail; Lorch, Scott A; George, Edward I; Small, Dylan
2015-03-01
A commonly used method for evaluating a hospital's performance on an outcome is to compare the hospital's observed outcome rate to the hospital's expected outcome rate given its patient (case) mix and service. The process of calculating the hospital's expected outcome rate given its patient mix and service is called risk adjustment (Iezzoni 1997). Risk adjustment is critical for accurately evaluating and comparing hospitals' performances since we would not want to unfairly penalize a hospital just because it treats sicker patients. The key to risk adjustment is accurately estimating the probability of an Outcome given patient characteristics. For cases with binary outcomes, the method that is commonly used in risk adjustment is logistic regression. In this paper, we consider ensemble of trees methods as alternatives for risk adjustment, including random forests and Bayesian additive regression trees (BART). Both random forests and BART are modern machine learning methods that have been shown recently to have excellent performance for prediction of outcomes in many settings. We apply these methods to carry out risk adjustment for the performance of neonatal intensive care units (NICU). We show that these ensemble of trees methods outperform logistic regression in predicting mortality among babies treated in NICU, and provide a superior method of risk adjustment compared to logistic regression.
Ludvigsson, Jonas F; Reichenberg, Abraham; Hultman, Christina M; Murray, Joseph A
2013-11-01
Most case reports suggest an association between autistic spectrum disorders (ASDs) and celiac disease (CD) or positive CD serologic test results, but larger studies are contradictory. To examine the association between ASDs and CD according to small intestinal histopathologic findings. Nationwide case-control study in Sweden. Through 28 Swedish biopsy registers, we collected data about 26,995 individuals with CD (equal to villous atrophy, Marsh stage 3), 12,304 individuals with inflammation (Marsh stages 1-2), and 3719 individuals with normal mucosa (Marsh stage 0) but positive CD serologic test results (IgA/IgG gliadin, endomysium, or tissue transglutaminase) and compared them with 213,208 age- and sex-matched controls. Conditional logistic regression estimated odds ratios (ORs) for having a prior diagnosis of an ASD according to the Swedish National Patient Register. In another analysis, we used the Cox proportional hazards regression model to estimate hazard ratios (HRs) for future ASDs in individuals undergoing small intestinal biopsy. A prior ASD was not associated with CD (OR, 0.93; 95% CI, 0.51-1.68) or inflammation (OR 1.03; 95% CI, 0.40-2.64) but was associated with a markedly increased risk of having a normal mucosa but a positive CD serologic test result (OR, 4.57; 95% CI, 1.58-13.22). Restricting our data to individuals without a diagnosis of an ASD at the time of biopsy, CD (HR, 1.39; 95% CI, 1.13-1.71) and inflammation (HR, 2.01; 95% CI, 1.29-3.13) were both associated with moderate excess risks of later ASDs, whereas the HR for later ASDs in individuals with normal mucosa but positive CD serologic test results was 3.09 (95% CI, 1.99-4.80). Although this study found no association between CD or inflammation and earlier ASDs, there was a markedly increased risk of ASDs in individuals with normal mucosa but a positive CD serologic test result.
Güleç, Hüseyin; Sayar, Kemal; Yazici Güleç, Medine
2007-01-01
The aim of this study was to examine whether cognitive factors, such as attributions, expectations, and anger management style, contribute to the decision to seek medical care for fibromyalgia syndrome (FMS). We recruited 3 groups of subjects; patients from a FMS tertiary care setting, community residents with FMS who had not sought medical care for their FMS symptoms (nonpatients), and healthy controls. In all, 38 FMS nonpatients were compared to 37 FMS patients and 41 healthy controls on measures of anxiety, depression, anger, locus of control (LOC), attributions, pain intensity, and disability, as well as demographic characteristics. The prevalence of FMS non-patients was 2%. There was a significant difference between the 3 groups on the measures of anxiety, depression, LOC, and somatic and normalizing subscale scores of the symptom interpretation questionnaire (SIQ). FMS nonpatients, relative to FMS patients and healthy controls, were characterized by a significantly higher measure of both LOC and normalizing subscale score on the SIQ. There were no differences between the 2 FMS groups in demographical percentage and other psychometric measures. A hierarchical logistic regression model showed that the number of tender points, normalizing attribution style, and depression were independent predictors of help-seeking behavior. The rate of psychiatric and medical history is not related to the FMS syndrome. Expectations and a normalizing attribution style may contribute to help-seeking behavior for FMS.
Erosive Esophagitis in the Obese: The Effect of Ethnicity and Gender on Its Association.
Abraham, Albin; Lipka, Seth; Hajar, Rabab; Krishnamachari, Bhuma; Virdi, Ravi; Jacob, Bobby; Viswanathan, Prakash; Mustacchia, Paul
2016-01-01
Background. Data examining the association between obesity and erosive esophagitis (ErE) have been inconsistent, with very little known about interracial variation. Goals. To examine the association between obesity and ErE among patients of different ethnic/racial backgrounds. Methods. The study sample included 2251 patients who underwent esophagogastroduodenoscopy (EGD). The effects of body mass index (BMI) on ErE were assessed by gender and in different ethnic groups. Odds ratios (ORs) and 95% confidence intervals (CIs) were calculated using multivariate logistic regression analysis. Results. The prevalence of ErE was 29.4% (661/2251). Overweight and obese subjects were significantly more likely to have ErE than individuals with a normal BMI, with the highest risk seen in the morbidly obese (OR 6.26; 95% CI 3.82-10.28; p < 0.0001). Normal weight Black patients were less likely to have ErE as compared to Caucasians (OR 0.46; 95% CI 0.27-0.79; p = 0.005), while the odds ratio comparing normal weight Hispanics to normal weight Whites was not statistically significant. No effect modification was seen between BMI and race/ethnicity or BMI and gender. Significant trends were seen in each gender and ethnicity. Conclusions. The effect of BMI on ErE does not appear to vary by race/ethnicity or gender.
Hosseinifard, Behshad; Moradi, Mohammad Hassan; Rostami, Reza
2013-03-01
Diagnosing depression in the early curable stages is very important and may even save the life of a patient. In this paper, we study nonlinear analysis of EEG signal for discriminating depression patients and normal controls. Forty-five unmedicated depressed patients and 45 normal subjects were participated in this study. Power of four EEG bands and four nonlinear features including detrended fluctuation analysis (DFA), higuchi fractal, correlation dimension and lyapunov exponent were extracted from EEG signal. For discriminating the two groups, k-nearest neighbor, linear discriminant analysis and logistic regression as the classifiers are then used. Highest classification accuracy of 83.3% is obtained by correlation dimension and LR classifier among other nonlinear features. For further improvement, all nonlinear features are combined and applied to classifiers. A classification accuracy of 90% is achieved by all nonlinear features and LR classifier. In all experiments, genetic algorithm is employed to select the most important features. The proposed technique is compared and contrasted with the other reported methods and it is demonstrated that by combining nonlinear features, the performance is enhanced. This study shows that nonlinear analysis of EEG can be a useful method for discriminating depressed patients and normal subjects. It is suggested that this analysis may be a complementary tool to help psychiatrists for diagnosing depressed patients. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Simental-Mendía, Luis E; Hernández-Ronquillo, Gabriela; Gómez-Díaz, Rita; Rodríguez-Morán, Martha; Guerrero-Romero, Fernando
2017-12-01
BackgroundGiven the usefulness of the product of triglycerides and glucose (TyG) to recognize individuals at high risk for developing cardiovascular events, the aim of this study was to determine whether the TyG index is associated with the presence of cardiovascular risk factors in apparently healthy normal-weight children and adolescents.MethodsApparently healthy children and adolescents with normal weight, aged 6-15 years, were enrolled in a population-based cross-sectional study. The children were allocated into groups with and without cardiovascular risk factors. Cardiovascular risk factors were considered as the occurrence of at least one of the following: elevated blood pressure, hypertriglyceridemia, low high-density lipoprotein cholesterol (HDL-C), or hyperglycemia.ResultsA total of 2,117 children and adolescents were enrolled in the study; of them, 1,078 (50.9%) participants exhibited cardiovascular risk. The adjusted logistic regression analysis showed that elevated TyG index was significantly associated with hypertriglyceridemia (odds ratio (OR)=96.45, 95% confidence interval (CI): 48.44-192.04), low HDL-C (OR=2.07, 95% CI: 1.46-2.92), and hyperglycemia (OR=3.11, 95% CI: 2.05-4.72), but not with elevated blood pressure (OR=1.39, 95% CI: 0.89-2.16).ConclusionThe elevated TyG index is associated with the presence of cardiovascular risk factors in healthy normal-weight children and adolescents.
Snell, Kym Ie; Ensor, Joie; Debray, Thomas Pa; Moons, Karel Gm; Riley, Richard D
2017-01-01
If individual participant data are available from multiple studies or clusters, then a prediction model can be externally validated multiple times. This allows the model's discrimination and calibration performance to be examined across different settings. Random-effects meta-analysis can then be used to quantify overall (average) performance and heterogeneity in performance. This typically assumes a normal distribution of 'true' performance across studies. We conducted a simulation study to examine this normality assumption for various performance measures relating to a logistic regression prediction model. We simulated data across multiple studies with varying degrees of variability in baseline risk or predictor effects and then evaluated the shape of the between-study distribution in the C-statistic, calibration slope, calibration-in-the-large, and E/O statistic, and possible transformations thereof. We found that a normal between-study distribution was usually reasonable for the calibration slope and calibration-in-the-large; however, the distributions of the C-statistic and E/O were often skewed across studies, particularly in settings with large variability in the predictor effects. Normality was vastly improved when using the logit transformation for the C-statistic and the log transformation for E/O, and therefore we recommend these scales to be used for meta-analysis. An illustrated example is given using a random-effects meta-analysis of the performance of QRISK2 across 25 general practices.
Belshaw, N J; Elliott, G O; Foxall, R J; Dainty, J R; Pal, N; Coupe, A; Garg, D; Bradburn, D M; Mathers, J C; Johnson, I T
2008-07-08
Aberrant CpG island (CGI) methylation occurs early in colorectal neoplasia. Quantitative methylation-specific PCR profiling applied to biopsies was used to quantify low levels of CGI methylation of 18 genes in the morphologically normal colonic mucosa of neoplasia-free subjects, adenomatous polyp patients, cancer patients and their tumours. Multivariate statistical analyses distinguished tumour from mucosa with a sensitivity of 78.9% and a specificity of 100% (P=3 x 10(-7)). In morphologically normal mucosa, age-dependent CGI methylation was observed for APC, AXIN2, DKK1, HPP1, N33, p16, SFRP1, SFRP2 and SFRP4 genes, and significant differences in CGI methylation levels were detected between groups. Multinomial logistic regression models based on the CGI methylation profiles from normal mucosa correctly identified 78.9% of cancer patients and 87.9% of non-cancer (neoplasia-free+polyp) patients (P=4.93 x 10(-7)) using APC, HPP1, p16, SFRP4, WIF1 and ESR1 methylation as the most informative variables. Similarly, CGI methylation of SFRP4, SFRP5 and WIF1 correctly identified 61.5% of polyp patients and 78.9% of neoplasia-free subjects (P=0.0167). The apparently normal mucosal field of patients presenting with neoplasia has evidently undergone significant epigenetic modification. Methylation of the genes selected by the models may play a role in the earliest stages of the development of colorectal neoplasia.
NASA Astrophysics Data System (ADS)
Jokar Arsanjani, Jamal; Helbich, Marco; Kainz, Wolfgang; Darvishi Boloorani, Ali
2013-04-01
This research analyses the suburban expansion in the metropolitan area of Tehran, Iran. A hybrid model consisting of logistic regression model, Markov chain (MC), and cellular automata (CA) was designed to improve the performance of the standard logistic regression model. Environmental and socio-economic variables dealing with urban sprawl were operationalised to create a probability surface of spatiotemporal states of built-up land use for the years 2006, 2016, and 2026. For validation, the model was evaluated by means of relative operating characteristic values for different sets of variables. The approach was calibrated for 2006 by cross comparing of actual and simulated land use maps. The achieved outcomes represent a match of 89% between simulated and actual maps of 2006, which was satisfactory to approve the calibration process. Thereafter, the calibrated hybrid approach was implemented for forthcoming years. Finally, future land use maps for 2016 and 2026 were predicted by means of this hybrid approach. The simulated maps illustrate a new wave of suburban development in the vicinity of Tehran at the western border of the metropolis during the next decades.
A statistical method for predicting seizure onset zones from human single-neuron recordings
NASA Astrophysics Data System (ADS)
Valdez, André B.; Hickman, Erin N.; Treiman, David M.; Smith, Kris A.; Steinmetz, Peter N.
2013-02-01
Objective. Clinicians often use depth-electrode recordings to localize human epileptogenic foci. To advance the diagnostic value of these recordings, we applied logistic regression models to single-neuron recordings from depth-electrode microwires to predict seizure onset zones (SOZs). Approach. We collected data from 17 epilepsy patients at the Barrow Neurological Institute and developed logistic regression models to calculate the odds of observing SOZs in the hippocampus, amygdala and ventromedial prefrontal cortex, based on statistics such as the burst interspike interval (ISI). Main results. Analysis of these models showed that, for a single-unit increase in burst ISI ratio, the left hippocampus was approximately 12 times more likely to contain a SOZ; and the right amygdala, 14.5 times more likely. Our models were most accurate for the hippocampus bilaterally (at 85% average sensitivity), and performance was comparable with current diagnostics such as electroencephalography. Significance. Logistic regression models can be combined with single-neuron recording to predict likely SOZs in epilepsy patients being evaluated for resective surgery, providing an automated source of clinically useful information.
Gazolla, Fernanda Mussi; Neves Bordallo, Maria Alice; Madeira, Isabel Rey; de Miranda Carvalho, Cecilia Noronha; Vieira Monteiro, Alexandra Maria; Pinheiro Rodrigues, Nádia Cristina; Borges, Marcos Antonio; Collett-Solberg, Paulo Ferrez; Muniz, Bruna Moreira; de Oliveira, Cecilia Lacroix; Pinheiro, Suellen Martins; de Queiroz Ribeiro, Rebeca Mathias
2015-05-01
Early exposure to cardiovascular risk factors creates a chronic inflammatory state that could damage the endothelium followed by thickening of the carotid intima-media. To investigate the association of cardiovascular risk factors and thickening of the carotid intima. Media in prepubertal children. In this cross-sectional study, carotid intima-media thickness (cIMT) and cardiovascular risk factors were assessed in 129 prepubertal children aged from 5 to 10 year. Association was assessed by simple and multivariate logistic regression analyses. In simple logistic regression analyses, body mass index (BMI) z-score, waist circumference, and systolic blood pressure (SBP) were positively associated with increased left, right, and average cIMT, whereas diastolic blood pressure was positively associated only with increased left and average cIMT (p<0.05). In multivariate logistic regression analyses increased left cIMT was positively associated to BMI z-score and SBP, and increased average cIMT was only positively associated to SBP (p<0.05). BMI z-score and SBP were the strongest risk factors for increased cIMT.
New machine-learning algorithms for prediction of Parkinson's disease
NASA Astrophysics Data System (ADS)
Mandal, Indrajit; Sairam, N.
2014-03-01
This article presents an enhanced prediction accuracy of diagnosis of Parkinson's disease (PD) to prevent the delay and misdiagnosis of patients using the proposed robust inference system. New machine-learning methods are proposed and performance comparisons are based on specificity, sensitivity, accuracy and other measurable parameters. The robust methods of treating Parkinson's disease (PD) includes sparse multinomial logistic regression, rotation forest ensemble with support vector machines and principal components analysis, artificial neural networks, boosting methods. A new ensemble method comprising of the Bayesian network optimised by Tabu search algorithm as classifier and Haar wavelets as projection filter is used for relevant feature selection and ranking. The highest accuracy obtained by linear logistic regression and sparse multinomial logistic regression is 100% and sensitivity, specificity of 0.983 and 0.996, respectively. All the experiments are conducted over 95% and 99% confidence levels and establish the results with corrected t-tests. This work shows a high degree of advancement in software reliability and quality of the computer-aided diagnosis system and experimentally shows best results with supportive statistical inference.
Landslide Hazard Mapping in Rwanda Using Logistic Regression
NASA Astrophysics Data System (ADS)
Piller, A.; Anderson, E.; Ballard, H.
2015-12-01
Landslides in the United States cause more than $1 billion in damages and 50 deaths per year (USGS 2014). Globally, figures are much more grave, yet monitoring, mapping and forecasting of these hazards are less than adequate. Seventy-five percent of the population of Rwanda earns a living from farming, mostly subsistence. Loss of farmland, housing, or life, to landslides is a very real hazard. Landslides in Rwanda have an impact at the economic, social, and environmental level. In a developing nation that faces challenges in tracking, cataloging, and predicting the numerous landslides that occur each year, satellite imagery and spatial analysis allow for remote study. We have focused on the development of a landslide inventory and a statistical methodology for assessing landslide hazards. Using logistic regression on approximately 30 test variables (i.e. slope, soil type, land cover, etc.) and a sample of over 200 landslides, we determine which variables are statistically most relevant to landslide occurrence in Rwanda. A preliminary predictive hazard map for Rwanda has been produced, using the variables selected from the logistic regression analysis.
Evaluating penalized logistic regression models to predict Heat-Related Electric grid stress days
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bramer, L. M.; Rounds, J.; Burleyson, C. D.
Understanding the conditions associated with stress on the electricity grid is important in the development of contingency plans for maintaining reliability during periods when the grid is stressed. In this paper, heat-related grid stress and the relationship with weather conditions is examined using data from the eastern United States. Penalized logistic regression models were developed and applied to predict stress on the electric grid using weather data. The inclusion of other weather variables, such as precipitation, in addition to temperature improved model performance. Several candidate models and datasets were examined. A penalized logistic regression model fit at the operation-zone levelmore » was found to provide predictive value and interpretability. Additionally, the importance of different weather variables observed at different time scales were examined. Maximum temperature and precipitation were identified as important across all zones while the importance of other weather variables was zone specific. The methods presented in this work are extensible to other regions and can be used to aid in planning and development of the electrical grid.« less
Evaluating penalized logistic regression models to predict Heat-Related Electric grid stress days
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bramer, Lisa M.; Rounds, J.; Burleyson, C. D.
Understanding the conditions associated with stress on the electricity grid is important in the development of contingency plans for maintaining reliability during periods when the grid is stressed. In this paper, heat-related grid stress and the relationship with weather conditions were examined using data from the eastern United States. Penalized logistic regression models were developed and applied to predict stress on the electric grid using weather data. The inclusion of other weather variables, such as precipitation, in addition to temperature improved model performance. Several candidate models and combinations of predictive variables were examined. A penalized logistic regression model which wasmore » fit at the operation-zone level was found to provide predictive value and interpretability. Additionally, the importance of different weather variables observed at various time scales were examined. Maximum temperature and precipitation were identified as important across all zones while the importance of other weather variables was zone specific. In conclusion, the methods presented in this work are extensible to other regions and can be used to aid in planning and development of the electrical grid.« less
GIS-based rare events logistic regression for mineral prospectivity mapping
NASA Astrophysics Data System (ADS)
Xiong, Yihui; Zuo, Renguang
2018-02-01
Mineralization is a special type of singularity event, and can be considered as a rare event, because within a specific study area the number of prospective locations (1s) are considerably fewer than the number of non-prospective locations (0s). In this study, GIS-based rare events logistic regression (RELR) was used to map the mineral prospectivity in the southwestern Fujian Province, China. An odds ratio was used to measure the relative importance of the evidence variables with respect to mineralization. The results suggest that formations, granites, and skarn alterations, followed by faults and aeromagnetic anomaly are the most important indicators for the formation of Fe-related mineralization in the study area. The prediction rate and the area under the curve (AUC) values show that areas with higher probability have a strong spatial relationship with the known mineral deposits. Comparing the results with original logistic regression (OLR) demonstrates that the GIS-based RELR performs better than OLR. The prospectivity map obtained in this study benefits the search for skarn Fe-related mineralization in the study area.
Sun, Shi-Guang; Li, Zi-Feng; Xie, Yan-Ming; Liu, Jian; Lu, Yan; Song, Yi-Fei; Han, Ying-Hua; Liu, Li-Da; Peng, Ting-Ting
2013-09-01
To rationalize the clinical use and safety are some of the key issues in the surveillance of traditional Chinese medicine injections (TCMIs). In this 2011 study, 240 medical records of patients who had been discharged following treatment with TCMIs between 1 and 12 month previously were randomly selected from hospital records. Consistency between clinical use and the description of TCMIs was evaluated. Research on drug use and adverse drug reactions/events using logistic regression analysis was carried out. There was poor consistency between clinical use and best practice advised in manuals on TCMIs. Over-dosage and overly concentrated administration of TCMIs occurred, with the outcome of modifying properties of the blood. Logistic regression analysis showed that, drug concentration was a valid predictor for both adverse drug reactions/events and benefits associated with TCMIs. Surveillance of rational clinical use and safety of TCMIs finds that clinical use should be consistent with technical drug manual specifications, and drug use should draw on multi-layered logistic regression analysis research to help avoid adverse drug reactions/events.
NASA Astrophysics Data System (ADS)
Oh, Hyun-Joo; Lee, Saro; Chotikasathien, Wisut; Kim, Chang Hwan; Kwon, Ju Hyoung
2009-04-01
For predictive landslide susceptibility mapping, this study applied and verified probability model, the frequency ratio and statistical model, logistic regression at Pechabun, Thailand, using a geographic information system (GIS) and remote sensing. Landslide locations were identified in the study area from interpretation of aerial photographs and field surveys, and maps of the topography, geology and land cover were constructed to spatial database. The factors that influence landslide occurrence, such as slope gradient, slope aspect and curvature of topography and distance from drainage were calculated from the topographic database. Lithology and distance from fault were extracted and calculated from the geology database. Land cover was classified from Landsat TM satellite image. The frequency ratio and logistic regression coefficient were overlaid for landslide susceptibility mapping as each factor’s ratings. Then the landslide susceptibility map was verified and compared using the existing landslide location. As the verification results, the frequency ratio model showed 76.39% and logistic regression model showed 70.42% in prediction accuracy. The method can be used to reduce hazards associated with landslides and to plan land cover.
Wang, Shuang; Zhang, Yuchen; Dai, Wenrui; Lauter, Kristin; Kim, Miran; Tang, Yuzhe; Xiong, Hongkai; Jiang, Xiaoqian
2016-01-01
Motivation: Genome-wide association studies (GWAS) have been widely used in discovering the association between genotypes and phenotypes. Human genome data contain valuable but highly sensitive information. Unprotected disclosure of such information might put individual’s privacy at risk. It is important to protect human genome data. Exact logistic regression is a bias-reduction method based on a penalized likelihood to discover rare variants that are associated with disease susceptibility. We propose the HEALER framework to facilitate secure rare variants analysis with a small sample size. Results: We target at the algorithm design aiming at reducing the computational and storage costs to learn a homomorphic exact logistic regression model (i.e. evaluate P-values of coefficients), where the circuit depth is proportional to the logarithmic scale of data size. We evaluate the algorithm performance using rare Kawasaki Disease datasets. Availability and implementation: Download HEALER at http://research.ucsd-dbmi.org/HEALER/ Contact: shw070@ucsd.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26446135
Testing Gene-Gene Interactions in the Case-Parents Design
Yu, Zhaoxia
2011-01-01
The case-parents design has been widely used to detect genetic associations as it can prevent spurious association that could occur in population-based designs. When examining the effect of an individual genetic locus on a disease, logistic regressions developed by conditioning on parental genotypes provide complete protection from spurious association caused by population stratification. However, when testing gene-gene interactions, it is unknown whether conditional logistic regressions are still robust. Here we evaluate the robustness and efficiency of several gene-gene interaction tests that are derived from conditional logistic regressions. We found that in the presence of SNP genotype correlation due to population stratification or linkage disequilibrium, tests with incorrectly specified main-genetic-effect models can lead to inflated type I error rates. We also found that a test with fully flexible main genetic effects always maintains correct test size and its robustness can be achieved with negligible sacrifice of its power. When testing gene-gene interactions is the focus, the test allowing fully flexible main effects is recommended to be used. PMID:21778736
Li, Saijiao; He, Aiyan; Yang, Jing; Yin, TaiLang; Xu, Wangming
2011-01-01
To investigate factors that can affect compliance with treatment of polycystic ovary syndrome (PCOS) in infertile patients and to provide a basis for clinical treatment, specialist consultation and health education. Patient compliance was assessed via a questionnaire based on the Morisky-Green test and the treatment principles of PCOS. Then interviews were conducted with 99 infertile patients diagnosed with PCOS at Renmin Hospital of Wuhan University in China, from March to September 2009. Finally, these data were analyzed using logistic regression analysis. Logistic regression analysis revealed that a total of 23 (25.6%) of the participants showed good compliance. Factors that significantly (p < 0.05) affected compliance with treatment were the patient's body mass index, convenience of medical treatment and concerns about adverse drug reactions. Patients who are obese, experience inconvenient medical treatment or are concerned about adverse drug reactions are more likely to exhibit noncompliance. Treatment education and intervention aimed at these patients should be strengthened in the clinic to improve treatment compliance. Further research is needed to better elucidate the compliance behavior of patients with PCOS.
A general equation to obtain multiple cut-off scores on a test from multinomial logistic regression.
Bersabé, Rosa; Rivas, Teresa
2010-05-01
The authors derive a general equation to compute multiple cut-offs on a total test score in order to classify individuals into more than two ordinal categories. The equation is derived from the multinomial logistic regression (MLR) model, which is an extension of the binary logistic regression (BLR) model to accommodate polytomous outcome variables. From this analytical procedure, cut-off scores are established at the test score (the predictor variable) at which an individual is as likely to be in category j as in category j+1 of an ordinal outcome variable. The application of the complete procedure is illustrated by an example with data from an actual study on eating disorders. In this example, two cut-off scores on the Eating Attitudes Test (EAT-26) scores are obtained in order to classify individuals into three ordinal categories: asymptomatic, symptomatic and eating disorder. Diagnoses were made from the responses to a self-report (Q-EDD) that operationalises DSM-IV criteria for eating disorders. Alternatives to the MLR model to set multiple cut-off scores are discussed.
Sparse Logistic Regression for Diagnosis of Liver Fibrosis in Rat by Using SCAD-Penalized Likelihood
Yan, Fang-Rong; Lin, Jin-Guan; Liu, Yu
2011-01-01
The objective of the present study is to find out the quantitative relationship between progression of liver fibrosis and the levels of certain serum markers using mathematic model. We provide the sparse logistic regression by using smoothly clipped absolute deviation (SCAD) penalized function to diagnose the liver fibrosis in rats. Not only does it give a sparse solution with high accuracy, it also provides the users with the precise probabilities of classification with the class information. In the simulative case and the experiment case, the proposed method is comparable to the stepwise linear discriminant analysis (SLDA) and the sparse logistic regression with least absolute shrinkage and selection operator (LASSO) penalty, by using receiver operating characteristic (ROC) with bayesian bootstrap estimating area under the curve (AUC) diagnostic sensitivity for selected variable. Results show that the new approach provides a good correlation between the serum marker levels and the liver fibrosis induced by thioacetamide (TAA) in rats. Meanwhile, this approach might also be used in predicting the development of liver cirrhosis. PMID:21716672
Bingham, P; Verlander, N Q; Cheal, M J
2004-09-01
This paper examines why Snow's contention that cholera was principally spread by water was not accepted in the 1850s by the medical elite. The consequence of rejection was that hundreds in the UK continued to die. Logistic regression was used to re-analyse data, first published in 1852 by William Farr, consisting of the 1849 mortality rate from cholera and eight potential explanatory variables for the 38 registration districts of London. Logistic regression does not support Farr's original conclusion that a district's elevation above high water was the most important explanatory variable. Elevation above high water, water supply and poor rate each have an independent significant effect on district cholera mortality rate, but in terms of size of effect, it can be argued that water supply most strongly 'invited' further consideration. The science of epidemiology, that Farr helped to found, has continued to advance. Had logistic regression been available to Farr, its application to his 1852 data set would have changed his conclusion.
Evaluating penalized logistic regression models to predict Heat-Related Electric grid stress days
Bramer, Lisa M.; Rounds, J.; Burleyson, C. D.; ...
2017-09-22
Understanding the conditions associated with stress on the electricity grid is important in the development of contingency plans for maintaining reliability during periods when the grid is stressed. In this paper, heat-related grid stress and the relationship with weather conditions were examined using data from the eastern United States. Penalized logistic regression models were developed and applied to predict stress on the electric grid using weather data. The inclusion of other weather variables, such as precipitation, in addition to temperature improved model performance. Several candidate models and combinations of predictive variables were examined. A penalized logistic regression model which wasmore » fit at the operation-zone level was found to provide predictive value and interpretability. Additionally, the importance of different weather variables observed at various time scales were examined. Maximum temperature and precipitation were identified as important across all zones while the importance of other weather variables was zone specific. In conclusion, the methods presented in this work are extensible to other regions and can be used to aid in planning and development of the electrical grid.« less
Evaluating the Locational Attributes of Education Management Organizations (EMOs)
ERIC Educational Resources Information Center
Gulosino, Charisse; Miron, Gary
2017-01-01
This study uses logistic and multinomial logistic regression models to analyze neighborhood factors affecting EMO (Education Management Organization)-operated schools' locational attributes (using census tracts) in 41 states for the 2014-2015 school year. Our research combines market-based school reform, institutional theory, and resource…
NASA Astrophysics Data System (ADS)
Nandy, Sreyankar; Mostafa, Atahar; Kumavor, Patrick D.; Sanders, Melinda; Brewer, Molly; Zhu, Quing
2016-10-01
A spatial frequency domain imaging (SFDI) system was developed for characterizing ex vivo human ovarian tissue using wide-field absorption and scattering properties and their spatial heterogeneities. Based on the observed differences between absorption and scattering images of different ovarian tissue groups, six parameters were quantitatively extracted. These are the mean absorption and scattering, spatial heterogeneities of both absorption and scattering maps measured by a standard deviation, and a fitting error of a Gaussian model fitted to normalized mean Radon transform of the absorption and scattering maps. A logistic regression model was used for classification of malignant and normal ovarian tissues. A sensitivity of 95%, specificity of 100%, and area under the curve of 0.98 were obtained using six parameters extracted from the SFDI images. The preliminary results demonstrate the diagnostic potential of the SFDI method for quantitative characterization of wide-field optical properties and the spatial distribution heterogeneity of human ovarian tissue. SFDI could be an extremely robust and valuable tool for evaluation of the ovary and detection of neoplastic changes of ovarian cancer.
Szyda, Joanna; Liu, Zengting; Zatoń-Dobrowolska, Magdalena; Wierzbicki, Heliodor; Rzasa, Anna
2008-01-01
We analysed data from a selective DNA pooling experiment with 130 individuals of the arctic fox (Alopex lagopus), which originated from 2 different types regarding body size. The association between alleles of 6 selected unlinked molecular markers and body size was tested by using univariate and multinomial logistic regression models, applying odds ratio and test statistics from the power divergence family. Due to the small sample size and the resulting sparseness of the data table, in hypothesis testing we could not rely on the asymptotic distributions of the tests. Instead, we tried to account for data sparseness by (i) modifying confidence intervals of odds ratio; (ii) using a normal approximation of the asymptotic distribution of the power divergence tests with different approaches for calculating moments of the statistics; and (iii) assessing P values empirically, based on bootstrap samples. As a result, a significant association was observed for 3 markers. Furthermore, we used simulations to assess the validity of the normal approximation of the asymptotic distribution of the test statistics under the conditions of small and sparse samples.
de Paula, Jonas J.; Bicalho, Maria A.; Ávila, Rafaela T.; Cintra, Marco T. G.; Diniz, Breno S.; Romano-Silva, Marco A.; Malloy-Diniz, Leandro F.
2016-01-01
Depressive symptoms are associated with cognitive-functional impairment in normal aging older adults (NA). However, less is known about this effect on people with mild Cognitive Impairment (MCI) and mild Alzheimer's disease dementia (AD). We investigated this relationship along with the NA-MCI-AD continuum by reanalyzing a previously published dataset. Participants (N = 274) underwent comprehensive neuropsychological assessment including measures of Executive Function, Language/Semantic Memory, Episodic Memory, Visuospatial Abilities, Activities of Daily Living (ADL), and the Geriatric Depression Scale. MANOVA, logistic regression and chi-square tests were performed to assess the association between depression and cognitive-functional performance in each group. In the NA group, depressed participants had a lower performance compared to non-depressed participants in all cognitive and functional domains. However, the same pattern was not observed in the MCI group or in AD. The results suggest a progressive loss of association between depression and worse cognitive-functional performance along the NA-MCI-AD continuum. PMID:26858666
Tocopherols and tocotrienols plasma levels are associated with cognitive impairment.
Mangialasche, Francesca; Xu, Weili; Kivipelto, Miia; Costanzi, Emanuela; Ercolani, Sara; Pigliautile, Martina; Cecchetti, Roberta; Baglioni, Mauro; Simmons, Andrew; Soininen, Hilkka; Tsolaki, Magda; Kloszewska, Iwona; Vellas, Bruno; Lovestone, Simon; Mecocci, Patrizia
2012-10-01
Vitamin E includes 8 natural compounds (4 tocopherols, 4 tocotrienols) with potential neuroprotective activity. α-Tocopherol has mainly been investigated in relation to cognitive impairment. We examined the relation of all plasma vitamin E forms and markers of vitamin E damage (α-tocopherylquinone, 5-nitro-γ-tocopherol) to mild cognitive impairment (MCI) and Alzheimer's disease (AD). Within the AddNeuroMed-Project, plasma tocopherols, tocotrienols, α-tocopherylquinone, and 5-nitro-γ-tocopherol were assessed in 168 AD cases, 166 MCI, and 187 cognitively normal (CN) people. Compared with cognitively normal subjects, AD and MCI had lower levels of total tocopherols, total tocotrienols, and total vitamin E. In multivariable-polytomous-logistic regression analysis, both MCI and AD cases had 85% lower odds to be in the highest tertile of total tocopherols and total vitamin E, and they were, respectively, 92% and 94% less likely to be in the highest tertile of total tocotrienols than the lowest tertile. Further, both disorders were associated with increased vitamin E damage. Low plasma tocopherols and tocotrienols levels are associated with increased odds of MCI and AD. Copyright © 2012 Elsevier Inc. All rights reserved.
Pierce Campbell, Christine M; Gheit, Tarik; Tommasino, Massimo; Lin, Hui-Yi; Torres, B Nelson; Messina, Jane L; Stoler, Mark H; Rollison, Dana E; Sirak, Bradley A; Abrahamsen, Martha; Carvalho da Silva, Roberto J; Sichero, Laura; Villa, Luisa L; Lazcano-Ponce, Eduardo; Giuliano, Anna R
2016-10-01
Cutaneous human papillomaviruses (HPVs) increase the risk of non-melanoma skin cancer in sun-exposed skin. We examined the role of beta-HPV in the development of male external genital lesions (EGLs), a sun-unexposed site. In this nested case-control study (67 men with pathologically-confirmed EGLs and 134 controls), exfoliated cells collected from the surface of lesions and normal genital skin 0, 6, and 12 months preceding EGL development were tested for beta-HPV DNA using a type-specific multiplex genotyping assay. Beta-HPV prevalence was estimated and conditional logistic regression was used to evaluate the association with condyloma, the most common EGL. While beta-HPV prevalence among controls remained stable, the prevalence among cases was lowest on the surface of lesion. Detecting beta-HPV on the normal genital skin was not associated with the presence or development of condyloma. Cutaneous beta-HPV does not appear to be contributing to pathogenesis in male genital skin. Copyright © 2016. Published by Elsevier Inc.
Predicting the names of the best teams after the knock-out phase of a cricket series.
Lemmer, Hermanus Hofmeyr
2014-01-01
Cricket players' performances can best be judged after a large number of matches had been played. For test or one-day international (ODI) players, career data are normally used to calculate performance measures. These are normally good indicators of future performances, although various factors influence the performance of a player in a specific match. It is often necessary to judge players' performances based on a small number of scores, e.g. to identify the best players after a short series of matches. The challenge then is to use the best available criteria in order to assess performances as accurately and fairly as possible. In the present study the results of the knock-out phase of an International Cricket Council (ICC) World Cup ODI Series are used to predict the names of the best teams by means of a suitably formulated logistic regression model. Despite using very sparse data, the methods used are reasonably successful. It is also shown that if the same technique is applied to career ratings, very good results are obtained.