Science.gov

Sample records for age-adjusted logistic regression

  1. Logistic Regression

    NASA Astrophysics Data System (ADS)

    Grégoire, G.

    2014-12-01

    The logistic regression originally is intended to explain the relationship between the probability of an event and a set of covariables. The model's coefficients can be interpreted via the odds and odds ratio, which are presented in introduction of the chapter. The observations are possibly got individually, then we speak of binary logistic regression. When they are grouped, the logistic regression is said binomial. In our presentation we mainly focus on the binary case. For statistical inference the main tool is the maximum likelihood methodology: we present the Wald, Rao and likelihoods ratio results and their use to compare nested models. The problems we intend to deal with are essentially the same as in multiple linear regression: testing global effect, individual effect, selection of variables to build a model, measure of the fitness of the model, prediction of new values… . The methods are demonstrated on data sets using R. Finally we briefly consider the binomial case and the situation where we are interested in several events, that is the polytomous (multinomial) logistic regression and the particular case of ordinal logistic regression.

  2. Practical Session: Logistic Regression

    NASA Astrophysics Data System (ADS)

    Clausel, M.; Grégoire, G.

    2014-12-01

    An exercise is proposed to illustrate the logistic regression. One investigates the different risk factors in the apparition of coronary heart disease. It has been proposed in Chapter 5 of the book of D.G. Kleinbaum and M. Klein, "Logistic Regression", Statistics for Biology and Health, Springer Science Business Media, LLC (2010) and also by D. Chessel and A.B. Dufour in Lyon 1 (see Sect. 6 of http://pbil.univ-lyon1.fr/R/pdf/tdr341.pdf). This example is based on data given in the file evans.txt coming from http://www.sph.emory.edu/dkleinb/logreg3.htm#data.

  3. [Understanding logistic regression].

    PubMed

    El Sanharawi, M; Naudet, F

    2013-10-01

    Logistic regression is one of the most common multivariate analysis models utilized in epidemiology. It allows the measurement of the association between the occurrence of an event (qualitative dependent variable) and factors susceptible to influence it (explicative variables). The choice of explicative variables that should be included in the logistic regression model is based on prior knowledge of the disease physiopathology and the statistical association between the variable and the event, as measured by the odds ratio. The main steps for the procedure, the conditions of application, and the essential tools for its interpretation are discussed concisely. We also discuss the importance of the choice of variables that must be included and retained in the regression model in order to avoid the omission of important confounding factors. Finally, by way of illustration, we provide an example from the literature, which should help the reader test his or her knowledge.

  4. Steganalysis using logistic regression

    NASA Astrophysics Data System (ADS)

    Lubenko, Ivans; Ker, Andrew D.

    2011-02-01

    We advocate Logistic Regression (LR) as an alternative to the Support Vector Machine (SVM) classifiers commonly used in steganalysis. LR offers more information than traditional SVM methods - it estimates class probabilities as well as providing a simple classification - and can be adapted more easily and efficiently for multiclass problems. Like SVM, LR can be kernelised for nonlinear classification, and it shows comparable classification accuracy to SVM methods. This work is a case study, comparing accuracy and speed of SVM and LR classifiers in detection of LSB Matching and other related spatial-domain image steganography, through the state-of-art 686-dimensional SPAM feature set, in three image sets.

  5. Logistic Regression: Concept and Application

    ERIC Educational Resources Information Center

    Cokluk, Omay

    2010-01-01

    The main focus of logistic regression analysis is classification of individuals in different groups. The aim of the present study is to explain basic concepts and processes of binary logistic regression analysis intended to determine the combination of independent variables which best explain the membership in certain groups called dichotomous…

  6. Standards for Standardized Logistic Regression Coefficients

    ERIC Educational Resources Information Center

    Menard, Scott

    2011-01-01

    Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…

  7. Should metacognition be measured by logistic regression?

    PubMed

    Rausch, Manuel; Zehetleitner, Michael

    2017-03-01

    Are logistic regression slopes suitable to quantify metacognitive sensitivity, i.e. the efficiency with which subjective reports differentiate between correct and incorrect task responses? We analytically show that logistic regression slopes are independent from rating criteria in one specific model of metacognition, which assumes (i) that rating decisions are based on sensory evidence generated independently of the sensory evidence used for primary task responses and (ii) that the distributions of evidence are logistic. Given a hierarchical model of metacognition, logistic regression slopes depend on rating criteria. According to all considered models, regression slopes depend on the primary task criterion. A reanalysis of previous data revealed that massive numbers of trials are required to distinguish between hierarchical and independent models with tolerable accuracy. It is argued that researchers who wish to use logistic regression as measure of metacognitive sensitivity need to control the primary task criterion and rating criteria.

  8. Satellite rainfall retrieval by logistic regression

    NASA Technical Reports Server (NTRS)

    Chiu, Long S.

    1986-01-01

    The potential use of logistic regression in rainfall estimation from satellite measurements is investigated. Satellite measurements provide covariate information in terms of radiances from different remote sensors.The logistic regression technique can effectively accommodate many covariates and test their significance in the estimation. The outcome from the logistical model is the probability that the rainrate of a satellite pixel is above a certain threshold. By varying the thresholds, a rainrate histogram can be obtained, from which the mean and the variant can be estimated. A logistical model is developed and applied to rainfall data collected during GATE, using as covariates the fractional rain area and a radiance measurement which is deduced from a microwave temperature-rainrate relation. It is demonstrated that the fractional rain area is an important covariate in the model, consistent with the use of the so-called Area Time Integral in estimating total rain volume in other studies. To calibrate the logistical model, simulated rain fields generated by rainfield models with prescribed parameters are needed. A stringent test of the logistical model is its ability to recover the prescribed parameters of simulated rain fields. A rain field simulation model which preserves the fractional rain area and lognormality of rainrates as found in GATE is developed. A stochastic regression model of branching and immigration whose solutions are lognormally distributed in some asymptotic limits has also been developed.

  9. Predicting Social Trust with Binary Logistic Regression

    ERIC Educational Resources Information Center

    Adwere-Boamah, Joseph; Hufstedler, Shirley

    2015-01-01

    This study used binary logistic regression to predict social trust with five demographic variables from a national sample of adult individuals who participated in The General Social Survey (GSS) in 2012. The five predictor variables were respondents' highest degree earned, race, sex, general happiness and the importance of personally assisting…

  10. Logistic models--an odd(s) kind of regression.

    PubMed

    Jupiter, Daniel C

    2013-01-01

    The logistic regression model bears some similarity to the multivariable linear regression with which we are familiar. However, the differences are great enough to warrant a discussion of the need for and interpretation of logistic regression.

  11. Model selection for logistic regression models

    NASA Astrophysics Data System (ADS)

    Duller, Christine

    2012-09-01

    Model selection for logistic regression models decides which of some given potential regressors have an effect and hence should be included in the final model. The second interesting question is whether a certain factor is heterogeneous among some subsets, i.e. whether the model should include a random intercept or not. In this paper these questions will be answered with classical as well as with Bayesian methods. The application show some results of recent research projects in medicine and business administration.

  12. Transfer Learning Based on Logistic Regression

    NASA Astrophysics Data System (ADS)

    Paul, A.; Rottensteiner, F.; Heipke, C.

    2015-08-01

    In this paper we address the problem of classification of remote sensing images in the framework of transfer learning with a focus on domain adaptation. The main novel contribution is a method for transductive transfer learning in remote sensing on the basis of logistic regression. Logistic regression is a discriminative probabilistic classifier of low computational complexity, which can deal with multiclass problems. This research area deals with methods that solve problems in which labelled training data sets are assumed to be available only for a source domain, while classification is needed in the target domain with different, yet related characteristics. Classification takes place with a model of weight coefficients for hyperplanes which separate features in the transformed feature space. In term of logistic regression, our domain adaptation method adjusts the model parameters by iterative labelling of the target test data set. These labelled data features are iteratively added to the current training set which, at the beginning, only contains source features and, simultaneously, a number of source features are deleted from the current training set. Experimental results based on a test series with synthetic and real data constitutes a first proof-of-concept of the proposed method.

  13. Logistic Regression Applied to Seismic Discrimination

    SciTech Connect

    BG Amindan; DN Hagedorn

    1998-10-08

    The usefulness of logistic discrimination was examined in an effort to learn how it performs in a regional seismic setting. Logistic discrimination provides an easily understood method, works with user-defined models and few assumptions about the population distributions, and handles both continuous and discrete data. Seismic event measurements from a data set compiled by Los Alamos National Laboratory (LANL) of Chinese events recorded at station WMQ were used in this demonstration study. PNNL applied logistic regression techniques to the data. All possible combinations of the Lg and Pg measurements were tried, and a best-fit logistic model was created. The best combination of Lg and Pg frequencies for predicting the source of a seismic event (earthquake or explosion) used Lg{sub 3.0-6.0} and Pg{sub 3.0-6.0} as the predictor variables. A cross-validation test was run, which showed that this model was able to correctly predict 99.7% earthquakes and 98.0% explosions for this given data set. Two other models were identified that used Pg and Lg measurements from the 1.5 to 3.0 Hz frequency range. Although these other models did a good job of correctly predicting the earthquakes, they were not as effective at predicting the explosions. Two possible biases were discovered which affect the predicted probabilities for each outcome. The first bias was due to this being a case-controlled study. The sampling fractions caused a bias in the probabilities that were calculated using the models. The second bias is caused by a change in the proportions for each event. If at a later date the proportions (a priori probabilities) of explosions versus earthquakes change, this would cause a bias in the predicted probability for an event. When using logistic regression, the user needs to be aware of the possible biases and what affect they will have on the predicted probabilities.

  14. Supporting Regularized Logistic Regression Privately and Efficiently.

    PubMed

    Li, Wenfa; Liu, Hongzhe; Yang, Peng; Xie, Wei

    2016-01-01

    As one of the most popular statistical and machine learning models, logistic regression with regularization has found wide adoption in biomedicine, social sciences, information technology, and so on. These domains often involve data of human subjects that are contingent upon strict privacy regulations. Concerns over data privacy make it increasingly difficult to coordinate and conduct large-scale collaborative studies, which typically rely on cross-institution data sharing and joint analysis. Our work here focuses on safeguarding regularized logistic regression, a widely-used statistical model while at the same time has not been investigated from a data security and privacy perspective. We consider a common use scenario of multi-institution collaborative studies, such as in the form of research consortia or networks as widely seen in genetics, epidemiology, social sciences, etc. To make our privacy-enhancing solution practical, we demonstrate a non-conventional and computationally efficient method leveraging distributing computing and strong cryptography to provide comprehensive protection over individual-level and summary data. Extensive empirical evaluations on several studies validate the privacy guarantee, efficiency and scalability of our proposal. We also discuss the practical implications of our solution for large-scale studies and applications from various disciplines, including genetic and biomedical studies, smart grid, network analysis, etc.

  15. Supporting Regularized Logistic Regression Privately and Efficiently

    PubMed Central

    Li, Wenfa; Liu, Hongzhe; Yang, Peng; Xie, Wei

    2016-01-01

    As one of the most popular statistical and machine learning models, logistic regression with regularization has found wide adoption in biomedicine, social sciences, information technology, and so on. These domains often involve data of human subjects that are contingent upon strict privacy regulations. Concerns over data privacy make it increasingly difficult to coordinate and conduct large-scale collaborative studies, which typically rely on cross-institution data sharing and joint analysis. Our work here focuses on safeguarding regularized logistic regression, a widely-used statistical model while at the same time has not been investigated from a data security and privacy perspective. We consider a common use scenario of multi-institution collaborative studies, such as in the form of research consortia or networks as widely seen in genetics, epidemiology, social sciences, etc. To make our privacy-enhancing solution practical, we demonstrate a non-conventional and computationally efficient method leveraging distributing computing and strong cryptography to provide comprehensive protection over individual-level and summary data. Extensive empirical evaluations on several studies validate the privacy guarantee, efficiency and scalability of our proposal. We also discuss the practical implications of our solution for large-scale studies and applications from various disciplines, including genetic and biomedical studies, smart grid, network analysis, etc. PMID:27271738

  16. Using Dominance Analysis to Determine Predictor Importance in Logistic Regression

    ERIC Educational Resources Information Center

    Azen, Razia; Traxel, Nicole

    2009-01-01

    This article proposes an extension of dominance analysis that allows researchers to determine the relative importance of predictors in logistic regression models. Criteria for choosing logistic regression R[superscript 2] analogues were determined and measures were selected that can be used to perform dominance analysis in logistic regression. A…

  17. [Logistic regression against a divergent Bayesian network].

    PubMed

    Sánchez Trujillo, Noel Antonio

    2015-02-03

    This article is a discussion about two statistical tools used for prediction and causality assessment: logistic regression and Bayesian networks. Using data of a simulated example from a study assessing factors that might predict pulmonary emphysema (where fingertip pigmentation and smoking are considered); we posed the following questions. Is pigmentation a confounding, causal or predictive factor? Is there perhaps another factor, like smoking, that confounds? Is there a synergy between pigmentation and smoking? The results, in terms of prediction, are similar with the two techniques; regarding causation, differences arise. We conclude that, in decision-making, the sum of both: a statistical tool, used with common sense, and previous evidence, taking years or even centuries to develop; is better than the automatic and exclusive use of statistical resources.

  18. A Methodology for Generating Placement Rules that Utilizes Logistic Regression

    ERIC Educational Resources Information Center

    Wurtz, Keith

    2008-01-01

    The purpose of this article is to provide the necessary tools for institutional researchers to conduct a logistic regression analysis and interpret the results. Aspects of the logistic regression procedure that are necessary to evaluate models are presented and discussed with an emphasis on cutoff values and choosing the appropriate number of…

  19. Preserving Institutional Privacy in Distributed binary Logistic Regression.

    PubMed

    Wu, Yuan; Jiang, Xiaoqian; Ohno-Machado, Lucila

    2012-01-01

    Privacy is becoming a major concern when sharing biomedical data across institutions. Although methods for protecting privacy of individual patients have been proposed, it is not clear how to protect the institutional privacy, which is many times a critical concern of data custodians. Built upon our previous work, Grid Binary LOgistic REgression (GLORE)1, we developed an Institutional Privacy-preserving Distributed binary Logistic Regression model (IPDLR) that considers both individual and institutional privacy for building a logistic regression model in a distributed manner. We tested our method using both simulated and clinical data, showing how it is possible to protect the privacy of individuals and of institutions using a distributed strategy.

  20. Large unbalanced credit scoring using Lasso-logistic regression ensemble.

    PubMed

    Wang, Hong; Xu, Qingsong; Zhou, Lifeng

    2015-01-01

    Recently, various ensemble learning methods with different base classifiers have been proposed for credit scoring problems. However, for various reasons, there has been little research using logistic regression as the base classifier. In this paper, given large unbalanced data, we consider the plausibility of ensemble learning using regularized logistic regression as the base classifier to deal with credit scoring problems. In this research, the data is first balanced and diversified by clustering and bagging algorithms. Then we apply a Lasso-logistic regression learning ensemble to evaluate the credit risks. We show that the proposed algorithm outperforms popular credit scoring models such as decision tree, Lasso-logistic regression and random forests in terms of AUC and F-measure. We also provide two importance measures for the proposed model to identify important variables in the data.

  1. MODELING SNAKE MICROHABITAT FROM RADIOTELEMETRY STUDIES USING POLYTOMOUS LOGISTIC REGRESSION

    EPA Science Inventory

    Multivariate analysis of snake microhabitat has historically used techniques that were derived under assumptions of normality and common covariance structure (e.g., discriminant function analysis, MANOVA). In this study, polytomous logistic regression (PLR which does not require ...

  2. Large Unbalanced Credit Scoring Using Lasso-Logistic Regression Ensemble

    PubMed Central

    Wang, Hong; Xu, Qingsong; Zhou, Lifeng

    2015-01-01

    Recently, various ensemble learning methods with different base classifiers have been proposed for credit scoring problems. However, for various reasons, there has been little research using logistic regression as the base classifier. In this paper, given large unbalanced data, we consider the plausibility of ensemble learning using regularized logistic regression as the base classifier to deal with credit scoring problems. In this research, the data is first balanced and diversified by clustering and bagging algorithms. Then we apply a Lasso-logistic regression learning ensemble to evaluate the credit risks. We show that the proposed algorithm outperforms popular credit scoring models such as decision tree, Lasso-logistic regression and random forests in terms of AUC and F-measure. We also provide two importance measures for the proposed model to identify important variables in the data. PMID:25706988

  3. [Unconditioned logistic regression and sample size: a bibliographic review].

    PubMed

    Ortega Calvo, Manuel; Cayuela Domínguez, Aurelio

    2002-01-01

    Unconditioned logistic regression is a highly useful risk prediction method in epidemiology. This article reviews the different solutions provided by different authors concerning the interface between the calculation of the sample size and the use of logistics regression. Based on the knowledge of the information initially provided, a review is made of the customized regression and predictive constriction phenomenon, the design of an ordinal exposition with a binary output, the event of interest per variable concept, the indicator variables, the classic Freeman equation, etc. Some skeptical ideas regarding this subject are also included.

  4. Estimating the exceedance probability of rain rate by logistic regression

    NASA Technical Reports Server (NTRS)

    Chiu, Long S.; Kedem, Benjamin

    1990-01-01

    Recent studies have shown that the fraction of an area with rain intensity above a fixed threshold is highly correlated with the area-averaged rain rate. To estimate the fractional rainy area, a logistic regression model, which estimates the conditional probability that rain rate over an area exceeds a fixed threshold given the values of related covariates, is developed. The problem of dependency in the data in the estimation procedure is bypassed by the method of partial likelihood. Analyses of simulated scanning multichannel microwave radiometer and observed electrically scanning microwave radiometer data during the Global Atlantic Tropical Experiment period show that the use of logistic regression in pixel classification is superior to multiple regression in predicting whether rain rate at each pixel exceeds a given threshold, even in the presence of noisy data. The potential of the logistic regression technique in satellite rain rate estimation is discussed.

  5. Differentially private distributed logistic regression using private and public data

    PubMed Central

    2014-01-01

    Background Privacy protecting is an important issue in medical informatics and differential privacy is a state-of-the-art framework for data privacy research. Differential privacy offers provable privacy against attackers who have auxiliary information, and can be applied to data mining models (for example, logistic regression). However, differentially private methods sometimes introduce too much noise and make outputs less useful. Given available public data in medical research (e.g. from patients who sign open-consent agreements), we can design algorithms that use both public and private data sets to decrease the amount of noise that is introduced. Methodology In this paper, we modify the update step in Newton-Raphson method to propose a differentially private distributed logistic regression model based on both public and private data. Experiments and results We try our algorithm on three different data sets, and show its advantage over: (1) a logistic regression model based solely on public data, and (2) a differentially private distributed logistic regression model based on private data under various scenarios. Conclusion Logistic regression models built with our new algorithm based on both private and public datasets demonstrate better utility than models that trained on private or public datasets alone without sacrificing the rigorous privacy guarantee. PMID:25079786

  6. Identification of high leverage points in binary logistic regression

    NASA Astrophysics Data System (ADS)

    Fitrianto, Anwar; Wendy, Tham

    2016-10-01

    Leverage points are those which measures uncommon observations in x space of regression diagnostics. Detection of high leverage points plays a vital role because it is responsible in masking outlier. In regression, high observations which made at extreme in the space of explanatory variables and they are far apart from the average of the data are called as leverage points. In this project, a method for identification of high leverage point in logistic regression was shown using numerical example. We investigate the effects of high leverage point in the logistic regression model. The comparison of the result in the model with and without leverage model is being discussed. Some graphical analyses based on the result of the analysis are presented. We found that the presence of HLP have effect on the hii, estimated probability, estimated coefficients, p-value of variable, odds ratio and regression equation.

  7. Improving Predictions in Imbalanced Data Using Pairwise Expanded Logistic Regression

    PubMed Central

    Jiang, Xiaoqian; El-Kareh, Robert; Ohno-Machado, Lucila

    2011-01-01

    Building classifiers for medical problems often involves dealing with rare, but important events. Imbalanced datasets pose challenges to ordinary classification algorithms such as Logistic Regression (LR) and Support Vector Machines (SVM). The lack of effective strategies for dealing with imbalanced training data often results in models that exhibit poor discrimination. We propose a novel approach to estimate class memberships based on the evaluation of pairwise relationships in the training data. The method we propose, Pairwise Expanded Logistic Regression, improved discrimination and had higher accuracy when compared to existing methods in two imbalanced datasets, thus showing promise as a potential remedy for this problem. PMID:22195118

  8. Model building strategy for logistic regression: purposeful selection.

    PubMed

    Zhang, Zhongheng

    2016-03-01

    Logistic regression is one of the most commonly used models to account for confounders in medical literature. The article introduces how to perform purposeful selection model building strategy with R. I stress on the use of likelihood ratio test to see whether deleting a variable will have significant impact on model fit. A deleted variable should also be checked for whether it is an important adjustment of remaining covariates. Interaction should be checked to disentangle complex relationship between covariates and their synergistic effect on response variable. Model should be checked for the goodness-of-fit (GOF). In other words, how the fitted model reflects the real data. Hosmer-Lemeshow GOF test is the most widely used for logistic regression model.

  9. Sugarcane Land Classification with Satellite Imagery using Logistic Regression Model

    NASA Astrophysics Data System (ADS)

    Henry, F.; Herwindiati, D. E.; Mulyono, S.; Hendryli, J.

    2017-03-01

    This paper discusses the classification of sugarcane plantation area from Landsat-8 satellite imagery. The classification process uses binary logistic regression method with time series data of normalized difference vegetation index as input. The process is divided into two steps: training and classification. The purpose of training step is to identify the best parameter of the regression model using gradient descent algorithm. The best fit of the model can be utilized to classify sugarcane and non-sugarcane area. The experiment shows high accuracy and successfully maps the sugarcane plantation area which obtained best result of Cohen’s Kappa value 0.7833 (strong) with 89.167% accuracy.

  10. Saddlepoint approximations for small sample logistic regression problems.

    PubMed

    Platt, R W

    2000-02-15

    Double saddlepoint approximations provide quick and accurate approximations to exact conditional tail probabilities in a variety of situations. This paper describes the use of these approximations in two logistic regression problems. An investigation of regression analysis of the log-odds ratio in a sequence or set of 2x2 tables via simulation studies shows that in practical settings the saddlepoint methods closely approximate exact conditional inference. The double saddlepoint approximation in the test for trend in a sequence of binomial random variates is also shown, via simulation studies, to be an effective approximation to exact conditional inference.

  11. Computing group cardinality constraint solutions for logistic regression problems.

    PubMed

    Zhang, Yong; Kwon, Dongjin; Pohl, Kilian M

    2017-01-01

    We derive an algorithm to directly solve logistic regression based on cardinality constraint, group sparsity and use it to classify intra-subject MRI sequences (e.g. cine MRIs) of healthy from diseased subjects. Group cardinality constraint models are often applied to medical images in order to avoid overfitting of the classifier to the training data. Solutions within these models are generally determined by relaxing the cardinality constraint to a weighted feature selection scheme. However, these solutions relate to the original sparse problem only under specific assumptions, which generally do not hold for medical image applications. In addition, inferring clinical meaning from features weighted by a classifier is an ongoing topic of discussion. Avoiding weighing features, we propose to directly solve the group cardinality constraint logistic regression problem by generalizing the Penalty Decomposition method. To do so, we assume that an intra-subject series of images represents repeated samples of the same disease patterns. We model this assumption by combining series of measurements created by a feature across time into a single group. Our algorithm then derives a solution within that model by decoupling the minimization of the logistic regression function from enforcing the group sparsity constraint. The minimum to the smooth and convex logistic regression problem is determined via gradient descent while we derive a closed form solution for finding a sparse approximation of that minimum. We apply our method to cine MRI of 38 healthy controls and 44 adult patients that received reconstructive surgery of Tetralogy of Fallot (TOF) during infancy. Our method correctly identifies regions impacted by TOF and generally obtains statistically significant higher classification accuracy than alternative solutions to this model, i.e., ones relaxing group cardinality constraints.

  12. Logistic Regression Model on Antenna Control Unit Autotracking Mode

    DTIC Science & Technology

    2015-10-20

    412TW-PA-15240 Logistic Regression Model on Antenna Control Unit Autotracking Mode DANIEL T. LAIRD AIR FORCE TEST CENTER EDWARDS AFB, CA...OCTOBER 20 2015 4 1 2 T W Approved for public release; distribution is unlimited. 412TW-PA-15240 AIR FORCE TEST ...unlimited. 13. SUPPLEMENTARY NOTES CA: Air Force Test Center Edwards AFB CA CC: 012100 14. ABSTRACT Over the past several years

  13. Classifying machinery condition using oil samples and binary logistic regression

    NASA Astrophysics Data System (ADS)

    Phillips, J.; Cripps, E.; Lau, John W.; Hodkiewicz, M. R.

    2015-08-01

    The era of big data has resulted in an explosion of condition monitoring information. The result is an increasing motivation to automate the costly and time consuming human elements involved in the classification of machine health. When working with industry it is important to build an understanding and hence some trust in the classification scheme for those who use the analysis to initiate maintenance tasks. Typically "black box" approaches such as artificial neural networks (ANN) and support vector machines (SVM) can be difficult to provide ease of interpretability. In contrast, this paper argues that logistic regression offers easy interpretability to industry experts, providing insight to the drivers of the human classification process and to the ramifications of potential misclassification. Of course, accuracy is of foremost importance in any automated classification scheme, so we also provide a comparative study based on predictive performance of logistic regression, ANN and SVM. A real world oil analysis data set from engines on mining trucks is presented and using cross-validation we demonstrate that logistic regression out-performs the ANN and SVM approaches in terms of prediction for healthy/not healthy engines.

  14. Using logistic regression to estimate delay-discounting functions.

    PubMed

    Wileyto, E Paul; Audrain-McGovern, Janet; Epstein, Leonard H; Lerman, Caryn

    2004-02-01

    The monetary choice questionnaire (MCQ) and similar computer tasks ask preference questions in order to ascertain indifference, the perceived equivalence of immediate versus larger delayed rewards. Indifference data are then fitted with a hyperbolic function, summarizing the decline in perceived value with delay time. We present a fitting method that estimates the hyperbolic parameter k directly from survey responses. Binary preferences are modeled as a function of time (X2) and a transformed reward ratio (X1), yielding logistic regression coefficients beta 2 and beta 1. The hyperbolic parameter emerges as k = beta 2/beta 1, where the logistic predicted p = .5 (the definition of indifference). The MCQ was administered to 1,073 adolescents and was scored using both standard and logistic methods. The means for In(k) were similar (standard, -4.53; logistic, -4.51), and the results were highly correlated (rho = .973). Simulated MCQ data showed that k was unbiased, except where beta 1 > or = -1, indicating a vague survey response. Jackknife standard errors provided excellent coverage.

  15. Detecting nonsense for Chinese comments based on logistic regression

    NASA Astrophysics Data System (ADS)

    Zhuolin, Ren; Guang, Chen; Shu, Chen

    2016-07-01

    To understand cyber citizens' opinion accurately from Chinese news comments, the clear definition on nonsense is present, and a detection model based on logistic regression (LR) is proposed. The detection of nonsense can be treated as a binary-classification problem. Besides of traditional lexical features, we propose three kinds of features in terms of emotion, structure and relevance. By these features, we train an LR model and demonstrate its effect in understanding Chinese news comments. We find that each of proposed features can significantly promote the result. In our experiments, we achieve a prediction accuracy of 84.3% which improves the baseline 77.3% by 7%.

  16. Landslide Hazard Mapping in Rwanda Using Logistic Regression

    NASA Astrophysics Data System (ADS)

    Piller, A.; Anderson, E.; Ballard, H.

    2015-12-01

    Landslides in the United States cause more than $1 billion in damages and 50 deaths per year (USGS 2014). Globally, figures are much more grave, yet monitoring, mapping and forecasting of these hazards are less than adequate. Seventy-five percent of the population of Rwanda earns a living from farming, mostly subsistence. Loss of farmland, housing, or life, to landslides is a very real hazard. Landslides in Rwanda have an impact at the economic, social, and environmental level. In a developing nation that faces challenges in tracking, cataloging, and predicting the numerous landslides that occur each year, satellite imagery and spatial analysis allow for remote study. We have focused on the development of a landslide inventory and a statistical methodology for assessing landslide hazards. Using logistic regression on approximately 30 test variables (i.e. slope, soil type, land cover, etc.) and a sample of over 200 landslides, we determine which variables are statistically most relevant to landslide occurrence in Rwanda. A preliminary predictive hazard map for Rwanda has been produced, using the variables selected from the logistic regression analysis.

  17. Parental Vaccine Acceptance: A Logistic Regression Model Using Previsit Decisions.

    PubMed

    Lee, Sara; Riley-Behringer, Maureen; Rose, Jeanmarie C; Meropol, Sharon B; Lazebnik, Rina

    2016-10-26

    This study explores how parents' intentions regarding vaccination prior to their children's visit were associated with actual vaccine acceptance. A convenience sample of parents accompanying 6-week-old to 17-year-old children completed a written survey at 2 pediatric practices. Using hierarchical logistic regression, for hospital-based participants (n = 216), vaccine refusal history (P < .01) and vaccine decision made before the visit (P < .05) explained 87% of vaccine refusals. In community-based participants (n = 100), vaccine refusal history (P < .01) explained 81% of refusals. Over 1 in 5 parents changed their minds about vaccination during the visit. Thirty parents who were previous vaccine refusers accepted current vaccines, and 37 who had intended not to vaccinate choose vaccination. Twenty-nine parents without a refusal history declined vaccines, and 32 who did not intend to refuse before the visit declined vaccination. Future research should identify key factors to nudge parent decision making in favor of vaccination.

  18. Metadata for ReVA logistic regression dataset

    USGS Publications Warehouse

    LaMotte, Andrew E.

    2004-01-01

    The U.S. Geological Survey in cooperation with the U.S. Environmental Protection Agency's Regional Vulnerability Assessment Program, has developed a set of statistical tools to support regional-scale, ground-water quality and vulnerability assessments. The Regional Vulnerability Assessment Program goals are to develop and demonstrate approaches to comprehensive, regional-scale assessments that effectively inform water-resources managers and decision-makers as to the magnitude, extent, distribution, and uncertainty of current and anticipated environmental risks. The U.S. Geological Survey is developing and exploring the use of statistical probability models to characterize the relation between ground-water quality and geographic factors in the Mid-Atlantic Region. Available water-quality data obtained from U.S. Geological Survey National Water-Quality Assessment Program studies conducted in the Mid-Atlantic Region were used in association with geographic data (land cover, geology, soils, and others) to develop logistic-regression equations that use explanatory variables to predict the presence of a selected water-quality parameter exceeding specified management concentration thresholds. The resulting logistic-regression equations were transformed to determine the probability, P(X), of a water-quality parameter exceeding a specified management threshold. Additional statistical procedures modified by the U.S. Geological Survey were used to compare the observed values to model-predicted values at each sample point. In addition, procedures to evaluate the confidence of the model predictions and estimate the uncertainty of the probability value were developed and applied. The resulting logistic-regression models were applied to the Mid-Atlantic Region to predict the spatial probability of nitrate concentrations exceeding specified management thresholds. These thresholds are usually set or established by regulators or managers at national or local levels. At management

  19. On the Usefulness of a Multilevel Logistic Regression Approach to Person-Fit Analysis

    ERIC Educational Resources Information Center

    Conijn, Judith M.; Emons, Wilco H. M.; van Assen, Marcel A. L. M.; Sijtsma, Klaas

    2011-01-01

    The logistic person response function (PRF) models the probability of a correct response as a function of the item locations. Reise (2000) proposed to use the slope parameter of the logistic PRF as a person-fit measure. He reformulated the logistic PRF model as a multilevel logistic regression model and estimated the PRF parameters from this…

  20. The crux of the method: assumptions in ordinary least squares and logistic regression.

    PubMed

    Long, Rebecca G

    2008-10-01

    Logistic regression has increasingly become the tool of choice when analyzing data with a binary dependent variable. While resources relating to the technique are widely available, clear discussions of why logistic regression should be used in place of ordinary least squares regression are difficult to find. The current paper compares and contrasts the assumptions of ordinary least squares with those of logistic regression and explains why logistic regression's looser assumptions make it adept at handling violations of the more important assumptions in ordinary least squares.

  1. Risk factors for temporomandibular disorder: Binary logistic regression analysis

    PubMed Central

    Magalhães, Bruno G.; de-Sousa, Stéphanie T.; de Mello, Victor V C.; da-Silva-Barbosa, André C.; de-Assis-Morais, Mariana P L.; Barbosa-Vasconcelos, Márcia M V.

    2014-01-01

    Objectives: To analyze the influence of socioeconomic and demographic factors (gender, economic class, age and marital status) on the occurrence of temporomandibular disorder. Study Design: One hundred individuals from urban areas in the city of Recife (Brazil) registered at Family Health Units was examined using Axis I of the Research Diagnostic Criteria for Temporomandibular Disorders (RDC/TMD) which addresses myofascial pain and joint problems (disc displacement, arthralgia, osteoarthritis and oesteoarthrosis). The Brazilian Economic Classification Criteria (CCEB) was used for the collection of socioeconomic and demographic data. Then, it was categorized as Class A (high social class), Classes B/C (middle class) and Classes D/E (very poor social class). The results were analyzed using Pearson’s chi-square test for proportions, Fisher’s exact test, nonparametric Mann-Whitney test and Binary logistic regression analysis. Results: None of the participants belonged to Class A, 72% belonged to Classes B/C and 28% belonged to Classes D/E. The multivariate analysis revealed that participants from Classes D/E had a 4.35-fold greater chance of exhibiting myofascial pain and 11.3-fold greater chance of exhibiting joint problems. Conclusions: Poverty is a important condition to exhibit myofascial pain and joint problems. Key words:Temporomandibular joint disorders, risk factors, prevalence. PMID:24316706

  2. Application of Bayesian logistic regression to mining biomedical data.

    PubMed

    Avali, Viji R; Cooper, Gregory F; Gopalakrishnan, Vanathi

    2014-01-01

    Mining high dimensional biomedical data with existing classifiers is challenging and the predictions are often inaccurate. We investigated the use of Bayesian Logistic Regression (B-LR) for mining such data to predict and classify various disease conditions. The analysis was done on twelve biomedical datasets with binary class variables and the performance of B-LR was compared to those from other popular classifiers on these datasets with 10-fold cross validation using the WEKA data mining toolkit. The statistical significance of the results was analyzed by paired two tailed t-tests and non-parametric Wilcoxon signed-rank tests. We observed overall that B-LR with non-informative Gaussian priors performed on par with other classifiers in terms of accuracy, balanced accuracy and AUC. These results suggest that it is worthwhile to explore the application of B-LR to predictive modeling tasks in bioinformatics using informative biological prior probabilities. With informative prior probabilities, we conjecture that the performance of B-LR will improve.

  3. Autoregressive logistic regression applied to atmospheric circulation patterns

    NASA Astrophysics Data System (ADS)

    Guanche, Y.; Mínguez, R.; Méndez, F. J.

    2014-01-01

    Autoregressive logistic regression models have been successfully applied in medical and pharmacology research fields, and in simple models to analyze weather types. The main purpose of this paper is to introduce a general framework to study atmospheric circulation patterns capable of dealing simultaneously with: seasonality, interannual variability, long-term trends, and autocorrelation of different orders. To show its effectiveness on modeling performance, daily atmospheric circulation patterns identified from observed sea level pressure fields over the Northeastern Atlantic, have been analyzed using this framework. Model predictions are compared with probabilities from the historical database, showing very good fitting diagnostics. In addition, the fitted model is used to simulate the evolution over time of atmospheric circulation patterns using Monte Carlo method. Simulation results are statistically consistent with respect to the historical sequence in terms of (1) probability of occurrence of the different weather types, (2) transition probabilities and (3) persistence. The proposed model constitutes an easy-to-use and powerful tool for a better understanding of the climate system.

  4. Predicting β-Turns in Protein Using Kernel Logistic Regression

    PubMed Central

    Elbashir, Murtada Khalafallah; Sheng, Yu; Wang, Jianxin; Wu, FangXiang; Li, Min

    2013-01-01

    A β-turn is a secondary protein structure type that plays a significant role in protein configuration and function. On average 25% of amino acids in protein structures are located in β-turns. It is very important to develope an accurate and efficient method for β-turns prediction. Most of the current successful β-turns prediction methods use support vector machines (SVMs) or neural networks (NNs). The kernel logistic regression (KLR) is a powerful classification technique that has been applied successfully in many classification problems. However, it is often not found in β-turns classification, mainly because it is computationally expensive. In this paper, we used KLR to obtain sparse β-turns prediction in short evolution time. Secondary structure information and position-specific scoring matrices (PSSMs) are utilized as input features. We achieved Qtotal of 80.7% and MCC of 50% on BT426 dataset. These results show that KLR method with the right algorithm can yield performance equivalent to or even better than NNs and SVMs in β-turns prediction. In addition, KLR yields probabilistic outcome and has a well-defined extension to multiclass case. PMID:23509793

  5. Analyzing Student Learning Outcomes: Usefulness of Logistic and Cox Regression Models. IR Applications, Volume 5

    ERIC Educational Resources Information Center

    Chen, Chau-Kuang

    2005-01-01

    Logistic and Cox regression methods are practical tools used to model the relationships between certain student learning outcomes and their relevant explanatory variables. The logistic regression model fits an S-shaped curve into a binary outcome with data points of zero and one. The Cox regression model allows investigators to study the duration…

  6. What Are the Odds of that? A Primer on Understanding Logistic Regression

    ERIC Educational Resources Information Center

    Huang, Francis L.; Moon, Tonya R.

    2013-01-01

    The purpose of this Methodological Brief is to present a brief primer on logistic regression, a commonly used technique when modeling dichotomous outcomes. Using data from the National Education Longitudinal Study of 1988 (NELS:88), logistic regression techniques were used to investigate student-level variables in eighth grade (i.e., enrolled in a…

  7. Power and Sample Size Calculations for Logistic Regression Tests for Differential Item Functioning

    ERIC Educational Resources Information Center

    Li, Zhushan

    2014-01-01

    Logistic regression is a popular method for detecting uniform and nonuniform differential item functioning (DIF) effects. Theoretical formulas for the power and sample size calculations are derived for likelihood ratio tests and Wald tests based on the asymptotic distribution of the maximum likelihood estimators for the logistic regression model.…

  8. Comparing Methodologies for Developing an Early Warning System: Classification and Regression Tree Model versus Logistic Regression. REL 2015-077

    ERIC Educational Resources Information Center

    Koon, Sharon; Petscher, Yaacov

    2015-01-01

    The purpose of this report was to explicate the use of logistic regression and classification and regression tree (CART) analysis in the development of early warning systems. It was motivated by state education leaders' interest in maintaining high classification accuracy while simultaneously improving practitioner understanding of the rules by…

  9. A Bayesian goodness of fit test and semiparametric generalization of logistic regression with measurement data.

    PubMed

    Schörgendorfer, Angela; Branscum, Adam J; Hanson, Timothy E

    2013-06-01

    Logistic regression is a popular tool for risk analysis in medical and population health science. With continuous response data, it is common to create a dichotomous outcome for logistic regression analysis by specifying a threshold for positivity. Fitting a linear regression to the nondichotomized response variable assuming a logistic sampling model for the data has been empirically shown to yield more efficient estimates of odds ratios than ordinary logistic regression of the dichotomized endpoint. We illustrate that risk inference is not robust to departures from the parametric logistic distribution. Moreover, the model assumption of proportional odds is generally not satisfied when the condition of a logistic distribution for the data is violated, leading to biased inference from a parametric logistic analysis. We develop novel Bayesian semiparametric methodology for testing goodness of fit of parametric logistic regression with continuous measurement data. The testing procedures hold for any cutoff threshold and our approach simultaneously provides the ability to perform semiparametric risk estimation. Bayes factors are calculated using the Savage-Dickey ratio for testing the null hypothesis of logistic regression versus a semiparametric generalization. We propose a fully Bayesian and a computationally efficient empirical Bayesian approach to testing, and we present methods for semiparametric estimation of risks, relative risks, and odds ratios when parametric logistic regression fails. Theoretical results establish the consistency of the empirical Bayes test. Results from simulated data show that the proposed approach provides accurate inference irrespective of whether parametric assumptions hold or not. Evaluation of risk factors for obesity shows that different inferences are derived from an analysis of a real data set when deviations from a logistic distribution are permissible in a flexible semiparametric framework.

  10. Factors Influencing Drug Injection History among Prisoners: A Comparison between Classification and Regression Trees and Logistic Regression Analysis

    PubMed Central

    Rastegari, Azam; Haghdoost, Ali Akbar; Baneshi, Mohammad Reza

    2013-01-01

    Background Due to the importance of medical studies, researchers of this field should be familiar with various types of statistical analyses to select the most appropriate method based on the characteristics of their data sets. Classification and regression trees (CARTs) can be as complementary to regression models. We compared the performance of a logistic regression model and a CART in predicting drug injection among prisoners. Methods Data of 2720 Iranian prisoners was studied to determine the factors influencing drug injection. The collected data was divided into two groups of training and testing. A logistic regression model and a CART were applied on training data. The performance of the two models was then evaluated on testing data. Findings The regression model and the CART had 8 and 4 significant variables, respectively. Overall, heroin use, history of imprisonment, age at first drug use, and marital status were important factors in determining the history of drug injection. Subjects without the history of heroin use or heroin users with short-term imprisonment were at lower risk of drug injection. Among heroin addicts with long-term imprisonment, individuals with higher age at first drug use and married subjects were at lower risk of drug injection. Although the logistic regression model was more sensitive than the CART, the two models had the same levels of specificity and classification accuracy. Conclusion In this study, both sensitivity and specificity were important. While the logistic regression model had better performance, the graphical presentation of the CART simplifies the interpretation of the results. In general, a combination of different analytical methods is recommended to explore the effects of variables. PMID:24494152

  11. Comparison of artificial neural networks with logistic regression for detection of obesity.

    PubMed

    Heydari, Seyed Taghi; Ayatollahi, Seyed Mohammad Taghi; Zare, Najaf

    2012-08-01

    Obesity is a common problem in nutrition, both in the developed and developing countries. The aim of this study was to classify obesity by artificial neural networks and logistic regression. This cross-sectional study comprised of 414 healthy military personnel in southern Iran. All subjects completed questionnaires on their socio-economic status and their anthropometric measures were measured by a trained nurse. Classification of obesity was done by artificial neural networks and logistic regression. The mean age±SD of participants was 34.4 ± 7.5 years. A total of 187 (45.2%) were obese. In regard to logistic regression and neural networks the respective values were 80.2% and 81.2% when correctly classified, 80.2 and 79.7 for sensitivity and 81.9 and 83.7 for specificity; while the area under Receiver-Operating Characteristic (ROC) curve were 0.888 and 0.884 and the Kappa statistic were 0.600 and 0.629 for logistic regression and neural networks model respectively. We conclude that the neural networks and logistic regression both were good classifier for obesity detection but they were not significantly different in classification.

  12. Binary logistic regression-Instrument for assessing museum indoor air impact on exhibits.

    PubMed

    Bucur, Elena; Danet, Andrei Florin; Lehr, Carol Blaziu; Lehr, Elena; Nita-Lazar, Mihai

    2017-04-01

    This paper presents a new way to assess the environmental impact on historical artifacts using binary logistic regression. The prediction of the impact on the exhibits during certain pollution scenarios (environmental impact) was calculated by a mathematical model based on the binary logistic regression; it allows the identification of those environmental parameters from a multitude of possible parameters with a significant impact on exhibitions and ranks them according to their severity effect. Air quality (NO2, SO2, O3 and PM2.5) and microclimate parameters (temperature, humidity) monitoring data from a case study conducted within exhibition and storage spaces of the Romanian National Aviation Museum Bucharest have been used for developing and validating the binary logistic regression method and the mathematical model. The logistic regression analysis was used on 794 data combinations (715 to develop of the model and 79 to validate it) by a Statistical Package for Social Sciences (SPSS 20.0). The results from the binary logistic regression analysis demonstrated that from six parameters taken into consideration, four of them present a significant effect upon exhibits in the following order: O3>PM2.5>NO2>humidity followed at a significant distance by the effects of SO2 and temperature. The mathematical model, developed in this study, correctly predicted 95.1 % of the cumulated effect of the environmental parameters upon the exhibits. Moreover, this model could also be used in the decisional process regarding the preventive preservation measures that should be implemented within the exhibition space.

  13. Modelling of binary logistic regression for obesity among secondary students in a rural area of Kedah

    NASA Astrophysics Data System (ADS)

    Kamaruddin, Ainur Amira; Ali, Zalila; Noor, Norlida Mohd.; Baharum, Adam; Ahmad, Wan Muhamad Amir W.

    2014-07-01

    Logistic regression analysis examines the influence of various factors on a dichotomous outcome by estimating the probability of the event's occurrence. Logistic regression, also called a logit model, is a statistical procedure used to model dichotomous outcomes. In the logit model the log odds of the dichotomous outcome is modeled as a linear combination of the predictor variables. The log odds ratio in logistic regression provides a description of the probabilistic relationship of the variables and the outcome. In conducting logistic regression, selection procedures are used in selecting important predictor variables, diagnostics are used to check that assumptions are valid which include independence of errors, linearity in the logit for continuous variables, absence of multicollinearity, and lack of strongly influential outliers and a test statistic is calculated to determine the aptness of the model. This study used the binary logistic regression model to investigate overweight and obesity among rural secondary school students on the basis of their demographics profile, medical history, diet and lifestyle. The results indicate that overweight and obesity of students are influenced by obesity in family and the interaction between a student's ethnicity and routine meals intake. The odds of a student being overweight and obese are higher for a student having a family history of obesity and for a non-Malay student who frequently takes routine meals as compared to a Malay student.

  14. Standardization of age-adjusted mortality rates

    SciTech Connect

    Selvin, S.; Sacks, S.T.; Merrill, D.W.

    1980-02-01

    Because age is a significant variable in the occurrence and frequency of human disease, any comparison of disease or mortality rates, to be useful, must be age-specific or age-adjusted. Age-specific comparisons are not always appropriate or possible, however. A common method of eliminating the influence of age in comparing mortality rates from one community to another is to employ statistical methods of age-adjustment. While a variety of methods will accomplish this task, most are weighted averages of the age-specific rates. Two widely used adjustment procedures are direct and indirect age-adjustment.

  15. Empirical Bayesian LASSO-logistic regression for multiple binary trait locus mapping

    PubMed Central

    2013-01-01

    Background Complex binary traits are influenced by many factors including the main effects of many quantitative trait loci (QTLs), the epistatic effects involving more than one QTLs, environmental effects and the effects of gene-environment interactions. Although a number of QTL mapping methods for binary traits have been developed, there still lacks an efficient and powerful method that can handle both main and epistatic effects of a relatively large number of possible QTLs. Results In this paper, we use a Bayesian logistic regression model as the QTL model for binary traits that includes both main and epistatic effects. Our logistic regression model employs hierarchical priors for regression coefficients similar to the ones used in the Bayesian LASSO linear model for multiple QTL mapping for continuous traits. We develop efficient empirical Bayesian algorithms to infer the logistic regression model. Our simulation study shows that our algorithms can easily handle a QTL model with a large number of main and epistatic effects on a personal computer, and outperform five other methods examined including the LASSO, HyperLasso, BhGLM, RVM and the single-QTL mapping method based on logistic regression in terms of power of detection and false positive rate. The utility of our algorithms is also demonstrated through analysis of a real data set. A software package implementing the empirical Bayesian algorithms in this paper is freely available upon request. Conclusions The EBLASSO logistic regression method can handle a large number of effects possibly including the main and epistatic QTL effects, environmental effects and the effects of gene-environment interactions. It will be a very useful tool for multiple QTLs mapping for complex binary traits. PMID:23410082

  16. PARAMETRIC AND NON PARAMETRIC (MARS: MULTIVARIATE ADDITIVE REGRESSION SPLINES) LOGISTIC REGRESSIONS FOR PREDICTION OF A DICHOTOMOUS RESPONSE VARIABLE WITH AN EXAMPLE FOR PRESENCE/ABSENCE OF AMPHIBIANS

    EPA Science Inventory

    The purpose of this report is to provide a reference manual that could be used by investigators for making informed use of logistic regression using two methods (standard logistic regression and MARS). The details for analyses of relationships between a dependent binary response ...

  17. Use and interpretation of logistic regression in habitat-selection studies

    USGS Publications Warehouse

    Keating, Kim A.; Cherry, Steve

    2004-01-01

     Logistic regression is an important tool for wildlife habitat-selection studies, but the method frequently has been misapplied due to an inadequate understanding of the logistic model, its interpretation, and the influence of sampling design. To promote better use of this method, we review its application and interpretation under 3 sampling designs: random, case-control, and use-availability. Logistic regression is appropriate for habitat use-nonuse studies employing random sampling and can be used to directly model the conditional probability of use in such cases. Logistic regression also is appropriate for studies employing case-control sampling designs, but careful attention is required to interpret results correctly. Unless bias can be estimated or probability of use is small for all habitats, results of case-control studies should be interpreted as odds ratios, rather than probability of use or relative probability of use. When data are gathered under a use-availability design, logistic regression can be used to estimate approximate odds ratios if probability of use is small, at least on average. More generally, however, logistic regression is inappropriate for modeling habitat selection in use-availability studies. In particular, using logistic regression to fit the exponential model of Manly et al. (2002:100) does not guarantee maximum-likelihood estimates, valid probabilities, or valid likelihoods. We show that the resource selection function (RSF) commonly used for the exponential model is proportional to a logistic discriminant function. Thus, it may be used to rank habitats with respect to probability of use and to identify important habitat characteristics or their surrogates, but it is not guaranteed to be proportional to probability of use. Other problems associated with the exponential model also are discussed. We describe an alternative model based on Lancaster and Imbens (1996) that offers a method for estimating conditional probability of use in

  18. Mixed-Effects Logistic Regression for Estimating Transitional Probabilities in Sequentially Coded Observational Data

    ERIC Educational Resources Information Center

    Ozechowski, Timothy J.; Turner, Charles W.; Hops, Hyman

    2007-01-01

    This article demonstrates the use of mixed-effects logistic regression (MLR) for conducting sequential analyses of binary observational data. MLR is a special case of the mixed-effects logit modeling framework, which may be applied to multicategorical observational data. The MLR approach is motivated in part by G. A. Dagne, G. W. Howe, C. H.…

  19. Detecting DIF in Polytomous Items Using MACS, IRT and Ordinal Logistic Regression

    ERIC Educational Resources Information Center

    Elosua, Paula; Wells, Craig

    2013-01-01

    The purpose of the present study was to compare the Type I error rate and power of two model-based procedures, the mean and covariance structure model (MACS) and the item response theory (IRT), and an observed-score based procedure, ordinal logistic regression, for detecting differential item functioning (DIF) in polytomous items. A simulation…

  20. Accuracy of Bayes and Logistic Regression Subscale Probabilities for Educational and Certification Tests

    ERIC Educational Resources Information Center

    Rudner, Lawrence

    2016-01-01

    In the machine learning literature, it is commonly accepted as fact that as calibration sample sizes increase, Naïve Bayes classifiers initially outperform Logistic Regression classifiers in terms of classification accuracy. Applied to subtests from an on-line final examination and from a highly regarded certification examination, this study shows…

  1. A Note on Three Statistical Tests in the Logistic Regression DIF Procedure

    ERIC Educational Resources Information Center

    Paek, Insu

    2012-01-01

    Although logistic regression became one of the well-known methods in detecting differential item functioning (DIF), its three statistical tests, the Wald, likelihood ratio (LR), and score tests, which are readily available under the maximum likelihood, do not seem to be consistently distinguished in DIF literature. This paper provides a clarifying…

  2. Comparison of IRT Likelihood Ratio Test and Logistic Regression DIF Detection Procedures

    ERIC Educational Resources Information Center

    Atar, Burcu; Kamata, Akihito

    2011-01-01

    The Type I error rates and the power of IRT likelihood ratio test and cumulative logit ordinal logistic regression procedures in detecting differential item functioning (DIF) for polytomously scored items were investigated in this Monte Carlo simulation study. For this purpose, 54 simulation conditions (combinations of 3 sample sizes, 2 sample…

  3. Modeling Polytomous Item Responses Using Simultaneously Estimated Multinomial Logistic Regression Models

    ERIC Educational Resources Information Center

    Anderson, Carolyn J.; Verkuilen, Jay; Peyton, Buddy L.

    2010-01-01

    Survey items with multiple response categories and multiple-choice test questions are ubiquitous in psychological and educational research. We illustrate the use of log-multiplicative association (LMA) models that are extensions of the well-known multinomial logistic regression model for multiple dependent outcome variables to reanalyze a set of…

  4. Risk Factors of Falls in Community-Dwelling Older Adults: Logistic Regression Tree Analysis

    ERIC Educational Resources Information Center

    Yamashita, Takashi; Noe, Douglas A.; Bailer, A. John

    2012-01-01

    Purpose of the Study: A novel logistic regression tree-based method was applied to identify fall risk factors and possible interaction effects of those risk factors. Design and Methods: A nationally representative sample of American older adults aged 65 years and older (N = 9,592) in the Health and Retirement Study 2004 and 2006 modules was used.…

  5. Comparison of a Bayesian Network with a Logistic Regression Model to Forecast IgA Nephropathy

    PubMed Central

    Ducher, Michel; Kalbacher, Emilie; Combarnous, François; Finaz de Vilaine, Jérome; McGregor, Brigitte; Fouque, Denis; Fauvel, Jean Pierre

    2013-01-01

    Models are increasingly used in clinical practice to improve the accuracy of diagnosis. The aim of our work was to compare a Bayesian network to logistic regression to forecast IgA nephropathy (IgAN) from simple clinical and biological criteria. Retrospectively, we pooled the results of all biopsies (n = 155) performed by nephrologists in a specialist clinical facility between 2002 and 2009. Two groups were constituted at random. The first subgroup was used to determine the parameters of the models adjusted to data by logistic regression or Bayesian network, and the second was used to compare the performances of the models using receiver operating characteristics (ROC) curves. IgAN was found (on pathology) in 44 patients. Areas under the ROC curves provided by both methods were highly significant but not different from each other. Based on the highest Youden indices, sensitivity reached (100% versus 67%) and specificity (73% versus 95%) using the Bayesian network and logistic regression, respectively. A Bayesian network is at least as efficient as logistic regression to estimate the probability of a patient suffering IgAN, using simple clinical and biological data obtained during consultation. PMID:24328031

  6. A Comparison of Logistic Regression, Neural Networks, and Classification Trees Predicting Success of Actuarial Students

    ERIC Educational Resources Information Center

    Schumacher, Phyllis; Olinsky, Alan; Quinn, John; Smith, Richard

    2010-01-01

    The authors extended previous research by 2 of the authors who conducted a study designed to predict the successful completion of students enrolled in an actuarial program. They used logistic regression to determine the probability of an actuarial student graduating in the major or dropping out. They compared the results of this study with those…

  7. Comparison of a Bayesian network with a logistic regression model to forecast IgA nephropathy.

    PubMed

    Ducher, Michel; Kalbacher, Emilie; Combarnous, François; Finaz de Vilaine, Jérome; McGregor, Brigitte; Fouque, Denis; Fauvel, Jean Pierre

    2013-01-01

    Models are increasingly used in clinical practice to improve the accuracy of diagnosis. The aim of our work was to compare a Bayesian network to logistic regression to forecast IgA nephropathy (IgAN) from simple clinical and biological criteria. Retrospectively, we pooled the results of all biopsies (n = 155) performed by nephrologists in a specialist clinical facility between 2002 and 2009. Two groups were constituted at random. The first subgroup was used to determine the parameters of the models adjusted to data by logistic regression or Bayesian network, and the second was used to compare the performances of the models using receiver operating characteristics (ROC) curves. IgAN was found (on pathology) in 44 patients. Areas under the ROC curves provided by both methods were highly significant but not different from each other. Based on the highest Youden indices, sensitivity reached (100% versus 67%) and specificity (73% versus 95%) using the Bayesian network and logistic regression, respectively. A Bayesian network is at least as efficient as logistic regression to estimate the probability of a patient suffering IgAN, using simple clinical and biological data obtained during consultation.

  8. Propensity Score Estimation with Data Mining Techniques: Alternatives to Logistic Regression

    ERIC Educational Resources Information Center

    Keller, Bryan S. B.; Kim, Jee-Seon; Steiner, Peter M.

    2013-01-01

    Propensity score analysis (PSA) is a methodological technique which may correct for selection bias in a quasi-experiment by modeling the selection process using observed covariates. Because logistic regression is well understood by researchers in a variety of fields and easy to implement in a number of popular software packages, it has…

  9. Multiple Logistic Regression Analysis of Cigarette Use among High School Students

    ERIC Educational Resources Information Center

    Adwere-Boamah, Joseph

    2011-01-01

    A binary logistic regression analysis was performed to predict high school students' cigarette smoking behavior from selected predictors from 2009 CDC Youth Risk Behavior Surveillance Survey. The specific target student behavior of interest was frequent cigarette use. Five predictor variables included in the model were: a) race, b) frequency of…

  10. Monte Carlo Evaluation of Two-Level Logistic Regression for Assessing Person Fit

    ERIC Educational Resources Information Center

    Woods, Carol M.

    2008-01-01

    Person fit is the degree to which an item response model fits for individual examinees. Reise (2000) described how two-level logistic regression can be used to detect heterogeneity in person fit, evaluate potential predictors of person fit heterogeneity, and identify potentially aberrant individuals. The method has apparently never been applied to…

  11. Strategies for Testing Statistical and Practical Significance in Detecting DIF with Logistic Regression Models

    ERIC Educational Resources Information Center

    Fidalgo, Angel M.; Alavi, Seyed Mohammad; Amirian, Seyed Mohammad Reza

    2014-01-01

    This study examines three controversial aspects in differential item functioning (DIF) detection by logistic regression (LR) models: first, the relative effectiveness of different analytical strategies for detecting DIF; second, the suitability of the Wald statistic for determining the statistical significance of the parameters of interest; and…

  12. Comparing the Classification Accuracy among Nonparametric, Parametric Discriminant Analysis and Logistic Regression Methods.

    ERIC Educational Resources Information Center

    Ferrer, Alvaro J. Arce; Wang, Lin

    This study compared the classification performance among parametric discriminant analysis, nonparametric discriminant analysis, and logistic regression in a two-group classification application. Field data from an organizational survey were analyzed and bootstrapped for additional exploration. The data were observed to depart from multivariate…

  13. An Additional Measure of Overall Effect Size for Logistic Regression Models

    ERIC Educational Resources Information Center

    Allen, Jeff; Le, Huy

    2008-01-01

    Users of logistic regression models often need to describe the overall predictive strength, or effect size, of the model's predictors. Analogs of R[superscript 2] have been developed, but none of these measures are interpretable on the same scale as effects of individual predictors. Furthermore, R[superscript 2] analogs are not invariant to the…

  14. Comparison of Two Approaches for Handling Missing Covariates in Logistic Regression

    ERIC Educational Resources Information Center

    Peng, Chao-Ying Joanne; Zhu, Jin

    2008-01-01

    For the past 25 years, methodological advances have been made in missing data treatment. Most published work has focused on missing data in dependent variables under various conditions. The present study seeks to fill the void by comparing two approaches for handling missing data in categorical covariates in logistic regression: the…

  15. Iterative Purification and Effect Size Use with Logistic Regression for Differential Item Functioning Detection

    ERIC Educational Resources Information Center

    French, Brian F.; Maller, Susan J.

    2007-01-01

    Two unresolved implementation issues with logistic regression (LR) for differential item functioning (DIF) detection include ability purification and effect size use. Purification is suggested to control inaccuracies in DIF detection as a result of DIF items in the ability estimate. Additionally, effect size use may be beneficial in controlling…

  16. Predictors of Placement Stability at the State Level: The Use of Logistic Regression to Inform Practice

    ERIC Educational Resources Information Center

    Courtney, Jon R.; Prophet, Retta

    2011-01-01

    Placement instability is often associated with a number of negative outcomes for children. To gain state level contextual knowledge of factors associated with placement stability/instability, logistic regression was applied to selected variables from the New Mexico Adoption and Foster Care Administrative Reporting System dataset. Predictors…

  17. Predictive Discriminant Analysis Versus Logistic Regression in Two-Group Classification Problems.

    ERIC Educational Resources Information Center

    Meshbane, Alice; Morris, John D.

    A method for comparing the cross-validated classification accuracies of predictive discriminant analysis and logistic regression classification models is presented under varying data conditions for the two-group classification problem. With this method, separate-group, as well as total-sample proportions of the correct classifications, can be…

  18. A hybrid approach of stepwise regression, logistic regression, support vector machine, and decision tree for forecasting fraudulent financial statements.

    PubMed

    Chen, Suduan; Goo, Yeong-Jia James; Shen, Zone-De

    2014-01-01

    As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%.

  19. A Hybrid Approach of Stepwise Regression, Logistic Regression, Support Vector Machine, and Decision Tree for Forecasting Fraudulent Financial Statements

    PubMed Central

    Goo, Yeong-Jia James; Shen, Zone-De

    2014-01-01

    As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%. PMID:25302338

  20. Regressive logistic models for familial diseases: a formulation assuming an underlying liability model.

    PubMed Central

    Demenais, F M

    1991-01-01

    Statistical models have been developed to delineate the major-gene and non-major-gene factors accounting for the familial aggregation of complex diseases. The mixed model assumes an underlying liability to the disease, to which a major gene, a multifactorial component, and random environment contribute independently. Affection is defined by a threshold on the liability scale. The regressive logistic models assume that the logarithm of the odds of being affected is a linear function of major genotype, phenotypes of antecedents and other covariates. An equivalence between these two approaches cannot be derived analytically. I propose a formulation of the regressive logistic models on the supposition of an underlying liability model of disease. Relatives are assumed to have correlated liabilities to the disease; affected persons have liabilities exceeding an estimable threshold. Under the assumption that the correlation structure of the relatives' liabilities follows a regressive model, the regression coefficients on antecedents are expressed in terms of the relevant familial correlations. A parsimonious parameterization is a consequence of the assumed liability model, and a one-to-one correspondence with the parameters of the mixed model can be established. The logits, derived under the class A regressive model and under the class D regressive model, can be extended to include a large variety of patterns of family dependence, as well as gene-environment interactions. PMID:1897524

  1. Predicting Outcomes of Hospitalization for Heart Failure Using Logistic Regression and Knowledge Discovery Methods

    PubMed Central

    Phillips, Kirk T.; Street, W. Nick

    2005-01-01

    The purpose of this study is to determine the best prediction of heart failure outcomes, resulting from two methods -- standard epidemiologic analysis with logistic regression and knowledge discovery with supervised learning/data mining. Heart failure was chosen for this study as it exhibits higher prevalence and cost of treatment than most other hospitalized diseases. The prevalence of heart failure has exceeded 4 million cases in the U.S.. Findings of this study should be useful for the design of quality improvement initiatives, as particular aspects of patient comorbidity and treatment are found to be associated with mortality. This is also a proof of concept study, considering the feasibility of emerging health informatics methods of data mining in conjunction with or in lieu of traditional logistic regression methods of prediction. Findings may also support the design of decision support systems and quality improvement programming for other diseases. PMID:16779367

  2. Hierarchical Bayesian Logistic Regression to forecast metabolic control in type 2 DM patients.

    PubMed

    Dagliati, Arianna; Malovini, Alberto; Decata, Pasquale; Cogni, Giulia; Teliti, Marsida; Sacchi, Lucia; Cerra, Carlo; Chiovato, Luca; Bellazzi, Riccardo

    2016-01-01

    In this work we present our efforts in building a model able to forecast patients' changes in clinical conditions when repeated measurements are available. In this case the available risk calculators are typically not applicable. We propose a Hierarchical Bayesian Logistic Regression model, which allows taking into account individual and population variability in model parameters estimate. The model is used to predict metabolic control and its variation in type 2 diabetes mellitus. In particular we have analyzed a population of more than 1000 Italian type 2 diabetic patients, collected within the European project Mosaic. The results obtained in terms of Matthews Correlation Coefficient are significantly better than the ones gathered with standard logistic regression model, based on data pooling.

  3. Hierarchical Bayesian Logistic Regression to forecast metabolic control in type 2 DM patients

    PubMed Central

    Dagliati, Arianna; Malovini, Alberto; Decata, Pasquale; Cogni, Giulia; Teliti, Marsida; Sacchi, Lucia; Cerra, Carlo; Chiovato, Luca; Bellazzi, Riccardo

    2016-01-01

    In this work we present our efforts in building a model able to forecast patients’ changes in clinical conditions when repeated measurements are available. In this case the available risk calculators are typically not applicable. We propose a Hierarchical Bayesian Logistic Regression model, which allows taking into account individual and population variability in model parameters estimate. The model is used to predict metabolic control and its variation in type 2 diabetes mellitus. In particular we have analyzed a population of more than 1000 Italian type 2 diabetic patients, collected within the European project Mosaic. The results obtained in terms of Matthews Correlation Coefficient are significantly better than the ones gathered with standard logistic regression model, based on data pooling. PMID:28269842

  4. EXpectation Propagation LOgistic REgRession (EXPLORER): Distributed Privacy-Preserving Online Model Learning

    PubMed Central

    Wang, Shuang; Jiang, Xiaoqian; Wu, Yuan; Cui, Lijuan; Cheng, Samuel; Ohno-Machado, Lucila

    2013-01-01

    We developed an EXpectation Propagation LOgistic REgRession (EXPLORER) model for distributed privacy-preserving online learning. The proposed framework provides a high level guarantee for protecting sensitive information, since the information exchanged between the server and the client is the encrypted posterior distribution of coefficients. Through experimental results, EXPLORER shows the same performance (e.g., discrimination, calibration, feature selection etc.) as the traditional frequentist Logistic Regression model, but provides more flexibility in model updating. That is, EXPLORER can be updated one point at a time rather than having to retrain the entire data set when new observations are recorded. The proposed EXPLORER supports asynchronized communication, which relieves the participants from coordinating with one another, and prevents service breakdown from the absence of participants or interrupted communications. PMID:23562651

  5. Assessing the effects of different types of covariates for binary logistic regression

    NASA Astrophysics Data System (ADS)

    Hamid, Hamzah Abdul; Wah, Yap Bee; Xie, Xian-Jin; Rahman, Hezlin Aryani Abd

    2015-02-01

    It is well known that the type of data distribution in the independent variable(s) may affect many statistical procedures. This paper investigates and illustrates the effect of different types of covariates on the parameter estimation of a binary logistic regression model. A simulation study with different sample sizes and different types of covariates (uniform, normal, skewed) was carried out. Results showed that parameter estimation of binary logistic regression model is severely overestimated when sample size is less than 150 for covariate which have normal and uniform distribution while the parameter is underestimated when the distribution of covariate is skewed. Parameter estimation improves for all types of covariates when sample size is large, that is at least 500.

  6. Predicting Student Success on the Texas Chemistry STAAR Test: A Logistic Regression Analysis

    ERIC Educational Resources Information Center

    Johnson, William L.; Johnson, Annabel M.; Johnson, Jared

    2012-01-01

    Background: The context is the new Texas STAAR end-of-course testing program. Purpose: The authors developed a logistic regression model to predict who would pass-or-fail the new Texas chemistry STAAR end-of-course exam. Setting: Robert E. Lee High School (5A) with an enrollment of 2700 students, Tyler, Texas. Date of the study was the 2011-2012…

  7. A logistic regression model for predicting axillary lymph node metastases in early breast carcinoma patients.

    PubMed

    Xie, Fei; Yang, Houpu; Wang, Shu; Zhou, Bo; Tong, Fuzhong; Yang, Deqi; Zhang, Jiaqing

    2012-01-01

    Nodal staging in breast cancer is a key predictor of prognosis. This paper presents the results of potential clinicopathological predictors of axillary lymph node involvement and develops an efficient prediction model to assist in predicting axillary lymph node metastases. Seventy patients with primary early breast cancer who underwent axillary dissection were evaluated. Univariate and multivariate logistic regression were performed to evaluate the association between clinicopathological factors and lymph node metastatic status. A logistic regression predictive model was built from 50 randomly selected patients; the model was also applied to the remaining 20 patients to assess its validity. Univariate analysis showed a significant relationship between lymph node involvement and absence of nm-23 (p = 0.010) and Kiss-1 (p = 0.001) expression. Absence of Kiss-1 remained significantly associated with positive axillary node status in the multivariate analysis (p = 0.018). Seven clinicopathological factors were involved in the multivariate logistic regression model: menopausal status, tumor size, ER, PR, HER2, nm-23 and Kiss-1. The model was accurate and discriminating, with an area under the receiver operating characteristic curve of 0.702 when applied to the validation group. Moreover, there is a need discover more specific candidate proteins and molecular biology tools to select more variables which should improve predictive accuracy.

  8. A Logistic Regression Model for Predicting Axillary Lymph Node Metastases in Early Breast Carcinoma Patients

    PubMed Central

    Xie, Fei; Yang, Houpu; Wang, Shu; Zhou, Bo; Tong, Fuzhong; Yang, Deqi; Zhang, Jiaqing

    2012-01-01

    Nodal staging in breast cancer is a key predictor of prognosis. This paper presents the results of potential clinicopathological predictors of axillary lymph node involvement and develops an efficient prediction model to assist in predicting axillary lymph node metastases. Seventy patients with primary early breast cancer who underwent axillary dissection were evaluated. Univariate and multivariate logistic regression were performed to evaluate the association between clinicopathological factors and lymph node metastatic status. A logistic regression predictive model was built from 50 randomly selected patients; the model was also applied to the remaining 20 patients to assess its validity. Univariate analysis showed a significant relationship between lymph node involvement and absence of nm-23 (p = 0.010) and Kiss-1 (p = 0.001) expression. Absence of Kiss-1 remained significantly associated with positive axillary node status in the multivariate analysis (p = 0.018). Seven clinicopathological factors were involved in the multivariate logistic regression model: menopausal status, tumor size, ER, PR, HER2, nm-23 and Kiss-1. The model was accurate and discriminating, with an area under the receiver operating characteristic curve of 0.702 when applied to the validation group. Moreover, there is a need discover more specific candidate proteins and molecular biology tools to select more variables which should improve predictive accuracy. PMID:23012578

  9. Variable Selection in Logistic Regression for Detecting SNP-SNP Interactions: the Rheumatoid Arthritis Example

    PubMed Central

    Lin, H. Y.; Desmond, R.; Liu, Y. H.; Bridges, S. L.; Soong, S. J.

    2013-01-01

    Summary Many complex disease traits are observed to be associated with single nucleotide polymorphism (SNP) interactions. In testing small-scale SNP-SNP interactions, variable selection procedures in logistic regressions are commonly used. The empirical evidence of variable selection for testing interactions in logistic regressions is limited. This simulation study was designed to compare nine variable selection procedures in logistic regressions for testing SNP-SNP interactions. Data on 10 SNPs were simulated for 400 and 1000 subjects (case/control ratio=1). The simulated model included one main effect and two 2-way interactions. The variable selection procedures included automatic selection (stepwise, forward and backward), common 2-step selection, AIC- and BIC-based selection. The hierarchical rule effect, in which all main effects and lower order terms of the highest-order interaction term are included in the model regardless of their statistical significance, was also examined. We found that the stepwise variable selection without the hierarchical rule which had reasonably high authentic (true positive) proportion and low noise (false positive) proportion, is a better method compared to other variable selection procedures. The procedure without the hierarchical rule requires fewer terms in testing interactions, so it can accommodate more SNPs than the procedure with the hierarchical rule. For testing interactions, the procedures without the hierarchical rule had higher authentic proportion and lower noise proportion compared with ones with the hierarchical rule. These variable selection procedures were also applied and compared in a rheumatoid arthritis study. PMID:18231122

  10. A comparative study on entrepreneurial attitudes modeled with logistic regression and Bayes nets.

    PubMed

    López Puga, Jorge; García García, Juan

    2012-11-01

    Entrepreneurship research is receiving increasing attention in our context, as entrepreneurs are key social agents involved in economic development. We compare the success of the dichotomic logistic regression model and the Bayes simple classifier to predict entrepreneurship, after manipulating the percentage of missing data and the level of categorization in predictors. A sample of undergraduate university students (N = 1230) completed five scales (motivation, attitude towards business creation, obstacles, deficiencies, and training needs) and we found that each of them predicted different aspects of the tendency to business creation. Additionally, our results show that the receiver operating characteristic (ROC) curve is affected by the rate of missing data in both techniques, but logistic regression seems to be more vulnerable when faced with missing data, whereas Bayes nets underperform slightly when categorization has been manipulated. Our study sheds light on the potential entrepreneur profile and we propose to use Bayesian networks as an additional alternative to overcome the weaknesses of logistic regression when missing data are present in applied research.

  11. Regularized logistic regression with adjusted adaptive elastic net for gene selection in high dimensional cancer classification.

    PubMed

    Algamal, Zakariya Yahya; Lee, Muhammad Hisyam

    2015-12-01

    Cancer classification and gene selection in high-dimensional data have been popular research topics in genetics and molecular biology. Recently, adaptive regularized logistic regression using the elastic net regularization, which is called the adaptive elastic net, has been successfully applied in high-dimensional cancer classification to tackle both estimating the gene coefficients and performing gene selection simultaneously. The adaptive elastic net originally used elastic net estimates as the initial weight, however, using this weight may not be preferable for certain reasons: First, the elastic net estimator is biased in selecting genes. Second, it does not perform well when the pairwise correlations between variables are not high. Adjusted adaptive regularized logistic regression (AAElastic) is proposed to address these issues and encourage grouping effects simultaneously. The real data results indicate that AAElastic is significantly consistent in selecting genes compared to the other three competitor regularization methods. Additionally, the classification performance of AAElastic is comparable to the adaptive elastic net and better than other regularization methods. Thus, we can conclude that AAElastic is a reliable adaptive regularized logistic regression method in the field of high-dimensional cancer classification.

  12. Prevalence and Predictive Factors of Sexual Dysfunction in Iranian Women: Univariate and Multivariate Logistic Regression Analyses

    PubMed Central

    Direkvand-Moghadam, Ashraf; Suhrabi, Zainab; Akbari, Malihe

    2016-01-01

    Background Female sexual dysfunction, which can occur during any stage of a normal sexual activity, is a serious condition for individuals and couples. The present study aimed to determine the prevalence and predictive factors of female sexual dysfunction in women referred to health centers in Ilam, the Western Iran, in 2014. Methods In the present cross-sectional study, 444 women who attended health centers in Ilam were enrolled from May to September 2014. Participants were selected according to the simple random sampling method. Univariate and multivariate logistic regression analyses were used to predict the risk factors of female sexual dysfunction. Diffe rences with an alpha error of 0.05 were regarded as statistically significant. Results Overall, 75.9% of the study population exhibited sexual dysfunction. Univariate logistic regression analysis demonstrated that there was a significant association between female sexual dysfunction and age, menarche age, gravidity, parity, and education (P<0.05). Multivariate logistic regression analysis indicated that, menarche age (odds ratio, 1.26), education level (odds ratio, 1.71), and gravida (odds ratio, 1.59) were independent predictive variables for female sexual dysfunction. Conclusion The majority of Iranian women suffer from sexual dysfunction. A lack of awareness of Iranian women's sexual pleasure and formal training on sexual function and its influencing factors, such as menarche age, gravida, and level of education, may lead to a high prevalence of female sexual dysfunction. PMID:27688863

  13. Application of likelihood ratio and logistic regression models to landslide susceptibility mapping using GIS.

    PubMed

    Lee, Saro

    2004-08-01

    For landslide susceptibility mapping, this study applied and verified a Bayesian probability model, a likelihood ratio and statistical model, and logistic regression to Janghung, Korea, using a Geographic Information System (GIS). Landslide locations were identified in the study area from interpretation of IRS satellite imagery and field surveys; and a spatial database was constructed from topographic maps, soil type, forest cover, geology and land cover. The factors that influence landslide occurrence, such as slope gradient, slope aspect, and curvature of topography, were calculated from the topographic database. Soil texture, material, drainage, and effective depth were extracted from the soil database, while forest type, diameter, and density were extracted from the forest database. Land cover was classified from Landsat TM satellite imagery using unsupervised classification. The likelihood ratio and logistic regression coefficient were overlaid to determine each factor's rating for landslide susceptibility mapping. Then the landslide susceptibility map was verified and compared with known landslide locations. The logistic regression model had higher prediction accuracy than the likelihood ratio model. The method can be used to reduce hazards associated with landslides and to land cover planning.

  14. Graphical assessment of internal and external calibration of logistic regression models by using loess smoothers.

    PubMed

    Austin, Peter C; Steyerberg, Ewout W

    2014-02-10

    Predicting the probability of the occurrence of a binary outcome or condition is important in biomedical research. While assessing discrimination is an essential issue in developing and validating binary prediction models, less attention has been paid to methods for assessing model calibration. Calibration refers to the degree of agreement between observed and predicted probabilities and is often assessed by testing for lack-of-fit. The objective of our study was to examine the ability of graphical methods to assess the calibration of logistic regression models. We examined lack of internal calibration, which was related to misspecification of the logistic regression model, and external calibration, which was related to an overfit model or to shrinkage of the linear predictor. We conducted an extensive set of Monte Carlo simulations with a locally weighted least squares regression smoother (i.e., the loess algorithm) to examine the ability of graphical methods to assess model calibration. We found that loess-based methods were able to provide evidence of moderate departures from linearity and indicate omission of a moderately strong interaction. Misspecification of the link function was harder to detect. Visual patterns were clearer with higher sample sizes, higher incidence of the outcome, or higher discrimination. Loess-based methods were also able to identify the lack of calibration in external validation samples when an overfit regression model had been used. In conclusion, loess-based smoothing methods are adequate tools to graphically assess calibration and merit wider application.

  15. Multiple Imputation of a Randomly Censored Covariate Improves Logistic Regression Analysis.

    PubMed

    Atem, Folefac D; Qian, Jing; Maye, Jacqueline E; Johnson, Keith A; Betensky, Rebecca A

    2016-01-01

    Randomly censored covariates arise frequently in epidemiologic studies. The most commonly used methods, including complete case and single imputation or substitution, suffer from inefficiency and bias. They make strong parametric assumptions or they consider limit of detection censoring only. We employ multiple imputation, in conjunction with semi-parametric modeling of the censored covariate, to overcome these shortcomings and to facilitate robust estimation. We develop a multiple imputation approach for randomly censored covariates within the framework of a logistic regression model. We use the non-parametric estimate of the covariate distribution or the semiparametric Cox model estimate in the presence of additional covariates in the model. We evaluate this procedure in simulations, and compare its operating characteristics to those from the complete case analysis and a survival regression approach. We apply the procedures to an Alzheimer's study of the association between amyloid positivity and maternal age of onset of dementia. Multiple imputation achieves lower standard errors and higher power than the complete case approach under heavy and moderate censoring and is comparable under light censoring. The survival regression approach achieves the highest power among all procedures, but does not produce interpretable estimates of association. Multiple imputation offers a favorable alternative to complete case analysis and ad hoc substitution methods in the presence of randomly censored covariates within the framework of logistic regression.

  16. A general framework for the use of logistic regression models in meta-analysis.

    PubMed

    Simmonds, Mark C; Higgins, Julian Pt

    2016-12-01

    Where individual participant data are available for every randomised trial in a meta-analysis of dichotomous event outcomes, "one-stage" random-effects logistic regression models have been proposed as a way to analyse these data. Such models can also be used even when individual participant data are not available and we have only summary contingency table data. One benefit of this one-stage regression model over conventional meta-analysis methods is that it maximises the correct binomial likelihood for the data and so does not require the common assumption that effect estimates are normally distributed. A second benefit of using this model is that it may be applied, with only minor modification, in a range of meta-analytic scenarios, including meta-regression, network meta-analyses and meta-analyses of diagnostic test accuracy. This single model can potentially replace the variety of often complex methods used in these areas. This paper considers, with a range of meta-analysis examples, how random-effects logistic regression models may be used in a number of different types of meta-analyses. This one-stage approach is compared with widely used meta-analysis methods including Bayesian network meta-analysis and the bivariate and hierarchical summary receiver operating characteristic (ROC) models for meta-analyses of diagnostic test accuracy.

  17. The assessment of groundwater nitrate contamination by using logistic regression model in a representative rural area

    NASA Astrophysics Data System (ADS)

    Ko, K.; Cheong, B.; Koh, D.

    2010-12-01

    Groundwater has been used a main source to provide a drinking water in a rural area with no regional potable water supply system in Korea. More than 50 percent of rural area residents depend on groundwater as drinking water. Thus, research on predicting groundwater pollution for the sustainable groundwater usage and protection from potential pollutants was demanded. This study was carried out to know the vulnerability of groundwater nitrate contamination reflecting the effect of land use in Nonsan city of a representative rural area of South Korea. About 47% of the study area is occupied by cultivated land with high vulnerable area to groundwater nitrate contamination because it has higher nitrogen fertilizer input of 62.3 tons/km2 than that of country’s average of 44.0 tons/km2. The two vulnerability assessment methods, logistic regression and DRASTIC model, were tested and compared to know more suitable techniques for the assessment of groundwater nitrate contamination in Nonsan area. The groundwater quality data were acquired from the collection of analyses of 111 samples of small potable supply system in the study area. The analyzed values of nitrate were classified by land use such as resident, upland, paddy, and field area. One dependent and two independent variables were addressed for logistic regression analysis. One dependent variable was a binary categorical data with 0 or 1 whether or not nitrate exceeding thresholds of 1 through 10 mg/L. The independent variables were one continuous data of slope indicating topography and multiple categorical data of land use which are classified by resident, upland, paddy, and field area. The results of the Levene’s test and T-test for slope and land use were showed the significant difference of mean values among groups in 95% confidence level. From the logistic regression, we could know the negative correlation between slope and nitrate which was caused by the decrease of contaminants inputs into groundwater with

  18. Comprehensible Predictive Modeling Using Regularized Logistic Regression and Comorbidity Based Features

    PubMed Central

    Stiglic, Gregor; Povalej Brzan, Petra; Fijacko, Nino; Wang, Fei; Delibasic, Boris; Kalousis, Alexandros; Obradovic, Zoran

    2015-01-01

    Different studies have demonstrated the importance of comorbidities to better understand the origin and evolution of medical complications. This study focuses on improvement of the predictive model interpretability based on simple logical features representing comorbidities. We use group lasso based feature interaction discovery followed by a post-processing step, where simple logic terms are added. In the final step, we reduce the feature set by applying lasso logistic regression to obtain a compact set of non-zero coefficients that represent a more comprehensible predictive model. The effectiveness of the proposed approach was demonstrated on a pediatric hospital discharge dataset that was used to build a readmission risk estimation model. The evaluation of the proposed method demonstrates a reduction of the initial set of features in a regression model by 72%, with a slight improvement in the Area Under the ROC Curve metric from 0.763 (95% CI: 0.755–0.771) to 0.769 (95% CI: 0.761–0.777). Additionally, our results show improvement in comprehensibility of the final predictive model using simple comorbidity based terms for logistic regression. PMID:26645087

  19. A comparison of discriminant analysis and logistic regression for the prediction of coliform mastitis in dairy cows.

    PubMed Central

    Montgomery, M E; White, M E; Martin, S W

    1987-01-01

    Results from discriminant analysis and logistic regression were compared using two data sets from a study on predictors of coliform mastitis in dairy cows. Both techniques selected the same set of variables as important predictors and were of nearly equal value in classifying cows as having, or not having mastitis. The logistic regression model made fewer classification errors. The magnitudes of the effects were considerably different for some variables. Given the failure to meet the underlying assumptions of discriminant analysis, the coefficients from logistic regression are preferable. PMID:3453271

  20. A logistic regression equation for estimating the probability of a stream in Vermont having intermittent flow

    USGS Publications Warehouse

    Olson, Scott A.; Brouillette, Michael C.

    2006-01-01

    A logistic regression equation was developed for estimating the probability of a stream flowing intermittently at unregulated, rural stream sites in Vermont. These determinations can be used for a wide variety of regulatory and planning efforts at the Federal, State, regional, county and town levels, including such applications as assessing fish and wildlife habitats, wetlands classifications, recreational opportunities, water-supply potential, waste-assimilation capacities, and sediment transport. The equation will be used to create a derived product for the Vermont Hydrography Dataset having the streamflow characteristic of 'intermittent' or 'perennial.' The Vermont Hydrography Dataset is Vermont's implementation of the National Hydrography Dataset and was created at a scale of 1:5,000 based on statewide digital orthophotos. The equation was developed by relating field-verified perennial or intermittent status of a stream site during normal summer low-streamflow conditions in the summer of 2005 to selected basin characteristics of naturally flowing streams in Vermont. The database used to develop the equation included 682 stream sites with drainage areas ranging from 0.05 to 5.0 square miles. When the 682 sites were observed, 126 were intermittent (had no flow at the time of the observation) and 556 were perennial (had flowing water at the time of the observation). The results of the logistic regression analysis indicate that the probability of a stream having intermittent flow in Vermont is a function of drainage area, elevation of the site, the ratio of basin relief to basin perimeter, and the areal percentage of well- and moderately well-drained soils in the basin. Using a probability cutpoint (a lower probability indicates the site has perennial flow and a higher probability indicates the site has intermittent flow) of 0.5, the logistic regression equation correctly predicted the perennial or intermittent status of 116 test sites 85 percent of the time.

  1. A secure distributed logistic regression protocol for the detection of rare adverse drug events

    PubMed Central

    El Emam, Khaled; Samet, Saeed; Arbuckle, Luk; Tamblyn, Robyn; Earle, Craig; Kantarcioglu, Murat

    2013-01-01

    Background There is limited capacity to assess the comparative risks of medications after they enter the market. For rare adverse events, the pooling of data from multiple sources is necessary to have the power and sufficient population heterogeneity to detect differences in safety and effectiveness in genetic, ethnic and clinically defined subpopulations. However, combining datasets from different data custodians or jurisdictions to perform an analysis on the pooled data creates significant privacy concerns that would need to be addressed. Existing protocols for addressing these concerns can result in reduced analysis accuracy and can allow sensitive information to leak. Objective To develop a secure distributed multi-party computation protocol for logistic regression that provides strong privacy guarantees. Methods We developed a secure distributed logistic regression protocol using a single analysis center with multiple sites providing data. A theoretical security analysis demonstrates that the protocol is robust to plausible collusion attacks and does not allow the parties to gain new information from the data that are exchanged among them. The computational performance and accuracy of the protocol were evaluated on simulated datasets. Results The computational performance scales linearly as the dataset sizes increase. The addition of sites results in an exponential growth in computation time. However, for up to five sites, the time is still short and would not affect practical applications. The model parameters are the same as the results on pooled raw data analyzed in SAS, demonstrating high model accuracy. Conclusion The proposed protocol and prototype system would allow the development of logistic regression models in a secure manner without requiring the sharing of personal health information. This can alleviate one of the key barriers to the establishment of large-scale post-marketing surveillance programs. We extended the secure protocol to account for

  2. Mining pharmacovigilance data using Bayesian logistic regression with James-Stein type shrinkage estimation.

    PubMed

    An, Lihua; Fung, Karen Y; Krewski, Daniel

    2010-09-01

    Spontaneous adverse event reporting systems are widely used to identify adverse reactions to drugs following their introduction into the marketplace. In this article, a James-Stein type shrinkage estimation strategy was developed in a Bayesian logistic regression model to analyze pharmacovigilance data. This method is effective in detecting signals as it combines information and borrows strength across medically related adverse events. Computer simulation demonstrated that the shrinkage estimator is uniformly better than the maximum likelihood estimator in terms of mean squared error. This method was used to investigate the possible association of a series of diabetic drugs and the risk of cardiovascular events using data from the Canada Vigilance Online Database.

  3. Semi-parametric estimation of random effects in a logistic regression model using conditional inference.

    PubMed

    Petersen, Jørgen Holm

    2016-01-15

    This paper describes a new approach to the estimation in a logistic regression model with two crossed random effects where special interest is in estimating the variance of one of the effects while not making distributional assumptions about the other effect. A composite likelihood is studied. For each term in the composite likelihood, a conditional likelihood is used that eliminates the influence of the random effects, which results in a composite conditional likelihood consisting of only one-dimensional integrals that may be solved numerically. Good properties of the resulting estimator are described in a small simulation study.

  4. Analyzing Beijing's in-use vehicle emissions test results using logistic regression.

    PubMed

    Chang, Cheng; Ortolano, Leonard

    2008-10-01

    A logistic regression model was built using vehicle emissions test data collected in 2003 for 129 604 motor vehicles in Beijing. The regression model uses vehicle model, model year, inspection station, ownership, and vehicle registration area as covariates to predict the probability that a vehicle fails an annual emissions test on the first try. Vehicle model is the most influential predictor variable: some vehicle models are much more likely to fail in emissions tests than an "average" vehicle. Five out of 14 vehicle models that performed the worst (out of a total of 52 models) were manufactured by foreign companies or by their joint ventures with Chinese enterprises. These 14 vehicle model types may have failed at relatively high rates because of design and manufacturing deficiencies, and such deficiencies cannot be easily detected and corrected without further efforts, such as programs for in-use surveillance and vehicle recall.

  5. Comparative study of biodegradability prediction of chemicals using decision trees, functional trees, and logistic regression.

    PubMed

    Chen, Guangchao; Li, Xuehua; Chen, Jingwen; Zhang, Ya-Nan; Peijnenburg, Willie J G M

    2014-12-01

    Biodegradation is the principal environmental dissipation process of chemicals. As such, it is a dominant factor determining the persistence and fate of organic chemicals in the environment, and is therefore of critical importance to chemical management and regulation. In the present study, the authors developed in silico methods assessing biodegradability based on a large heterogeneous set of 825 organic compounds, using the techniques of the C4.5 decision tree, the functional inner regression tree, and logistic regression. External validation was subsequently carried out by 2 independent test sets of 777 and 27 chemicals. As a result, the functional inner regression tree exhibited the best predictability with predictive accuracies of 81.5% and 81.0%, respectively, on the training set (825 chemicals) and test set I (777 chemicals). Performance of the developed models on the 2 test sets was subsequently compared with that of the Estimation Program Interface (EPI) Suite Biowin 5 and Biowin 6 models, which also showed a better predictability of the functional inner regression tree model. The model built in the present study exhibits a reasonable predictability compared with existing models while possessing a transparent algorithm. Interpretation of the mechanisms of biodegradation was also carried out based on the models developed.

  6. Simultaneous confidence bands for log-logistic regression with applications in risk assessment.

    PubMed

    Kerns, Lucy X

    2017-01-27

    In risk assessment, it is often desired to make inferences on the low dose levels at which a specific benchmark risk is attained. Applications of simultaneous hyperbolic confidence bands for low-dose risk estimation with quantal data under different dose-response models (multistage, Abbott-adjusted Weibull, and Abbott-adjusted log-logistic models) have appeared in the literature. The use of simultaneous three-segment bands under the multistage model has also been proposed recently. In this article, we present explicit formulas for constructing asymptotic one-sided simultaneous hyperbolic and three-segment bands for the simple log-logistic regression model. We use the simultaneous construction to estimate upper hyperbolic and three-segment confidence bands on extra risk and to obtain lower limits on the benchmark dose by inverting the upper bands on risk under the Abbott-adjusted log-logistic model. Monte Carlo simulations evaluate the characteristics of the simultaneous limits. An example is given to illustrate the use of the proposed methods and to compare the two types of simultaneous limits at very low dose levels.

  7. Sub-resolution assist feature (SRAF) printing prediction using logistic regression

    NASA Astrophysics Data System (ADS)

    Tan, Chin Boon; Koh, Kar Kit; Zhang, Dongqing; Foong, Yee Mei

    2015-03-01

    In optical proximity correction (OPC), the sub-resolution assist feature (SRAF) has been used to enhance the process window of main structures. However, the printing of SRAF on wafer is undesirable as this may adversely degrade the overall process yield if it is transferred into the final pattern. A reasonably accurate prediction model is needed during OPC to ensure that the SRAF placement and size have no risk of SRAF printing. Current common practice in OPC is either using the main OPC model or model threshold adjustment (MTA) solution to predict the SRAF printing. This paper studies the feasibility of SRAF printing prediction using logistic regression (LR). Logistic regression is a probabilistic classification model that gives discrete binary outputs after receiving sufficient input variables from SRAF printing conditions. In the application of SRAF printing prediction, the binary outputs can be treated as 1 for SRAFPrinting and 0 for No-SRAF-Printing. The experimental work was performed using a 20nm line/space process layer. The results demonstrate that the accuracy of SRAF printing prediction using LR approach outperforms MTA solution. Overall error rate of as low as calibration 2% and verification 5% was achieved by LR approach compared to calibration 6% and verification 15% for MTA solution. In addition, the performance of LR approach was found to be relatively independent and consistent across different resist image planes compared to MTA solution.

  8. Predicting students' success at pre-university studies using linear and logistic regressions

    NASA Astrophysics Data System (ADS)

    Suliman, Noor Azizah; Abidin, Basir; Manan, Norhafizah Abdul; Razali, Ahmad Mahir

    2014-09-01

    The study is aimed to find the most suitable model that could predict the students' success at the medical pre-university studies, Centre for Foundation in Science, Languages and General Studies of Cyberjaya University College of Medical Sciences (CUCMS). The predictors under investigation were the national high school exit examination-Sijil Pelajaran Malaysia (SPM) achievements such as Biology, Chemistry, Physics, Additional Mathematics, Mathematics, English and Bahasa Malaysia results as well as gender and high school background factors. The outcomes showed that there is a significant difference in the final CGPA, Biology and Mathematics subjects at pre-university by gender factor, while by high school background also for Mathematics subject. In general, the correlation between the academic achievements at the high school and medical pre-university is moderately significant at α-level of 0.05, except for languages subjects. It was found also that logistic regression techniques gave better prediction models than the multiple linear regression technique for this data set. The developed logistic models were able to give the probability that is almost accurate with the real case. Hence, it could be used to identify successful students who are qualified to enter the CUCMS medical faculty before accepting any students to its foundation program.

  9. Application of ordinal logistic regression analysis in determining risk factors of child malnutrition in Bangladesh

    PubMed Central

    2011-01-01

    Background The study attempts to develop an ordinal logistic regression (OLR) model to identify the determinants of child malnutrition instead of developing traditional binary logistic regression (BLR) model using the data of Bangladesh Demographic and Health Survey 2004. Methods Based on weight-for-age anthropometric index (Z-score) child nutrition status is categorized into three groups-severely undernourished (< -3.0), moderately undernourished (-3.0 to -2.01) and nourished (≥-2.0). Since nutrition status is ordinal, an OLR model-proportional odds model (POM) can be developed instead of two separate BLR models to find predictors of both malnutrition and severe malnutrition if the proportional odds assumption satisfies. The assumption is satisfied with low p-value (0.144) due to violation of the assumption for one co-variate. So partial proportional odds model (PPOM) and two BLR models have also been developed to check the applicability of the OLR model. Graphical test has also been adopted for checking the proportional odds assumption. Results All the models determine that age of child, birth interval, mothers' education, maternal nutrition, household wealth status, child feeding index, and incidence of fever, ARI & diarrhoea were the significant predictors of child malnutrition; however, results of PPOM were more precise than those of other models. Conclusion These findings clearly justify that OLR models (POM and PPOM) are appropriate to find predictors of malnutrition instead of BLR models. PMID:22082256

  10. An application of a microcomputer compiler program to multiple logistic regression analysis.

    PubMed

    Sakai, R

    1988-01-01

    Microcomputer programs for multiple logistic regression analysis were written in BASIC language to determine the usefulness of microcomputers for multivariate analysis, which is an important method in epidemiological studies. The program, carried out by an interpreter system, required a comparatively long computing time for a small amount of data. For example, it took approximately thirty minutes to compute the data of 6 independent variables and 63 matched sets of case and controls (1:4). The majority of the calculation time was spent computing a matrix. The matrix computation time increased cumulatively in proportion to additions in the number of subjects, and increased exponentially with the number of variables. A BASIC compiler was utilized for the program of multiple logistic regression analysis. The compiled program carried out the same computations as above, but within 4 minutes. Therefore, it is evident that a compiler can be an extremely convenient tool for computing multivariate analysis. The two programs produced here were also easily linked with spreadsheet packages to enter data.

  11. Ordinal logistic regression analysis on the nutritional status of children in KarangKitri village

    NASA Astrophysics Data System (ADS)

    Ohyver, Margaretha; Yongharto, Kimmy Octavian

    2015-09-01

    Ordinal logistic regression is a statistical technique that can be used to describe the relationship between ordinal response variable with one or more independent variables. This method has been used in various fields including in the health field. In this research, ordinal logistic regression is used to describe the relationship between nutritional status of children with age, gender, height, and family status. Nutritional status of children in this research is divided into over nutrition, well nutrition, less nutrition, and malnutrition. The purpose for this research is to describe the characteristics of children in the KarangKitri Village and to determine the factors that influence the nutritional status of children in the KarangKitri village. There are three things that obtained from this research. First, there are still children who are not categorized as well nutritional status. Second, there are children who come from sufficient economic level which include in not normal status. Third, the factors that affect the nutritional level of children are age, family status, and height.

  12. Short Term Forecasts of Volcanic Activity Using An Event Tree Analysis System and Logistic Regression

    NASA Astrophysics Data System (ADS)

    Junek, W. N.; Jones, W. L.; Woods, M. T.

    2011-12-01

    An automated event tree analysis system for estimating the probability of short term volcanic activity is presented. The algorithm is driven by a suite of empirical statistical models that are derived through logistic regression. Each model is constructed from a multidisciplinary dataset that was assembled from a collection of historic volcanic unrest episodes. The dataset consists of monitoring measurements (e.g. InSAR, seismic), source modeling results, and historic eruption activity. This provides a simple mechanism for simultaneously accounting for the geophysical changes occurring within the volcano and the historic behavior of analog volcanoes. The algorithm is extensible and can be easily recalibrated to include new or additional monitoring, modeling, or historic information. Standard cross validation techniques are employed to optimize its forecasting capabilities. Analysis results from several recent volcanic unrest episodes are presented.

  13. Modeling data for pancreatitis in presence of a duodenal diverticula using logistic regression

    NASA Astrophysics Data System (ADS)

    Dineva, S.; Prodanova, K.; Mlachkova, D.

    2013-12-01

    The presence of a periampullary duodenal diverticulum (PDD) is often observed during upper digestive tract barium meal studies and endoscopic retrograde cholangiopancreatography (ERCP). A few papers reported that the diverticulum had something to do with the incidence of pancreatitis. The aim of this study is to investigate if the presence of duodenal diverticula predisposes to the development of a pancreatic disease. A total 3966 patients who had undergone ERCP were studied retrospectively. They were divided into 2 groups-with and without PDD. Patients with a duodenal diverticula had a higher rate of acute pancreatitis. The duodenal diverticula is a risk factor for acute idiopathic pancreatitis. A multiple logistic regression to obtain adjusted estimate of odds and to identify if a PDD is a predictor of acute or chronic pancreatitis was performed. The software package STATISTICA 10.0 was used for analyzing the real data.

  14. Binary logistic regression modelling: Measuring the probability of relapse cases among drug addict

    NASA Astrophysics Data System (ADS)

    Ismail, Mohd Tahir; Alias, Siti Nor Shadila

    2014-07-01

    For many years Malaysia faced the drug addiction issues. The most serious case is relapse phenomenon among treated drug addict (drug addict who have under gone the rehabilitation programme at Narcotic Addiction Rehabilitation Centre, PUSPEN). Thus, the main objective of this study is to find the most significant factor that contributes to relapse to happen. The binary logistic regression analysis was employed to model the relationship between independent variables (predictors) and dependent variable. The dependent variable is the status of the drug addict either relapse, (Yes coded as 1) or not, (No coded as 0). Meanwhile the predictors involved are age, age at first taking drug, family history, education level, family crisis, community support and self motivation. The total of the sample is 200 which the data are provided by AADK (National Antidrug Agency). The finding of the study revealed that age and self motivation are statistically significant towards the relapse cases..

  15. Using GIS and logistic regression to estimate agricultural chemical concentrations in rivers of the midwestern USA

    USGS Publications Warehouse

    Battaglin, W.A.

    1996-01-01

    Agricultural chemicals (herbicides, insecticides, other pesticides and fertilizers) in surface water may constitute a human health risk. Recent research on unregulated rivers in the midwestern USA documents that elevated concentrations of herbicides occur for 1-4 months following application in spring and early summer. In contrast, nitrate concentrations in unregulated rivers are elevated during the fall, winter and spring. Natural and anthropogenic variables of river drainage basins, such as soil permeability, the amount of agricultural chemicals applied or percentage of land planted in corn, affect agricultural chemical concentrations in rivers. Logistic regression (LGR) models are used to investigate relations between various drainage basin variables and the concentration of selected agricultural chemicals in rivers. The method is successful in contributing to the understanding of agricultural chemical concentration in rivers. Overall accuracies of the best LGR models, defined as the number of correct classifications divided by the number of attempted classifications, averaged about 66%.

  16. Confidence interval of difference of proportions in logistic regression in presence of covariates.

    PubMed

    Reeve, Russell

    2016-03-16

    Comparison of treatment differences in incidence rates is an important objective of many clinical trials. However, often the proportion is affected by covariates, and the adjustment of the predicted proportion is made using logistic regression. It is desirable to estimate the treatment differences in proportions adjusting for the covariates, similarly to the comparison of adjusted means in analysis of variance. Because of the correlation between the point estimates in the different treatment groups, the standard methods for constructing confidence intervals are inadequate. The problem is more difficult in the binary case, as the comparison is not uniquely defined, and the sampling distribution more difficult to analyze. Four procedures for analyzing the data are presented, which expand upon existing methods and generalize the link function. It is shown that, among the four methods studied, the resampling method based on the exact distribution function yields a coverage rate closest to the nominal.

  17. A modified approach to estimating sample size for simple logistic regression with one continuous covariate.

    PubMed

    Novikov, I; Fund, N; Freedman, L S

    2010-01-15

    Different methods for the calculation of sample size for simple logistic regression (LR) with one normally distributed continuous covariate give different results. Sometimes the difference can be large. Furthermore, some methods require the user to specify the prevalence of cases when the covariate equals its population mean, rather than the more natural population prevalence. We focus on two commonly used methods and show through simulations that the power for a given sample size may differ substantially from the nominal value for one method, especially when the covariate effect is large, while the other method performs poorly if the user provides the population prevalence instead of the required parameter. We propose a modification of the method of Hsieh et al. that requires specification of the population prevalence and that employs Schouten's sample size formula for a t-test with unequal variances and group sizes. This approach appears to increase the accuracy of the sample size estimates for LR with one continuous covariate.

  18. Application of Logistic Regression for Landslide Susceptibility Mapping in the Alishan Area, Southern Taiwan

    NASA Astrophysics Data System (ADS)

    Chan, H. C.; Chang, C. C.; Laio, P. Y.

    2012-04-01

    Landslide susceptibility analysis usually combines several factors, including the terrain, geology, and hydrology. The analysis tries to find a suitable combination of these factors in order to establish a landslide susceptibility model and calculate the susceptibility value. A potential landslide map can be established by using the calculated the susceptibility value of landslide. This study took Alishan area as an example and aimed to assess landslide susceptibility analysis by Logistic regression, a multivariate analysis method. In order to select the factors efficiently, the calibration and selection procedure were performed. The results were verified by a previous typhoon event. The classification error matrix was used to evaluate the accuracy of landslide predicted by the present model. Finally, this study applied 10-, 25-, 50-, and 100-year return periods precipitation to estimate the susceptibility values for the study area. The landslide susceptibilities were separated into four levels, including high, medium-high, medium, and low, to delineate the map of potential landslide.

  19. Maximum likelihood training of connectionist models: comparison with least squares back-propagation and logistic regression.

    PubMed Central

    Spackman, K. A.

    1991-01-01

    This paper presents maximum likelihood back-propagation (ML-BP), an approach to training neural networks. The widely reported original approach uses least squares back-propagation (LS-BP), minimizing the sum of squared errors (SSE). Unfortunately, least squares estimation does not give a maximum likelihood (ML) estimate of the weights in the network. Logistic regression, on the other hand, gives ML estimates for single layer linear models only. This report describes how to obtain ML estimates of the weights in a multi-layer model, and compares LS-BP to ML-BP using several examples. It shows that in many neural networks, least squares estimation gives inferior results and should be abandoned in favor of maximum likelihood estimation. Questions remain about the potential uses of multi-level connectionist models in such areas as diagnostic systems and risk-stratification in outcomes research. PMID:1807606

  20. Logistic regression analysis of pedestrian casualty risk in passenger vehicle collisions in China.

    PubMed

    Kong, Chunyu; Yang, Jikuang

    2010-07-01

    A large number of pedestrian fatalities were reported in China since the 1990s, however the exposure of pedestrians in public traffic has never been measured quantitatively using in-depth accident data. This study aimed to investigate the association between the impact speed and risk of pedestrian casualties in passenger vehicle collisions based on real-world accident cases in China. The cases were selected from a database of in-depth investigation of vehicle accidents in Changsha-IVAC. The sampling criteria were defined as (1) the accident was a frontal impact that occurred between 2003 and 2009; (2) the pedestrian age was above 14; (3) the injury according to the Abbreviated Injury Scale (AIS) was 1+; (4) the accident involved passenger cars, SUVs, or MPVs; and (5) the vehicle impact speed can be determined. The selected IVAC data set, which included 104 pedestrian accident cases, was weighted based on the national traffic accident data. The logistical regression models of the risks for pedestrian fatalities and AIS 3+ injuries were developed in terms of vehicle impact speed using the unweighted and weighted data sets. A multiple logistic regression model on the risk of pedestrian AIS 3+ injury was developed considering the age and impact speed as two variables. It was found that the risk of pedestrian fatality is 26% at 50 km/h, 50% at 58 km/h, and 82% at 70 km/h. At an impact speed of 80 km/h, the pedestrian rarely survives. The weighted risk curves indicated that the risks of pedestrian fatality and injury in China were higher than that in other high-income countries, whereas the risks of pedestrian casualty was lower than in these countries 30 years ago. The findings could have a contribution to better understanding of the exposures of pedestrians in urban traffic in China, and provide background knowledge for the development of strategies for pedestrian protection.

  1. Modeling Group Size and Scalar Stress by Logistic Regression from an Archaeological Perspective

    PubMed Central

    Alberti, Gianmarco

    2014-01-01

    Johnson’s scalar stress theory, describing the mechanics of (and the remedies to) the increase in in-group conflictuality that parallels the increase in groups’ size, provides scholars with a useful theoretical framework for the understanding of different aspects of the material culture of past communities (i.e., social organization, communal food consumption, ceramic style, architecture and settlement layout). Due to its relevance in archaeology and anthropology, the article aims at proposing a predictive model of critical level of scalar stress on the basis of community size. Drawing upon Johnson’s theory and on Dunbar’s findings on the cognitive constrains to human group size, a model is built by means of Logistic Regression on the basis of the data on colony fissioning among the Hutterites of North America. On the grounds of the theoretical framework sketched in the first part of the article, the absence or presence of colony fissioning is considered expression of not critical vs. critical level of scalar stress for the sake of the model building. The model, which is also tested against a sample of archaeological and ethnographic cases: a) confirms the existence of a significant relationship between critical scalar stress and group size, setting the issue on firmer statistical grounds; b) allows calculating the intercept and slope of the logistic regression model, which can be used in any time to estimate the probability that a community experienced a critical level of scalar stress; c) allows locating a critical scalar stress threshold at community size 127 (95% CI: 122–132), while the maximum probability of critical scale stress is predicted at size 158 (95% CI: 147–170). The model ultimately provides grounds to assess, for the sake of any further archaeological/anthropological interpretation, the probability that a group reached a hot spot of size development critical for its internal cohesion. PMID:24626241

  2. Optimization of Game Formats in U-10 Soccer Using Logistic Regression Analysis

    PubMed Central

    Amatria, Mario; Arana, Javier; Anguera, M. Teresa; Garzón, Belén

    2016-01-01

    Abstract Small-sided games provide young soccer players with better opportunities to develop their skills and progress as individual and team players. There is, however, little evidence on the effectiveness of different game formats in different age groups, and furthermore, these formats can vary between and even within countries. The Royal Spanish Soccer Association replaced the traditional grassroots 7-a-side format (F-7) with the 8-a-side format (F-8) in the 2011-12 season and the country’s regional federations gradually followed suit. The aim of this observational methodology study was to investigate which of these formats best suited the learning needs of U-10 players transitioning from 5-aside futsal. We built a multiple logistic regression model to predict the success of offensive moves depending on the game format and the area of the pitch in which the move was initiated. Success was defined as a shot at the goal. We also built two simple logistic regression models to evaluate how the game format influenced the acquisition of technicaltactical skills. It was found that the probability of a shot at the goal was higher in F-7 than in F-8 for moves initiated in the Creation Sector-Own Half (0.08 vs 0.07) and the Creation Sector-Opponent's Half (0.18 vs 0.16). The probability was the same (0.04) in the Safety Sector. Children also had more opportunities to control the ball and pass or take a shot in the F-7 format (0.24 vs 0.20), and these were also more likely to be successful in this format (0.28 vs 0.19). PMID:28031768

  3. Modeling group size and scalar stress by logistic regression from an archaeological perspective.

    PubMed

    Alberti, Gianmarco

    2014-01-01

    Johnson's scalar stress theory, describing the mechanics of (and the remedies to) the increase in in-group conflictuality that parallels the increase in groups' size, provides scholars with a useful theoretical framework for the understanding of different aspects of the material culture of past communities (i.e., social organization, communal food consumption, ceramic style, architecture and settlement layout). Due to its relevance in archaeology and anthropology, the article aims at proposing a predictive model of critical level of scalar stress on the basis of community size. Drawing upon Johnson's theory and on Dunbar's findings on the cognitive constrains to human group size, a model is built by means of Logistic Regression on the basis of the data on colony fissioning among the Hutterites of North America. On the grounds of the theoretical framework sketched in the first part of the article, the absence or presence of colony fissioning is considered expression of not critical vs. critical level of scalar stress for the sake of the model building. The model, which is also tested against a sample of archaeological and ethnographic cases: a) confirms the existence of a significant relationship between critical scalar stress and group size, setting the issue on firmer statistical grounds; b) allows calculating the intercept and slope of the logistic regression model, which can be used in any time to estimate the probability that a community experienced a critical level of scalar stress; c) allows locating a critical scalar stress threshold at community size 127 (95% CI: 122-132), while the maximum probability of critical scale stress is predicted at size 158 (95% CI: 147-170). The model ultimately provides grounds to assess, for the sake of any further archaeological/anthropological interpretation, the probability that a group reached a hot spot of size development critical for its internal cohesion.

  4. Genomic-Enabled Prediction of Ordinal Data with Bayesian Logistic Ordinal Regression

    PubMed Central

    Montesinos-López, Osval A.; Montesinos-López, Abelardo; Crossa, José; Burgueño, Juan; Eskridge, Kent

    2015-01-01

    Most genomic-enabled prediction models developed so far assume that the response variable is continuous and normally distributed. The exception is the probit model, developed for ordered categorical phenotypes. In statistical applications, because of the easy implementation of the Bayesian probit ordinal regression (BPOR) model, Bayesian logistic ordinal regression (BLOR) is implemented rarely in the context of genomic-enabled prediction [sample size (n) is much smaller than the number of parameters (p)]. For this reason, in this paper we propose a BLOR model using the Pólya-Gamma data augmentation approach that produces a Gibbs sampler with similar full conditional distributions of the BPOR model and with the advantage that the BPOR model is a particular case of the BLOR model. We evaluated the proposed model by using simulation and two real data sets. Results indicate that our BLOR model is a good alternative for analyzing ordinal data in the context of genomic-enabled prediction with the probit or logit link. PMID:26290569

  5. Power comparisons between similarity-based multilocus association methods, logistic regression, and score tests for haplotypes.

    PubMed

    Lin, Wan-Yu; Schaid, Daniel J

    2009-04-01

    Recently, a genomic distance-based regression for multilocus associations was proposed (Wessel and Schork [2006] Am. J. Hum. Genet. 79:792-806) in which either locus or haplotype scoring can be used to measure genetic distance. Although it allows various measures of genomic similarity and simultaneous analyses of multiple phenotypes, its power relative to other methods for case-control analyses is not well known. We compare the power of traditional methods with this new distance-based approach, for both locus-scoring and haplotype-scoring strategies. We discuss the relative power of these association methods with respect to five properties: (1) the marker informativity; (2) the number of markers; (3) the causal allele frequency; (4) the preponderance of the most common high-risk haplotype; (5) the correlation between the causal single-nucleotide polymorphism (SNP) and its flanking markers. We found that locus-based logistic regression and the global score test for haplotypes suffered from power loss when many markers were included in the analyses, due to many degrees of freedom. In contrast, the distance-based approach was not as vulnerable to more markers or more haplotypes. A genotype counting measure was more sensitive to the marker informativity and the correlation between the causal SNP and its flanking markers. After examining the impact of the five properties on power, we found that on average, the genomic distance-based regression that uses a matching measure for diplotypes was the most powerful and robust method among the seven methods we compared.

  6. Feature Selection and Cancer Classification via Sparse Logistic Regression with the Hybrid L1/2 +2 Regularization

    PubMed Central

    Huang, Hai-Hui; Liu, Xiao-Ying; Liang, Yong

    2016-01-01

    Cancer classification and feature (gene) selection plays an important role in knowledge discovery in genomic data. Although logistic regression is one of the most popular classification methods, it does not induce feature selection. In this paper, we presented a new hybrid L1/2 +2 regularization (HLR) function, a linear combination of L1/2 and L2 penalties, to select the relevant gene in the logistic regression. The HLR approach inherits some fascinating characteristics from L1/2 (sparsity) and L2 (grouping effect where highly correlated variables are in or out a model together) penalties. We also proposed a novel univariate HLR thresholding approach to update the estimated coefficients and developed the coordinate descent algorithm for the HLR penalized logistic regression model. The empirical results and simulations indicate that the proposed method is highly competitive amongst several state-of-the-art methods. PMID:27136190

  7. Multiple logistic regression model of signalling practices of drivers on urban highways

    NASA Astrophysics Data System (ADS)

    Puan, Othman Che; Ibrahim, Muttaka Na'iya; Zakaria, Rozana

    2015-05-01

    Giving signal is a way of informing other road users, especially to the conflicting drivers, the intention of a driver to change his/her movement course. Other users are exposed to hazard situation and risks of accident if the driver who changes his/her course failed to give signal as required. This paper describes the application of logistic regression model for the analysis of driver's signalling practices on multilane highways based on possible factors affecting driver's decision such as driver's gender, vehicle's type, vehicle's speed and traffic flow intensity. Data pertaining to the analysis of such factors were collected manually. More than 2000 drivers who have performed a lane changing manoeuvre while driving on two sections of multilane highways were observed. Finding from the study shows that relatively a large proportion of drivers failed to give any signals when changing lane. The result of the analysis indicates that although the proportion of the drivers who failed to provide signal prior to lane changing manoeuvre is high, the degree of compliances of the female drivers is better than the male drivers. A binary logistic model was developed to represent the probability of a driver to provide signal indication prior to lane changing manoeuvre. The model indicates that driver's gender, type of vehicle's driven, speed of vehicle and traffic volume influence the driver's decision to provide a signal indication prior to a lane changing manoeuvre on a multilane urban highway. In terms of types of vehicles driven, about 97% of motorcyclists failed to comply with the signal indication requirement. The proportion of non-compliance drivers under stable traffic flow conditions is much higher than when the flow is relatively heavy. This is consistent with the data which indicates a high degree of non-compliances when the average speed of the traffic stream is relatively high.

  8. Sample size matters: Investigating the optimal sample size for a logistic regression debris flow susceptibility model

    NASA Astrophysics Data System (ADS)

    Heckmann, Tobias; Gegg, Katharina; Becht, Michael

    2013-04-01

    Statistical approaches to landslide susceptibility modelling on the catchment and regional scale are used very frequently compared to heuristic and physically based approaches. In the present study, we deal with the problem of the optimal sample size for a logistic regression model. More specifically, a stepwise approach has been chosen in order to select those independent variables (from a number of derivatives of a digital elevation model and landcover data) that explain best the spatial distribution of debris flow initiation zones in two neighbouring central alpine catchments in Austria (used mutually for model calculation and validation). In order to minimise problems arising from spatial autocorrelation, we sample a single raster cell from each debris flow initiation zone within an inventory. In addition, as suggested by previous work using the "rare events logistic regression" approach, we take a sample of the remaining "non-event" raster cells. The recommendations given in the literature on the size of this sample appear to be motivated by practical considerations, e.g. the time and cost of acquiring data for non-event cases, which do not apply to the case of spatial data. In our study, we aim at finding empirically an "optimal" sample size in order to avoid two problems: First, a sample too large will violate the independent sample assumption as the independent variables are spatially autocorrelated; hence, a variogram analysis leads to a sample size threshold above which the average distance between sampled cells falls below the autocorrelation range of the independent variables. Second, if the sample is too small, repeated sampling will lead to very different results, i.e. the independent variables and hence the result of a single model calculation will be extremely dependent on the choice of non-event cells. Using a Monte-Carlo analysis with stepwise logistic regression, 1000 models are calculated for a wide range of sample sizes. For each sample size

  9. A logistic regression based approach for the prediction of flood warning threshold exceedance

    NASA Astrophysics Data System (ADS)

    Diomede, Tommaso; Trotter, Luca; Stefania Tesini, Maria; Marsigli, Chiara

    2016-04-01

    A method based on logistic regression is proposed for the prediction of river level threshold exceedance at short (+0-18h) and medium (+18-42h) lead times. The aim of the study is to provide a valuable tool for the issue of warnings by the authority responsible of public safety in case of flood. The role of different precipitation periods as predictors for the exceedance of a fixed river level has been investigated, in order to derive significant information for flood forecasting. Based on catchment-averaged values, a separation of "antecedent" and "peak-triggering" rainfall amounts as independent variables is attempted. In particular, the following flood-related precipitation periods have been considered: (i) the period from 1 to n days before the forecast issue time, which may be relevant for the soil saturation, (ii) the last 24 hours, which may be relevant for the current water level in the river, and (iii) the period from 0 to x hours in advance with respect to the forecast issue time, when the flood-triggering precipitation generally occurs. Several combinations and values of these predictors have been tested to optimise the method implementation. In particular, the period for the precursor antecedent precipitation ranges between 5 and 45 days; the state of the river can be represented by the last 24-h precipitation or, as alternative, by the current river level. The flood-triggering precipitation has been cumulated over the next 18 hours (for the short lead time) and 36-42 hours (for the medium lead time). The proposed approach requires a specific implementation of logistic regression for each river section and warning threshold. The method performance has been evaluated over the Santerno river catchment (about 450 km2) in the Emilia-Romagna Region, northern Italy. A statistical analysis in terms of false alarms, misses and related scores was carried out by using a 8-year long database. The results are quite satisfactory, with slightly better performances

  10. Geographical variation of unmet medical needs in Italy: a multivariate logistic regression analysis

    PubMed Central

    2013-01-01

    Background Unmet health needs should be, in theory, a minor issue in Italy where a publicly funded and universally accessible health system exists. This, however, does not seem to be the case. Moreover, in the last two decades responsibilities for health care have been progressively decentralized to regional governments, which have differently organized health service delivery within their territories. Regional decision-making has affected the use of health care services, further increasing the existing geographical disparities in the access to care across the country. This study aims at comparing self-perceived unmet needs across Italian regions and assessing how the reported reasons - grouped into the categories of availability, accessibility and acceptability – vary geographically. Methods Data from the 2006 Italian component of the European Union Statistics on Income and Living Conditions are employed to explore reasons and predictors of self-reported unmet medical needs among 45,175 Italian respondents aged 18 and over. Multivariate logistic regression models are used to determine adjusted rates for overall unmet medical needs and for each of the three categories of reasons. Results Results show that, overall, 6.9% of the Italian population stated having experienced at least one unmet medical need during the last 12 months. The unadjusted rates vary markedly across regions, thus resulting in a clear-cut north–south divide (4.6% in the North-East vs. 10.6% in the South). Among those reporting unmet medical needs, the leading reason was problems of accessibility related to cost or transportation (45.5%), followed by acceptability (26.4%) and availability due to the presence of too long waiting lists (21.4%). In the South, more than one out of two individuals with an unmet need refrained from seeing a physician due to economic reasons. In the northern regions, working and family responsibilities contribute relatively more to the underutilization of medical

  11. Alternatives for logistic regression in cross-sectional studies: an empirical comparison of models that directly estimate the prevalence ratio

    PubMed Central

    Barros, Aluísio JD; Hirakata, Vânia N

    2003-01-01

    Background Cross-sectional studies with binary outcomes analyzed by logistic regression are frequent in the epidemiological literature. However, the odds ratio can importantly overestimate the prevalence ratio, the measure of choice in these studies. Also, controlling for confounding is not equivalent for the two measures. In this paper we explore alternatives for modeling data of such studies with techniques that directly estimate the prevalence ratio. Methods We compared Cox regression with constant time at risk, Poisson regression and log-binomial regression against the standard Mantel-Haenszel estimators. Models with robust variance estimators in Cox and Poisson regressions and variance corrected by the scale parameter in Poisson regression were also evaluated. Results Three outcomes, from a cross-sectional study carried out in Pelotas, Brazil, with different levels of prevalence were explored: weight-for-age deficit (4%), asthma (31%) and mother in a paid job (52%). Unadjusted Cox/Poisson regression and Poisson regression with scale parameter adjusted by deviance performed worst in terms of interval estimates. Poisson regression with scale parameter adjusted by χ2 showed variable performance depending on the outcome prevalence. Cox/Poisson regression with robust variance, and log-binomial regression performed equally well when the model was correctly specified. Conclusions Cox or Poisson regression with robust variance and log-binomial regression provide correct estimates and are a better alternative for the analysis of cross-sectional studies with binary outcomes than logistic regression, since the prevalence ratio is more interpretable and easier to communicate to non-specialists than the odds ratio. However, precautions are needed to avoid estimation problems in specific situations. PMID:14567763

  12. An epidemiological survey on road traffic crashes in Iran: application of the two logistic regression models.

    PubMed

    Bakhtiyari, Mahmood; Mehmandar, Mohammad Reza; Mirbagheri, Babak; Hariri, Gholam Reza; Delpisheh, Ali; Soori, Hamid

    2014-01-01

    Risk factors of human-related traffic crashes are the most important and preventable challenges for community health due to their noteworthy burden in developing countries in particular. The present study aims to investigate the role of human risk factors of road traffic crashes in Iran. Through a cross-sectional study using the COM 114 data collection forms, the police records of almost 600,000 crashes occurred in 2010 are investigated. The binary logistic regression and proportional odds regression models are used. The odds ratio for each risk factor is calculated. These models are adjusted for known confounding factors including age, sex and driving time. The traffic crash reports of 537,688 men (90.8%) and 54,480 women (9.2%) are analysed. The mean age is 34.1 ± 14 years. Not maintaining eyes on the road (53.7%) and losing control of the vehicle (21.4%) are the main causes of drivers' deaths in traffic crashes within cities. Not maintaining eyes on the road is also the most frequent human risk factor for road traffic crashes out of cities. Sudden lane excursion (OR = 9.9, 95% CI: 8.2-11.9) and seat belt non-compliance (OR = 8.7, CI: 6.7-10.1), exceeding authorised speed (OR = 17.9, CI: 12.7-25.1) and exceeding safe speed (OR = 9.7, CI: 7.2-13.2) are the most significant human risk factors for traffic crashes in Iran. The high mortality rate of 39 people for every 100,000 population emphasises on the importance of traffic crashes in Iran. Considering the important role of human risk factors in traffic crashes, struggling efforts are required to control dangerous driving behaviours such as exceeding speed, illegal overtaking and not maintaining eyes on the road.

  13. The likelihood of achieving quantified road safety targets: a binary logistic regression model for possible factors.

    PubMed

    Sze, N N; Wong, S C; Lee, C Y

    2014-12-01

    In past several decades, many countries have set quantified road safety targets to motivate transport authorities to develop systematic road safety strategies and measures and facilitate the achievement of continuous road safety improvement. Studies have been conducted to evaluate the association between the setting of quantified road safety targets and road fatality reduction, in both the short and long run, by comparing road fatalities before and after the implementation of a quantified road safety target. However, not much work has been done to evaluate whether the quantified road safety targets are actually achieved. In this study, we used a binary logistic regression model to examine the factors - including vehicle ownership, fatality rate, and national income, in addition to level of ambition and duration of target - that contribute to a target's success. We analyzed 55 quantified road safety targets set by 29 countries from 1981 to 2009, and the results indicate that targets that are in progress and with lower level of ambitions had a higher likelihood of eventually being achieved. Moreover, possible interaction effects on the association between level of ambition and the likelihood of success are also revealed.

  14. Investigating flight response of Pacific brant to helicopters at Izembek Lagoon, Alaska by using logistic regression

    USGS Publications Warehouse

    Erickson, Wallace P.; Nick, Todd G.; Ward, David H.; Peck, Roxy; Haugh, Larry D.; Goodman, Arnold

    1998-01-01

    Izembek Lagoon, an estuary in Alaska, is a very important staging area for Pacific brant, a small migratory goose. Each fall, nearly the entire Pacific Flyway population of 130,000 brant flies to Izembek Lagoon and feeds on eelgrass to accumulate fat reserves for nonstop transoceanic migration to wintering areas as distant as Mexico. In the past 10 years, offshore drilling activities in this area have increased, and, as a result, the air traffic in and out of the nearby Cold Bay airport has also increased. There has been a concern that this increased air traffic could affect the brant by disturbing them from their feeding and resting activities, which in turn could result in reduced energy intake and buildup. This may increase the mortality rates during their migratory journey. Because of these concerns, a study was conducted to investigate the flight response of brant to overflights of large helicopters. Response was measured on flocks during experimental overflights of large helicopters flown at varying altitudes and lateral (perpendicular) distances from the flocks. Logistic regression models were developed for predicting probability of flight response as a function of these distance variables. Results of this study may be used in the development of new FAA guidelines for aircraft near Izembek Lagoon.

  15. Penalization, bias reduction, and default priors in logistic and related categorical and survival regressions.

    PubMed

    Greenland, Sander; Mansournia, Mohammad Ali

    2015-10-15

    Penalization is a very general method of stabilizing or regularizing estimates, which has both frequentist and Bayesian rationales. We consider some questions that arise when considering alternative penalties for logistic regression and related models. The most widely programmed penalty appears to be the Firth small-sample bias-reduction method (albeit with small differences among implementations and the results they provide), which corresponds to using the log density of the Jeffreys invariant prior distribution as a penalty function. The latter representation raises some serious contextual objections to the Firth reduction, which also apply to alternative penalties based on t-distributions (including Cauchy priors). Taking simplicity of implementation and interpretation as our chief criteria, we propose that the log-F(1,1) prior provides a better default penalty than other proposals. Penalization based on more general log-F priors is trivial to implement and facilitates mean-squared error reduction and sensitivity analyses of penalty strength by varying the number of prior degrees of freedom. We caution however against penalization of intercepts, which are unduly sensitive to covariate coding and design idiosyncrasies.

  16. Predictive occurrence models for coastal wetland plant communities: delineating hydrologic response surfaces with multinomial logistic regression

    USGS Publications Warehouse

    Snedden, Gregg A.; Steyer, Gregory D.

    2013-01-01

    Understanding plant community zonation along estuarine stress gradients is critical for effective conservation and restoration of coastal wetland ecosystems. We related the presence of plant community types to estuarine hydrology at 173 sites across coastal Louisiana. Percent relative cover by species was assessed at each site near the end of the growing season in 2008, and hourly water level and salinity were recorded at each site Oct 2007–Sep 2008. Nine plant community types were delineated with k-means clustering, and indicator species were identified for each of the community types with indicator species analysis. An inverse relation between salinity and species diversity was observed. Canonical correspondence analysis (CCA) effectively segregated the sites across ordination space by community type, and indicated that salinity and tidal amplitude were both important drivers of vegetation composition. Multinomial logistic regression (MLR) and Akaike's Information Criterion (AIC) were used to predict the probability of occurrence of the nine vegetation communities as a function of salinity and tidal amplitude, and probability surfaces obtained from the MLR model corroborated the CCA results. The weighted kappa statistic, calculated from the confusion matrix of predicted versus actual community types, was 0.7 and indicated good agreement between observed community types and model predictions. Our results suggest that models based on a few key hydrologic variables can be valuable tools for predicting vegetation community development when restoring and managing coastal wetlands.

  17. Predicting the "graduate on time (GOT)" of PhD students using binary logistics regression model

    NASA Astrophysics Data System (ADS)

    Shariff, S. Sarifah Radiah; Rodzi, Nur Atiqah Mohd; Rahman, Kahartini Abdul; Zahari, Siti Meriam; Deni, Sayang Mohd

    2016-10-01

    Malaysian government has recently set a new goal to produce 60,000 Malaysian PhD holders by the year 2023. As a Malaysia's largest institution of higher learning in terms of size and population which offers more than 500 academic programmes in a conducive and vibrant environment, UiTM has taken several initiatives to fill up the gap. Strategies to increase the numbers of graduates with PhD are a process that is challenging. In many occasions, many have already identified that the struggle to get into the target set is even more daunting, and that implementation is far too ideal. This has further being progressing slowly as the attrition rate increases. This study aims to apply the proposed models that incorporates several factors in predicting the number PhD students that will complete their PhD studies on time. Binary Logistic Regression model is proposed and used on the set of data to determine the number. The results show that only 6.8% of the 2014 PhD students are predicted to graduate on time and the results are compared wih the actual number for validation purpose.

  18. [Clinical research XX. From clinical judgment to multiple logistic regression model].

    PubMed

    Berea-Baltierra, Ricardo; Rivas-Ruiz, Rodolfo; Pérez-Rodríguez, Marcela; Palacios-Cruz, Lino; Moreno, Jorge; Talavera, Juan O

    2014-01-01

    The complexity of the causality phenomenon in clinical practice implies that the result of a maneuver is not solely caused by the maneuver, but by the interaction among the maneuver and other baseline factors or variables occurring during the maneuver. This requires methodological designs that allow the evaluation of these variables. When the outcome is a binary variable, we use the multiple logistic regression model (MLRM). This multivariate model is useful when we want to predict or explain, adjusting due to the effect of several risk factors, the effect of a maneuver or exposition over the outcome. In order to perform an MLRM, the outcome or dependent variable must be a binary variable and both categories must mutually exclude each other (i.e. live/death, healthy/ill); on the other hand, independent variables or risk factors may be either qualitative or quantitative. The effect measure obtained from this model is the odds ratio (OR) with 95 % confidence intervals (CI), from which we can estimate the proportion of the outcome's variability explained through the risk factors. For these reasons, the MLRM is used in clinical research, since one of the main objectives in clinical practice comprises the ability to predict or explain an event where different risk or prognostic factors are taken into account.

  19. Validation data-based adjustments for outcome misclassification in logistic regression: an illustration.

    PubMed

    Lyles, Robert H; Tang, Li; Superak, Hillary M; King, Caroline C; Celentano, David D; Lo, Yungtai; Sobel, Jack D

    2011-07-01

    Misclassification of binary outcome variables is a known source of potentially serious bias when estimating adjusted odds ratios. Although researchers have described frequentist and Bayesian methods for dealing with the problem, these methods have seldom fully bridged the gap between statistical research and epidemiologic practice. In particular, there have been few real-world applications of readily grasped and computationally accessible methods that make direct use of internal validation data to adjust for differential outcome misclassification in logistic regression. In this paper, we illustrate likelihood-based methods for this purpose that can be implemented using standard statistical software. Using main study and internal validation data from the HIV Epidemiology Research Study, we demonstrate how misclassification rates can depend on the values of subject-specific covariates, and we illustrate the importance of accounting for this dependence. Simulation studies confirm the effectiveness of the maximum likelihood approach. We emphasize clear exposition of the likelihood function itself, to permit the reader to easily assimilate appended computer code that facilitates sensitivity analyses as well as the efficient handling of main/external and main/internal validation-study data. These methods are readily applicable under random cross-sectional sampling, and we discuss the extent to which the main/internal analysis remains appropriate under outcome-dependent (case-control) sampling.

  20. CPAT: Coding-Potential Assessment Tool using an alignment-free logistic regression model

    PubMed Central

    Wang, Liguo; Park, Hyun Jung; Dasari, Surendra; Wang, Shengqin; Kocher, Jean-Pierre; Li, Wei

    2013-01-01

    Thousands of novel transcripts have been identified using deep transcriptome sequencing. This discovery of large and ‘hidden’ transcriptome rejuvenates the demand for methods that can rapidly distinguish between coding and noncoding RNA. Here, we present a novel alignment-free method, Coding Potential Assessment Tool (CPAT), which rapidly recognizes coding and noncoding transcripts from a large pool of candidates. To this end, CPAT uses a logistic regression model built with four sequence features: open reading frame size, open reading frame coverage, Fickett TESTCODE statistic and hexamer usage bias. CPAT software outperformed (sensitivity: 0.96, specificity: 0.97) other state-of-the-art alignment-based software such as Coding-Potential Calculator (sensitivity: 0.99, specificity: 0.74) and Phylo Codon Substitution Frequencies (sensitivity: 0.90, specificity: 0.63). In addition to high accuracy, CPAT is approximately four orders of magnitude faster than Coding-Potential Calculator and Phylo Codon Substitution Frequencies, enabling its users to process thousands of transcripts within seconds. The software accepts input sequences in either FASTA- or BED-formatted data files. We also developed a web interface for CPAT that allows users to submit sequences and receive the prediction results almost instantly. PMID:23335781

  1. Extreme value distribution based gene selection criteria for discriminant microarray data analysis using logistic regression.

    PubMed

    Li, Wentian; Sun, Fengzhu; Grosse, Ivo

    2004-01-01

    One important issue commonly encountered in the analysis of microarray data is to decide which and how many genes should be selected for further studies. For discriminant microarray data analyses based on statistical models, such as the logistic regression models, gene selection can be accomplished by a comparison of the maximum likelihood of the model given the real data, L(D|M), and the expected maximum likelihood of the model given an ensemble of surrogate data with randomly permuted label, L(D(0)|M). Typically, the computational burden for obtaining L(D(0)M) is immense, often exceeding the limits of available computing resources by orders of magnitude. Here, we propose an approach that circumvents such heavy computations by mapping the simulation problem to an extreme-value problem. We present the derivation of an asymptotic distribution of the extreme-value as well as its mean, median, and variance. Using this distribution, we propose two gene selection criteria, and we apply them to two microarray datasets and three classification tasks for illustration.

  2. Using Logistic Regression To Predict the Probability of Debris Flows Occurring in Areas Recently Burned By Wildland Fires

    USGS Publications Warehouse

    Rupert, Michael G.; Cannon, Susan H.; Gartner, Joseph E.

    2003-01-01

    Logistic regression was used to predict the probability of debris flows occurring in areas recently burned by wildland fires. Multiple logistic regression is conceptually similar to multiple linear regression because statistical relations between one dependent variable and several independent variables are evaluated. In logistic regression, however, the dependent variable is transformed to a binary variable (debris flow did or did not occur), and the actual probability of the debris flow occurring is statistically modeled. Data from 399 basins located within 15 wildland fires that burned during 2000-2002 in Colorado, Idaho, Montana, and New Mexico were evaluated. More than 35 independent variables describing the burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated. The models were developed as follows: (1) Basins that did and did not produce debris flows were delineated from National Elevation Data using a Geographic Information System (GIS). (2) Data describing the burn severity, geology, land surface gradient, rainfall, and soil properties were determined for each basin. These data were then downloaded to a statistics software package for analysis using logistic regression. (3) Relations between the occurrence/non-occurrence of debris flows and burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated and several preliminary multivariate logistic regression models were constructed. All possible combinations of independent variables were evaluated to determine which combination produced the most effective model. The multivariate model that best predicted the occurrence of debris flows was selected. (4) The multivariate logistic regression model was entered into a GIS, and a map showing the probability of debris flows was constructed. The most effective model incorporates the percentage of each basin with slope greater than 30 percent, percentage of land burned at medium and high burn severity

  3. Estimation of Logistic Regression Models in Small Samples. A Simulation Study Using a Weakly Informative Default Prior Distribution

    ERIC Educational Resources Information Center

    Gordovil-Merino, Amalia; Guardia-Olmos, Joan; Pero-Cebollero, Maribel

    2012-01-01

    In this paper, we used simulations to compare the performance of classical and Bayesian estimations in logistic regression models using small samples. In the performed simulations, conditions were varied, including the type of relationship between independent and dependent variable values (i.e., unrelated and related values), the type of variable…

  4. The Overall Odds Ratio as an Intuitive Effect Size Index for Multiple Logistic Regression: Examination of Further Refinements

    ERIC Educational Resources Information Center

    Le, Huy; Marcus, Justin

    2012-01-01

    This study used Monte Carlo simulation to examine the properties of the overall odds ratio (OOR), which was recently introduced as an index for overall effect size in multiple logistic regression. It was found that the OOR was relatively independent of study base rate and performed better than most commonly used R-square analogs in indexing model…

  5. A Comparison of the Logistic Regression and Contingency Table Methods for Simultaneous Detection of Uniform and Nonuniform DIF

    ERIC Educational Resources Information Center

    Guler, Nese; Penfield, Randall D.

    2009-01-01

    In this study, we investigate the logistic regression (LR), Mantel-Haenszel (MH), and Breslow-Day (BD) procedures for the simultaneous detection of both uniform and nonuniform differential item functioning (DIF). A simulation study was used to assess and compare the Type I error rate and power of a combined decision rule (CDR), which assesses DIF…

  6. Multinomial logistic regression modelling of obesity and overweight among primary school students in a rural area of Negeri Sembilan

    NASA Astrophysics Data System (ADS)

    Ghazali, Amirul Syafiq Mohd; Ali, Zalila; Noor, Norlida Mohd; Baharum, Adam

    2015-10-01

    Multinomial logistic regression is widely used to model the outcomes of a polytomous response variable, a categorical dependent variable with more than two categories. The model assumes that the conditional mean of the dependent categorical variables is the logistic function of an affine combination of predictor variables. Its procedure gives a number of logistic regression models that make specific comparisons of the response categories. When there are q categories of the response variable, the model consists of q-1 logit equations which are fitted simultaneously. The model is validated by variable selection procedures, tests of regression coefficients, a significant test of the overall model, goodness-of-fit measures, and validation of predicted probabilities using odds ratio. This study used the multinomial logistic regression model to investigate obesity and overweight among primary school students in a rural area on the basis of their demographic profiles, lifestyles and on the diet and food intake. The results indicated that obesity and overweight of students are related to gender, religion, sleep duration, time spent on electronic games, breakfast intake in a week, with whom meals are taken, protein intake, and also, the interaction between breakfast intake in a week with sleep duration, and the interaction between gender and protein intake.

  7. Differential Item Functioning Detection and Effect Size: A Comparison between Logistic Regression and Mantel-Haenszel Procedures

    ERIC Educational Resources Information Center

    Hidalgo, M. Dolores; Lopez-Pina, Jose Antonio

    2004-01-01

    This article compares several procedures in their efficacy for detecting differential item functioning (DIF): logistic regression analysis, the Mantel-Haenszel (MH) procedure, and the modified Mantel-Haenszel procedure by Mazor, Clauser, and Hambleton. It also compares the effect size measures that these procedures provide. In this study,…

  8. Multinomial logistic regression modelling of obesity and overweight among primary school students in a rural area of Negeri Sembilan

    SciTech Connect

    Ghazali, Amirul Syafiq Mohd; Ali, Zalila; Noor, Norlida Mohd; Baharum, Adam

    2015-10-22

    Multinomial logistic regression is widely used to model the outcomes of a polytomous response variable, a categorical dependent variable with more than two categories. The model assumes that the conditional mean of the dependent categorical variables is the logistic function of an affine combination of predictor variables. Its procedure gives a number of logistic regression models that make specific comparisons of the response categories. When there are q categories of the response variable, the model consists of q-1 logit equations which are fitted simultaneously. The model is validated by variable selection procedures, tests of regression coefficients, a significant test of the overall model, goodness-of-fit measures, and validation of predicted probabilities using odds ratio. This study used the multinomial logistic regression model to investigate obesity and overweight among primary school students in a rural area on the basis of their demographic profiles, lifestyles and on the diet and food intake. The results indicated that obesity and overweight of students are related to gender, religion, sleep duration, time spent on electronic games, breakfast intake in a week, with whom meals are taken, protein intake, and also, the interaction between breakfast intake in a week with sleep duration, and the interaction between gender and protein intake.

  9. Loss of Power in Logistic, Ordinal Logistic, and Probit Regression When an Outcome Variable Is Coarsely Categorized

    ERIC Educational Resources Information Center

    Taylor, Aaron B.; West, Stephen G.; Aiken, Leona S.

    2006-01-01

    Variables that have been coarsely categorized into a small number of ordered categories are often modeled as outcome variables in psychological research. The authors employ a Monte Carlo study to investigate the effects of this coarse categorization of dependent variables on power to detect true effects using three classes of regression models:…

  10. Using Logistic Regression to Predict the Probability of Debris Flows in Areas Burned by Wildfires, Southern California, 2003-2006

    USGS Publications Warehouse

    Rupert, Michael G.; Cannon, Susan H.; Gartner, Joseph E.; Michael, John A.; Helsel, Dennis R.

    2008-01-01

    Logistic regression was used to develop statistical models that can be used to predict the probability of debris flows in areas recently burned by wildfires by using data from 14 wildfires that burned in southern California during 2003-2006. Twenty-eight independent variables describing the basin morphology, burn severity, rainfall, and soil properties of 306 drainage basins located within those burned areas were evaluated. The models were developed as follows: (1) Basins that did and did not produce debris flows soon after the 2003 to 2006 fires were delineated from data in the National Elevation Dataset using a geographic information system; (2) Data describing the basin morphology, burn severity, rainfall, and soil properties were compiled for each basin. These data were then input to a statistics software package for analysis using logistic regression; and (3) Relations between the occurrence or absence of debris flows and the basin morphology, burn severity, rainfall, and soil properties were evaluated, and five multivariate logistic regression models were constructed. All possible combinations of independent variables were evaluated to determine which combinations produced the most effective models, and the multivariate models that best predicted the occurrence of debris flows were identified. Percentage of high burn severity and 3-hour peak rainfall intensity were significant variables in all models. Soil organic matter content and soil clay content were significant variables in all models except Model 5. Soil slope was a significant variable in all models except Model 4. The most suitable model can be selected from these five models on the basis of the availability of independent variables in the particular area of interest and field checking of probability maps. The multivariate logistic regression models can be entered into a geographic information system, and maps showing the probability of debris flows can be constructed in recently burned areas of

  11. A logistic regression equation for estimating the probability of a stream flowing perennially in Massachusetts

    USGS Publications Warehouse

    Bent, Gardner C.; Archfield, Stacey A.

    2002-01-01

    A logistic regression equation was developed for estimating the probability of a stream flowing perennially at a specific site in Massachusetts. The equation provides city and town conservation commissions and the Massachusetts Department of Environmental Protection with an additional method for assessing whether streams are perennial or intermittent at a specific site in Massachusetts. This information is needed to assist these environmental agencies, who administer the Commonwealth of Massachusetts Rivers Protection Act of 1996, which establishes a 200-foot-wide protected riverfront area extending along the length of each side of the stream from the mean annual high-water line along each side of perennial streams, with exceptions in some urban areas. The equation was developed by relating the verified perennial or intermittent status of a stream site to selected basin characteristics of naturally flowing streams (no regulation by dams, surface-water withdrawals, ground-water withdrawals, diversion, waste-water discharge, and so forth) in Massachusetts. Stream sites used in the analysis were identified as perennial or intermittent on the basis of review of measured streamflow at sites throughout Massachusetts and on visual observation at sites in the South Coastal Basin, southeastern Massachusetts. Measured or observed zero flow(s) during months of extended drought as defined by the 310 Code of Massachusetts Regulations (CMR) 10.58(2)(a) were not considered when designating the perennial or intermittent status of a stream site. The database used to develop the equation included a total of 305 stream sites (84 intermittent- and 89 perennial-stream sites in the State, and 50 intermittent- and 82 perennial-stream sites in the South Coastal Basin). Stream sites included in the database had drainage areas that ranged from 0.14 to 8.94 square miles in the State and from 0.02 to 7.00 square miles in the South Coastal Basin.Results of the logistic regression analysis

  12. LOGISTIC NETWORK REGRESSION FOR SCALABLE ANALYSIS OF NETWORKS WITH JOINT EDGE/VERTEX DYNAMICS.

    PubMed

    Almquist, Zack W; Butts, Carter T

    2014-08-01

    Change in group size and composition has long been an important area of research in the social sciences. Similarly, interest in interaction dynamics has a long history in sociology and social psychology. However, the effects of endogenous group change on interaction dynamics are a surprisingly understudied area. One way to explore these relationships is through social network models. Network dynamics may be viewed as a process of change in the edge structure of a network, in the vertex set on which edges are defined, or in both simultaneously. Although early studies of such processes were primarily descriptive, recent work on this topic has increasingly turned to formal statistical models. Although showing great promise, many of these modern dynamic models are computationally intensive and scale very poorly in the size of the network under study and/or the number of time points considered. Likewise, currently used models focus on edge dynamics, with little support for endogenously changing vertex sets. Here, the authors show how an existing approach based on logistic network regression can be extended to serve as a highly scalable framework for modeling large networks with dynamic vertex sets. The authors place this approach within a general dynamic exponential family (exponential-family random graph modeling) context, clarifying the assumptions underlying the framework (and providing a clear path for extensions), and they show how model assessment methods for cross-sectional networks can be extended to the dynamic case. Finally, the authors illustrate this approach on a classic data set involving interactions among windsurfers on a California beach.

  13. Logistic regression modeling to assess groundwater vulnerability to contamination in Hawaii, USA.

    PubMed

    Mair, Alan; El-Kadi, Aly I

    2013-10-01

    Capture zone analysis combined with a subjective susceptibility index is currently used in Hawaii to assess vulnerability to contamination of drinking water sources derived from groundwater. In this study, we developed an alternative objective approach that combines well capture zones with multiple-variable logistic regression (LR) modeling and applied it to the highly-utilized Pearl Harbor and Honolulu aquifers on the island of Oahu, Hawaii. Input for the LR models utilized explanatory variables based on hydrogeology, land use, and well geometry/location. A suite of 11 target contaminants detected in the region, including elevated nitrate (>1 mg/L), four chlorinated solvents, four agricultural fumigants, and two pesticides, was used to develop the models. We then tested the ability of the new approach to accurately separate groups of wells with low and high vulnerability, and the suitability of nitrate as an indicator of other types of contamination. Our results produced contaminant-specific LR models that accurately identified groups of wells with the lowest/highest reported detections and the lowest/highest nitrate concentrations. Current and former agricultural land uses were identified as significant explanatory variables for eight of the 11 target contaminants, while elevated nitrate was a significant variable for five contaminants. The utility of the combined approach is contingent on the availability of hydrologic and chemical monitoring data for calibrating groundwater and LR models. Application of the approach using a reference site with sufficient data could help identify key variables in areas with similar hydrogeology and land use but limited data. In addition, elevated nitrate may also be a suitable indicator of groundwater contamination in areas with limited data. The objective LR modeling approach developed in this study is flexible enough to address a wide range of contaminants and represents a suitable addition to the current subjective approach.

  14. Logistic regression modeling to assess groundwater vulnerability to contamination in Hawaii, USA

    NASA Astrophysics Data System (ADS)

    Mair, Alan; El-Kadi, Aly I.

    2013-10-01

    Capture zone analysis combined with a subjective susceptibility index is currently used in Hawaii to assess vulnerability to contamination of drinking water sources derived from groundwater. In this study, we developed an alternative objective approach that combines well capture zones with multiple-variable logistic regression (LR) modeling and applied it to the highly-utilized Pearl Harbor and Honolulu aquifers on the island of Oahu, Hawaii. Input for the LR models utilized explanatory variables based on hydrogeology, land use, and well geometry/location. A suite of 11 target contaminants detected in the region, including elevated nitrate (> 1 mg/L), four chlorinated solvents, four agricultural fumigants, and two pesticides, was used to develop the models. We then tested the ability of the new approach to accurately separate groups of wells with low and high vulnerability, and the suitability of nitrate as an indicator of other types of contamination. Our results produced contaminant-specific LR models that accurately identified groups of wells with the lowest/highest reported detections and the lowest/highest nitrate concentrations. Current and former agricultural land uses were identified as significant explanatory variables for eight of the 11 target contaminants, while elevated nitrate was a significant variable for five contaminants. The utility of the combined approach is contingent on the availability of hydrologic and chemical monitoring data for calibrating groundwater and LR models. Application of the approach using a reference site with sufficient data could help identify key variables in areas with similar hydrogeology and land use but limited data. In addition, elevated nitrate may also be a suitable indicator of groundwater contamination in areas with limited data. The objective LR modeling approach developed in this study is flexible enough to address a wide range of contaminants and represents a suitable addition to the current subjective approach.

  15. Bias-reduced and separation-proof conditional logistic regression with small or sparse data sets.

    PubMed

    Heinze, Georg; Puhr, Rainer

    2010-03-30

    Conditional logistic regression is used for the analysis of binary outcomes when subjects are stratified into several subsets, e.g. matched pairs or blocks. Log odds ratio estimates are usually found by maximizing the conditional likelihood. This approach eliminates all strata-specific parameters by conditioning on the number of events within each stratum. However, in the analyses of both an animal experiment and a lung cancer case-control study, conditional maximum likelihood (CML) resulted in infinite odds ratio estimates and monotone likelihood. Estimation can be improved by using Cytel Inc.'s well-known LogXact software, which provides a median unbiased estimate and exact or mid-p confidence intervals. Here, we suggest and outline point and interval estimation based on maximization of a penalized conditional likelihood in the spirit of Firth's (Biometrika 1993; 80:27-38) bias correction method (CFL). We present comparative analyses of both studies, demonstrating some advantages of CFL over competitors. We report on a small-sample simulation study where CFL log odds ratio estimates were almost unbiased, whereas LogXact estimates showed some bias and CML estimates exhibited serious bias. Confidence intervals and tests based on the penalized conditional likelihood had close-to-nominal coverage rates and yielded highest power among all methods compared, respectively. Therefore, we propose CFL as an attractive solution to the stratified analysis of binary data, irrespective of the occurrence of monotone likelihood. A SAS program implementing CFL is available at: http://www.muw.ac.at/msi/biometrie/programs.

  16. LOGISTIC NETWORK REGRESSION FOR SCALABLE ANALYSIS OF NETWORKS WITH JOINT EDGE/VERTEX DYNAMICS

    PubMed Central

    Almquist, Zack W.; Butts, Carter T.

    2015-01-01

    Change in group size and composition has long been an important area of research in the social sciences. Similarly, interest in interaction dynamics has a long history in sociology and social psychology. However, the effects of endogenous group change on interaction dynamics are a surprisingly understudied area. One way to explore these relationships is through social network models. Network dynamics may be viewed as a process of change in the edge structure of a network, in the vertex set on which edges are defined, or in both simultaneously. Although early studies of such processes were primarily descriptive, recent work on this topic has increasingly turned to formal statistical models. Although showing great promise, many of these modern dynamic models are computationally intensive and scale very poorly in the size of the network under study and/or the number of time points considered. Likewise, currently used models focus on edge dynamics, with little support for endogenously changing vertex sets. Here, the authors show how an existing approach based on logistic network regression can be extended to serve as a highly scalable framework for modeling large networks with dynamic vertex sets. The authors place this approach within a general dynamic exponential family (exponential-family random graph modeling) context, clarifying the assumptions underlying the framework (and providing a clear path for extensions), and they show how model assessment methods for cross-sectional networks can be extended to the dynamic case. Finally, the authors illustrate this approach on a classic data set involving interactions among windsurfers on a California beach. PMID:26120218

  17. Shock Index Correlates with Extravasation on Angiographs of Gastrointestinal Hemorrhage: A Logistics Regression Analysis

    SciTech Connect

    Nakasone, Yutaka Ikeda, Osamu; Yamashita, Yasuyuki; Kudoh, Kouichi; Shigematsu, Yoshinori; Harada, Kazunori

    2007-09-15

    We applied multivariate analysis to the clinical findings in patients with acute gastrointestinal (GI) hemorrhage and compared the relationship between these findings and angiographic evidence of extravasation. Our study population consisted of 46 patients with acute GI bleeding. They were divided into two groups. In group 1 we retrospectively analyzed 41 angiograms obtained in 29 patients (age range, 25-91 years; average, 71 years). Their clinical findings including the shock index (SI), diastolic blood pressure, hemoglobin, platelet counts, and age, which were quantitatively analyzed. In group 2, consisting of 17 patients (age range, 21-78 years; average, 60 years), we prospectively applied statistical analysis by a logistics regression model to their clinical findings and then assessed 21 angiograms obtained in these patients to determine whether our model was useful for predicting the presence of angiographic evidence of extravasation. On 18 of 41 (43.9%) angiograms in group 1 there was evidence of extravasation; in 3 patients it was demonstrated only by selective angiography. Factors significantly associated with angiographic visualization of extravasation were the SI and patient age. For differentiation between cases with and cases without angiographic evidence of extravasation, the maximum cutoff point was between 0.51 and 0.0.53. Of the 21 angiograms obtained in group 2, 13 (61.9%) showed evidence of extravasation; in 1 patient it was demonstrated only on selective angiograms. We found that in 90% of the cases, the prospective application of our model correctly predicted the angiographically confirmed presence or absence of extravasation. We conclude that in patients with GI hemorrhage, angiographic visualization of extravasation is associated with the pre-embolization SI. Patients with a high SI value should undergo study to facilitate optimal treatment planning.

  18. Robust Inference from Conditional Logistic Regression Applied to Movement and Habitat Selection Analysis

    PubMed Central

    Duchesne, Thierry; Fortin, Daniel

    2017-01-01

    Conditional logistic regression (CLR) is widely used to analyze habitat selection and movement of animals when resource availability changes over space and time. Observations used for these analyses are typically autocorrelated, which biases model-based variance estimation of CLR parameters. This bias can be corrected using generalized estimating equations (GEE), an approach that requires partitioning the data into independent clusters. Here we establish the link between clustering rules in GEE and their effectiveness to remove statistical biases in variance estimation of CLR parameters. The current lack of guidelines is such that broad variation in clustering rules can be found among studies (e.g., 14–450 clusters) with unknown consequences on the robustness of statistical inference. We simulated datasets reflecting conditions typical of field studies. Longitudinal data were generated based on several parameters of habitat selection with varying strength of autocorrelation and some individuals having more observations than others. We then evaluated how changing the number of clusters impacted the effectiveness of variance estimators. Simulations revealed that 30 clusters were sufficient to get unbiased and relatively precise estimates of variance of parameter estimates. The use of destructive sampling to increase the number of independent clusters was successful at removing statistical bias, but only when observations were temporally autocorrelated and the strength of inter-individual heterogeneity was weak. GEE also provided robust estimates of variance for different magnitudes of unbalanced datasets. Our simulations demonstrate that GEE should be estimated by assigning each individual to a cluster when at least 30 animals are followed, or by using destructive sampling for studies with fewer individuals having intermediate level of behavioural plasticity in selection and temporally autocorrelated observations. The simulations provide valuable information to

  19. Binary logistic regression analysis of hard palate dimensions for sexing human crania

    PubMed Central

    Asif, Muhammed; Shetty, Radhakrishna; Avadhani, Ramakrishna

    2016-01-01

    Sex determination is the preliminary step in every forensic investigation and the hard palate assumes significance in cranial sexing in cases involving burns and explosions due to its resistant nature and secluded location. This study analyzes the sexing potential of incisive foramen to posterior nasal spine length, palatine process of maxilla length, horizontal plate of palatine bone length and transverse length between the greater palatine foramina. The study deviates from the conventional method of measuring the maxillo-alveolar length and breadth as the dimensions considered in this study are more heat resistant and useful in situations with damaged alveolar margins. The study involves 50 male and 50 female adult dry skulls of Indian ethnic group. The dimensions measured were statistically analyzed using Student's t test, binary logistic regression and receiver operating characteristic curve. It was observed that the incisive foramen to posterior nasal spine length is a definite sex marker with sex predictability of 87.2%. The palatine process of maxilla length with 66.8% sex predictability and the horizontal plate of palatine bone length with 71.9% sex predictability cannot be relied upon as definite sex markers. The transverse length between the greater palatine foramina is statistically insignificant in sexing crania (P=0.318). Considering a significant overlap of values in both the sexes the palatal dimensions singularly cannot be relied upon for sexing. Nevertheless, considering the high sex predictability of incisive foramen to posterior nasal spine length this dimension can definitely be used to supplement other sexing evidence available to precisely conclude the cranial sex. PMID:27382518

  20. Education-Based Gaps in eHealth: A Weighted Logistic Regression Approach

    PubMed Central

    2016-01-01

    Background Persons with a college degree are more likely to engage in eHealth behaviors than persons without a college degree, compounding the health disadvantages of undereducated groups in the United States. However, the extent to which quality of recent eHealth experience reduces the education-based eHealth gap is unexplored. Objective The goal of this study was to examine how eHealth information search experience moderates the relationship between college education and eHealth behaviors. Methods Based on a nationally representative sample of adults who reported using the Internet to conduct the most recent health information search (n=1458), I evaluated eHealth search experience in relation to the likelihood of engaging in different eHealth behaviors. I examined whether Internet health information search experience reduces the eHealth behavior gaps among college-educated and noncollege-educated adults. Weighted logistic regression models were used to estimate the probability of different eHealth behaviors. Results College education was significantly positively related to the likelihood of 4 eHealth behaviors. In general, eHealth search experience was negatively associated with health care behaviors, health information-seeking behaviors, and user-generated or content sharing behaviors after accounting for other covariates. Whereas Internet health information search experience has narrowed the education gap in terms of likelihood of using email or Internet to communicate with a doctor or health care provider and likelihood of using a website to manage diet, weight, or health, it has widened the education gap in the instances of searching for health information for oneself, searching for health information for someone else, and downloading health information on a mobile device. Conclusion The relationship between college education and eHealth behaviors is moderated by Internet health information search experience in different ways depending on the type of e

  1. An Original Stepwise Multilevel Logistic Regression Analysis of Discriminatory Accuracy: The Case of Neighbourhoods and Health

    PubMed Central

    Wagner, Philippe; Ghith, Nermin; Leckie, George

    2016-01-01

    Background and Aim Many multilevel logistic regression analyses of “neighbourhood and health” focus on interpreting measures of associations (e.g., odds ratio, OR). In contrast, multilevel analysis of variance is rarely considered. We propose an original stepwise analytical approach that distinguishes between “specific” (measures of association) and “general” (measures of variance) contextual effects. Performing two empirical examples we illustrate the methodology, interpret the results and discuss the implications of this kind of analysis in public health. Methods We analyse 43,291 individuals residing in 218 neighbourhoods in the city of Malmö, Sweden in 2006. We study two individual outcomes (psychotropic drug use and choice of private vs. public general practitioner, GP) for which the relative importance of neighbourhood as a source of individual variation differs substantially. In Step 1 of the analysis, we evaluate the OR and the area under the receiver operating characteristic (AUC) curve for individual-level covariates (i.e., age, sex and individual low income). In Step 2, we assess general contextual effects using the AUC. Finally, in Step 3 the OR for a specific neighbourhood characteristic (i.e., neighbourhood income) is interpreted jointly with the proportional change in variance (i.e., PCV) and the proportion of ORs in the opposite direction (POOR) statistics. Results For both outcomes, information on individual characteristics (Step 1) provide a low discriminatory accuracy (AUC = 0.616 for psychotropic drugs; = 0.600 for choosing a private GP). Accounting for neighbourhood of residence (Step 2) only improved the AUC for choosing a private GP (+0.295 units). High neighbourhood income (Step 3) was strongly associated to choosing a private GP (OR = 3.50) but the PCV was only 11% and the POOR 33%. Conclusion Applying an innovative stepwise multilevel analysis, we observed that, in Malmö, the neighbourhood context per se had a negligible

  2. The Application of the Cumulative Logistic Regression Model to Automated Essay Scoring

    ERIC Educational Resources Information Center

    Haberman, Shelby J.; Sinharay, Sandip

    2010-01-01

    Most automated essay scoring programs use a linear regression model to predict an essay score from several essay features. This article applied a cumulative logit model instead of the linear regression model to automated essay scoring. Comparison of the performances of the linear regression model and the cumulative logit model was performed on a…

  3. What's the Risk? A Simple Approach for Estimating Adjusted Risk Measures from Nonlinear Models Including Logistic Regression

    PubMed Central

    Kleinman, Lawrence C; Norton, Edward C

    2009-01-01

    Objective To develop and validate a general method (called regression risk analysis) to estimate adjusted risk measures from logistic and other nonlinear multiple regression models. We show how to estimate standard errors for these estimates. These measures could supplant various approximations (e.g., adjusted odds ratio [AOR]) that may diverge, especially when outcomes are common. Study Design Regression risk analysis estimates were compared with internal standards as well as with Mantel–Haenszel estimates, Poisson and log-binomial regressions, and a widely used (but flawed) equation to calculate adjusted risk ratios (ARR) from AOR. Data Collection Data sets produced using Monte Carlo simulations. Principal Findings Regression risk analysis accurately estimates ARR and differences directly from multiple regression models, even when confounders are continuous, distributions are skewed, outcomes are common, and effect size is large. It is statistically sound and intuitive, and has properties favoring it over other methods in many cases. Conclusions Regression risk analysis should be the new standard for presenting findings from multiple regression analysis of dichotomous outcomes for cross-sectional, cohort, and population-based case–control studies, particularly when outcomes are common or effect size is large. PMID:18793213

  4. Updated logistic regression equations for the calculation of post-fire debris-flow likelihood in the western United States

    USGS Publications Warehouse

    Staley, Dennis M.; Negri, Jacquelyn A.; Kean, Jason W.; Laber, Jayme L.; Tillery, Anne C.; Youberg, Ann M.

    2016-06-30

    Wildfire can significantly alter the hydrologic response of a watershed to the extent that even modest rainstorms can generate dangerous flash floods and debris flows. To reduce public exposure to hazard, the U.S. Geological Survey produces post-fire debris-flow hazard assessments for select fires in the western United States. We use publicly available geospatial data describing basin morphology, burn severity, soil properties, and rainfall characteristics to estimate the statistical likelihood that debris flows will occur in response to a storm of a given rainfall intensity. Using an empirical database and refined geospatial analysis methods, we defined new equations for the prediction of debris-flow likelihood using logistic regression methods. We showed that the new logistic regression model outperformed previous models used to predict debris-flow likelihood.

  5. Analysis of an environmental exposure health questionnaire in a metropolitan minority population utilizing logistic regression and Support Vector Machines.

    PubMed

    Chen, Chau-Kuang; Bruce, Michelle; Tyler, Lauren; Brown, Claudine; Garrett, Angelica; Goggins, Susan; Lewis-Polite, Brandy; Weriwoh, Mirabel L; Juarez, Paul D; Hood, Darryl B; Skelton, Tyler

    2013-02-01

    The goal of this study was to analyze a 54-item instrument for assessment of perception of exposure to environmental contaminants within the context of the built environment, or exposome. This exposome was defined in five domains to include 1) home and hobby, 2) school, 3) community, 4) occupation, and 5) exposure history. Interviews were conducted with child-bearing-age minority women at Metro Nashville General Hospital at Meharry Medical College. Data were analyzed utilizing DTReg software for Support Vector Machine (SVM) modeling followed by an SPSS package for a logistic regression model. The target (outcome) variable of interest was respondent's residence by ZIP code. The results demonstrate that the rank order of important variables with respect to SVM modeling versus traditional logistic regression models is almost identical. This is the first study documenting that SVM analysis has discriminate power for determination of higher-ordered spatial relationships on an environmental exposure history questionnaire.

  6. An empirical study of statistical properties of variance partition coefficients for multi-level logistic regression models

    USGS Publications Warehouse

    Li, J.; Gray, B.R.; Bates, D.M.

    2008-01-01

    Partitioning the variance of a response by design levels is challenging for binomial and other discrete outcomes. Goldstein (2003) proposed four definitions for variance partitioning coefficients (VPC) under a two-level logistic regression model. In this study, we explicitly derived formulae for multi-level logistic regression model and subsequently studied the distributional properties of the calculated VPCs. Using simulations and a vegetation dataset, we demonstrated associations between different VPC definitions, the importance of methods for estimating VPCs (by comparing VPC obtained using Laplace and penalized quasilikehood methods), and bivariate dependence between VPCs calculated at different levels. Such an empirical study lends an immediate support to wider applications of VPC in scientific data analysis.

  7. Susceptibility mapping of shallow landslides using kernel-based Gaussian process, support vector machines and logistic regression

    NASA Astrophysics Data System (ADS)

    Colkesen, Ismail; Sahin, Emrehan Kutlug; Kavzoglu, Taskin

    2016-06-01

    Identification of landslide prone areas and production of accurate landslide susceptibility zonation maps have been crucial topics for hazard management studies. Since the prediction of susceptibility is one of the main processing steps in landslide susceptibility analysis, selection of a suitable prediction method plays an important role in the success of the susceptibility zonation process. Although simple statistical algorithms (e.g. logistic regression) have been widely used in the literature, the use of advanced non-parametric algorithms in landslide susceptibility zonation has recently become an active research topic. The main purpose of this study is to investigate the possible application of kernel-based Gaussian process regression (GPR) and support vector regression (SVR) for producing landslide susceptibility map of Tonya district of Trabzon, Turkey. Results of these two regression methods were compared with logistic regression (LR) method that is regarded as a benchmark method. Results showed that while kernel-based GPR and SVR methods generally produced similar results (90.46% and 90.37%, respectively), they outperformed the conventional LR method by about 18%. While confirming the superiority of the GPR method, statistical tests based on ROC statistics, success rate and prediction rate curves revealed the significant improvement in susceptibility map accuracy by applying kernel-based GPR and SVR methods.

  8. Use of genetic programming, logistic regression, and artificial neural nets to predict readmission after coronary artery bypass surgery.

    PubMed

    Engoren, Milo; Habib, Robert H; Dooner, John J; Schwann, Thomas A

    2013-08-01

    As many as 14 % of patients undergoing coronary artery bypass surgery are readmitted within 30 days. Readmission is usually the result of morbidity and may lead to death. The purpose of this study is to develop and compare statistical and genetic programming models to predict readmission. Patients were divided into separate Construction and Validation populations. Using 88 variables, logistic regression, genetic programs, and artificial neural nets were used to develop predictive models. Models were first constructed and tested on the Construction populations, then validated on the Validation population. Areas under the receiver operator characteristic curves (AU ROC) were used to compare the models. Two hundred and two patients (7.6 %) in the 2,644 patient Construction group and 216 (8.0 %) of the 2,711 patient Validation group were re-admitted within 30 days of CABG surgery. Logistic regression predicted readmission with AU ROC = .675 ± .021 in the Construction group. Genetic programs significantly improved the accuracy, AU ROC = .767 ± .001, p < .001). Artificial neural nets were less accurate with AU ROC = 0.597 ± .001 in the Construction group. Predictive accuracy of all three techniques fell in the Validation group. However, the accuracy of genetic programming (AU ROC = .654 ± .001) was still trivially but statistically non-significantly better than that of the logistic regression (AU ROC = .644 ± .020, p = .61). Genetic programming and logistic regression provide alternative methods to predict readmission that are similarly accurate.

  9. The identification of menstrual blood in forensic samples by logistic regression modeling of miRNA expression.

    PubMed

    Hanson, Erin K; Mirza, Mohid; Rekab, Kamel; Ballantyne, Jack

    2014-11-01

    We report the identification of sensitive and specific miRNA biomarkers for menstrual blood, a tissue that might provide probative information in certain specialized instances. We incorporated these biomarkers into qPCR assays and developed a quantitative statistical model using logistic regression that permits the prediction of menstrual blood in a forensic sample with a high, and measurable, degree of accuracy. Using the developed model, we achieved 100% accuracy in determining the body fluid of interest for a set of test samples (i.e. samples not used in model development). The development, and details, of the logistic regression model are described. Testing and evaluation of the finalized logistic regression modeled assay using a small number of samples was carried out to preliminarily estimate the limit of detection (LOD), specificity in admixed samples and expression of the menstrual blood miRNA biomarkers throughout the menstrual cycle (25-28 days). The LOD was <1 ng of total RNA, the assay performed as expected with admixed samples and menstrual blood was identified only during the menses phase of the female reproductive cycle in two donors.

  10. Logistic Regression Analysis of Contrast-Enhanced Ultrasound and Conventional Ultrasound Characteristics of Sub-centimeter Thyroid Nodules.

    PubMed

    Zhao, Rui-Na; Zhang, Bo; Yang, Xiao; Jiang, Yu-Xin; Lai, Xing-Jian; Zhang, Xiao-Yan

    2015-12-01

    The purpose of the study described here was to determine specific characteristics of thyroid microcarcinoma (TMC) and explore the value of contrast-enhanced ultrasound (CEUS) combined with conventional ultrasound (US) in the diagnosis of TMC. Characteristics of 63 patients with TMC and 39 with benign sub-centimeter thyroid nodules were retrospectively analyzed. Multivariate logistic regression analysis was performed to determine independent risk factors. Four variables were included in the logistic regression models: age, shape, blood flow distribution and enhancement pattern. The area under the receiver operating characteristic curve was 0.919. With 0.113 selected as the cutoff value, sensitivity, specificity, positive predictive value, negative predictive value and accuracy were 90.5%, 82.1%, 89.1%, 84.2% and 87.3%, respectively. Independent risk factors for TMC determined with the combination of CEUS and conventional US were age, shape, blood flow distribution and enhancement pattern. Age was negatively correlated with malignancy, whereas shape, blood flow distribution and enhancement pattern were positively correlated. The logistic regression model involving CEUS and conventional US was found to be effective in the diagnosis of sub-centimeter thyroid nodules.

  11. Comparing performances of heuristic and logistic regression models for a spatial landslide susceptibility assessment in Maramures, County, Northwestern Romania

    NASA Astrophysics Data System (ADS)

    Mǎgut, F. L.; Zaharia, S.; Glade, T.; Irimuş, I. A.

    2012-04-01

    Various methods exist in analyzing spatial landslide susceptibility and classing the results in susceptibility classes. The prediction of spatial landslide distribution can be performed by using a variety of methods based on GIS techniques. The two very common methods of a heuristic assessment and a logistic regression model are employed in this study in order to compare their performance in predicting the spatial distribution of previously mapped landslides for a study area located in Maramureš County, in Northwestern Romania. The first model determines a susceptibility index by combining the heuristic approach with GIS techniques of spatial data analysis. The criteria used for quantifying each susceptibility factor and the expression used to determine the susceptibility index are taken from the Romanian legislation (Governmental Decision 447/2003). This procedure is followed in any Romanian state-ordered study which relies on financial support. The logistic regression model predicts the spatial distribution of landslides by statistically calculating regressive coefficients which describe the dependency of previously mapped landslides on different factors. The identified shallow landslides correspond generally to Pannonian marl and Quaternary contractile clay deposits. The study region is located in the Northwestern part of Romania, including the Baia Mare municipality, the capital of Maramureš County. The study focuses on the former piedmontal region situated to the south of the volcanic mountains Gutâi, in the Baia Mare Depression, where most of the landslide activity has been recorded. In addition, a narrow sector of the volcanic mountains which borders the city of Baia Mare to the north has also been included to test the accuracy of the models in different lithologic units. The results of both models indicate a general medium landslide susceptibility of the study area. The more detailed differences will be discussed with respect to the advantages and

  12. Trends in age-adjusted coronary heart disease mortality rates in Slovakia between 1993 and 2009.

    PubMed

    Psota, Marek; Pekarciková, Jarmila; O'Mullane, Monica; Rusnák, Martin

    2013-06-01

    Cardiovascular diseases (CVD) and especially coronary heart disease (CHD) are the main causes of death in the Slovak Republic (SR). The aim of this study is to explore trends in age-adjusted coronary heart disease mortality rates in the whole Slovak population and in the population of working age between the years 1993 and 2009. A related indicator - potential years of life lost (PYLL) due to CHD--was calculated in the same period for males and females. Crude CHD mortality rates were age-adjusted using European standard population. The joinpoint Poisson regression was performed in order to find out the annual percentage change in trends. The age-adjusted CHD mortality rates decreased in the Slovak population and also in the population of working age. The change was significant only within the working-age sub-group. We found that partial diagnoses (myocardial infarction and chronic ischaemic heart disease) developed in the mirror-like manner. PYLL per 100,000 decreased during the observed period and the decline was more prominent in males. For further research we recommend to focus on several other issues, namely, to examine the validity of cause of death codes, to examine the development of mortality rates in selected age groups, to find out the cause of differential development of mortality rates in the Slovak Republic in comparison with the Czech Republic and Poland, and to explain the causes of decrease of the age-adjusted CHD mortality rates in younger age groups in Slovakia.

  13. Age-adjusted mortality and its association to variations in urban conditions in Shanghai.

    PubMed

    Takano, Takehito; Fu, Jia; Nakamura, Keiko; Uji, Kazuyuki; Fukuda, Yoshiharu; Watanabe, Masafumi; Nakajima, Hiroshi

    2002-09-01

    The objective of this study was to explore the association between health and urbanization in a megacity, Shanghai, by calculating the age-adjusted mortality ratio by ward-unit of Shanghai and by examining relationships between mortalities and urban indicators. Crude mortality rates and age-adjusted mortality ratios by ward-unit were calculated. Demographic, residential environment, healthcare, and socioeconomic indicators were formulated for each of the ward-units between 1995 and 1998. Correlation and Poisson regression analyses were performed to examine the association between urban indicators and mortalities. The crude mortality rate by ward-unit in 1997 varied from 6.3 to 9.4 deaths per 1000 population. The age-adjusted mortality ratio in 1997 by ward-units as reference to the average mortality of urban China varied from 57.8 to 113.3 within Shanghai. Age-adjusted mortalities were inversely related with indicators of a larger floor space of dwellings per population, a larger proportion of parks, gardens, and green areas to total land area; a greater number of health professionals per population; and a greater number of employees in retail business per population. Spacious living showed independent association to a higher standard of community health in Shanghai (P < 0.05). Consequences of health policy and the developments of urban infrastructural resources from the viewpoint of the Healthy Cities concept were discussed.

  14. Comparison of Logistic Regression and Random Forests techniques for shallow landslide susceptibility assessment in Giampilieri (NE Sicily, Italy)

    NASA Astrophysics Data System (ADS)

    Trigila, Alessandro; Iadanza, Carla; Esposito, Carlo; Scarascia-Mugnozza, Gabriele

    2015-11-01

    The aim of this work is to define reliable susceptibility models for shallow landslides using Logistic Regression and Random Forests multivariate statistical techniques. The study area, located in North-East Sicily, was hit on October 1st 2009 by a severe rainstorm (225 mm of cumulative rainfall in 7 h) which caused flash floods and more than 1000 landslides. Several small villages, such as Giampilieri, were hit with 31 fatalities, 6 missing persons and damage to buildings and transportation infrastructures. Landslides, mainly types such as earth and debris translational slides evolving into debris flows, were triggered on steep slopes and involved colluvium and regolith materials which cover the underlying metamorphic bedrock. The work has been carried out with the following steps: i) realization of a detailed event landslide inventory map through field surveys coupled with observation of high resolution aerial colour orthophoto; ii) identification of landslide source areas; iii) data preparation of landslide controlling factors and descriptive statistics based on a bivariate method (Frequency Ratio) to get an initial overview on existing relationships between causative factors and shallow landslide source areas; iv) choice of criteria for the selection and sizing of the mapping unit; v) implementation of 5 multivariate statistical susceptibility models based on Logistic Regression and Random Forests techniques and focused on landslide source areas; vi) evaluation of the influence of sample size and type of sampling on results and performance of the models; vii) evaluation of the predictive capabilities of the models using ROC curve, AUC and contingency tables; viii) comparison of model results and obtained susceptibility maps; and ix) analysis of temporal variation of landslide susceptibility related to input parameter changes. Models based on Logistic Regression and Random Forests have demonstrated excellent predictive capabilities. Land use and wildfire

  15. An investigation of the speeding-related crash designation through crash narrative reviews sampled via logistic regression.

    PubMed

    Fitzpatrick, Cole D; Rakasi, Saritha; Knodler, Michael A

    2017-01-01

    Speed is one of the most important factors in traffic safety as higher speeds are linked to increased crash risk and higher injury severities. Nearly a third of fatal crashes in the United States are designated as "speeding-related", which is defined as either "the driver behavior of exceeding the posted speed limit or driving too fast for conditions." While many studies have utilized the speeding-related designation in safety analyses, no studies have examined the underlying accuracy of this designation. Herein, we investigate the speeding-related crash designation through the development of a series of logistic regression models that were derived from the established speeding-related crash typologies and validated using a blind review, by multiple researchers, of 604 crash narratives. The developed logistic regression model accurately identified crashes which were not originally designated as speeding-related but had crash narratives that suggested speeding as a causative factor. Only 53.4% of crashes designated as speeding-related contained narratives which described speeding as a causative factor. Further investigation of these crashes revealed that the driver contributing code (DCC) of "driving too fast for conditions" was being used in three separate situations. Additionally, this DCC was also incorrectly used when "exceeding the posted speed limit" would likely have been a more appropriate designation. Finally, it was determined that the responding officer only utilized one DCC in 82% of crashes not designated as speeding-related but contained a narrative indicating speed as a contributing causal factor. The use of logistic regression models based upon speeding-related crash typologies offers a promising method by which all possible speeding-related crashes could be identified.

  16. An Alternative Flight Software Trigger Paradigm: Applying Multivariate Logistic Regression to Sense Trigger Conditions Using Inaccurate or Scarce Information

    NASA Technical Reports Server (NTRS)

    Smith, Kelly M.; Gay, Robert S.; Stachowiak, Susan J.

    2013-01-01

    In late 2014, NASA will fly the Orion capsule on a Delta IV-Heavy rocket for the Exploration Flight Test-1 (EFT-1) mission. For EFT-1, the Orion capsule will be flying with a new GPS receiver and new navigation software. Given the experimental nature of the flight, the flight software must be robust to the loss of GPS measurements. Once the high-speed entry is complete, the drogue parachutes must be deployed within the proper conditions to stabilize the vehicle prior to deploying the main parachutes. When GPS is available in nominal operations, the vehicle will deploy the drogue parachutes based on an altitude trigger. However, when GPS is unavailable, the navigated altitude errors become excessively large, driving the need for a backup barometric altimeter to improve altitude knowledge. In order to increase overall robustness, the vehicle also has an alternate method of triggering the parachute deployment sequence based on planet-relative velocity if both the GPS and the barometric altimeter fail. However, this backup trigger results in large altitude errors relative to the targeted altitude. Motivated by this challenge, this paper demonstrates how logistic regression may be employed to semi-automatically generate robust triggers based on statistical analysis. Logistic regression is used as a ground processor pre-flight to develop a statistical classifier. The classifier would then be implemented in flight software and executed in real-time. This technique offers improved performance even in the face of highly inaccurate measurements. Although the logistic regression-based trigger approach will not be implemented within EFT-1 flight software, the methodology can be carried forward for future missions and vehicles.

  17. An Alternative Flight Software Paradigm: Applying Multivariate Logistic Regression to Sense Trigger Conditions using Inaccurate or Scarce Information

    NASA Technical Reports Server (NTRS)

    Smith, Kelly; Gay, Robert; Stachowiak, Susan

    2013-01-01

    In late 2014, NASA will fly the Orion capsule on a Delta IV-Heavy rocket for the Exploration Flight Test-1 (EFT-1) mission. For EFT-1, the Orion capsule will be flying with a new GPS receiver and new navigation software. Given the experimental nature of the flight, the flight software must be robust to the loss of GPS measurements. Once the high-speed entry is complete, the drogue parachutes must be deployed within the proper conditions to stabilize the vehicle prior to deploying the main parachutes. When GPS is available in nominal operations, the vehicle will deploy the drogue parachutes based on an altitude trigger. However, when GPS is unavailable, the navigated altitude errors become excessively large, driving the need for a backup barometric altimeter to improve altitude knowledge. In order to increase overall robustness, the vehicle also has an alternate method of triggering the parachute deployment sequence based on planet-relative velocity if both the GPS and the barometric altimeter fail. However, this backup trigger results in large altitude errors relative to the targeted altitude. Motivated by this challenge, this paper demonstrates how logistic regression may be employed to semi-automatically generate robust triggers based on statistical analysis. Logistic regression is used as a ground processor pre-flight to develop a statistical classifier. The classifier would then be implemented in flight software and executed in real-time. This technique offers improved performance even in the face of highly inaccurate measurements. Although the logistic regression-based trigger approach will not be implemented within EFT-1 flight software, the methodology can be carried forward for future missions and vehicles

  18. An Alternative Flight Software Trigger Paradigm: Applying Multivariate Logistic Regression to Sense Trigger Conditions using Inaccurate or Scarce Information

    NASA Technical Reports Server (NTRS)

    Smith, Kelly M.; Gay, Robert S.; Stachowiak, Susan J.

    2013-01-01

    In late 2014, NASA will fly the Orion capsule on a Delta IV-Heavy rocket for the Exploration Flight Test-1 (EFT-1) mission. For EFT-1, the Orion capsule will be flying with a new GPS receiver and new navigation software. Given the experimental nature of the flight, the flight software must be robust to the loss of GPS measurements. Once the high-speed entry is complete, the drogue parachutes must be deployed within the proper conditions to stabilize the vehicle prior to deploying the main parachutes. When GPS is available in nominal operations, the vehicle will deploy the drogue parachutes based on an altitude trigger. However, when GPS is unavailable, the navigated altitude errors become excessively large, driving the need for a backup barometric altimeter. In order to increase overall robustness, the vehicle also has an alternate method of triggering the drogue parachute deployment based on planet-relative velocity if both the GPS and the barometric altimeter fail. However, this velocity-based trigger results in large altitude errors relative to the targeted altitude. Motivated by this challenge, this paper demonstrates how logistic regression may be employed to automatically generate robust triggers based on statistical analysis. Logistic regression is used as a ground processor pre-flight to develop a classifier. The classifier would then be implemented in flight software and executed in real-time. This technique offers excellent performance even in the face of highly inaccurate measurements. Although the logistic regression-based trigger approach will not be implemented within EFT-1 flight software, the methodology can be carried forward for future missions and vehicles.

  19. Remote sensing and GIS-based landslide hazard analysis and cross-validation using multivariate logistic regression model on three test areas in Malaysia

    NASA Astrophysics Data System (ADS)

    Pradhan, Biswajeet

    2010-05-01

    This paper presents the results of the cross-validation of a multivariate logistic regression model using remote sensing data and GIS for landslide hazard analysis on the Penang, Cameron, and Selangor areas in Malaysia. Landslide locations in the study areas were identified by interpreting aerial photographs and satellite images, supported by field surveys. SPOT 5 and Landsat TM satellite imagery were used to map landcover and vegetation index, respectively. Maps of topography, soil type, lineaments and land cover were constructed from the spatial datasets. Ten factors which influence landslide occurrence, i.e., slope, aspect, curvature, distance from drainage, lithology, distance from lineaments, soil type, landcover, rainfall precipitation, and normalized difference vegetation index (ndvi), were extracted from the spatial database and the logistic regression coefficient of each factor was computed. Then the landslide hazard was analysed using the multivariate logistic regression coefficients derived not only from the data for the respective area but also using the logistic regression coefficients calculated from each of the other two areas (nine hazard maps in all) as a cross-validation of the model. For verification of the model, the results of the analyses were then compared with the field-verified landslide locations. Among the three cases of the application of logistic regression coefficient in the same study area, the case of Selangor based on the Selangor logistic regression coefficients showed the highest accuracy (94%), where as Penang based on the Penang coefficients showed the lowest accuracy (86%). Similarly, among the six cases from the cross application of logistic regression coefficient in other two areas, the case of Selangor based on logistic coefficient of Cameron showed highest (90%) prediction accuracy where as the case of Penang based on the Selangor logistic regression coefficients showed the lowest accuracy (79%). Qualitatively, the cross

  20. Applicability of the Ricketts’ posteroanterior cephalometry for sex determination using logistic regression analysis in Hispano American Peruvians

    PubMed Central

    Perez, Ivan; Chavez, Allison K.; Ponce, Dario

    2016-01-01

    Background: The Ricketts' posteroanterior (PA) cephalometry seems to be the most widely used and it has not been tested by multivariate statistics for sex determination. Objective: The objective was to determine the applicability of Ricketts' PA cephalometry for sex determination using the logistic regression analysis. Materials and Methods: The logistic models were estimated at distinct age cutoffs (all ages, 11 years, 13 years, and 15 years) in a database from 1,296 Hispano American Peruvians between 5 years and 44 years of age. Results: The logistic models were composed by six cephalometric measurements; the accuracy achieved by resubstitution varied between 60% and 70% and all the variables, with one exception, exhibited a direct relationship with the probability of being classified as male; the nasal width exhibited an indirect relationship. Conclusion: The maxillary and facial widths were present in all models and may represent a sexual dimorphism indicator. The accuracy found was lower than the literature and the Ricketts' PA cephalometry may not be adequate for sex determination. The indirect relationship of the nasal width in models with data from patients of 12 years of age or less may be a trait related to age or a characteristic in the studied population, which could be better studied and confirmed. PMID:27555732

  1. Estimating the susceptibility of surface water in Texas to nonpoint-source contamination by use of logistic regression modeling

    USGS Publications Warehouse

    Battaglin, William A.; Ulery, Randy L.; Winterstein, Thomas; Welborn, Toby

    2003-01-01

    In the State of Texas, surface water (streams, canals, and reservoirs) and ground water are used as sources of public water supply. Surface-water sources of public water supply are susceptible to contamination from point and nonpoint sources. To help protect sources of drinking water and to aid water managers in designing protective yet cost-effective and risk-mitigated monitoring strategies, the Texas Commission on Environmental Quality and the U.S. Geological Survey developed procedures to assess the susceptibility of public water-supply source waters in Texas to the occurrence of 227 contaminants. One component of the assessments is the determination of susceptibility of surface-water sources to nonpoint-source contamination. To accomplish this, water-quality data at 323 monitoring sites were matched with geographic information system-derived watershed- characteristic data for the watersheds upstream from the sites. Logistic regression models then were developed to estimate the probability that a particular contaminant will exceed a threshold concentration specified by the Texas Commission on Environmental Quality. Logistic regression models were developed for 63 of the 227 contaminants. Of the remaining contaminants, 106 were not modeled because monitoring data were available at less than 10 percent of the monitoring sites; 29 were not modeled because there were less than 15 percent detections of the contaminant in the monitoring data; 27 were not modeled because of the lack of any monitoring data; and 2 were not modeled because threshold values were not specified.

  2. HEALER: homomorphic computation of ExAct Logistic rEgRession for secure rare disease variants analysis in GWAS

    PubMed Central

    Wang, Shuang; Zhang, Yuchen; Dai, Wenrui; Lauter, Kristin; Kim, Miran; Tang, Yuzhe; Xiong, Hongkai; Jiang, Xiaoqian

    2016-01-01

    Motivation: Genome-wide association studies (GWAS) have been widely used in discovering the association between genotypes and phenotypes. Human genome data contain valuable but highly sensitive information. Unprotected disclosure of such information might put individual’s privacy at risk. It is important to protect human genome data. Exact logistic regression is a bias-reduction method based on a penalized likelihood to discover rare variants that are associated with disease susceptibility. We propose the HEALER framework to facilitate secure rare variants analysis with a small sample size. Results: We target at the algorithm design aiming at reducing the computational and storage costs to learn a homomorphic exact logistic regression model (i.e. evaluate P-values of coefficients), where the circuit depth is proportional to the logarithmic scale of data size. We evaluate the algorithm performance using rare Kawasaki Disease datasets. Availability and implementation: Download HEALER at http://research.ucsd-dbmi.org/HEALER/ Contact: shw070@ucsd.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26446135

  3. Adjusting for unmeasured confounding due to either of two crossed factors with a logistic regression model.

    PubMed

    Li, Li; Brumback, Babette A; Weppelmann, Thomas A; Morris, J Glenn; Ali, Afsar

    2016-08-15

    Motivated by an investigation of the effect of surface water temperature on the presence of Vibrio cholerae in water samples collected from different fixed surface water monitoring sites in Haiti in different months, we investigated methods to adjust for unmeasured confounding due to either of the two crossed factors site and month. In the process, we extended previous methods that adjust for unmeasured confounding due to one nesting factor (such as site, which nests the water samples from different months) to the case of two crossed factors. First, we developed a conditional pseudolikelihood estimator that eliminates fixed effects for the levels of each of the crossed factors from the estimating equation. Using the theory of U-Statistics for independent but non-identically distributed vectors, we show that our estimator is consistent and asymptotically normal, but that its variance depends on the nuisance parameters and thus cannot be easily estimated. Consequently, we apply our estimator in conjunction with a permutation test, and we investigate use of the pigeonhole bootstrap and the jackknife for constructing confidence intervals. We also incorporate our estimator into a diagnostic test for a logistic mixed model with crossed random effects and no unmeasured confounding. For comparison, we investigate between-within models extended to two crossed factors. These generalized linear mixed models include covariate means for each level of each factor in order to adjust for the unmeasured confounding. We conduct simulation studies, and we apply the methods to the Haitian data. Copyright © 2016 John Wiley & Sons, Ltd.

  4. Moment Adjusted Imputation for Multivariate Measurement Error Data with Applications to Logistic Regression

    PubMed Central

    Thomas, Laine; Stefanski, Leonard A.; Davidian, Marie

    2013-01-01

    In clinical studies, covariates are often measured with error due to biological fluctuations, device error and other sources. Summary statistics and regression models that are based on mismeasured data will differ from the corresponding analysis based on the “true” covariate. Statistical analysis can be adjusted for measurement error, however various methods exhibit a tradeo between convenience and performance. Moment Adjusted Imputation (MAI) is method for measurement error in a scalar latent variable that is easy to implement and performs well in a variety of settings. In practice, multiple covariates may be similarly influenced by biological fluctuastions, inducing correlated multivariate measurement error. The extension of MAI to the setting of multivariate latent variables involves unique challenges. Alternative strategies are described, including a computationally feasible option that is shown to perform well. PMID:24072947

  5. Predictors of psychiatric disorders in liver transplantation candidates: logistic regression models.

    PubMed

    Rocca, Paola; Cocuzza, Elena; Rasetti, Roberta; Rocca, Giuseppe; Zanalda, Enrico; Bogetto, Filippo

    2003-07-01

    This study has two goals. The first goal is to assess the prevalence of psychiatric disorders in orthotopic liver transplantation (OLT) candidates by means of standardized procedures because there has been little research concerning psychiatric problems of potential OLT candidates using standardized instruments. The second goal focuses on identifying predictors of these psychiatric disorders. One hundred sixty-five elective OLT candidates were assessed by our unit. Psychiatric diagnoses were based on the Structured Clinical Interview for the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition. Patients also were assessed using the Hamilton Depression Rating Scale (HDRS) and the Spielberger Anxiety Index, State and Trait forms (STAI-X1 and STAI-X2). Severity of cirrhosis was assessed by applying Child-Pugh score criteria. Chi-squared and general linear model analysis of variance were used to test the univariate association between patient characteristics and both clinical psychiatric diagnoses and severity of psychiatric diseases. Variables with P less than.10 in univariate analyses were included in multiple regression models. Forty-three percent of patients presented at least one psychiatric diagnosis. Child-Pugh score and previous psychiatric diagnoses were independent significant predictors of depressive disorders. Severity of psychiatric symptoms measured by psychometric scales (HDRS, STAI-X1, and STAI-X2) was associated with Child-Pugh score in the multiple regression model. Our data suggest a high rate of psychiatric disorders, particularly adjustment disorders, in our sample of OLT candidates. Severity of liver disease emerges as the most important variable in predicting severity of psychiatric disorders in these patients.

  6. Dynamic Network Logistic Regression: A Logistic Choice Analysis of Inter- and Intra-Group Blog Citation Dynamics in the 2004 US Presidential Election.

    PubMed

    Almquist, Zack W; Butts, Carter T

    2013-10-01

    Methods for analysis of network dynamics have seen great progress in the past decade. This article shows how Dynamic Network Logistic Regression techniques (a special case of the Temporal Exponential Random Graph Models) can be used to implement decision theoretic models for network dynamics in a panel data context. We also provide practical heuristics for model building and assessment. We illustrate the power of these techniques by applying them to a dynamic blog network sampled during the 2004 US presidential election cycle. This is a particularly interesting case because it marks the debut of Internet-based media such as blogs and social networking web sites as institutionally recognized features of the American political landscape. Using a longitudinal sample of all Democratic National Convention/Republican National Convention-designated blog citation networks, we are able to test the influence of various strategic, institutional, and balance-theoretic mechanisms as well as exogenous factors such as seasonality and political events on the propensity of blogs to cite one another over time. Using a combination of deviance-based model selection criteria and simulation-based model adequacy tests, we identify the combination of processes that best characterizes the choice behavior of the contending blogs.

  7. Dynamic Network Logistic Regression: A Logistic Choice Analysis of Inter- and Intra-Group Blog Citation Dynamics in the 2004 US Presidential Election

    PubMed Central

    2013-01-01

    Methods for analysis of network dynamics have seen great progress in the past decade. This article shows how Dynamic Network Logistic Regression techniques (a special case of the Temporal Exponential Random Graph Models) can be used to implement decision theoretic models for network dynamics in a panel data context. We also provide practical heuristics for model building and assessment. We illustrate the power of these techniques by applying them to a dynamic blog network sampled during the 2004 US presidential election cycle. This is a particularly interesting case because it marks the debut of Internet-based media such as blogs and social networking web sites as institutionally recognized features of the American political landscape. Using a longitudinal sample of all Democratic National Convention/Republican National Convention–designated blog citation networks, we are able to test the influence of various strategic, institutional, and balance-theoretic mechanisms as well as exogenous factors such as seasonality and political events on the propensity of blogs to cite one another over time. Using a combination of deviance-based model selection criteria and simulation-based model adequacy tests, we identify the combination of processes that best characterizes the choice behavior of the contending blogs. PMID:24143060

  8. Age-adjusted Labor Force Participation Rates, 1960-2045.

    ERIC Educational Resources Information Center

    Szafran, Robert F.

    2002-01-01

    A proposed new age-adjusted measure for calculating labor force participation rate eliminates the effect of changes in the age distribution. According to the new criterion, increases in women's labor force participation from 1960-2000 would have been even greater of shifts in the age distribution had not occurred. (Contains 12 references.) (JOW)

  9. Trinculo: Bayesian and frequentist multinomial logistic regression for genome-wide association studies of multi-category phenotypes

    PubMed Central

    Jostins, Luke; McVean, Gilean

    2016-01-01

    Motivation: For many classes of disease the same genetic risk variants underly many related phenotypes or disease subtypes. Multinomial logistic regression provides an attractive framework to analyze multi-category phenotypes, and explore the genetic relationships between these phenotype categories. We introduce Trinculo, a program that implements a wide range of multinomial analyses in a single fast package that is designed to be easy to use by users of standard genome-wide association study software. Availability and implementation: An open source C implementation, with code and binaries for Linux and Mac OSX, is available for download at http://sourceforge.net/projects/trinculo Supplementary information: Supplementary data are available at Bioinformatics online. Contact: lj4@well.ox.ac.uk PMID:26873930

  10. Exploring the performance of logistic regression model types on growth/no growth data of Listeria monocytogenes.

    PubMed

    Gysemans, K P M; Bernaerts, K; Vermeulen, A; Geeraerd, A H; Debevere, J; Devlieghere, F; Van Impe, J F

    2007-03-20

    Several model types have already been developed to describe the boundary between growth and no growth conditions. In this article two types were thoroughly studied and compared, namely (i) the ordinary (linear) logistic regression model, i.e., with a polynomial on the right-hand side of the model equation (type I) and (ii) the (nonlinear) logistic regression model derived from a square root-type kinetic model (type II). The examination was carried out on the basis of the data described in Vermeulen et al. [Vermeulen, A., Gysemans, K.P.M., Bernaerts, K., Geeraerd, A.H., Van Impe, J.F., Debevere, J., Devlieghere, F., 2006-this issue. Influence of pH, water activity and acetic acid concentration on Listeria monocytogenes at 7 degrees C: data collection for the development of a growth/no growth model. International Journal of Food Microbiology. .]. These data sets consist of growth/no growth data for Listeria monocytogenes as a function of water activity (0.960-0.990), pH (5.0-6.0) and acetic acid percentage (0-0.8% (w/w)), both for a monoculture and a mixed strain culture. Numerous replicates, namely twenty, were performed at closely spaced conditions. In this way detailed information was obtained about the position of the interface and the transition zone between growth and no growth. The main questions investigated were (i) which model type performs best on the monoculture and the mixed strain data, (ii) are there differences between the growth/no growth interfaces of monocultures and mixed strain cultures, (iii) which parameter estimation approach works best for the type II models, and (iv) how sensitive is the performance of these models to the values of their nonlinear-appearing parameters. The results showed that both type I and II models performed well on the monoculture data with respect to goodness-of-fit and predictive power. The type I models were, however, more sensitive to anomalous data points. The situation was different for the mixed strain culture. In

  11. Assessment of the classification abilities of the CNS multi-parametric optimization approach by the method of logistic regression.

    PubMed

    Raevsky, O A; Polianczyk, D E; Mukhametov, A; Grigorev, V Y

    2016-08-01

    Assessment of "CNS drugs/CNS candidates" classification abilities of the multi-parametric optimization (CNS MPO) approach was performed by logistic regression. It was found that the five out of the six separately used physical-chemical properties (topological polar surface area, number of hydrogen-bonded donor atoms, basicity, lipophilicity of compound in neutral form and at pH = 7.4) provided accuracy of recognition below 60%. Only the descriptor of molecular weight (MW) could correctly classify two-thirds of the studied compounds. Aggregation of all six properties in the MPOscore did not improve the classification, which was worse than the classification using only MW. The results of our study demonstrate the imperfection of the CNS MPO approach; in its current form it is not very useful for computer design of new, effective CNS drugs.

  12. Analysis of vegetation distribution in Interior Alaska and sensitivity to climate change using a logistic regression approach

    USGS Publications Warehouse

    Calef, M.P.; McGuire, A.D.; Epstein, H.E.; Rupp, T.S.; Shugart, H.H.

    2005-01-01

    Aim: To understand drivers of vegetation type distribution and sensitivity to climate change. Location: Interior Alaska. Methods: A logistic regression model was developed that predicts the potential equilibrium distribution of four major vegetation types: tundra, deciduous forest, black spruce forest and white spruce forest based on elevation, aspect, slope, drainage type, fire interval, average growing season temperature and total growing season precipitation. The model was run in three consecutive steps. The hierarchical logistic regression model was used to evaluate how scenarios of changes in temperature, precipitation and fire interval may influence the distribution of the four major vegetation types found in this region. Results: At the first step, tundra was distinguished from forest, which was mostly driven by elevation, precipitation and south to north aspect. At the second step, forest was separated into deciduous and spruce forest, a distinction that was primarily driven by fire interval and elevation. At the third step, the identification of black vs. white spruce was driven mainly by fire interval and elevation. The model was verified for Interior Alaska, the region used to develop the model, where it predicted vegetation distribution among the steps with an accuracy of 60-83%. When the model was independently validated for north-west Canada, it predicted vegetation distribution among the steps with an accuracy of 53-85%. Black spruce remains the dominant vegetation type under all scenarios, potentially expanding most under warming coupled with increasing fire interval. White spruce is clearly limited by moisture once average growing season temperatures exceeded a critical limit (+2 ??C). Deciduous forests expand their range the most when any two of the following scenarios are combined: decreasing fire interval, warming and increasing precipitation. Tundra can be replaced by forest under warming but expands under precipitation increase. Main

  13. Prediction of Foreign Object Debris/Damage (FOD) type for elimination in the aeronautics manufacturing environment through logistic regression model

    NASA Astrophysics Data System (ADS)

    Espino, Natalia V.

    Foreign Object Debris/Damage (FOD) is a costly and high-risk problem that aeronautics industries such as Boeing, Lockheed Martin, among others are facing at their production lines every day. They spend an average of $350 thousand dollars per year fixing FOD problems. FOD can put pilots, passengers and other crews' lives into high-risk. FOD refers to any type of foreign object, particle, debris or agent in the manufacturing environment, which could contaminate/damage the product or otherwise undermine quality control standards. FOD can be in the form of any of the following categories: panstock, manufacturing debris, tools/shop aids, consumables and trash. Although aeronautics industries have put many prevention plans in place such as housekeeping and "clean as you go" philosophies, trainings, use of RFID for tooling control, etc. none of them has been able to completely eradicate the problem. This research presents a logistic regression statistical model approach to predict probability of FOD type under given specific circumstances such as workstation, month and aircraft/jet being built. FOD Quality Assurance Reports of the last three years were provided by an aeronautical industry for this study. By predicting type of FOD, custom reduction/elimination plans can be put in place and by such means being able to diminish the problem. Different aircrafts were analyzed and so different models developed through same methodology. Results of the study presented are predictions of FOD type for each aircraft and workstation throughout the year, which were obtained by applying proposed logistic regression models. This research would help aeronautic industries to address the FOD problem correctly, to be able to identify root causes and establish actual reduction/elimination plans.

  14. Multinomial Logistic Regression Predicted Probability Map To Visualize The Influence Of Socio-Economic Factors On Breast Cancer Occurrence in Southern Karnataka

    NASA Astrophysics Data System (ADS)

    Madhu, B.; Ashok, N. C.; Balasubramanian, S.

    2014-11-01

    Multinomial logistic regression analysis was used to develop statistical model that can predict the probability of breast cancer in Southern Karnataka using the breast cancer occurrence data during 2007-2011. Independent socio-economic variables describing the breast cancer occurrence like age, education, occupation, parity, type of family, health insurance coverage, residential locality and socioeconomic status of each case was obtained. The models were developed as follows: i) Spatial visualization of the Urban- rural distribution of breast cancer cases that were obtained from the Bharat Hospital and Institute of Oncology. ii) Socio-economic risk factors describing the breast cancer occurrences were complied for each case. These data were then analysed using multinomial logistic regression analysis in a SPSS statistical software and relations between the occurrence of breast cancer across the socio-economic status and the influence of other socio-economic variables were evaluated and multinomial logistic regression models were constructed. iii) the model that best predicted the occurrence of breast cancer were identified. This multivariate logistic regression model has been entered into a geographic information system and maps showing the predicted probability of breast cancer occurrence in Southern Karnataka was created. This study demonstrates that Multinomial logistic regression is a valuable tool for developing models that predict the probability of breast cancer Occurrence in Southern Karnataka.

  15. The implementation of rare events logistic regression to predict the distribution of mesophotic hard corals across the main Hawaiian Islands

    PubMed Central

    Franklin, Erik C.; Kelley, Christopher; Frazer, L. Neil; Toonen, Robert J.

    2016-01-01

    Predictive habitat suitability models are powerful tools for cost-effective, statistically robust assessment of the environmental drivers of species distributions. The aim of this study was to develop predictive habitat suitability models for two genera of scleractinian corals (Leptoserisand Montipora) found within the mesophotic zone across the main Hawaiian Islands. The mesophotic zone (30–180 m) is challenging to reach, and therefore historically understudied, because it falls between the maximum limit of SCUBA divers and the minimum typical working depth of submersible vehicles. Here, we implement a logistic regression with rare events corrections to account for the scarcity of presence observations within the dataset. These corrections reduced the coefficient error and improved overall prediction success (73.6% and 74.3%) for both original regression models. The final models included depth, rugosity, slope, mean current velocity, and wave height as the best environmental covariates for predicting the occurrence of the two genera in the mesophotic zone. Using an objectively selected theta (“presence”) threshold, the predicted presence probability values (average of 0.051 for Leptoseris and 0.040 for Montipora) were translated to spatially-explicit habitat suitability maps of the main Hawaiian Islands at 25 m grid cell resolution. Our maps are the first of their kind to use extant presence and absence data to examine the habitat preferences of these two dominant mesophotic coral genera across Hawai‘i. PMID:27441122

  16. The implementation of rare events logistic regression to predict the distribution of mesophotic hard corals across the main Hawaiian Islands.

    PubMed

    Veazey, Lindsay M; Franklin, Erik C; Kelley, Christopher; Rooney, John; Frazer, L Neil; Toonen, Robert J

    2016-01-01

    Predictive habitat suitability models are powerful tools for cost-effective, statistically robust assessment of the environmental drivers of species distributions. The aim of this study was to develop predictive habitat suitability models for two genera of scleractinian corals (Leptoserisand Montipora) found within the mesophotic zone across the main Hawaiian Islands. The mesophotic zone (30-180 m) is challenging to reach, and therefore historically understudied, because it falls between the maximum limit of SCUBA divers and the minimum typical working depth of submersible vehicles. Here, we implement a logistic regression with rare events corrections to account for the scarcity of presence observations within the dataset. These corrections reduced the coefficient error and improved overall prediction success (73.6% and 74.3%) for both original regression models. The final models included depth, rugosity, slope, mean current velocity, and wave height as the best environmental covariates for predicting the occurrence of the two genera in the mesophotic zone. Using an objectively selected theta ("presence") threshold, the predicted presence probability values (average of 0.051 for Leptoseris and 0.040 for Montipora) were translated to spatially-explicit habitat suitability maps of the main Hawaiian Islands at 25 m grid cell resolution. Our maps are the first of their kind to use extant presence and absence data to examine the habitat preferences of these two dominant mesophotic coral genera across Hawai'i.

  17. Assessment of susceptibility to earth-flow landslide using logistic regression and multivariate adaptive regression splines: A case of the Belice River basin (western Sicily, Italy)

    NASA Astrophysics Data System (ADS)

    Conoscenti, Christian; Ciaccio, Marilena; Caraballo-Arias, Nathalie Almaru; Gómez-Gutiérrez, Álvaro; Rotigliano, Edoardo; Agnesi, Valerio

    2015-08-01

    In this paper, terrain susceptibility to earth-flow occurrence was evaluated by using geographic information systems (GIS) and two statistical methods: Logistic regression (LR) and multivariate adaptive regression splines (MARS). LR has been already demonstrated to provide reliable predictions of earth-flow occurrence, whereas MARS, as far as we know, has never been used to generate earth-flow susceptibility models. The experiment was carried out in a basin of western Sicily (Italy), which extends for 51 km2 and is severely affected by earth-flows. In total, we mapped 1376 earth-flows, covering an area of 4.59 km2. To explore the effect of pre-failure topography on earth-flow spatial distribution, we performed a reconstruction of topography before the landslide occurrence. This was achieved by preparing a digital terrain model (DTM) where altitude of areas hosting landslides was interpolated from the adjacent undisturbed land surface by using the algorithm topo-to-raster. This DTM was exploited to extract 15 morphological and hydrological variables that, in addition to outcropping lithology, were employed as explanatory variables of earth-flow spatial distribution. The predictive skill of the earth-flow susceptibility models and the robustness of the procedure were tested by preparing five datasets, each including a different subset of landslides and stable areas. The accuracy of the predictive models was evaluated by drawing receiver operating characteristic (ROC) curves and by calculating the area under the ROC curve (AUC). The results demonstrate that the overall accuracy of LR and MARS earth-flow susceptibility models is from excellent to outstanding. However, AUC values of the validation datasets attest to a higher predictive power of MARS-models (AUC between 0.881 and 0.912) with respect to LR-models (AUC between 0.823 and 0.870). The adopted procedure proved to be resistant to overfitting and stable when changes of the learning and validation samples are

  18. The role of prediction modeling in propensity score estimation: an evaluation of logistic regression, bCART, and the covariate-balancing propensity score.

    PubMed

    Wyss, Richard; Ellis, Alan R; Brookhart, M Alan; Girman, Cynthia J; Jonsson Funk, Michele; LoCasale, Robert; Stürmer, Til

    2014-09-15

    The covariate-balancing propensity score (CBPS) extends logistic regression to simultaneously optimize covariate balance and treatment prediction. Although the CBPS has been shown to perform well in certain settings, its performance has not been evaluated in settings specific to pharmacoepidemiology and large database research. In this study, we use both simulations and empirical data to compare the performance of the CBPS with logistic regression and boosted classification and regression trees. We simulated various degrees of model misspecification to evaluate the robustness of each propensity score (PS) estimation method. We then applied these methods to compare the effect of initiating glucagonlike peptide-1 agonists versus sulfonylureas on cardiovascular events and all-cause mortality in the US Medicare population in 2007-2009. In simulations, the CBPS was generally more robust in terms of balancing covariates and reducing bias compared with misspecified logistic PS models and boosted classification and regression trees. All PS estimation methods performed similarly in the empirical example. For settings common to pharmacoepidemiology, logistic regression with balance checks to assess model specification is a valid method for PS estimation, but it can require refitting multiple models until covariate balance is achieved. The CBPS is a promising method to improve the robustness of PS models.

  19. Extension of the Peters–Belson method to estimate health disparities among multiple groups using logistic regression with survey data

    PubMed Central

    Li, Y.; Graubard, B. I.; Huang, P.; Gastwirth, J. L.

    2015-01-01

    Determining the extent of a disparity, if any, between groups of people, for example, race or gender, is of interest in many fields, including public health for medical treatment and prevention of disease. An observed difference in the mean outcome between an advantaged group (AG) and disadvantaged group (DG) can be due to differences in the distribution of relevant covariates. The Peters–Belson (PB) method fits a regression model with covariates to the AG to predict, for each DG member, their outcome measure as if they had been from the AG. The difference between the mean predicted and the mean observed outcomes of DG members is the (unexplained) disparity of interest. We focus on applying the PB method to estimate the disparity based on binary/multinomial/proportional odds logistic regression models using data collected from complex surveys with more than one DG. Estimators of the unexplained disparity, an analytic variance–covariance estimator that is based on the Taylor linearization variance–covariance estimation method, as well as a Wald test for testing a joint null hypothesis of zero for unexplained disparities between two or more minority groups and a majority group, are provided. Simulation studies with data selected from simple random sampling and cluster sampling, as well as the analyses of disparity in body mass index in the National Health and Nutrition Examination Survey 1999–2004, are conducted. Empirical results indicate that the Taylor linearization variance–covariance estimation is accurate and that the proposed Wald test maintains the nominal level. PMID:25382235

  20. Quantifying determinants of cash crop expansion and their relative effects using logistic regression modeling and variance partitioning

    NASA Astrophysics Data System (ADS)

    Xiao, Rui; Su, Shiliang; Mai, Gengchen; Zhang, Zhonghao; Yang, Chenxue

    2015-02-01

    Cash crop expansion has been a major land use change in tropical and subtropical regions worldwide. Quantifying the determinants of cash crop expansion should provide deeper spatial insights into the dynamics and ecological consequences of cash crop expansion. This paper investigated the process of cash crop expansion in Hangzhou region (China) from 1985 to 2009 using remotely sensed data. The corresponding determinants (neighborhood, physical, and proximity) and their relative effects during three periods (1985-1994, 1994-2003, and 2003-2009) were quantified by logistic regression modeling and variance partitioning. Results showed that the total area of cash crops increased from 58,874.1 ha in 1985 to 90,375.1 ha in 2009, with a net growth of 53.5%. Cash crops were more likely to grow in loam soils. Steep areas with higher elevation would experience less likelihood of cash crop expansion. A consistently higher probability of cash crop expansion was found on places with abundant farmland and forest cover in the three periods. Besides, distance to river and lake, distance to county center, and distance to provincial road were decisive determinants for farmers' choice of cash crop plantation. Different categories of determinants and their combinations exerted different influences on cash crop expansion. The joint effects of neighborhood and proximity determinants were the strongest, and the unique effect of physical determinants decreased with time. Our study contributed to understanding of the proximate drivers of cash crop expansion in subtropical regions.

  1. A revised logistic regression equation and an automated procedure for mapping the probability of a stream flowing perennially in Massachusetts

    USGS Publications Warehouse

    Bent, Gardner C.; Steeves, Peter A.

    2006-01-01

    A revised logistic regression equation and an automated procedure were developed for mapping the probability of a stream flowing perennially in Massachusetts. The equation provides city and town conservation commissions and the Massachusetts Department of Environmental Protection a method for assessing whether streams are intermittent or perennial at a specific site in Massachusetts by estimating the probability of a stream flowing perennially at that site. This information could assist the environmental agencies who administer the Commonwealth of Massachusetts Rivers Protection Act of 1996, which establishes a 200-foot-wide protected riverfront area extending from the mean annual high-water line along each side of a perennial stream, with exceptions for some urban areas. The equation was developed by relating the observed intermittent or perennial status of a stream site to selected basin characteristics of naturally flowing streams (defined as having no regulation by dams, surface-water withdrawals, ground-water withdrawals, diversion, wastewater discharge, and so forth) in Massachusetts. This revised equation differs from the equation developed in a previous U.S. Geological Survey study in that it is solely based on visual observations of the intermittent or perennial status of stream sites across Massachusetts and on the evaluation of several additional basin and land-use characteristics as potential explanatory variables in the logistic regression analysis. The revised equation estimated more accurately the intermittent or perennial status of the observed stream sites than the equation from the previous study. Stream sites used in the analysis were identified as intermittent or perennial based on visual observation during low-flow periods from late July through early September 2001. The database of intermittent and perennial streams included a total of 351 naturally flowing (no regulation) sites, of which 85 were observed to be intermittent and 266 perennial

  2. Seismic features and automatic discrimination of deep and shallow induced-microearthquakes using neural network and logistic regression

    NASA Astrophysics Data System (ADS)

    Mousavi, S. Mostafa; Horton, Stephen P.; Langston, Charles A.; Samei, Borhan

    2016-10-01

    We develop an automated strategy for discriminating deep microseismic events from shallow ones on the basis of the waveforms recorded on a limited number of surface receivers. Machine-learning techniques are employed to explore the relationship between event hypocentres and seismic features of the recorded signals in time, frequency and time-frequency domains. We applied the technique to 440 microearthquakes -1.7 < Mw < 1.29, induced by an underground cavern collapse in the Napoleonville Salt Dome in Bayou Corne, Louisiana. Forty different seismic attributes of whole seismograms including degree of polarization and spectral attributes were measured. A selected set of features was then used to train the system to discriminate between deep and shallow events based on the knowledge gained from existing patterns. The cross-validation test showed that events with depth shallower than 250 m can be discriminated from events with hypocentral depth between 1000 and 2000 m with 88 per cent and 90.7 per cent accuracy using logistic regression and artificial neural network models, respectively. Similar results were obtained using single station seismograms. The results show that the spectral features have the highest correlation to source depth. Spectral centroids and 2-D cross-correlations in the time-frequency domain are two new seismic features used in this study that showed to be promising measures for seismic event classification. The used machine-learning techniques have application for efficient automatic classification of low energy signals recorded at one or more seismic stations.

  3. Financial performance monitoring of the technical efficiency of critical access hospitals: a data envelopment analysis and logistic regression modeling approach.

    PubMed

    Wilson, Asa B; Kerr, Bernard J; Bastian, Nathaniel D; Fulton, Lawrence V

    2012-01-01

    From 1980 to 1999, rural designated hospitals closed at a disproportionally high rate. In response to this emergent threat to healthcare access in rural settings, the Balanced Budget Act of 1997 made provisions for the creation of a new rural hospital--the critical access hospital (CAH). The conversion to CAH and the associated cost-based reimbursement scheme significantly slowed the closure rate of rural hospitals. This work investigates which methods can ensure the long-term viability of small hospitals. This article uses a two-step design to focus on a hypothesized relationship between technical efficiency of CAHs and a recently developed set of financial monitors for these entities. The goal is to identify the financial performance measures associated with efficiency. The first step uses data envelopment analysis (DEA) to differentiate efficient from inefficient facilities within a data set of 183 CAHs. Determining DEA efficiency is an a priori categorization of hospitals in the data set as efficient or inefficient. In the second step, DEA efficiency is the categorical dependent variable (efficient = 0, inefficient = 1) in the subsequent binary logistic regression (LR) model. A set of six financial monitors selected from the array of 20 measures were the LR independent variables. We use a binary LR to test the null hypothesis that recently developed CAH financial indicators had no predictive value for categorizing a CAH as efficient or inefficient, (i.e., there is no relationship between DEA efficiency and fiscal performance).

  4. Spatial Analysis of Severe Fever with Thrombocytopenia Syndrome Virus in China Using a Geographically Weighted Logistic Regression Model

    PubMed Central

    Wu, Liang; Deng, Fei; Xie, Zhong; Hu, Sheng; Shen, Shu; Shi, Junming; Liu, Dan

    2016-01-01

    Severe fever with thrombocytopenia syndrome (SFTS) is caused by severe fever with thrombocytopenia syndrome virus (SFTSV), which has had a serious impact on public health in parts of Asia. There is no specific antiviral drug or vaccine for SFTSV and, therefore, it is important to determine the factors that influence the occurrence of SFTSV infections. This study aimed to explore the spatial associations between SFTSV infections and several potential determinants, and to predict the high-risk areas in mainland China. The analysis was carried out at the level of provinces in mainland China. The potential explanatory variables that were investigated consisted of meteorological factors (average temperature, average monthly precipitation and average relative humidity), the average proportion of rural population and the average proportion of primary industries over three years (2010–2012). We constructed a geographically weighted logistic regression (GWLR) model in order to explore the associations between the selected variables and confirmed cases of SFTSV. The study showed that: (1) meteorological factors have a strong influence on the SFTSV cover; (2) a GWLR model is suitable for exploring SFTSV cover in mainland China; (3) our findings can be used for predicting high-risk areas and highlighting when meteorological factors pose a risk in order to aid in the implementation of public health strategies. PMID:27845737

  5. A Study of Domain Adaptation Classifiers Derived from Logistic Regression for the Task of Splice Site Prediction

    PubMed Central

    Herndon, Nic; Caragea, Doina

    2016-01-01

    Supervised classifiers are highly dependent on abundant labeled training data. Alternatives for addressing the lack of labeled data include: labeling data (but this is costly and time consuming); training classifiers with abundant data from another domain (however, the classification accuracy usually decreases as the distance between domains increases); or complementing the limited labeled data with abundant unlabeled data from the same domain and learning semi-supervised classifiers (but the unlabeled data can mislead the classifier). A better alternative is to use both the abundant labeled data from a source domain, the limited labeled data and optionally the unlabeled data from the target domain to train classifiers in a domain adaptation setting. We propose two such classifiers, based on logistic regression, and evaluate them for the task of splice site prediction – a difficult and essential step in gene prediction. Our classifiers achieved high accuracy, with highest areas under the precision-recall curve between 50.83% and 82.61%. PMID:26849871

  6. Gully erosion susceptibility assessment by means of GIS-based logistic regression: A case of Sicily (Italy)

    NASA Astrophysics Data System (ADS)

    Conoscenti, Christian; Angileri, Silvia; Cappadonia, Chiara; Rotigliano, Edoardo; Agnesi, Valerio; Märker, Michael

    2014-01-01

    This research aims at characterizing susceptibility conditions to gully erosion by means of GIS and multivariate statistical analysis. The study area is a 9.5 km2 river catchment in central-northern Sicily, where agriculture activities are limited by intense erosion. By means of field surveys and interpretation of aerial images, we prepared a digital map of the spatial distribution of 260 gullies in the study area. In addition, from available thematic maps, a 5 m cell size digital elevation model and field checks, we derived 27 environmental attributes that describe the variability of lithology, land use, topography and road position. These attributes were selected for their potential influence on erosion processes, while the dependent variable was given by presence or absence of gullies within two different types of mapping units: 5 m grid cells and slope units (average size = 2.66 ha). The functional relationships between gully occurrence and the controlling factors were obtained from forward stepwise logistic regression to calculate the probability to host a gully for each mapping unit. In order to train and test the predictive models, three calibration and three validation subsets, of both grid cells and slope units, were randomly selected. Results of validation, based on ROC (receiving operating characteristic) curves, attest for acceptable to excellent accuracies of the models, showing better predictive skill and more stable performance of the susceptibility model based on grid cells.

  7. Measuring decision weights in recognition experiments with multiple response alternatives: comparing the correlation and multinomial-logistic-regression methods.

    PubMed

    Dai, Huanping; Micheyl, Christophe

    2012-11-01

    Psychophysical "reverse-correlation" methods allow researchers to gain insight into the perceptual representations and decision weighting strategies of individual subjects in perceptual tasks. Although these methods have gained momentum, until recently their development was limited to experiments involving only two response categories. Recently, two approaches for estimating decision weights in m-alternative experiments have been put forward. One approach extends the two-category correlation method to m > 2 alternatives; the second uses multinomial logistic regression (MLR). In this article, the relative merits of the two methods are discussed, and the issues of convergence and statistical efficiency of the methods are evaluated quantitatively using Monte Carlo simulations. The results indicate that, for a range of values of the number of trials, the estimated weighting patterns are closer to their asymptotic values for the correlation method than for the MLR method. Moreover, for the MLR method, weight estimates for different stimulus components can exhibit strong correlations, making the analysis and interpretation of measured weighting patterns less straightforward than for the correlation method. These and other advantages of the correlation method, which include computational simplicity and a close relationship to other well-established psychophysical reverse-correlation methods, make it an attractive tool to uncover decision strategies in m-alternative experiments.

  8. Predicting Student Success in a Major's Introductory Biology Course via Logistic Regression Analysis of Scientific Reasoning Ability and Mathematics Scores

    NASA Astrophysics Data System (ADS)

    Thompson, E. David; Bowling, Bethany V.; Markle, Ross E.

    2017-02-01

    Studies over the last 30 years have considered various factors related to student success in introductory biology courses. While much of the available literature suggests that the best predictors of success in a college course are prior college grade point average (GPA) and class attendance, faculty often require a valuable predictor of success in those courses wherein the majority of students are in the first semester and have no previous record of college GPA or attendance. In this study, we evaluated the efficacy of the ACT Mathematics subject exam and Lawson's Classroom Test of Scientific Reasoning in predicting success in a major's introductory biology course. A logistic regression was utilized to determine the effectiveness of a combination of scientific reasoning (SR) scores and ACT math (ACT-M) scores to predict student success. In summary, we found that the model—with both SR and ACT-M as significant predictors—could be an effective predictor of student success and thus could potentially be useful in practical decision making for the course, such as directing students to support services at an early point in the semester.

  9. Identifying Environmental and Social Factors Predisposing to Pathological Gambling Combining Standard Logistic Regression and Logic Learning Machine.

    PubMed

    Parodi, Stefano; Dosi, Corrado; Zambon, Antonella; Ferrari, Enrico; Muselli, Marco

    2017-03-02

    Identifying potential risk factors for problem gambling (PG) is of primary importance for planning preventive and therapeutic interventions. We illustrate a new approach based on the combination of standard logistic regression and an innovative method of supervised data mining (Logic Learning Machine or LLM). Data were taken from a pilot cross-sectional study to identify subjects with PG behaviour, assessed by two internationally validated scales (SOGS and Lie/Bet). Information was obtained from 251 gamblers recruited in six betting establishments. Data on socio-demographic characteristics, lifestyle and cognitive-related factors, and type, place and frequency of preferred gambling were obtained by a self-administered questionnaire. The following variables associated with PG were identified: instant gratification games, alcohol abuse, cognitive distortion, illegal behaviours and having started gambling with a relative or a friend. Furthermore, the combination of LLM and LR indicated the presence of two different types of PG, namely: (a) daily gamblers, more prone to illegal behaviour, with poor money management skills and who started gambling at an early age, and (b) non-daily gamblers, characterised by superstitious beliefs and a higher preference for immediate reward games. Finally, instant gratification games were strongly associated with the number of games usually played. Studies on gamblers habitually frequently betting shops are rare. The finding of different types of PG by habitual gamblers deserves further analysis in larger studies. Advanced data mining algorithms, like LLM, are powerful tools and potentially useful in identifying risk factors for PG.

  10. To Set Up a Logistic Regression Prediction Model for Hepatotoxicity of Chinese Herbal Medicines Based on Traditional Chinese Medicine Theory

    PubMed Central

    Liu, Hongjie; Li, Tianhao; Zhan, Sha; Pan, Meilan; Ma, Zhiguo; Li, Chenghua

    2016-01-01

    Aims. To establish a logistic regression (LR) prediction model for hepatotoxicity of Chinese herbal medicines (HMs) based on traditional Chinese medicine (TCM) theory and to provide a statistical basis for predicting hepatotoxicity of HMs. Methods. The correlations of hepatotoxic and nonhepatotoxic Chinese HMs with four properties, five flavors, and channel tropism were analyzed with chi-square test for two-way unordered categorical data. LR prediction model was established and the accuracy of the prediction by this model was evaluated. Results. The hepatotoxic and nonhepatotoxic Chinese HMs were related with four properties (p < 0.05), and the coefficient was 0.178 (p < 0.05); also they were related with five flavors (p < 0.05), and the coefficient was 0.145 (p < 0.05); they were not related with channel tropism (p > 0.05). There were totally 12 variables from four properties and five flavors for the LR. Four variables, warm and neutral of the four properties and pungent and salty of five flavors, were selected to establish the LR prediction model, with the cutoff value being 0.204. Conclusions. Warm and neutral of the four properties and pungent and salty of five flavors were the variables to affect the hepatotoxicity. Based on such results, the established LR prediction model had some predictive power for hepatotoxicity of Chinese HMs. PMID:27656240

  11. Comparative study of two sparse multinomial logistic regression models in decoding visual stimuli from brain activity of fMRI

    NASA Astrophysics Data System (ADS)

    Song, Sutao; Chen, Gongxiang; Zhan, Yu; Zhang, Jiacai; Yao, Li

    2014-03-01

    Recently, sparse algorithms, such as Sparse Multinomial Logistic Regression (SMLR), have been successfully applied in decoding visual information from functional magnetic resonance imaging (fMRI) data, where the contrast of visual stimuli was predicted by a classifier. The contrast classifier combined brain activities of voxels with sparse weights. For sparse algorithms, the goal is to learn a classifier whose weights distributed as sparse as possible by introducing some prior belief about the weights. There are two ways to introduce a sparse prior constraints for weights: the Automatic Relevance Determination (ARD-SMLR) and Laplace prior (LAP-SMLR). In this paper, we presented comparison results between the ARD-SMLR and LAP-SMLR models in computational time, classification accuracy and voxel selection. Results showed that, for fMRI data, no significant difference was found in classification accuracy between these two methods when voxels in V1 were chosen as input features (totally 1017 voxels). As for computation time, LAP-SMLR was superior to ARD-SMLR; the survived voxels for ARD-SMLR was less than LAP-SMLR. Using simulation data, we confirmed the classification performance for the two SMLR models was sensitive to the sparsity of the initial features, when the ratio of relevant features to the initial features was larger than 0.01, ARD-SMLR outperformed LAP-SMLR; otherwise, LAP-SMLR outperformed LAP-SMLR. Simulation data showed ARD-SMLR was more efficient in selecting relevant features.

  12. Development of synthetic velocity - depth damage curves using a Weighted Monte Carlo method and Logistic Regression analysis

    NASA Astrophysics Data System (ADS)

    Vozinaki, Anthi Eirini K.; Karatzas, George P.; Sibetheros, Ioannis A.; Varouchakis, Emmanouil A.

    2014-05-01

    Damage curves are the most significant component of the flood loss estimation models. Their development is quite complex. Two types of damage curves exist, historical and synthetic curves. Historical curves are developed from historical loss data from actual flood events. However, due to the scarcity of historical data, synthetic damage curves can be alternatively developed. Synthetic curves rely on the analysis of expected damage under certain hypothetical flooding conditions. A synthetic approach was developed and presented in this work for the development of damage curves, which are subsequently used as the basic input to a flood loss estimation model. A questionnaire-based survey took place among practicing and research agronomists, in order to generate rural loss data based on the responders' loss estimates, for several flood condition scenarios. In addition, a similar questionnaire-based survey took place among building experts, i.e. civil engineers and architects, in order to generate loss data for the urban sector. By answering the questionnaire, the experts were in essence expressing their opinion on how damage to various crop types or building types is related to a range of values of flood inundation parameters, such as floodwater depth and velocity. However, the loss data compiled from the completed questionnaires were not sufficient for the construction of workable damage curves; to overcome this problem, a Weighted Monte Carlo method was implemented, in order to generate extra synthetic datasets with statistical properties identical to those of the questionnaire-based data. The data generated by the Weighted Monte Carlo method were processed via Logistic Regression techniques in order to develop accurate logistic damage curves for the rural and the urban sectors. A Python-based code was developed, which combines the Weighted Monte Carlo method and the Logistic Regression analysis into a single code (WMCLR Python code). Each WMCLR code execution

  13. Binary Logistic Regression Analysis for Detecting Differential Item Functioning: Effectiveness of R[superscript 2] and Delta Log Odds Ratio Effect Size Measures

    ERIC Educational Resources Information Center

    Hidalgo, Mª Dolores; Gómez-Benito, Juana; Zumbo, Bruno D.

    2014-01-01

    The authors analyze the effectiveness of the R[superscript 2] and delta log odds ratio effect size measures when using logistic regression analysis to detect differential item functioning (DIF) in dichotomous items. A simulation study was carried out, and the Type I error rate and power estimates under conditions in which only statistical testing…

  14. Logistic regression and artificial neural network models for mapping of regional-scale landslide susceptibility in volcanic mountains of West Java (Indonesia)

    NASA Astrophysics Data System (ADS)

    Ngadisih, Bhandary, Netra P.; Yatabe, Ryuichi; Dahal, Ranjan K.

    2016-05-01

    West Java Province is the most landslide risky area in Indonesia owing to extreme geo-morphological conditions, climatic conditions and densely populated settlements with immense completed and ongoing development activities. So, a landslide susceptibility map at regional scale in this province is a fundamental tool for risk management and land-use planning. Logistic regression and Artificial Neural Network (ANN) models are the most frequently used tools for landslide susceptibility assessment, mainly because they are capable of handling the nature of landslide data. The main objective of this study is to apply logistic regression and ANN models and compare their performance for landslide susceptibility mapping in volcanic mountains of West Java Province. In addition, the model application is proposed to identify the most contributing factors to landslide events in the study area. The spatial database built in GIS platform consists of landslide inventory, four topographical parameters (slope, aspect, relief, distance to river), three geological parameters (distance to volcano crater, distance to thrust and fault, geological formation), and two anthropogenic parameters (distance to road, land use). The logistic regression model in this study revealed that slope, geological formations, distance to road and distance to volcano are the most influential factors of landslide events while, the ANN model revealed that distance to volcano crater, geological formation, distance to road, and land-use are the most important causal factors of landslides in the study area. Moreover, an evaluation of the model showed that the ANN model has a higher accuracy than the logistic regression model.

  15. The Association Between Changes in Insulin Sensitivity and Consumption of Tobacco and Alcohol in Young Adults: Ordinal Logistic Regression Approach

    PubMed Central

    Fufaa, Gudeta; Cai, Bin

    2016-01-01

    Context Reduced insulin sensitivity is one of the traditional risk factors for chronic diseases such as type 2 diabetes. Reduced insulin sensitivity leads to insulin resistance, which in turn can lead to the development of type 2 diabetes. Few studies have examined factors such as blood pressure, tobacco and alcohol consumption that influence changes in insulin sensitivity over time especially among young adults. Purpose To examine temporal changes in insulin sensitivity in young adults (18-30 years of age at baseline) over a period of 20 years by taking into account the effects of tobacco and alcohol consumptions at baseline. In other words, the purpose of the present study is to examine if baseline tobacco and alcohol consumptions can be used in predicting lowered insulin sensitivity. Method This is a retrospective study using data collected by the Coronary Artery Risk Development in Young Adults (CARDIA) study from the National Heart, Lung, and Blood Institute. Participants were enrolled into the study in 1985 (baseline) and followed up to 2005. Insulin sensitivity, measured by the quantitative insulin sensitivity check index (QUICKI), was recorded at baseline and 20 years later, in 2005. The number of participants included in the study was 3,547. The original study included a total of 5,112 participants at baseline. Of these, 54.48% were female, and 45.52% were male; 45.31% were 18 to 24 years of age, and 54.69% were 25 to 30 years of age. Ordinal logistic regression was used to assess changes in insulin sensitivity. Changes in insulin sensitivity from baseline were calculated and grouped into three categories (more than 15%, more than 8.5% to at most 15%, and at most 8.5%), which provided the basis for employing ordinal logistic regression to assess changes in insulin sensitivity. The effects of alcohol and smoking consumption at baseline on the change in insulin sensitivity were accounted for by including these variables in the model. Results Daily alcohol

  16. Factors associated with trait anger level of juvenile offenders in Hubei province: A binary logistic regression analysis.

    PubMed

    Tang, Li-Na; Ye, Xiao-Zhou; Yan, Qiu-Ge; Chang, Hong-Juan; Ma, Yu-Qiao; Liu, De-Bin; Li, Zhi-Gen; Yu, Yi-Zhen

    2017-02-01

    The risk factors of high trait anger of juvenile offenders were explored through questionnaire study in a youth correctional facility of Hubei province, China. A total of 1090 juvenile offenders in Hubei province were investigated by self-compiled social-demographic questionnaire, Childhood Trauma Questionnaire (CTQ), and State-Trait Anger Expression Inventory-II (STAXI-II). The risk factors were analyzed by chi-square tests, correlation analysis, and binary logistic regression analysis with SPSS 19.0. A total of 1082 copies of valid questionnaires were collected. High trait anger group (n=316) was defined as those who scored in the upper 27th percentile of STAXI-II trait anger scale (TAS), and the rest were defined as low trait anger group (n=766). The risk factors associated with high level of trait anger included: childhood emotional abuse, childhood sexual abuse, step family, frequent drug abuse, and frequent internet using (P<0.05 or P<0.01). Birth sequence, number of sibling, ranking in the family, identity of the main care-taker, the education level of care-taker, educational style of care-taker, family income, relationship between parents, social atmosphere of local area, frequent drinking, and frequent smoking did not predict to high level of trait anger (P>0.05). It was suggested that traumatic experience in childhood and unhealthy life style may significantly increase the level of trait anger in adulthood. The risk factors of high trait anger and their effects should be taken into consideration seriously.

  17. Classification of Urban Aerial Data Based on Pixel Labelling with Deep Convolutional Neural Networks and Logistic Regression

    NASA Astrophysics Data System (ADS)

    Yao, W.; Poleswki, P.; Krzystek, P.

    2016-06-01

    The recent success of deep convolutional neural networks (CNN) on a large number of applications can be attributed to large amounts of available training data and increasing computing power. In this paper, a semantic pixel labelling scheme for urban areas using multi-resolution CNN and hand-crafted spatial-spectral features of airborne remotely sensed data is presented. Both CNN and hand-crafted features are applied to image/DSM patches to produce per-pixel class probabilities with a L1-norm regularized logistical regression classifier. The evidence theory infers a degree of belief for pixel labelling from different sources to smooth regions by handling the conflicts present in the both classifiers while reducing the uncertainty. The aerial data used in this study were provided by ISPRS as benchmark datasets for 2D semantic labelling tasks in urban areas, which consists of two data sources from LiDAR and color infrared camera. The test sites are parts of a city in Germany which is assumed to consist of typical object classes including impervious surfaces, trees, buildings, low vegetation, vehicles and clutter. The evaluation is based on the computation of pixel-based confusion matrices by random sampling. The performance of the strategy with respect to scene characteristics and method combination strategies is analyzed and discussed. The competitive classification accuracy could be not only explained by the nature of input data sources: e.g. the above-ground height of nDSM highlight the vertical dimension of houses, trees even cars and the nearinfrared spectrum indicates vegetation, but also attributed to decision-level fusion of CNN's texture-based approach with multichannel spatial-spectral hand-crafted features based on the evidence combination theory.

  18. Developing a Referral Protocol for Community-Based Occupational Therapy Services in Taiwan: A Logistic Regression Analysis

    PubMed Central

    Chang, Ling-Hui; Tsai, Athena Yi-Jung; Huang, Wen-Ni

    2016-01-01

    Because resources for long-term care services are limited, timely and appropriate referral for rehabilitation services is critical for optimizing clients’ functions and successfully integrating them into the community. We investigated which client characteristics are most relevant in predicting Taiwan’s community-based occupational therapy (OT) service referral based on experts’ beliefs. Data were collected in face-to-face interviews using the Multidimensional Assessment Instrument (MDAI). Community-dwelling participants (n = 221) ≥ 18 years old who reported disabilities in the previous National Survey of Long-term Care Needs in Taiwan were enrolled. The standard for referral was the judgment and agreement of two experienced occupational therapists who reviewed the results of the MDAI. Logistic regressions and Generalized Additive Models were used for analysis. Two predictive models were proposed, one using basic activities of daily living (BADLs) and one using instrumental ADLs (IADLs). Dementia, psychiatric disorders, cognitive impairment, joint range-of-motion limitations, fear of falling, behavioral or emotional problems, expressive deficits (in the BADL-based model), and limitations in IADLs or BADLs were significantly correlated with the need for referral. Both models showed high area under the curve (AUC) values on receiver operating curve testing (AUC = 0.977 and 0.972, respectively). The probability of being referred for community OT services was calculated using the referral algorithm. The referral protocol facilitated communication between healthcare professionals to make appropriate decisions for OT referrals. The methods and findings should be useful for developing referral protocols for other long-term care services. PMID:26863544

  19. Personality, Driving Behavior and Mental Disorders Factors as Predictors of Road Traffic Accidents Based on Logistic Regression

    PubMed Central

    Alavi, Seyyed Salman; Mohammadi, Mohammad Reza; Souri, Hamid; Mohammadi Kalhori, Soroush; Jannatifard, Fereshteh; Sepahbodi, Ghazal

    2017-01-01

    Background: The aim of this study was to evaluate the effect of variables such as personality traits, driving behavior and mental illness on road traffic accidents among the drivers with accidents and those without road crash. Methods: In this cohort study, 800 bus and truck drivers were recruited. Participants were selected among drivers who referred to Imam Sajjad Hospital (Tehran, Iran) during 2013-2015. The Manchester driving behavior questionnaire (MDBQ), big five personality test (NEO personality inventory) and semi-structured interview (schizophrenia and affective disorders scale) were used. After two years, we surveyed all accidents due to human factors that involved the recruited drivers. The data were analyzed using the SPSS software by performing the descriptive statistics, t-test, and multiple logistic regression analysis methods. P values less than 0.05 were considered statistically significant. Results: In terms of controlling the effective and demographic variables, the findings revealed significant differences between the two groups of drivers that were and were not involved in road accidents. In addition, it was found that depression and anxiety could increase the odds ratio (OR) of road accidents by 2.4- and 2.7-folds, respectively (P=0.04, P=0.004). It is noteworthy to mention that neuroticism alone can increase the odds of road accidents by 1.1-fold (P=0.009), but other personality factors did not have a significant effect on the equation. Conclusion: The results revealed that some mental disorders affect the incidence of road collisions. Considering the importance and sensitivity of driving behavior, it is necessary to evaluate multiple psychological factors influencing drivers before and after receiving or renewing their driver’s license. PMID:28293047

  20. Developing a Referral Protocol for Community-Based Occupational Therapy Services in Taiwan: A Logistic Regression Analysis.

    PubMed

    Mao, Hui-Fen; Chang, Ling-Hui; Tsai, Athena Yi-Jung; Huang, Wen-Ni; Wang, Jye

    2016-01-01

    Because resources for long-term care services are limited, timely and appropriate referral for rehabilitation services is critical for optimizing clients' functions and successfully integrating them into the community. We investigated which client characteristics are most relevant in predicting Taiwan's community-based occupational therapy (OT) service referral based on experts' beliefs. Data were collected in face-to-face interviews using the Multidimensional Assessment Instrument (MDAI). Community-dwelling participants (n = 221) ≥ 18 years old who reported disabilities in the previous National Survey of Long-term Care Needs in Taiwan were enrolled. The standard for referral was the judgment and agreement of two experienced occupational therapists who reviewed the results of the MDAI. Logistic regressions and Generalized Additive Models were used for analysis. Two predictive models were proposed, one using basic activities of daily living (BADLs) and one using instrumental ADLs (IADLs). Dementia, psychiatric disorders, cognitive impairment, joint range-of-motion limitations, fear of falling, behavioral or emotional problems, expressive deficits (in the BADL-based model), and limitations in IADLs or BADLs were significantly correlated with the need for referral. Both models showed high area under the curve (AUC) values on receiver operating curve testing (AUC = 0.977 and 0.972, respectively). The probability of being referred for community OT services was calculated using the referral algorithm. The referral protocol facilitated communication between healthcare professionals to make appropriate decisions for OT referrals. The methods and findings should be useful for developing referral protocols for other long-term care services.

  1. SU-F-BRD-01: A Logistic Regression Model to Predict Objective Function Weights in Prostate Cancer IMRT

    SciTech Connect

    Boutilier, J; Chan, T; Lee, T; Craig, T; Sharpe, M

    2014-06-15

    Purpose: To develop a statistical model that predicts optimization objective function weights from patient geometry for intensity-modulation radiotherapy (IMRT) of prostate cancer. Methods: A previously developed inverse optimization method (IOM) is applied retrospectively to determine optimal weights for 51 treated patients. We use an overlap volume ratio (OVR) of bladder and rectum for different PTV expansions in order to quantify patient geometry in explanatory variables. Using the optimal weights as ground truth, we develop and train a logistic regression (LR) model to predict the rectum weight and thus the bladder weight. Post hoc, we fix the weights of the left femoral head, right femoral head, and an artificial structure that encourages conformity to the population average while normalizing the bladder and rectum weights accordingly. The population average of objective function weights is used for comparison. Results: The OVR at 0.7cm was found to be the most predictive of the rectum weights. The LR model performance is statistically significant when compared to the population average over a range of clinical metrics including bladder/rectum V53Gy, bladder/rectum V70Gy, and mean voxel dose to the bladder, rectum, CTV, and PTV. On average, the LR model predicted bladder and rectum weights that are both 63% closer to the optimal weights compared to the population average. The treatment plans resulting from the LR weights have, on average, a rectum V70Gy that is 35% closer to the clinical plan and a bladder V70Gy that is 43% closer. Similar results are seen for bladder V54Gy and rectum V54Gy. Conclusion: Statistical modelling from patient anatomy can be used to determine objective function weights in IMRT for prostate cancer. Our method allows the treatment planners to begin the personalization process from an informed starting point, which may lead to more consistent clinical plans and reduce overall planning time.

  2. Functional Logistic Regression Approach to Detecting Gene by Longitudinal Environmental Exposure Interaction in a Case-Control Study

    PubMed Central

    Wei, Peng; Tang, Hongwei; Li, Donghui

    2014-01-01

    Most complex human diseases are likely the consequence of the joint actions of genetic and environmental factors. Identification of gene-environment (GxE) interactions not only contributes to a better understanding of the disease mechanisms, but also improves disease risk prediction and targeted intervention. In contrast to the large number of genetic susceptibility loci discovered by genome-wide association studies, there have been very few successes in identifying GxE interactions which may be partly due to limited statistical power and inaccurately measured exposures. While existing statistical methods only consider interactions between genes and static environmental exposures, many environmental/lifestyle factors, such as air pollution and diet, change over time, and cannot be accurately captured at one measurement time point or by simply categorizing into static exposure categories. There is a dearth of statistical methods for detecting gene by time-varying environmental exposure interactions. Here we propose a powerful functional logistic regression (FLR) approach to model the time-varying effect of longitudinal environmental exposure and its interaction with genetic factors on disease risk. Capitalizing on the powerful functional data analysis framework, our proposed FLR model is capable of accommodating longitudinal exposures measured at irregular time points and contaminated by measurement errors, commonly encountered in observational studies. We use extensive simulations to show that the proposed method can control the Type I error and is more powerful than alternative ad hoc methods. We demonstrate the utility of this new method using data from a case-control study of pancreatic cancer to identify the windows of vulnerability of lifetime body mass index on the risk of pancreatic cancer as well as genes which may modify this association. PMID:25219575

  3. Adjusted Age-Adjusted Charlson Comorbidity Index Score as a Risk Measure of Perioperative Mortality before Cancer Surgery

    PubMed Central

    Chang, Chun-Ming; Yin, Wen-Yao; Wei, Chang-Kao; Wu, Chin-Chia; Su, Yu-Chieh; Yu, Chia-Hui; Lee, Ching-Chih

    2016-01-01

    Background Identification of patients at risk of death from cancer surgery should aid in preoperative preparation. The purpose of this study is to assess and adjust the age-adjusted Charlson comorbidity index (ACCI) to identify cancer patients with increased risk of perioperative mortality. Methods We identified 156,151 patients undergoing surgery for one of the ten common cancers between 2007 and 2011 in the Taiwan National Health Insurance Research Database. Half of the patients were randomly selected, and a multivariate logistic regression analysis was used to develop an adjusted-ACCI score for estimating the risk of 90-day mortality by variables from the original ACCI. The score was validated. The association between the score and perioperative mortality was analyzed. Results The adjusted-ACCI score yield a better discrimination on mortality after cancer surgery than the original ACCI score, with c-statics of 0.75 versus 0.71. Over 80 years of age, 70–80 years, and renal disease had the strongest impact on mortality, hazard ratios 8.40, 3.63, and 3.09 (P < 0.001), respectively. The overall 90-day mortality rates in the entire cohort varied from 0.9%, 2.9%, 7.0%, and 13.2% in four risk groups stratifying by the adjusted-ACCI score; the adjusted hazard ratio for score 4–7, 8–11, and ≥ 12 was 2.84, 6.07, and 11.17 (P < 0.001), respectively, in 90-day mortality compared to score 0–3. Conclusions The adjusted-ACCI score helps to identify patients with a higher risk of 90-day mortality after cancer surgery. It might be particularly helpful for preoperative evaluation of patients over 80 years of age. PMID:26848761

  4. Use of age-adjusted rates of suicide in time series studies in Israel.

    PubMed

    Bridges, F Stephen; Tankersley, William B

    2009-01-01

    Durkheim's modified theory of suicide was examined to explore how consistent it was in predicting Israeli rates of suicide from 1965 to 1997 when using age-adjusted rates rather than crude ones. In this time-series study, Israeli male and female rates of suicide increased and decreased, respectively, between 1965 and 1997. Conforming to Durkheim's modified theory, the Israeli male rate of suicide was lower in years when rates of marriage and birth are higher, while rates of suicide are higher in years when rates of divorce are higher, the opposite to that of Israeli women. The corrected regression coefficients suggest that the Israeli female rate of suicide remained lower in years when rate of divorce is higher, again the opposite suggested by Durkheim's modified theory. These results may indicate that divorce affects the mental health of Israeli women as suggested by their lower rate of suicide. Perhaps the "multiple roles held by Israeli females creates suicidogenic stress" and divorce provides some sense of stress relief, mentally speaking. The results were not as consistent with predictions suggested by Durkheim's modified theory of suicide as were rates from the United States for the same period nor were they consistent with rates based on "crude" suicide data. Thus, using age-adjusted rates of suicide had an influence on the prediction of the Israeli rate of suicide during this period.

  5. Building vulnerability to hydro-geomorphic hazards: Estimating damage probability from qualitative vulnerability assessment using logistic regression

    NASA Astrophysics Data System (ADS)

    Ettinger, Susanne; Mounaud, Loïc; Magill, Christina; Yao-Lafourcade, Anne-Françoise; Thouret, Jean-Claude; Manville, Vern; Negulescu, Caterina; Zuccaro, Giulio; De Gregorio, Daniela; Nardone, Stefano; Uchuchoque, Juan Alexis Luque; Arguedas, Anita; Macedo, Luisa; Manrique Llerena, Nélida

    2016-10-01

    bivariate analyses were applied to better characterize each vulnerability parameter. Multiple corresponding analyses revealed strong relationships between the "Distance to channel or bridges", "Structural building type", "Building footprint" and the observed damage. Logistic regression enabled quantification of the contribution of each explanatory parameter to potential damage, and determination of the significant parameters that express the damage susceptibility of a building. The model was applied 200 times on different calibration and validation data sets in order to examine performance. Results show that 90% of these tests have a success rate of more than 67%. Probabilities (at building scale) of experiencing different damage levels during a future event similar to the 8 February 2013 flash flood are the major outcomes of this study.

  6. Sample size matters: investigating the effect of sample size on a logistic regression susceptibility model for debris flows

    NASA Astrophysics Data System (ADS)

    Heckmann, T.; Gegg, K.; Gegg, A.; Becht, M.

    2014-02-01

    Predictive spatial modelling is an important task in natural hazard assessment and regionalisation of geomorphic processes or landforms. Logistic regression is a multivariate statistical approach frequently used in predictive modelling; it can be conducted stepwise in order to select from a number of candidate independent variables those that lead to the best model. In our case study on a debris flow susceptibility model, we investigate the sensitivity of model selection and quality to different sample sizes in light of the following problem: on the one hand, a sample has to be large enough to cover the variability of geofactors within the study area, and to yield stable and reproducible results; on the other hand, the sample must not be too large, because a large sample is likely to violate the assumption of independent observations due to spatial autocorrelation. Using stepwise model selection with 1000 random samples for a number of sample sizes between n = 50 and n = 5000, we investigate the inclusion and exclusion of geofactors and the diversity of the resulting models as a function of sample size; the multiplicity of different models is assessed using numerical indices borrowed from information theory and biodiversity research. Model diversity decreases with increasing sample size and reaches either a local minimum or a plateau; even larger sample sizes do not further reduce it, and they approach the upper limit of sample size given, in this study, by the autocorrelation range of the spatial data sets. In this way, an optimised sample size can be derived from an exploratory analysis. Model uncertainty due to sampling and model selection, and its predictive ability, are explored statistically and spatially through the example of 100 models estimated in one study area and validated in a neighbouring area: depending on the study area and on sample size, the predicted probabilities for debris flow release differed, on average, by 7 to 23 percentage points. In

  7. Sample size matters: investigating the effect of sample size on a logistic regression debris flow susceptibility model

    NASA Astrophysics Data System (ADS)

    Heckmann, T.; Gegg, K.; Gegg, A.; Becht, M.

    2013-06-01

    Predictive spatial modelling is an important task in natural hazard assessment and regionalisation of geomorphic processes or landforms. Logistic regression is a multivariate statistical approach frequently used in predictive modelling; it can be conducted stepwise in order to select from a number of candidate independent variables those that lead to the best model. In our case study on a debris flow susceptibility model, we investigate the sensitivity of model selection and quality to different sample sizes in light of the following problem: on the one hand, a sample has to be large enough to cover the variability of geofactors within the study area, and to yield stable results; on the other hand, the sample must not be too large, because a large sample is likely to violate the assumption of independent observations due to spatial autocorrelation. Using stepwise model selection with 1000 random samples for a number of sample sizes between n = 50 and n = 5000, we investigate the inclusion and exclusion of geofactors and the diversity of the resulting models as a function of sample size; the multiplicity of different models is assessed using numerical indices borrowed from information theory and biodiversity research. Model diversity decreases with increasing sample size and reaches either a local minimum or a plateau; even larger sample sizes do not further reduce it, and approach the upper limit of sample size given, in this study, by the autocorrelation range of the spatial datasets. In this way, an optimised sample size can be derived from an exploratory analysis. Model uncertainty due to sampling and model selection, and its predictive ability, are explored statistically and spatially through the example of 100 models estimated in one study area and validated in a neighbouring area: depending on the study area and on sample size, the predicted probabilities for debris flow release differed, on average, by 7 to 23 percentage points. In view of these results, we

  8. Using Logistic Regression and Random Forests multivariate statistical methods for landslide spatial probability assessment in North-Est Sicily, Italy

    NASA Astrophysics Data System (ADS)

    Trigila, Alessandro; Iadanza, Carla; Esposito, Carlo; Scarascia-Mugnozza, Gabriele

    2015-04-01

    first phase of the work addressed to identify the spatial relationships between the landslides location and the 13 related factors by using the Frequency Ratio bivariate statistical method. The analysis was then carried out by adopting a multivariate statistical approach, according to the Logistic Regression technique and Random Forests technique that gave best results in terms of AUC. The models were performed and evaluated with different sample sizes and also taking into account the temporal variation of input variables such as burned areas by wildfire. The most significant outcome of this work are: the relevant influence of the sample size on the model results and the strong importance of some environmental factors (e.g. land use and wildfires) for the identification of the depletion zones of extremely rapid shallow landslides.

  9. Using ordinal logistic regression to evaluate the performance of laser-Doppler predictions of burn-healing time

    PubMed Central

    2009-01-01

    Background Laser-Doppler imaging (LDI) of cutaneous blood flow is beginning to be used by burn surgeons to predict the healing time of burn wounds; predicted healing time is used to determine wound treatment as either dressings or surgery. In this paper, we do a statistical analysis of the performance of the technique. Methods We used data from a study carried out by five burn centers: LDI was done once between days 2 to 5 post burn, and healing was assessed at both 14 days and 21 days post burn. Random-effects ordinal logistic regression and other models such as the continuation ratio model were used to model healing-time as a function of the LDI data, and of demographic and wound history variables. Statistical methods were also used to study the false-color palette, which enables the laser-Doppler imager to be used by clinicians as a decision-support tool. Results Overall performance is that diagnoses are over 90% correct. Related questions addressed were what was the best blood flow summary statistic and whether, given the blood flow measurements, demographic and observational variables had any additional predictive power (age, sex, race, % total body surface area burned (%TBSA), site and cause of burn, day of LDI scan, burn center). It was found that mean laser-Doppler flux over a wound area was the best statistic, and that, given the same mean flux, women recover slightly more slowly than men. Further, the likely degradation in predictive performance on moving to a patient group with larger %TBSA than those in the data sample was studied, and shown to be small. Conclusion Modeling healing time is a complex statistical problem, with random effects due to multiple burn areas per individual, and censoring caused by patients missing hospital visits and undergoing surgery. This analysis applies state-of-the art statistical methods such as the bootstrap and permutation tests to a medical problem of topical interest. New medical findings are that age and %TBSA are

  10. Estimating allele dropout probabilities by logistic regression: Assessments using Applied Biosystems 3500xL and 3130xl Genetic Analyzers with various commercially available human identification kits.

    PubMed

    Inokuchi, Shota; Kitayama, Tetsushi; Fujii, Koji; Nakahara, Hiroaki; Nakanishi, Hiroaki; Saito, Kazuyuki; Mizuno, Natsuko; Sekiguchi, Kazumasa

    2016-03-01

    Phenomena called allele dropouts are often observed in crime stain profiles. Allele dropouts are generated because one of a pair of heterozygous alleles is underrepresented by stochastic influences and is indicated by a low peak detection threshold. Therefore, it is important that such risks are statistically evaluated. In recent years, attempts to interpret allele dropout probabilities by logistic regression using the information on peak heights have been reported. However, these previous studies are limited to the use of a human identification kit and fragment analyzer. In the present study, we calculated allele dropout probabilities by logistic regression using contemporary capillary electrophoresis instruments, 3500xL Genetic Analyzer and 3130xl Genetic Analyzer with various commercially available human identification kits such as AmpFℓSTR® Identifiler® Plus PCR Amplification Kit. Furthermore, the differences in logistic curves between peak detection thresholds using analytical threshold (AT) and values recommended by the manufacturer were compared. The standard logistic curves for calculating allele dropout probabilities from the peak height of sister alleles were characterized. The present study confirmed that ATs were lower than the values recommended by the manufacturer in human identification kits; therefore, it is possible to reduce allele dropout probabilities and obtain more information using AT as the peak detection threshold.

  11. Comparing machine learning and logistic regression methods for predicting hypertension using a combination of gene expression and next-generation sequencing data.

    PubMed

    Held, Elizabeth; Cape, Joshua; Tintle, Nathan

    2016-01-01

    Machine learning methods continue to show promise in the analysis of data from genetic association studies because of the high number of variables relative to the number of observations. However, few best practices exist for the application of these methods. We extend a recently proposed supervised machine learning approach for predicting disease risk by genotypes to be able to incorporate gene expression data and rare variants. We then apply 2 different versions of the approach (radial and linear support vector machines) to simulated data from Genetic Analysis Workshop 19 and compare performance to logistic regression. Method performance was not radically different across the 3 methods, although the linear support vector machine tended to show small gains in predictive ability relative to a radial support vector machine and logistic regression. Importantly, as the number of genes in the models was increased, even when those genes contained causal rare variants, model predictive ability showed a statistically significant decrease in performance for both the radial support vector machine and logistic regression. The linear support vector machine showed more robust performance to the inclusion of additional genes. Further work is needed to evaluate machine learning approaches on larger samples and to evaluate the relative improvement in model prediction from the incorporation of gene expression data.

  12. A local equation for differential diagnosis of β-thalassemia trait and iron deficiency anemia by logistic regression analysis in Southeast Iran.

    PubMed

    Sargolzaie, Narjes; Miri-Moghaddam, Ebrahim

    2014-01-01

    The most common differential diagnosis of β-thalassemia (β-thal) trait is iron deficiency anemia. Several red blood cell equations were introduced during different studies for differential diagnosis between β-thal trait and iron deficiency anemia. Due to genetic variations in different regions, these equations cannot be useful in all population. The aim of this study was to determine a native equation with high accuracy for differential diagnosis of β-thal trait and iron deficiency anemia for the Sistan and Baluchestan population by logistic regression analysis. We selected 77 iron deficiency anemia and 100 β-thal trait cases. We used binary logistic regression analysis and determined best equations for probability prediction of β-thal trait against iron deficiency anemia in our population. We compared diagnostic values and receiver operative characteristic (ROC) curve related to this equation and another 10 published equations in discriminating β-thal trait and iron deficiency anemia. The binary logistic regression analysis determined the best equation for best probability prediction of β-thal trait against iron deficiency anemia with area under curve (AUC) 0.998. Based on ROC curves and AUC, Green & King, England & Frazer, and then Sirdah indices, respectively, had the most accuracy after our equation. We suggest that to get the best equation and cut-off in each region, one needs to evaluate specific information of each region, specifically in areas where populations are homogeneous, to provide a specific formula for differentiating between β-thal trait and iron deficiency anemia.

  13. Comparison of dimension reduction-based logistic regression models for case-control genome-wide association study: principal components analysis vs. partial least squares

    PubMed Central

    Yi, Honggang; Wo, Hongmei; Zhao, Yang; Zhang, Ruyang; Dai, Junchen; Jin, Guangfu; Ma, Hongxia; Wu, Tangchun; Hu, Zhibin; Lin, Dongxin; Shen, Hongbing; Chen, Feng

    2015-01-01

    Abstract With recent advances in biotechnology, genome-wide association study (GWAS) has been widely used to identify genetic variants that underlie human complex diseases and traits. In case-control GWAS, typical statistical strategy is traditional logistical regression (LR) based on single-locus analysis. However, such a single-locus analysis leads to the well-known multiplicity problem, with a risk of inflating type I error and reducing power. Dimension reduction-based techniques, such as principal component-based logistic regression (PC-LR), partial least squares-based logistic regression (PLS-LR), have recently gained much attention in the analysis of high dimensional genomic data. However, the performance of these methods is still not clear, especially in GWAS. We conducted simulations and real data application to compare the type I error and power of PC-LR, PLS-LR and LR applicable to GWAS within a defined single nucleotide polymorphism (SNP) set region. We found that PC-LR and PLS can reasonably control type I error under null hypothesis. On contrast, LR, which is corrected by Bonferroni method, was more conserved in all simulation settings. In particular, we found that PC-LR and PLS-LR had comparable power and they both outperformed LR, especially when the causal SNP was in high linkage disequilibrium with genotyped ones and with a small effective size in simulation. Based on SNP set analysis, we applied all three methods to analyze non-small cell lung cancer GWAS data. PMID:26243516

  14. Age-adjusted charlson comorbidity index score as predictor of prolonged postoperative ileus in patients with colorectal cancer who underwent surgical resection.

    PubMed

    Tian, Yaohua; Xu, Beibei; Yu, Guopei; Li, Yan; Liu, Hui

    2017-02-11

    Comorbidities had considerable effects on the development of postoperative ileus (POI). The primary aim of the present study was to determine the influence of the age-adjusted Charlson comorbidity index (ACCI) score on the risk of prolonged POI in patients with colorectal cancer who underwent surgical resection. Using the electronic Hospitalization Summary Reports, we identified 11,397 patients with colorectal cancer who underwent surgical resection from 2013 through 2015. Logistic regression models were applied to evaluate the effect of the ACCI score on the risk of prolonged POI. The ACCI score had a positive graded association with the risk of prolonged POI in both colon and rectal cancer (P for trend < 0.05). Among patients with rectal cancer, after adjusting for potential confounders, those with an ACCI score of 4-5 had a 108% higher risk of prolonged POI than those with an ACCI score of 0-1 (odds ratio [OR], 2.08; 95% confidence interval [CI], 1.09-3.98), and those with an ACCI score of ≥ 6 had a 130% higher risk (OR, 2.30; 95% CI, 1.08-4.89). Among patients with colon cancer, those with an ACCI score of ≥ 6 had a 47% greater risk of prolonged POI than those with an ACCI score of 0-1 (OR, 1.47; 95% CI, 1.07-2.02). These findings suggested that a higher ACCI score was an independent predictor of the development of prolonged POI.

  15. Assessing the probability of acquisition of meticillin-resistant Staphylococcus aureus (MRSA) in a dog using a nested stochastic simulation model and logistic regression sensitivity analysis.

    PubMed

    Heller, J; Innocent, G T; Denwood, M; Reid, S W J; Kelly, L; Mellor, D J

    2011-05-01

    Meticillin-resistant Staphylococcus aureus (MRSA) is an important nosocomial and community-acquired pathogen with zoonotic potential. The relationship between MRSA in humans and companion animals is poorly understood. This study presents a quantitative exposure assessment, based on expert opinion and published data, in the form of a second order stochastic simulation model with accompanying logistic regression sensitivity analysis that aims to define the most important factors for MRSA acquisition in dogs. The simulation model was parameterised using expert opinion estimates, along with published and unpublished data. The outcome of the model was biologically plausible and found to be dominated by uncertainty over variability. The sensitivity analysis, in the form of four separate logistic regression models, found that both veterinary and non-veterinary routes of acquisition of MRSA are likely to be relevant for dogs. The effects of exposure to, and probability of, transmission of MRSA from the home environment were ranked as the most influential predictors in all sensitivity analyses, although it is unlikely that this environmental source of MRSA is independent of alternative sources of MRSA (human and/or animal). Exposure to and transmission from MRSA positive family members were also found to be influential for acquisition of MRSA in pet dogs, along with veterinary clinic attendance and, while exposure to and transmission from the veterinary clinic environment was also found to be influential, it was difficult to differentiate between the importance of independent sources of MRSA within the veterinary clinic. The implementation of logistic regression analyses directly to the input/output relationship within the simulation model presented in this paper represents the application of a variance based sensitivity analysis technique in the area of veterinary medicine and is a useful means of ranking the relative importance of input variables.

  16. Discriminating between adaptive and carcinogenic liver hypertrophy in rat studies using logistic ridge regression analysis of toxicogenomic data: The mode of action and predictive models.

    PubMed

    Liu, Shujie; Kawamoto, Taisuke; Morita, Osamu; Yoshinari, Kouichi; Honda, Hiroshi

    2017-03-01

    Chemical exposure often results in liver hypertrophy in animal tests, characterized by increased liver weight, hepatocellular hypertrophy, and/or cell proliferation. While most of these changes are considered adaptive responses, there is concern that they may be associated with carcinogenesis. In this study, we have employed a toxicogenomic approach using a logistic ridge regression model to identify genes responsible for liver hypertrophy and hypertrophic hepatocarcinogenesis and to develop a predictive model for assessing hypertrophy-inducing compounds. Logistic regression models have previously been used in the quantification of epidemiological risk factors. DNA microarray data from the Toxicogenomics Project-Genomics Assisted Toxicity Evaluation System were used to identify hypertrophy-related genes that are expressed differently in hypertrophy induced by carcinogens and non-carcinogens. Data were collected for 134 chemicals (72 non-hypertrophy-inducing chemicals, 27 hypertrophy-inducing non-carcinogenic chemicals, and 15 hypertrophy-inducing carcinogenic compounds). After applying logistic ridge regression analysis, 35 genes for liver hypertrophy (e.g., Acot1 and Abcc3) and 13 genes for hypertrophic hepatocarcinogenesis (e.g., Asns and Gpx2) were selected. The predictive models built using these genes were 94.8% and 82.7% accurate, respectively. Pathway analysis of the genes indicates that, aside from a xenobiotic metabolism-related pathway as an adaptive response for liver hypertrophy, amino acid biosynthesis and oxidative responses appear to be involved in hypertrophic hepatocarcinogenesis. Early detection and toxicogenomic characterization of liver hypertrophy using our models may be useful for predicting carcinogenesis. In addition, the identified genes provide novel insight into discrimination between adverse hypertrophy associated with carcinogenesis and adaptive hypertrophy in risk assessment.

  17. Comparing Machine Learning Classifiers and Linear/Logistic Regression to Explore the Relationship between Hand Dimensions and Demographic Characteristics

    PubMed Central

    2016-01-01

    Understanding the relationship between physiological measurements from human subjects and their demographic data is important within both the biometric and forensic domains. In this paper we explore the relationship between measurements of the human hand and a range of demographic features. We assess the ability of linear regression and machine learning classifiers to predict demographics from hand features, thereby providing evidence on both the strength of relationship and the key features underpinning this relationship. Our results show that we are able to predict sex, height, weight and foot size accurately within various data-range bin sizes, with machine learning classification algorithms out-performing linear regression in most situations. In addition, we identify the features used to provide these relationships applicable across multiple applications. PMID:27806075

  18. Comparing Machine Learning Classifiers and Linear/Logistic Regression to Explore the Relationship between Hand Dimensions and Demographic Characteristics.

    PubMed

    Miguel-Hurtado, Oscar; Guest, Richard; Stevenage, Sarah V; Neil, Greg J; Black, Sue

    2016-01-01

    Understanding the relationship between physiological measurements from human subjects and their demographic data is important within both the biometric and forensic domains. In this paper we explore the relationship between measurements of the human hand and a range of demographic features. We assess the ability of linear regression and machine learning classifiers to predict demographics from hand features, thereby providing evidence on both the strength of relationship and the key features underpinning this relationship. Our results show that we are able to predict sex, height, weight and foot size accurately within various data-range bin sizes, with machine learning classification algorithms out-performing linear regression in most situations. In addition, we identify the features used to provide these relationships applicable across multiple applications.

  19. Models of logistic regression analysis, support vector machine, and back-propagation neural network based on serum tumor markers in colorectal cancer diagnosis.

    PubMed

    Zhang, B; Liang, X L; Gao, H Y; Ye, L S; Wang, Y G

    2016-05-13

    We evaluated the application of three machine learning algorithms, including logistic regression, support vector machine and back-propagation neural network, for diagnosing congenital heart disease and colorectal cancer. By inspecting related serum tumor marker levels in colorectal cancer patients and healthy subjects, early diagnosis models for colorectal cancer were built using three machine learning algorithms to assess their corresponding diagnostic values. Except for serum alpha-fetoprotein, the levels of 11 other serum markers of patients in the colorectal cancer group were higher than those in the benign colorectal cancer group (P < 0.05). The results of logistic regression analysis indicted that individual detection of serum carcinoembryonic antigens, CA199, CA242, CA125, and CA153 and their combined detection was effective for diagnosing colorectal cancer. Combined detection had a better diagnostic effect with a sensitivity of 94.2% and specificity of 97.7%; combining serum carcinoembryonic antigens, CA199, CA242, CA125, and CA153, with the support vector machine diagnosis model and back-propagation, a neural network diagnosis model was built with diagnostic accuracies of 82 and 75%, sensitivities of 85 and 80%, and specificities of 80 and 70%, respectively. Colorectal cancer diagnosis models based on the three machine learning algorithms showed high diagnostic value and can help obtain evidence for the early diagnosis of colorectal cancer.

  20. Multi-label ℓ2-regularized logistic regression for predicting activation/inhibition relationships in human protein-protein interaction networks

    PubMed Central

    Mei, Suyu; Zhang, Kun

    2016-01-01

    Protein-protein interaction (PPI) networks are naturally viewed as infrastructure to infer signalling pathways. The descriptors of signal events between two interacting proteins such as upstream/downstream signal flow, activation/inhibition relationship and protein modification are indispensable for inferring signalling pathways from PPI networks. However, such descriptors are not available in most cases as most PPI networks are seldom semantically annotated. In this work, we extend ℓ2-regularized logistic regression to the scenario of multi-label learning for predicting the activation/inhibition relationships in human PPI networks. The phenomenon that both activation and inhibition relationships exist between two interacting proteins is computationally modelled by multi-label learning framework. The problem of GO (gene ontology) sparsity is tackled by introducing the homolog knowledge as independent homolog instances. ℓ2-regularized logistic regression is accordingly adopted here to penalize the homolog noise and to reduce the computational complexity of the double-sized training data. Computational results show that the proposed method achieves satisfactory multi-label learning performance and outperforms the existing phenotype correlation method on the experimental data of Drosophila melanogaster. Several predictions have been validated against recent literature. The predicted activation/inhibition relationships in human PPI networks are provided in the supplementary file for further biomedical research. PMID:27819359

  1. A case study using support vector machines, neural networks and logistic regression in a GIS to identify wells contaminated with nitrate-N

    NASA Astrophysics Data System (ADS)

    Dixon, Barnali

    2009-09-01

    Accurate and inexpensive identification of potentially contaminated wells is critical for water resources protection and management. The objectives of this study are to 1) assess the suitability of approximation tools such as neural networks (NN) and support vector machines (SVM) integrated in a geographic information system (GIS) for identifying contaminated wells and 2) use logistic regression and feature selection methods to identify significant variables for transporting contaminants in and through the soil profile to the groundwater. Fourteen GIS derived soil hydrogeologic and landuse parameters were used as initial inputs in this study. Well water quality data (nitrate-N) from 6,917 wells provided by Florida Department of Environmental Protection (USA) were used as an output target class. The use of the logistic regression and feature selection methods reduced the number of input variables to nine. Receiver operating characteristics (ROC) curves were used for evaluation of these approximation tools. Results showed superior performance with the NN as compared to SVM especially on training data while testing results were comparable. Feature selection did not improve accuracy; however, it helped increase the sensitivity or true positive rate (TPR). Thus, a higher TPR was obtainable with fewer variables.

  2. Antimicrobial activity of aroma compounds against Saccharomyces cerevisiae and improvement of microbiological stability of soft drinks as assessed by logistic regression.

    PubMed

    Belletti, Nicoletta; Kamdem, Sylvain Sado; Patrignani, Francesca; Lanciotti, Rosalba; Covelli, Alessandro; Gardini, Fausto

    2007-09-01

    The combined effects of a mild heat treatment (55 degrees C) and the presence of three aroma compounds [citron essential oil, citral, and (E)-2-hexenal] on the spoilage of noncarbonated beverages inoculated with different amounts of a Saccharomyces cerevisiae strain were evaluated. The results, expressed as growth/no growth, were elaborated using a logistic regression in order to assess the probability of beverage spoilage as a function of thermal treatment length, concentration of flavoring agents, and yeast inoculum. The logit models obtained for the three substances were extremely precise. The thermal treatment alone, even if prolonged for 20 min, was not able to prevent yeast growth. However, the presence of increasing concentrations of aroma compounds improved the stability of the products. The inhibiting effect of the compounds was enhanced by a prolonged thermal treatment. In fact, it influenced the vapor pressure of the molecules, which can easily interact within microbial membranes when they are in gaseous form. (E)-2-Hexenal showed a threshold level, related to initial inoculum and thermal treatment length, over which yeast growth was rapidly inhibited. Concentrations over 100 ppm of citral and thermal treatment longer than 16 min allowed a 90% probability of stability for bottles inoculated with 10(5) CFU/bottle. Citron gave the most interesting responses: beverages with 500 ppm of essential oil needed only 3 min of treatment to prevent yeast growth. In this framework, the logistic regression proved to be an important tool to study alternative hurdle strategies for the stabilization of noncarbonated beverages.

  3. [Multiple imputation and complete case analysis in logistic regression models: a practical assessment of the impact of incomplete covariate data].

    PubMed

    Camargos, Vitor Passos; César, Cibele Comini; Caiaffa, Waleska Teixeira; Xavier, Cesar Coelho; Proietti, Fernando Augusto

    2011-12-01

    Researchers in the health field often deal with the problem of incomplete databases. Complete Case Analysis (CCA), which restricts the analysis to subjects with complete data, reduces the sample size and may result in biased estimates. Based on statistical grounds, Multiple Imputation (MI) uses all collected data and is recommended as an alternative to CCA. Data from the study Saúde em Beagá, attended by 4,048 adults from two of nine health districts in the city of Belo Horizonte, Minas Gerais State, Brazil, in 2008-2009, were used to evaluate CCA and different MI approaches in the context of logistic models with incomplete covariate data. Peculiarities in some variables in this study allowed analyzing a situation in which the missing covariate data are recovered and thus the results before and after recovery are compared. Based on the analysis, even the more simplistic MI approach performed better than CCA, since it was closer to the post-recovery results.

  4. The comparison of landslide ratio-based and general logistic regression landslide susceptibility models in the Chishan watershed after 2009 Typhoon Morakot

    NASA Astrophysics Data System (ADS)

    WU, Chunhung

    2015-04-01

    The research built the original logistic regression landslide susceptibility model (abbreviated as or-LRLSM) and landslide ratio-based ogistic regression landslide susceptibility model (abbreviated as lr-LRLSM), compared the performance and explained the error source of two models. The research assumes that the performance of the logistic regression model can be better if the distribution of landslide ratio and weighted value of each variable is similar. Landslide ratio is the ratio of landslide area to total area in the specific area and an useful index to evaluate the seriousness of landslide disaster in Taiwan. The research adopted the landside inventory induced by 2009 Typhoon Morakot in the Chishan watershed, which was the most serious disaster event in the last decade, in Taiwan. The research adopted the 20 m grid as the basic unit in building the LRLSM, and six variables, including elevation, slope, aspect, geological formation, accumulated rainfall, and bank erosion, were included in the two models. The six variables were divided as continuous variables, including elevation, slope, and accumulated rainfall, and categorical variables, including aspect, geological formation and bank erosion in building the or-LRLSM, while all variables, which were classified based on landslide ratio, were categorical variables in building the lr-LRLSM. Because the count of whole basic unit in the Chishan watershed was too much to calculate by using commercial software, the research took random sampling instead of the whole basic units. The research adopted equal proportions of landslide unit and not landslide unit in logistic regression analysis. The research took 10 times random sampling and selected the group with the best Cox & Snell R2 value and Nagelkerker R2 value as the database for the following analysis. Based on the best result from 10 random sampling groups, the or-LRLSM (lr-LRLSM) is significant at the 1% level with Cox & Snell R2 = 0.190 (0.196) and Nagelkerke R2

  5. Driver injury severity outcome analysis in rural interstate highway crashes: a two-level Bayesian logistic regression interpretation.

    PubMed

    Chen, Cong; Zhang, Guohui; Liu, Xiaoyue Cathy; Ci, Yusheng; Huang, Helai; Ma, Jianming; Chen, Yanyan; Guan, Hongzhi

    2016-12-01

    There is a high potential of severe injury outcomes in traffic crashes on rural interstate highways due to the significant amount of high speed traffic on these corridors. Hierarchical Bayesian models are capable of incorporating between-crash variance and within-crash correlations into traffic crash data analysis and are increasingly utilized in traffic crash severity analysis. This paper applies a hierarchical Bayesian logistic model to examine the significant factors at crash and vehicle/driver levels and their heterogeneous impacts on driver injury severity in rural interstate highway crashes. Analysis results indicate that the majority of the total variance is induced by the between-crash variance, showing the appropriateness of the utilized hierarchical modeling approach. Three crash-level variables and six vehicle/driver-level variables are found significant in predicting driver injury severities: road curve, maximum vehicle damage in a crash, number of vehicles in a crash, wet road surface, vehicle type, driver age, driver gender, driver seatbelt use and driver alcohol or drug involvement. Among these variables, road curve, functional and disabled vehicle damage in crash, single-vehicle crashes, female drivers, senior drivers, motorcycles and driver alcohol or drug involvement tend to increase the odds of drivers being incapably injured or killed in rural interstate crashes, while wet road surface, male drivers and driver seatbelt use are more likely to decrease the probability of severe driver injuries. The developed methodology and estimation results provide insightful understanding of the internal mechanism of rural interstate crashes and beneficial references for developing effective countermeasures for rural interstate crash prevention.

  6. lordif: An R Package for Detecting Differential Item Functioning Using Iterative Hybrid Ordinal Logistic Regression/Item Response Theory and Monte Carlo Simulations

    PubMed Central

    Choi, Seung W.; Gibbons, Laura E.; Crane, Paul K.

    2011-01-01

    Logistic regression provides a flexible framework for detecting various types of differential item functioning (DIF). Previous efforts extended the framework by using item response theory (IRT) based trait scores, and by employing an iterative process using group–specific item parameters to account for DIF in the trait scores, analogous to purification approaches used in other DIF detection frameworks. The current investigation advances the technique by developing a computational platform integrating both statistical and IRT procedures into a single program. Furthermore, a Monte Carlo simulation approach was incorporated to derive empirical criteria for various DIF statistics and effect size measures. For purposes of illustration, the procedure was applied to data from a questionnaire of anxiety symptoms for detecting DIF associated with age from the Patient–Reported Outcomes Measurement Information System. PMID:21572908

  7. The severity of Minamata disease declined in 25 years: temporal profile of the neurological findings analyzed by multiple logistic regression model.

    PubMed

    Uchino, Makoto; Hirano, Teruyuki; Satoh, Hiroshi; Arimura, Kimiyoshi; Nakagawa, Masanori; Wakamiya, Jyunji

    2005-01-01

    Minamata disease (MD) was caused by ingestion of seafood from the methylmercury-contaminated areas. Although 50 years have passed since the discovery of MD, there have been only a few studies on the temporal profile of neurological findings in certified MD patients. Thus, we evaluated changes in neurological symptoms and signs of MD using discriminants by multiple logistic regression analysis. The severity of predictive index declined in 25 years in most of the patients. Only a few patients showed aggravation of neurological findings, which was due to complications such as spino-cerebellar degeneration. Patients with chronic MD aged over 45 years had several concomitant diseases so that their clinical pictures were complicated. It was difficult to differentiate chronic MD using statistically established discriminants based on sensory disturbance alone. In conclusion, the severity of MD declined in 25 years along with the modification by age-related concomitant disorders.

  8. Spatio-temporal analyses of cropland degradation in the irrigated lowlands of Uzbekistan using remote-sensing and logistic regression modeling.

    PubMed

    Dubovyk, Olena; Menz, Gunter; Conrad, Christopher; Kan, Elena; Machwitz, Miriam; Khamzina, Asia

    2013-06-01

    Advancing land degradation in the irrigated areas of Central Asia hinders sustainable development of this predominantly agricultural region. To support decisions on mitigating cropland degradation, this study combines linear trend analysis and spatial logistic regression modeling to expose a land degradation trend in the Khorezm region, Uzbekistan, and to analyze the causes. Time series of the 250-m MODIS NDVI, summed over the growing seasons of 2000-2010, were used to derive areas with an apparent negative vegetation trend; this was interpreted as an indicator of land degradation. About one third (161,000 ha) of the region's area experienced negative trends of different magnitude. The vegetation decline was particularly evident on the low-fertility lands bordering on the natural sandy desert, suggesting that these areas should be prioritized in mitigation planning. The results of logistic modeling indicate that the spatial pattern of the observed trend is mainly associated with the level of the groundwater table (odds = 330 %), land-use intensity (odds = 103 %), low soil quality (odds = 49 %), slope (odds = 29 %), and salinity of the groundwater (odds = 26 %). Areas, threatened by land degradation, were mapped by fitting the estimated model parameters to available data. The elaborated approach, combining remote-sensing and GIS, can form the basis for developing a common tool for monitoring land degradation trends in irrigated croplands of Central Asia.

  9. Novel Logistic Regression Model of Chest CT Attenuation Coefficient Distributions for the Automated Detection of Abnormal (Emphysema or ILD) versus Normal Lung

    PubMed Central

    Chan, Kung-Sik; Jiao, Feiran; Mikulski, Marek A.; Gerke, Alicia; Guo, Junfeng; Newell, John D; Hoffman, Eric A.; Thompson, Brad; Lee, Chang Hyun; Fuortes, Laurence J.

    2015-01-01

    Rationale and Objectives We evaluated the role of automated quantitative computed tomography (CT) scan interpretation algorithm in detecting Interstitial Lung Disease (ILD) and/or emphysema in a sample of elderly subjects with mild lung disease.ypothesized that the quantification and distributions of CT attenuation values on lung CT, over a subset of Hounsfield Units (HU) range [−1000 HU, 0 HU], can differentiate early or mild disease from normal lung. Materials and Methods We compared results of quantitative spiral rapid end-exhalation (functional residual capacity; FRC) and end-inhalation (total lung capacity; TLC) CT scan analyses in 52 subjects with radiographic evidence of mild fibrotic lung disease to 17 normal subjects. Several CT value distributions were explored, including (i) that from the peripheral lung taken at TLC (with peels at 15 or 65mm), (ii) the ratio of (i) to that from the core of lung, and (iii) the ratio of (ii) to its FRC counterpart. We developed a fused-lasso logistic regression model that can automatically identify sub-intervals of [−1000 HU, 0 HU] over which a CT value distribution provides optimal discrimination between abnormal and normal scans. Results The fused-lasso logistic regression model based on (ii) with 15 mm peel identified the relative frequency of CT values over [−1000, −900] and that over [−450,−200] HU as a means of discriminating abnormal versus normal, resulting in a zero out-sample false positive rate and 15%false negative rate of that was lowered to 12% by pooling information. Conclusions We demonstrated the potential usefulness of this novel quantitative imaging analysis method in discriminating ILD and/or emphysema from normal lungs. PMID:26776294

  10. Predicting Grade 3 Acute Diarrhea During Radiation Therapy for Rectal Cancer Using a Cutoff-Dose Logistic Regression Normal Tissue Complication Probability Model

    SciTech Connect

    Robertson, John M.; Soehn, Matthias; Yan Di

    2010-05-01

    Purpose: Understanding the dose-volume relationship of small bowel irradiation and severe acute diarrhea may help reduce the incidence of this side effect during adjuvant treatment for rectal cancer. Methods and Materials: Consecutive patients treated curatively for rectal cancer were reviewed, and the maximum grade of acute diarrhea was determined. The small bowel was outlined on the treatment planning CT scan, and a dose-volume histogram was calculated for the initial pelvic treatment (45 Gy). Logistic regression models were fitted for varying cutoff-dose levels from 5 to 45 Gy in 5-Gy increments. The model with the highest LogLikelihood was used to develop a cutoff-dose normal tissue complication probability (NTCP) model. Results: There were a total of 152 patients (48% preoperative, 47% postoperative, 5% other), predominantly treated prone (95%) with a three-field technique (94%) and a protracted venous infusion of 5-fluorouracil (78%). Acute Grade 3 diarrhea occurred in 21%. The largest LogLikelihood was found for the cutoff-dose logistic regression model with 15 Gy as the cutoff-dose, although the models for 20 Gy and 25 Gy had similar significance. According to this model, highly significant correlations (p <0.001) between small bowel volumes receiving at least 15 Gy and toxicity exist in the considered patient population. Similar findings applied to both the preoperatively (p = 0.001) and postoperatively irradiated groups (p = 0.001). Conclusion: The incidence of Grade 3 diarrhea was significantly correlated with the volume of small bowel receiving at least 15 Gy using a cutoff-dose NTCP model.

  11. Using occupancy modeling and logistic regression to assess the distribution of shrimp species in lowland streams, Costa Rica: Does regional groundwater create favorable habitat?

    USGS Publications Warehouse

    Snyder, Marcia; Freeman, Mary C.; Purucker, S. Thomas; Pringle, Catherine M.

    2016-01-01

    Freshwater shrimps are an important biotic component of tropical ecosystems. However, they can have a low probability of detection when abundances are low. We sampled 3 of the most common freshwater shrimp species, Macrobrachium olfersii, Macrobrachium carcinus, and Macrobrachium heterochirus, and used occupancy modeling and logistic regression models to improve our limited knowledge of distribution of these cryptic species by investigating both local- and landscape-scale effects at La Selva Biological Station in Costa Rica. Local-scale factors included substrate type and stream size, and landscape-scale factors included presence or absence of regional groundwater inputs. Capture rates for 2 of the sampled species (M. olfersii and M. carcinus) were sufficient to compare the fit of occupancy models. Occupancy models did not converge for M. heterochirus, but M. heterochirus had high enough occupancy rates that logistic regression could be used to model the relationship between occupancy rates and predictors. The best-supported models for M. olfersii and M. carcinus included conductivity, discharge, and substrate parameters. Stream size was positively correlated with occupancy rates of all 3 species. High stream conductivity, which reflects the quantity of regional groundwater input into the stream, was positively correlated with M. olfersii occupancy rates. Boulder substrates increased occupancy rate of M. carcinus and decreased the detection probability of M. olfersii. Our models suggest that shrimp distribution is driven by factors that function at local (substrate and discharge) and landscape (conductivity) scales.

  12. Binary Logistic Regression Versus Boosted Regression Trees in Assessing Landslide Susceptibility for Multiple-Occurring Regional Landslide Events: Application to the 2009 Storm Event in Messina (Sicily, southern Italy).

    NASA Astrophysics Data System (ADS)

    Lombardo, L.; Cama, M.; Maerker, M.; Parisi, L.; Rotigliano, E.

    2014-12-01

    This study aims at comparing the performances of Binary Logistic Regression (BLR) and Boosted Regression Trees (BRT) methods in assessing landslide susceptibility for multiple-occurrence regional landslide events within the Mediterranean region. A test area was selected in the north-eastern sector of Sicily (southern Italy), corresponding to the catchments of the Briga and the Giampilieri streams both stretching for few kilometres from the Peloritan ridge (eastern Sicily, Italy) to the Ionian sea. This area was struck on the 1st October 2009 by an extreme climatic event resulting in thousands of rapid shallow landslides, mainly of debris flows and debris avalanches types involving the weathered layer of a low to high grade metamorphic bedrock. Exploiting the same set of predictors and the 2009 landslide archive, BLR- and BRT-based susceptibility models were obtained for the two catchments separately, adopting a random partition (RP) technique for validation; besides, the models trained in one of the two catchments (Briga) were tested in predicting the landslide distribution in the other (Giampilieri), adopting a spatial partition (SP) based validation procedure. All the validation procedures were based on multi-folds tests so to evaluate and compare the reliability of the fitting, the prediction skill, the coherence in the predictor selection and the precision of the susceptibility estimates. All the obtained models for the two methods produced very high predictive performances, with a general congruence between BLR and BRT in the predictor importance. In particular, the research highlighted that BRT-models reached a higher prediction performance with respect to BLR-models, for RP based modelling, whilst for the SP-based models the difference in predictive skills between the two methods dropped drastically, converging to an analogous excellent performance. However, when looking at the precision of the probability estimates, BLR demonstrated to produce more robust

  13. The impact of meteorology on the occurrence of waterborne outbreaks of vero cytotoxin-producing Escherichia coli (VTEC): a logistic regression approach.

    PubMed

    O'Dwyer, Jean; Morris Downes, Margaret; Adley, Catherine C

    2016-02-01

    This study analyses the relationship between meteorological phenomena and outbreaks of waterborne-transmitted vero cytotoxin-producing Escherichia coli (VTEC) in the Republic of Ireland over an 8-year period (2005-2012). Data pertaining to the notification of waterborne VTEC outbreaks were extracted from the Computerised Infectious Disease Reporting system, which is administered through the national Health Protection Surveillance Centre as part of the Health Service Executive. Rainfall and temperature data were obtained from the national meteorological office and categorised as cumulative rainfall, heavy rainfall events in the previous 7 days, and mean temperature. Regression analysis was performed using logistic regression (LR) analysis. The LR model was significant (p < 0.001), with all independent variables: cumulative rainfall, heavy rainfall and mean temperature making a statistically significant contribution to the model. The study has found that rainfall, particularly heavy rainfall in the preceding 7 days of an outbreak, is a strong statistical indicator of a waterborne outbreak and that temperature also impacts waterborne VTEC outbreak occurrence.

  14. Age adjustment in ecological studies: using a study on arsenic ingestion and bladder cancer as an example

    PubMed Central

    2011-01-01

    Background Despite its limitations, ecological study design is widely applied in epidemiology. In most cases, adjustment for age is necessary, but different methods may lead to different conclusions. To compare three methods of age adjustment, a study on the associations between arsenic in drinking water and incidence of bladder cancer in 243 townships in Taiwan was used as an example. Methods A total of 3068 cases of bladder cancer, including 2276 men and 792 women, were identified during a ten-year study period in the study townships. Three methods were applied to analyze the same data set on the ten-year study period. The first (Direct Method) applied direct standardization to obtain standardized incidence rate and then used it as the dependent variable in the regression analysis. The second (Indirect Method) applied indirect standardization to obtain standardized incidence ratio and then used it as the dependent variable in the regression analysis instead. The third (Variable Method) used proportions of residents in different age groups as a part of the independent variables in the multiple regression models. Results All three methods showed a statistically significant positive association between arsenic exposure above 0.64 mg/L and incidence of bladder cancer in men and women, but different results were observed for the other exposure categories. In addition, the risk estimates obtained by different methods for the same exposure category were all different. Conclusions Using an empirical example, the current study confirmed the argument made by other researchers previously that whereas the three different methods of age adjustment may lead to different conclusions, only the third approach can obtain unbiased estimates of the risks. The third method can also generate estimates of the risk associated with each age group, but the other two are unable to evaluate the effects of age directly. PMID:22014275

  15. Power of multifactor dimensionality reduction and penalized logistic regression for detecting gene-gene Interaction in a case-control study

    PubMed Central

    2009-01-01

    Background There is a growing awareness that interaction between multiple genes play an important role in the risk of common, complex multi-factorial diseases. Many common diseases are affected by certain genotype combinations (associated with some genes and their interactions). The identification and characterization of these susceptibility genes and gene-gene interaction have been limited by small sample size and large number of potential interactions between genes. Several methods have been proposed to detect gene-gene interaction in a case control study. The penalized logistic regression (PLR), a variant of logistic regression with L2 regularization, is a parametric approach to detect gene-gene interaction. On the other hand, the Multifactor Dimensionality Reduction (MDR) is a nonparametric and genetic model-free approach to detect genotype combinations associated with disease risk. Methods We compared the power of MDR and PLR for detecting two-way and three-way interactions in a case-control study through extensive simulations. We generated several interaction models with different magnitudes of interaction effect. For each model, we simulated 100 datasets, each with 200 cases and 200 controls and 20 SNPs. We considered a wide variety of models such as models with just main effects, models with only interaction effects or models with both main and interaction effects. We also compared the performance of MDR and PLR to detect gene-gene interaction associated with acute rejection(AR) in kidney transplant patients. Results In this paper, we have studied the power of MDR and PLR for detecting gene-gene interaction in a case-control study through extensive simulation. We have compared their performances for different two-way and three-way interaction models. We have studied the effect of different allele frequencies on these methods. We have also implemented their performance on a real dataset. As expected, none of these methods were consistently better for all

  16. Forest cover dynamics analysis and prediction modelling using logistic regression model (case study: forest cover at Indragiri Hulu Regency, Riau Province)

    NASA Astrophysics Data System (ADS)

    Nahib, Irmadi; Suryanta, Jaka

    2017-01-01

    Forest destruction, climate change and global warming could reduce an indirect forest benefit because forest is the largest carbon sink and it plays a very important role in global carbon cycle. To support Reducing Emissions from Deforestation and Forest Degradation (REDD +) program, people pay attention of forest cover changes as the basis for calculating carbon stock changes. This study try to explore the forest cover dynamics as well as the prediction model of forest cover in Indragiri Hulu Regency, Riau Province Indonesia. The study aims to analyse some various explanatory variables associated with forest conversion processes and predict forest cover change using logistic regression model (LRM). The main data used in this study is Land use/cover map (1990 – 2011). Performance of developed model was assessed through a comparison of the predicted model of forest cover change and the actual forest cover in 2011. The analysis result showed that forest cover has decreased continuously between 1990 and 2011, up to the loss of 165,284.82 ha (35.19 %) of forest area. The LRM successfully predicted the forest cover for the period 2010 with reasonably high accuracy (ROC = 92.97 % and 70.26 %).

  17. Supply and demand analysis for flood insurance by using logistic regression model: case study at Citarum watershed in South Bandung, West Java, Indonesia

    NASA Astrophysics Data System (ADS)

    Sidi, P.; Mamat, M.; Sukono; Supian, S.

    2017-01-01

    Floods have always occurred in the Citarum river basin. The adverse effects caused by floods can cover all their property, including the destruction of houses. The impact due to damage to residential buildings is usually not small. Indeed, each of flooding, the government and several social organizations providing funds to repair the building. But the donations are given very limited, so it cannot cover the entire cost of repair was necessary. The presence of insurance products for property damage caused by the floods is considered very important. However, if its presence is also considered necessary by the public or not? In this paper, the factors that affect the supply and demand of insurance product for damaged building due to floods are analyzed. The method used in this analysis is the ordinal logistic regression. Based on the analysis that the factors that affect the supply and demand of insurance product for damaged building due to floods, it is included: age, economic circumstances, family situations, insurance motivations, and lifestyle. Simultaneously that the factors affecting supply and demand of insurance product for damaged building due to floods mounted to 65.7%.

  18. A Simulation Study to Assess the Effect of the Number of Response Categories on the Power of Ordinal Logistic Regression for Differential Item Functioning Analysis in Rating Scales

    PubMed Central

    Allahyari, Elahe; Jafari, Peyman; Bagheri, Zahra

    2016-01-01

    Objective. The present study uses simulated data to find what the optimal number of response categories is to achieve adequate power in ordinal logistic regression (OLR) model for differential item functioning (DIF) analysis in psychometric research. Methods. A hypothetical ten-item quality of life scale with three, four, and five response categories was simulated. The power and type I error rates of OLR model for detecting uniform DIF were investigated under different combinations of ability distribution (θ), sample size, sample size ratio, and the magnitude of uniform DIF across reference and focal groups. Results. When θ was distributed identically in the reference and focal groups, increasing the number of response categories from 3 to 5 resulted in an increase of approximately 8% in power of OLR model for detecting uniform DIF. The power of OLR was less than 0.36 when ability distribution in the reference and focal groups was highly skewed to the left and right, respectively. Conclusions. The clearest conclusion from this research is that the minimum number of response categories for DIF analysis using OLR is five. However, the impact of the number of response categories in detecting DIF was lower than might be expected. PMID:27403207

  19. Shallow landslide susceptibility model for the Oria river basin, Gipuzkoa province (North of Spain). Application of the logistic regression and comparison with previous studies.

    NASA Astrophysics Data System (ADS)

    Bornaetxea, Txomin; Antigüedad, Iñaki; Ormaetxea, Orbange

    2016-04-01

    In the Oria river basin (885 km2) shallow landslides are very frequent and they produce several roadblocks and damage in the infrastructure and properties, causing big economic loss every year. Considering that the zonification of the territory in different landslide susceptibility levels provides a useful tool for the territorial planning and natural risk management, this study has the objective of identifying the most prone landslide places applying an objective and reproducible methodology. To do so, a quantitative multivariate methodology, the logistic regression, has been used. Fieldwork landslide points and randomly selected stable points have been used along with Lithology, Land Use, Distance to the transport infrastructure, Altitude, Senoidal Slope and Normalized Difference Vegetation Index (NDVI) independent variables to carry out a landslide susceptibility map. The model has been validated by the prediction and success rate curves and their corresponding area under the curve (AUC). In addition, the result has been compared to those from two landslide susceptibility models, covering the study area previously applied in different scales, such as ELSUS1000 version 1 (2013) and Landslide Susceptibility Map of Gipuzkoa (2007). Validation results show an excellent prediction capacity of the proposed model (AUC 0,962), and comparisons highlight big differences with previous studies.

  20. Emergency department mental health presentations by people born in refugee source countries: an epidemiological logistic regression study in a Medicare Local region in Australia.

    PubMed

    Enticott, Joanne C; Cheng, I-Hao; Russell, Grant; Szwarc, Josef; Braitberg, George; Peek, Anne; Meadows, Graham

    2015-01-01

    This study investigated if people born in refugee source countries are disproportionately represented among those receiving a diagnosis of mental illness within emergency departments (EDs). The setting was the Cities of Greater Dandenong and Casey, the resettlement region for one-twelfth of Australia's refugees. An epidemiological, secondary data analysis compared mental illness diagnoses received in EDs by refugee and non-refugee populations. Data was the Victorian Emergency Minimum Dataset in the 2008-09 financial year. Univariate and multivariate logistic regression created predictive models for mental illness using five variables: age, sex, refugee background, interpreter use and preferred language. Collinearity, model fit and model stability were examined. Multivariate analysis showed age and sex to be the only significant risk factors for mental illness diagnosis in EDs. 'Refugee status', 'interpreter use' and 'preferred language' were not associatedwith a mental health diagnosis following risk adjustment forthe effects ofage and sex. The disappearance ofthe univariate association after adjustment for age and sex is a salutary lesson for Medicare Locals and other health planners regarding the importance of adjusting analyses of health service data for demographic characteristics.

  1. Logistic regression analysis of multiple noninvasive tests for the prediction of the presence and extent of coronary artery disease in men

    SciTech Connect

    Hung, J.; Chaitman, B.R.; Lam, J.; Lesperance, J.; Dupras, G.; Fines, P.; Cherkaoui, O.; Robert, P.; Bourassa, M.G.

    1985-08-01

    The incremental diagnostic yield of clinical data, exercise ECG, stress thallium scintigraphy, and cardiac fluoroscopy to predict coronary and multivessel disease was assessed in 171 symptomatic men by means of multiple logistic regression analyses. When clinical variables alone were analyzed, chest pain type and age were predictive of coronary disease, whereas chest pain type, age, a family history of premature coronary disease before age 55 years, and abnormal ST-T wave changes on the rest ECG were predictive of multivessel disease. The percentage of patients correctly classified by cardiac fluoroscopy (presence or absence of coronary artery calcification), exercise ECG, and thallium scintigraphy was 9%, 25%, and 50%, respectively, greater than for clinical variables, when the presence or absence of coronary disease was the outcome, and 13%, 25%, and 29%, respectively, when multivessel disease was studied; 5% of patients were misclassified. When the 37 clinical and noninvasive test variables were analyzed jointly, the most significant variable predictive of coronary disease was an abnormal thallium scan and for multivessel disease, the amount of exercise performed. The data from this study provide a quantitative model and confirm previous reports that optimal diagnostic efficacy is obtained when noninvasive tests are ordered sequentially. In symptomatic men, cardiac fluoroscopy is a relatively ineffective test when compared to exercise ECG and thallium scintigraphy.

  2. Spatial prediction of Lactarius deliciosus and Lactarius salmonicolor mushroom distribution with logistic regression models in the Kızılcasu Planning Unit, Turkey.

    PubMed

    Mumcu Kucuker, Derya; Baskent, Emin Zeki

    2015-01-01

    Integration of non-wood forest products (NWFPs) into forest management planning has become an increasingly important issue in forestry over the last decade. Among NWFPs, mushrooms are valued due to their medicinal, commercial, high nutritional and recreational importance. Commercial mushroom harvesting also provides important income to local dwellers and contributes to the economic value of regional forests. Sustainable management of these products at the regional scale requires information on their locations in diverse forest settings and the ability to predict and map their spatial distributions over the landscape. This study focuses on modeling the spatial distribution of commercially harvested Lactarius deliciosus and L. salmonicolor mushrooms in the Kızılcasu Forest Planning Unit, Turkey. The best models were developed based on topographic, climatic and stand characteristics, separately through logistic regression analysis using SPSS™. The best topographic model provided better classification success (69.3 %) than the best climatic (65.4 %) and stand (65 %) models. However, the overall best model, with 73 % overall classification success, used a mix of several variables. The best models were integrated into an Arc/Info GIS program to create spatial distribution maps of L. deliciosus and L. salmonicolor in the planning area. Our approach may be useful to predict the occurrence and distribution of other NWFPs and provide a valuable tool for designing silvicultural prescriptions and preparing multiple-use forest management plans.

  3. Quasi-Likelihood Techniques in a Logistic Regression Equation for Identifying Simulium damnosum s.l. Larval Habitats Intra-cluster Covariates in Togo

    PubMed Central

    JACOB, BENJAMIN G.; NOVAK, ROBERT J.; TOE, LAURENT; SANFO, MOUSSA S.; AFRIYIE, ABENA N.; IBRAHIM, MOHAMMED A.; GRIFFITH, DANIEL A.; UNNASCH, THOMAS R.

    2013-01-01

    The standard methods for regression analyses of clustered riverine larval habitat data of Simulium damnosum s.l. a major black-fly vector of Onchoceriasis, postulate models relating observational ecological-sampled parameter estimators to prolific habitats without accounting for residual intra-cluster error correlation effects. Generally, this correlation comes from two sources: (1) the design of the random effects and their assumed covariance from the multiple levels within the regression model; and, (2) the correlation structure of the residuals. Unfortunately, inconspicuous errors in residual intra-cluster correlation estimates can overstate precision in forecasted S.damnosum s.l. riverine larval habitat explanatory attributes regardless how they are treated (e.g., independent, autoregressive, Toeplitz, etc). In this research, the geographical locations for multiple riverine-based S. damnosum s.l. larval ecosystem habitats sampled from 2 pre-established epidemiological sites in Togo were identified and recorded from July 2009 to June 2010. Initially the data was aggregated into proc genmod. An agglomerative hierarchical residual cluster-based analysis was then performed. The sampled clustered study site data was then analyzed for statistical correlations using Monthly Biting Rates (MBR). Euclidean distance measurements and terrain-related geomorphological statistics were then generated in ArcGIS. A digital overlay was then performed also in ArcGIS using the georeferenced ground coordinates of high and low density clusters stratified by Annual Biting Rates (ABR). This data was overlain onto multitemporal sub-meter pixel resolution satellite data (i.e., QuickBird 0.61m wavbands ). Orthogonal spatial filter eigenvectors were then generated in SAS/GIS. Univariate and non-linear regression-based models (i.e., Logistic, Poisson and Negative Binomial) were also employed to determine probability distributions and to identify statistically significant parameter

  4. Quasi-Likelihood Techniques in a Logistic Regression Equation for Identifying Simulium damnosum s.l. Larval Habitats Intra-cluster Covariates in Togo.

    PubMed

    Jacob, Benjamin G; Novak, Robert J; Toe, Laurent; Sanfo, Moussa S; Afriyie, Abena N; Ibrahim, Mohammed A; Griffith, Daniel A; Unnasch, Thomas R

    2012-01-01

    The standard methods for regression analyses of clustered riverine larval habitat data of Simulium damnosum s.l. a major black-fly vector of Onchoceriasis, postulate models relating observational ecological-sampled parameter estimators to prolific habitats without accounting for residual intra-cluster error correlation effects. Generally, this correlation comes from two sources: (1) the design of the random effects and their assumed covariance from the multiple levels within the regression model; and, (2) the correlation structure of the residuals. Unfortunately, inconspicuous errors in residual intra-cluster correlation estimates can overstate precision in forecasted S.damnosum s.l. riverine larval habitat explanatory attributes regardless how they are treated (e.g., independent, autoregressive, Toeplitz, etc). In this research, the geographical locations for multiple riverine-based S. damnosum s.l. larval ecosystem habitats sampled from 2 pre-established epidemiological sites in Togo were identified and recorded from July 2009 to June 2010. Initially the data was aggregated into proc genmod. An agglomerative hierarchical residual cluster-based analysis was then performed. The sampled clustered study site data was then analyzed for statistical correlations using Monthly Biting Rates (MBR). Euclidean distance measurements and terrain-related geomorphological statistics were then generated in ArcGIS. A digital overlay was then performed also in ArcGIS using the georeferenced ground coordinates of high and low density clusters stratified by Annual Biting Rates (ABR). This data was overlain onto multitemporal sub-meter pixel resolution satellite data (i.e., QuickBird 0.61m wavbands ). Orthogonal spatial filter eigenvectors were then generated in SAS/GIS. Univariate and non-linear regression-based models (i.e., Logistic, Poisson and Negative Binomial) were also employed to determine probability distributions and to identify statistically significant parameter

  5. Unitary Response Regression Models

    ERIC Educational Resources Information Center

    Lipovetsky, S.

    2007-01-01

    The dependent variable in a regular linear regression is a numerical variable, and in a logistic regression it is a binary or categorical variable. In these models the dependent variable has varying values. However, there are problems yielding an identity output of a constant value which can also be modelled in a linear or logistic regression with…

  6. Predicting reintubation, prolonged mechanical ventilation and death in post-coronary artery bypass graft surgery: a comparison between artificial neural networks and logistic regression models

    PubMed Central

    Mendes, Renata G.; de Souza, César R.; Machado, Maurício N.; Correa, Paulo R.; Di Thommazo-Luporini, Luciana; Arena, Ross; Myers, Jonathan; Pizzolato, Ednaldo B.

    2015-01-01

    Introduction In coronary artery bypass (CABG) surgery, the common complications are the need for reintubation, prolonged mechanical ventilation (PMV) and death. Thus, a reliable model for the prognostic evaluation of those particular outcomes is a worthwhile pursuit. The existence of such a system would lead to better resource planning, cost reductions and an increased ability to guide preventive strategies. The aim of this study was to compare different methods – logistic regression (LR) and artificial neural networks (ANNs) – in accomplishing this goal. Material and methods Subjects undergoing CABG (n = 1315) were divided into training (n = 1053) and validation (n = 262) groups. The set of independent variables consisted of age, gender, weight, height, body mass index, diabetes, creatinine level, cardiopulmonary bypass, presence of preserved ventricular function, moderate and severe ventricular dysfunction and total number of grafts. The PMV was also an input for the prediction of death. The ability of ANN to discriminate outcomes was assessed using receiver-operating characteristic (ROC) analysis and the results were compared using a multivariate LR. Results The ROC curve areas for LR and ANN models, respectively, were: for reintubation 0.62 (CI: 0.50–0.75) and 0.65 (CI: 0.53–0.77); for PMV 0.67 (CI: 0.57–0.78) and 0.72 (CI: 0.64–0.81); and for death 0.86 (CI: 0.79–0.93) and 0.85 (CI: 0.80–0.91). No differences were observed between models. Conclusions The ANN has similar discriminating power in predicting reintubation, PMV and death outcomes. Thus, both models may be applicable as a predictor for these outcomes in subjects undergoing CABG. PMID:26322087

  7. Use of Logistic Regression for Prediction of the Fate of Staphylococcus aureus in Pasteurized Milk in the Presence of Two Lytic Phages ▿

    PubMed Central

    Obeso, José M.; García, Pilar; Martínez, Beatriz; Arroyo-López, Francisco Noé; Garrido-Fernández, Antonio; Rodriguez, Ana

    2010-01-01

    The use of bacteriophages provides an attractive approach to the fight against food-borne pathogenic bacteria, since they can be found in different environments and are unable to infect humans, both characteristics of which support their use as biocontrol agents. Two lytic bacteriophages, vB_SauS-phiIPLA35 (phiIPLA35) and vB_SauS-phiIPLA88 (phiIPLA88), previously isolated from the dairy environment inhibited the growth of Staphylococcus aureus. To facilitate the successful application of both bacteriophages as biocontrol agents, probabilistic models for predicting S. aureus inactivation by the phages in pasteurized milk were developed. A linear logistic regression procedure was used to describe the survival/death interface of S. aureus after 8 h of storage as a function of the initial phage titer (2 to 8 log10 PFU/ml), initial bacterial contamination (2 to 6 log10 CFU/ml), and temperature (15 to 37°C). Two successive models were built, with the first including only data from the experimental design and a global one in which results derived from the validation experiments were also included. The temperature, interaction temperature-initial level of bacterial contamination, and initial level of bacterial contamination-phage titer contributed significantly to the first model prediction. However, only the phage titer and temperature were significantly involved in the global model prediction. The predictions of both models were fail-safe and highly consistent with the observed S. aureus responses. Nevertheless, the global model, deduced from a higher number of experiments (with a higher degree of freedom), was dependent on a lower number of variables and had an apparent better fit. Therefore, it can be considered a convenient evolution of the first model. Besides, the global model provides the minimum phage concentration (about 2 × 108 PFU/ml) required to inactivate S. aureus in milk at different temperatures, irrespective of the bacterial contamination level. PMID

  8. Hip fracture risk assessment: artificial neural network outperforms conditional logistic regression in an age- and sex-matched case control study

    PubMed Central

    2013-01-01

    Background Osteoporotic hip fractures with a significant morbidity and excess mortality among the elderly have imposed huge health and economic burdens on societies worldwide. In this age- and sex-matched case control study, we examined the risk factors of hip fractures and assessed the fracture risk by conditional logistic regression (CLR) and ensemble artificial neural network (ANN). The performances of these two classifiers were compared. Methods The study population consisted of 217 pairs (149 women and 68 men) of fractures and controls with an age older than 60 years. All the participants were interviewed with the same standardized questionnaire including questions on 66 risk factors in 12 categories. Univariate CLR analysis was initially conducted to examine the unadjusted odds ratio of all potential risk factors. The significant risk factors were then tested by multivariate analyses. For fracture risk assessment, the participants were randomly divided into modeling and testing datasets for 10-fold cross validation analyses. The predicting models built by CLR and ANN in modeling datasets were applied to testing datasets for generalization study. The performances, including discrimination and calibration, were compared with non-parametric Wilcoxon tests. Results In univariate CLR analyses, 16 variables achieved significant level, and six of them remained significant in multivariate analyses, including low T score, low BMI, low MMSE score, milk intake, walking difficulty, and significant fall at home. For discrimination, ANN outperformed CLR in both 16- and 6-variable analyses in modeling and testing datasets (p?

  9. Improving Case-Based Reasoning Systems by Combining K-Nearest Neighbour Algorithm with Logistic Regression in the Prediction of Patients’ Registration on the Renal Transplant Waiting List

    PubMed Central

    Campillo-Gimenez, Boris; Jouini, Wassim; Bayat, Sahar; Cuggia, Marc

    2013-01-01

    Introduction Case-based reasoning (CBR) is an emerging decision making paradigm in medical research where new cases are solved relying on previously solved similar cases. Usually, a database of solved cases is provided, and every case is described through a set of attributes (inputs) and a label (output). Extracting useful information from this database can help the CBR system providing more reliable results on the yet to be solved cases. Objective We suggest a general framework where a CBR system, viz. K-Nearest Neighbour (K-NN) algorithm, is combined with various information obtained from a Logistic Regression (LR) model, in order to improve prediction of access to the transplant waiting list. Methods LR is applied, on the case database, to assign weights to the attributes as well as the solved cases. Thus, five possible decision making systems based on K-NN and/or LR were identified: a standalone K-NN, a standalone LR and three soft K-NN algorithms that rely on the weights based on the results of the LR. The evaluation was performed under two conditions, either using predictive factors known to be related to registration, or using a combination of factors related and not related to registration. Results and Conclusion The results show that our suggested approach, where the K-NN algorithm relies on both weighted attributes and cases, can efficiently deal with non relevant attributes, whereas the four other approaches suffer from this kind of noisy setups. The robustness of this approach suggests interesting perspectives for medical problem solving tools using CBR methodology. PMID:24039730

  10. Comparing performances of logistic regression and neural networks for predicting melatonin excretion patterns in the rat exposed to ELF magnetic fields.

    PubMed

    Jahandideh, Samad; Abdolmaleki, Parviz; Movahedi, Mohammad Mehdi

    2010-02-01

    Various studies have been reported on the bioeffects of magnetic field exposure; however, no consensus or guideline is available for experimental designs relating to exposure conditions as yet. In this study, logistic regression (LR) and artificial neural networks (ANNs) were used in order to analyze and predict the melatonin excretion patterns in the rat exposed to extremely low frequency magnetic fields (ELF-MF). Subsequently, on a database containing 33 experiments, performances of LR and ANNs were compared through resubstitution and jackknife tests. Predictor variables were more effective parameters and included frequency, polarization, exposure duration, and strength of magnetic fields. Also, five performance measures including accuracy, sensitivity, specificity, Matthew's Correlation Coefficient (MCC) and normalized percentage, better than random (S) were used to evaluate the performance of models. The LR as a conventional model obtained poor prediction performance. Nonetheless, LR distinguished the duration of magnetic fields as a statistically significant parameter. Also, horizontal polarization of magnetic fields with the highest logit coefficient (or parameter estimate) with negative sign was found to be the strongest indicator for experimental designs relating to exposure conditions. This means that each experiment with horizontal polarization of magnetic fields has a higher probability to result in "not changed melatonin level" pattern. On the other hand, ANNs, a more powerful model which has not been introduced in predicting melatonin excretion patterns in the rat exposed to ELF-MF, showed high performance measure values and higher reliability, especially obtaining 0.55 value of MCC through jackknife tests. Obtained results showed that such predictor models are promising and may play a useful role in defining guidelines for experimental designs relating to exposure conditions. In conclusion, analysis of the bioelectromagnetic data could result in

  11. Logistic regression analyses for predicting clinically important differences in motor capacity, motor performance, and functional independence after constraint-induced therapy in children with cerebral palsy.

    PubMed

    Wang, Tien-ni; Wu, Ching-yi; Chen, Chia-ling; Shieh, Jeng-yi; Lu, Lu; Lin, Keh-chung

    2013-03-01

    Given the growing evidence for the effects of constraint-induced therapy (CIT) in children with cerebral palsy (CP), there is a need for investigating the characteristics of potential participants who may benefit most from this intervention. This study aimed to establish predictive models for the effects of pediatric CIT on motor and functional outcomes. Therapists administered CIT to 49 children (aged 3-11 years) with CP. Sessions were 1-3.5h a day, twice a week, for 3-4 weeks. Parents were asked to document the number of restraint hours outside of the therapy sessions. Domains of treatment outcomes included motor capacity (measured by the Peabody Developmental Motor Scales II), motor performance (measured by the Pediatric Motor Activity Log), and functional independence (measured by the Pediatric Functional Independence Measure). Potential predictors included age, affected side, compliance (measured by time of restraint), and the initial level of motor impairment severity. Tests were administered before, immediately after, and 3 months after the intervention. Logistic regression analyses showed that total amount of restraint time was the only significant predictor for improved motor capacity immediately after CIT. Younger children who restrained the less affected arm for a longer time had a greater chance to achieve clinically significant improvements in motor performance. For outcomes of functional independence in daily life, younger age was associated with clinically meaningful improvement in the self-care domain. Baseline motor abilities were significantly predictive of better improvement in mobility and cognition. Significant predictors varied according to the aspects of motor outcomes after 3 months of follow-up. The potential predictors identified in this study allow clinicians to target those children who may benefit most from CIT.

  12. Age-Adjustment and Related Epidemiology Rates in Education and Research

    ERIC Educational Resources Information Center

    Baker, John D.; Kruckman, Laurence; George, Joyce

    2006-01-01

    A quick review of introductory textbooks reveals that while gerontology authors and instructors introduce some aspect of demography and epidemiology data, there is limited focus on age adjustment or other important epidemiology rates. The goal of this paper is to reintroduce a variety of basic epidemiology strategies such as incidence, prevalence,…

  13. The role of multicollinearity in landslide susceptibility assessment by means of Binary Logistic Regression: comparison between VIF and AIC stepwise selection

    NASA Astrophysics Data System (ADS)

    Cama, Mariaelena; Cristi Nicu, Ionut; Conoscenti, Christian; Quénéhervé, Geraldine; Maerker, Michael

    2016-04-01

    Landslide susceptibility can be defined as the likelihood of a landslide occurring in a given area on the basis of local terrain conditions. In the last decades many research focused on its evaluation by means of stochastic approaches under the assumption that 'the past is the key to the future' which means that if a model is able to reproduce a known landslide spatial distribution, it will be able to predict the future locations of new (i.e. unknown) slope failures. Among the various stochastic approaches, Binary Logistic Regression (BLR) is one of the most used because it calculates the susceptibility in probabilistic terms and its results are easily interpretable from a geomorphological point of view. However, very often not much importance is given to multicollinearity assessment whose effect is that the coefficient estimates are unstable, with opposite sign and therefore difficult to interpret. Therefore, it should be evaluated every time in order to make a model whose results are geomorphologically correct. In this study the effects of multicollinearity in the predictive performance and robustness of landslide susceptibility models are analyzed. In particular, the multicollinearity is estimated by means of Variation Inflation Index (VIF) which is also used as selection criterion for the independent variables (VIF Stepwise Selection) and compared to the more commonly used AIC Stepwise Selection. The robustness of the results is evaluated through 100 replicates of the dataset. The study area selected to perform this analysis is the Moldavian Plateau where landslides are among the most frequent geomorphological processes. This area has an increasing trend of urbanization and a very high potential regarding the cultural heritage, being the place of discovery of the largest settlement belonging to the Cucuteni Culture from Eastern Europe (that led to the development of the great complex Cucuteni-Tripyllia). Therefore, identifying the areas susceptible to

  14. Three-Level Mixed-Effects Logistic Regression Analysis Reveals Complex Epidemiology of Swine Rotaviruses in Diagnostic Samples from North America

    PubMed Central

    Diaz, Andres; Rossow, Stephanie; Ciarlet, Max; Marthaler, Douglas

    2016-01-01

    Rotaviruses (RV) are important causes of diarrhea in animals, especially in domestic animals. Of the 9 RV species, rotavirus A, B, and C (RVA, RVB, and RVC, respectively) had been established as important causes of diarrhea in pigs. The Minnesota Veterinary Diagnostic Laboratory receives swine stool samples from North America to determine the etiologic agents of disease. Between November 2009 and October 2011, 7,508 samples from pigs with diarrhea were submitted to determine if enteric pathogens, including RV, were present in the samples. All samples were tested for RVA, RVB, and RVC by real time RT-PCR. The majority of the samples (82%) were positive for RVA, RVB, and/or RVC. To better understand the risk factors associated with RV infections in swine diagnostic samples, three-level mixed-effects logistic regression models (3L-MLMs) were used to estimate associations among RV species, age, and geographical variability within the major swine production regions in North America. The conditional odds ratios (cORs) for RVA and RVB detection were lower for 1–3 day old pigs when compared to any other age group. However, the cOR of RVC detection in 1–3 day old pigs was significantly higher (p < 0.001) than pigs in the 4–20 days old and >55 day old age groups. Furthermore, pigs in the 21–55 day old age group had statistically higher cORs of RV co-detection compared to 1–3 day old pigs (p < 0.001). The 3L-MLMs indicated that RV status was more similar within states than among states or within each region. Our results indicated that 3L-MLMs are a powerful and adaptable tool to handle and analyze large-hierarchical datasets. In addition, our results indicated that, overall, swine RV epidemiology is complex, and RV species are associated with different age groups and vary by regions in North America. PMID:27145176

  15. Comparison and validation of Logistic Regression and Analytic Hierarchy Process models of landslide susceptibility in monoclinic regions. A case study in Moldavian Plateau, N-E Romania

    NASA Astrophysics Data System (ADS)

    Ciprian Margarint, Mihai; Niculita, Mihai

    2014-05-01

    The regions with monoclinic geological structure are large portions of earth surface where the repetition of similar landform patterns is very distinguished, the scarps of cuestas being characterized by similar values of morphometrical variables. Landslides are associated with these scarps of cuestas and consequently, a very high value of landslide susceptibility can be reported on its surface. In these regions, landslide susceptibility mapping can be realized for the entire region, or for test areas, with accurate, reliable, and available datasets, concerning multi-temporal inventories and landslide predictors. Because of the similar geomorphologic and landslide distribution we think that if any relevance of using test areas for extrapolating susceptibility models is present, these areas should be targeted first. This study case try to establish the level of usability of landslide predictors influence, obtained for a 90 km2 sample located in the northern part of the Moldavian Plateau (N-E Romania), in other areas of the same physio-geographic region. In a first phase, landslide susceptibility assessment was carried out and validated using logistic regression (LR) approach, using a multiple landslide inventory. This inventory was created using ortorectified aerial images from 1978 and 2005, for each period being considered both old and active landslides. The modeling strategy was based on a distinctly inventory of depletion areas of all landslide, for 1978 phase, and on a number of 30 covariates extracted from topographical and aerial images (both from 1978 and 2005 periods). The geomorphometric variables were computed from a Digital Elevation Model (DEM) obtained by interpolation from 1:5000 contour data (2.5 m equidistance), at 10x10 m resolution. Distance from river network, distance from roads and land use were extracted from topographic maps and aerial images. By applying Akaike Information Criterion (AIC) the covariates with significance under 0.001 level

  16. Personal, Social, and Game-Related Correlates of Active and Non-Active Gaming Among Dutch Gaming Adolescents: Survey-Based Multivariable, Multilevel Logistic Regression Analyses

    PubMed Central

    de Vet, Emely; Chinapaw, Mai JM; de Boer, Michiel; Seidell, Jacob C; Brug, Johannes

    2014-01-01

    Background Playing video games contributes substantially to sedentary behavior in youth. A new generation of video games—active games—seems to be a promising alternative to sedentary games to promote physical activity and reduce sedentary behavior. At this time, little is known about correlates of active and non-active gaming among adolescents. Objective The objective of this study was to examine potential personal, social, and game-related correlates of both active and non-active gaming in adolescents. Methods A survey assessing game behavior and potential personal, social, and game-related correlates was conducted among adolescents (12-16 years, N=353) recruited via schools. Multivariable, multilevel logistic regression analyses, adjusted for demographics (age, sex and educational level of adolescents), were conducted to examine personal, social, and game-related correlates of active gaming ≥1 hour per week (h/wk) and non-active gaming >7 h/wk. Results Active gaming ≥1 h/wk was significantly associated with a more positive attitude toward active gaming (OR 5.3, CI 2.4-11.8; P<.001), a less positive attitude toward non-active games (OR 0.30, CI 0.1-0.6; P=.002), a higher score on habit strength regarding gaming (OR 1.9, CI 1.2-3.2; P=.008) and having brothers/sisters (OR 6.7, CI 2.6-17.1; P<.001) and friends (OR 3.4, CI 1.4-8.4; P=.009) who spend more time on active gaming and a little bit lower score on game engagement (OR 0.95, CI 0.91-0.997; P=.04). Non-active gaming >7 h/wk was significantly associated with a more positive attitude toward non-active gaming (OR 2.6, CI 1.1-6.3; P=.035), a stronger habit regarding gaming (OR 3.0, CI 1.7-5.3; P<.001), having friends who spend more time on non-active gaming (OR 3.3, CI 1.46-7.53; P=.004), and a more positive image of a non-active gamer (OR 2, CI 1.07–3.75; P=.03). Conclusions Various factors were significantly associated with active gaming ≥1 h/wk and non-active gaming >7 h/wk. Active gaming is most

  17. Basic Diagnosis and Prediction of Persistent Contrail Occurrence using High-resolution Numerical Weather Analyses/Forecasts and Logistic Regression. Part I: Effects of Random Error

    NASA Technical Reports Server (NTRS)

    Duda, David P.; Minnis, Patrick

    2009-01-01

    Straightforward application of the Schmidt-Appleman contrail formation criteria to diagnose persistent contrail occurrence from numerical weather prediction data is hindered by significant bias errors in the upper tropospheric humidity. Logistic models of contrail occurrence have been proposed to overcome this problem, but basic questions remain about how random measurement error may affect their accuracy. A set of 5000 synthetic contrail observations is created to study the effects of random error in these probabilistic models. The simulated observations are based on distributions of temperature, humidity, and vertical velocity derived from Advanced Regional Prediction System (ARPS) weather analyses. The logistic models created from the simulated observations were evaluated using two common statistical measures of model accuracy, the percent correct (PC) and the Hanssen-Kuipers discriminant (HKD). To convert the probabilistic results of the logistic models into a dichotomous yes/no choice suitable for the statistical measures, two critical probability thresholds are considered. The HKD scores are higher when the climatological frequency of contrail occurrence is used as the critical threshold, while the PC scores are higher when the critical probability threshold is 0.5. For both thresholds, typical random errors in temperature, relative humidity, and vertical velocity are found to be small enough to allow for accurate logistic models of contrail occurrence. The accuracy of the models developed from synthetic data is over 85 percent for both the prediction of contrail occurrence and non-occurrence, although in practice, larger errors would be anticipated.

  18. Effectiveness of Combining Statistical Tests and Effect Sizes When Using Logistic Discriminant Function Regression to Detect Differential Item Functioning for Polytomous Items

    ERIC Educational Resources Information Center

    Gómez-Benito, Juana; Hidalgo, Maria Dolores; Zumbo, Bruno D.

    2013-01-01

    The objective of this article was to find an optimal decision rule for identifying polytomous items with large or moderate amounts of differential functioning. The effectiveness of combining statistical tests with effect size measures was assessed using logistic discriminant function analysis and two effect size measures: R[superscript 2] and…

  19. Basic Diagnosis and Prediction of Persistent Contrail Occurrence using High-resolution Numerical Weather Analyses/Forecasts and Logistic Regression. Part II: Evaluation of Sample Models

    NASA Technical Reports Server (NTRS)

    Duda, David P.; Minnis, Patrick

    2009-01-01

    Previous studies have shown that probabilistic forecasting may be a useful method for predicting persistent contrail formation. A probabilistic forecast to accurately predict contrail formation over the contiguous United States (CONUS) is created by using meteorological data based on hourly meteorological analyses from the Advanced Regional Prediction System (ARPS) and from the Rapid Update Cycle (RUC) as well as GOES water vapor channel measurements, combined with surface and satellite observations of contrails. Two groups of logistic models were created. The first group of models (SURFACE models) is based on surface-based contrail observations supplemented with satellite observations of contrail occurrence. The second group of models (OUTBREAK models) is derived from a selected subgroup of satellite-based observations of widespread persistent contrails. The mean accuracies for both the SURFACE and OUTBREAK models typically exceeded 75 percent when based on the RUC or ARPS analysis data, but decreased when the logistic models were derived from ARPS forecast data.

  20. Large scale landslide susceptibility assessment using the statistical methods of logistic regression and BSA - study case: the sub-basin of the small Niraj (Transylvania Depression, Romania)

    NASA Astrophysics Data System (ADS)

    Roşca, S.; Bilaşco, Ş.; Petrea, D.; Fodorean, I.; Vescan, I.; Filip, S.; Măguţ, F.-L.

    2015-11-01

    The existence of a large number of GIS models for the identification of landslide occurrence probability makes difficult the selection of a specific one. The present study focuses on the application of two quantitative models: the logistic and the BSA models. The comparative analysis of the results aims at identifying the most suitable model. The territory corresponding to the Niraj Mic Basin (87 km2) is an area characterised by a wide variety of the landforms with their morphometric, morphographical and geological characteristics as well as by a high complexity of the land use types where active landslides exist. This is the reason why it represents the test area for applying the two models and for the comparison of the results. The large complexity of input variables is illustrated by 16 factors which were represented as 72 dummy variables, analysed on the basis of their importance within the model structures. The testing of the statistical significance corresponding to each variable reduced the number of dummy variables to 12 which were considered significant for the test area within the logistic model, whereas for the BSA model all the variables were employed. The predictability degree of the models was tested through the identification of the area under the ROC curve which indicated a good accuracy (AUROC = 0.86 for the testing area) and predictability of the logistic model (AUROC = 0.63 for the validation area).

  1. Impact of Colic Pain as a Significant Factor for Predicting the Stone Free Rate of One-Session Shock Wave Lithotripsy for Treating Ureter Stones: A Bayesian Logistic Regression Model Analysis

    PubMed Central

    Chung, Doo Yong; Cho, Kang Su; Lee, Dae Hun; Han, Jang Hee; Kang, Dong Hyuk; Jung, Hae Do; Kown, Jong Kyou; Ham, Won Sik; Choi, Young Deuk; Lee, Joo Yong

    2015-01-01

    Purpose This study was conducted to evaluate colic pain as a prognostic pretreatment factor that can influence ureter stone clearance and to estimate the probability of stone-free status in shock wave lithotripsy (SWL) patients with a ureter stone. Materials and Methods We retrospectively reviewed the medical records of 1,418 patients who underwent their first SWL between 2005 and 2013. Among these patients, 551 had a ureter stone measuring 4–20 mm and were thus eligible for our analyses. The colic pain as the chief complaint was defined as either subjective flank pain during history taking and physical examination. Propensity-scores for established for colic pain was calculated for each patient using multivariate logistic regression based upon the following covariates: age, maximal stone length (MSL), and mean stone density (MSD). Each factor was evaluated as predictor for stone-free status by Bayesian and non-Bayesian logistic regression model. Results After propensity-score matching, 217 patients were extracted in each group from the total patient cohort. There were no statistical differences in variables used in propensity- score matching. One-session success and stone-free rate were also higher in the painful group (73.7% and 71.0%, respectively) than in the painless group (63.6% and 60.4%, respectively). In multivariate non-Bayesian and Bayesian logistic regression models, a painful stone, shorter MSL, and lower MSD were significant factors for one-session stone-free status in patients who underwent SWL. Conclusions Colic pain in patients with ureter calculi was one of the significant predicting factors including MSL and MSD for one-session stone-free status of SWL. PMID:25902059

  2. Completing the Remedial Sequence and College-Level Credit-Bearing Math: Comparing Binary, Cumulative, and Continuation Ratio Logistic Regression Models

    ERIC Educational Resources Information Center

    Davidson, J. Cody

    2016-01-01

    Mathematics is the most common subject area of remedial need and the majority of remedial math students never pass a college-level credit-bearing math class. The majorities of studies that investigate this phenomenon are conducted at community colleges and use some type of regression model; however, none have used a continuation ratio model. The…

  3. Evaluating the High Risk Groups for Suicide: A Comparison of Logistic Regression, Support Vector Machine, Decision Tree and Artificial Neural Network

    PubMed Central

    AMINI, Payam; AHMADINIA, Hasan; POOROLAJAL, Jalal; MOQADDASI AMIRI, Mohammad

    2016-01-01

    Background: We aimed to assess the high-risk group for suicide using different classification methods includinglogistic regression (LR), decision tree (DT), artificial neural network (ANN), and support vector machine (SVM). Methods: We used the dataset of a study conducted to predict risk factors of completed suicide in Hamadan Province, the west of Iran, in 2010. To evaluate the high-risk groups for suicide, LR, SVM, DT and ANN were performed. The applied methods were compared using sensitivity, specificity, positive predicted value, negative predicted value, accuracy and the area under curve. Cochran-Q test was implied to check differences in proportion among methods. To assess the association between the observed and predicted values, Ø coefficient, contingency coefficient, and Kendall tau-b were calculated. Results: Gender, age, and job were the most important risk factors for fatal suicide attempts in common for four methods. SVM method showed the highest accuracy 0.68 and 0.67 for training and testing sample, respectively. However, this method resulted in the highest specificity (0.67 for training and 0.68 for testing sample) and the highest sensitivity for training sample (0.85), but the lowest sensitivity for the testing sample (0.53). Cochran-Q test resulted in differences between proportions in different methods (P<0.001). The association of SVM predictions and observed values, Ø coefficient, contingency coefficient, and Kendall tau-b were 0.239, 0.232 and 0.239, respectively. Conclusion: SVM had the best performance to classify fatal suicide attempts comparing to DT, LR and ANN. PMID:27957463

  4. A Latent Transition Model with Logistic Regression

    ERIC Educational Resources Information Center

    Chung, Hwan; Walls, Theodore A.; Park, Yousung

    2007-01-01

    Latent transition models increasingly include covariates that predict prevalence of latent classes at a given time or transition rates among classes over time. In many situations, the covariate of interest may be latent. This paper describes an approach for handling both manifest and latent covariates in a latent transition model. A Bayesian…

  5. Logistic and linear regression model documentation for statistical relations between continuous real-time and discrete water-quality constituents in the Kansas River, Kansas, July 2012 through June 2015

    USGS Publications Warehouse

    Foster, Guy M.; Graham, Jennifer L.

    2016-04-06

    The Kansas River is a primary source of drinking water for about 800,000 people in northeastern Kansas. Source-water supplies are treated by a combination of chemical and physical processes to remove contaminants before distribution. Advanced notification of changing water-quality conditions and cyanobacteria and associated toxin and taste-and-odor compounds provides drinking-water treatment facilities time to develop and implement adequate treatment strategies. The U.S. Geological Survey (USGS), in cooperation with the Kansas Water Office (funded in part through the Kansas State Water Plan Fund), and the City of Lawrence, the City of Topeka, the City of Olathe, and Johnson County Water One, began a study in July 2012 to develop statistical models at two Kansas River sites located upstream from drinking-water intakes. Continuous water-quality monitors have been operated and discrete-water quality samples have been collected on the Kansas River at Wamego (USGS site number 06887500) and De Soto (USGS site number 06892350) since July 2012. Continuous and discrete water-quality data collected during July 2012 through June 2015 were used to develop statistical models for constituents of interest at the Wamego and De Soto sites. Logistic models to continuously estimate the probability of occurrence above selected thresholds were developed for cyanobacteria, microcystin, and geosmin. Linear regression models to continuously estimate constituent concentrations were developed for major ions, dissolved solids, alkalinity, nutrients (nitrogen and phosphorus species), suspended sediment, indicator bacteria (Escherichia coli, fecal coliform, and enterococci), and actinomycetes bacteria. These models will be used to provide real-time estimates of the probability that cyanobacteria and associated compounds exceed thresholds and of the concentrations of other water-quality constituents in the Kansas River. The models documented in this report are useful for characterizing changes

  6. Sequential PET/CT with [18F]-FDG Predicts Pathological Tumor Response to Preoperative Short Course Radiotherapy with Delayed Surgery in Patients with Locally Advanced Rectal Cancer Using Logistic Regression Analysis

    PubMed Central

    Pecori, Biagio; Lastoria, Secondo; Caracò, Corradina; Celentani, Marco; Tatangelo, Fabiana; Avallone, Antonio; Rega, Daniela; De Palma, Giampaolo; Mormile, Maria; Budillon, Alfredo; Muto, Paolo; Bianco, Francesco; Aloj, Luigi; Petrillo, Antonella; Delrio, Paolo

    2017-01-01

    Previous studies indicate that FDG PET/CT may predict pathological response in patients undergoing neoadjuvant chemo-radiotherapy for locally advanced rectal cancer (LARC). Aim of the current study is evaluate if pathological response can be similarly predicted in LARC patients after short course radiation therapy alone. Methods: Thirty-three patients with cT2-3, N0-2, M0 rectal adenocarcinoma treated with hypo fractionated short course neoadjuvant RT (5x5 Gy) with delayed surgery (SCRTDS) were prospectively studied. All patients underwent 3 PET/CT studies at baseline, 10 days from RT end (early), and 53 days from RT end (delayed). Maximal standardized uptake value (SUVmax), mean standardized uptake value (SUVmean) and total lesion glycolysis (TLG) of the primary tumor were measured and recorded at each PET/CT study. We use logistic regression analysis to aggregate different measures of metabolic response to predict the pathological response in the course of SCRTDS. Results: We provide straightforward formulas to classify response and estimate the probability of being a major responder (TRG1-2) or a complete responder (TRG1) for each individual. The formulas are based on the level of TLG at the early PET and on the overall proportional reduction of TLG between baseline and delayed PET studies. Conclusions: This study demonstrates that in the course of SCRTDS it is possible to estimate the probabilities of pathological tumor responses on the basis of PET/CT with FDG. Our formulas make it possible to assess the risks associated to LARC borne by a patient in the course of SCRTDS. These risk assessments can be balanced against other health risks associated with further treatments and can therefore be used to make informed therapy adjustments during SCRTDS. PMID:28060889

  7. Data mining methods in the prediction of Dementia: A real-data comparison of the accuracy, sensitivity and specificity of linear discriminant analysis, logistic regression, neural networks, support vector machines, classification trees and random forests

    PubMed Central

    2011-01-01

    Background Dementia and cognitive impairment associated with aging are a major medical and social concern. Neuropsychological testing is a key element in the diagnostic procedures of Mild Cognitive Impairment (MCI), but has presently a limited value in the prediction of progression to dementia. We advance the hypothesis that newer statistical classification methods derived from data mining and machine learning methods like Neural Networks, Support Vector Machines and Random Forests can improve accuracy, sensitivity and specificity of predictions obtained from neuropsychological testing. Seven non parametric classifiers derived from data mining methods (Multilayer Perceptrons Neural Networks, Radial Basis Function Neural Networks, Support Vector Machines, CART, CHAID and QUEST Classification Trees and Random Forests) were compared to three traditional classifiers (Linear Discriminant Analysis, Quadratic Discriminant Analysis and Logistic Regression) in terms of overall classification accuracy, specificity, sensitivity, Area under the ROC curve and Press'Q. Model predictors were 10 neuropsychological tests currently used in the diagnosis of dementia. Statistical distributions of classification parameters obtained from a 5-fold cross-validation were compared using the Friedman's nonparametric test. Results Press' Q test showed that all classifiers performed better than chance alone (p < 0.05). Support Vector Machines showed the larger overall classification accuracy (Median (Me) = 0.76) an area under the ROC (Me = 0.90). However this method showed high specificity (Me = 1.0) but low sensitivity (Me = 0.3). Random Forest ranked second in overall accuracy (Me = 0.73) with high area under the ROC (Me = 0.73) specificity (Me = 0.73) and sensitivity (Me = 0.64). Linear Discriminant Analysis also showed acceptable overall accuracy (Me = 0.66), with acceptable area under the ROC (Me = 0.72) specificity (Me = 0.66) and sensitivity (Me = 0.64). The remaining classifiers showed

  8. Multiple Logistic Regression Analysis of Risk Factors Associated with Denture Plaque and Staining in Chinese Removable Denture Wearers over 40 Years Old in Xi’an – a Cross-Sectional Study

    PubMed Central

    Chai, Zhiguo; Chen, Jihua; Zhang, Shaofeng

    2014-01-01

    Background Removable dentures are subject to plaque and/or staining problems. Denture hygiene habits and risk factors differ among countries and regions. The aims of this study were to assess hygiene habits and denture plaque and staining risk factors in Chinese removable denture wearers aged >40 years in Xi’an through multiple logistic regression analysis (MLRA). Methods Questionnaires were administered to 222 patients whose removable dentures were examined clinically to assess wear status and levels of plaque and staining. Univariate analyses were performed to identify potential risk factors for denture plaque/staining. MLRA was performed to identify significant risk factors. Results Brushing (77.93%) was the most prevalent cleaning method in the present study. Only 16.4% of patients regularly used commercial cleansers. Most (81.08%) patients removed their dentures overnight. MLRA indicated that potential risk factors for denture plaque were the duration of denture use (reference, ≤0.5 years; 2.1–5 years: OR = 4.155, P = 0.001; >5 years: OR = 7.238, P<0.001) and cleaning method (reference, chemical cleanser; running water: OR = 7.081, P = 0.010; brushing: OR = 3.567, P = 0.005). Potential risk factors for denture staining were female gender (OR = 0.377, P = 0.013), smoking (OR = 5.471, P = 0.031), tea consumption (OR = 3.957, P = 0.002), denture scratching (OR = 4.557, P = 0.036), duration of denture use (reference, ≤0.5 years; 2.1–5 years: OR = 7.899, P = 0.001; >5 years: OR = 27.226, P<0.001), and cleaning method (reference, chemical cleanser; running water: OR = 29.184, P<0.001; brushing: OR = 4.236, P = 0.007). Conclusion Denture hygiene habits need further improvement. An understanding of the risk factors for denture plaque and staining may provide the basis for preventive efforts. PMID:24498369

  9. Comparison and applicability of landslide susceptibility models based on landslide ratio-based logistic regression, frequency ratio, weight of evidence, and instability index methods in an extreme rainfall event

    NASA Astrophysics Data System (ADS)

    Wu, Chunhung

    2016-04-01

    Few researches have discussed about the applicability of applying the statistical landslide susceptibility (LS) model for extreme rainfall-induced landslide events. The researches focuses on the comparison and applicability of LS models based on four methods, including landslide ratio-based logistic regression (LRBLR), frequency ratio (FR), weight of evidence (WOE), and instability index (II) methods, in an extreme rainfall-induced landslide cases. The landslide inventory in the Chishan river watershed, Southwestern Taiwan, after 2009 Typhoon Morakot is the main materials in this research. The Chishan river watershed is a tributary watershed of Kaoping river watershed, which is a landslide- and erosion-prone watershed with the annual average suspended load of 3.6×107 MT/yr (ranks 11th in the world). Typhoon Morakot struck Southern Taiwan from Aug. 6-10 in 2009 and dumped nearly 2,000 mm of rainfall in the Chishan river watershed. The 24-hour, 48-hour, and 72-hours accumulated rainfall in the Chishan river watershed exceeded the 200-year return period accumulated rainfall. 2,389 landslide polygons in the Chishan river watershed were extracted from SPOT 5 images after 2009 Typhoon Morakot. The total landslide area is around 33.5 km2, equals to the landslide ratio of 4.1%. The main landslide types based on Varnes' (1978) classification are rotational and translational slides. The two characteristics of extreme rainfall-induced landslide event are dense landslide distribution and large occupation of downslope landslide areas owing to headward erosion and bank erosion in the flooding processes. The area of downslope landslide in the Chishan river watershed after 2009 Typhoon Morakot is 3.2 times higher than that of upslope landslide areas. The prediction accuracy of LS models based on LRBLR, FR, WOE, and II methods have been proven over 70%. The model performance and applicability of four models in a landslide-prone watershed with dense distribution of rainfall

  10. QuickStats: Age-Adjusted Death Rates* for Top Five Causes of Cancer Death,(†) by Race/Hispanic Ethnicity - United States, 2014.

    PubMed

    2016-09-16

    In 2014, the top five causes of cancer deaths for the total population were lung, colorectal, female breast, pancreatic, and prostate cancer. The non-Hispanic black population had the highest age-adjusted death rates for each of these five cancers, followed by non-Hispanic white and Hispanic groups. The age-adjusted death rate for lung cancer, the leading cause of cancer death in all groups, was 42.1 per 100,000 standard population for the total population, 45.4 for non-Hispanic white, 45.7 for non-Hispanic black, and 18.3 for Hispanic populations.

  11. The logistics of choice.

    PubMed

    Killeen, Peter R

    2015-07-01

    The generalized matching law (GML) is reconstructed as a logistic regression equation that privileges no particular value of the sensitivity parameter, a. That value will often approach 1 due to the feedback that drives switching that is intrinsic to most concurrent schedules. A model of that feedback reproduced some features of concurrent data. The GML is a law only in the strained sense that any equation that maps data is a law. The machine under the hood of matching is in all likelihood the very law that was displaced by the Matching Law. It is now time to return the Law of Effect to centrality in our science.

  12. Changes in Age-Adjusted Mortality Rates and Disparities for Rural Physician Shortage Areas Staffed by the National Health Service Corps: 1984-1998

    ERIC Educational Resources Information Center

    Pathman, Donald E.; Fryer, George E.; Green, Larry A.; Phillips, Robert L.

    2005-01-01

    This study assesses whether the National Health Service Corps's legislated goals to see health improve and health disparities lessen are being met in rural health professional shortage areas for a key population health indicator: age-adjusted mortality. In a descriptive study using a pre-post design with comparison groups, the authors calculated…

  13. Changes in Age-Adjusted Mortality Rates and Disparities for Rural Physician Shortage Areas Staffed by the National Health Service Corps: 1984-1998

    ERIC Educational Resources Information Center

    Pathman, Donald E.; Fryer, George E.; Green, Larry A.; Phillips, Robert L.

    2005-01-01

    Objective: This study assesses whether the National Health Service Corps's legislated goals to see health improve and health disparities lessen are being met in rural health professional shortage areas for a key population health indicator: age-adjusted mortality. Methods: In a descriptive study using a pre-post design with comparison groups, the…

  14. Country logistics performance and disaster impact.

    PubMed

    Vaillancourt, Alain; Haavisto, Ira

    2016-04-01

    The aim of this paper is to deepen the understanding of the relationship between country logistics performance and disaster impact. The relationship is analysed through correlation analysis and regression models for 117 countries for the years 2007 to 2012 with disaster impact variables from the International Disaster Database (EM-DAT) and logistics performance indicators from the World Bank. The results show a significant relationship between country logistics performance and disaster impact overall and for five out of six specific logistic performance indicators. These specific indicators were further used to explore the relationship between country logistic performance and disaster impact for three specific disaster types (epidemic, flood and storm). The findings enhance the understanding of the role of logistics in a humanitarian context with empirical evidence of the importance of country logistics performance in disaster response operations.

  15. Biomass Logistics

    SciTech Connect

    J. Richard Hess; Kevin L. Kenney; William A. Smith; Ian Bonner; David J. Muth

    2015-04-01

    Equipment manufacturers have made rapid improvements in biomass harvesting and handling equipment. These improvements have increased transportation and handling efficiencies due to higher biomass densities and reduced losses. Improvements in grinder efficiencies and capacity have reduced biomass grinding costs. Biomass collection efficiencies (the ratio of biomass collected to the amount available in the field) as high as 75% for crop residues and greater than 90% for perennial energy crops have also been demonstrated. However, as collection rates increase, the fraction of entrained soil in the biomass increases, and high biomass residue removal rates can violate agronomic sustainability limits. Advancements in quantifying multi-factor sustainability limits to increase removal rate as guided by sustainable residue removal plans, and mitigating soil contamination through targeted removal rates based on soil type and residue type/fraction is allowing the use of new high efficiency harvesting equipment and methods. As another consideration, single pass harvesting and other technologies that improve harvesting costs cause biomass storage moisture management challenges, which challenges are further perturbed by annual variability in biomass moisture content. Monitoring, sampling, simulation, and analysis provide basis for moisture, time, and quality relationships in storage, which has allowed the development of moisture tolerant storage systems and best management processes that combine moisture content and time to accommodate baled storage of wet material based upon “shelf-life.” The key to improving biomass supply logistics costs has been developing the associated agronomic sustainability and biomass quality technologies and processes that allow the implementation of equipment engineering solutions.

  16. The original and simplified Wells rules and age-adjusted D-dimer testing to rule out pulmonary embolism: an individual patient data meta-analysis.

    PubMed

    van Es, N; Kraaijpoel, N; Klok, F A; Huisman, M V; Den Exter, P L; Mos, I C M; Galipienzo, J; Büller, H R; Bossuyt, P M

    2017-04-01

    Essentials Evidence for the simplified Wells rule in ruling out acute pulmonary embolism (PE) is scarce. This was a post-hoc analysis on data from 6 studies comprising 7268 patients with suspected PE. The simplified Wells rule combined with age-adjusted D-dimer testing may safely rule out PE. Given its ease of use, the simplified Wells rule is to be preferred over the original Wells rule.

  17. Community water fluoridation predicts increase in age-adjusted incidence and prevalence of diabetes in 22 states from 2005 and 2010.

    PubMed

    Fluegge, Kyle

    2016-10-01

    Community water fluoridation is considered a significant public health achievement of the 20th century. In this paper, the hypothesis that added water fluoridation has contributed to diabetes incidence and prevalence in the United States was investigated. Panel data from publicly available sources were used with population-averaged models to test the associations of added and natural fluoride on the outcomes at the county level in 22 states for the years 2005 and 2010. The findings suggest that a 1 mg increase in the county mean added fluoride significantly positively predicts a 0.23 per 1,000 person increase in age-adjusted diabetes incidence (P < 0.001), and a 0.17% increase in age-adjusted diabetes prevalence percent (P < 0.001), while natural fluoride concentration is significantly protective. For counties using fluorosilicic acid as the chemical additive, both outcomes were lower: by 0.45 per 1,000 persons (P < 0.001) and 0.33% (P < 0.001), respectively. These findings are adjusted for county-level and time-varying changes in per capita tap water consumption, poverty, year, population density, age-adjusted obesity and physical inactivity, and mean number of years since water fluoridation started. Sensitivity analyses revealed robust effects for both types of fluoride. Community water fluoridation is associated with epidemiological outcomes for diabetes.

  18. Community water fluoridation predicts increase in age-adjusted incidence and prevalence of diabetes in 22 states from 2005 and 2010

    PubMed Central

    Fluegge, Kyle

    2016-01-01

    Community water fluoridation is considered a significant public health achievement of the 20th century. In this paper, the hypothesis that added water fluoridation has contributed to diabetes incidence and prevalence in the United States was investigated. Panel data from publicly available sources were used with population-averaged models to test the associations of added and natural fluoride on the outcomes at the county level in 22 states for the years 2005 and 2010. The findings suggest that a 1 mg increase in the county mean added fluoride significantly positively predicts a 0.23 per 1,000 person increase in age-adjusted diabetes incidence (P < 0.001), and a 0.17% increase in age-adjusted diabetes prevalence percent (P < 0.001), while natural fluoride concentration is significantly protective. For counties using fluorosilicic acid as the chemical additive, both outcomes were lower: by 0.45 per 1,000 persons (P < 0.001) and 0.33% (P < 0.001), respectively. These findings are adjusted for county-level and time-varying changes in per capita tap water consumption, poverty, year, population density, age-adjusted obesity and physical inactivity, and mean number of years since water fluoridation started. Sensitivity analyses revealed robust effects for both types of fluoride. Community water fluoridation is associated with epidemiological outcomes for diabetes. PMID:27740551

  19. Height and age adjustment for cross sectional studies of lung function in children aged 6-11 years.

    PubMed Central

    Chinn, S; Rona, R J

    1992-01-01

    BACKGROUND: No standard exists for the adjustment of lung function for height and age in children. Multiple regression should not be used on untransformed data because, for example, forced expiratory volume (FEV1), though normally distributed for height, age, and sex, has increasing standard deviation. A solution to the conflict is proposed. METHODS: Spirometry on representative samples of children aged 6.5 to 11.99 years in primary schools in England. After exclusion of children who did not provide two repeatable blows 910 white English boys and 722 girls had data on FEV1 and height. Means and standard deviations of FEV1 divided by height were plotted to determine whether logarithmic transformation of FEV1 was appropriate. Multiple regression was used to give predicted FEV1 for height and age on the transformed scale; back transformation gave predicted values in litres. Other lung function measures were analysed, and data on inner city children, children from ethnic minority groups, and Scottish children were described. RESULTS: After logarithmic (ln) transformation of FEV1 standard deviation was constant. The ratios of actual and predicted values of FEV1 were normally distributed in boys and girls. From the means and standard deviations of these distributions, and the predicted values, centiles and standard deviation scores can be calculated. CONCLUSION: The method described is valid because the assumption of stable variance for multiple regression was satisfied on the log scale and the variation of ratios of actual to predicted values on the original scale was well described by a normal distribution. The adoption of the method will lead to uniformity and greater ease of comparison of research findings. PMID:1440464

  20. Development and validation of a continuously age-adjusted measure of patient condition for hospitalized children using the electronic medical record.

    PubMed

    Rothman, Michael J; Tepas, Joseph J; Nowalk, Andrew J; Levin, James E; Rimar, Joan M; Marchetti, Albert; Hsiao, Allen L

    2017-02-01

    Awareness of a patient's clinical status during hospitalization is a primary responsibility for hospital providers. One tool to assess status is the Rothman Index (RI), a validated measure of patient condition for adults, based on empirically derived relationships between 1-year post-discharge mortality and each of 26 clinical measurements available in the electronic medical record. However, such an approach cannot be used for pediatrics, where the relationships between risk and clinical variables are distinct functions of patient age, and sufficient 1-year mortality data for each age group simply do not exist. We report the development and validation of a new methodology to use adult mortality data to generate continuously age-adjusted acuity scores for pediatrics. Clinical data were extracted from EMRs at three pediatric hospitals covering 105,470 inpatient visits over a 3-year period. The RI input variable set was used as a starting point for the development of the pediatric Rothman Index (pRI). Age-dependence of continuous variables was determined by plotting mean values versus age. For variables determined to be age-dependent, polynomial functions of mean value and mean standard deviation versus age were constructed. Mean values and standard deviations for adult RI excess risk curves were separately estimated. Based on the "find the center of the channel" hypothesis, univariate pediatric risk was then computed by applying a z-score transform to adult mean and standard deviation values based on polynomial pediatric mean and standard deviation functions. Multivariate pediatric risk is estimated as the sum of univariate risk. Other age adjustments for categorical variables were also employed. Age-specific pediatric excess risk functions were compared to age-specific expert-derived functions and to in-hospital mortality. AUC for 24-h mortality and pRI scores prior to unplanned ICU transfers were computed. Age-adjusted risk functions correlated well with similar

  1. QuickStats: Age-Adjusted Rate* for Suicide,(†) by Sex - National Vital Statistics System, United States, 1975-2015.

    PubMed

    2017-03-17

    There was an overall decline of 24% in the age-adjusted suicide rate from 1977 (13.7 per 100,000) to 2000 (10.4). The rate increased in most years from 2000 to 2015. The 2015  suicide rate (13.3) was 28% higher than in 2000. The rates for males and females  followed the overall pattern; however, the rate for males was approximately 3-5 times higher than the rate for females throughout the study period.

  2. Survival Data and Regression Models

    NASA Astrophysics Data System (ADS)

    Grégoire, G.

    2014-12-01

    We start this chapter by introducing some basic elements for the analysis of censored survival data. Then we focus on right censored data and develop two types of regression models. The first one concerns the so-called accelerated failure time models (AFT), which are parametric models where a function of a parameter depends linearly on the covariables. The second one is a semiparametric model, where the covariables enter in a multiplicative form in the expression of the hazard rate function. The main statistical tool for analysing these regression models is the maximum likelihood methodology and, in spite we recall some essential results about the ML theory, we refer to the chapter "Logistic Regression" for a more detailed presentation.

  3. Incremental logistic regression for customizing automatic diagnostic models.

    PubMed

    Tortajada, Salvador; Robles, Montserrat; García-Gómez, Juan Miguel

    2015-01-01

    In the last decades, and following the new trends in medicine, statistical learning techniques have been used for developing automatic diagnostic models for aiding the clinical experts throughout the use of Clinical Decision Support Systems. The development of these models requires a large, representative amount of data, which is commonly obtained from one hospital or a group of hospitals after an expensive and time-consuming gathering, preprocess, and validation of cases. After the model development, it has to overcome an external validation that is often carried out in a different hospital or health center. The experience is that the models show underperformed expectations. Furthermore, patient data needs ethical approval and patient consent to send and store data. For these reasons, we introduce an incremental learning algorithm base on the Bayesian inference approach that may allow us to build an initial model with a smaller number of cases and update it incrementally when new data are collected or even perform a new calibration of a model from a different center by using a reduced number of cases. The performance of our algorithm is demonstrated by employing different benchmark datasets and a real brain tumor dataset; and we compare its performance to a previous incremental algorithm and a non-incremental Bayesian model, showing that the algorithm is independent of the data model, iterative, and has a good convergence.

  4. Assessing Lake Trophic Status: A Proportional Odds Logistic Regression Model

    EPA Science Inventory

    Lake trophic state classifications are good predictors of ecosystem condition and are indicative of both ecosystem services (e.g., recreation and aesthetics), and disservices (e.g., harmful algal blooms). Methods for classifying trophic state are based off the foundational work o...

  5. Analyzing Army Reserve Unsatisfactory Participants through Logistic Regression

    DTIC Science & Technology

    2012-06-08

    interactions, and referenced Dr. Fredric Jablin’s four stages of socialization (anticipatory socialization, organizational encounter, metamorphosis ...Soldier’s encounter is positive he begins the metamorphosis stage, which “marks the newcomer’s alignment of expectations to those of the organization...anticipatory socialization, encounter, and metamorphosis stages do not have enough time in the unit to make educated decisions. The metamorphosis stage should

  6. Measuring NAVSPASUR Sensor Performance using Logistic Regression Models

    DTIC Science & Technology

    1992-09-01

    consists of three transmitting and six receiving stations or sites. The three transmitting sites, located in Lake Kickapoo TX, Gila River AZ and Jordan...frequency of 216.980 MHz. The largest of the transmitters, located in Lake Kickapoo TX, operates at 810 kW of output power and consists of eighteen...parameters for the Lake Kickapoo transmitter and the San Diego receiver’ will be used. The following parameters are given [Ref. 1: p. 6-7]: P, = 6.76 X 10

  7. Interpreting Multiple Logistic Regression Coefficients in Prospective Observational Studies

    DTIC Science & Technology

    1982-11-01

    prompted close examination of the issue at a workshop on hypertriglyceridemia where some of the cautions and perspectives given in this paper were...longevity," Circulation, 34, 679-697, (1966). 19. Lippel, K., Tyroler, H., Eder, H., Gotto, A., and Vahouny, G. "Rela- tionship of hypertriglyceridemia

  8. Background or Experience? Using Logistic Regression to Predict College Retention

    ERIC Educational Resources Information Center

    Synco, Tracee M.

    2012-01-01

    Tinto, Astin and countless others have researched the retention and attrition of students from college for more than thirty years. However, the six year graduation rate for all first-time full-time freshmen for the 2002 cohort was 57%. This study sought to determine the retention variables that predicted continued enrollment of entering freshmen…

  9. KSC ISS Logistics Support

    NASA Technical Reports Server (NTRS)

    Tellado, Joseph

    2014-01-01

    The presentation contains a status of KSC ISS Logistics Operations. It basically presents current top level ISS Logistics tasks being conducted at KSC, current International Partner activities, hardware processing flow focussing on late Stow operations, list of KSC Logistics POC's, and a backup list of Logistics launch site services. This presentation is being given at the annual International Space Station (ISS) Multi-lateral Logistics Maintenance Control Panel meeting to be held in Turin, Italy during the week of May 13-16. The presentatiuon content doesn't contain any potential lessons learned.

  10. Regression: A Bibliography.

    ERIC Educational Resources Information Center

    Pedrini, D. T.; Pedrini, Bonnie C.

    Regression, another mechanism studied by Sigmund Freud, has had much research, e.g., hypnotic regression, frustration regression, schizophrenic regression, and infra-human-animal regression (often directly related to fixation). Many investigators worked with hypnotic age regression, which has a long history, going back to Russian reflexologists.…

  11. Just-In-Time Logistics

    DTIC Science & Technology

    2006-02-07

    Logistics modernization, end-to-end logistics, logistics transformation, and just - in - time logistics -- whichever name it is being called today, it is...demonstrations as to what these systems will do for the end user, consumer confidence will increase and " just - in - time " logistics will lead to a lighter and leaner combat logistics support.

  12. Focused Logistics: Putting Agility in Agile Logistics

    DTIC Science & Technology

    2011-05-19

    envisioned an agile and adaptable logstics system built around common situational understanding.5 The Focused Logistics concept specified the requirement to...the tracking of resources moving through the TD network. As a result, 112 Ibid, 20. 113 Ibid, 18-20. 39 distribution centers tagged inbound

  13. Ridge Regression: A Panacea?

    ERIC Educational Resources Information Center

    Walton, Joseph M.; And Others

    1978-01-01

    Ridge regression is an approach to the problem of large standard errors of regression estimates of intercorrelated regressors. The effect of ridge regression on the estimated squared multiple correlation coefficient is discussed and illustrated. (JKS)

  14. The "Smarter Regression" Add-In for Linear and Logistic Regression in Excel

    DTIC Science & Technology

    2007-07-01

    only Visual Basic , the built-in programming language of Excel (Walkenbach, 1999). We wanted to avoid the use of external Dynamic Linked Libraries...least two different ways of entering results into workbook cells in Visual Basic . One is to establish an array in Visual Basic , fill up the elements of

  15. QuickStats: Age-Adjusted Death Rates* for Females Aged 15-44 Years, by the Five Leading Causes of Death(†) - United States, 1999 and 2014.

    PubMed

    2016-07-01

    The age-adjusted death rate for females aged 15-44 years was 5% lower in 2014 (82.1 per 100,000 population) than in 1999 (86.5). Among the five leading causes of death, the age-adjusted rates of three were lower in 2014 than in 1999: cancer (from 19.6 to 15.3, a 22% decline), heart disease (8.9 to 8.2, an 8% decline), and homicide (4.2 to 2.8, a 33% decline). The age-adjusted death rates for two of the five causes were higher in 2014 than in 1999: unintentional injuries (from 17.0 to 20.1, an 18% increase) and suicide (4.8 to 6.5, a 35% increase). Unintentional injuries replaced cancer as the leading cause of death in this demographic group.

  16. QuickStats: Age-Adjusted Death Rates* for Males Aged 15-44 Years, by the Five Leading Causes of Death(†) - United States, 1999 and 2014.

    PubMed

    2016-08-12

    The age-adjusted death rate for males aged 15-44 years was 10% lower in 2014 (156.6 per 100,000 population) than in 1999 (174.1). Among the five leading causes of death, the age-adjusted rates for three were lower in 2014 than in 1999: cancer (from 17.1 to 12.8; 25% decline), heart disease (20.1 to 17.0; 15% decline), and homicide (15.7 to 13.8; 12% decline). The age-adjusted death rates for two of the five causes were higher in 2014 than in 1999: suicide (20.1 to 22.5; 12% increase), and unintentional injuries (from 48.7 to 51.0; 5% increase).

  17. Introduction to the use of regression models in epidemiology.

    PubMed

    Bender, Ralf

    2009-01-01

    Regression modeling is one of the most important statistical techniques used in analytical epidemiology. By means of regression models the effect of one or several explanatory variables (e.g., exposures, subject characteristics, risk factors) on a response variable such as mortality or cancer can be investigated. From multiple regression models, adjusted effect estimates can be obtained that take the effect of potential confounders into account. Regression methods can be applied in all epidemiologic study designs so that they represent a universal tool for data analysis in epidemiology. Different kinds of regression models have been developed in dependence on the measurement scale of the response variable and the study design. The most important methods are linear regression for continuous outcomes, logistic regression for binary outcomes, Cox regression for time-to-event data, and Poisson regression for frequencies and rates. This chapter provides a nontechnical introduction to these regression models with illustrating examples from cancer research.

  18. Security in Logistics

    NASA Astrophysics Data System (ADS)

    Cempírek, Václav; Nachtigall, Petr; Široký, Jaromír

    2016-12-01

    This paper deals with security of logistic chains according to incorrect declaration of transported goods, fraudulent transport and forwarding companies and possible threats caused by political influences. The main goal of this paper is to highlight possible logistic costs increase due to these fraudulent threats. An analysis of technological processes will beis provided, and an increase of these transport times considering the possible threatswhich will beis evaluated economic costs-wise. In the conclusion, possible threat of companies'` efficiency in logistics due to the costs`, means of transport and increase in human resources` increase will beare pointed out.

  19. Green Logistics Management

    NASA Astrophysics Data System (ADS)

    Chang, Yoon S.; Oh, Chang H.

    Nowadays, environmental management becomes a critical business consideration for companies to survive from many regulations and tough business requirements. Most of world-leading companies are now aware that environment friendly technology and management are critical to the sustainable growth of the company. The environment market has seen continuous growth marking 532B in 2000, and 590B in 2004. This growth rate is expected to grow to 700B in 2010. It is not hard to see the environment-friendly efforts in almost all aspects of business operations. Such trends can be easily found in logistics area. Green logistics aims to make environmental friendly decisions throughout a product lifecycle. Therefore for the success of green logistics, it is critical to have real time tracking capability on the product throughout the product lifecycle and smart solution service architecture. In this chapter, we introduce an RFID based green logistics solution and service.

  20. Lunar Commercial Mining Logistics

    NASA Astrophysics Data System (ADS)

    Kistler, Walter P.; Citron, Bob; Taylor, Thomas C.

    2008-01-01

    Innovative commercial logistics is required for supporting lunar resource recovery operations and assisting larger consortiums in lunar mining, base operations, camp consumables and the future commercial sales of propellant over the next 50 years. To assist in lowering overall development costs, ``reuse'' innovation is suggested in reusing modified LTS in-space hardware for use on the moon's surface, developing product lines for recovered gases, regolith construction materials, surface logistics services, and other services as they evolve, (Kistler, Citron and Taylor, 2005) Surface logistics architecture is designed to have sustainable growth over 50 years, financed by private sector partners and capable of cargo transportation in both directions in support of lunar development and resource recovery development. The author's perspective on the importance of logistics is based on five years experience at remote sites on Earth, where remote base supply chain logistics didn't always work, (Taylor, 1975a). The planning and control of the flow of goods and materials to and from the moon's surface may be the most complicated logistics challenges yet to be attempted. Affordability is tied to the innovation and ingenuity used to keep the transportation and surface operations costs as low as practical. Eleven innovations are proposed and discussed by an entrepreneurial commercial space startup team that has had success in introducing commercial space innovation and reducing the cost of space operations in the past. This logistics architecture offers NASA and other exploring nations a commercial alternative for non-essential cargo. Five transportation technologies and eleven surface innovations create the logistics transportation system discussed.

  1. Clinical usefulness and safety of an age-adjusted D-dimer cutoff levels to exclude pulmonary embolism: a retrospective analysis.

    PubMed

    Flores, Julio; García de Tena, Jaime; Galipienzo, Javier; García-Avello, Ángel; Pérez-Rodríguez, Esteban; Tortuero, José Ignacio; Álvarez, Concepción; Ruíz, Antonio; Arribas, Ignacio

    2016-02-01

    Age-adjusted D-dimer (AADD) appears to increase the proportion of patients in whom pulmonary embolism (PE) can safely be excluded compared with conventional D-dimer (CDD), according to a limited number of studies. The aim if this study was to assess whether the use of an AADD might safely increase the clinical usefulness of CDD for the diagnosis of PE in our setting. Three hundred and sixty two consecutive outpatients with clinically suspected PE in whom plasma samples were obtained to measure D-dimer were included in this post hoc analysis of a previous study. CDD cutoff value was 500 ng/mL and AADD was calculated as (patient's age × 10) ng/mL in patients aged >50. Sensitivity, specificity, clinical usefulness (i.e., proportion of true-negative tests among all patients with suspected PE), and the proportion of false negatives were calculated for both AADD and CDD among patients with low-to-moderate clinical probability of PE according to Well's criteria. PE was confirmed in 98 patients (27%). Among 331 patients with low-to-moderate clinical probability of PE, sensitivity and clinical usefulness were 100 and 27.8% for CDD, respectively, and 100 and 36.5% for AADD, respectively. In 29 patients aged >50 with CDD >500 ng/mL, AADD showed values under its normal cutoff point, without false negatives for the diagnosis of PE (0%, 95% CI 0-11%). AADD increases clinical usefulness notably with respect to that of CDD in patients with clinical suspected PE without losing sensitivity in our cohort. The use of AADD apparently does not reduce the safety of CDD for the exclusion of PE.

  2. Space Station fluid management logistics

    NASA Technical Reports Server (NTRS)

    Dominick, Sam M.

    1990-01-01

    Viewgraphs and discussion on space station fluid management logistics are presented. Topics covered include: fluid management logistics - issues for Space Station Freedom evolution; current fluid logistics approach; evolution of Space Station Freedom fluid resupply; launch vehicle evolution; ELV logistics system approach; logistics carrier configuration; expendable fluid/propellant carrier description; fluid carrier design concept; logistics carrier orbital operations; carrier operations at space station; summary/status of orbital fluid transfer techniques; Soviet progress tanker system; and Soviet propellant resupply system observations.

  3. Regressive systemic sclerosis.

    PubMed Central

    Black, C; Dieppe, P; Huskisson, T; Hart, F D

    1986-01-01

    Systemic sclerosis is a disease which usually progresses or reaches a plateau with persistence of symptoms and signs. Regression is extremely unusual. Four cases of established scleroderma are described in which regression is well documented. The significance of this observation and possible mechanisms of disease regression are discussed. Images PMID:3718012

  4. NCCS Regression Test Harness

    SciTech Connect

    Tharrington, Arnold N.

    2015-09-09

    The NCCS Regression Test Harness is a software package that provides a framework to perform regression and acceptance testing on NCCS High Performance Computers. The package is written in Python and has only the dependency of a Subversion repository to store the regression tests.

  5. Fully Regressive Melanoma

    PubMed Central

    Ehrsam, Eric; Kallini, Joseph R.; Lebas, Damien; Modiano, Philippe; Cotten, Hervé

    2016-01-01

    Fully regressive melanoma is a phenomenon in which the primary cutaneous melanoma becomes completely replaced by fibrotic components as a result of host immune response. Although 10 to 35 percent of cases of cutaneous melanomas may partially regress, fully regressive melanoma is very rare; only 47 cases have been reported in the literature to date. AH of the cases of fully regressive melanoma reported in the literature were diagnosed in conjunction with metastasis on a patient. The authors describe a case of fully regressive melanoma without any metastases at the time of its diagnosis. Characteristic findings on dermoscopy, as well as the absence of melanoma on final biopsy, confirmed the diagnosis. PMID:27672418

  6. Nuclear Shuttle Logistics Configuration

    NASA Technical Reports Server (NTRS)

    1971-01-01

    This 1971 artist's concept shows the Nuclear Shuttle in both its lunar logistics configuraton and geosynchronous station configuration. As envisioned by Marshall Space Flight Center Program Development persornel, the Nuclear Shuttle would deliver payloads to lunar orbits or other destinations then return to Earth orbit for refueling and additional missions.

  7. Optimal estimation for regression models on τ-year survival probability.

    PubMed

    Kwak, Minjung; Kim, Jinseog; Jung, Sin-Ho

    2015-01-01

    A logistic regression method can be applied to regressing the [Formula: see text]-year survival probability to covariates, if there are no censored observations before time [Formula: see text]. But if some observations are incomplete due to censoring before time [Formula: see text], then the logistic regression cannot be applied. Jung (1996) proposed to modify the score function for logistic regression to accommodate the right-censored observations. His modified score function, motivated for a consistent estimation of regression parameters, becomes a regular logistic score function if no observations are censored before time [Formula: see text]. In this article, we propose a modification of Jung's estimating function for an optimal estimation for the regression parameters in addition to consistency. We prove that the optimal estimator is more efficient than Jung's estimator. This theoretical comparison is illustrated with a real example data analysis and simulations.

  8. Assessing risk factors for periodontitis using regression

    NASA Astrophysics Data System (ADS)

    Lobo Pereira, J. A.; Ferreira, Maria Cristina; Oliveira, Teresa

    2013-10-01

    Multivariate statistical analysis is indispensable to assess the associations and interactions between different factors and the risk of periodontitis. Among others, regression analysis is a statistical technique widely used in healthcare to investigate and model the relationship between variables. In our work we study the impact of socio-demographic, medical and behavioral factors on periodontal health. Using regression, linear and logistic models, we can assess the relevance, as risk factors for periodontitis disease, of the following independent variables (IVs): Age, Gender, Diabetic Status, Education, Smoking status and Plaque Index. The multiple linear regression analysis model was built to evaluate the influence of IVs on mean Attachment Loss (AL). Thus, the regression coefficients along with respective p-values will be obtained as well as the respective p-values from the significance tests. The classification of a case (individual) adopted in the logistic model was the extent of the destruction of periodontal tissues defined by an Attachment Loss greater than or equal to 4 mm in 25% (AL≥4mm/≥25%) of sites surveyed. The association measures include the Odds Ratios together with the correspondent 95% confidence intervals.

  9. Morse–Smale Regression

    SciTech Connect

    Gerber, Samuel; Rubel, Oliver; Bremer, Peer -Timo; Pascucci, Valerio; Whitaker, Ross T.

    2012-01-19

    This paper introduces a novel partition-based regression approach that incorporates topological information. Partition-based regression typically introduces a quality-of-fit-driven decomposition of the domain. The emphasis in this work is on a topologically meaningful segmentation. Thus, the proposed regression approach is based on a segmentation induced by a discrete approximation of the Morse–Smale complex. This yields a segmentation with partitions corresponding to regions of the function with a single minimum and maximum that are often well approximated by a linear model. This approach yields regression models that are amenable to interpretation and have good predictive capacity. Typically, regression estimates are quantified by their geometrical accuracy. For the proposed regression, an important aspect is the quality of the segmentation itself. Thus, this article introduces a new criterion that measures the topological accuracy of the estimate. The topological accuracy provides a complementary measure to the classical geometrical error measures and is very sensitive to overfitting. The Morse–Smale regression is compared to state-of-the-art approaches in terms of geometry and topology and yields comparable or improved fits in many cases. Finally, a detailed study on climate-simulation data demonstrates the application of the Morse–Smale regression. Supplementary Materials are available online and contain an implementation of the proposed approach in the R package msr, an analysis and simulations on the stability of the Morse–Smale complex approximation, and additional tables for the climate-simulation study.

  10. Improved Regression Calibration

    ERIC Educational Resources Information Center

    Skrondal, Anders; Kuha, Jouni

    2012-01-01

    The likelihood for generalized linear models with covariate measurement error cannot in general be expressed in closed form, which makes maximum likelihood estimation taxing. A popular alternative is regression calibration which is computationally efficient at the cost of inconsistent estimation. We propose an improved regression calibration…

  11. Morse-Smale Regression

    PubMed Central

    Gerber, Samuel; Rübel, Oliver; Bremer, Peer-Timo; Pascucci, Valerio; Whitaker, Ross T.

    2012-01-01

    This paper introduces a novel partition-based regression approach that incorporates topological information. Partition-based regression typically introduce a quality-of-fit-driven decomposition of the domain. The emphasis in this work is on a topologically meaningful segmentation. Thus, the proposed regression approach is based on a segmentation induced by a discrete approximation of the Morse-Smale complex. This yields a segmentation with partitions corresponding to regions of the function with a single minimum and maximum that are often well approximated by a linear model. This approach yields regression models that are amenable to interpretation and have good predictive capacity. Typically, regression estimates are quantified by their geometrical accuracy. For the proposed regression, an important aspect is the quality of the segmentation itself. Thus, this paper introduces a new criterion that measures the topological accuracy of the estimate. The topological accuracy provides a complementary measure to the classical geometrical error measures and is very sensitive to over-fitting. The Morse-Smale regression is compared to state-of-the-art approaches in terms of geometry and topology and yields comparable or improved fits in many cases. Finally, a detailed study on climate-simulation data demonstrates the application of the Morse-Smale regression. Supplementary materials are available online and contain an implementation of the proposed approach in the R package msr, an analysis and simulations on the stability of the Morse-Smale complex approximation and additional tables for the climate-simulation study. PMID:23687424

  12. Logistics Assessment Guidebook

    DTIC Science & Technology

    2011-07-01

    significant. Another case involved the integration of an aircraft and ground vehicle system. Integration issues were identified early in the design...for height and width; insufficient power requirements to support maintenance actions; and insufficient design of the autonomic logistics system...evaluation at IOT &E and thus on track for Initial Operational Capability (IOC). If not, a plan is in place to ensure they are met (ref DoDI 5000; CJCSM

  13. Logistics Software Implementation.

    DTIC Science & Technology

    1986-02-28

    to obtain benchmark performance results. The list presented in Appendix A is not final and should be considered only as the initial worksheet . d...has been used in more applications and its imple-... mentation is better understood. All of the preliminary tests have indi- cated excellent ...environment. The fifth-generation languages, such as PROLOG, also provide excellent dynamic data base support. The proposed approach for the Logistics

  14. Logistics and Strategy

    DTIC Science & Technology

    2014-12-04

    2014 by: , Director, Graduate Degree Programs Robert F. Baumann, PhD The opinions and conclusions expressed herein are those of the...Washington, DC: Center for Military History, 1993), 244. 24 Spencer C. Tucker and Priscilla Mary Roberts , Encyclopedia of World War I (Santa Barbara...Theater of Operations. See Richard M. Leighton and Robert W. Coakley, Global Logistics and Strategy 1940-1943 (Washington, DC: Center for Military

  15. Relative risk regression analysis of epidemiologic data.

    PubMed

    Prentice, R L

    1985-11-01

    Relative risk regression methods are described. These methods provide a unified approach to a range of data analysis problems in environmental risk assessment and in the study of disease risk factors more generally. Relative risk regression methods are most readily viewed as an outgrowth of Cox's regression and life model. They can also be viewed as a regression generalization of more classical epidemiologic procedures, such as that due to Mantel and Haenszel. In the context of an epidemiologic cohort study, relative risk regression methods extend conventional survival data methods and binary response (e.g., logistic) regression models by taking explicit account of the time to disease occurrence while allowing arbitrary baseline disease rates, general censorship, and time-varying risk factors. This latter feature is particularly relevant to many environmental risk assessment problems wherein one wishes to relate disease rates at a particular point in time to aspects of a preceding risk factor history. Relative risk regression methods also adapt readily to time-matched case-control studies and to certain less standard designs. The uses of relative risk regression methods are illustrated and the state of development of these procedures is discussed. It is argued that asymptotic partial likelihood estimation techniques are now well developed in the important special case in which the disease rates of interest have interpretations as counting process intensity functions. Estimation of relative risks processes corresponding to disease rates falling outside this class has, however, received limited attention. The general area of relative risk regression model criticism has, as yet, not been thoroughly studied, though a number of statistical groups are studying such features as tests of fit, residuals, diagnostics and graphical procedures. Most such studies have been restricted to exponential form relative risks as have simulation studies of relative risk estimation

  16. Boosted Beta Regression

    PubMed Central

    Schmid, Matthias; Wickler, Florian; Maloney, Kelly O.; Mitchell, Richard; Fenske, Nora; Mayr, Andreas

    2013-01-01

    Regression analysis with a bounded outcome is a common problem in applied statistics. Typical examples include regression models for percentage outcomes and the analysis of ratings that are measured on a bounded scale. In this paper, we consider beta regression, which is a generalization of logit models to situations where the response is continuous on the interval (0,1). Consequently, beta regression is a convenient tool for analyzing percentage responses. The classical approach to fit a beta regression model is to use maximum likelihood estimation with subsequent AIC-based variable selection. As an alternative to this established - yet unstable - approach, we propose a new estimation technique called boosted beta regression. With boosted beta regression estimation and variable selection can be carried out simultaneously in a highly efficient way. Additionally, both the mean and the variance of a percentage response can be modeled using flexible nonlinear covariate effects. As a consequence, the new method accounts for common problems such as overdispersion and non-binomial variance structures. PMID:23626706

  17. George: Gaussian Process regression

    NASA Astrophysics Data System (ADS)

    Foreman-Mackey, Daniel

    2015-11-01

    George is a fast and flexible library, implemented in C++ with Python bindings, for Gaussian Process regression useful for accounting for correlated noise in astronomical datasets, including those for transiting exoplanet discovery and characterization and stellar population modeling.

  18. Understanding poisson regression.

    PubMed

    Hayat, Matthew J; Higgins, Melinda

    2014-04-01

    Nurse investigators often collect study data in the form of counts. Traditional methods of data analysis have historically approached analysis of count data either as if the count data were continuous and normally distributed or with dichotomization of the counts into the categories of occurred or did not occur. These outdated methods for analyzing count data have been replaced with more appropriate statistical methods that make use of the Poisson probability distribution, which is useful for analyzing count data. The purpose of this article is to provide an overview of the Poisson distribution and its use in Poisson regression. Assumption violations for the standard Poisson regression model are addressed with alternative approaches, including addition of an overdispersion parameter or negative binomial regression. An illustrative example is presented with an application from the ENSPIRE study, and regression modeling of comorbidity data is included for illustrative purposes.

  19. Squared sine logistic map

    NASA Astrophysics Data System (ADS)

    de Carvalho, R. Egydio; Leonel, Edson D.

    2016-12-01

    A periodic time perturbation is introduced in the logistic map as an attempt to investigate new scenarios of bifurcations and new mechanisms toward the chaos. With a squared sine perturbation we observe that a point attractor reaches the chaotic attractor without following a cascade of bifurcations. One fixed point of the system presents a new scenario of bifurcations through an infinite sequence of alternating changes of stability. At the bifurcations, the perturbation does not modify the scaling features observed in the convergence toward the stationary state.

  20. An age adjustment of very young children of India, 1981 and reappraisal of fertility and mortality rates--A model approach.

    PubMed

    Mukhopadhyay, B K

    1986-01-01

    Several approaches were made by actuaries and demographers to correct and smooth the Indian age distribution with special emphasis on population in age group 0-4 at different points of time. The present analysis conceives the life table stationary population (using the West Model) as 'reference standard'. 2 parameters were estimated from a regression equation using the proportion of population in age groups 5-14 and 60-plus as independent variables and that in 0-4 as the dependent variable. The corrected census proportions in age group 0-4 obtained from the regression model under certain assumptions for the 14 major states and India seem to be consistent and to have slightly lower values than those of the 1971 adjusted data. Moreover, unadjusted and adjusted proportions in 5-14 and 60 plus do not show any significant difference between the predicted values. Using the corrected population aged 0-4 years, the average annual birth and death rates during the 5 year period preceeding the 1981 census have been estimated for those 14 states and India as well. The estimated birth rates so obtained were further adjusted using an appropriate factor from the West Model and Indian life table survival ratios. The final estimates seem to be consistent, except for a few, and to have slightly higher values than those of earlier estimates. As the present analysis is based on a 5% sample and confined to only 14 states, it is proposed to study the same for all the states and India in greater detail using full count data on age distribution and actul life tables as and when available.

  1. Estimation of the Regression Effect Using a Latent Trait Model.

    ERIC Educational Resources Information Center

    Quinn, Jimmy L.

    A logistic model was used to generate data to serve as a proxy for an immediate retest from item responses to a fourth grade standardized reading comprehension test of 45 items. Assuming that the actual test may be considered a pretest and the proxy data may be considered a retest, the effect of regression was investigated using a percentage of…

  2. Ridge Regression: A Regression Procedure for Analyzing correlated Independent Variables

    ERIC Educational Resources Information Center

    Rakow, Ernest A.

    1978-01-01

    Ridge regression is a technique used to ameliorate the problem of highly correlated independent variables in multiple regression analysis. This paper explains the fundamentals of ridge regression and illustrates its use. (JKS)

  3. Developing Future Strategic Logistics Leaders

    DTIC Science & Technology

    2013-03-01

    and our focus on tactical level lessons learned has 12 skewed our emphasis in PME to a very narrow band of excellence. As such our current...A partnership between senior logistics leaders, PME developers, and personnel managers is essential to constructing and maintaining strategic...short term and inconsistent PME system that does not produce strategic logistic leaders. The Logistics Corps, sustainment branches, LOO lead for leader

  4. Space Shuttle operational logistics plan

    NASA Technical Reports Server (NTRS)

    Botts, J. W.

    1983-01-01

    The Kennedy Space Center plan for logistics to support Space Shuttle Operations and to establish the related policies, requirements, and responsibilities are described. The Directorate of Shuttle Management and Operations logistics responsibilities required by the Kennedy Organizational Manual, and the self-sufficiency contracting concept are implemented. The Space Shuttle Program Level 1 and Level 2 logistics policies and requirements applicable to KSC that are presented in HQ NASA and Johnson Space Center directives are also implemented.

  5. Modern Regression Discontinuity Analysis

    ERIC Educational Resources Information Center

    Bloom, Howard S.

    2012-01-01

    This article provides a detailed discussion of the theory and practice of modern regression discontinuity (RD) analysis for estimating the effects of interventions or treatments. Part 1 briefly chronicles the history of RD analysis and summarizes its past applications. Part 2 explains how in theory an RD analysis can identify an average effect of…

  6. Multiple linear regression analysis

    NASA Technical Reports Server (NTRS)

    Edwards, T. R.

    1980-01-01

    Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.

  7. Explorations in Statistics: Regression

    ERIC Educational Resources Information Center

    Curran-Everett, Douglas

    2011-01-01

    Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This seventh installment of "Explorations in Statistics" explores regression, a technique that estimates the nature of the relationship between two things for which we may only surmise a mechanistic or predictive…

  8. Nuclear Lunar Logistics Study

    NASA Technical Reports Server (NTRS)

    1963-01-01

    This document has been prepared to incorporate all presentation aid material, together with some explanatory text, used during an oral briefing on the Nuclear Lunar Logistics System given at the George C. Marshall Space Flight Center, National Aeronautics and Space Administration, on 18 July 1963. The briefing and this document are intended to present the general status of the NERVA (Nuclear Engine for Rocket Vehicle Application) nuclear rocket development, the characteristics of certain operational NERVA-class engines, and appropriate technical and schedule information. Some of the information presented herein is preliminary in nature and will be subject to further verification, checking and analysis during the remainder of the study program. In addition, more detailed information will be prepared in many areas for inclusion in a final summary report. This work has been performed by REON, a division of Aerojet-General Corporation under Subcontract 74-10039 from the Lockheed Missiles and Space Company. The presentation and this document have been prepared in partial fulfillment of the provisions of the subcontract. From the inception of the NERVA program in July 1961, the stated emphasis has centered around the demonstration of the ability of a nuclear rocket to perform safely and reliably in the space environment, with the understanding that the assignment of a mission (or missions) would place undue emphasis on performance and operational flexibility. However, all were aware that the ultimate justification for the development program must lie in the application of the nuclear propulsion system to the national space objectives.

  9. Selected Logistics Models and Techniques.

    DTIC Science & Technology

    1984-09-01

    Programmable Calculator LCC...Program 27 TI-59 Programmable Calculator LCC Model 30 Unmanned Spacecraft Cost Model 31 iv I: TABLE OF CONTENTS (CONT’D) (Subject Index) LOGISTICS...34"" - % - "° > - " ° .° - " .’ > -% > ]*° - LOGISTICS ANALYSIS MODEL/TECHNIQUE DATA MODEL/TECHNIQUE NAME: TI-59 Programmable Calculator LCC Model TYPE MODEL: Cost Estimating DEVELOPED BY:

  10. Calculating a Stepwise Ridge Regression.

    ERIC Educational Resources Information Center

    Morris, John D.

    1986-01-01

    Although methods for using ordinary least squares regression computer programs to calculate a ridge regression are available, the calculation of a stepwise ridge regression requires a special purpose algorithm and computer program. The correct stepwise ridge regression procedure is given, and a parallel FORTRAN computer program is described.…

  11. Orthogonal Regression: A Teaching Perspective

    ERIC Educational Resources Information Center

    Carr, James R.

    2012-01-01

    A well-known approach to linear least squares regression is that which involves minimizing the sum of squared orthogonal projections of data points onto the best fit line. This form of regression is known as orthogonal regression, and the linear model that it yields is known as the major axis. A similar method, reduced major axis regression, is…

  12. Structural regression trees

    SciTech Connect

    Kramer, S.

    1996-12-31

    In many real-world domains the task of machine learning algorithms is to learn a theory for predicting numerical values. In particular several standard test domains used in Inductive Logic Programming (ILP) are concerned with predicting numerical values from examples and relational and mostly non-determinate background knowledge. However, so far no ILP algorithm except one can predict numbers and cope with nondeterminate background knowledge. (The only exception is a covering algorithm called FORS.) In this paper we present Structural Regression Trees (SRT), a new algorithm which can be applied to the above class of problems. SRT integrates the statistical method of regression trees into ILP. It constructs a tree containing a literal (an atomic formula or its negation) or a conjunction of literals in each node, and assigns a numerical value to each leaf. SRT provides more comprehensible results than purely statistical methods, and can be applied to a class of problems most other ILP systems cannot handle. Experiments in several real-world domains demonstrate that the approach is competitive with existing methods, indicating that the advantages are not at the expense of predictive accuracy.

  13. Logistics in smallpox: the legacy.

    PubMed

    Wickett, John; Carrasco, Peter

    2011-12-30

    Logistics, defined as "the time-related positioning of resources" was critical to the implementation of the smallpox eradication strategy of surveillance and containment. Logistical challenges in the smallpox programme included vaccine delivery, supplies, staffing, vehicle maintenance, and financing. Ensuring mobility was essential as health workers had to travel to outbreaks to contain them. Three examples illustrate a range of logistic challenges which required imagination and innovation. Standard price lists were developed to expedite vehicle maintenance and repair in Bihar, India. Innovative staffing ensured an adequate infrastructure for vehicle maintenance in Bangladesh. The use of disaster relief mechanisms in Somalia provided airlifts, vehicles and funding within 27 days of their initiation. In contrast the Expanded Programme on Immunization (EPI) faces more complex logistical challenges.

  14. What Exactly is Space Logistics?

    DTIC Science & Technology

    2011-01-01

    information on the space shuttle in the news and have inferred by now that resupply missions to the International Space Sta- tion, Hubble Space ... Telescope repair missions, and satellite deployment missions are space logistics—and you would be technically correct. The science of logistics as applied...What Exactly is Space Logistics? James C. Breidenbach Report Documentation Page Form ApprovedOMB No. 0704-0188 Public reporting burden for the

  15. Logistics Transformation: The Paradigm Shift

    DTIC Science & Technology

    2007-05-14

    NUMBER 6. AUTHOR(S) MAJ Derrick A. Corbett 5d. PROJECT NUMBER 5e. TASK NUMBER 5f. WORK UNIT NUMBER 7. PERFORMING ORGANIZATION NAME(S) AND...ADDRESS(ES) 8. PERFORMING ORGANIZATION REPORT U.S. Army Command and General Staff College School of Advanced Military Studies 250 Gibbon...logistics transformation be defined. In the 18th Century, the French invented “a third military science which they called Logistique , or Logistics

  16. Ridge regression processing

    NASA Technical Reports Server (NTRS)

    Kuhl, Mark R.

    1990-01-01

    Current navigation requirements depend on a geometric dilution of precision (GDOP) criterion. As long as the GDOP stays below a specific value, navigation requirements are met. The GDOP will exceed the specified value when the measurement geometry becomes too collinear. A new signal processing technique, called Ridge Regression Processing, can reduce the effects of nearly collinear measurement geometry; thereby reducing the inflation of the measurement errors. It is shown that the Ridge signal processor gives a consistently better mean squared error (MSE) in position than the Ordinary Least Mean Squares (OLS) estimator. The applicability of this technique is currently being investigated to improve the following areas: receiver autonomous integrity monitoring (RAIM), coverage requirements, availability requirements, and precision approaches.

  17. NASA Space Rocket Logistics Challenges

    NASA Technical Reports Server (NTRS)

    Neeley, James R.; Jones, James V.; Watson, Michael D.; Bramon, Christopher J.; Inman, Sharon K.; Tuttle, Loraine

    2014-01-01

    The Space Launch System (SLS) is the new NASA heavy lift launch vehicle and is scheduled for its first mission in 2017. The goal of the first mission, which will be uncrewed, is to demonstrate the integrated system performance of the SLS rocket and spacecraft before a crewed flight in 2021. SLS has many of the same logistics challenges as any other large scale program. Common logistics concerns for SLS include integration of discreet programs geographically separated, multiple prime contractors with distinct and different goals, schedule pressures and funding constraints. However, SLS also faces unique challenges. The new program is a confluence of new hardware and heritage, with heritage hardware constituting seventy-five percent of the program. This unique approach to design makes logistics concerns such as commonality especially problematic. Additionally, a very low manifest rate of one flight every four years makes logistics comparatively expensive. That, along with the SLS architecture being developed using a block upgrade evolutionary approach, exacerbates long-range planning for supportability considerations. These common and unique logistics challenges must be clearly identified and tackled to allow SLS to have a successful program. This paper will address the common and unique challenges facing the SLS programs, along with the analysis and decisions the NASA Logistics engineers are making to mitigate the threats posed by each.

  18. Logistic Mixed Models to Investigate Implicit and Explicit Belief Tracking

    PubMed Central

    Lages, Martin; Scheel, Anne

    2016-01-01

    We investigated the proposition of a two-systems Theory of Mind in adults’ belief tracking. A sample of N = 45 participants predicted the choice of one of two opponent players after observing several rounds in an animated card game. Three matches of this card game were played and initial gaze direction on target and subsequent choice predictions were recorded for each belief task and participant. We conducted logistic regressions with mixed effects on the binary data and developed Bayesian logistic mixed models to infer implicit and explicit mentalizing in true belief and false belief tasks. Although logistic regressions with mixed effects predicted the data well a Bayesian logistic mixed model with latent task- and subject-specific parameters gave a better account of the data. As expected explicit choice predictions suggested a clear understanding of true and false beliefs (TB/FB). Surprisingly, however, model parameters for initial gaze direction also indicated belief tracking. We discuss why task-specific parameters for initial gaze directions are different from choice predictions yet reflect second-order perspective taking. PMID:27853440

  19. Joint regression analysis of correlated data using Gaussian copulas.

    PubMed

    Song, Peter X-K; Li, Mingyao; Yuan, Ying

    2009-03-01

    This article concerns a new joint modeling approach for correlated data analysis. Utilizing Gaussian copulas, we present a unified and flexible machinery to integrate separate one-dimensional generalized linear models (GLMs) into a joint regression analysis of continuous, discrete, and mixed correlated outcomes. This essentially leads to a multivariate analogue of the univariate GLM theory and hence an efficiency gain in the estimation of regression coefficients. The availability of joint probability models enables us to develop a full maximum likelihood inference. Numerical illustrations are focused on regression models for discrete correlated data, including multidimensional logistic regression models and a joint model for mixed normal and binary outcomes. In the simulation studies, the proposed copula-based joint model is compared to the popular generalized estimating equations, which is a moment-based estimating equation method to join univariate GLMs. Two real-world data examples are used in the illustration.

  20. Statistical power analyses using G*Power 3.1: tests for correlation and regression analyses.

    PubMed

    Faul, Franz; Erdfelder, Edgar; Buchner, Axel; Lang, Albert-Georg

    2009-11-01

    G*Power is a free power analysis program for a variety of statistical tests. We present extensions and improvements of the version introduced by Faul, Erdfelder, Lang, and Buchner (2007) in the domain of correlation and regression analyses. In the new version, we have added procedures to analyze the power of tests based on (1) single-sample tetrachoric correlations, (2) comparisons of dependent correlations, (3) bivariate linear regression, (4) multiple linear regression based on the random predictor model, (5) logistic regression, and (6) Poisson regression. We describe these new features and provide a brief introduction to their scope and handling.

  1. Logistics Management: New trends in the Reverse Logistics

    NASA Astrophysics Data System (ADS)

    Antonyová, A.; Antony, P.; Soewito, B.

    2016-04-01

    Present level and quality of the environment are directly dependent on our access to natural resources, as well as their sustainability. In particular production activities and phenomena associated with it have a direct impact on the future of our planet. Recycling process, which in large enterprises often becomes an important and integral part of the production program, is usually in small and medium-sized enterprises problematic. We can specify a few factors, which have direct impact on the development and successful application of the effective reverse logistics system. Find the ways to economically acceptable model of reverse logistics, focusing on converting waste materials for renewable energy, is the task in progress.

  2. Regression analysis for solving diagnosis problem of children's health

    NASA Astrophysics Data System (ADS)

    Cherkashina, Yu A.; Gerget, O. M.

    2016-04-01

    The paper includes results of scientific researches. These researches are devoted to the application of statistical techniques, namely, regression analysis, to assess the health status of children in the neonatal period based on medical data (hemostatic parameters, parameters of blood tests, the gestational age, vascular-endothelial growth factor) measured at 3-5 days of children's life. In this paper a detailed description of the studied medical data is given. A binary logistic regression procedure is discussed in the paper. Basic results of the research are presented. A classification table of predicted values and factual observed values is shown, the overall percentage of correct recognition is determined. Regression equation coefficients are calculated, the general regression equation is written based on them. Based on the results of logistic regression, ROC analysis was performed, sensitivity and specificity of the model are calculated and ROC curves are constructed. These mathematical techniques allow carrying out diagnostics of health of children providing a high quality of recognition. The results make a significant contribution to the development of evidence-based medicine and have a high practical importance in the professional activity of the author.

  3. Operational Logistics: Lessons from the Inchon Landing.

    DTIC Science & Technology

    2007-11-02

    logistics at the strategic, operational, or tactical levels. An analysis of Operation Chromite , the amphibious landing at Inchon, reveals that logistical...commander’s intent, and define the logistics focus of effort demonstrate that operational logistics was a key enabler in Operation Chromite . This analysis

  4. Logistics background study: underground mining

    SciTech Connect

    Hanslovan, J. J.; Visovsky, R. G.

    1982-02-01

    Logistical functions that are normally associated with US underground coal mining are investigated and analyzed. These functions imply all activities and services that support the producing sections of the mine. The report provides a better understanding of how these functions impact coal production in terms of time, cost, and safety. Major underground logistics activities are analyzed and include: transportation and personnel, supplies and equipment; transportation of coal and rock; electrical distribution and communications systems; water handling; hydraulics; and ventilation systems. Recommended areas for future research are identified and prioritized.

  5. Continual Improvement in Shuttle Logistics

    NASA Technical Reports Server (NTRS)

    Flowers, Jean; Schafer, Loraine

    1995-01-01

    It has been said that Continual Improvement (CI) is difficult to apply to service oriented functions, especially in a government agency such as NASA. However, a constrained budget and increasing requirements are a way of life at NASA Kennedy Space Center (KSC), making it a natural environment for the application of CI tools and techniques. This paper describes how KSC, and specifically the Space Shuttle Logistics Project, a key contributor to KSC's mission, has embraced the CI management approach as a means of achieving its strategic goals and objectives. An overview of how the KSC Space Shuttle Logistics Project has structured its CI effort and examples of some of the initiatives are provided.

  6. Evaluating differential effects using regression interactions and regression mixture models

    PubMed Central

    Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung

    2015-01-01

    Research increasingly emphasizes understanding differential effects. This paper focuses on understanding regression mixture models, a relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their formulation, and their assumptions are compared using Monte Carlo simulations and real data analysis. The capabilities of regression mixture models are described and specific issues to be addressed when conducting regression mixtures are proposed. The paper aims to clarify the role that regression mixtures can take in the estimation of differential effects and increase awareness of the benefits and potential pitfalls of this approach. Regression mixture models are shown to be a potentially effective exploratory method for finding differential effects when these effects can be defined by a small number of classes of respondents who share a typical relationship between a predictor and an outcome. It is also shown that the comparison between regression mixture models and interactions becomes substantially more complex as the number of classes increases. It is argued that regression interactions are well suited for direct tests of specific hypotheses about differential effects and regression mixtures provide a useful approach for exploring effect heterogeneity given adequate samples and study design. PMID:26556903

  7. Long-term follow-up of tandem high-dose therapy with autologous stem cell support for adults with high-risk age-adjusted international prognostic index aggressive non-Hodgkin Lymphomas: a GOELAMS pilot study.

    PubMed

    Monjanel, Hélène; Deconinck, Eric; Perrodeau, Elodie; Gastinne, Thomas; Delwail, Vincent; Moreau, Anne; François, Sylvie; Berthou, Christian; Gyan, Emmanuel; Milpied, Noël

    2011-06-01

    Single high-dose therapy (HDT) followed by autologous peripheral blood stem cell (PBSC) support improves complete response and overall survival (OS) in untreated aggressive non-Hodgkin's lymphoma (NHL). However, patients with a high age-adjusted international prognostic index (aa-IPI equal to 3) still have poor clinical outcome despite high dose intensity regimen. To improve complete response in this subgroup, the French Groupe Ouest-Est des Leucémies et Autres Maladies du Sang (GOELAMS) conducted a pilot phase II trial (073) evaluating tandem HDT with PBSC support in a series of 45 patients with aa-IPI equal to 3 untreated aggressive non-Hodgkin's lymphoma. After induction with an anthracyclin-containing regimen, responders underwent tandem HDT conditioned by high-dose mitoxantrone plus cytarabine for the first HDT and total-body irradiation (TBI), carmustine, etoposide, and cyclophosphamide for the second HDT. Thirty-one patients out of 41 evaluable patients completed the program. There were 4 toxic deaths. The complete response rate was 49%. With a median follow-up of 114 months for surviving patients, the OS was 51%, and 19 out of the 22 patients (86%) who reached a complete response are alive and relapse-free. Recent prospective evaluation of quality of life and comorbidities of surviving patients does not reveal long-term toxicities of the procedure. In the era of monoclonal antibodies and response-adapted therapy, the role of tandem HDT still need to be determined.

  8. NASA Space Rocket Logistics Challenges

    NASA Technical Reports Server (NTRS)

    Bramon, Chris; Neeley, James R.; Jones, James V.; Watson, Michael D.; Inman, Sharon K.; Tuttle, Loraine

    2014-01-01

    The Space Launch System (SLS) is the new NASA heavy lift launch vehicle in development and is scheduled for its first mission in 2017. SLS has many of the same logistics challenges as any other large scale program. However, SLS also faces unique challenges. This presentation will address the SLS challenges, along with the analysis and decisions to mitigate the threats posed by each.

  9. Equivalent Linear Logistic Test Models.

    ERIC Educational Resources Information Center

    Bechger, Timo M.; Verstralen, Huub H. F. M.; Verhelst, Norma D.

    2002-01-01

    Discusses the Linear Logistic Test Model (LLTM) and demonstrates that there are many equivalent ways to specify a model. Analyzed a real data set (300 responses to 5 analogies) using a Lagrange multiplier test for the specification of the model, and demonstrated that there may be many ways to change the specification of an LLTM and achieve the…

  10. Evaluating Differential Effects Using Regression Interactions and Regression Mixture Models

    ERIC Educational Resources Information Center

    Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung

    2015-01-01

    Research increasingly emphasizes understanding differential effects. This article focuses on understanding regression mixture models, which are relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their…

  11. Error bounds in cascading regressions

    USGS Publications Warehouse

    Karlinger, M.R.; Troutman, B.M.

    1985-01-01

    Cascading regressions is a technique for predicting a value of a dependent variable when no paired measurements exist to perform a standard regression analysis. Biases in coefficients of a cascaded-regression line as well as error variance of points about the line are functions of the correlation coefficient between dependent and independent variables. Although this correlation cannot be computed because of the lack of paired data, bounds can be placed on errors through the required properties of the correlation coefficient. The potential meansquared error of a cascaded-regression prediction can be large, as illustrated through an example using geomorphologic data. ?? 1985 Plenum Publishing Corporation.

  12. Logistics, electronic commerce, and the environment

    NASA Astrophysics Data System (ADS)

    Sarkis, Joseph; Meade, Laura; Talluri, Srinivas

    2002-02-01

    Organizations realize that a strong supporting logistics or electronic logistics (e-logistics) function is important from both commercial and consumer perspectives. The implications of e-logistics models and practices cover the forward and reverse logistics functions of organizations. They also have direct and profound impact on the natural environment. This paper will focus on a discussion of forward and reverse e-logistics and their relationship to the natural environment. After discussion of the many pertinent issues in these areas, directions of practice and implications for study and research are then described.

  13. Precision Efficacy Analysis for Regression.

    ERIC Educational Resources Information Center

    Brooks, Gordon P.

    When multiple linear regression is used to develop a prediction model, sample size must be large enough to ensure stable coefficients. If the derivation sample size is inadequate, the model may not predict well for future subjects. The precision efficacy analysis for regression (PEAR) method uses a cross- validity approach to select sample sizes…

  14. Logistics support of lunar bases

    NASA Astrophysics Data System (ADS)

    Woodcock, Gordon R.

    Concepts for lunar base design and operations have been studied for over twenty years. A brief summary of past concepts is presented, with reference to new ideas on design, uses and logistics schemes. Transportation systems and mission modes are reviewed, showing the significant cost benefits of reusable transportation at the higher traffic rates representative of lunar base buildup and logistics operations. Parametric studies are presented over the range of base size from 6 crew to 1000 crew, and the "choke points", or barriers to growth are identified and means for their resolution presented. The leverages of food growth on the Moon, crew stay time, use of lunar oxygen, indigenous resources for construction of facilities, and transportation systems size and operations modes are presented. It is concluded that bases as large as 1000 people are affordable at less than twice the cost of an initial base of six people if these leverages of advanced basing technologies are exploited.

  15. Logistic analysis of algae cultivation.

    PubMed

    Slegers, P M; Leduc, S; Wijffels, R H; van Straten, G; van Boxtel, A J B

    2015-03-01

    Energy requirements for resource transport of algae cultivation are unknown. This work describes the quantitative analysis of energy requirements for water and CO2 transport. Algae cultivation models were combined with the quantitative logistic decision model 'BeWhere' for the regions Benelux (Northwest Europe), southern France and Sahara. For photobioreactors, the energy consumed for transport of water and CO2 turns out to be a small percentage of the energy contained in the algae biomass (0.1-3.6%). For raceway ponds the share for transport is higher (0.7-38.5%). The energy consumption for transport is the lowest in the Benelux due to good availability of both water and CO2. Analysing transport logistics is still important, despite the low energy consumption for transport. The results demonstrate that resource requirements, resource distribution and availability and transport networks have a profound effect on the location choices for algae cultivation.

  16. Logistical teamwork tames Madagascar wildcats

    SciTech Connect

    Twa, W.

    1986-03-01

    Amoco Production Company's exploration program in western Madagascar's Sakalaya coastal plain exemplifies the unique logistical challenges both operator and drilling contractor must undergo to reach the few remaining onshore frontier areas. Sakalava is characterized by deep rivers, flood prone tributaries, and a lone 40 km hard-surface road. Problems caused by a lack of port facilities and oil field services are complicated by thousands of square miles of unimproved wooded plains. Rainwater from the nearby mountains of central Madagascar frequently floods rivers in the Sakalava coastal plain leaving impassable marshes in their wake. Prior to this project, about 45 wells had been drilled in Madagascar. Most recently, state oil company Omnis contracted Bawden Drilling International Inc. to drill nine wells for its heavy oil project on the Tsimioro oil prospect. Bawden provided both logistical and drilling services for that program.

  17. Fleet Logistics Center, Puget Sound

    DTIC Science & Technology

    2012-08-01

    ORGANIZATION NAME(S) AND ADDRESS(ES) Naval Supply Systems Command,Fleet Logistics Center, Puget Sound,467 W Street , Bremerton ,WA,98314-5100 8... Bremerton , WA Established: October 1967 Name Changes: Naval Supply Center Puget Sound, Fleet and Industrial Supply Center Puget...or Sasebo) deployed Ships in the Western Pacific (WestPac) Naval Base Kitsap at Bremerton and Bangor (NBK at Bremerton or Bangor) Navy Region

  18. Trend Analysis of Cancer Mortality and Incidence in Panama, Using Joinpoint Regression Analysis

    PubMed Central

    Politis, Michael; Higuera, Gladys; Chang, Lissette Raquel; Gomez, Beatriz; Bares, Juan; Motta, Jorge

    2015-01-01

    Abstract Cancer is one of the leading causes of death worldwide and its incidence is expected to increase in the future. In Panama, cancer is also one of the leading causes of death. In 1964, a nationwide cancer registry was started and it was restructured and improved in 2012. The aim of this study is to utilize Joinpoint regression analysis to study the trends of the incidence and mortality of cancer in Panama in the last decade. Cancer mortality was estimated from the Panamanian National Institute of Census and Statistics Registry for the period 2001 to 2011. Cancer incidence was estimated from the Panamanian National Cancer Registry for the period 2000 to 2009. The Joinpoint Regression Analysis program, version 4.0.4, was used to calculate trends by age-adjusted incidence and mortality rates for selected cancers. Overall, the trend of age-adjusted cancer mortality in Panama has declined over the last 10 years (−1.12% per year). The cancers for which there was a significant increase in the trend of mortality were female breast cancer and ovarian cancer; while the highest increases in incidence were shown for breast cancer, liver cancer, and prostate cancer. Significant decrease in the trend of mortality was evidenced for the following: prostate cancer, lung and bronchus cancer, and cervical cancer; with respect to incidence, only oral and pharynx cancer in both sexes had a significant decrease. Some cancers showed no significant trends in incidence or mortality. This study reveals contrasting trends in cancer incidence and mortality in Panama in the last decade. Although Panama is considered an upper middle income nation, this study demonstrates that some cancer mortality trends, like the ones seen in cervical and lung cancer, behave similarly to the ones seen in high income countries. In contrast, other types, like breast cancer, follow a pattern seen in countries undergoing a transition to a developed economy with its associated lifestyle, nutrition, and

  19. Trend Analysis of Cancer Mortality and Incidence in Panama, Using Joinpoint Regression Analysis.

    PubMed

    Politis, Michael; Higuera, Gladys; Chang, Lissette Raquel; Gomez, Beatriz; Bares, Juan; Motta, Jorge

    2015-06-01

    Cancer is one of the leading causes of death worldwide and its incidence is expected to increase in the future. In Panama, cancer is also one of the leading causes of death. In 1964, a nationwide cancer registry was started and it was restructured and improved in 2012. The aim of this study is to utilize Joinpoint regression analysis to study the trends of the incidence and mortality of cancer in Panama in the last decade. Cancer mortality was estimated from the Panamanian National Institute of Census and Statistics Registry for the period 2001 to 2011. Cancer incidence was estimated from the Panamanian National Cancer Registry for the period 2000 to 2009. The Joinpoint Regression Analysis program, version 4.0.4, was used to calculate trends by age-adjusted incidence and mortality rates for selected cancers. Overall, the trend of age-adjusted cancer mortality in Panama has declined over the last 10 years (-1.12% per year). The cancers for which there was a significant increase in the trend of mortality were female breast cancer and ovarian cancer; while the highest increases in incidence were shown for breast cancer, liver cancer, and prostate cancer. Significant decrease in the trend of mortality was evidenced for the following: prostate cancer, lung and bronchus cancer, and cervical cancer; with respect to incidence, only oral and pharynx cancer in both sexes had a significant decrease. Some cancers showed no significant trends in incidence or mortality. This study reveals contrasting trends in cancer incidence and mortality in Panama in the last decade. Although Panama is considered an upper middle income nation, this study demonstrates that some cancer mortality trends, like the ones seen in cervical and lung cancer, behave similarly to the ones seen in high income countries. In contrast, other types, like breast cancer, follow a pattern seen in countries undergoing a transition to a developed economy with its associated lifestyle, nutrition, and body weight

  20. A Predictive Logistic Regression Model of World Conflict Using Open Source Data

    DTIC Science & Technology

    2015-03-26

    model that predicts the future state of the world where nations will either be in a state of “ violent conflict” or “not in violent conflict” based on...state of violent conflict in 2015, seventeen of them are new to conflict since the last published list in 2013. A prediction tool is created to allow...war- game subject matter experts and students to identify future predicted violent conflict and the responsible variables. v Dedication

  1. Comparative Analysis of Decision Trees with Logistic Regression in Predicting Fault-Prone Classes

    NASA Astrophysics Data System (ADS)

    Singh, Yogesh; Takkar, Arvinder Kaur; Malhotra, Ruchika

    There are available metrics for predicting fault prone classes, which may help software organizations for planning and performing testing activities. This may be possible due to proper allocation of resources on fault prone parts of the design and code of the software. Hence, importance and usefulness of such metrics is understandable, but empirical validation of these metrics is always a great challenge. Decision Tree (DT) methods have been successfully applied for solving classification problems in many applications. This paper evaluates the capability of three DT methods and compares its performance with statistical method in predicting fault prone software classes using publicly available NASA data set. The results indicate that the prediction performance of DT is generally better than statistical model. However, similar types of studies are required to be carried out in order to establish the acceptability of the DT models.

  2. Ordinal Logistic Regression to Detect Differential Item Functioning for Gender in the Institutional Integration Scale

    ERIC Educational Resources Information Center

    Breidenbach, Daniel H.; French, Brian F.

    2011-01-01

    Many factors can influence a student's decision to withdraw from college. Intervention programs aimed at retention can benefit from understanding the factors related to such decisions, especially in underrepresented groups. The Institutional Integration Scale (IIS) has been suggested as a predictor of student persistence. Accurate prediction of…

  3. Integration of geographic information systems and logistic multiple regression for aquatic macrophyte modeling

    SciTech Connect

    Narumalani, S.; Jensen, J.R.; Althausen, J.D.; Burkhalter, S.; Mackey, H.E. Jr.

    1994-06-01

    Since aquatic macrophytes have an important influence on the physical and chemical processes of an ecosystem while simultaneously affecting human activity, it is imperative that they be inventoried and managed wisely. However, mapping wetlands can be a major challenge because they are found in diverse geographic areas ranging from small tributary streams, to shrub or scrub and marsh communities, to open water lacustrian environments. In addition, the type and spatial distribution of wetlands can change dramatically from season to season, especially when nonpersistent species are present. This research, focuses on developing a model for predicting the future growth and distribution of aquatic macrophytes. This model will use a geographic information system (GIS) to analyze some of the biophysical variables that affect aquatic macrophyte growth and distribution. The data will provide scientists information on the future spatial growth and distribution of aquatic macrophytes. This study focuses on the Savannah River Site Par Pond (1,000 ha) and L Lake (400 ha) these are two cooling ponds that have received thermal effluent from nuclear reactor operations. Par Pond was constructed in 1958, and natural invasion of wetland has occurred over its 35-year history, with much of the shoreline having developed extensive beds of persistent and non-persistent aquatic macrophytes.

  4. Fitting Proportional Odds Models to Educational Data in Ordinal Logistic Regression Using Stata, SAS and SPSS

    ERIC Educational Resources Information Center

    Liu, Xing

    2008-01-01

    The proportional odds (PO) model, which is also called cumulative odds model (Agresti, 1996, 2002 ; Armstrong & Sloan, 1989; Long, 1997, Long & Freese, 2006; McCullagh, 1980; McCullagh & Nelder, 1989; Powers & Xie, 2000; O'Connell, 2006), is one of the most commonly used models for the analysis of ordinal categorical data and comes from the class…

  5. Logistic Regression Analysis of the Response of Winter Wheat to Components of Artificial Freezing Episodes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Improvement of cold tolerance of winter wheat (Triticum aestivum L.) through breeding methods has been problematic. A better understanding of how individual wheat cultivars respond to components of the freezing process may provide new information that can be used to develop more cold tolerance culti...

  6. Analysis of Variables: Predicting Sophomore Persistence Using Logistic Regression Analysis at the University of South Florida

    ERIC Educational Resources Information Center

    Miller, Thomas E.; Herreid, Charlene H.

    2009-01-01

    This is the fifth in a series of articles describing an attrition prediction and intervention project at the University of South Florida (USF) in Tampa. The project was originally presented in the 83(2) issue (Miller 2007). The statistical model for predicting attrition was described in the 83(3) issue (Miller and Herreid 2008). The methods and…

  7. AP® Potential Predicted by PSAT/NMSQT® Scores Using Logistic Regression. Statistical Report 2014-1

    ERIC Educational Resources Information Center

    Zhang, Xiuyuan; Patel, Priyank; Ewing, Maureen

    2014-01-01

    AP Potential™ is an educational guidance tool that uses PSAT/NMSQT® scores to identify students who have the potential to do well on one or more Advanced Placement® (AP®) Exams. Students identified as having AP potential, perhaps students who would not have been otherwise identified, should consider enrolling in the corresponding AP course if they…

  8. Achievement Gap Projection for Standardized Testing through Logistic Regression within a Large Arizona School District

    ERIC Educational Resources Information Center

    Kellermeyer, Steven Bruce

    2011-01-01

    In the last few decades high-stakes testing has become more political than educational. The Districts within Arizona are bound by the mandates of both AZ LEARNS and the No Child Left Behind Act of 2001. At the time of this writing, both legislative mandates relied on the Arizona Instrument for Measuring Standards (AIMS) as State Tests for gauging…

  9. Role of Social Performance in Predicting Learning Problems: Prediction of Risk Using Logistic Regression Analysis

    ERIC Educational Resources Information Center

    Del Prette, Zilda Aparecida Pereira; Prette, Almir Del; De Oliveira, Lael Almeida; Gresham, Frank M.; Vance, Michael J.

    2012-01-01

    Social skills are specific behaviors that individuals exhibit in order to successfully complete social tasks whereas social competence represents judgments by significant others that these social tasks have been successfully accomplished. The present investigation identified the best sociobehavioral predictors obtained from different raters…

  10. Modeling Small Unmanned Aerial System Mishaps Using Logistic Regression and Artificial Neural Networks

    DTIC Science & Technology

    2012-03-22

    preconditions for unsafe acts which ultimately result in active failures (mishaps). The DoD HFACS classification system has been used to categorize the...reoccur as future individuals repeat the same locally rational acts , while a classification scheme on these errors merely provides a label to what in...boundaries, or public endangerment ”. The second document, AFRLMAN 99-103, defines Class A through C mishaps much like the DoD classification

  11. Exploring Person Fit with an Approach Based on Multilevel Logistic Regression

    ERIC Educational Resources Information Center

    Walker, A. Adrienne; Engelhard, George, Jr.

    2015-01-01

    The idea that test scores may not be valid representations of what students know, can do, and should learn next is well known. Person fit provides an important aspect of validity evidence. Person fit analyses at the individual student level are not typically conducted and person fit information is not communicated to educational stakeholders. In…

  12. Prediction of early postoperative infections in pediatric liver transplantation by logistic regression

    NASA Astrophysics Data System (ADS)

    Uzunova, Yordanka; Prodanova, Krasimira; Spassov, Lubomir

    2016-12-01

    Orthotopic liver transplantation (OLT) is the only curative treatment for end-stage liver disease. Early diagnosis and treatment of infections after OLT are usually associated with improved outcomes. This study's objective is to identify reliable factors that can predict postoperative infectious morbidity. 27 children were included in the analysis. They underwent liver transplantation in our department. The correlation between two parameters (the level of blood glucose at 5th postoperative day and the duration of the anhepatic phase) and postoperative infections was analyzed, using univariate analysis. In this analysis, an independent predictive factor was derived which adequately identifies patients at risk of infectious complications after a liver transplantation.

  13. Investigating the Effect of Complexity Factors in Stoichiometry Problems Using Logistic Regression and Eye Tracking

    ERIC Educational Resources Information Center

    Tang, Hui; Kirk, John; Pienta, Norbert J.

    2014-01-01

    This paper includes two experiments, one investigating complexity factors in stoichiometry word problems, and the other identifying students' problem-solving protocols by using eye-tracking technology. The word problems used in this study had five different complexity factors, which were randomly assigned by a Web-based tool that we developed. The…

  14. Comparing the Discrete and Continuous Logistic Models

    ERIC Educational Resources Information Center

    Gordon, Sheldon P.

    2008-01-01

    The solutions of the discrete logistic growth model based on a difference equation and the continuous logistic growth model based on a differential equation are compared and contrasted. The investigation is conducted using a dynamic interactive spreadsheet. (Contains 5 figures.)

  15. EMS incident management: emergency medical logistics.

    PubMed

    Maniscalco, P M; Christen, H T

    1999-01-01

    If you had to get x amount of supplies to point A or point B, or both, in 10 minutes, how would you do it? The answer lies in the following steps: 1. Develop a logistics plan. 2. Use emergency management as a partner agency for developing your logistics plan. 3. Implement a push logistics system by determining what supplies/medications and equipment are important. 4. Place mass casualty/disaster caches at key locations for rapid deployment. Have medication/fluid caches available at local hospitals. 5. Develop and implement command caches for key supervisors and managers. 6. Anticipate the logistics requirements of a terrorism/tactical violence event based on a community threat assessment. 7. Educate the public about preparing a BLS family disaster kit. 8. Test logistics capabilities at disaster exercises. 9. Budget for logistics needs. 10. Never underestimate the importance of logistics. When logistics support fails, the EMS system fails.

  16. Rank regression: an alternative regression approach for data with outliers.

    PubMed

    Chen, Tian; Tang, Wan; Lu, Ying; Tu, Xin

    2014-10-01

    Linear regression models are widely used in mental health and related health services research. However, the classic linear regression analysis assumes that the data are normally distributed, an assumption that is not met by the data obtained in many studies. One method of dealing with this problem is to use semi-parametric models, which do not require that the data be normally distributed. But semi-parametric models are quite sensitive to outlying observations, so the generated estimates are unreliable when study data includes outliers. In this situation, some researchers trim the extreme values prior to conducting the analysis, but the ad-hoc rules used for data trimming are based on subjective criteria so different methods of adjustment can yield different results. Rank regression provides a more objective approach to dealing with non-normal data that includes outliers. This paper uses simulated and real data to illustrate this useful regression approach for dealing with outliers and compares it to the results generated using classical regression models and semi-parametric regression models.

  17. Practical Session: Simple Linear Regression

    NASA Astrophysics Data System (ADS)

    Clausel, M.; Grégoire, G.

    2014-12-01

    Two exercises are proposed to illustrate the simple linear regression. The first one is based on the famous Galton's data set on heredity. We use the lm R command and get coefficients estimates, standard error of the error, R2, residuals …In the second example, devoted to data related to the vapor tension of mercury, we fit a simple linear regression, predict values, and anticipate on multiple linear regression. This pratical session is an excerpt from practical exercises proposed by A. Dalalyan at EPNC (see Exercises 1 and 2 of http://certis.enpc.fr/~dalalyan/Download/TP_ENPC_4.pdf).

  18. The Evolution of Centralized Operational Logistics

    DTIC Science & Technology

    2012-05-17

    Command Reports 1-10; Lieutenant General William G. Pagonis and Jeffery L . Cruikshank. Moving Mountains: Lessons in Leadership and Logistics from the Gulf...1943- 1945, (Washington, DC: Government Printing Office, 1955), 320-321. 22 Alan Gropman, The Big " L "- American Logistics in World War II...29 Carter B. Magruder, Recurring Logistic Problems As I Have Observed Them, 29. 30 Ibid., 245; Lieutenant Colonel Robert L . Burke, “Corps Logistic

  19. Research on 6R Military Logistics Network

    NASA Astrophysics Data System (ADS)

    Jie, Wan; Wen, Wang

    The building of military logistics network is an important issue for the construction of new forces. This paper has thrown out a concept model of 6R military logistics network model based on JIT. Then we conceive of axis spoke y logistics centers network, flexible 6R organizational network, lean 6R military information network based grid. And then the strategy and proposal for the construction of the three sub networks of 6Rmilitary logistics network are given.

  20. Nowcasting sunshine number using logistic modeling

    NASA Astrophysics Data System (ADS)

    Brabec, Marek; Badescu, Viorel; Paulescu, Marius

    2013-04-01

    In this paper, we present a formalized approach to statistical modeling of the sunshine number, binary indicator of whether the Sun is covered by clouds introduced previously by Badescu (Theor Appl Climatol 72:127-136, 2002). Our statistical approach is based on Markov chain and logistic regression and yields fully specified probability models that are relatively easily identified (and their unknown parameters estimated) from a set of empirical data (observed sunshine number and sunshine stability number series). We discuss general structure of the model and its advantages, demonstrate its performance on real data and compare its results to classical ARIMA approach as to a competitor. Since the model parameters have clear interpretation, we also illustrate how, e.g., their inter-seasonal stability can be tested. We conclude with an outlook to future developments oriented to construction of models allowing for practically desirable smooth transition between data observed with different frequencies and with a short discussion of technical problems that such a goal brings.

  1. Crisis Management- Operational Logistics & Asset Visibility Technologies

    DTIC Science & Technology

    2006-06-01

    greatly increase the effectiveness of future crisis response operations. The proposed logistics framework serves as a viable solution for common...increase the effectiveness of future crisis response operations. The proposed logistics framework serves as a viable solution for common logistical...FRAMEWORK............................................................29 A. INTEGRATED COMMUNICATIONS NETWORK MODEL ................29 B. INTEGRATED

  2. Forecasting Workload for Defense Logistics Agency Distribution

    DTIC Science & Technology

    2014-12-01

    NAVAL POSTGRADUATE SCHOOL MONTEREY, CALIFORNIA MBA PROFESSIONAL REPORT FORECASTING WORKLOAD FOR DEFENSE LOGISTICS AGENCY...DATE December 2014 3. REPORT TYPE AND DATES COVERED MBA Professional Report 4. TITLE AND SUBTITLE FORECASTING WORKLOAD FOR DEFENSE LOGISTICS ...maximum 200 words) The Defense Logistics Agency (DLA) predicts issue and receipt workload for its distribution agency in order to maintain

  3. Naval Logistics Integration Through Interoperable Supply Systems

    DTIC Science & Technology

    2014-06-13

    NAVAL LOGISTICS INTEGRATION THROUGH INTEROPERABLE SUPPLY SYSTEMS A thesis presented to the Faculty of the U.S. Army Command...Thesis 3. DATES COVERED (From - To) AUG 2013 – JUNE 2014 4. TITLE AND SUBTITLE Naval Logistics Integration through Interoperable Supply Systems...Unlimited 13. SUPPLEMENTARY NOTES 14. ABSTRACT This research investigates how the Navy and the Marine Corps could increase Naval Logistics

  4. Logistics hardware and services control system

    NASA Technical Reports Server (NTRS)

    Koromilas, A.; Miller, K.; Lamb, T.

    1973-01-01

    Software system permits onsite direct control of logistics operations, which include spare parts, initial installation, tool control, and repairable parts status and control, through all facets of operations. System integrates logistics actions and controls receipts, issues, loans, repairs, fabrications, and modifications and assets in predicting and allocating logistics parts and services effectively.

  5. Multiple Regression and Its Discontents

    ERIC Educational Resources Information Center

    Snell, Joel C.; Marsh, Mitchell

    2012-01-01

    Multiple regression is part of a larger statistical strategy originated by Gauss. The authors raise questions about the theory and suggest some changes that would make room for Mandelbrot and Serendipity.

  6. Wrong Signs in Regression Coefficients

    NASA Technical Reports Server (NTRS)

    McGee, Holly

    1999-01-01

    When using parametric cost estimation, it is important to note the possibility of the regression coefficients having the wrong sign. A wrong sign is defined as a sign on the regression coefficient opposite to the researcher's intuition and experience. Some possible causes for the wrong sign discussed in this paper are a small range of x's, leverage points, missing variables, multicollinearity, and computational error. Additionally, techniques for determining the cause of the wrong sign are given.

  7. Mini pressurized logistics module (MPLM)

    NASA Astrophysics Data System (ADS)

    Vallerani, E.; Brondolo, D.; Basile, L.

    1996-06-01

    The MPLM Program was initiated through a Memorandum of Understanding (MOU) between the United States' National Aeronautics and Space Administration (NASA) and Italy's ASI, the Italian Space Agency, that was signed on 6 December 1991. The MPLM is a pressurized logistics module that will be used to transport supplies and materials (up to 20,000 lb), including user experiments, between Earth and International Space Station Alpha (ISSA) using the Shuttle, to support active and passive storage, and to provide a habitable environment for two people when docked to the Station. The Italian Space Agency has selected Alenia Spazio to develop MPLM modules that have always been considered a key element for the new International Space Station taking benefit from its design flexibility and consequent possible cost saving based on the maximum utilization of the Shuttle launch capability for any mission. In the frame of the very recent agreement between the U.S. and Russia for cooperation in space, that foresees the utilization of MIR 1 hardware, the Italian MPLM will remain an important element of the logistics system, being the only pressurized module designed for re-entry. Within the new scenario of anticipated Shuttle flights to MIR 1 during Space Station phase 1, MPLM remains a candidate for one or more missions to provide MIR 1 resupply capabilities and advanced ISSA hardware/procedures verification. Based on the concept of Flexible Carriers, Alenia Spazio is providing NASA with three MPLM flight units that can be configured according to the requirements of the Human-Tended Capability (HTC) and Permanent Human Capability (PHC) of the Space Station. Configurability will allow transportation of passive cargo only, or a combination of passive and cold cargo accommodated in R/F racks. Having developed and qualified the baseline configuration with respect to the worst enveloping condition, each unit could be easily configured to the passive or active version depending upon the

  8. Accounting for Slipping and Other False Negatives in Logistic Models of Student Learning

    ERIC Educational Resources Information Center

    MacLellan, Christopher J.; Liu, Ran; Koedinger, Kenneth R.

    2015-01-01

    Additive Factors Model (AFM) and Performance Factors Analysis (PFA) are two popular models of student learning that employ logistic regression to estimate parameters and predict performance. This is in contrast to Bayesian Knowledge Tracing (BKT) which uses a Hidden Markov Model formalism. While all three models tend to make similar predictions,…

  9. Complementary Log Regression for Sufficient-Cause Modeling of Epidemiologic Data.

    PubMed

    Lin, Jui-Hsiang; Lee, Wen-Chung

    2016-12-13

    The logistic regression model is the workhorse of epidemiological data analysis. The model helps to clarify the relationship between multiple exposures and a binary outcome. Logistic regression analysis is readily implemented using existing statistical software, and this has contributed to it becoming a routine procedure for epidemiologists. In this paper, the authors focus on a causal model which has recently received much attention from the epidemiologic community, namely, the sufficient-component cause model (causal-pie model). The authors show that the sufficient-component cause model is associated with a particular 'link' function: the complementary log link. In a complementary log regression, the exponentiated coefficient of a main-effect term corresponds to an adjusted 'peril ratio', and the coefficient of a cross-product term can be used directly to test for causal mechanistic interaction (sufficient-cause interaction). The authors provide detailed instructions on how to perform a complementary log regression using existing statistical software and use three datasets to illustrate the methodology. Complementary log regression is the model of choice for sufficient-cause analysis of binary outcomes. Its implementation is as easy as conventional logistic regression.

  10. XRA image segmentation using regression

    NASA Astrophysics Data System (ADS)

    Jin, Jesse S.

    1996-04-01

    Segmentation is an important step in image analysis. Thresholding is one of the most important approaches. There are several difficulties in segmentation, such as automatic selecting threshold, dealing with intensity distortion and noise removal. We have developed an adaptive segmentation scheme by applying the Central Limit Theorem in regression. A Gaussian regression is used to separate the distribution of background from foreground in a single peak histogram. The separation will help to automatically determine the threshold. A small 3 by 3 widow is applied and the modal of the local histogram is used to overcome noise. Thresholding is based on local weighting, where regression is used again for parameter estimation. A connectivity test is applied to the final results to remove impulse noise. We have applied the algorithm to x-ray angiogram images to extract brain arteries. The algorithm works well for single peak distribution where there is no valley in the histogram. The regression provides a method to apply knowledge in clustering. Extending regression for multiple-level segmentation needs further investigation.

  11. Regressive evolution in Astyanax cavefish.

    PubMed

    Jeffery, William R

    2009-01-01

    A diverse group of animals, including members of most major phyla, have adapted to life in the perpetual darkness of caves. These animals are united by the convergence of two regressive phenotypes, loss of eyes and pigmentation. The mechanisms of regressive evolution are poorly understood. The teleost Astyanax mexicanus is of special significance in studies of regressive evolution in cave animals. This species includes an ancestral surface dwelling form and many con-specific cave-dwelling forms, some of which have evolved their recessive phenotypes independently. Recent advances in Astyanax development and genetics have provided new information about how eyes and pigment are lost during cavefish evolution; namely, they have revealed some of the molecular and cellular mechanisms involved in trait modification, the number and identity of the underlying genes and mutations, the molecular basis of parallel evolution, and the evolutionary forces driving adaptation to the cave environment.

  12. A conditional likelihood approach for regression analysis using biomarkers measured with batch-specific error.

    PubMed

    Wang, Ming; Flanders, W Dana; Bostick, Roberd M; Long, Qi

    2012-12-20

    Measurement error is common in epidemiological and biomedical studies. When biomarkers are measured in batches or groups, measurement error is potentially correlated within each batch or group. In regression analysis, most existing methods are not applicable in the presence of batch-specific measurement error in predictors. We propose a robust conditional likelihood approach to account for batch-specific error in predictors when batch effect is additive and the predominant source of error, which requires no assumptions on the distribution of measurement error. Although a regression model with batch as a categorical covariable yields the same parameter estimates as the proposed conditional likelihood approach for linear regression, this result does not hold in general for all generalized linear models, in particular, logistic regression. Our simulation studies show that the conditional likelihood approach achieves better finite sample performance than the regression calibration approach or a naive approach without adjustment for measurement error. In the case of logistic regression, our proposed approach is shown to also outperform the regression approach with batch as a categorical covariate. In addition, we also examine a 'hybrid' approach combining the conditional likelihood method and the regression calibration method, which is shown in simulations to achieve good performance in the presence of both batch-specific and measurement-specific errors. We illustrate our method by using data from a colorectal adenoma study.

  13. A linear regression solution to the spatial autocorrelation problem

    NASA Astrophysics Data System (ADS)

    Griffith, Daniel A.

    The Moran Coefficient spatial autocorrelation index can be decomposed into orthogonal map pattern components. This decomposition relates it directly to standard linear regression, in which corresponding eigenvectors can be used as predictors. This paper reports comparative results between these linear regressions and their auto-Gaussian counterparts for the following georeferenced data sets: Columbus (Ohio) crime, Ottawa-Hull median family income, Toronto population density, southwest Ohio unemployment, Syracuse pediatric lead poisoning, and Glasgow standard mortality rates, and a small remotely sensed image of the High Peak district. This methodology is extended to auto-logistic and auto-Poisson situations, with selected data analyses including percentage of urban population across Puerto Rico, and the frequency of SIDs cases across North Carolina. These data analytic results suggest that this approach to georeferenced data analysis offers considerable promise.

  14. Cactus: An Introduction to Regression

    ERIC Educational Resources Information Center

    Hyde, Hartley

    2008-01-01

    When the author first used "VisiCalc," the author thought it a very useful tool when he had the formulas. But how could he design a spreadsheet if there was no known formula for the quantities he was trying to predict? A few months later, the author relates he learned to use multiple linear regression software and suddenly it all clicked into…

  15. Multiple Regression: A Leisurely Primer.

    ERIC Educational Resources Information Center

    Daniel, Larry G.; Onwuegbuzie, Anthony J.

    Multiple regression is a useful statistical technique when the researcher is considering situations in which variables of interest are theorized to be multiply caused. It may also be useful in those situations in which the researchers is interested in studies of predictability of phenomena of interest. This paper provides an introduction to…

  16. Weighting Regressions by Propensity Scores

    ERIC Educational Resources Information Center

    Freedman, David A.; Berk, Richard A.

    2008-01-01

    Regressions can be weighted by propensity scores in order to reduce bias. However, weighting is likely to increase random error in the estimates, and to bias the estimated standard errors downward, even when selection mechanisms are well understood. Moreover, in some cases, weighting will increase the bias in estimated causal parameters. If…

  17. Quantile Regression with Censored Data

    ERIC Educational Resources Information Center

    Lin, Guixian

    2009-01-01

    The Cox proportional hazards model and the accelerated failure time model are frequently used in survival data analysis. They are powerful, yet have limitation due to their model assumptions. Quantile regression offers a semiparametric approach to model data with possible heterogeneity. It is particularly powerful for censored responses, where the…

  18. Integrated Computer System of Management in Logistics

    NASA Astrophysics Data System (ADS)

    Chwesiuk, Krzysztof

    2011-06-01

    This paper aims at presenting a concept of an integrated computer system of management in logistics, particularly in supply and distribution chains. Consequently, the paper includes the basic idea of the concept of computer-based management in logistics and components of the system, such as CAM and CIM systems in production processes, and management systems for storage, materials flow, and for managing transport, forwarding and logistics companies. The platform which integrates computer-aided management systems is that of electronic data interchange.

  19. Optimal distributions for multiplex logistic networks.

    PubMed

    Solá Conde, Luis E; Used, Javier; Romance, Miguel

    2016-06-01

    This paper presents some mathematical models for distribution of goods in logistic networks based on spectral analysis of complex networks. Given a steady distribution of a finished product, some numerical algorithms are presented for computing the weights in a multiplex logistic network that reach the equilibrium dynamics with high convergence rate. As an application, the logistic networks of Germany and Spain are analyzed in terms of their convergence rates.

  20. Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors

    PubMed Central

    Woodard, Dawn B.; Crainiceanu, Ciprian; Ruppert, David

    2013-01-01

    We propose a new method for regression using a parsimonious and scientifically interpretable representation of functional predictors. Our approach is designed for data that exhibit features such as spikes, dips, and plateaus whose frequency, location, size, and shape varies stochastically across subjects. We propose Bayesian inference of the joint functional and exposure models, and give a method for efficient computation. We contrast our approach with existing state-of-the-art methods for regression with functional predictors, and show that our method is more effective and efficient for data that include features occurring at varying locations. We apply our methodology to a large and complex dataset from the Sleep Heart Health Study, to quantify the association between sleep characteristics and health outcomes. Software and technical appendices are provided in online supplemental materials. PMID:24293988

  1. Logistics Reduction Technologies for Exploration Missions

    NASA Technical Reports Server (NTRS)

    Broyan, James L., Jr.; Ewert, Michael K.; Fink, Patrick W.

    2014-01-01

    Human exploration missions under study are limited by the launch mass capacity of existing and planned launch vehicles. The logistical mass of crew items is typically considered separate from the vehicle structure, habitat outfitting, and life support systems. Although mass is typically the focus of exploration missions, due to its strong impact on launch vehicle and habitable volume for the crew, logistics volume also needs to be considered. NASA's Advanced Exploration Systems (AES) Logistics Reduction and Repurposing (LRR) Project is developing six logistics technologies guided by a systems engineering cradle-to-grave approach to enable after-use crew items to augment vehicle systems. Specifically, AES LRR is investigating the direct reduction of clothing mass, the repurposing of logistical packaging, the use of autonomous logistics management technologies, the processing of spent crew items to benefit radiation shielding and water recovery, and the conversion of trash to propulsion gases. Reduction of mass has a corresponding and significant impact to logistical volume. The reduction of logistical volume can reduce the overall pressurized vehicle mass directly, or indirectly benefit the mission by allowing for an increase in habitable volume during the mission. The systematic implementation of these types of technologies will increase launch mass efficiency by enabling items to be used for secondary purposes and improve the habitability of the vehicle as mission durations increase. Early studies have shown that the use of advanced logistics technologies can save approximately 20 m(sup 3) of volume during transit alone for a six-person Mars conjunction class mission.

  2. Analysis of Jingdong Mall Logistics Distribution Model

    NASA Astrophysics Data System (ADS)

    Shao, Kang; Cheng, Feng

    In recent years, the development of electronic commerce in our country to speed up the pace. The role of logistics has been highlighted, more and more electronic commerce enterprise are beginning to realize the importance of logistics in the success or failure of the enterprise. In this paper, the author take Jingdong Mall for example, performing a SWOT analysis of their current situation of self-built logistics system, find out the problems existing in the current Jingdong Mall logistics distribution and give appropriate recommendations.

  3. Regression Verification Using Impact Summaries

    NASA Technical Reports Server (NTRS)

    Backes, John; Person, Suzette J.; Rungta, Neha; Thachuk, Oksana

    2013-01-01

    Regression verification techniques are used to prove equivalence of syntactically similar programs. Checking equivalence of large programs, however, can be computationally expensive. Existing regression verification techniques rely on abstraction and decomposition techniques to reduce the computational effort of checking equivalence of the entire program. These techniques are sound but not complete. In this work, we propose a novel approach to improve scalability of regression verification by classifying the program behaviors generated during symbolic execution as either impacted or unimpacted. Our technique uses a combination of static analysis and symbolic execution to generate summaries of impacted program behaviors. The impact summaries are then checked for equivalence using an o-the-shelf decision procedure. We prove that our approach is both sound and complete for sequential programs, with respect to the depth bound of symbolic execution. Our evaluation on a set of sequential C artifacts shows that reducing the size of the summaries can help reduce the cost of software equivalence checking. Various reduction, abstraction, and compositional techniques have been developed to help scale software verification techniques to industrial-sized systems. Although such techniques have greatly increased the size and complexity of systems that can be checked, analysis of large software systems remains costly. Regression analysis techniques, e.g., regression testing [16], regression model checking [22], and regression verification [19], restrict the scope of the analysis by leveraging the differences between program versions. These techniques are based on the idea that if code is checked early in development, then subsequent versions can be checked against a prior (checked) version, leveraging the results of the previous analysis to reduce analysis cost of the current version. Regression verification addresses the problem of proving equivalence of closely related program

  4. Interaction Models for Functional Regression

    PubMed Central

    USSET, JOSEPH; STAICU, ANA-MARIA; MAITY, ARNAB

    2015-01-01

    A functional regression model with a scalar response and multiple functional predictors is proposed that accommodates two-way interactions in addition to their main effects. The proposed estimation procedure models the main effects using penalized regression splines, and the interaction effect by a tensor product basis. Extensions to generalized linear models and data observed on sparse grids or with measurement error are presented. A hypothesis testing procedure for the functional interaction effect is described. The proposed method can be easily implemented through existing software. Numerical studies show that fitting an additive model in the presence of interaction leads to both poor estimation performance and lost prediction power, while fitting an interaction model where there is in fact no interaction leads to negligible losses. The methodology is illustrated on the AneuRisk65 study data. PMID:26744549

  5. A regularization corrected score method for nonlinear regression models with covariate error.

    PubMed

    Zucker, David M; Gorfine, Malka; Li, Yi; Tadesse, Mahlet G; Spiegelman, Donna

    2013-03-01

    Many regression analyses involve explanatory variables that are measured with error, and failing to account for this error is well known to lead to biased point and interval estimates of the regression coefficients. We present here a new general method for adjusting for covariate error. Our method consists of an approximate version of the Stefanski-Nakamura corrected score approach, using the method of regularization to obtain an approximate solution of the relevant integral equation. We develop the theory in the setting of classical likelihood models; this setting covers, for example, linear regression, nonlinear regression, logistic regression, and Poisson regression. The method is extremely general in terms of the types of measurement error models covered, and is a functional method in the sense of not involving assumptions on the distribution of the true covariate. We discuss the theoretical properties of the method and present simulation results in the logistic regression setting (univariate and multivariate). For illustration, we apply the method to data from the Harvard Nurses' Health Study concerning the relationship between physical activity and breast cancer mortality in the period following a diagnosis of breast cancer.

  6. Astronomical Methods for Nonparametric Regression

    NASA Astrophysics Data System (ADS)

    Steinhardt, Charles L.; Jermyn, Adam

    2017-01-01

    I will discuss commonly used techniques for nonparametric regression in astronomy. We find that several of them, particularly running averages and running medians, are generically biased, asymmetric between dependent and independent variables, and perform poorly in recovering the underlying function, even when errors are present only in one variable. We then examine less-commonly used techniques such as Multivariate Adaptive Regressive Splines and Boosted Trees and find them superior in bias, asymmetry, and variance both theoretically and in practice under a wide range of numerical benchmarks. In this context the chief advantage of the common techniques is runtime, which even for large datasets is now measured in microseconds compared with milliseconds for the more statistically robust techniques. This points to a tradeoff between bias, variance, and computational resources which in recent years has shifted heavily in favor of the more advanced methods, primarily driven by Moore's Law. Along these lines, we also propose a new algorithm which has better overall statistical properties than all techniques examined thus far, at the cost of significantly worse runtime, in addition to providing guidance on choosing the nonparametric regression technique most suitable to any specific problem. We then examine the more general problem of errors in both variables and provide a new algorithm which performs well in most cases and lacks the clear asymmetry of existing non-parametric methods, which fail to account for errors in both variables.

  7. Biomass Supply Logistics and Infrastructure

    SciTech Connect

    Sokhansanj, Shahabaddine

    2009-04-01

    Feedstock supply system encompasses numerous unit operations necessary to move lignocellulosic feedstock from the place where it is produced (in the field or on the stump) to the start of the conversion process (reactor throat) of the Biorefinery. These unit operations, which include collection, storage, preprocessing, handling, and transportation, represent one of the largest technical and logistics challenges to the emerging lignocellulosic biorefining industry. This chapter briefly reviews methods of estimating the quantities of biomass followed by harvesting and collection processes based on current practices on handling wet and dry forage materials. Storage and queuing are used to deal with seasonal harvest times, variable yields, and delivery schedules. Preprocessing can be as simple as grinding and formatting the biomass for increased bulk density or improved conversion efficiency, or it can be as complex as improving feedstock quality through fractionation, tissue separation, drying, blending, and densification. Handling and Transportation consists of using a variety of transport equipment (truck, train, ship) for moving the biomass from one point to another. The chapter also provides typical cost figures for harvest and processing of biomass.

  8. Biomass supply logistics and infrastructure.

    PubMed

    Sokhansanj, Shahabaddine; Hess, J Richard

    2009-01-01

    Feedstock supply system encompasses numerous unit operations necessary to move lignocellulosic feedstock from the place where it is produced (in the field or on the stump) to the start of the conversion process (reactor throat) of the biorefinery. These unit operations, which include collection, storage, preprocessing, handling, and transportation, represent one of the largest technical and logistics challenges to the emerging lignocellulosic biorefining industry. This chapter briefly reviews the methods of estimating the quantities of biomass, followed by harvesting and collection processes based on current practices on handling wet and dry forage materials. Storage and queuing are used to deal with seasonal harvest times, variable yields, and delivery schedules. Preprocessing can be as simple as grinding and formatting the biomass for increased bulk density or improved conversion efficiency, or it can be as complex as improving feedstock quality through fractionation, tissue separation, drying, blending, and densification. Handling and transportation consists of using a variety of transport equipment (truck, train, ship) for moving the biomass from one point to another. The chapter also provides typical cost figures for harvest and processing of biomass.

  9. FUTURE LOGISTICS AND OPERATIONAL ADAPTABILITY

    SciTech Connect

    Houck, Roger P.

    2009-10-01

    While we cannot predict the future, we can ascertain trends and examine them through the use of alternative futures methodologies and tools. From a logistics perspective, we know that many different futures are possible, all of which are obviously dependent on decisions we make in the present. As professional logisticians we are obligated to provide the field - our Soldiers - with our best professional opinion of what will result in success on the battlefield. Our view of the future should take history and contemporary conflict into account, but it must also consider that continuity with the past cannot be taken for granted. If we are too focused on past and current experience, then our vision of the future will be limited indeed. On the one hand, the future must be explained in language that does not defy common sense. On the other hand, the pace of change is such that we must conduct qualitative and quantitative trend analyses, forecasting, and explorative scenario development in ways that allow for significant breaks - or "shocks" - that may "change the game". We will need capabilities and solutions that are constantly evolving - and improving - to match the operational tempo of a radically changing threat environment. For those who provide quartermaster services, this article will briefly examine what this means from the perspective of creating what might be termed a preferred future.

  10. The Automated Logistics Element Planning System (ALEPS)

    NASA Technical Reports Server (NTRS)

    Schwaab, Douglas G.

    1991-01-01

    The design and functions of ALEPS (Automated Logistics Element Planning System) is a computer system that will automate planning and decision support for Space Station Freedom Logistical Elements (LEs) resupply and return operations. ALEPS provides data management, planning, analysis, monitoring, interfacing, and flight certification for support of LE flight load planning activities. The prototype ALEPS algorithm development is described.

  11. Visualizing the Logistic Map with a Microcontroller

    ERIC Educational Resources Information Center

    Serna, Juan D.; Joshi, Amitabh

    2012-01-01

    The logistic map is one of the simplest nonlinear dynamical systems that clearly exhibits the route to chaos. In this paper, we explore the evolution of the logistic map using an open-source microcontroller connected to an array of light-emitting diodes (LEDs). We divide the one-dimensional domain interval [0,1] into ten equal parts, an associate…

  12. Exploration Mission Benefits From Logistics Reduction Technologies

    NASA Technical Reports Server (NTRS)

    Broyan, James Lee, Jr.; Ewert, Michael K.; Schlesinger, Thilini

    2016-01-01

    Technologies that reduce logistical mass, volume, and the crew time dedicated to logistics management become more important as exploration missions extend further from the Earth. Even modest reductions in logistical mass can have a significant impact because it also reduces the packaging burden. NASA's Advanced Exploration Systems' Logistics Reduction Project is developing technologies that can directly reduce the mass and volume of crew clothing and metabolic waste collection. Also, cargo bags have been developed that can be reconfigured for crew outfitting, and trash processing technologies are under development to increase habitable volume and improve protection against solar storm events. Additionally, Mars class missions are sufficiently distant that even logistics management without resupply can be problematic due to the communication time delay with Earth. Although exploration vehicles are launched with all consumables and logistics in a defined configuration, the configuration continually changes as the mission progresses. Traditionally significant ground and crew time has been required to understand the evolving configuration and to help locate misplaced items. For key mission events and unplanned contingencies, the crew will not be able to rely on the ground for logistics localization assistance. NASA has been developing a radio-frequency-identification autonomous logistics management system to reduce crew time for general inventory and enable greater crew self-response to unplanned events when a wide range of items may need to be located in a very short time period. This paper provides a status of the technologies being developed and their mission benefits for exploration missions.

  13. The Automated Logistics Element Planning System (ALEPS)

    NASA Technical Reports Server (NTRS)

    Schwaab, Douglas G.

    1992-01-01

    ALEPS, which is being developed to provide the SSF program with a computer system to automate logistics resupply/return cargo load planning and verification, is presented. ALEPS will make it possible to simultaneously optimize both the resupply flight load plan and the return flight reload plan for any of the logistics carriers. In the verification mode ALEPS will support the carrier's flight readiness reviews and control proper execution of the approved plans. It will also support the SSF inventory management system by providing electronic block updates to the inventory database on the cargo arriving at or departing the station aboard a logistics carrier. A prototype drawer packing algorithm is described which is capable of generating solutions for 3D packing of cargo items into a logistics carrier storage accommodation. It is concluded that ALEPS will provide the capability to generate and modify optimized loading plans for the logistics elements fleet.

  14. Regression analysis of cytopathological data

    SciTech Connect

    Whittemore, A.S.; McLarty, J.W.; Fortson, N.; Anderson, K.

    1982-12-01

    Epithelial cells from the human body are frequently labelled according to one of several ordered levels of abnormality, ranging from normal to malignant. The label of the most abnormal cell in a specimen determines the score for the specimen. This paper presents a model for the regression of specimen scores against continuous and discrete variables, as in host exposure to carcinogens. Application to data and tests for adequacy of model fit are illustrated using sputum specimens obtained from a cohort of former asbestos workers.

  15. Logistics Modeling for Lunar Exploration Systems

    NASA Technical Reports Server (NTRS)

    Andraschko, Mark R.; Merrill, R. Gabe; Earle, Kevin D.

    2008-01-01

    The extensive logistics required to support extended crewed operations in space make effective modeling of logistics requirements and deployment critical to predicting the behavior of human lunar exploration systems. This paper discusses the software that has been developed as part of the Campaign Manifest Analysis Tool in support of strategic analysis activities under the Constellation Architecture Team - Lunar. The described logistics module enables definition of logistics requirements across multiple surface locations and allows for the transfer of logistics between those locations. A key feature of the module is the loading algorithm that is used to efficiently load logistics by type into carriers and then onto landers. Attention is given to the capabilities and limitations of this loading algorithm, particularly with regard to surface transfers. These capabilities are described within the context of the object-oriented software implementation, with details provided on the applicability of using this approach to model other human exploration scenarios. Some challenges of incorporating probabilistics into this type of logistics analysis model are discussed at a high level.

  16. ISS Logistics Hardware Disposition and Metrics Validation

    NASA Technical Reports Server (NTRS)

    Rogers, Toneka R.

    2010-01-01

    I was assigned to the Logistics Division of the International Space Station (ISS)/Spacecraft Processing Directorate. The Division consists of eight NASA engineers and specialists that oversee the logistics portion of the Checkout, Assembly, and Payload Processing Services (CAPPS) contract. Boeing, their sub-contractors and the Boeing Prime contract out of Johnson Space Center, provide the Integrated Logistics Support for the ISS activities at Kennedy Space Center. Essentially they ensure that spares are available to support flight hardware processing and the associated ground support equipment (GSE). Boeing maintains a Depot for electrical, mechanical and structural modifications and/or repair capability as required. My assigned task was to learn project management techniques utilized by NASA and its' contractors to provide an efficient and effective logistics support infrastructure to the ISS program. Within the Space Station Processing Facility (SSPF) I was exposed to Logistics support components, such as, the NASA Spacecraft Services Depot (NSSD) capabilities, Mission Processing tools, techniques and Warehouse support issues, required for integrating Space Station elements at the Kennedy Space Center. I also supported the identification of near-term ISS Hardware and Ground Support Equipment (GSE) candidates for excessing/disposition prior to October 2010; and the validation of several Logistics Metrics used by the contractor to measure logistics support effectiveness.

  17. Exploration Mission Benefits From Logistics Reduction Technologies

    NASA Technical Reports Server (NTRS)

    Broyan, James Lee, Jr.; Schlesinger, Thilini; Ewert, Michael K.

    2016-01-01

    Technologies that reduce logistical mass, volume, and the crew time dedicated to logistics management become more important as exploration missions extend further from the Earth. Even modest reductions in logical mass can have a significant impact because it also reduces the packing burden. NASA's Advanced Exploration Systems' Logistics Reduction Project is developing technologies that can directly reduce the mass and volume of crew clothing and metabolic waste collection. Also, cargo bags have been developed that can be reconfigured for crew outfitting and trash processing technologies to increase habitable volume and improve protection against solar storm events are under development. Additionally, Mars class missions are sufficiently distant that even logistics management without resupply can be problematic due to the communication time delay with Earth. Although exploration vehicles are launched with all consumables and logistics in a defined configuration, the configuration continually changes as the mission progresses. Traditionally significant ground and crew time has been required to understand the evolving configuration and locate misplaced items. For key mission events and unplanned contingencies, the crew will not be able to rely on the ground for logistics localization assistance. NASA has been developing a radio frequency identification autonomous logistics management system to reduce crew time for general inventory and enable greater crew self-response to unplanned events when a wide range of items may need to be located in a very short time period. This paper provides a status of the technologies being developed and there mission benefits for exploration missions.

  18. Multiatlas segmentation as nonparametric regression.

    PubMed

    Awate, Suyash P; Whitaker, Ross T

    2014-09-01

    This paper proposes a novel theoretical framework to model and analyze the statistical characteristics of a wide range of segmentation methods that incorporate a database of label maps or atlases; such methods are termed as label fusion or multiatlas segmentation. We model these multiatlas segmentation problems as nonparametric regression problems in the high-dimensional space of image patches. We analyze the nonparametric estimator's convergence behavior that characterizes expected segmentation error as a function of the size of the multiatlas database. We show that this error has an analytic form involving several parameters that are fundamental to the specific segmentation problem (determined by the chosen anatomical structure, imaging modality, registration algorithm, and label-fusion algorithm). We describe how to estimate these parameters and show that several human anatomical structures exhibit the trends modeled analytically. We use these parameter estimates to optimize the regression estimator. We show that the expected error for large database sizes is well predicted by models learned on small databases. Thus, a few expert segmentations can help predict the database sizes required to keep the expected error below a specified tolerance level. Such cost-benefit analysis is crucial for deploying clinical multiatlas segmentation systems.

  19. Multiatlas Segmentation as Nonparametric Regression

    PubMed Central

    Awate, Suyash P.; Whitaker, Ross T.

    2015-01-01

    This paper proposes a novel theoretical framework to model and analyze the statistical characteristics of a wide range of segmentation methods that incorporate a database of label maps or atlases; such methods are termed as label fusion or multiatlas segmentation. We model these multiatlas segmentation problems as nonparametric regression problems in the high-dimensional space of image patches. We analyze the nonparametric estimator’s convergence behavior that characterizes expected segmentation error as a function of the size of the multiatlas database. We show that this error has an analytic form involving several parameters that are fundamental to the specific segmentation problem (determined by the chosen anatomical structure, imaging modality, registration algorithm, and label-fusion algorithm). We describe how to estimate these parameters and show that several human anatomical structures exhibit the trends modeled analytically. We use these parameter estimates to optimize the regression estimator. We show that the expected error for large database sizes is well predicted by models learned on small databases. Thus, a few expert segmentations can help predict the database sizes required to keep the expected error below a specified tolerance level. Such cost-benefit analysis is crucial for deploying clinical multiatlas segmentation systems. PMID:24802528

  20. Recognition of caudal regression syndrome.

    PubMed

    Boulas, Mari M

    2009-04-01

    Caudal regression syndrome, also referred to as caudal dysplasia and sacral agenesis syndrome, is a rare congenital malformation characterized by varying degrees of developmental failure early in gestation. It involves the lower extremities, the lumbar and coccygeal vertebrae, and corresponding segments of the spinal cord. This is a rare disorder, and true pathogenesis is unclear. The etiology is thought to be related to maternal diabetes, genetic predisposition, and vascular hypoperfusion, but no true causative factor has been determined. Fetal diagnostic tools allow for early recognition of the syndrome, and careful examination of the newborn is essential to determine the extent of the disorder. Associated organ system dysfunction depends on the severity of the disease. Related defects are structural, and systematic problems including respiratory, cardiac, gastrointestinal, urinary, orthopedic, and neurologic can be present in varying degrees of severity and in different combinations. A multidisciplinary approach to management is crucial. Because the primary pathology is irreversible, treatment is only supportive.

  1. Practical Session: Multiple Linear Regression

    NASA Astrophysics Data System (ADS)

    Clausel, M.; Grégoire, G.

    2014-12-01

    Three exercises are proposed to illustrate the simple linear regression. In the first one investigates the influence of several factors on atmospheric pollution. It has been proposed by D. Chessel and A.B. Dufour in Lyon 1 (see Sect. 6 of http://pbil.univ-lyon1.fr/R/pdf/tdr33.pdf) and is based on data coming from 20 cities of U.S. Exercise 2 is an introduction to model selection whereas Exercise 3 provides a first example of analysis of variance. Exercises 2 and 3 have been proposed by A. Dalalyan at ENPC (see Exercises 2 and 3 of http://certis.enpc.fr/~dalalyan/Download/TP_ENPC_5.pdf).

  2. Featured Partner: Saddle Creek Logistics Services

    EPA Pesticide Factsheets

    This EPA fact sheet spotlights Saddle Creek Logistics as a SmartWay partner committed to sustainability in reducing greenhouse gas emissions and air pollution caused by freight transportation, partly by growing its compressed natural gas (CNG) vehicles for

  3. Lumbar herniated disc: spontaneous regression

    PubMed Central

    Yüksel, Kasım Zafer

    2017-01-01

    Background Low back pain is a frequent condition that results in substantial disability and causes admission of patients to neurosurgery clinics. To evaluate and present the therapeutic outcomes in lumbar disc hernia (LDH) patients treated by means of a conservative approach, consisting of bed rest and medical therapy. Methods This retrospective cohort was carried out in the neurosurgery departments of hospitals in Kahramanmaraş city and 23 patients diagnosed with LDH at the levels of L3−L4, L4−L5 or L5−S1 were enrolled. Results The average age was 38.4 ± 8.0 and the chief complaint was low back pain and sciatica radiating to one or both lower extremities. Conservative treatment was administered. Neurological examination findings, durations of treatment and intervals until symptomatic recovery were recorded. Laségue tests and neurosensory examination revealed that mild neurological deficits existed in 16 of our patients. Previously, 5 patients had received physiotherapy and 7 patients had been on medical treatment. The number of patients with LDH at the level of L3−L4, L4−L5, and L5−S1 were 1, 13, and 9, respectively. All patients reported that they had benefit from medical treatment and bed rest, and radiologic improvement was observed simultaneously on MRI scans. The average duration until symptomatic recovery and/or regression of LDH symptoms was 13.6 ± 5.4 months (range: 5−22). Conclusions It should be kept in mind that lumbar disc hernias could regress with medical treatment and rest without surgery, and there should be an awareness that these patients could recover radiologically. This condition must be taken into account during decision making for surgical intervention in LDH patients devoid of indications for emergent surgery. PMID:28119770

  4. Sustainment and Logistics in Better Buying Power

    DTIC Science & Technology

    2015-07-01

    37 Defense AT&L: July–August 2015 Sustainment and Logistics in Better Buying Power David J. Berteau Berteau is Assistant Secretary of Defense...Better Buying Power (BBP) in 2010, its key sustain - ment initiative has focused on Per- formance-Based Logistics (PBL). With the updated guid- ance...for BBP 3.0 issued April 9, it is worth expanding the view of these updated initiatives through the sustainment prism. This article finds that

  5. Logistics: Supply Based or Distribution Based?

    DTIC Science & Technology

    2007-04-01

    supply chain management, the need to adapt systems as conditions and capabilities change may not be clear or the root causes of problems may not be...logistics-system and supply - chain design and management positions. They need to become adept at integrating the full range of options available to...distribution-based and supply-based designs, the Army, in conjunction with its joint supply - chain partners, should seek optimal, balanced logistics system designs that it can adapt quickly to changing conditions.

  6. Logistics Handbook for Strategic Mobility Planning

    DTIC Science & Technology

    1992-09-01

    AD-A257 075 MTMCTEA REFERENCE 92-700-2 LOGISTICS HANDBOOK FOR STRATEGIC 7 DTIGMOBILITY >CT 2 8 1992 SEPTEMBER 1992 MILITARY TRAFFIC MANAGEMENT...unlimited.,, I MTMCTEA Reference 92-700-2 LOGISTICS HANDBOOK FOR STRATEGIC MOBILITY PLANNING September 1992 Aces gaj 0 n For (NTIS cz MILITARY TRAFFIC ...Military Traffic Management Command Transportation Engineering Agency (MTMCTEA) transportation and transportability studies. D. REFERENCES Source references

  7. Estimating Contraceptive Prevalence Using Logistics Data for Short-Acting Methods: Analysis Across 30 Countries

    PubMed Central

    Cunningham, Marc; Brown, Niquelle; Sacher, Suzy; Hatch, Benjamin; Inglis, Andrew; Aronovich, Dana

    2015-01-01

    Background: Contraceptive prevalence rate (CPR) is a vital indicator used by country governments, international donors, and other stakeholders for measuring progress in family planning programs against country targets and global initiatives as well as for estimating health outcomes. Because of the need for more frequent CPR estimates than population-based surveys currently provide, alternative approaches for estimating CPRs are being explored, including using contraceptive logistics data. Methods: Using data from the Demographic and Health Surveys (DHS) in 30 countries, population data from the United States Census Bureau International Database, and logistics data from the Procurement Planning and Monitoring Report (PPMR) and the Pipeline Monitoring and Procurement Planning System (PipeLine), we developed and evaluated 3 models to generate country-level, public-sector contraceptive prevalence estimates for injectable contraceptives, oral contraceptives, and male condoms. Models included: direct estimation through existing couple-years of protection (CYP) conversion factors, bivariate linear regression, and multivariate linear regression. Model evaluation consisted of comparing the referent DHS prevalence rates for each short-acting method with the model-generated prevalence rate using multiple metrics, including mean absolute error and proportion of countries where the modeled prevalence rate for each method was within 1, 2, or 5 percentage points of the DHS referent value. Results: For the methods studied, family planning use estimates from public-sector logistics data were correlated with those from the DHS, validating the quality and accuracy of current public-sector logistics data. Logistics data for oral and injectable contraceptives were significantly associated (P<.05) with the referent DHS values for both bivariate and multivariate models. For condoms, however, that association was only significant for the bivariate model. With the exception of the CYP

  8. Logistics Reduction Technologies for Exploration Missions

    NASA Technical Reports Server (NTRS)

    Broyan, James L., Jr.; Ewert, Michael K.; Fink, Patrick W.

    2014-01-01

    Human exploration missions under study are very limited by the launch mass capacity of existing and planned vehicles. The logistical mass of crew items is typically considered separate from the vehicle structure, habitat outfitting, and life support systems. Consequently, crew item logistical mass is typically competing with vehicle systems for mass allocation. NASA's Advanced Exploration Systems (AES) Logistics Reduction and Repurposing (LRR) Project is developing five logistics technologies guided by a systems engineering cradle-to-grave approach to enable used crew items to augment vehicle systems. Specifically, AES LRR is investigating the direct reduction of clothing mass, the repurposing of logistical packaging, the use of autonomous logistics management technologies, the processing of spent crew items to benefit radiation shielding and water recovery, and the conversion of trash to propulsion gases. The systematic implementation of these types of technologies will increase launch mass efficiency by enabling items to be used for secondary purposes and improve the habitability of the vehicle as the mission duration increases. This paper provides a description and the challenges of the five technologies under development and the estimated overall mission benefits of each technology.

  9. ISS Update: Logistics Reduction and Repurposing (Part 1)

    NASA Video Gallery

    Public Affairs Officer Brandi Dean interviews Sarah Shull, Deputy Project Manager Logistics Reduction and Repurposing. Shull, who is with the Advanced Exploration Systems, discusses the Logistics t...

  10. ISS Update: Logistics Reduction and Repurposing (Part 2)

    NASA Video Gallery

    Public Affairs Officer Brandi Dean interviews Sarah Shull, Deputy Project Manager Logistics Reduction and Repurposing. Shull, who is with the Advanced Exploration Systems, discusses the Logistics t...

  11. Genetics Home Reference: caudal regression syndrome

    MedlinePlus

    ... of a genetic condition? Genetic and Rare Diseases Information Center Frequency Caudal regression syndrome is estimated to occur in 1 to ... parts of the skeleton, gastrointestinal system, and genitourinary ... caudal regression syndrome results from the presence of an abnormal ...

  12. Semiparametric regression during 2003–2007*

    PubMed Central

    Ruppert, David; Wand, M.P.; Carroll, Raymond J.

    2010-01-01

    Semiparametric regression is a fusion between parametric regression and nonparametric regression that integrates low-rank penalized splines, mixed model and hierarchical Bayesian methodology – thus allowing more streamlined handling of longitudinal and spatial correlation. We review progress in the field over the five-year period between 2003 and 2007. We find semiparametric regression to be a vibrant field with substantial involvement and activity, continual enhancement and widespread application. PMID:20305800

  13. Using GA-Ridge regression to select hydro-geological parameters influencing groundwater pollution vulnerability.

    PubMed

    Ahn, Jae Joon; Kim, Young Min; Yoo, Keunje; Park, Joonhong; Oh, Kyong Joo

    2012-11-01

    For groundwater conservation and management, it is important to accurately assess groundwater pollution vulnerability. This study proposed an integrated model using ridge regression and a genetic algorithm (GA) to effectively select the major hydro-geological parameters influencing groundwater pollution vulnerability in an aquifer. The GA-Ridge regression method determined that depth to water, net recharge, topography, and the impact of vadose zone media were the hydro-geological parameters that influenced trichloroethene pollution vulnerability in a Korean aquifer. When using these selected hydro-geological parameters, the accuracy was improved for various statistical nonlinear and artificial intelligence (AI) techniques, such as multinomial logistic regression, decision trees, artificial neural networks, and case-based reasoning. These results provide a proof of concept that the GA-Ridge regression is effective at determining influential hydro-geological parameters for the pollution vulnerability of an aquifer, and in turn, improves the AI performance in assessing groundwater pollution vulnerability.

  14. Bayesian Unimodal Density Regression for Causal Inference

    ERIC Educational Resources Information Center

    Karabatsos, George; Walker, Stephen G.

    2011-01-01

    Karabatsos and Walker (2011) introduced a new Bayesian nonparametric (BNP) regression model. Through analyses of real and simulated data, they showed that the BNP regression model outperforms other parametric and nonparametric regression models of common use, in terms of predictive accuracy of the outcome (dependent) variable. The other,…

  15. Developmental Regression in Autism Spectrum Disorders

    ERIC Educational Resources Information Center

    Rogers, Sally J.

    2004-01-01

    The occurrence of developmental regression in autism is one of the more puzzling features of this disorder. Although several studies have documented the validity of parental reports of regression using home videos, accumulating data suggest that most children who demonstrate regression also demonstrated previous, subtle, developmental differences.…

  16. Regression Analysis by Example. 5th Edition

    ERIC Educational Resources Information Center

    Chatterjee, Samprit; Hadi, Ali S.

    2012-01-01

    Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. "Regression Analysis by Example, Fifth Edition" has been expanded and thoroughly…

  17. Synthesizing Regression Results: A Factored Likelihood Method

    ERIC Educational Resources Information Center

    Wu, Meng-Jia; Becker, Betsy Jane

    2013-01-01

    Regression methods are widely used by researchers in many fields, yet methods for synthesizing regression results are scarce. This study proposes using a factored likelihood method, originally developed to handle missing data, to appropriately synthesize regression models involving different predictors. This method uses the correlations reported…

  18. Streamflow forecasting using functional regression

    NASA Astrophysics Data System (ADS)

    Masselot, Pierre; Dabo-Niang, Sophie; Chebana, Fateh; Ouarda, Taha B. M. J.

    2016-07-01

    Streamflow, as a natural phenomenon, is continuous in time and so are the meteorological variables which influence its variability. In practice, it can be of interest to forecast the whole flow curve instead of points (daily or hourly). To this end, this paper introduces the functional linear models and adapts it to hydrological forecasting. More precisely, functional linear models are regression models based on curves instead of single values. They allow to consider the whole process instead of a limited number of time points or features. We apply these models to analyse the flow volume and the whole streamflow curve during a given period by using precipitations curves. The functional model is shown to lead to encouraging results. The potential of functional linear models to detect special features that would have been hard to see otherwise is pointed out. The functional model is also compared to the artificial neural network approach and the advantages and disadvantages of both models are discussed. Finally, future research directions involving the functional model in hydrology are presented.

  19. Survival analysis and Cox regression.

    PubMed

    Benítez-Parejo, N; Rodríguez del Águila, M M; Pérez-Vicente, S

    2011-01-01

    The data provided by clinical trials are often expressed in terms of survival. The analysis of survival comprises a series of statistical analytical techniques in which the measurements analysed represent the time elapsed between a given exposure and the outcome of a certain event. Despite the name of these techniques, the outcome in question does not necessarily have to be either survival or death, and may be healing versus no healing, relief versus pain, complication versus no complication, relapse versus no relapse, etc. The present article describes the analysis of survival from both a descriptive perspective, based on the Kaplan-Meier estimation method, and in terms of bivariate comparisons using the log-rank statistic. Likewise, a description is provided of the Cox regression models for the study of risk factors or covariables associated to the probability of survival. These models are defined in both simple and multiple forms, and a description is provided of how they are calculated and how the postulates for application are checked - accompanied by illustrating examples with the shareware application R.

  20. Estimating equivalence with quantile regression

    USGS Publications Warehouse

    Cade, B.S.

    2011-01-01

    Equivalence testing and corresponding confidence interval estimates are used to provide more enlightened statistical statements about parameter estimates by relating them to intervals of effect sizes deemed to be of scientific or practical importance rather than just to an effect size of zero. Equivalence tests and confidence interval estimates are based on a null hypothesis that a parameter estimate is either outside (inequivalence hypothesis) or inside (equivalence hypothesis) an equivalence region, depending on the question of interest and assignment of risk. The former approach, often referred to as bioequivalence testing, is often used in regulatory settings because it reverses the burden of proof compared to a standard test of significance, following a precautionary principle for environmental protection. Unfortunately, many applications of equivalence testing focus on establishing average equivalence by estimating differences in means of distributions that do not have homogeneous variances. I discuss how to compare equivalence across quantiles of distributions using confidence intervals on quantile regression estimates that detect differences in heterogeneous distributions missed by focusing on means. I used one-tailed confidence intervals based on inequivalence hypotheses in a two-group treatment-control design for estimating bioequivalence of arsenic concentrations in soils at an old ammunition testing site and bioequivalence of vegetation biomass at a reclaimed mining site. Two-tailed confidence intervals based both on inequivalence and equivalence hypotheses were used to examine quantile equivalence for negligible trends over time for a continuous exponential model of amphibian abundance. ?? 2011 by the Ecological Society of America.

  1. NASA Space Exploration Logistics Workshop Proceedings

    NASA Technical Reports Server (NTRS)

    deWeek, Oliver; Evans, William A.; Parrish, Joe; James, Sarah

    2006-01-01

    As NASA has embarked on a new Vision for Space Exploration, there is new energy and focus around the area of manned space exploration. These activities encompass the design of new vehicles such as the Crew Exploration Vehicle (CEV) and Crew Launch Vehicle (CLV) and the identification of commercial opportunities for space transportation services, as well as continued operations of the Space Shuttle and the International Space Station. Reaching the Moon and eventually Mars with a mix of both robotic and human explorers for short term missions is a formidable challenge in itself. How to achieve this in a safe, efficient and long-term sustainable way is yet another question. The challenge is not only one of vehicle design, launch, and operations but also one of space logistics. Oftentimes, logistical issues are not given enough consideration upfront, in relation to the large share of operating budgets they consume. In this context, a group of 54 experts in space logistics met for a two-day workshop to discuss the following key questions: 1. What is the current state-of the art in space logistics, in terms of architectures, concepts, technologies as well as enabling processes? 2. What are the main challenges for space logistics for future human exploration of the Moon and Mars, at the intersection of engineering and space operations? 3. What lessons can be drawn from past successes and failures in human space flight logistics? 4. What lessons and connections do we see from terrestrial analogies as well as activities in other areas, such as U.S. military logistics? 5. What key advances are required to enable long-term success in the context of a future interplanetary supply chain? These proceedings summarize the outcomes of the workshop, reference particular presentations, panels and breakout sessions, and record specific observations that should help guide future efforts.

  2. Logistics Lessons Learned in NASA Space Flight

    NASA Technical Reports Server (NTRS)

    Evans, William A.; DeWeck, Olivier; Laufer, Deanna; Shull, Sarah

    2006-01-01

    The Vision for Space Exploration sets out a number of goals, involving both strategic and tactical objectives. These include returning the Space Shuttle to flight, completing the International Space Station, and conducting human expeditions to the Moon by 2020. Each of these goals has profound logistics implications. In the consideration of these objectives,a need for a study on NASA logistics lessons learned was recognized. The study endeavors to identify both needs for space exploration and challenges in the development of past logistics architectures, as well as in the design of space systems. This study may also be appropriately applied as guidance in the development of an integrated logistics architecture for future human missions to the Moon and Mars. This report first summarizes current logistics practices for the Space Shuttle Program (SSP) and the International Space Station (ISS) and examines the practices of manifesting, stowage, inventory tracking, waste disposal, and return logistics. The key findings of this examination are that while the current practices do have many positive aspects, there are also several shortcomings. These shortcomings include a high-level of excess complexity, redundancy of information/lack of a common database, and a large human-in-the-loop component. Later sections of this report describe the methodology and results of our work to systematically gather logistics lessons learned from past and current human spaceflight programs as well as validating these lessons through a survey of the opinions of current space logisticians. To consider the perspectives on logistics lessons, we searched several sources within NASA, including organizations with direct and indirect connections with the system flow in mission planning. We utilized crew debriefs, the John Commonsense lessons repository for the JSC Mission Operations Directorate, and the Skylab Lessons Learned. Additionally, we searched the public version of the Lessons Learned

  3. Humanitarian response: improving logistics to save lives.

    PubMed

    McCoy, Jessica

    2008-01-01

    Each year, millions of people worldwide are affected by disasters, underscoring the importance of effective relief efforts. Many highly visible disaster responses have been inefficient and ineffective. Humanitarian agencies typically play a key role in disaster response (eg, procuring and distributing relief items to an affected population, assisting with evacuation, providing healthcare, assisting in the development of long-term shelter), and thus their efficiency is critical for a successful disaster response. The field of disaster and emergency response modeling is well established, but the application of such techniques to humanitarian logistics is relatively recent. This article surveys models of humanitarian response logistics and identifies promising opportunities for future work. Existing models analyze a variety of preparation and response decisions (eg, warehouse location and the distribution of relief supplies), consider both natural and manmade disasters, and typically seek to minimize cost or unmet demand. Opportunities to enhance the logistics of humanitarian response include the adaptation of models developed for general disaster response; the use of existing models, techniques, and insights from the literature on commercial supply chain management; the development of working partnerships between humanitarian aid organizations and private companies with expertise in logistics; and the consideration of behavioral factors relevant to a response. Implementable, realistic models that support the logistics of humanitarian relief can improve the preparation for and the response to disasters, which in turn can save lives.

  4. Reverse logistics in the construction industry.

    PubMed

    Hosseini, M Reza; Rameezdeen, Raufdeen; Chileshe, Nicholas; Lehmann, Steffen

    2015-06-01

    Reverse logistics in construction refers to the movement of products and materials from salvaged buildings to a new construction site. While there is a plethora of studies looking at various aspects of the reverse logistics chain, there is no systematic review of literature on this important subject as applied to the construction industry. Therefore, the objective of this study is to integrate the fragmented body of knowledge on reverse logistics in construction, with the aim of promoting the concept among industry stakeholders and the wider construction community. Through a qualitative meta-analysis, the study synthesises the findings of previous studies and presents some actions needed by industry stakeholders to promote this concept within the real-life context. First, the trend of research and terminology related with reverse logistics is introduced. Second, it unearths the main advantages and barriers of reverse logistics in construction while providing some suggestions to harness the advantages and mitigate these barriers. Finally, it provides a future research direction based on the review.

  5. Developmental regression in autism spectrum disorder.

    PubMed

    Al Backer, Nouf Backer

    2015-01-01

    The occurrence of developmental regression in autism spectrum disorder (ASD) is one of the most puzzling phenomena of this disorder. A little is known about the nature and mechanism of developmental regression in ASD. About one-third of young children with ASD lose some skills during the preschool period, usually speech, but sometimes also nonverbal communication, social or play skills are also affected. There is a lot of evidence suggesting that most children who demonstrate regression also had previous, subtle, developmental differences. It is difficult to predict the prognosis of autistic children with developmental regression. It seems that the earlier development of social, language, and attachment behaviors followed by regression does not predict the later recovery of skills or better developmental outcomes. The underlying mechanisms that lead to regression in autism are unknown. The role of subclinical epilepsy in the developmental regression of children with autism remains unclear.

  6. 78 FR 25853 - Defense Logistics Agency Privacy Program

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-05-03

    ... of the Secretary 32 CFR Part 323 RIN 0790-AI86 Defense Logistics Agency Privacy Program AGENCY: Defense Logistics Agency, DoD. ACTION: Final rule. SUMMARY: The Defense Logistics Agency (DLA) is revising... Defense Logistics Agency's implementation of the Privacy Act of 1974, as amended. In addition, DLA...

  7. 77 FR 39662 - Hazardous Materials; Reverse Logistics (RRR)

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-07-05

    ... Materials; Reverse Logistics (RRR) AGENCY: Pipeline and Hazardous Materials Safety Administration (PHMSA... materials in the ``reverse logistics'' supply chain. Reverse logistics is the process that is initiated when... will propose to simplify the regulations for reverse logistics shipments and provide avenue means...

  8. Right-sizing the Logistics Deployment Footprint

    DTIC Science & Technology

    2008-06-01

    ISER -MLB-PR-08-61 Right-sizing the Logistics Deployment Footprint 76th MORS Symposium 10-12 June, 2008 Tom Collipi Mike Albright Northrop Grumman...Form 298 (Rev. 8-98) Prescribed by ANSI Std Z39-18 ISER -MLB-PR-08-61 3 Purpose • Present A Simulation-Based Methodology For Sizing The Logistics...Avoidance Associated With The “Optimal” Footprint ISER -MLB-PR-08-61 4 Agenda • Tool Overview • Scenario And Trade Space • Input Data • Initial Results

  9. Reverse logistics in the Brazilian construction industry.

    PubMed

    Nunes, K R A; Mahler, C F; Valle, R A

    2009-09-01

    In Brazil most Construction and Demolition Waste (C&D waste) is not recycled. This situation is expected to change significantly, since new federal regulations oblige municipalities to create and implement sustainable C&D waste management plans which assign an important role to recycling activities. The recycling organizational network and its flows and components are fundamental to C&D waste recycling feasibility. Organizational networks, flows and components involve reverse logistics. The aim of this work is to introduce the concepts of reverse logistics and reverse distribution channel networks and to study the Brazilian C&D waste case.

  10. Using Cultural Algorithms to Improve Intelligent Logistics

    NASA Astrophysics Data System (ADS)

    Ochoa, Alberto; García, Yazmani; Yañez, Javier; Teymanoglu, Yaddik

    Today the issue of logistics is a very important within companies to the extent that some have departments devoted exclusively to it. This has evolved over time and today is a fundamental aspect in the fight business seeking to consolidate or remain leaders in their field. With the above we know that logistics can be divided into different classes, however, in this regard, our study is based on the timely distribution to the customer with a lower cost, higher sales and better utilization of space resulting in excellent service. Finally, prepare a comparative analysis of the results with respect to another method of optimization solution space.

  11. Strategies on the Implementation of China's Logistics Information Network

    NASA Astrophysics Data System (ADS)

    Dong, Yahui; Li, Wei; Guo, Xuwen

    The economic globalization and trend of e-commerce network have determined that the logistics industry will be rapidly developed in the 21st century. In order to achieve the optimal allocation of resources, a worldwide rapid and sound customer service system should be established. The establishment of a corresponding modern logistics system is the inevitable choice of this requirement. It is also the inevitable choice for the development of modern logistics industry in China. The perfect combination of modern logistics and information network can better promote the development of the logistics industry. Through the analysis of Status of Logistics Industry in China, this paper summed up the domestic logistics enterprise logistics information system in the building of some common problems. According to logistics information systems planning methods and principles set out logistics information system to optimize the management model.

  12. Estimating the effect of regression toward the mean under stochastic censoring.

    PubMed

    Hannan, P J; Jacobs, D R; McGovern, P; Klepp, K I; Elmer, P

    1994-02-15

    This investigation estimated regression toward the mean in the evaluation of a blood cholesterol educational program in which the probability of return for a follow-up measure was positively related to the initial measure--a situation the authors term "stochastic censoring." The situation in which selection is by a fixed cutpoint is well known. The authors estimate the extent of regression toward the mean when selection is probabilistically related to an initially higher blood cholesterol level. A Monte Carlo method is proposed, conditional on the observed data, for estimating the regression toward the mean and its precision. In 106 simulations, regression toward the mean was estimated to be 0.012 mmol/liter (standard deviation (SD), 0.032 mmol/liter). Similar estimates were obtained using Empirical Bayes shrinkage estimates of baseline cholesterol (regression toward the mean = 0.0106 mmol/liter) and numeric integration (0.0109 mmol/liter). The observed reduction in blood cholesterol was 0.271 mmol/liter (SD, 0.061 mmol/liter); after correction for regression toward the mean, the estimate of the true educational effect was 0.259 mmol/liter (SD, 0.069 mmol/liter). Functions are presented that will enable investigators to predict regression toward the mean and its standard deviation under the conditions that errors are gaussian and stochastic censoring can be approximated by a logistic model.

  13. Factors associated with remission and/or regression of microalbuminuria in type 2 diabetes mellitus.

    PubMed

    Ono, Tetsuichiro; Shikata, Kenichi; Obika, Mikako; Miyatake, Nobuyuki; Kodera, Ryo; Hirota, Daisyo; Wada, Jun; Kataoka, Hitomi; Ogawa, Daisuke; Makino, Hirofumi

    2014-01-01

    The aim of this study was to clarify the factors associated with the remission and/or regression of microalbuminuria in Japanese patients with type 2 diabetes mellitus. We retrospectively analyzed the data of 130 patients with type 2 diabetes mellitus with microalbuminuria for 2-6 years (3.39±1.31 years). Remission was defined as improving from microalbuminuria to normoalbuminuria using the albumin/creatinine ratio (ACR), and regression of microalbuminuria was defined as a decrease in ACR of 50% or more from baseline. Progression of microalbuminuria was defined as progressing from microalbuminuria to overt proteinuria during the follow-up period. Among 130 patients with type 2 diabetes mellitus with microalbuminuria, 57 and 13 patients were defined as having remission and regression, respectively, while 26 patients progressed to overt proteinuria. Sex (female), higher HDL cholesterol and lower HbA1c were determinant factors associated with remission/regression of microalbuminuria by logistic regression analysis. Lower systolic blood pressure (SBP) was also correlated with remission/regression, but not at a significant level. These results suggest that proper control of blood glucose, BP and lipid profiles may be associated with remission and/or regression of type 2 diabetes mellitus with microalbuminuria in clinical practice.

  14. Improving non-homogeneous regression for probabilistic precipitation forecasts

    NASA Astrophysics Data System (ADS)

    Presser, Manuel; Messner, Jakob W.; Mayr, Georg J.; Zeileis, Achim

    2016-04-01

    Non-homogenous regression is a state-of-the-art ensemble post-processing technique that statistically corrects ensemble forecasts and predicts a full probability distribution. Originally, a Gaussian model is employed that linearly links the predicted distribution mean and variance to the ensemble mean and variance, respectively. Regarding non-normally distributed precipitation data, this model can be censored at zero to account for periods without precipitation. We improve this regression approach in several directions. First, we consider link functions in the variance sub-model that assure positivity of the model variance. Second, we consider a censored Logistic (instead of censored Gaussian) distribution to accommodate more frequent events with high precipitation. Third, we introduce a splitting procedure, which appropriately accounts for perfect prediction cases, i.e., where no precipitation is observed when all ensemble members predict no precipitation. This study is applied to different accumulation periods (3, 6, 12, 24 hours) for short-range precipitation forecasts in Northern Italy. The choice of link function for the variance parameter, the splitting procedure, and an appropriate distribution assumption for precipitation data significantly improve the probabilistic forecast skill, especially for shorter accumulation periods. KEYWORDS: heteroscedastic ensemble post-processing, censored distribution, maximum likelihood estimation, probabilistic precipitation forecasting

  15. Bayesian Isotonic Regression Dose-response (BIRD) Model.

    PubMed

    Li, Wen; Fu, Haoda

    2016-12-21

    Understanding dose-response relationship is a crucial step in drug development. There are a few parametric methods to estimate dose-response curves, such as the Emax model and the logistic model. These parametric models are easy to interpret and, hence, widely used. However, these models often require the inclusion of patients on high-dose levels; otherwise, the model parameters cannot be reliably estimated. To have robust estimation, nonparametric models are used. However, these models are not able to estimate certain important clinical parameters, such as ED50 and Emax. Furthermore, in many therapeutic areas, dose-response curves can be assumed as non-decreasing functions. This creates an additional challenge for nonparametric methods. In this paper, we propose a new Bayesian isotonic regression dose-response model which features advantages from both parametric and nonparametric models. The ED50 and Emax can be derived from this model. Simulations are provided to evaluate the Bayesian isotonic regression dose-response model performance against two parametric models. We apply this model to a data set from a diabetes dose-finding study.

  16. Linear Logistic Test Modeling with R

    ERIC Educational Resources Information Center

    Baghaei, Purya; Kubinger, Klaus D.

    2015-01-01

    The present paper gives a general introduction to the linear logistic test model (Fischer, 1973), an extension of the Rasch model with linear constraints on item parameters, along with eRm (an R package to estimate different types of Rasch models; Mair, Hatzinger, & Mair, 2014) functions to estimate the model and interpret its parameters. The…

  17. Biomass round bales infield aggregation logistic scenarios

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Biomass bales often need to be aggregated (collected into groups and transported) to a field-edge stack for temporary storage for feedlots or processing facilities. Aggregating the bales with the least total distance involved is a goal of producers and bale handlers. Several logistics scenarios for ...

  18. Mission Benefits Analysis of Logistics Reduction Technologies

    NASA Technical Reports Server (NTRS)

    Ewert, Michael K.; Broyan, James L.

    2012-01-01

    Future space exploration missions will need to use less logistical supplies if humans are to live for longer periods away from our home planet. Anything that can be done to reduce initial mass and volume of supplies or reuse or recycle items that have been launched will be very valuable. Reuse and recycling also reduce the trash burden and associated nuisances, such as smell, but require good systems engineering and operations integration to reap the greatest benefits. A systems analysis was conducted to quantify the mass and volume savings of four different technologies currently under development by NASA fs Advanced Exploration Systems (AES) Logistics Reduction and Repurposing project. Advanced clothing systems lead to savings by direct mass reduction and increased wear duration. Reuse of logistical items, such as packaging, for a second purpose allows fewer items to be launched. A device known as a heat melt compactor drastically reduces the volume of trash, recovers water and produces a stable tile that can be used instead of launching additional radiation protection. The fourth technology, called trash ]to ]supply ]gas, can benefit a mission by supplying fuel such as methane to the propulsion system. This systems engineering work will help improve logistics planning and overall mission architectures by determining the most effective use, and reuse, of all resources.

  19. LOGISTICS OF ECOLOGICAL SAMPLING ON LARGE RIVERS

    EPA Science Inventory

    The objectives of this document are to provide an overview of the logistical problems associated with the ecological sampling of boatable rivers and to suggest solutions to those problems. It is intended to be used as a resource for individuals preparing to collect biological dat...

  20. Co-providing: understanding the logistics.

    PubMed

    Dickerson, Pamela S

    2011-11-01

    Continuing nursing education providers have sometimes said that they don't want to co-provide because "it's too much trouble" or they "won't be able to control what happens" or because they don't understand the process. This column clarifies the logistics of the co-provider relationship.

  1. [Discussion on logistics management of medical consumables].

    PubMed

    Deng, Sutong; Wang, Miao; Jiang, Xiali

    2011-09-01

    Management of medical consumables is an important part of modern hospital management. In modern medical behavior, drugs and medical devices act directly on the patient, and are important factors affecting the quality of medical practice. With the increasing use of medical materials, based on practical application, this article proposes the management model of medical consumables, and discusses the essence of medical materials logistics management.

  2. Multi-Purpose Logistics Module Briefing

    NASA Technical Reports Server (NTRS)

    2001-01-01

    Silvanna Rabbi, MPLM Program Manager, Italian Space Agency, gives an overview of the Multi-Purpose Logistics Module (MPLM) in a prelaunch press conference. She describes the objectives, construction, specifications, and purpose of the three Italian-built modules, Leonardo, Rafaello, and Donatello. Ms. Rabbi then answers questions from the press.

  3. Mission Benefits Analysis of Logistics Reduction Technologies

    NASA Technical Reports Server (NTRS)

    Ewert, Michael K.; Broyan, James Lee, Jr.

    2013-01-01

    Future space exploration missions will need to use less logistical supplies if humans are to live for longer periods away from our home planet. Anything that can be done to reduce initial mass and volume of supplies or reuse or recycle items that have been launched will be very valuable. Reuse and recycling also reduce the trash burden and associated nuisances, such as smell, but require good systems engineering and operations integration to reap the greatest benefits. A systems analysis was conducted to quantify the mass and volume savings of four different technologies currently under development by NASA s Advanced Exploration Systems (AES) Logistics Reduction and Repurposing project. Advanced clothing systems lead to savings by direct mass reduction and increased wear duration. Reuse of logistical items, such as packaging, for a second purpose allows fewer items to be launched. A device known as a heat melt compactor drastically reduces the volume of trash, recovers water and produces a stable tile that can be used instead of launching additional radiation protection. The fourth technology, called trash-to-gas, can benefit a mission by supplying fuel such as methane to the propulsion system. This systems engineering work will help improve logistics planning and overall mission architectures by determining the most effective use, and reuse, of all resources.

  4. SIMULATION AND EVALUATION OF LOGISTICS SYSTEMS,

    DTIC Science & Technology

    A description of PLANET (Planned Logistics Analysis and Evaluation Technique), designed to increase the possibility of discovering flaws in a weapon...equipment, and spare parts. Even though PLANET cannot optimize the hardware configuration, it permits examination of different configurations. It

  5. Process modeling with the regression network.

    PubMed

    van der Walt, T; Barnard, E; van Deventer, J

    1995-01-01

    A new connectionist network topology called the regression network is proposed. The structural and underlying mathematical features of the regression network are investigated. Emphasis is placed on the intricacies of the optimization process for the regression network and some measures to alleviate these difficulties of optimization are proposed and investigated. The ability of the regression network algorithm to perform either nonparametric or parametric optimization, as well as a combination of both, is also highlighted. It is further shown how the regression network can be used to model systems which are poorly understood on the basis of sparse data. A semi-empirical regression network model is developed for a metallurgical processing operation (a hydrocyclone classifier) by building mechanistic knowledge into the connectionist structure of the regression network model. Poorly understood aspects of the process are provided for by use of nonparametric regions within the structure of the semi-empirical connectionist model. The performance of the regression network model is compared to the corresponding generalization performance results obtained by some other nonparametric regression techniques.

  6. Quantile regression applied to spectral distance decay

    USGS Publications Warehouse

    Rocchini, D.; Cade, B.S.

    2008-01-01

    Remotely sensed imagery has long been recognized as a powerful support for characterizing and estimating biodiversity. Spectral distance among sites has proven to be a powerful approach for detecting species composition variability. Regression analysis of species similarity versus spectral distance allows us to quantitatively estimate the amount of turnover in species composition with respect to spectral and ecological variability. In classical regression analysis, the residual sum of squares is minimized for the mean of the dependent variable distribution. However, many ecological data sets are characterized by a high number of zeroes that add noise to the regression model. Quantile regressions can be used to evaluate trend in the upper quantiles rather than a mean trend across the whole distribution of the dependent variable. In this letter, we used ordinary least squares (OLS) and quantile regressions to estimate the decay of species similarity versus spectral distance. The achieved decay rates were statistically nonzero (p < 0.01), considering both OLS and quantile regressions. Nonetheless, the OLS regression estimate of the mean decay rate was only half the decay rate indicated by the upper quantiles. Moreover, the intercept value, representing the similarity reached when the spectral distance approaches zero, was very low compared with the intercepts of the upper quantiles, which detected high species similarity when habitats are more similar. In this letter, we demonstrated the power of using quantile regressions applied to spectral distance decay to reveal species diversity patterns otherwise lost or underestimated by OLS regression. ?? 2008 IEEE.

  7. [From clinical judgment to linear regression model.

    PubMed

    Palacios-Cruz, Lino; Pérez, Marcela; Rivas-Ruiz, Rodolfo; Talavera, Juan O

    2013-01-01

    When we think about mathematical models, such as linear regression model, we think that these terms are only used by those engaged in research, a notion that is far from the truth. Legendre described the first mathematical model in 1805, and Galton introduced the formal term in 1886. Linear regression is one of the most commonly used regression models in clinical practice. It is useful to predict or show the relationship between two or more variables as long as the dependent variable is quantitative and has normal distribution. Stated in another way, the regression is used to predict a measure based on the knowledge of at least one other variable. Linear regression has as it's first objective to determine the slope or inclination of the regression line: Y = a + bx, where "a" is the intercept or regression constant and it is equivalent to "Y" value when "X" equals 0 and "b" (also called slope) indicates the increase or decrease that occurs when the variable "x" increases or decreases in one unit. In the regression line, "b" is called regression coefficient. The coefficient of determination (R(2)) indicates the importance of independent variables in the outcome.

  8. Geodesic least squares regression on information manifolds

    SciTech Connect

    Verdoolaege, Geert

    2014-12-05

    We present a novel regression method targeted at situations with significant uncertainty on both the dependent and independent variables or with non-Gaussian distribution models. Unlike the classic regression model, the conditional distribution of the response variable suggested by the data need not be the same as the modeled distribution. Instead they are matched by minimizing the Rao geodesic distance between them. This yields a more flexible regression method that is less constrained by the assumptions imposed through the regression model. As an example, we demonstrate the improved resistance of our method against some flawed model assumptions and we apply this to scaling laws in magnetic confinement fusion.

  9. Factor complexity of crash occurrence: An empirical demonstration using boosted regression trees.

    PubMed

    Chung, Yi-Shih

    2013-12-01

    Factor complexity is a characteristic of traffic crashes. This paper proposes a novel method, namely boosted regression trees (BRT), to investigate the complex and nonlinear relationships in high-variance traffic crash data. The Taiwanese 2004-2005 single-vehicle motorcycle crash data are used to demonstrate the utility of BRT. Traditional logistic regression and classification and regression tree (CART) models are also used to compare their estimation results and external validities. Both the in-sample cross-validation and out-of-sample validation results show that an increase in tree complexity provides improved, although declining, classification performance, indicating a limited factor complexity of single-vehicle motorcycle crashes. The effects of crucial variables including geographical, time, and sociodemographic factors explain some fatal crashes. Relatively unique fatal crashes are better approximated by interactive terms, especially combinations of behavioral factors. BRT models generally provide improved transferability than conventional logistic regression and CART models. This study also discusses the implications of the results for devising safety policies.

  10. Suppression Situations in Multiple Linear Regression

    ERIC Educational Resources Information Center

    Shieh, Gwowen

    2006-01-01

    This article proposes alternative expressions for the two most prevailing definitions of suppression without resorting to the standardized regression modeling. The formulation provides a simple basis for the examination of their relationship. For the two-predictor regression, the author demonstrates that the previous results in the literature are…

  11. Regression Analysis: Legal Applications in Institutional Research

    ERIC Educational Resources Information Center

    Frizell, Julie A.; Shippen, Benjamin S., Jr.; Luna, Andrew L.

    2008-01-01

    This article reviews multiple regression analysis, describes how its results should be interpreted, and instructs institutional researchers on how to conduct such analyses using an example focused on faculty pay equity between men and women. The use of multiple regression analysis will be presented as a method with which to compare salaries of…

  12. Principles of Quantile Regression and an Application

    ERIC Educational Resources Information Center

    Chen, Fang; Chalhoub-Deville, Micheline

    2014-01-01

    Newer statistical procedures are typically introduced to help address the limitations of those already in practice or to deal with emerging research needs. Quantile regression (QR) is introduced in this paper as a relatively new methodology, which is intended to overcome some of the limitations of least squares mean regression (LMR). QR is more…

  13. Three-Dimensional Modeling in Linear Regression.

    ERIC Educational Resources Information Center

    Herman, James D.

    Linear regression examines the relationship between one or more independent (predictor) variables and a dependent variable. By using a particular formula, regression determines the weights needed to minimize the error term for a given set of predictors. With one predictor variable, the relationship between the predictor and the dependent variable…

  14. A Practical Guide to Regression Discontinuity

    ERIC Educational Resources Information Center

    Jacob, Robin; Zhu, Pei; Somers, Marie-Andrée; Bloom, Howard

    2012-01-01

    Regression discontinuity (RD) analysis is a rigorous nonexperimental approach that can be used to estimate program impacts in situations in which candidates are selected for treatment based on whether their value for a numeric rating exceeds a designated threshold or cut-point. Over the last two decades, the regression discontinuity approach has…

  15. Regression Analysis and the Sociological Imagination

    ERIC Educational Resources Information Center

    De Maio, Fernando

    2014-01-01

    Regression analysis is an important aspect of most introductory statistics courses in sociology but is often presented in contexts divorced from the central concerns that bring students into the discipline. Consequently, we present five lesson ideas that emerge from a regression analysis of income inequality and mortality in the USA and Canada.

  16. Fatigue design of a cellular phone folder using regression model-based multi-objective optimization

    NASA Astrophysics Data System (ADS)

    Kim, Young Gyun; Lee, Jongsoo

    2016-08-01

    In a folding cellular phone, the folding device is repeatedly opened and closed by the user, which eventually results in fatigue damage, particularly to the front of the folder. Hence, it is important to improve the safety and endurance of the folder while also reducing its weight. This article presents an optimal design for the folder front that maximizes its fatigue endurance while minimizing its thickness. Design data for analysis and optimization were obtained experimentally using a test jig. Multi-objective optimization was carried out using a nonlinear regression model. Three regression methods were employed: back-propagation neural networks, logistic regression and support vector machines. The AdaBoost ensemble technique was also used to improve the approximation. Two-objective Pareto-optimal solutions were identified using the non-dominated sorting genetic algorithm (NSGA-II). Finally, a numerically optimized solution was validated against experimental product data, in terms of both fatigue endurance and thickness index.

  17. Atherosclerotic plaque regression: fact or fiction?

    PubMed

    Shanmugam, Nesan; Román-Rego, Ana; Ong, Peter; Kaski, Juan Carlos

    2010-08-01

    Coronary artery disease is the major cause of death in the western world. The formation and rapid progression of atheromatous plaques can lead to serious cardiovascular events in patients with atherosclerosis. The better understanding, in recent years, of the mechanisms leading to atheromatous plaque growth and disruption and the availability of powerful HMG CoA-reductase inhibitors (statins) has permitted the consideration of plaque regression as a realistic therapeutic goal. This article reviews the existing evidence underpinning current therapeutic strategies aimed at achieving atherosclerotic plaque regression. In this review we also discuss imaging modalities for the assessment of plaque regression, predictors of regression and whether plaque regression is associated with a survival benefit.

  18. Almost efficient estimation of relative risk regression

    PubMed Central

    Fitzmaurice, Garrett M.; Lipsitz, Stuart R.; Arriaga, Alex; Sinha, Debajyoti; Greenberg, Caprice; Gawande, Atul A.

    2014-01-01

    Relative risks (RRs) are often considered the preferred measures of association in prospective studies, especially when the binary outcome of interest is common. In particular, many researchers regard RRs to be more intuitively interpretable than odds ratios. Although RR regression is a special case of generalized linear models, specifically with a log link function for the binomial (or Bernoulli) outcome, the resulting log-binomial regression does not respect the natural parameter constraints. Because log-binomial regression does not ensure that predicted probabilities are mapped to the [0,1] range, maximum likelihood (ML) estimation is often subject to numerical instability that leads to convergence problems. To circumvent these problems, a number of alternative approaches for estimating RR regression parameters have been proposed. One approach that has been widely studied is the use of Poisson regression estimating equations. The estimating equations for Poisson regression yield consistent, albeit inefficient, estimators of the RR regression parameters. We consider the relative efficiency of the Poisson regression estimator and develop an alternative, almost efficient estimator for the RR regression parameters. The proposed method uses near-optimal weights based on a Maclaurin series (Taylor series expanded around zero) approximation to the true Bernoulli or binomial weight function. This yields an almost efficient estimator while avoiding convergence problems. We examine the asymptotic relative efficiency of the proposed estimator for an increase in the number of terms in the series. Using simulations, we demonstrate the potential for convergence problems with standard ML estimation of the log-binomial regression model and illustrate how this is overcome using the proposed estimator. We apply the proposed estimator to a study of predictors of pre-operative use of beta blockers among patients undergoing colorectal surgery after diagnosis of colon cancer. PMID

  19. Multicenter Comparison of Machine Learning Methods and Conventional Regression for Predicting Clinical Deterioration on the Wards

    PubMed Central

    Churpek, Matthew M; Yuen, Trevor C; Winslow, Christopher; Meltzer, David O; Kattan, Michael W; Edelson, Dana P

    2016-01-01

    OBJECTIVE Machine learning methods are flexible prediction algorithms that may be more accurate than conventional regression. We compared the accuracy of different techniques for detecting clinical deterioration on the wards in a large, multicenter database. DESIGN Observational cohort study. SETTING Five hospitals, from November 2008 until January 2013. PATIENTS Hospitalized ward patients INTERVENTIONS None MEASUREMENTS AND MAIN RESULTS Demographic variables, laboratory values, and vital signs were utilized in a discrete-time survival analysis framework to predict the combined outcome of cardiac arrest, intensive care unit transfer, or death. Two logistic regression models (one using linear predictor terms and a second utilizing restricted cubic splines) were compared to several different machine learning methods. The models were derived in the first 60% of the data by date and then validated in the next 40%. For model derivation, each event time window was matched to a non-event window. All models were compared to each other and to the Modified Early Warning score (MEWS), a commonly cited early warning score, using the area under the receiver operating characteristic curve (AUC). A total of 269,999 patients were admitted, and 424 cardiac arrests, 13,188 intensive care unit transfers, and 2,840 deaths occurred in the study. In the validation dataset, the random forest model was the most accurate model (AUC 0.80 [95% CI 0.80–0.80]). The logistic regression model with spline predictors was more accurate than the model utilizing linear predictors (AUC 0.77 vs 0.74; p<0.01), and all models were more accurate than the MEWS (AUC 0.70 [95% CI 0.70–0.70]). CONCLUSIONS In this multicenter study, we found that several machine learning methods more accurately predicted clinical deterioration than logistic regression. Use of detection algorithms derived from these techniques may result in improved identification of critically ill patients on the wards. PMID:26771782

  20. Logistics Models: Evolution and Future Trends,

    DTIC Science & Technology

    1982-03-01

    combat capable aircraft available to fly as a function of time Fig. 1--The Logistics Model -o ’,7:7- -5- performed. Many were measures of internal ...difficult combinatorial problem, which many modelers have studied with only limited success. Generally, only the two-machine problem has a known solution...currently uses a large simulation called MIDAS to study airlift requirements and capabilities. The use of ton miles/day as a measure of the military