NASA Astrophysics Data System (ADS)
Grégoire, G.
2014-12-01
The logistic regression originally is intended to explain the relationship between the probability of an event and a set of covariables. The model's coefficients can be interpreted via the odds and odds ratio, which are presented in introduction of the chapter. The observations are possibly got individually, then we speak of binary logistic regression. When they are grouped, the logistic regression is said binomial. In our presentation we mainly focus on the binary case. For statistical inference the main tool is the maximum likelihood methodology: we present the Wald, Rao and likelihoods ratio results and their use to compare nested models. The problems we intend to deal with are essentially the same as in multiple linear regression: testing global effect, individual effect, selection of variables to build a model, measure of the fitness of the model, prediction of new values… . The methods are demonstrated on data sets using R. Finally we briefly consider the binomial case and the situation where we are interested in several events, that is the polytomous (multinomial) logistic regression and the particular case of ordinal logistic regression.
Algamal, Zakariya Yahya; Lee, Muhammad Hisyam
2015-12-01
Cancer classification and gene selection in high-dimensional data have been popular research topics in genetics and molecular biology. Recently, adaptive regularized logistic regression using the elastic net regularization, which is called the adaptive elastic net, has been successfully applied in high-dimensional cancer classification to tackle both estimating the gene coefficients and performing gene selection simultaneously. The adaptive elastic net originally used elastic net estimates as the initial weight, however, using this weight may not be preferable for certain reasons: First, the elastic net estimator is biased in selecting genes. Second, it does not perform well when the pairwise correlations between variables are not high. Adjusted adaptive regularized logistic regression (AAElastic) is proposed to address these issues and encourage grouping effects simultaneously. The real data results indicate that AAElastic is significantly consistent in selecting genes compared to the other three competitor regularization methods. Additionally, the classification performance of AAElastic is comparable to the adaptive elastic net and better than other regularization methods. Thus, we can conclude that AAElastic is a reliable adaptive regularized logistic regression method in the field of high-dimensional cancer classification. PMID:26520484
Kleinman, Lawrence C; Norton, Edward C
2009-01-01
Objective To develop and validate a general method (called regression risk analysis) to estimate adjusted risk measures from logistic and other nonlinear multiple regression models. We show how to estimate standard errors for these estimates. These measures could supplant various approximations (e.g., adjusted odds ratio [AOR]) that may diverge, especially when outcomes are common. Study Design Regression risk analysis estimates were compared with internal standards as well as with Mantel–Haenszel estimates, Poisson and log-binomial regressions, and a widely used (but flawed) equation to calculate adjusted risk ratios (ARR) from AOR. Data Collection Data sets produced using Monte Carlo simulations. Principal Findings Regression risk analysis accurately estimates ARR and differences directly from multiple regression models, even when confounders are continuous, distributions are skewed, outcomes are common, and effect size is large. It is statistically sound and intuitive, and has properties favoring it over other methods in many cases. Conclusions Regression risk analysis should be the new standard for presenting findings from multiple regression analysis of dichotomous outcomes for cross-sectional, cohort, and population-based case–control studies, particularly when outcomes are common or effect size is large. PMID:18793213
Li, Li; Brumback, Babette A; Weppelmann, Thomas A; Morris, J Glenn; Ali, Afsar
2016-08-15
Motivated by an investigation of the effect of surface water temperature on the presence of Vibrio cholerae in water samples collected from different fixed surface water monitoring sites in Haiti in different months, we investigated methods to adjust for unmeasured confounding due to either of the two crossed factors site and month. In the process, we extended previous methods that adjust for unmeasured confounding due to one nesting factor (such as site, which nests the water samples from different months) to the case of two crossed factors. First, we developed a conditional pseudolikelihood estimator that eliminates fixed effects for the levels of each of the crossed factors from the estimating equation. Using the theory of U-Statistics for independent but non-identically distributed vectors, we show that our estimator is consistent and asymptotically normal, but that its variance depends on the nuisance parameters and thus cannot be easily estimated. Consequently, we apply our estimator in conjunction with a permutation test, and we investigate use of the pigeonhole bootstrap and the jackknife for constructing confidence intervals. We also incorporate our estimator into a diagnostic test for a logistic mixed model with crossed random effects and no unmeasured confounding. For comparison, we investigate between-within models extended to two crossed factors. These generalized linear mixed models include covariate means for each level of each factor in order to adjust for the unmeasured confounding. We conduct simulation studies, and we apply the methods to the Haitian data. Copyright © 2016 John Wiley & Sons, Ltd. PMID:26892025
Practical Session: Logistic Regression
NASA Astrophysics Data System (ADS)
Clausel, M.; Grégoire, G.
2014-12-01
An exercise is proposed to illustrate the logistic regression. One investigates the different risk factors in the apparition of coronary heart disease. It has been proposed in Chapter 5 of the book of D.G. Kleinbaum and M. Klein, "Logistic Regression", Statistics for Biology and Health, Springer Science Business Media, LLC (2010) and also by D. Chessel and A.B. Dufour in Lyon 1 (see Sect. 6 of http://pbil.univ-lyon1.fr/R/pdf/tdr341.pdf). This example is based on data given in the file evans.txt coming from http://www.sph.emory.edu/dkleinb/logreg3.htm#data.
Steganalysis using logistic regression
NASA Astrophysics Data System (ADS)
Lubenko, Ivans; Ker, Andrew D.
2011-02-01
We advocate Logistic Regression (LR) as an alternative to the Support Vector Machine (SVM) classifiers commonly used in steganalysis. LR offers more information than traditional SVM methods - it estimates class probabilities as well as providing a simple classification - and can be adapted more easily and efficiently for multiclass problems. Like SVM, LR can be kernelised for nonlinear classification, and it shows comparable classification accuracy to SVM methods. This work is a case study, comparing accuracy and speed of SVM and LR classifiers in detection of LSB Matching and other related spatial-domain image steganography, through the state-of-art 686-dimensional SPAM feature set, in three image sets.
Multinomial logistic regression ensembles.
Lee, Kyewon; Ahn, Hongshik; Moon, Hojin; Kodell, Ralph L; Chen, James J
2013-05-01
This article proposes a method for multiclass classification problems using ensembles of multinomial logistic regression models. A multinomial logit model is used as a base classifier in ensembles from random partitions of predictors. The multinomial logit model can be applied to each mutually exclusive subset of the feature space without variable selection. By combining multiple models the proposed method can handle a huge database without a constraint needed for analyzing high-dimensional data, and the random partition can improve the prediction accuracy by reducing the correlation among base classifiers. The proposed method is implemented using R, and the performance including overall prediction accuracy, sensitivity, and specificity for each category is evaluated on two real data sets and simulation data sets. To investigate the quality of prediction in terms of sensitivity and specificity, the area under the receiver operating characteristic (ROC) curve (AUC) is also examined. The performance of the proposed model is compared to a single multinomial logit model and it shows a substantial improvement in overall prediction accuracy. The proposed method is also compared with other classification methods such as the random forest, support vector machines, and random multinomial logit model. PMID:23611203
Logistic Regression: Concept and Application
ERIC Educational Resources Information Center
Cokluk, Omay
2010-01-01
The main focus of logistic regression analysis is classification of individuals in different groups. The aim of the present study is to explain basic concepts and processes of binary logistic regression analysis intended to determine the combination of independent variables which best explain the membership in certain groups called dichotomous…
Fungible weights in logistic regression.
Jones, Jeff A; Waller, Niels G
2016-06-01
In this article we develop methods for assessing parameter sensitivity in logistic regression models. To set the stage for this work, we first review Waller's (2008) equations for computing fungible weights in linear regression. Next, we describe 2 methods for computing fungible weights in logistic regression. To demonstrate the utility of these methods, we compute fungible logistic regression weights using data from the Centers for Disease Control and Prevention's (2010) Youth Risk Behavior Surveillance Survey, and we illustrate how these alternate weights can be used to evaluate parameter sensitivity. To make our work accessible to the research community, we provide R code (R Core Team, 2015) that will generate both kinds of fungible logistic regression weights. (PsycINFO Database Record PMID:26651981
Partial covariate adjusted regression
Şentürk, Damla; Nguyen, Danh V.
2008-01-01
Covariate adjusted regression (CAR) is a recently proposed adjustment method for regression analysis where both the response and predictors are not directly observed (Şentürk and Müller, 2005). The available data has been distorted by unknown functions of an observable confounding covariate. CAR provides consistent estimators for the coefficients of the regression between the variables of interest, adjusted for the confounder. We develop a broader class of partial covariate adjusted regression (PCAR) models to accommodate both distorted and undistorted (adjusted/unadjusted) predictors. The PCAR model allows for unadjusted predictors, such as age, gender and demographic variables, which are common in the analysis of biomedical and epidemiological data. The available estimation and inference procedures for CAR are shown to be invalid for the proposed PCAR model. We propose new estimators and develop new inference tools for the more general PCAR setting. In particular, we establish the asymptotic normality of the proposed estimators and propose consistent estimators of their asymptotic variances. Finite sample properties of the proposed estimators are investigated using simulation studies and the method is also illustrated with a Pima Indians diabetes data set. PMID:20126296
Transfer Learning Based on Logistic Regression
NASA Astrophysics Data System (ADS)
Paul, A.; Rottensteiner, F.; Heipke, C.
2015-08-01
In this paper we address the problem of classification of remote sensing images in the framework of transfer learning with a focus on domain adaptation. The main novel contribution is a method for transductive transfer learning in remote sensing on the basis of logistic regression. Logistic regression is a discriminative probabilistic classifier of low computational complexity, which can deal with multiclass problems. This research area deals with methods that solve problems in which labelled training data sets are assumed to be available only for a source domain, while classification is needed in the target domain with different, yet related characteristics. Classification takes place with a model of weight coefficients for hyperplanes which separate features in the transformed feature space. In term of logistic regression, our domain adaptation method adjusts the model parameters by iterative labelling of the target test data set. These labelled data features are iteratively added to the current training set which, at the beginning, only contains source features and, simultaneously, a number of source features are deleted from the current training set. Experimental results based on a test series with synthetic and real data constitutes a first proof-of-concept of the proposed method.
Standards for Standardized Logistic Regression Coefficients
ERIC Educational Resources Information Center
Menard, Scott
2011-01-01
Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…
Computing measures of explained variation for logistic regression models.
Mittlböck, M; Schemper, M
1999-01-01
The proportion of explained variation (R2) is frequently used in the general linear model but in logistic regression no standard definition of R2 exists. We present a SAS macro which calculates two R2-measures based on Pearson and on deviance residuals for logistic regression. Also, adjusted versions for both measures are given, which should prevent the inflation of R2 in small samples. PMID:10195643
Satellite rainfall retrieval by logistic regression
NASA Technical Reports Server (NTRS)
Chiu, Long S.
1986-01-01
The potential use of logistic regression in rainfall estimation from satellite measurements is investigated. Satellite measurements provide covariate information in terms of radiances from different remote sensors.The logistic regression technique can effectively accommodate many covariates and test their significance in the estimation. The outcome from the logistical model is the probability that the rainrate of a satellite pixel is above a certain threshold. By varying the thresholds, a rainrate histogram can be obtained, from which the mean and the variant can be estimated. A logistical model is developed and applied to rainfall data collected during GATE, using as covariates the fractional rain area and a radiance measurement which is deduced from a microwave temperature-rainrate relation. It is demonstrated that the fractional rain area is an important covariate in the model, consistent with the use of the so-called Area Time Integral in estimating total rain volume in other studies. To calibrate the logistical model, simulated rain fields generated by rainfield models with prescribed parameters are needed. A stringent test of the logistical model is its ability to recover the prescribed parameters of simulated rain fields. A rain field simulation model which preserves the fractional rain area and lognormality of rainrates as found in GATE is developed. A stochastic regression model of branching and immigration whose solutions are lognormally distributed in some asymptotic limits has also been developed.
A Logistic Regression Model for Personnel Selection.
ERIC Educational Resources Information Center
Raju, Nambury S.; And Others
1991-01-01
A two-parameter logistic regression model for personnel selection is proposed. The model was tested with a database of 84,808 military enlistees. The probability of job success was related directly to trait levels, addressing such topics as selection, validity generalization, employee classification, selection bias, and utility-based fair…
Predicting Social Trust with Binary Logistic Regression
ERIC Educational Resources Information Center
Adwere-Boamah, Joseph; Hufstedler, Shirley
2015-01-01
This study used binary logistic regression to predict social trust with five demographic variables from a national sample of adult individuals who participated in The General Social Survey (GSS) in 2012. The five predictor variables were respondents' highest degree earned, race, sex, general happiness and the importance of personally assisting…
NASA Astrophysics Data System (ADS)
Lin, Yingzhi; Deng, Xiangzheng; Li, Xing; Ma, Enjun
2014-12-01
Spatially explicit simulation of land use change is the basis for estimating the effects of land use and cover change on energy fluxes, ecology and the environment. At the pixel level, logistic regression is one of the most common approaches used in spatially explicit land use allocation models to determine the relationship between land use and its causal factors in driving land use change, and thereby to evaluate land use suitability. However, these models have a drawback in that they do not determine/allocate land use based on the direct relationship between land use change and its driving factors. Consequently, a multinomial logistic regression method was introduced to address this flaw, and thereby, judge the suitability of a type of land use in any given pixel in a case study area of the Jiangxi Province, China. A comparison of the two regression methods indicated that the proportion of correctly allocated pixels using multinomial logistic regression was 92.98%, which was 8.47% higher than that obtained using logistic regression. Paired t-test results also showed that pixels were more clearly distinguished by multinomial logistic regression than by logistic regression. In conclusion, multinomial logistic regression is a more efficient and accurate method for the spatial allocation of land use changes. The application of this method in future land use change studies may improve the accuracy of predicting the effects of land use and cover change on energy fluxes, ecology, and environment.
Model selection for logistic regression models
NASA Astrophysics Data System (ADS)
Duller, Christine
2012-09-01
Model selection for logistic regression models decides which of some given potential regressors have an effect and hence should be included in the final model. The second interesting question is whether a certain factor is heterogeneous among some subsets, i.e. whether the model should include a random intercept or not. In this paper these questions will be answered with classical as well as with Bayesian methods. The application show some results of recent research projects in medicine and business administration.
Supporting Regularized Logistic Regression Privately and Efficiently.
Li, Wenfa; Liu, Hongzhe; Yang, Peng; Xie, Wei
2016-01-01
As one of the most popular statistical and machine learning models, logistic regression with regularization has found wide adoption in biomedicine, social sciences, information technology, and so on. These domains often involve data of human subjects that are contingent upon strict privacy regulations. Concerns over data privacy make it increasingly difficult to coordinate and conduct large-scale collaborative studies, which typically rely on cross-institution data sharing and joint analysis. Our work here focuses on safeguarding regularized logistic regression, a widely-used statistical model while at the same time has not been investigated from a data security and privacy perspective. We consider a common use scenario of multi-institution collaborative studies, such as in the form of research consortia or networks as widely seen in genetics, epidemiology, social sciences, etc. To make our privacy-enhancing solution practical, we demonstrate a non-conventional and computationally efficient method leveraging distributing computing and strong cryptography to provide comprehensive protection over individual-level and summary data. Extensive empirical evaluations on several studies validate the privacy guarantee, efficiency and scalability of our proposal. We also discuss the practical implications of our solution for large-scale studies and applications from various disciplines, including genetic and biomedical studies, smart grid, network analysis, etc. PMID:27271738
Supporting Regularized Logistic Regression Privately and Efficiently
Li, Wenfa; Liu, Hongzhe; Yang, Peng; Xie, Wei
2016-01-01
As one of the most popular statistical and machine learning models, logistic regression with regularization has found wide adoption in biomedicine, social sciences, information technology, and so on. These domains often involve data of human subjects that are contingent upon strict privacy regulations. Concerns over data privacy make it increasingly difficult to coordinate and conduct large-scale collaborative studies, which typically rely on cross-institution data sharing and joint analysis. Our work here focuses on safeguarding regularized logistic regression, a widely-used statistical model while at the same time has not been investigated from a data security and privacy perspective. We consider a common use scenario of multi-institution collaborative studies, such as in the form of research consortia or networks as widely seen in genetics, epidemiology, social sciences, etc. To make our privacy-enhancing solution practical, we demonstrate a non-conventional and computationally efficient method leveraging distributing computing and strong cryptography to provide comprehensive protection over individual-level and summary data. Extensive empirical evaluations on several studies validate the privacy guarantee, efficiency and scalability of our proposal. We also discuss the practical implications of our solution for large-scale studies and applications from various disciplines, including genetic and biomedical studies, smart grid, network analysis, etc. PMID:27271738
Using Dominance Analysis to Determine Predictor Importance in Logistic Regression
ERIC Educational Resources Information Center
Azen, Razia; Traxel, Nicole
2009-01-01
This article proposes an extension of dominance analysis that allows researchers to determine the relative importance of predictors in logistic regression models. Criteria for choosing logistic regression R[superscript 2] analogues were determined and measures were selected that can be used to perform dominance analysis in logistic regression. A…
Logistic Regression: Going beyond Point-and-Click.
ERIC Educational Resources Information Center
King, Jason E.
A review of the literature reveals that important statistical algorithms and indices pertaining to logistic regression are being underused. This paper describes logistic regression in comparison with discriminant analysis and linear regression, and suggests that some techniques only accessible through computer syntax should be consulted in…
Sparse logistic regression with Lp penalty for biomarker identification.
Liu, Zhenqiu; Jiang, Feng; Tian, Guoliang; Wang, Suna; Sato, Fumiaki; Meltzer, Stephen J; Tan, Ming
2007-01-01
In this paper, we propose a novel method for sparse logistic regression with non-convex regularization Lp (p <1). Based on smooth approximation, we develop several fast algorithms for learning the classifier that is applicable to high dimensional dataset such as gene expression. To the best of our knowledge, these are the first algorithms to perform sparse logistic regression with an Lp and elastic net (Le) penalty. The regularization parameters are decided through maximizing the area under the ROC curve (AUC) of the test data. Experimental results on methylation and microarray data attest the accuracy, sparsity, and efficiency of the proposed algorithms. Biomarkers identified with our methods are compared with that in the literature. Our computational results show that Lp Logistic regression (p <1) outperforms the L1 logistic regression and SCAD SVM. Software is available upon request from the first author. PMID:17402921
MODELING SNAKE MICROHABITAT FROM RADIOTELEMETRY STUDIES USING POLYTOMOUS LOGISTIC REGRESSION
Multivariate analysis of snake microhabitat has historically used techniques that were derived under assumptions of normality and common covariance structure (e.g., discriminant function analysis, MANOVA). In this study, polytomous logistic regression (PLR which does not require ...
Large unbalanced credit scoring using Lasso-logistic regression ensemble.
Wang, Hong; Xu, Qingsong; Zhou, Lifeng
2015-01-01
Recently, various ensemble learning methods with different base classifiers have been proposed for credit scoring problems. However, for various reasons, there has been little research using logistic regression as the base classifier. In this paper, given large unbalanced data, we consider the plausibility of ensemble learning using regularized logistic regression as the base classifier to deal with credit scoring problems. In this research, the data is first balanced and diversified by clustering and bagging algorithms. Then we apply a Lasso-logistic regression learning ensemble to evaluate the credit risks. We show that the proposed algorithm outperforms popular credit scoring models such as decision tree, Lasso-logistic regression and random forests in terms of AUC and F-measure. We also provide two importance measures for the proposed model to identify important variables in the data. PMID:25706988
Large Unbalanced Credit Scoring Using Lasso-Logistic Regression Ensemble
Wang, Hong; Xu, Qingsong; Zhou, Lifeng
2015-01-01
Recently, various ensemble learning methods with different base classifiers have been proposed for credit scoring problems. However, for various reasons, there has been little research using logistic regression as the base classifier. In this paper, given large unbalanced data, we consider the plausibility of ensemble learning using regularized logistic regression as the base classifier to deal with credit scoring problems. In this research, the data is first balanced and diversified by clustering and bagging algorithms. Then we apply a Lasso-logistic regression learning ensemble to evaluate the credit risks. We show that the proposed algorithm outperforms popular credit scoring models such as decision tree, Lasso-logistic regression and random forests in terms of AUC and F-measure. We also provide two importance measures for the proposed model to identify important variables in the data. PMID:25706988
Estimating the exceedance probability of rain rate by logistic regression
NASA Technical Reports Server (NTRS)
Chiu, Long S.; Kedem, Benjamin
1990-01-01
Recent studies have shown that the fraction of an area with rain intensity above a fixed threshold is highly correlated with the area-averaged rain rate. To estimate the fractional rainy area, a logistic regression model, which estimates the conditional probability that rain rate over an area exceeds a fixed threshold given the values of related covariates, is developed. The problem of dependency in the data in the estimation procedure is bypassed by the method of partial likelihood. Analyses of simulated scanning multichannel microwave radiometer and observed electrically scanning microwave radiometer data during the Global Atlantic Tropical Experiment period show that the use of logistic regression in pixel classification is superior to multiple regression in predicting whether rain rate at each pixel exceeds a given threshold, even in the presence of noisy data. The potential of the logistic regression technique in satellite rain rate estimation is discussed.
Residuals and regression diagnostics: focusing on logistic regression.
Zhang, Zhongheng
2016-05-01
Up to now I have introduced most steps in regression model building and validation. The last step is to check whether there are observations that have significant impact on model coefficient and specification. The article firstly describes plotting Pearson residual against predictors. Such plots are helpful in identifying non-linearity and provide hints on how to transform predictors. Next, I focus on observations of outlier, leverage and influence that may have significant impact on model building. Outlier is such an observation that its response value is unusual conditional on covariate pattern. Leverage is an observation with covariate pattern that is far away from the regressor space. Influence is the product of outlier and leverage. That is, when influential observation is dropped from the model, there will be a significant shift of the coefficient. Summary statistics for outlier, leverage and influence are studentized residuals, hat values and Cook's distance. They can be easily visualized with graphs and formally tested using the car package. PMID:27294091
Ksantini, Riadh; Ziou, Djemel; Colin, Bernard; Dubeau, François
2008-02-01
In this paper, we investigate the effectiveness of a Bayesian logistic regression model to compute the weights of a pseudo-metric, in order to improve its discriminatory capacity and thereby increase image retrieval accuracy. In the proposed Bayesian model, the prior knowledge of the observations is incorporated and the posterior distribution is approximated by a tractable Gaussian form using variational transformation and Jensen's inequality, which allow a fast and straightforward computation of the weights. The pseudo-metric makes use of the compressed and quantized versions of wavelet decomposed feature vectors, and in our previous work, the weights were adjusted by classical logistic regression model. A comparative evaluation of the Bayesian and classical logistic regression models is performed for content-based image retrieval as well as for other classification tasks, in a decontextualized evaluation framework. In this same framework, we compare the Bayesian logistic regression model to some relevant state-of-the-art classification algorithms. Experimental results show that the Bayesian logistic regression model outperforms these linear classification algorithms, and is a significantly better tool than the classical logistic regression model to compute the pseudo-metric weights and improve retrieval and classification performance. Finally, we perform a comparison with results obtained by other retrieval methods. PMID:18084057
Multinomial logistic regression-based feature selection for hyperspectral data
NASA Astrophysics Data System (ADS)
Pal, Mahesh
2012-02-01
This paper evaluates the performance of three feature selection methods based on multinomial logistic regression, and compares the performance of the best multinomial logistic regression-based feature selection approach with the support vector machine based recurring feature elimination approach. Two hyperspectral datasets, one consisting of 65 features (DAIS data) and other with 185 features (AVIRIS data) were used. Result suggests that a total of between 15 and 10 features selected by using the multinomial logistic regression-based feature selection approach as proposed by Cawley and Talbot achieve a significant improvement in classification accuracy in comparison to the use of all the features of the DAIS and AVIRIS datasets. In addition to the improved performance, the Cawley and Talbot approach does not require any user-defined parameter, thus avoiding the requirement of a model selection stage. In comparison, the other two multinomial logistic regression-based feature selection approaches require one user-defined parameter and do not perform as well as the Cawley and Talbot approach in terms of (i) the number of features required to achieve classification accuracy comparable to that achieved using the full dataset, and (ii) the classification accuracy achieved by the selected features. The Cawley and Talbot approach was also found to be computationally more efficient than the SVM-RFE technique, though both use the same number of selected features to achieve an equal or even higher level of accuracy than that achieved with full hyperspectral datasets.
Detecting Differential Item Functioning Using Logistic Regression Procedures.
ERIC Educational Resources Information Center
Swaminathan, Hariharan; Rogers, H. Jane
1990-01-01
A logistic regression model for characterizing differential item functioning (DIF) between two groups is presented. A distinction is drawn between uniform and nonuniform DIF in terms of model parameters. A statistic for testing the hypotheses of no DIF is developed, and simulation studies compare it with the Mantel-Haenszel procedure. (Author/TJH)
Hierarchical Logistic Regression: Accounting for Multilevel Data in DIF Detection
ERIC Educational Resources Information Center
French, Brian F.; Finch, W. Holmes
2010-01-01
The purpose of this study was to examine the performance of differential item functioning (DIF) assessment in the presence of a multilevel structure that often underlies data from large-scale testing programs. Analyses were conducted using logistic regression (LR), a popular, flexible, and effective tool for DIF detection. Data were simulated…
Weather adjustment using seemingly unrelated regression
Noll, T.A.
1995-05-01
Seemingly unrelated regression (SUR) is a system estimation technique that accounts for time-contemporaneous correlation between individual equations within a system of equations. SUR is suited to weather adjustment estimations when the estimation is: (1) composed of a system of equations and (2) the system of equations represents either different weather stations, different sales sectors or a combination of different weather stations and different sales sectors. SUR utilizes the cross-equation error values to develop more accurate estimates of the system coefficients than are obtained using ordinary least-squares (OLS) estimation. SUR estimates can be generated using a variety of statistical software packages including MicroTSP and SAS.
Determination of riverbank erosion probability using Locally Weighted Logistic Regression
NASA Astrophysics Data System (ADS)
Ioannidou, Elena; Flori, Aikaterini; Varouchakis, Emmanouil A.; Giannakis, Georgios; Vozinaki, Anthi Eirini K.; Karatzas, George P.; Nikolaidis, Nikolaos
2015-04-01
Riverbank erosion is a natural geomorphologic process that affects the fluvial environment. The most important issue concerning riverbank erosion is the identification of the vulnerable locations. An alternative to the usual hydrodynamic models to predict vulnerable locations is to quantify the probability of erosion occurrence. This can be achieved by identifying the underlying relations between riverbank erosion and the geomorphological or hydrological variables that prevent or stimulate erosion. Thus, riverbank erosion can be determined by a regression model using independent variables that are considered to affect the erosion process. The impact of such variables may vary spatially, therefore, a non-stationary regression model is preferred instead of a stationary equivalent. Locally Weighted Regression (LWR) is proposed as a suitable choice. This method can be extended to predict the binary presence or absence of erosion based on a series of independent local variables by using the logistic regression model. It is referred to as Locally Weighted Logistic Regression (LWLR). Logistic regression is a type of regression analysis used for predicting the outcome of a categorical dependent variable (e.g. binary response) based on one or more predictor variables. The method can be combined with LWR to assign weights to local independent variables of the dependent one. LWR allows model parameters to vary over space in order to reflect spatial heterogeneity. The probabilities of the possible outcomes are modelled as a function of the independent variables using a logistic function. Logistic regression measures the relationship between a categorical dependent variable and, usually, one or several continuous independent variables by converting the dependent variable to probability scores. Then, a logistic regression is formed, which predicts success or failure of a given binary variable (e.g. erosion presence or absence) for any value of the independent variables. The
Modeling urban growth with geographically weighted multinomial logistic regression
NASA Astrophysics Data System (ADS)
Luo, Jun; Kanala, Nagaraj Kapi
2008-10-01
Spatial heterogeneity is usually ignored in previous land use change studies. This paper presents a geographically weighted multinomial logistic regression model for investigating multiple land use conversion in the urban growth process. The proposed model makes estimation at each sample location and generates local coefficients of driving factors for land use conversion. A Gaussian function is used for determine the geographic weights guarantying that all other samples are involved in the calibration of the model for one location. A case study on Springfield metropolitan area is conducted. A set of independent variables are selected as driving factors. A traditional multinomial logistic regression model is set up and compared with the proposed model. Spatial variations of coefficients of independent variables are revealed by investigating the estimations at sample locations.
ERIC Educational Resources Information Center
Shih, Ching-Lin; Liu, Tien-Hsiang; Wang, Wen-Chung
2014-01-01
The simultaneous item bias test (SIBTEST) method regression procedure and the differential item functioning (DIF)-free-then-DIF strategy are applied to the logistic regression (LR) method simultaneously in this study. These procedures are used to adjust the effects of matching true score on observed score and to better control the Type I error…
Classifying machinery condition using oil samples and binary logistic regression
NASA Astrophysics Data System (ADS)
Phillips, J.; Cripps, E.; Lau, John W.; Hodkiewicz, M. R.
2015-08-01
The era of big data has resulted in an explosion of condition monitoring information. The result is an increasing motivation to automate the costly and time consuming human elements involved in the classification of machine health. When working with industry it is important to build an understanding and hence some trust in the classification scheme for those who use the analysis to initiate maintenance tasks. Typically "black box" approaches such as artificial neural networks (ANN) and support vector machines (SVM) can be difficult to provide ease of interpretability. In contrast, this paper argues that logistic regression offers easy interpretability to industry experts, providing insight to the drivers of the human classification process and to the ramifications of potential misclassification. Of course, accuracy is of foremost importance in any automated classification scheme, so we also provide a comparative study based on predictive performance of logistic regression, ANN and SVM. A real world oil analysis data set from engines on mining trucks is presented and using cross-validation we demonstrate that logistic regression out-performs the ANN and SVM approaches in terms of prediction for healthy/not healthy engines.
Comparison of Logistic Regression and Linear Regression in Modeling Percentage Data
Zhao, Lihui; Chen, Yuhuan; Schaffner, Donald W.
2001-01-01
Percentage is widely used to describe different results in food microbiology, e.g., probability of microbial growth, percent inactivated, and percent of positive samples. Four sets of percentage data, percent-growth-positive, germination extent, probability for one cell to grow, and maximum fraction of positive tubes, were obtained from our own experiments and the literature. These data were modeled using linear and logistic regression. Five methods were used to compare the goodness of fit of the two models: percentage of predictions closer to observations, range of the differences (predicted value minus observed value), deviation of the model, linear regression between the observed and predicted values, and bias and accuracy factors. Logistic regression was a better predictor of at least 78% of the observations in all four data sets. In all cases, the deviation of logistic models was much smaller. The linear correlation between observations and logistic predictions was always stronger. Validation (accomplished using part of one data set) also demonstrated that the logistic model was more accurate in predicting new data points. Bias and accuracy factors were found to be less informative when evaluating models developed for percentage data, since neither of these indices can compare predictions at zero. Model simplification for the logistic model was demonstrated with one data set. The simplified model was as powerful in making predictions as the full linear model, and it also gave clearer insight in determining the key experimental factors. PMID:11319091
Prediction of siRNA potency using sparse logistic regression.
Hu, Wei; Hu, John
2014-06-01
RNA interference (RNAi) can modulate gene expression at post-transcriptional as well as transcriptional levels. Short interfering RNA (siRNA) serves as a trigger for the RNAi gene inhibition mechanism, and therefore is a crucial intermediate step in RNAi. There have been extensive studies to identify the sequence characteristics of potent siRNAs. One such study built a linear model using LASSO (Least Absolute Shrinkage and Selection Operator) to measure the contribution of each siRNA sequence feature. This model is simple and interpretable, but it requires a large number of nonzero weights. We have introduced a novel technique, sparse logistic regression, to build a linear model using single-position specific nucleotide compositions which has the same prediction accuracy of the linear model based on LASSO. The weights in our new model share the same general trend as those in the previous model, but have only 25 nonzero weights out of a total 84 weights, a 54% reduction compared to the previous model. Contrary to the linear model based on LASSO, our model suggests that only a few positions are influential on the efficacy of the siRNA, which are the 5' and 3' ends and the seed region of siRNA sequences. We also employed sparse logistic regression to build a linear model using dual-position specific nucleotide compositions, a task LASSO is not able to accomplish well due to its high dimensional nature. Our results demonstrate the superiority of sparse logistic regression as a technique for both feature selection and regression over LASSO in the context of siRNA design. PMID:21091052
Using and Interpreting Logistic Regression: A Guide for Teachers and Students.
ERIC Educational Resources Information Center
Lottes, Ilsa L.; And Others
1996-01-01
Defines and illustrates basic concepts of dichotomous logistic regression (DLR) and presents strategies for teaching these concepts. Strategies include using analogies between ordinary least squares regression and logistic regression; illustrating concepts with contingency tables; and linking logistic regression concepts to interpretation of…
Robust Logistic and Probit Methods for Binary and Multinomial Regression
Tabatabai, MA; Li, H; Eby, WM; Kengwoung-Keumo, JJ; Manne, U; Bae, S; Fouad, M; Singh, KP
2015-01-01
In this paper we introduce new robust estimators for the logistic and probit regressions for binary, multinomial, nominal and ordinal data and apply these models to estimate the parameters when outliers or inluential observations are present. Maximum likelihood estimates don't behave well when outliers or inluential observations are present. One remedy is to remove inluential observations from the data and then apply the maximum likelihood technique on the deleted data. Another approach is to employ a robust technique that can handle outliers and inluential observations without removing any observations from the data sets. The robustness of the method is tested using real and simulated data sets. PMID:26078914
Ranked Tag Recommendation Systems Based on Logistic Regression
NASA Astrophysics Data System (ADS)
Quevedo, J. R.; Montañés, E.; Ranilla, J.; Díaz, I.
This work proposes an approach to tag recommendation based on a logistic regression based system. The goal of the method is to support users of current social network systems by providing a rank of new meaningful tags for a resource. This system provides a ranked tag set and it feeds on different posts depending on the resource for which the user requests the recommendation. The performance of this approach is tested according to several evaluation measures, one of them proposed in this paper (F_1^+). The experiments show that this learning system outperforms certain benchmark recommenders.
Comparison of a Bayesian Network with a Logistic Regression Model to Forecast IgA Nephropathy
Ducher, Michel; Kalbacher, Emilie; Combarnous, François; Finaz de Vilaine, Jérome; McGregor, Brigitte; Fouque, Denis; Fauvel, Jean Pierre
2013-01-01
Models are increasingly used in clinical practice to improve the accuracy of diagnosis. The aim of our work was to compare a Bayesian network to logistic regression to forecast IgA nephropathy (IgAN) from simple clinical and biological criteria. Retrospectively, we pooled the results of all biopsies (n = 155) performed by nephrologists in a specialist clinical facility between 2002 and 2009. Two groups were constituted at random. The first subgroup was used to determine the parameters of the models adjusted to data by logistic regression or Bayesian network, and the second was used to compare the performances of the models using receiver operating characteristics (ROC) curves. IgAN was found (on pathology) in 44 patients. Areas under the ROC curves provided by both methods were highly significant but not different from each other. Based on the highest Youden indices, sensitivity reached (100% versus 67%) and specificity (73% versus 95%) using the Bayesian network and logistic regression, respectively. A Bayesian network is at least as efficient as logistic regression to estimate the probability of a patient suffering IgAN, using simple clinical and biological data obtained during consultation. PMID:24328031
[Outliers and robust logistic regression in Health Sciences].
Cutanda Henríquez, Francisco
2008-01-01
Logistic regression methods have many applications in Health Sciences. There is a vast literature about procedures to be followed and the way to find the estimators for the parameters from the observed values, and these methods are implemented to all the usual statistical packages. These estimators are of the 'maximum likelihood' kind, i.e., they are the ones that make the observed values the most probable among all the models that could have been used. The good properties of the maximum likelihood estimators are widely demonstrated. However, there are some practical circumstances that may cause the presence of 'outliers', i.e., observed values not corresponding to the logistic model we are assuming as a hypothesis. Occasionally, these anomalous observations can have a strong effect on the fit, and lead the study to the wrong conclusion. The causes of these outliers depend on the particular study, but it is possible to point out classification errors, observations (subjects) with special features which have not been taken into account, uncertainty in the measurement of some parameters, etc. The problem with maximum likelihood estimators is that they are not 'robust', i.e., their sensitivity to outliers could be arbitrarily large, and a minority of outliers could lead to a wrong logistic model. In this work, we will show two cases illustrating possible consequences, and we will discuss the application of robust methods. PMID:19180273
Evaluating sediment chemistry and toxicity data using logistic regression modeling
Field, L.J.; MacDonald, D.D.; Norton, S.B.; Severn, C.G.; Ingersoll, C.G.
1999-01-01
This paper describes the use of logistic-regression modeling for evaluating matching sediment chemistry and toxicity data. Contaminant- specific logistic models were used to estimate the percentage of samples expected to be toxic at a given concentration. These models enable users to select the probability of effects of concern corresponding to their specific assessment or management objective or to estimate the probability of observing specific biological effects at any contaminant concentration. The models were developed using a large database (n = 2,524) of matching saltwater sediment chemistry and toxicity data for field-collected samples compiled from a number of different sources and geographic areas. The models for seven chemicals selected as examples showed a wide range in goodness of fit, reflecting high variability in toxicity at low concentrations and limited data on toxicity at higher concentrations for some chemicals. The models for individual test endpoints (e.g., amphipod mortality) provided a better fit to the data than the models based on all endpoints combined. A comparison of the relative sensitivity of two amphipod species to specific contaminants illustrated an important application of the logistic model approach.
Landslide Hazard Mapping in Rwanda Using Logistic Regression
NASA Astrophysics Data System (ADS)
Piller, A.; Anderson, E.; Ballard, H.
2015-12-01
Landslides in the United States cause more than $1 billion in damages and 50 deaths per year (USGS 2014). Globally, figures are much more grave, yet monitoring, mapping and forecasting of these hazards are less than adequate. Seventy-five percent of the population of Rwanda earns a living from farming, mostly subsistence. Loss of farmland, housing, or life, to landslides is a very real hazard. Landslides in Rwanda have an impact at the economic, social, and environmental level. In a developing nation that faces challenges in tracking, cataloging, and predicting the numerous landslides that occur each year, satellite imagery and spatial analysis allow for remote study. We have focused on the development of a landslide inventory and a statistical methodology for assessing landslide hazards. Using logistic regression on approximately 30 test variables (i.e. slope, soil type, land cover, etc.) and a sample of over 200 landslides, we determine which variables are statistically most relevant to landslide occurrence in Rwanda. A preliminary predictive hazard map for Rwanda has been produced, using the variables selected from the logistic regression analysis.
Change point testing in logistic regression models with interaction term.
Fong, Youyi; Di, Chongzhi; Permar, Sallie
2015-04-30
A threshold effect takes place in situations where the relationship between an outcome variable and a predictor variable changes as the predictor value crosses a certain threshold/change point. Threshold effects are often plausible in a complex biological system, especially in defining immune responses that are protective against infections such as HIV-1, which motivates the current work. We study two hypothesis testing problems in change point models. We first compare three different approaches to obtaining a p-value for the maximum of scores test in a logistic regression model with change point variable as a main effect. Next, we study the testing problem in a logistic regression model with the change point variable both as a main effect and as part of an interaction term. We propose a test based on the maximum of likelihood ratios test statistic and obtain its reference distribution through a Monte Carlo method. We also propose a maximum of weighted scores test that can be more powerful than the maximum of likelihood ratios test when we know the direction of the interaction effect. In simulation studies, we show that the proposed tests have a correct type I error and higher power than several existing methods. We illustrate the application of change point model-based testing methods in a recent study of immune responses that are associated with the risk of mother to child transmission of HIV-1. PMID:25612253
Metadata for ReVA logistic regression dataset
LaMotte, Andrew E.
2004-01-01
The U.S. Geological Survey in cooperation with the U.S. Environmental Protection Agency's Regional Vulnerability Assessment Program, has developed a set of statistical tools to support regional-scale, ground-water quality and vulnerability assessments. The Regional Vulnerability Assessment Program goals are to develop and demonstrate approaches to comprehensive, regional-scale assessments that effectively inform water-resources managers and decision-makers as to the magnitude, extent, distribution, and uncertainty of current and anticipated environmental risks. The U.S. Geological Survey is developing and exploring the use of statistical probability models to characterize the relation between ground-water quality and geographic factors in the Mid-Atlantic Region. Available water-quality data obtained from U.S. Geological Survey National Water-Quality Assessment Program studies conducted in the Mid-Atlantic Region were used in association with geographic data (land cover, geology, soils, and others) to develop logistic-regression equations that use explanatory variables to predict the presence of a selected water-quality parameter exceeding specified management concentration thresholds. The resulting logistic-regression equations were transformed to determine the probability, P(X), of a water-quality parameter exceeding a specified management threshold. Additional statistical procedures modified by the U.S. Geological Survey were used to compare the observed values to model-predicted values at each sample point. In addition, procedures to evaluate the confidence of the model predictions and estimate the uncertainty of the probability value were developed and applied. The resulting logistic-regression models were applied to the Mid-Atlantic Region to predict the spatial probability of nitrate concentrations exceeding specified management thresholds. These thresholds are usually set or established by regulators or managers at national or local levels. At management
Drought Patterns Forecasting using an Auto-Regressive Logistic Model
NASA Astrophysics Data System (ADS)
del Jesus, M.; Sheffield, J.; Méndez Incera, F. J.; Losada, I. J.; Espejo, A.
2014-12-01
Drought is characterized by a water deficit that may manifest across a large range of spatial and temporal scales. Drought may create important socio-economic consequences, many times of catastrophic dimensions. A quantifiable definition of drought is elusive because depending on its impacts, consequences and generation mechanism, different water deficit periods may be identified as a drought by virtue of some definitions but not by others. Droughts are linked to the water cycle and, although a climate change signal may not have emerged yet, they are also intimately linked to climate.In this work we develop an auto-regressive logistic model for drought prediction at different temporal scales that makes use of a spatially explicit framework. Our model allows to include covariates, continuous or categorical, to improve the performance of the auto-regressive component.Our approach makes use of dimensionality reduction (principal component analysis) and classification techniques (K-Means and maximum dissimilarity) to simplify the representation of complex climatic patterns, such as sea surface temperature (SST) and sea level pressure (SLP), while including information on their spatial structure, i.e. considering their spatial patterns. This procedure allows us to include in the analysis multivariate representation of complex climatic phenomena, as the El Niño-Southern Oscillation. We also explore the impact of other climate-related variables such as sun spots. The model allows to quantify the uncertainty of the forecasts and can be easily adapted to make predictions under future climatic scenarios. The framework herein presented may be extended to other applications such as flash flood analysis, or risk assessment of natural hazards.
NASA Astrophysics Data System (ADS)
Ariffin, Syaiba Balqish; Midi, Habshah
2014-06-01
This article is concerned with the performance of logistic ridge regression estimation technique in the presence of multicollinearity and high leverage points. In logistic regression, multicollinearity exists among predictors and in the information matrix. The maximum likelihood estimator suffers a huge setback in the presence of multicollinearity which cause regression estimates to have unduly large standard errors. To remedy this problem, a logistic ridge regression estimator is put forward. It is evident that the logistic ridge regression estimator outperforms the maximum likelihood approach for handling multicollinearity. The effect of high leverage points are then investigated on the performance of the logistic ridge regression estimator through real data set and simulation study. The findings signify that logistic ridge regression estimator fails to provide better parameter estimates in the presence of both high leverage points and multicollinearity.
On the Usefulness of a Multilevel Logistic Regression Approach to Person-Fit Analysis
ERIC Educational Resources Information Center
Conijn, Judith M.; Emons, Wilco H. M.; van Assen, Marcel A. L. M.; Sijtsma, Klaas
2011-01-01
The logistic person response function (PRF) models the probability of a correct response as a function of the item locations. Reise (2000) proposed to use the slope parameter of the logistic PRF as a person-fit measure. He reformulated the logistic PRF model as a multilevel logistic regression model and estimated the PRF parameters from this…
Concept formation vs. logistic regression: predicting death in trauma patients.
Hadzikadic, M; Hakenewerth, A; Bohren, B; Norton, J; Mehta, B; Andrews, C
1996-10-01
This study compares two classification models used to predict survival of injured patients entering the emergency department. Concept formation is a machine learning technique that summarizes known examples cases in the form of a tree. After the tree is constructed, it can then be used to predict the classification of new cases. Logistic regression, on the other hand, is a statistical model that allows for a quantitative relationship for a dichotomous event with several independent variables. The outcome (dependent) variable must have only two choices, e.g. does or does not occur, alive or dead, etc. The result of this model is an equation which is then used to predict the probability of class membership of a new case. The two models were evaluated on a trauma registry database composed of information on all trauma patients admitted in 1992 to a Level I trauma center. A total of 2155 records. representing all trauma patients admitted for more than 24 h or who died in the Emergency Department, were grouped into two databases as follows: (1) discharge status of 'died' (containing 151 records), and (2) any discharge status other than 'died' (containing 2004 records). Both databases contained the same variables. PMID:8955858
Application of Bayesian Logistic Regression to Mining Biomedical Data
Avali, Viji R.; Cooper, Gregory F.; Gopalakrishnan, Vanathi
2014-01-01
Mining high dimensional biomedical data with existing classifiers is challenging and the predictions are often inaccurate. We investigated the use of Bayesian Logistic Regression (B-LR) for mining such data to predict and classify various disease conditions. The analysis was done on twelve biomedical datasets with binary class variables and the performance of B-LR was compared to those from other popular classifiers on these datasets with 10-fold cross validation using the WEKA data mining toolkit. The statistical significance of the results was analyzed by paired two tailed t-tests and non-parametric Wilcoxon signed-rank tests. We observed overall that B-LR with non-informative Gaussian priors performed on par with other classifiers in terms of accuracy, balanced accuracy and AUC. These results suggest that it is worthwhile to explore the application of B-LR to predictive modeling tasks in bioinformatics using informative biological prior probabilities. With informative prior probabilities, we conjecture that the performance of B-LR will improve. PMID:25954328
Efficient Logistic Regression Designs Under an Imperfect Population Identifier
Albert, Paul S.; Liu, Aiyi; Nansel, Tonja
2013-01-01
Summary Motivated by actual study designs, this article considers efficient logistic regression designs where the population is identified with a binary test that is subject to diagnostic error. We consider the case where the imperfect test is obtained on all participants, while the gold standard test is measured on a small chosen subsample. Under maximum-likelihood estimation, we evaluate the optimal design in terms of sample selection as well as verification. We show that there may be substantial efficiency gains by choosing a small percentage of individuals who test negative on the imperfect test for inclusion in the sample (e.g., verifying 90% test-positive cases). We also show that a two-stage design may be a good practical alternative to a fixed design in some situations. Under optimal and nearly optimal designs, we compare maximum-likelihood and semi-parametric efficient estimators under correct and misspecified models with simulations. The methodology is illustrated with an analysis from a diabetes behavioral intervention trial. PMID:24261471
Risk factors for temporomandibular disorder: Binary logistic regression analysis
Magalhães, Bruno G.; de-Sousa, Stéphanie T.; de Mello, Victor V C.; da-Silva-Barbosa, André C.; de-Assis-Morais, Mariana P L.; Barbosa-Vasconcelos, Márcia M V.
2014-01-01
Objectives: To analyze the influence of socioeconomic and demographic factors (gender, economic class, age and marital status) on the occurrence of temporomandibular disorder. Study Design: One hundred individuals from urban areas in the city of Recife (Brazil) registered at Family Health Units was examined using Axis I of the Research Diagnostic Criteria for Temporomandibular Disorders (RDC/TMD) which addresses myofascial pain and joint problems (disc displacement, arthralgia, osteoarthritis and oesteoarthrosis). The Brazilian Economic Classification Criteria (CCEB) was used for the collection of socioeconomic and demographic data. Then, it was categorized as Class A (high social class), Classes B/C (middle class) and Classes D/E (very poor social class). The results were analyzed using Pearson’s chi-square test for proportions, Fisher’s exact test, nonparametric Mann-Whitney test and Binary logistic regression analysis. Results: None of the participants belonged to Class A, 72% belonged to Classes B/C and 28% belonged to Classes D/E. The multivariate analysis revealed that participants from Classes D/E had a 4.35-fold greater chance of exhibiting myofascial pain and 11.3-fold greater chance of exhibiting joint problems. Conclusions: Poverty is a important condition to exhibit myofascial pain and joint problems. Key words:Temporomandibular joint disorders, risk factors, prevalence. PMID:24316706
Autoregressive logistic regression applied to atmospheric circulation patterns
NASA Astrophysics Data System (ADS)
Guanche, Y.; Mínguez, R.; Méndez, F. J.
2014-01-01
Autoregressive logistic regression models have been successfully applied in medical and pharmacology research fields, and in simple models to analyze weather types. The main purpose of this paper is to introduce a general framework to study atmospheric circulation patterns capable of dealing simultaneously with: seasonality, interannual variability, long-term trends, and autocorrelation of different orders. To show its effectiveness on modeling performance, daily atmospheric circulation patterns identified from observed sea level pressure fields over the Northeastern Atlantic, have been analyzed using this framework. Model predictions are compared with probabilities from the historical database, showing very good fitting diagnostics. In addition, the fitted model is used to simulate the evolution over time of atmospheric circulation patterns using Monte Carlo method. Simulation results are statistically consistent with respect to the historical sequence in terms of (1) probability of occurrence of the different weather types, (2) transition probabilities and (3) persistence. The proposed model constitutes an easy-to-use and powerful tool for a better understanding of the climate system.
ERIC Educational Resources Information Center
Chen, Chau-Kuang
2005-01-01
Logistic and Cox regression methods are practical tools used to model the relationships between certain student learning outcomes and their relevant explanatory variables. The logistic regression model fits an S-shaped curve into a binary outcome with data points of zero and one. The Cox regression model allows investigators to study the duration…
Mixed-Effects Logistic Regression Models for Indirectly Observed Discrete Outcome Variables
ERIC Educational Resources Information Center
Vermunt, Jeroen K.
2005-01-01
A well-established approach to modeling clustered data introduces random effects in the model of interest. Mixed-effects logistic regression models can be used to predict discrete outcome variables when observations are correlated. An extension of the mixed-effects logistic regression model is presented in which the dependent variable is a latent…
What Are the Odds of that? A Primer on Understanding Logistic Regression
ERIC Educational Resources Information Center
Huang, Francis L.; Moon, Tonya R.
2013-01-01
The purpose of this Methodological Brief is to present a brief primer on logistic regression, a commonly used technique when modeling dichotomous outcomes. Using data from the National Education Longitudinal Study of 1988 (NELS:88), logistic regression techniques were used to investigate student-level variables in eighth grade (i.e., enrolled in a…
Power and Sample Size Calculations for Logistic Regression Tests for Differential Item Functioning
ERIC Educational Resources Information Center
Li, Zhushan
2014-01-01
Logistic regression is a popular method for detecting uniform and nonuniform differential item functioning (DIF) effects. Theoretical formulas for the power and sample size calculations are derived for likelihood ratio tests and Wald tests based on the asymptotic distribution of the maximum likelihood estimators for the logistic regression model.…
Analysis of sparse data in logistic regression in medical research: A newer approach
Devika, S; Jeyaseelan, L; Sebastian, G
2016-01-01
Background and Objective: In the analysis of dichotomous type response variable, logistic regression is usually used. However, the performance of logistic regression in the presence of sparse data is questionable. In such a situation, a common problem is the presence of high odds ratios (ORs) with very wide 95% confidence interval (CI) (OR: >999.999, 95% CI: <0.001, >999.999). In this paper, we addressed this issue by using penalized logistic regression (PLR) method. Materials and Methods: Data from case-control study on hyponatremia and hiccups conducted in Christian Medical College, Vellore, Tamil Nadu, India was used. The outcome variable was the presence/absence of hiccups and the main exposure variable was the status of hyponatremia. Simulation dataset was created with different sample sizes and with a different number of covariates. Results: A total of 23 cases and 50 controls were used for the analysis of ordinary and PLR methods. The main exposure variable hyponatremia was present in nine (39.13%) of the cases and in four (8.0%) of the controls. Of the 23 hiccup cases, all were males and among the controls, 46 (92.0%) were males. Thus, the complete separation between gender and the disease group led into an infinite OR with 95% CI (OR: >999.999, 95% CI: <0.001, >999.999) whereas there was a finite and consistent regression coefficient for gender (OR: 5.35; 95% CI: 0.42, 816.48) using PLR. After adjusting for all the confounding variables, hyponatremia entailed 7.9 (95% CI: 2.06, 38.86) times higher risk for the development of hiccups as was found using PLR whereas there was an overestimation of risk OR: 10.76 (95% CI: 2.17, 53.41) using the conventional method. Simulation experiment shows that the estimated coverage probability of this method is near the nominal level of 95% even for small sample sizes and for a large number of covariates. Conclusions: PLR is almost equal to the ordinary logistic regression when the sample size is large and is superior in
ERIC Educational Resources Information Center
Koon, Sharon; Petscher, Yaacov
2015-01-01
The purpose of this report was to explicate the use of logistic regression and classification and regression tree (CART) analysis in the development of early warning systems. It was motivated by state education leaders' interest in maintaining high classification accuracy while simultaneously improving practitioner understanding of the rules…
Asymptotically Unbiased Estimation of Exposure Odds Ratios in Complete Records Logistic Regression
Bartlett, Jonathan W.; Harel, Ofer; Carpenter, James R.
2015-01-01
Missing data are a commonly occurring threat to the validity and efficiency of epidemiologic studies. Perhaps the most common approach to handling missing data is to simply drop those records with 1 or more missing values, in so-called “complete records” or “complete case” analysis. In this paper, we bring together earlier-derived yet perhaps now somewhat neglected results which show that a logistic regression complete records analysis can provide asymptotically unbiased estimates of the association of an exposure of interest with an outcome, adjusted for a number of confounders, under a surprisingly wide range of missing-data assumptions. We give detailed guidance describing how the observed data can be used to judge the plausibility of these assumptions. The results mean that in large epidemiologic studies which are affected by missing data and analyzed by logistic regression, exposure associations may be estimated without bias in a number of settings where researchers might otherwise assume that bias would occur. PMID:26429998
Sub-resolution assist feature (SRAF) printing prediction using logistic regression
NASA Astrophysics Data System (ADS)
Tan, Chin Boon; Koh, Kar Kit; Zhang, Dongqing; Foong, Yee Mei
2015-03-01
In optical proximity correction (OPC), the sub-resolution assist feature (SRAF) has been used to enhance the process window of main structures. However, the printing of SRAF on wafer is undesirable as this may adversely degrade the overall process yield if it is transferred into the final pattern. A reasonably accurate prediction model is needed during OPC to ensure that the SRAF placement and size have no risk of SRAF printing. Current common practice in OPC is either using the main OPC model or model threshold adjustment (MTA) solution to predict the SRAF printing. This paper studies the feasibility of SRAF printing prediction using logistic regression (LR). Logistic regression is a probabilistic classification model that gives discrete binary outputs after receiving sufficient input variables from SRAF printing conditions. In the application of SRAF printing prediction, the binary outputs can be treated as 1 for SRAFPrinting and 0 for No-SRAF-Printing. The experimental work was performed using a 20nm line/space process layer. The results demonstrate that the accuracy of SRAF printing prediction using LR approach outperforms MTA solution. Overall error rate of as low as calibration 2% and verification 5% was achieved by LR approach compared to calibration 6% and verification 15% for MTA solution. In addition, the performance of LR approach was found to be relatively independent and consistent across different resist image planes compared to MTA solution.
Modeling data for pancreatitis in presence of a duodenal diverticula using logistic regression
NASA Astrophysics Data System (ADS)
Dineva, S.; Prodanova, K.; Mlachkova, D.
2013-12-01
The presence of a periampullary duodenal diverticulum (PDD) is often observed during upper digestive tract barium meal studies and endoscopic retrograde cholangiopancreatography (ERCP). A few papers reported that the diverticulum had something to do with the incidence of pancreatitis. The aim of this study is to investigate if the presence of duodenal diverticula predisposes to the development of a pancreatic disease. A total 3966 patients who had undergone ERCP were studied retrospectively. They were divided into 2 groups-with and without PDD. Patients with a duodenal diverticula had a higher rate of acute pancreatitis. The duodenal diverticula is a risk factor for acute idiopathic pancreatitis. A multiple logistic regression to obtain adjusted estimate of odds and to identify if a PDD is a predictor of acute or chronic pancreatitis was performed. The software package STATISTICA 10.0 was used for analyzing the real data.
Dreiseitl, Stephan; Harbauer, Alexandra; Binder, Michael; Kittler, Harald
2005-10-01
Logistic regression models are widely used in medicine, but difficult to apply without the aid of electronic devices. In this paper, we present a novel approach to represent logistic regression models as nomograms that can be evaluated by simple line drawings. As a case study, we show how data obtained from a questionnaire-based patient self-assessment study on the risks of developing melanoma can be used to first identify a subset of significant covariates, build a logistic regression model, and finally transform the model to a graphical format. The advantage of the nomogram is that it can easily be mass-produced, distributed and evaluated, while providing the same information as the logistic regression model it represents. PMID:16198997
A solution to the problem of separation in logistic regression.
Heinze, Georg; Schemper, Michael
2002-08-30
The phenomenon of separation or monotone likelihood is observed in the fitting process of a logistic model if the likelihood converges while at least one parameter estimate diverges to +/- infinity. Separation primarily occurs in small samples with several unbalanced and highly predictive risk factors. A procedure by Firth originally developed to reduce the bias of maximum likelihood estimates is shown to provide an ideal solution to separation. It produces finite parameter estimates by means of penalized maximum likelihood estimation. Corresponding Wald tests and confidence intervals are available but it is shown that penalized likelihood ratio tests and profile penalized likelihood confidence intervals are often preferable. The clear advantage of the procedure over previous options of analysis is impressively demonstrated by the statistical analysis of two cancer studies. PMID:12210625
NASA Astrophysics Data System (ADS)
Kamaruddin, Ainur Amira; Ali, Zalila; Noor, Norlida Mohd.; Baharum, Adam; Ahmad, Wan Muhamad Amir W.
2014-07-01
Logistic regression analysis examines the influence of various factors on a dichotomous outcome by estimating the probability of the event's occurrence. Logistic regression, also called a logit model, is a statistical procedure used to model dichotomous outcomes. In the logit model the log odds of the dichotomous outcome is modeled as a linear combination of the predictor variables. The log odds ratio in logistic regression provides a description of the probabilistic relationship of the variables and the outcome. In conducting logistic regression, selection procedures are used in selecting important predictor variables, diagnostics are used to check that assumptions are valid which include independence of errors, linearity in the logit for continuous variables, absence of multicollinearity, and lack of strongly influential outliers and a test statistic is calculated to determine the aptness of the model. This study used the binary logistic regression model to investigate overweight and obesity among rural secondary school students on the basis of their demographics profile, medical history, diet and lifestyle. The results indicate that overweight and obesity of students are influenced by obesity in family and the interaction between a student's ethnicity and routine meals intake. The odds of a student being overweight and obese are higher for a student having a family history of obesity and for a non-Malay student who frequently takes routine meals as compared to a Malay student.
Yang, Lixue; Chen, Kean
2015-11-01
To improve the design of underwater target recognition systems based on auditory perception, this study compared human listeners with automatic classifiers. Performances measures and strategies in three discrimination experiments, including discriminations between man-made and natural targets, between ships and submarines, and among three types of ships, were used. In the experiments, the subjects were asked to assign a score to each sound based on how confident they were about the category to which it belonged, and logistic regression, which represents linear discriminative models, also completed three similar tasks by utilizing many auditory features. The results indicated that the performances of logistic regression improved as the ratio between inter- and intra-class differences became larger, whereas the performances of the human subjects were limited by their unfamiliarity with the targets. Logistic regression performed better than the human subjects in all tasks but the discrimination between man-made and natural targets, and the strategies employed by excellent human subjects were similar to that of logistic regression. Logistic regression and several human subjects demonstrated similar performances when discriminating man-made and natural targets, but in this case, their strategies were not similar. An appropriate fusion of their strategies led to further improvement in recognition accuracy. PMID:26627787
Solving Logistic Regression with Group Cardinality Constraints for Time Series Analysis
Zhang, Yong; Pohl, Kilian M.
2016-01-01
We propose an algorithm to distinguish 3D+t images of healthy from diseased subjects by solving logistic regression based on cardinality constrained, group sparsity. This method reduces the risk of overfitting by providing an elegant solution to identifying anatomical regions most impacted by disease. It also ensures that consistent identification across the time series by grouping each image feature across time and counting the number of non-zero groupings. While popular in medical imaging, group cardinality constrained problems are generally solved by relaxing counting with summing over the groupings. We instead solve the original problem by generalizing a penalty decomposition algorithm, which alternates between minimizing a logistic regression function with a regularizer based on the Frobenius norm and enforcing sparsity. Applied to 86 cine MRIs of healthy cases and subjects with Tetralogy of Fallot (TOF), our method correctly identifies regions impacted by TOF and obtains a statistically significant higher classification accuracy than logistic regression without and relaxed grouped sparsity constraint.
Estimation of adjusted rate differences using additive negative binomial regression.
Donoghoe, Mark W; Marschner, Ian C
2016-08-15
Rate differences are an important effect measure in biostatistics and provide an alternative perspective to rate ratios. When the data are event counts observed during an exposure period, adjusted rate differences may be estimated using an identity-link Poisson generalised linear model, also known as additive Poisson regression. A problem with this approach is that the assumption of equality of mean and variance rarely holds in real data, which often show overdispersion. An additive negative binomial model is the natural alternative to account for this; however, standard model-fitting methods are often unable to cope with the constrained parameter space arising from the non-negativity restrictions of the additive model. In this paper, we propose a novel solution to this problem using a variant of the expectation-conditional maximisation-either algorithm. Our method provides a reliable way to fit an additive negative binomial regression model and also permits flexible generalisations using semi-parametric regression functions. We illustrate the method using a placebo-controlled clinical trial of fenofibrate treatment in patients with type II diabetes, where the outcome is the number of laser therapy courses administered to treat diabetic retinopathy. An R package is available that implements the proposed method. Copyright © 2016 John Wiley & Sons, Ltd. PMID:27073156
Use and interpretation of logistic regression in habitat-selection studies
Keating, K.A.; Cherry, S.
2004-01-01
Logistic regression is an important tool for wildlife habitat-selection studies, but the method frequently has been misapplied due to an inadequate understanding of the logistic model, its interpretation, and the influence of sampling design. To promote better use of this method, we review its application and interpretation under 3 sampling designs: random, case-control, and use-availability. Logistic regression is appropriate for habitat use-nonuse studies employing random sampling and can be used to directly model the conditional probability of use in such cases. Logistic regression also is appropriate for studies employing case-control sampling designs, but careful attention is required to interpret results correctly. Unless bias can be estimated or probability of use is small for all habitats, results of case-control studies should be interpreted as odds ratios, rather than probability of use or relative probability of use. When data are gathered under a use-availability design, logistic regression can be used to estimate approximate odds ratios if probability of use is small, at least on average. More generally, however, logistic regression is inappropriate for modeling habitat selection in use-availability studies. In particular, using logistic regression to fit the exponential model of Manly et al. (2002:100) does not guarantee maximum-likelihood estimates, valid probabilities, or valid likelihoods. We show that the resource selection function (RSF) commonly used for the exponential model is proportional to a logistic discriminant function. Thus, it may be used to rank habitats with respect to probability of use and to identify important habitat characteristics or their surrogates, but it is not guaranteed to be proportional to probability of use. Other problems associated with the exponential model also are discussed. We describe an alternative model based on Lancaster and Imbens (1996) that offers a method for estimating conditional probability of use in
Comparison of Two Approaches for Handling Missing Covariates in Logistic Regression
ERIC Educational Resources Information Center
Peng, Chao-Ying Joanne; Zhu, Jin
2008-01-01
For the past 25 years, methodological advances have been made in missing data treatment. Most published work has focused on missing data in dependent variables under various conditions. The present study seeks to fill the void by comparing two approaches for handling missing data in categorical covariates in logistic regression: the…
Multiple Logistic Regression Analysis of Cigarette Use among High School Students
ERIC Educational Resources Information Center
Adwere-Boamah, Joseph
2011-01-01
A binary logistic regression analysis was performed to predict high school students' cigarette smoking behavior from selected predictors from 2009 CDC Youth Risk Behavior Surveillance Survey. The specific target student behavior of interest was frequent cigarette use. Five predictor variables included in the model were: a) race, b) frequency of…
Optimizing the Classification Performance of Logistic Regression and Fisher's Discriminant Analyses.
ERIC Educational Resources Information Center
Yarnold, Paul R.; And Others
1994-01-01
A methodology is proposed to optimize the training classification performance of any suboptimal model. The method, referred to as univariate optimal discriminant analysis (UniODA), is illustrated through application to a two-group logistic regression analysis with 12 empirical examples. Maximizing percentage accuracy in classification is…
Heteroscedastic Extended Logistic Regression for Post-Processing of Ensemble Guidance
NASA Astrophysics Data System (ADS)
Messner, Jakob W.; Mayr, Georg J.; Wilks, Daniel S.; Zeileis, Achim
2014-05-01
To achieve well-calibrated probabilistic weather forecasts, numerical ensemble forecasts are often statistically post-processed. One recent ensemble-calibration method is extended logistic regression which extends the popular logistic regression to yield full probability distribution forecasts. Although the purpose of this method is to post-process ensemble forecasts, usually only the ensemble mean is used as predictor variable, whereas the ensemble spread is neglected because it does not improve the forecasts. In this study we show that when simply used as ordinary predictor variable in extended logistic regression, the ensemble spread only affects the location but not the variance of the predictive distribution. Uncertainty information contained in the ensemble spread is therefore not utilized appropriately. To solve this drawback we propose a new approach where the ensemble spread is directly used to predict the dispersion of the predictive distribution. With wind speed data and ensemble forecasts from the European Centre for Medium-Range Weather Forecasts (ECMWF) we show that using this approach, the ensemble spread can be used effectively to improve forecasts from extended logistic regression.
ERIC Educational Resources Information Center
Schumacher, Phyllis; Olinsky, Alan; Quinn, John; Smith, Richard
2010-01-01
The authors extended previous research by 2 of the authors who conducted a study designed to predict the successful completion of students enrolled in an actuarial program. They used logistic regression to determine the probability of an actuarial student graduating in the major or dropping out. They compared the results of this study with those…
The Use and Interpretation of Logistic Regression in Higher Education Journals: 1988-1999.
ERIC Educational Resources Information Center
Peng, Chao-Ying Joanne; So, Tak-Shing Harry; Stage, Frances K.; St. John, Edward P.
2002-01-01
Examined use and interpretation of logistic regression in leading higher education journals. Found increasingly sophisticated use for a wide range of topics; however, there continued to be confusion over terminology. Sample sizes did not always achieve a desired level of stability in parameters estimated, and discussion of results in terms of…
ERIC Educational Resources Information Center
Fidalgo, Angel M.; Alavi, Seyed Mohammad; Amirian, Seyed Mohammad Reza
2014-01-01
This study examines three controversial aspects in differential item functioning (DIF) detection by logistic regression (LR) models: first, the relative effectiveness of different analytical strategies for detecting DIF; second, the suitability of the Wald statistic for determining the statistical significance of the parameters of interest; and…
Predictive Discriminant Analysis Versus Logistic Regression in Two-Group Classification Problems.
ERIC Educational Resources Information Center
Meshbane, Alice; Morris, John D.
A method for comparing the cross-validated classification accuracies of predictive discriminant analysis and logistic regression classification models is presented under varying data conditions for the two-group classification problem. With this method, separate-group, as well as total-sample proportions of the correct classifications, can be…
ERIC Educational Resources Information Center
French, Brian F.; Maller, Susan J.
2007-01-01
Two unresolved implementation issues with logistic regression (LR) for differential item functioning (DIF) detection include ability purification and effect size use. Purification is suggested to control inaccuracies in DIF detection as a result of DIF items in the ability estimate. Additionally, effect size use may be beneficial in controlling…
Risk Factors of Falls in Community-Dwelling Older Adults: Logistic Regression Tree Analysis
ERIC Educational Resources Information Center
Yamashita, Takashi; Noe, Douglas A.; Bailer, A. John
2012-01-01
Purpose of the Study: A novel logistic regression tree-based method was applied to identify fall risk factors and possible interaction effects of those risk factors. Design and Methods: A nationally representative sample of American older adults aged 65 years and older (N = 9,592) in the Health and Retirement Study 2004 and 2006 modules was used.…
Propensity Score Estimation with Data Mining Techniques: Alternatives to Logistic Regression
ERIC Educational Resources Information Center
Keller, Bryan S. B.; Kim, Jee-Seon; Steiner, Peter M.
2013-01-01
Propensity score analysis (PSA) is a methodological technique which may correct for selection bias in a quasi-experiment by modeling the selection process using observed covariates. Because logistic regression is well understood by researchers in a variety of fields and easy to implement in a number of popular software packages, it has…
Comparison of IRT Likelihood Ratio Test and Logistic Regression DIF Detection Procedures
ERIC Educational Resources Information Center
Atar, Burcu; Kamata, Akihito
2011-01-01
The Type I error rates and the power of IRT likelihood ratio test and cumulative logit ordinal logistic regression procedures in detecting differential item functioning (DIF) for polytomously scored items were investigated in this Monte Carlo simulation study. For this purpose, 54 simulation conditions (combinations of 3 sample sizes, 2 sample…
A Note on Three Statistical Tests in the Logistic Regression DIF Procedure
ERIC Educational Resources Information Center
Paek, Insu
2012-01-01
Although logistic regression became one of the well-known methods in detecting differential item functioning (DIF), its three statistical tests, the Wald, likelihood ratio (LR), and score tests, which are readily available under the maximum likelihood, do not seem to be consistently distinguished in DIF literature. This paper provides a clarifying…
Detecting DIF in Polytomous Items Using MACS, IRT and Ordinal Logistic Regression
ERIC Educational Resources Information Center
Elosua, Paula; Wells, Craig
2013-01-01
The purpose of the present study was to compare the Type I error rate and power of two model-based procedures, the mean and covariance structure model (MACS) and the item response theory (IRT), and an observed-score based procedure, ordinal logistic regression, for detecting differential item functioning (DIF) in polytomous items. A simulation…
NASA Astrophysics Data System (ADS)
Das, Iswar; Stein, Alfred; Kerle, Norman; Dadhwal, Vinay K.
2012-12-01
Landslide susceptibility mapping (LSM) along road corridors in the Indian Himalayas is an essential exercise that helps planners and decision makers in determining the severity of probable slope failure areas. Logistic regression is commonly applied for this purpose, as it is a robust and straightforward technique that is relatively easy to handle. Ordinary logistic regression as a data-driven technique, however, does not allow inclusion of prior information. This study presents Bayesian logistic regression (BLR) for landslide susceptibility assessment along road corridors. The methodology is tested in a landslide-prone area in the Bhagirathi river valley in the Indian Himalayas. Parameter estimates from BLR are compared with those obtained from ordinary logistic regression. By means of iterative Markov Chain Monte Carlo simulation, BLR provides a rich set of results on parameter estimation. We assessed model performance by the receiver operator characteristics curve analysis, and validated the model using 50% of the landslide cells kept apart for testing and validation. The study concludes that BLR performs better in posterior parameter estimation in general and the uncertainty estimation in particular.
ERIC Educational Resources Information Center
Anderson, Carolyn J.; Verkuilen, Jay; Peyton, Buddy L.
2010-01-01
Survey items with multiple response categories and multiple-choice test questions are ubiquitous in psychological and educational research. We illustrate the use of log-multiplicative association (LMA) models that are extensions of the well-known multinomial logistic regression model for multiple dependent outcome variables to reanalyze a set of…
ERIC Educational Resources Information Center
Monahan, Patrick O.; McHorney, Colleen A.; Stump, Timothy E.; Perkins, Anthony J.
2007-01-01
Previous methodological and applied studies that used binary logistic regression (LR) for detection of differential item functioning (DIF) in dichotomously scored items either did not report an effect size or did not employ several useful measures of DIF magnitude derived from the LR model. Equations are provided for these effect size indices.…
ERIC Educational Resources Information Center
Magis, David; Raiche, Gilles; Beland, Sebastien; Gerard, Paul
2011-01-01
We present an extension of the logistic regression procedure to identify dichotomous differential item functioning (DIF) in the presence of more than two groups of respondents. Starting from the usual framework of a single focal group, we propose a general approach to estimate the item response functions in each group and to test for the presence…
ERIC Educational Resources Information Center
Ozechowski, Timothy J.; Turner, Charles W.; Hops, Hyman
2007-01-01
This article demonstrates the use of mixed-effects logistic regression (MLR) for conducting sequential analyses of binary observational data. MLR is a special case of the mixed-effects logit modeling framework, which may be applied to multicategorical observational data. The MLR approach is motivated in part by G. A. Dagne, G. W. Howe, C. H.…
ERIC Educational Resources Information Center
Courtney, Jon R.; Prophet, Retta
2011-01-01
Placement instability is often associated with a number of negative outcomes for children. To gain state level contextual knowledge of factors associated with placement stability/instability, logistic regression was applied to selected variables from the New Mexico Adoption and Foster Care Administrative Reporting System dataset. Predictors…
The purpose of this report is to provide a reference manual that could be used by investigators for making informed use of logistic regression using two methods (standard logistic regression and MARS). The details for analyses of relationships between a dependent binary response ...
Using logistic regression to describe the length of breastfeeding: a study in Guadalajara, Mexico.
Gonzalez-Perez, G J; Vega-Lopez, M G; Cabrera-Pivaral, C
1998-12-01
This study seeks, through a logistic regression model, to describe the pattern of breastfeeding duration in Guadalajara, Mexico, during 1993. A multistage random sample of children under 1 year of age (n = 1036) was studied; observational data regarding breastfeeding duration, obtained through a "status quo" procedure, were compared with prevalence rates obtained from the logistic regression model. Modeling the duration of breastfeeding during the first year of life rather than only analyzing observational data helps researchers to understand this process in a dynamic and quantitative way. For example, uncommon indicators of breastfeeding were derived from the model. These indicators are impossible to obtain from observational data. The prevalence curve estimated through the logistic model was adequately fitted to observed data: there were no significant differences between the number or distribution of breastfed infants observed and those predicted by the model. Moreover, the model revealed that less than 40% of the children were breastfed in the fourth month of life; the median age for weaning was 39.3 days; 55% of the potential breastfeeding in the first 4 months did not occur; and the greatest abandonment of breastfeeding in the first 4 months was observed in the first 60 days. Thus, logistic regression seems a suitable option to construct a population-based model that describes breastfeeding duration during the first year of life. The indicators derived from the model offer health care providers valuable information for developing programs that promote breastfeeding. PMID:10205448
Chen, Suduan; Goo, Yeong-Jia James; Shen, Zone-De
2014-01-01
As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%. PMID:25302338
Goo, Yeong-Jia James; Shen, Zone-De
2014-01-01
As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%. PMID:25302338
Aranda, Diana; Lopez, Jose V; Solo-Gabriele, Helena M; Fleisher, Jay M
2016-02-01
Recreational water quality surveillance involves comparing bacterial levels to set threshold values to determine beach closure. Bacterial levels can be predicted through models which are traditionally based upon multiple linear regression. The objective of this study was to evaluate exceedance probabilities, as opposed to bacterial levels, as an alternate method to express beach risk. Data were incorporated into a logistic regression for the purpose of identifying environmental parameters most closely correlated with exceedance probabilities. The analysis was based on 7,422 historical sample data points from the years 2000-2010 for 15 South Florida beach sample sites. Probability analyses showed which beaches in the dataset were most susceptible to exceedances. No yearly trends were observed nor were any relationships apparent with monthly rainfall or hurricanes. Results from logistic regression analyses found that among the environmental parameters evaluated, tide was most closely associated with exceedances, with exceedances 2.475 times more likely to occur at high tide compared to low tide. The logistic regression methodology proved useful for predicting future exceedances at a beach location in terms of probability and modeling water quality environmental parameters with dependence on a binary response. This methodology can be used by beach managers for allocating resources when sampling more than one beach. PMID:26837832
Wang, Shuang; Jiang, Xiaoqian; Wu, Yuan; Cui, Lijuan; Cheng, Samuel; Ohno-Machado, Lucila
2013-01-01
We developed an EXpectation Propagation LOgistic REgRession (EXPLORER) model for distributed privacy-preserving online learning. The proposed framework provides a high level guarantee for protecting sensitive information, since the information exchanged between the server and the client is the encrypted posterior distribution of coefficients. Through experimental results, EXPLORER shows the same performance (e.g., discrimination, calibration, feature selection etc.) as the traditional frequentist Logistic Regression model, but provides more flexibility in model updating. That is, EXPLORER can be updated one point at a time rather than having to retrain the entire data set when new observations are recorded. The proposed EXPLORER supports asynchronized communication, which relieves the participants from coordinating with one another, and prevents service breakdown from the absence of participants or interrupted communications. PMID:23562651
Menditto, Anthony A; Linhorst, Donald M; Coleman, James C; Beck, Niels C
2006-04-01
Development of policies and procedures to contend with the risks presented by elopement, aggression, and suicidal behaviors are long-standing challenges for mental health administrators. Guidance in making such judgments can be obtained through the use of a multivariate statistical technique known as logistic regression. This procedure can be used to develop a predictive equation that is mathematically formulated to use the best combination of predictors, rather than considering just one factor at a time. This paper presents an overview of logistic regression and its utility in mental health administrative decision making. A case example of its application is presented using data on elopements from Missouri's long-term state psychiatric hospitals. Ultimately, the use of statistical prediction analyses tempered with differential qualitative weighting of classification errors can augment decision-making processes in a manner that provides guidance and flexibility while wrestling with the complex problem of risk assessment and decision making. PMID:16645908
Phillips, Kirk T.; Street, W. Nick
2005-01-01
The purpose of this study is to determine the best prediction of heart failure outcomes, resulting from two methods -- standard epidemiologic analysis with logistic regression and knowledge discovery with supervised learning/data mining. Heart failure was chosen for this study as it exhibits higher prevalence and cost of treatment than most other hospitalized diseases. The prevalence of heart failure has exceeded 4 million cases in the U.S.. Findings of this study should be useful for the design of quality improvement initiatives, as particular aspects of patient comorbidity and treatment are found to be associated with mortality. This is also a proof of concept study, considering the feasibility of emerging health informatics methods of data mining in conjunction with or in lieu of traditional logistic regression methods of prediction. Findings may also support the design of decision support systems and quality improvement programming for other diseases. PMID:16779367
Predicting Student Success on the Texas Chemistry STAAR Test: A Logistic Regression Analysis
ERIC Educational Resources Information Center
Johnson, William L.; Johnson, Annabel M.; Johnson, Jared
2012-01-01
Background: The context is the new Texas STAAR end-of-course testing program. Purpose: The authors developed a logistic regression model to predict who would pass-or-fail the new Texas chemistry STAAR end-of-course exam. Setting: Robert E. Lee High School (5A) with an enrollment of 2700 students, Tyler, Texas. Date of the study was the 2011-2012…
Ohlmacher, G.C.; Davis, J.C.
2003-01-01
Landslides in the hilly terrain along the Kansas and Missouri rivers in northeastern Kansas have caused millions of dollars in property damage during the last decade. To address this problem, a statistical method called multiple logistic regression has been used to create a landslide-hazard map for Atchison, Kansas, and surrounding areas. Data included digitized geology, slopes, and landslides, manipulated using ArcView GIS. Logistic regression relates predictor variables to the occurrence or nonoccurrence of landslides within geographic cells and uses the relationship to produce a map showing the probability of future landslides, given local slopes and geologic units. Results indicated that slope is the most important variable for estimating landslide hazard in the study area. Geologic units consisting mostly of shale, siltstone, and sandstone were most susceptible to landslides. Soil type and aspect ratio were considered but excluded from the final analysis because these variables did not significantly add to the predictive power of the logistic regression. Soil types were highly correlated with the geologic units, and no significant relationships existed between landslides and slope aspect. ?? 2003 Elsevier Science B.V. All rights reserved.
A comparative study on entrepreneurial attitudes modeled with logistic regression and Bayes nets.
López Puga, Jorge; García García, Juan
2012-11-01
Entrepreneurship research is receiving increasing attention in our context, as entrepreneurs are key social agents involved in economic development. We compare the success of the dichotomic logistic regression model and the Bayes simple classifier to predict entrepreneurship, after manipulating the percentage of missing data and the level of categorization in predictors. A sample of undergraduate university students (N = 1230) completed five scales (motivation, attitude towards business creation, obstacles, deficiencies, and training needs) and we found that each of them predicted different aspects of the tendency to business creation. Additionally, our results show that the receiver operating characteristic (ROC) curve is affected by the rate of missing data in both techniques, but logistic regression seems to be more vulnerable when faced with missing data, whereas Bayes nets underperform slightly when categorization has been manipulated. Our study sheds light on the potential entrepreneur profile and we propose to use Bayesian networks as an additional alternative to overcome the weaknesses of logistic regression when missing data are present in applied research. PMID:23156922
KUPPER, Lawrence L.
2012-01-01
A common goal in environmental epidemiologic studies is to undertake logistic regression modeling to associate a continuous measure of exposure with binary disease status, adjusting for covariates. A frequent complication is that exposure may only be measurable indirectly, through a collection of subject-specific variables assumed associated with it. Motivated by a specific study to investigate the association between lung function and exposure to metal working fluids, we focus on a multiplicative-lognormal structural measurement error scenario and approaches to address it when external validation data are available. Conceptually, we emphasize the case in which true untransformed exposure is of interest in modeling disease status, but measurement error is additive on the log scale and thus multiplicative on the raw scale. Methodologically, we favor a pseudo-likelihood (PL) approach that exhibits fewer computational problems than direct full maximum likelihood (ML) yet maintains consistency under the assumed models without necessitating small exposure effects and/or small measurement error assumptions. Such assumptions are required by computationally convenient alternative methods like regression calibration (RC) and ML based on probit approximations. We summarize simulations demonstrating considerable potential for bias in the latter two approaches, while supporting the use of PL across a variety of scenarios. We also provide accessible strategies for obtaining adjusted standard errors to accompany RC and PL estimates. PMID:24027381
Stiglic, Gregor; Povalej Brzan, Petra; Fijacko, Nino; Wang, Fei; Delibasic, Boris; Kalousis, Alexandros; Obradovic, Zoran
2015-01-01
Different studies have demonstrated the importance of comorbidities to better understand the origin and evolution of medical complications. This study focuses on improvement of the predictive model interpretability based on simple logical features representing comorbidities. We use group lasso based feature interaction discovery followed by a post-processing step, where simple logic terms are added. In the final step, we reduce the feature set by applying lasso logistic regression to obtain a compact set of non-zero coefficients that represent a more comprehensible predictive model. The effectiveness of the proposed approach was demonstrated on a pediatric hospital discharge dataset that was used to build a readmission risk estimation model. The evaluation of the proposed method demonstrates a reduction of the initial set of features in a regression model by 72%, with a slight improvement in the Area Under the ROC Curve metric from 0.763 (95% CI: 0.755–0.771) to 0.769 (95% CI: 0.761–0.777). Additionally, our results show improvement in comprehensibility of the final predictive model using simple comorbidity based terms for logistic regression. PMID:26645087
Olson, Scott A.; Brouillette, Michael C.
2006-01-01
A logistic regression equation was developed for estimating the probability of a stream flowing intermittently at unregulated, rural stream sites in Vermont. These determinations can be used for a wide variety of regulatory and planning efforts at the Federal, State, regional, county and town levels, including such applications as assessing fish and wildlife habitats, wetlands classifications, recreational opportunities, water-supply potential, waste-assimilation capacities, and sediment transport. The equation will be used to create a derived product for the Vermont Hydrography Dataset having the streamflow characteristic of 'intermittent' or 'perennial.' The Vermont Hydrography Dataset is Vermont's implementation of the National Hydrography Dataset and was created at a scale of 1:5,000 based on statewide digital orthophotos. The equation was developed by relating field-verified perennial or intermittent status of a stream site during normal summer low-streamflow conditions in the summer of 2005 to selected basin characteristics of naturally flowing streams in Vermont. The database used to develop the equation included 682 stream sites with drainage areas ranging from 0.05 to 5.0 square miles. When the 682 sites were observed, 126 were intermittent (had no flow at the time of the observation) and 556 were perennial (had flowing water at the time of the observation). The results of the logistic regression analysis indicate that the probability of a stream having intermittent flow in Vermont is a function of drainage area, elevation of the site, the ratio of basin relief to basin perimeter, and the areal percentage of well- and moderately well-drained soils in the basin. Using a probability cutpoint (a lower probability indicates the site has perennial flow and a higher probability indicates the site has intermittent flow) of 0.5, the logistic regression equation correctly predicted the perennial or intermittent status of 116 test sites 85 percent of the time.
NASA Astrophysics Data System (ADS)
Sugihara, Shigemitsu; Shinozaki, Tsuguhiro; Ohishi, Hiroyuki; Araki, Yoshinori; Furukawa, Kohei
It is difficult to deregulate sediment-related disaster warning information, for the reason that it is difficult to quantify the risk of disaster after the heavy rain. If we can quantify the risk according to the rain situation, it will be an indication of deregulation. In this study, using logistic regression analysis, we quantified the risk according to the rain situation as the probability of disaster occurrence. And we analyzed the setup of resolutive criterion for sediment-related disaster warning information. As a result, we can improve convenience of the evaluation method of probability of disaster occurrence, which is useful to provide information of imminently situation.
Hu, Wei
2013-01-01
We utilised Sparse Logistic Regression (SLR) to build two sparse and interpretable predictors. The first one (SLR-65) was based on a signature consisting of the top 65 probe sets (59 genes) differentially expressed between Pathologic Complete Response (PCR) and Residual Disease (RD) cases, and the second one (SLR-Notch) was based on the genes involved in the Notch singling related pathways (113 genes). The two predictors produced better predictions than the predictor in a previous study. The SLR-65 selected 16 informative genes and the SLR-Notch selected 12 informative genes. PMID:23649738
Regularized logistic regression and multiobjective variable selection for classifying MEG data.
Santana, Roberto; Bielza, Concha; Larrañaga, Pedro
2012-09-01
This paper addresses the question of maximizing classifier accuracy for classifying task-related mental activity from Magnetoencelophalography (MEG) data. We propose the use of different sources of information and introduce an automatic channel selection procedure. To determine an informative set of channels, our approach combines a variety of machine learning algorithms: feature subset selection methods, classifiers based on regularized logistic regression, information fusion, and multiobjective optimization based on probabilistic modeling of the search space. The experimental results show that our proposal is able to improve classification accuracy compared to approaches whose classifiers use only one type of MEG information or for which the set of channels is fixed a priori. PMID:22854976
de Irala, J; Fernandez-Crehuet Navajas, R; Serrano del Castillo, A
1997-03-01
This study describes the behavior of eight statistical programs (BMDP, EGRET, JMP, SAS, SPSS, STATA, STATISTIX, and SYSTAT) when performing a logistic regression with a simulated data set that contains a numerical problem created by the presence of a cell value equal to zero. The programs respond in different ways to this problem. Most of them give a warning, although many simultaneously present incorrect results, among which are confidence intervals that tend toward infinity. Such results can mislead the user. Various guidelines are offered for detecting these problems in actual analyses, and users are reminded of the importance of critical interpretation of the results of statistical programs. PMID:9162592
Penalised logistic regression and dynamic prediction for discrete-time recurrent event data.
Elgmati, Entisar; Fiaccone, Rosemeire L; Henderson, R; Matthews, John N S
2015-10-01
We consider methods for the analysis of discrete-time recurrent event data, when interest is mainly in prediction. The Aalen additive model provides an extremely simple and effective method for the determination of covariate effects for this type of data, especially in the presence of time-varying effects and time varying covariates, including dynamic summaries of prior event history. The method is weakened for predictive purposes by the presence of negative estimates. The obvious alternative of a standard logistic regression analysis at each time point can have problems of stability when event frequency is low and maximum likelihood estimation is used. The Firth penalised likelihood approach is stable but in removing bias in regression coefficients it introduces bias into predicted event probabilities. We propose an alterative modified penalised likelihood, intermediate between Firth and no penalty, as a pragmatic compromise between stability and bias. Illustration on two data sets is provided. PMID:25626559
Predicting pilot-error incidents of US airline pilots using logistic regression.
McFadden, K L
1997-06-01
In a population of 70,164 airline pilots obtained from the Federal Aviation Administration, 475 males and 22 females had pilot-error incidents in the years 1986-1992. A simple chi-squared test revealed that female pilots employed by major airlines had a significantly greater likelihood of pilot-error incidents than their male colleagues. In order to control for age, experience (total flying hours), risk exposure (recent flying hours) and employer (major/non-major airline) simultaneously, the author built a model of male pilot-error incidents using logistic regression. The regression analysis indicated that youth, inexperience and non-major airline employer were independent contributors to the increased risk of pilot-error incidents. The results also provide further support to the literature that pilot performance does not differ significantly between male and female airline pilots. PMID:9414359
Ordinal logistic regression analysis on the nutritional status of children in KarangKitri village
NASA Astrophysics Data System (ADS)
Ohyver, Margaretha; Yongharto, Kimmy Octavian
2015-09-01
Ordinal logistic regression is a statistical technique that can be used to describe the relationship between ordinal response variable with one or more independent variables. This method has been used in various fields including in the health field. In this research, ordinal logistic regression is used to describe the relationship between nutritional status of children with age, gender, height, and family status. Nutritional status of children in this research is divided into over nutrition, well nutrition, less nutrition, and malnutrition. The purpose for this research is to describe the characteristics of children in the KarangKitri Village and to determine the factors that influence the nutritional status of children in the KarangKitri village. There are three things that obtained from this research. First, there are still children who are not categorized as well nutritional status. Second, there are children who come from sufficient economic level which include in not normal status. Third, the factors that affect the nutritional level of children are age, family status, and height.
Reliability estimation for cutting tools based on logistic regression model using vibration signals
NASA Astrophysics Data System (ADS)
Chen, Baojia; Chen, Xuefeng; Li, Bing; He, Zhengjia; Cao, Hongrui; Cai, Gaigai
2011-10-01
As an important part of CNC machine, the reliability of cutting tools influences the whole manufacturing effectiveness and stability of equipment. The present study proposes a novel reliability estimation approach to the cutting tools based on logistic regression model by using vibration signals. The operation condition information of the CNC machine is incorporated into reliability analysis to reflect the product time-varying characteristics. The proposed approach is superior to other degradation estimation methods in that it does not necessitate any assumption about degradation paths and probability density functions of condition parameters. The three steps of new reliability estimation approach for cutting tools are as follows. First, on-line vibration signals of cutting tools are measured during the manufacturing process. Second, wavelet packet (WP) transform is employed to decompose the original signals and correlation analysis is employed to find out the feature frequency bands which indicate tool wear. Third, correlation analysis is also used to select the salient feature parameters which are composed of feature band energy, energy entropy and time-domain features. Finally, reliability estimation is carried out based on logistic regression model. The approach has been validated on a NC lathe. Under different failure threshold, the reliability and failure time of the cutting tools are all estimated accurately. The positive results show the plausibility and effectiveness of the proposed approach, which can facilitate machine performance and reliability estimation.
Grid Binary LOgistic REgression (GLORE): building shared models without sharing data
Jiang, Xiaoqian; Kim, Jihoon; Ohno-Machado, Lucila
2012-01-01
Objective The classification of complex or rare patterns in clinical and genomic data requires the availability of a large, labeled patient set. While methods that operate on large, centralized data sources have been extensively used, little attention has been paid to understanding whether models such as binary logistic regression (LR) can be developed in a distributed manner, allowing researchers to share models without necessarily sharing patient data. Material and methods Instead of bringing data to a central repository for computation, we bring computation to the data. The Grid Binary LOgistic REgression (GLORE) model integrates decomposable partial elements or non-privacy sensitive prediction values to obtain model coefficients, the variance-covariance matrix, the goodness-of-fit test statistic, and the area under the receiver operating characteristic (ROC) curve. Results We conducted experiments on both simulated and clinically relevant data, and compared the computational costs of GLORE with those of a traditional LR model estimated using the combined data. We showed that our results are the same as those of LR to a 10−15 precision. In addition, GLORE is computationally efficient. Limitation In GLORE, the calculation of coefficient gradients must be synchronized at different sites, which involves some effort to ensure the integrity of communication. Ensuring that the predictors have the same format and meaning across the data sets is necessary. Conclusion The results suggest that GLORE performs as well as LR and allows data to remain protected at their original sites. PMID:22511014
Regularization Paths for Conditional Logistic Regression: The clogitL1 Package
Reid, Stephen; Tibshirani, Rob
2014-01-01
We apply the cyclic coordinate descent algorithm of Friedman, Hastie, and Tibshirani (2010) to the fitting of a conditional logistic regression model with lasso (ℓ1) and elastic net penalties. The sequential strong rules of Tibshirani, Bien, Hastie, Friedman, Taylor, Simon, and Tibshirani (2012) are also used in the algorithm and it is shown that these offer a considerable speed up over the standard coordinate descent algorithm with warm starts. Once implemented, the algorithm is used in simulation studies to compare the variable selection and prediction performance of the conditional logistic regression model against that of its unconditional (standard) counterpart. We find that the conditional model performs admirably on datasets drawn from a suitable conditional distribution, outperforming its unconditional counterpart at variable selection. The conditional model is also fit to a small real world dataset, demonstrating how we obtain regularization paths for the parameters of the model and how we apply cross validation for this method where natural unconditional prediction rules are hard to come by. PMID:26257587
Deng, Weiping; Chen, Hanfeng; Li, Zhaohai
2006-01-01
Often in genetic research, presence or absence of a disease is affected by not only the trait locus genotypes but also some covariates. The finite logistic regression mixture models and the methods under the models are developed for detection of a binary trait locus (BTL) through an interval-mapping procedure. The maximum-likelihood estimates (MLEs) of the logistic regression parameters are asymptotically unbiased. The null asymptotic distributions of the likelihood-ratio test (LRT) statistics for detection of a BTL are found to be given by the supremum of a χ2-process. The limiting null distributions are free of the null model parameters and are determined explicitly through only four (backcross case) or nine (intercross case) independent standard normal random variables. Therefore a threshold for detecting a BTL in a flanking marker interval can be approximated easily by using a Monte Carlo method. It is pointed out that use of a threshold incorrectly determined by reading off a χ2-probability table can result in an excessive false BTL detection rate much more severely than many researchers might anticipate. Simulation results show that the BTL detection procedures based on the thresholds determined by the limiting distributions perform quite well when the sample sizes are moderately large. PMID:16272416
Predicting students' success at pre-university studies using linear and logistic regressions
NASA Astrophysics Data System (ADS)
Suliman, Noor Azizah; Abidin, Basir; Manan, Norhafizah Abdul; Razali, Ahmad Mahir
2014-09-01
The study is aimed to find the most suitable model that could predict the students' success at the medical pre-university studies, Centre for Foundation in Science, Languages and General Studies of Cyberjaya University College of Medical Sciences (CUCMS). The predictors under investigation were the national high school exit examination-Sijil Pelajaran Malaysia (SPM) achievements such as Biology, Chemistry, Physics, Additional Mathematics, Mathematics, English and Bahasa Malaysia results as well as gender and high school background factors. The outcomes showed that there is a significant difference in the final CGPA, Biology and Mathematics subjects at pre-university by gender factor, while by high school background also for Mathematics subject. In general, the correlation between the academic achievements at the high school and medical pre-university is moderately significant at α-level of 0.05, except for languages subjects. It was found also that logistic regression techniques gave better prediction models than the multiple linear regression technique for this data set. The developed logistic models were able to give the probability that is almost accurate with the real case. Hence, it could be used to identify successful students who are qualified to enter the CUCMS medical faculty before accepting any students to its foundation program.
NASA Astrophysics Data System (ADS)
Jokar Arsanjani, Jamal; Helbich, Marco; Kainz, Wolfgang; Darvishi Boloorani, Ali
2013-04-01
This research analyses the suburban expansion in the metropolitan area of Tehran, Iran. A hybrid model consisting of logistic regression model, Markov chain (MC), and cellular automata (CA) was designed to improve the performance of the standard logistic regression model. Environmental and socio-economic variables dealing with urban sprawl were operationalised to create a probability surface of spatiotemporal states of built-up land use for the years 2006, 2016, and 2026. For validation, the model was evaluated by means of relative operating characteristic values for different sets of variables. The approach was calibrated for 2006 by cross comparing of actual and simulated land use maps. The achieved outcomes represent a match of 89% between simulated and actual maps of 2006, which was satisfactory to approve the calibration process. Thereafter, the calibrated hybrid approach was implemented for forthcoming years. Finally, future land use maps for 2016 and 2026 were predicted by means of this hybrid approach. The simulated maps illustrate a new wave of suburban development in the vicinity of Tehran at the western border of the metropolis during the next decades.
Hill, Benjamin David; Womble, Melissa N; Rohling, Martin L
2015-01-01
This study utilized logistic regression to determine whether performance patterns on Concussion Vital Signs (CVS) could differentiate known groups with either genuine or feigned performance. For the embedded measure development group (n = 174), clinical patients and undergraduate students categorized as feigning obtained significantly lower scores on the overall test battery mean for the CVS, Shipley-2 composite score, and California Verbal Learning Test-Second Edition subtests than did genuinely performing individuals. The final full model of 3 predictor variables (Verbal Memory immediate hits, Verbal Memory immediate correct passes, and Stroop Test complex reaction time correct) was significant and correctly classified individuals in their known group 83% of the time (sensitivity = .65; specificity = .97) in a mixed sample of young-adult clinical cases and simulators. The CVS logistic regression function was applied to a separate undergraduate college group (n = 378) that was asked to perform genuinely and identified 5% as having possibly feigned performance indicating a low false-positive rate. The failure rate was 11% and 16% at baseline cognitive testing in samples of high school and college athletes, respectively. These findings have particular relevance given the increasing use of computerized test batteries for baseline cognitive testing and return-to-play decisions after concussion. PMID:25371976
Assessing Longitudinal Change: Adjustment for Regression to the Mean Effects
ERIC Educational Resources Information Center
Rocconi, Louis M.; Ethington, Corinna A.
2009-01-01
Pascarella (J Coll Stud Dev 47:508-520, 2006) has called for an increase in use of longitudinal data with pretest-posttest design when studying effects on college students. However, such designs that use multiple measures to document change are vulnerable to an important threat to internal validity, regression to the mean. Herein, we discuss a…
Procedures for adjusting regional regression models of urban-runoff quality using local data
Hoos, Anne B.; Lizarraga, Joy S.
1996-01-01
Statistical operations termed model-adjustment procedures can be used to incorporate local data into existing regression modes to improve the predication of urban-runoff quality. Each procedure is a form of regression analysis in which the local data base is used as a calibration data set; the resulting adjusted regression models can then be used to predict storm-runoff quality at unmonitored sites. Statistical tests of the calibration data set guide selection among proposed procedures.
Coercively Adjusted Auto Regression Model for Forecasting in Epilepsy EEG
Kim, Sun-Hee; Faloutsos, Christos; Yang, Hyung-Jeong
2013-01-01
Recently, data with complex characteristics such as epilepsy electroencephalography (EEG) time series has emerged. Epilepsy EEG data has special characteristics including nonlinearity, nonnormality, and nonperiodicity. Therefore, it is important to find a suitable forecasting method that covers these special characteristics. In this paper, we propose a coercively adjusted autoregression (CA-AR) method that forecasts future values from a multivariable epilepsy EEG time series. We use the technique of random coefficients, which forcefully adjusts the coefficients with −1 and 1. The fractal dimension is used to determine the order of the CA-AR model. We applied the CA-AR method reflecting special characteristics of data to forecast the future value of epilepsy EEG data. Experimental results show that when compared to previous methods, the proposed method can forecast faster and accurately. PMID:23710252
Variable selection for logistic regression using a prediction-focused information criterion.
Claeskens, Gerda; Croux, Christophe; Van Kerckhoven, Johan
2006-12-01
In biostatistical practice, it is common to use information criteria as a guide for model selection. We propose new versions of the focused information criterion (FIC) for variable selection in logistic regression. The FIC gives, depending on the quantity to be estimated, possibly different sets of selected variables. The standard version of the FIC measures the mean squared error of the estimator of the quantity of interest in the selected model. In this article, we propose more general versions of the FIC, allowing other risk measures such as the one based on L(p) error. When prediction of an event is important, as is often the case in medical applications, we construct an FIC using the error rate as a natural risk measure. The advantages of using an information criterion which depends on both the quantity of interest and the selected risk measure are illustrated by means of a simulation study and application to a study on diabetic retinopathy. PMID:17156270
Accounting for informatively missing data in logistic regression by means of reassessment sampling.
Lin, Ji; Lyles, Robert H
2015-05-20
We explore the 'reassessment' design in a logistic regression setting, where a second wave of sampling is applied to recover a portion of the missing data on a binary exposure and/or outcome variable. We construct a joint likelihood function based on the original model of interest and a model for the missing data mechanism, with emphasis on non-ignorable missingness. The estimation is carried out by numerical maximization of the joint likelihood function with close approximation of the accompanying Hessian matrix, using sharable programs that take advantage of general optimization routines in standard software. We show how likelihood ratio tests can be used for model selection and how they facilitate direct hypothesis testing for whether missingness is at random. Examples and simulations are presented to demonstrate the performance of the proposed method. PMID:25707010
Battaglin, W.A.
1996-01-01
Agricultural chemicals (herbicides, insecticides, other pesticides and fertilizers) in surface water may constitute a human health risk. Recent research on unregulated rivers in the midwestern USA documents that elevated concentrations of herbicides occur for 1-4 months following application in spring and early summer. In contrast, nitrate concentrations in unregulated rivers are elevated during the fall, winter and spring. Natural and anthropogenic variables of river drainage basins, such as soil permeability, the amount of agricultural chemicals applied or percentage of land planted in corn, affect agricultural chemical concentrations in rivers. Logistic regression (LGR) models are used to investigate relations between various drainage basin variables and the concentration of selected agricultural chemicals in rivers. The method is successful in contributing to the understanding of agricultural chemical concentration in rivers. Overall accuracies of the best LGR models, defined as the number of correct classifications divided by the number of attempted classifications, averaged about 66%.
Statistical modelling for thoracic surgery using a nomogram based on logistic regression.
Liu, Run-Zhong; Zhao, Ze-Rui; Ng, Calvin S H
2016-08-01
A well-developed clinical nomogram is a popular decision-tool, which can be used to predict the outcome of an individual, bringing benefits to both clinicians and patients. With just a few steps on a user-friendly interface, the approximate clinical outcome of patients can easily be estimated based on their clinical and laboratory characteristics. Therefore, nomograms have recently been developed to predict the different outcomes or even the survival rate at a specific time point for patients with different diseases. However, on the establishment and application of nomograms, there is still a lot of confusion that may mislead researchers. The objective of this paper is to provide a brief introduction on the history, definition, and application of nomograms and then to illustrate simple procedures to develop a nomogram with an example based on a multivariate logistic regression model in thoracic surgery. In addition, validation strategies and common pitfalls have been highlighted. PMID:27621910
Binary logistic regression modelling: Measuring the probability of relapse cases among drug addict
NASA Astrophysics Data System (ADS)
Ismail, Mohd Tahir; Alias, Siti Nor Shadila
2014-07-01
For many years Malaysia faced the drug addiction issues. The most serious case is relapse phenomenon among treated drug addict (drug addict who have under gone the rehabilitation programme at Narcotic Addiction Rehabilitation Centre, PUSPEN). Thus, the main objective of this study is to find the most significant factor that contributes to relapse to happen. The binary logistic regression analysis was employed to model the relationship between independent variables (predictors) and dependent variable. The dependent variable is the status of the drug addict either relapse, (Yes coded as 1) or not, (No coded as 0). Meanwhile the predictors involved are age, age at first taking drug, family history, education level, family crisis, community support and self motivation. The total of the sample is 200 which the data are provided by AADK (National Antidrug Agency). The finding of the study revealed that age and self motivation are statistically significant towards the relapse cases..
Matthews, J N S; Badi, N H
2015-08-30
When the difference between treatments in a clinical trial is estimated by a difference in means, then it is well known that randomization ensures unbiassed estimation, even if no account is taken of important baseline covariates. However, when the treatment effect is assessed by other summaries, for example by an odds ratio if the outcome is binary, then bias can arise if some covariates are omitted, regardless of the use of randomization for treatment allocation or the size of the trial. We present accurate closed-form approximations for this asymptotic bias when important normally distributed covariates are omitted from a logistic regression. We compare this approximation with ones in the literature and derive more convenient forms for some of these existing results. The expressions give insight into the form of the bias, which simulations show is usable for distributions other than the normal. The key result applies even when there are additional binary covariates in the model. PMID:25869059
Statistical modelling for thoracic surgery using a nomogram based on logistic regression
Liu, Run-Zhong; Zhao, Ze-Rui
2016-01-01
A well-developed clinical nomogram is a popular decision-tool, which can be used to predict the outcome of an individual, bringing benefits to both clinicians and patients. With just a few steps on a user-friendly interface, the approximate clinical outcome of patients can easily be estimated based on their clinical and laboratory characteristics. Therefore, nomograms have recently been developed to predict the different outcomes or even the survival rate at a specific time point for patients with different diseases. However, on the establishment and application of nomograms, there is still a lot of confusion that may mislead researchers. The objective of this paper is to provide a brief introduction on the history, definition, and application of nomograms and then to illustrate simple procedures to develop a nomogram with an example based on a multivariate logistic regression model in thoracic surgery. In addition, validation strategies and common pitfalls have been highlighted. PMID:27621910
Ghouri, A F; Zamora, R L; Sessions, D G; Spitznagel, E L; Harvey, J E
1994-10-01
The ability to accurately predict the presence of subclinical metastatic neck disease in clinically N0 patients with primary epidermoid cancer of the larynx would be of great value in determining whether to perform an elective neck dissection. We describe a statistical approach to estimating the probability of occult neck disease given pretreatment clinical parameters. A retrospective study was performed involving 736 clinically N0 patients with primary laryngeal cancer who were treated surgically with primary resection and ipsilateral neck dissection. Nodal involvement was determined histologically after surgical lymphadenectomy. A logistic regression model was used to derive an equation that calculated the probability of occult neck metastasis based on pretreatment T stage, tumor location, and histologic grade. The model has a sensitivity of 74%, a specificity of 87%, and can be entered into a programmable calculator. PMID:7934602
Modeling Group Size and Scalar Stress by Logistic Regression from an Archaeological Perspective
Alberti, Gianmarco
2014-01-01
Johnson’s scalar stress theory, describing the mechanics of (and the remedies to) the increase in in-group conflictuality that parallels the increase in groups’ size, provides scholars with a useful theoretical framework for the understanding of different aspects of the material culture of past communities (i.e., social organization, communal food consumption, ceramic style, architecture and settlement layout). Due to its relevance in archaeology and anthropology, the article aims at proposing a predictive model of critical level of scalar stress on the basis of community size. Drawing upon Johnson’s theory and on Dunbar’s findings on the cognitive constrains to human group size, a model is built by means of Logistic Regression on the basis of the data on colony fissioning among the Hutterites of North America. On the grounds of the theoretical framework sketched in the first part of the article, the absence or presence of colony fissioning is considered expression of not critical vs. critical level of scalar stress for the sake of the model building. The model, which is also tested against a sample of archaeological and ethnographic cases: a) confirms the existence of a significant relationship between critical scalar stress and group size, setting the issue on firmer statistical grounds; b) allows calculating the intercept and slope of the logistic regression model, which can be used in any time to estimate the probability that a community experienced a critical level of scalar stress; c) allows locating a critical scalar stress threshold at community size 127 (95% CI: 122–132), while the maximum probability of critical scale stress is predicted at size 158 (95% CI: 147–170). The model ultimately provides grounds to assess, for the sake of any further archaeological/anthropological interpretation, the probability that a group reached a hot spot of size development critical for its internal cohesion. PMID:24626241
Screening for ketosis using multiple logistic regression based on milk yield and composition.
Kayano, Mitsunori; Kataoka, Tomoko
2015-11-01
Multiple logistic regression was applied to milk yield and composition data for 632 records of healthy cows and 61 records of ketotic cows in Hokkaido, Japan. The purpose was to diagnose ketosis based on milk yield and composition, simultaneously. The cows were divided into two groups: (1) multiparous, including 314 healthy cows and 45 ketotic cows and (2) primiparous, including 318 healthy cows and 16 ketotic cows, since nutritional status, milk yield and composition are affected by parity. Multiple logistic regression was applied to these groups separately. For multiparous cows, milk yield (kg/day/cow) and protein-to-fat (P/F) ratio in milk were significant factors (P<0.05) for the diagnosis of ketosis. For primiparous cows, lactose content (%), solid not fat (SNF) content (%) and milk urea nitrogen (MUN) content (mg/dl) were significantly associated with ketosis (P<0.01). A diagnostic rule was constructed for each group of cows: (1) 9.978 × P/F ratio + 0.085 × milk yield <10 and (2) 2.327 × SNF - 2.703 × lactose + 0.225 × MUN <10. The sensitivity, specificity and the area under the curve (AUC) of the diagnostic rules were (1) 0.800, 0.729 and 0.811; (2) 0.813, 0.730 and 0.787, respectively. The P/F ratio, which is a widely used measure of ketosis, provided the sensitivity, specificity and AUC values of (1) 0.711, 0.726 and 0.781; and (2) 0.678, 0.767 and 0.738, respectively. PMID:26074408
Screening for ketosis using multiple logistic regression based on milk yield and composition
KAYANO, Mitsunori; KATAOKA, Tomoko
2015-01-01
Multiple logistic regression was applied to milk yield and composition data for 632 records of healthy cows and 61 records of ketotic cows in Hokkaido, Japan. The purpose was to diagnose ketosis based on milk yield and composition, simultaneously. The cows were divided into two groups: (1) multiparous, including 314 healthy cows and 45 ketotic cows and (2) primiparous, including 318 healthy cows and 16 ketotic cows, since nutritional status, milk yield and composition are affected by parity. Multiple logistic regression was applied to these groups separately. For multiparous cows, milk yield (kg/day/cow) and protein-to-fat (P/F) ratio in milk were significant factors (P<0.05) for the diagnosis of ketosis. For primiparous cows, lactose content (%), solid not fat (SNF) content (%) and milk urea nitrogen (MUN) content (mg/dl) were significantly associated with ketosis (P<0.01). A diagnostic rule was constructed for each group of cows: (1) 9.978 × P/F ratio + 0.085 × milk yield <10 and (2) 2.327 × SNF − 2.703 × lactose + 0.225 × MUN <10. The sensitivity, specificity and the area under the curve (AUC) of the diagnostic rules were (1) 0.800, 0.729 and 0.811; (2) 0.813, 0.730 and 0.787, respectively. The P/F ratio, which is a widely used measure of ketosis, provided the sensitivity, specificity and AUC values of (1) 0.711, 0.726 and 0.781; and (2) 0.678, 0.767 and 0.738, respectively. PMID:26074408
Genomic-Enabled Prediction of Ordinal Data with Bayesian Logistic Ordinal Regression
Montesinos-López, Osval A.; Montesinos-López, Abelardo; Crossa, José; Burgueño, Juan; Eskridge, Kent
2015-01-01
Most genomic-enabled prediction models developed so far assume that the response variable is continuous and normally distributed. The exception is the probit model, developed for ordered categorical phenotypes. In statistical applications, because of the easy implementation of the Bayesian probit ordinal regression (BPOR) model, Bayesian logistic ordinal regression (BLOR) is implemented rarely in the context of genomic-enabled prediction [sample size (n) is much smaller than the number of parameters (p)]. For this reason, in this paper we propose a BLOR model using the Pólya-Gamma data augmentation approach that produces a Gibbs sampler with similar full conditional distributions of the BPOR model and with the advantage that the BPOR model is a particular case of the BLOR model. We evaluated the proposed model by using simulation and two real data sets. Results indicate that our BLOR model is a good alternative for analyzing ordinal data in the context of genomic-enabled prediction with the probit or logit link. PMID:26290569
Genomic-Enabled Prediction of Ordinal Data with Bayesian Logistic Ordinal Regression.
Montesinos-López, Osval A; Montesinos-López, Abelardo; Crossa, José; Burgueño, Juan; Eskridge, Kent
2015-10-01
Most genomic-enabled prediction models developed so far assume that the response variable is continuous and normally distributed. The exception is the probit model, developed for ordered categorical phenotypes. In statistical applications, because of the easy implementation of the Bayesian probit ordinal regression (BPOR) model, Bayesian logistic ordinal regression (BLOR) is implemented rarely in the context of genomic-enabled prediction [sample size (n) is much smaller than the number of parameters (p)]. For this reason, in this paper we propose a BLOR model using the Pólya-Gamma data augmentation approach that produces a Gibbs sampler with similar full conditional distributions of the BPOR model and with the advantage that the BPOR model is a particular case of the BLOR model. We evaluated the proposed model by using simulation and two real data sets. Results indicate that our BLOR model is a good alternative for analyzing ordinal data in the context of genomic-enabled prediction with the probit or logit link. PMID:26290569
Adjustment of regional regression equations for urban storm-runoff quality using at-site data
Barks, C.S.
1996-01-01
Regional regression equations have been developed to estimate urban storm-runoff loads and mean concentrations using a national data base. Four statistical methods using at-site data to adjust the regional equation predictions were developed to provide better local estimates. The four adjustment procedures are a single-factor adjustment, a regression of the observed data against the predicted values, a regression of the observed values against the predicted values and additional local independent variables, and a weighted combination of a local regression with the regional prediction. Data collected at five representative storm-runoff sites during 22 storms in Little Rock, Arkansas, were used to verify, and, when appropriate, adjust the regional regression equation predictions. Comparison of observed values of stormrunoff loads and mean concentrations to the predicted values from the regional regression equations for nine constituents (chemical oxygen demand, suspended solids, total nitrogen as N, total ammonia plus organic nitrogen as N, total phosphorus as P, dissolved phosphorus as P, total recoverable copper, total recoverable lead, and total recoverable zinc) showed large prediction errors ranging from 63 percent to more than several thousand percent. Prediction errors for 6 of the 18 regional regression equations were less than 100 percent and could be considered reasonable for water-quality prediction equations. The regression adjustment procedure was used to adjust five of the regional equation predictions to improve the predictive accuracy. For seven of the regional equations the observed and the predicted values are not significantly correlated. Thus neither the unadjusted regional equations nor any of the adjustments were appropriate. The mean of the observed values was used as a simple estimator when the regional equation predictions and adjusted predictions were not appropriate.
Huang, Hai-Hui; Liu, Xiao-Ying; Liang, Yong
2016-01-01
Cancer classification and feature (gene) selection plays an important role in knowledge discovery in genomic data. Although logistic regression is one of the most popular classification methods, it does not induce feature selection. In this paper, we presented a new hybrid L1/2 +2 regularization (HLR) function, a linear combination of L1/2 and L2 penalties, to select the relevant gene in the logistic regression. The HLR approach inherits some fascinating characteristics from L1/2 (sparsity) and L2 (grouping effect where highly correlated variables are in or out a model together) penalties. We also proposed a novel univariate HLR thresholding approach to update the estimated coefficients and developed the coordinate descent algorithm for the HLR penalized logistic regression model. The empirical results and simulations indicate that the proposed method is highly competitive amongst several state-of-the-art methods. PMID:27136190
Nie, Z Q; Ou, Y Q; Zhuang, J; Qu, Y J; Mai, J Z; Chen, J M; Liu, X Q
2016-05-10
Conditional logistic regression analysis and unconditional logistic regression analysis are commonly used in case control study, but Cox proportional hazard model is often used in survival data analysis. Most literature only refer to main effect model, however, generalized linear model differs from general linear model, and the interaction was composed of multiplicative interaction and additive interaction. The former is only statistical significant, but the latter has biological significance. In this paper, macros was written by using SAS 9.4 and the contrast ratio, attributable proportion due to interaction and synergy index were calculated while calculating the items of logistic and Cox regression interactions, and the confidence intervals of Wald, delta and profile likelihood were used to evaluate additive interaction for the reference in big data analysis in clinical epidemiology and in analysis of genetic multiplicative and additive interactions. PMID:27188374
Multiple logistic regression model of signalling practices of drivers on urban highways
NASA Astrophysics Data System (ADS)
Puan, Othman Che; Ibrahim, Muttaka Na'iya; Zakaria, Rozana
2015-05-01
Giving signal is a way of informing other road users, especially to the conflicting drivers, the intention of a driver to change his/her movement course. Other users are exposed to hazard situation and risks of accident if the driver who changes his/her course failed to give signal as required. This paper describes the application of logistic regression model for the analysis of driver's signalling practices on multilane highways based on possible factors affecting driver's decision such as driver's gender, vehicle's type, vehicle's speed and traffic flow intensity. Data pertaining to the analysis of such factors were collected manually. More than 2000 drivers who have performed a lane changing manoeuvre while driving on two sections of multilane highways were observed. Finding from the study shows that relatively a large proportion of drivers failed to give any signals when changing lane. The result of the analysis indicates that although the proportion of the drivers who failed to provide signal prior to lane changing manoeuvre is high, the degree of compliances of the female drivers is better than the male drivers. A binary logistic model was developed to represent the probability of a driver to provide signal indication prior to lane changing manoeuvre. The model indicates that driver's gender, type of vehicle's driven, speed of vehicle and traffic volume influence the driver's decision to provide a signal indication prior to a lane changing manoeuvre on a multilane urban highway. In terms of types of vehicles driven, about 97% of motorcyclists failed to comply with the signal indication requirement. The proportion of non-compliance drivers under stable traffic flow conditions is much higher than when the flow is relatively heavy. This is consistent with the data which indicates a high degree of non-compliances when the average speed of the traffic stream is relatively high.
A logistic regression based approach for the prediction of flood warning threshold exceedance
NASA Astrophysics Data System (ADS)
Diomede, Tommaso; Trotter, Luca; Stefania Tesini, Maria; Marsigli, Chiara
2016-04-01
A method based on logistic regression is proposed for the prediction of river level threshold exceedance at short (+0-18h) and medium (+18-42h) lead times. The aim of the study is to provide a valuable tool for the issue of warnings by the authority responsible of public safety in case of flood. The role of different precipitation periods as predictors for the exceedance of a fixed river level has been investigated, in order to derive significant information for flood forecasting. Based on catchment-averaged values, a separation of "antecedent" and "peak-triggering" rainfall amounts as independent variables is attempted. In particular, the following flood-related precipitation periods have been considered: (i) the period from 1 to n days before the forecast issue time, which may be relevant for the soil saturation, (ii) the last 24 hours, which may be relevant for the current water level in the river, and (iii) the period from 0 to x hours in advance with respect to the forecast issue time, when the flood-triggering precipitation generally occurs. Several combinations and values of these predictors have been tested to optimise the method implementation. In particular, the period for the precursor antecedent precipitation ranges between 5 and 45 days; the state of the river can be represented by the last 24-h precipitation or, as alternative, by the current river level. The flood-triggering precipitation has been cumulated over the next 18 hours (for the short lead time) and 36-42 hours (for the medium lead time). The proposed approach requires a specific implementation of logistic regression for each river section and warning threshold. The method performance has been evaluated over the Santerno river catchment (about 450 km2) in the Emilia-Romagna Region, northern Italy. A statistical analysis in terms of false alarms, misses and related scores was carried out by using a 8-year long database. The results are quite satisfactory, with slightly better performances
NASA Astrophysics Data System (ADS)
Pradhan, Biswajeet
Recently, in 2006 and 2007 heavy monsoons rainfall have triggered floods along Malaysia's east coast as well as in southern state of Johor. The hardest hit areas are along the east coast of peninsular Malaysia in the states of Kelantan, Terengganu and Pahang. The city of Johor was particularly hard hit in southern side. The flood cost nearly billion ringgit of property and many lives. The extent of damage could have been reduced or minimized if an early warning system would have been in place. This paper deals with flood susceptibility analysis using logistic regression model. We have evaluated the flood susceptibility and the effect of flood-related factors along the Kelantan river basin using the Geographic Information System (GIS) and remote sensing data. Previous flooded areas were extracted from archived radarsat images using image processing tools. Flood susceptibility mapping was conducted in the study area along the Kelantan River using radarsat imagery and then enlarged to 1:25,000 scales. Topographical, hydrological, geological data and satellite images were collected, processed, and constructed into a spatial database using GIS and image processing. The factors chosen that influence flood occurrence were: topographic slope, topographic aspect, topographic curvature, DEM and distance from river drainage, all from the topographic database; flow direction, flow accumulation, extracted from hydrological database; geology and distance from lineament, taken from the geologic database; land use from SPOT satellite images; soil texture from soil database; and the vegetation index value from SPOT satellite images. Flood susceptible areas were analyzed and mapped using the probability-logistic regression model. Results indicate that flood prone areas can be performed at 1:25,000 which is comparable to some conventional flood hazard map scales. The flood prone areas delineated on these maps correspond to areas that would be inundated by significant flooding
Heinze, Georg; Ploner, Meinhard; Beyea, Jan
2013-12-20
In the logistic regression analysis of a small-sized, case-control study on Alzheimer's disease, some of the risk factors exhibited missing values, motivating the use of multiple imputation. Usually, Rubin's rules (RR) for combining point estimates and variances would then be used to estimate (symmetric) confidence intervals (CIs), on the assumption that the regression coefficients were distributed normally. Yet, rarely is this assumption tested, with or without transformation. In analyses of small, sparse, or nearly separated data sets, such symmetric CI may not be reliable. Thus, RR alternatives have been considered, for example, Bayesian sampling methods, but not yet those that combine profile likelihoods, particularly penalized profile likelihoods, which can remove first order biases and guarantee convergence of parameter estimation. To fill the gap, we consider the combination of penalized likelihood profiles (CLIP) by expressing them as posterior cumulative distribution functions (CDFs) obtained via a chi-squared approximation to the penalized likelihood ratio statistic. CDFs from multiple imputations can then easily be averaged into a combined CDF c , allowing confidence limits for a parameter β at level 1 - α to be identified as those β* and β** that satisfy CDF c (β*) = α ∕ 2 and CDF c (β**) = 1 - α ∕ 2. We demonstrate that the CLIP method outperforms RR in analyzing both simulated data and data from our motivating example. CLIP can also be useful as a confirmatory tool, should it show that the simpler RR are adequate for extended analysis. We also compare the performance of CLIP to Bayesian sampling methods using Markov chain Monte Carlo. CLIP is available in the R package logistf. PMID:23873477
Shayan, Zahra; Mezerji, Naser Mohammad Gholi; Shayan, Leila; Naseri, Parisa
2016-01-01
Background: Logistic regression (LR) and linear discriminant analysis (LDA) are two popular statistical models for prediction of group membership. Although they are very similar, the LDA makes more assumptions about the data. When categorical and continuous variables used simultaneously, the optimal choice between the two models is questionable. In most studies, classification error (CE) is used to discriminate between subjects in several groups, but this index is not suitable to predict the accuracy of the outcome. The present study compared LR and LDA models using classification indices. Methods: This cross-sectional study selected 243 cancer patients. Sample sets of different sizes (n = 50, 100, 150, 200, 220) were randomly selected and the CE, B, and Q classification indices were calculated by the LR and LDA models. Results: CE revealed the a lack of superiority for one model over the other, but the results showed that LR performed better than LDA for the B and Q indices in all situations. No significant effect for sample size on CE was noted for selection of an optimal model. Assessment of the accuracy of prediction of real data indicated that the B and Q indices are appropriate for selection of an optimal model. Conclusion: The results of this study showed that LR performs better in some cases and LDA in others when based on CE. The CE index is not appropriate for classification, although the B and Q indices performed better and offered more efficient criteria for comparison and discrimination between groups.
Zhao, Hongya; Logothetis, Christopher J.; Gorlov, Ivan P.; Zeng, Jia; Dai, Jianguo
2013-01-01
Predicting disease progression is one of the most challenging problems in prostate cancer research. Adding gene expression data to prediction models that are based on clinical features has been proposed to improve accuracy. In the current study, we applied a logistic regression (LR) model combining clinical features and gene co-expression data to improve the accuracy of the prediction of prostate cancer progression. The top-scoring pair (TSP) method was used to select genes for the model. The proposed models not only preserved the basic properties of the TSP algorithm but also incorporated the clinical features into the prognostic models. Based on the statistical inference with the iterative cross validation, we demonstrated that prediction LR models that included genes selected by the TSP method provided better predictions of prostate cancer progression than those using clinical variables only and/or those that included genes selected by the one-gene-at-a-time approach. Thus, we conclude that TSP selection is a useful tool for feature (and/or gene) selection to use in prognostic models and our model also provides an alternative for predicting prostate cancer progression. PMID:24367394
Kadilar, Gamze Ozel
2016-06-01
The aim of the study is to examine the factors that appear to have a higher potential for serious injury or death of drivers in traffic accidents in Turkey, such as collision type, roadway surface, vehicle speed, alcohol/drug use, and restraint use. Driver crash severity is the dependent variable of this study with two categories, fatal and non-fatal. Due to the binary nature of the dependent variable, a conditional logistic regression analysis was found suitable. Of the 16 independent variables obtained from Turkish police accident reports, 11 variables were found most significantly associated with driver crash severity. They are age, education level, restraint use, roadway condition, roadway type, time of day, collision location, collision type, number and direction of vehicles, vehicle speed, and alcohol/drug use. This study found that belted drivers aged 18-25 years involving two vehicles travelling in the same direction, in an urban area, during the daytime, and on an avenue or a street have better chances of survival in traffic accidents. PMID:25087577
NASA Astrophysics Data System (ADS)
Snedden, Gregg A.; Steyer, Gregory D.
2013-02-01
Understanding plant community zonation along estuarine stress gradients is critical for effective conservation and restoration of coastal wetland ecosystems. We related the presence of plant community types to estuarine hydrology at 173 sites across coastal Louisiana. Percent relative cover by species was assessed at each site near the end of the growing season in 2008, and hourly water level and salinity were recorded at each site Oct 2007-Sep 2008. Nine plant community types were delineated with k-means clustering, and indicator species were identified for each of the community types with indicator species analysis. An inverse relation between salinity and species diversity was observed. Canonical correspondence analysis (CCA) effectively segregated the sites across ordination space by community type, and indicated that salinity and tidal amplitude were both important drivers of vegetation composition. Multinomial logistic regression (MLR) and Akaike's Information Criterion (AIC) were used to predict the probability of occurrence of the nine vegetation communities as a function of salinity and tidal amplitude, and probability surfaces obtained from the MLR model corroborated the CCA results. The weighted kappa statistic, calculated from the confusion matrix of predicted versus actual community types, was 0.7 and indicated good agreement between observed community types and model predictions. Our results suggest that models based on a few key hydrologic variables can be valuable tools for predicting vegetation community development when restoring and managing coastal wetlands.
Ismail, Abbas; Josephat, Peter
2014-01-01
Tuberculosis (TB) is one of the most important public health problems in Tanzania and was declared as a national public health emergency in 2006. Community and individual knowledge and perceptions are critical factors in the control of the disease. The objective of this study was to analyze the knowledge and perception on the transmission of TB in Tanzania. Multinomial Logistic Regression analysis was considered in order to quantify the impact of knowledge and perception on TB. The data used was adopted as secondary data from larger national survey 2007-08 Tanzania HIV/AIDS and Malaria Indicator Survey. The findings across groups revealed that knowledge on TB transmission increased with an increase in age and level of education. People in rural areas had less knowledge regarding tuberculosis transmission compared to urban areas [OR = 0.7]. People with the access to radio [OR = 1.7] were more knowledgeable on tuberculosis transmission compared to those who did not have access to radio. People who did not have telephone [OR = 0.6] were less knowledgeable on tuberculosis route of transmission compared to those who had telephone. The findings showed that socio-demographic factors such as age, education, place of residence and owning telephone or radio varied systematically with knowledge on tuberculosis transmission. PMID:26867270
Kernel-based logistic regression model for protein sequence without vectorialization.
Fong, Youyi; Datta, Saheli; Georgiev, Ivelin S; Kwong, Peter D; Tomaras, Georgia D
2015-07-01
Protein sequence data arise more and more often in vaccine and infectious disease research. These types of data are discrete, high-dimensional, and complex. We propose to study the impact of protein sequences on binary outcomes using a kernel-based logistic regression model, which models the effect of protein through a random effect whose variance-covariance matrix is mostly determined by a kernel function. We propose a novel, biologically motivated, profile hidden Markov model (HMM)-based mutual information (MI) kernel. Hypothesis testing can be carried out using the maximum of the score statistics and a parametric bootstrap procedure. To improve the power of testing, we propose intuitive modifications to the test statistic. We show through simulation studies that the profile HMM-based MI kernel can be substantially more powerful than competing kernels, and that the modified test statistics bring incremental gains in power. We use these proposed methods to investigate two problems from HIV-1 vaccine research: (1) identifying segments of HIV-1 envelope (Env) protein that confer resistance to neutralizing antibody and (2) identifying segments of Env that are associated with attenuation of protective vaccine effect by antibodies of isotype A in the RV144 vaccine trial. PMID:25532524
Binary Logistic Regression Analysis of Foramen Magnum Dimensions for Sex Determination
Kamath, Venkatesh Gokuldas; Asif, Muhammed; Shetty, Radhakrishna; Avadhani, Ramakrishna
2015-01-01
Purpose. The structural integrity of foramen magnum is usually preserved in fire accidents and explosions due to its resistant nature and secluded anatomical position and this study attempts to determine its sexing potential. Methods. The sagittal and transverse diameters and area of foramen magnum of seventy-two skulls (41 male and 31 female) from south Indian population were measured. The analysis was done using Student's t-test, linear correlation, histogram, Q-Q plot, and Binary Logistic Regression (BLR) to obtain a model for sex determination. The predicted probabilities of BLR were analysed using Receiver Operating Characteristic (ROC) curve. Result. BLR analysis and ROC curve revealed that the predictability of the dimensions in sexing the crania was 69.6% for sagittal diameter, 66.4% for transverse diameter, and 70.3% for area of foramen. Conclusion. The sexual dimorphism of foramen magnum dimensions is established. However, due to considerable overlapping of male and female values, it is unwise to singularly rely on the foramen measurements. However, considering the high sex predictability percentage of its dimensions in the present study and the studies preceding it, the foramen measurements can be used to supplement other sexing evidence available so as to precisely ascertain the sex of the skeleton. PMID:26346917
NASA Astrophysics Data System (ADS)
Kumarasamy, K.; Kaluarachchi, J.
2006-12-01
Groundwater is an important natural resource for numerous human activities, accounting for more than 50% of the total water used in the United States. It is vulnerable to contamination by several organic and inorganic pollutants such as nitrates, heavy metals, and pesticides etc. Assessment of groundwater vulnerability aids in the management of limited groundwater resources. The focus of this thesis is to (1) statistically compare two groundwater vulnerability assessment models; DRASTIC (Acronym for Depth to water, Net recharge, Aquifer Media, Soil media, Topography, Impact of vadose zone, and Hydraulic conductivity of aquifer) and Ordinal Logistic Regression on a national scale by using nitrate concentration as the performance indicator. (2) Analyze any discrepancies in the predictability of each of these models. (3) The advantage of each of the above mentioned models with respect to the knowledge, expertise, time, data requirement and its ability to predict the vulnerability. The breadth of nitrate concentration data in groundwater allows for a reliable comparison of the two models. This work will contribute towards the ultimate motive of developing a universal method to predict the vulnerability of groundwater.
Lewis, Kristin Nicole; Heckman, Bernadette Davantes; Himawan, Lina
2011-08-01
Growth mixture modeling (GMM) identified latent groups based on treatment outcome trajectories of headache disability measures in patients in headache subspecialty treatment clinics. Using a longitudinal design, 219 patients in headache subspecialty clinics in 4 large cities throughout Ohio provided data on their headache disability at pretreatment and 3 follow-up assessments. GMM identified 3 treatment outcome trajectory groups: (1) patients who initiated treatment with elevated disability levels and who reported statistically significant reductions in headache disability (high-disability improvers; 11%); (2) patients who initiated treatment with elevated disability but who reported no reductions in disability (high-disability nonimprovers; 34%); and (3) patients who initiated treatment with moderate disability and who reported statistically significant reductions in headache disability (moderate-disability improvers; 55%). Based on the final multinomial logistic regression model, a dichotomized treatment appointment attendance variable was a statistically significant predictor for differentiating high-disability improvers from high-disability nonimprovers. Three-fourths of patients who initiated treatment with elevated disability levels did not report reductions in disability after 5 months of treatment with new preventive pharmacotherapies. Preventive headache agents may be most efficacious for patients with moderate levels of disability and for patients with high disability levels who attend all treatment appointments. PMID:21420240
Use of binary logistic regression technique with MODIS data to estimate wild fire risk
NASA Astrophysics Data System (ADS)
Fan, Hong; Di, Liping; Yang, Wenli; Bonnlander, Brian; Li, Xiaoyan
2007-11-01
Many forest fires occur across the globe each year, which destroy life and property, and strongly impact ecosystems. In recent years, wildland fires and altered fire disturbance regimes have become a significant management and science problem affecting ecosystems and wildland/urban interface cross the United States and global. In this paper, we discuss the estimation of 504 probability models for forecasting fire risk for 14 fuel types, 12 months, one day/week/month in advance, which use 19 years of historical fire data in addition to meteorological and vegetation variables. MODIS land products are utilized as a major data source, and a logistical binary regression was adopted to solve fire forecast probability. In order to better modeling the change of fire risk along with the transition of seasons, some spatial and temporal stratification strategies were applied. In order to explore the possibilities of real time prediction, the Matlab distributing computing toolbox was used to accelerate the prediction. Finally, this study give an evaluation and validation of predict based on the ground truth collected. Validating results indicate these fire risk models have achieved nearly 70% accuracy of prediction and as well MODIS data are potential data source to implement near real-time fire risk prediction.
Snedden, Gregg A.; Steyer, Gregory D.
2013-01-01
Understanding plant community zonation along estuarine stress gradients is critical for effective conservation and restoration of coastal wetland ecosystems. We related the presence of plant community types to estuarine hydrology at 173 sites across coastal Louisiana. Percent relative cover by species was assessed at each site near the end of the growing season in 2008, and hourly water level and salinity were recorded at each site Oct 2007–Sep 2008. Nine plant community types were delineated with k-means clustering, and indicator species were identified for each of the community types with indicator species analysis. An inverse relation between salinity and species diversity was observed. Canonical correspondence analysis (CCA) effectively segregated the sites across ordination space by community type, and indicated that salinity and tidal amplitude were both important drivers of vegetation composition. Multinomial logistic regression (MLR) and Akaike's Information Criterion (AIC) were used to predict the probability of occurrence of the nine vegetation communities as a function of salinity and tidal amplitude, and probability surfaces obtained from the MLR model corroborated the CCA results. The weighted kappa statistic, calculated from the confusion matrix of predicted versus actual community types, was 0.7 and indicated good agreement between observed community types and model predictions. Our results suggest that models based on a few key hydrologic variables can be valuable tools for predicting vegetation community development when restoring and managing coastal wetlands.
Sze, N N; Wong, S C; Lee, C Y
2014-12-01
In past several decades, many countries have set quantified road safety targets to motivate transport authorities to develop systematic road safety strategies and measures and facilitate the achievement of continuous road safety improvement. Studies have been conducted to evaluate the association between the setting of quantified road safety targets and road fatality reduction, in both the short and long run, by comparing road fatalities before and after the implementation of a quantified road safety target. However, not much work has been done to evaluate whether the quantified road safety targets are actually achieved. In this study, we used a binary logistic regression model to examine the factors - including vehicle ownership, fatality rate, and national income, in addition to level of ambition and duration of target - that contribute to a target's success. We analyzed 55 quantified road safety targets set by 29 countries from 1981 to 2009, and the results indicate that targets that are in progress and with lower level of ambitions had a higher likelihood of eventually being achieved. Moreover, possible interaction effects on the association between level of ambition and the likelihood of success are also revealed. PMID:25255417
Comparison of linear discriminant analysis and logistic regression for data classification
NASA Astrophysics Data System (ADS)
Liong, Choong-Yeun; Foo, Sin-Fan
2013-04-01
Linear discriminant analysis (LDA) and logistic regression (LR) are often used for the purpose of classifying populations or groups using a set of predictor variables. Assumptions of multivariate normality and equal variance-covariance matrices across groups are required before proceeding with LDA, but such assumptions are not required for LR and hence LR is considered to be much more robust than LDA. In this paper, several real datasets which are different in terms of normality, number of independent variables and sample size are used to study the performance of both methods. The methods are compared based on the percentage of correct classification and B index. The results show that overall, LR performs better regardless of the distribution of the data is normal or nonnormal. However, LR needs longer computing time than LDA with the increase in sample size. The performance of LDA was also tested by using various prior probabilities. The results show that the average percentage of correct classification and the B index are higher when the prior probability is set based on the group size rather than using equal probabilities for all groups.
ERIC Educational Resources Information Center
Taylor, Aaron B.; West, Stephen G.; Aiken, Leona S.
2006-01-01
Variables that have been coarsely categorized into a small number of ordered categories are often modeled as outcome variables in psychological research. The authors employ a Monte Carlo study to investigate the effects of this coarse categorization of dependent variables on power to detect true effects using three classes of regression models:…
Rupert, Michael G.; Cannon, Susan H.; Gartner, Joseph E.
2003-01-01
Logistic regression was used to predict the probability of debris flows occurring in areas recently burned by wildland fires. Multiple logistic regression is conceptually similar to multiple linear regression because statistical relations between one dependent variable and several independent variables are evaluated. In logistic regression, however, the dependent variable is transformed to a binary variable (debris flow did or did not occur), and the actual probability of the debris flow occurring is statistically modeled. Data from 399 basins located within 15 wildland fires that burned during 2000-2002 in Colorado, Idaho, Montana, and New Mexico were evaluated. More than 35 independent variables describing the burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated. The models were developed as follows: (1) Basins that did and did not produce debris flows were delineated from National Elevation Data using a Geographic Information System (GIS). (2) Data describing the burn severity, geology, land surface gradient, rainfall, and soil properties were determined for each basin. These data were then downloaded to a statistics software package for analysis using logistic regression. (3) Relations between the occurrence/non-occurrence of debris flows and burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated and several preliminary multivariate logistic regression models were constructed. All possible combinations of independent variables were evaluated to determine which combination produced the most effective model. The multivariate model that best predicted the occurrence of debris flows was selected. (4) The multivariate logistic regression model was entered into a GIS, and a map showing the probability of debris flows was constructed. The most effective model incorporates the percentage of each basin with slope greater than 30 percent, percentage of land burned at medium and high burn severity
Ghazali, Amirul Syafiq Mohd; Ali, Zalila; Noor, Norlida Mohd; Baharum, Adam
2015-10-22
Multinomial logistic regression is widely used to model the outcomes of a polytomous response variable, a categorical dependent variable with more than two categories. The model assumes that the conditional mean of the dependent categorical variables is the logistic function of an affine combination of predictor variables. Its procedure gives a number of logistic regression models that make specific comparisons of the response categories. When there are q categories of the response variable, the model consists of q-1 logit equations which are fitted simultaneously. The model is validated by variable selection procedures, tests of regression coefficients, a significant test of the overall model, goodness-of-fit measures, and validation of predicted probabilities using odds ratio. This study used the multinomial logistic regression model to investigate obesity and overweight among primary school students in a rural area on the basis of their demographic profiles, lifestyles and on the diet and food intake. The results indicated that obesity and overweight of students are related to gender, religion, sleep duration, time spent on electronic games, breakfast intake in a week, with whom meals are taken, protein intake, and also, the interaction between breakfast intake in a week with sleep duration, and the interaction between gender and protein intake.
Asghari, Mehdi Poursheikhali; Hayatshahi, Sayyed Hamed Sadat; Abdolmaleki, Parviz
2012-01-01
From both the structural and functional points of view, β-turns play important biological roles in proteins. In the present study, a novel two-stage hybrid procedure has been developed to identify β-turns in proteins. Binary logistic regression was initially used for the first time to select significant sequence parameters in identification of β-turns due to a re-substitution test procedure. Sequence parameters were consisted of 80 amino acid positional occurrences and 20 amino acid percentages in sequence. Among these parameters, the most significant ones which were selected by binary logistic regression model, were percentages of Gly, Ser and the occurrence of Asn in position i+2, respectively, in sequence. These significant parameters have the highest effect on the constitution of a β-turn sequence. A neural network model was then constructed and fed by the parameters selected by binary logistic regression to build a hybrid predictor. The networks have been trained and tested on a non-homologous dataset of 565 protein chains. With applying a nine fold cross-validation test on the dataset, the network reached an overall accuracy (Qtotal) of 74, which is comparable with results of the other β-turn prediction methods. In conclusion, this study proves that the parameter selection ability of binary logistic regression together with the prediction capability of neural networks lead to the development of more precise models for identifying β-turns in proteins.
NASA Astrophysics Data System (ADS)
Ghazali, Amirul Syafiq Mohd; Ali, Zalila; Noor, Norlida Mohd; Baharum, Adam
2015-10-01
Multinomial logistic regression is widely used to model the outcomes of a polytomous response variable, a categorical dependent variable with more than two categories. The model assumes that the conditional mean of the dependent categorical variables is the logistic function of an affine combination of predictor variables. Its procedure gives a number of logistic regression models that make specific comparisons of the response categories. When there are q categories of the response variable, the model consists of q-1 logit equations which are fitted simultaneously. The model is validated by variable selection procedures, tests of regression coefficients, a significant test of the overall model, goodness-of-fit measures, and validation of predicted probabilities using odds ratio. This study used the multinomial logistic regression model to investigate obesity and overweight among primary school students in a rural area on the basis of their demographic profiles, lifestyles and on the diet and food intake. The results indicated that obesity and overweight of students are related to gender, religion, sleep duration, time spent on electronic games, breakfast intake in a week, with whom meals are taken, protein intake, and also, the interaction between breakfast intake in a week with sleep duration, and the interaction between gender and protein intake.
ERIC Educational Resources Information Center
Le, Huy; Marcus, Justin
2012-01-01
This study used Monte Carlo simulation to examine the properties of the overall odds ratio (OOR), which was recently introduced as an index for overall effect size in multiple logistic regression. It was found that the OOR was relatively independent of study base rate and performed better than most commonly used R-square analogs in indexing model…
ERIC Educational Resources Information Center
Guler, Nese; Penfield, Randall D.
2009-01-01
In this study, we investigate the logistic regression (LR), Mantel-Haenszel (MH), and Breslow-Day (BD) procedures for the simultaneous detection of both uniform and nonuniform differential item functioning (DIF). A simulation study was used to assess and compare the Type I error rate and power of a combined decision rule (CDR), which assesses DIF…
ERIC Educational Resources Information Center
Gordovil-Merino, Amalia; Guardia-Olmos, Joan; Pero-Cebollero, Maribel
2012-01-01
In this paper, we used simulations to compare the performance of classical and Bayesian estimations in logistic regression models using small samples. In the performed simulations, conditions were varied, including the type of relationship between independent and dependent variable values (i.e., unrelated and related values), the type of variable…
NASA Astrophysics Data System (ADS)
Tunusluoglu, M. C.; Gokceoglu, C.; Nefeslioglu, H. A.; Sonmez, H.
2008-03-01
Debris flow is one of the most destructive mass movements. Sometimes regional debris flow susceptibility or hazard assessments can be more difficult than the other mass movements. Determination of debris accumulation zones and debris source areas, which is one of the most crucial stages in debris flow investigations, can be too difficult because of morphological restrictions. The main goal of the present study is to extract debris source areas by logistic regression analyses based on the data from the slopes of the Barla, Besparmak and Kapi Mountains in the SW part of the Taurids Mountain belt of Turkey, where formation of debris material are clearly evident and common. In this study, in order to achieve this goal, extensive field observations to identify the areal extent of debris source areas and debris material, air-photo studies to determine the debris source areas and also desk studies including Geographical Information System (GIS) applications and statistical assessments were performed. To justify the training data used in logistic regression analyses as representative, a random sampling procedure was applied. By using the results of the logistic regression analysis, the debris source area probability map of the region is produced. However, according to the field experiences of the authors, the produced map yielded over-predicted results. The main source of the over-prediction is structural relation between the bedding planes and slope aspects on the basis of the field observations, for the generation of debris, the dip of the bedding planes must be taken into consideration regarding the slope face. In order to eliminate this problem, in this study, an approach has been developed using probability distribution of the aspect values. With the application of structural adjustment, the final adjusted debris source area probability map is obtained for the study area. The field observations revealed that the actual debris source areas in the field coincide with
Rupert, Michael G.; Cannon, Susan H.; Gartner, Joseph E.; Michael, John A.; Helsel, Dennis R.
2008-01-01
Logistic regression was used to develop statistical models that can be used to predict the probability of debris flows in areas recently burned by wildfires by using data from 14 wildfires that burned in southern California during 2003-2006. Twenty-eight independent variables describing the basin morphology, burn severity, rainfall, and soil properties of 306 drainage basins located within those burned areas were evaluated. The models were developed as follows: (1) Basins that did and did not produce debris flows soon after the 2003 to 2006 fires were delineated from data in the National Elevation Dataset using a geographic information system; (2) Data describing the basin morphology, burn severity, rainfall, and soil properties were compiled for each basin. These data were then input to a statistics software package for analysis using logistic regression; and (3) Relations between the occurrence or absence of debris flows and the basin morphology, burn severity, rainfall, and soil properties were evaluated, and five multivariate logistic regression models were constructed. All possible combinations of independent variables were evaluated to determine which combinations produced the most effective models, and the multivariate models that best predicted the occurrence of debris flows were identified. Percentage of high burn severity and 3-hour peak rainfall intensity were significant variables in all models. Soil organic matter content and soil clay content were significant variables in all models except Model 5. Soil slope was a significant variable in all models except Model 4. The most suitable model can be selected from these five models on the basis of the availability of independent variables in the particular area of interest and field checking of probability maps. The multivariate logistic regression models can be entered into a geographic information system, and maps showing the probability of debris flows can be constructed in recently burned areas of
Bent, Gardner C.; Archfield, Stacey A.
2002-01-01
A logistic regression equation was developed for estimating the probability of a stream flowing perennially at a specific site in Massachusetts. The equation provides city and town conservation commissions and the Massachusetts Department of Environmental Protection with an additional method for assessing whether streams are perennial or intermittent at a specific site in Massachusetts. This information is needed to assist these environmental agencies, who administer the Commonwealth of Massachusetts Rivers Protection Act of 1996, which establishes a 200-foot-wide protected riverfront area extending along the length of each side of the stream from the mean annual high-water line along each side of perennial streams, with exceptions in some urban areas. The equation was developed by relating the verified perennial or intermittent status of a stream site to selected basin characteristics of naturally flowing streams (no regulation by dams, surface-water withdrawals, ground-water withdrawals, diversion, waste-water discharge, and so forth) in Massachusetts. Stream sites used in the analysis were identified as perennial or intermittent on the basis of review of measured streamflow at sites throughout Massachusetts and on visual observation at sites in the South Coastal Basin, southeastern Massachusetts. Measured or observed zero flow(s) during months of extended drought as defined by the 310 Code of Massachusetts Regulations (CMR) 10.58(2)(a) were not considered when designating the perennial or intermittent status of a stream site. The database used to develop the equation included a total of 305 stream sites (84 intermittent- and 89 perennial-stream sites in the State, and 50 intermittent- and 82 perennial-stream sites in the South Coastal Basin). Stream sites included in the database had drainage areas that ranged from 0.14 to 8.94 square miles in the State and from 0.02 to 7.00 square miles in the South Coastal Basin.Results of the logistic regression analysis
NASA Astrophysics Data System (ADS)
Nowicki, M. A.; Hearne, M.; Thompson, E.; Wald, D. J.
2012-12-01
Seismically induced landslides present a costly and often fatal threats in many mountainous regions. Substantial effort has been invested to understand where seismically induced landslides may occur in the future. Both slope-stability methods and, more recently, statistical approaches to the problem are described throughout the literature. Though some regional efforts have succeeded, no uniformly agreed-upon method is available for predicting the likelihood and spatial extent of seismically induced landslides. For use in the U. S. Geological Survey (USGS) Prompt Assessment of Global Earthquakes for Response (PAGER) system, we would like to routinely make such estimates, in near-real time, around the globe. Here we use the recently produced USGS ShakeMap Atlas of historic earthquakes to develop an empirical landslide probability model. We focus on recent events, yet include any digitally-mapped landslide inventories for which well-constrained ShakeMaps are also available. We combine these uniform estimates of the input shaking (e.g., peak acceleration and velocity) with broadly available susceptibility proxies, such as topographic slope and surface geology. The resulting database is used to build a predictive model of the probability of landslide occurrence with logistic regression. The landslide database includes observations from the Northridge, California (1994); Wenchuan, China (2008); ChiChi, Taiwan (1999); and Chuetsu, Japan (2004) earthquakes; we also provide ShakeMaps for moderate-sized events without landslide for proper model testing and training. The performance of the regression model is assessed with both statistical goodness-of-fit metrics and a qualitative review of whether or not the model is able to capture the spatial extent of landslides for each event. Part of our goal is to determine which variables can be employed based on globally-available data or proxies, and whether or not modeling results from one region are transferrable to
Logistic regression modeling to assess groundwater vulnerability to contamination in Hawaii, USA
NASA Astrophysics Data System (ADS)
Mair, Alan; El-Kadi, Aly I.
2013-10-01
Capture zone analysis combined with a subjective susceptibility index is currently used in Hawaii to assess vulnerability to contamination of drinking water sources derived from groundwater. In this study, we developed an alternative objective approach that combines well capture zones with multiple-variable logistic regression (LR) modeling and applied it to the highly-utilized Pearl Harbor and Honolulu aquifers on the island of Oahu, Hawaii. Input for the LR models utilized explanatory variables based on hydrogeology, land use, and well geometry/location. A suite of 11 target contaminants detected in the region, including elevated nitrate (> 1 mg/L), four chlorinated solvents, four agricultural fumigants, and two pesticides, was used to develop the models. We then tested the ability of the new approach to accurately separate groups of wells with low and high vulnerability, and the suitability of nitrate as an indicator of other types of contamination. Our results produced contaminant-specific LR models that accurately identified groups of wells with the lowest/highest reported detections and the lowest/highest nitrate concentrations. Current and former agricultural land uses were identified as significant explanatory variables for eight of the 11 target contaminants, while elevated nitrate was a significant variable for five contaminants. The utility of the combined approach is contingent on the availability of hydrologic and chemical monitoring data for calibrating groundwater and LR models. Application of the approach using a reference site with sufficient data could help identify key variables in areas with similar hydrogeology and land use but limited data. In addition, elevated nitrate may also be a suitable indicator of groundwater contamination in areas with limited data. The objective LR modeling approach developed in this study is flexible enough to address a wide range of contaminants and represents a suitable addition to the current subjective approach.
Landslide susceptibility analysis with logistic regression model based on FCM sampling strategy
NASA Astrophysics Data System (ADS)
Wang, Liang-Jie; Sawada, Kazuhide; Moriguchi, Shuji
2013-08-01
Several mathematical models are used to predict the spatial distribution characteristics of landslides to mitigate damage caused by landslide disasters. Although some studies have achieved excellent results around the world, few studies take the inter-relationship of the selected points (training points) into account. In this paper, we present the Fuzzy c-means (FCM) algorithm as an optimal method for choosing the appropriate input landslide points as training data. Based on different combinations of the Fuzzy exponent (m) and the number of clusters (c), five groups of sampling points were derived from formal seed cells points and applied to analyze the landslide susceptibility in Mizunami City, Gifu Prefecture, Japan. A logistic regression model is applied to create the models of the relationships between landslide-conditioning factors and landslide occurrence. The pre-existing landslide bodies and the area under the relative operative characteristic (ROC) curve were used to evaluate the performance of all the models with different m and c. The results revealed that Model no. 4 (m=1.9, c=4) and Model no. 5 (m=1.9, c=5) have significantly high classification accuracies, i.e., 90.0%. Moreover, over 30% of the landslide bodies were grouped under the very high susceptibility zone. Otherwise, Model no. 4 and Model no. 5 had higher area under the ROC curve (AUC) values, which were 0.78 and 0.79, respectively. Therefore, Model no. 4 and Model no. 5 offer better model results for landslide susceptibility mapping. Maps derived from Model no. 4 and Model no. 5 would offer the local authorities crucial information for city planning and development.
Nakasone, Yutaka Ikeda, Osamu; Yamashita, Yasuyuki; Kudoh, Kouichi; Shigematsu, Yoshinori; Harada, Kazunori
2007-09-15
We applied multivariate analysis to the clinical findings in patients with acute gastrointestinal (GI) hemorrhage and compared the relationship between these findings and angiographic evidence of extravasation. Our study population consisted of 46 patients with acute GI bleeding. They were divided into two groups. In group 1 we retrospectively analyzed 41 angiograms obtained in 29 patients (age range, 25-91 years; average, 71 years). Their clinical findings including the shock index (SI), diastolic blood pressure, hemoglobin, platelet counts, and age, which were quantitatively analyzed. In group 2, consisting of 17 patients (age range, 21-78 years; average, 60 years), we prospectively applied statistical analysis by a logistics regression model to their clinical findings and then assessed 21 angiograms obtained in these patients to determine whether our model was useful for predicting the presence of angiographic evidence of extravasation. On 18 of 41 (43.9%) angiograms in group 1 there was evidence of extravasation; in 3 patients it was demonstrated only by selective angiography. Factors significantly associated with angiographic visualization of extravasation were the SI and patient age. For differentiation between cases with and cases without angiographic evidence of extravasation, the maximum cutoff point was between 0.51 and 0.0.53. Of the 21 angiograms obtained in group 2, 13 (61.9%) showed evidence of extravasation; in 1 patient it was demonstrated only on selective angiograms. We found that in 90% of the cases, the prospective application of our model correctly predicted the angiographically confirmed presence or absence of extravasation. We conclude that in patients with GI hemorrhage, angiographic visualization of extravasation is associated with the pre-embolization SI. Patients with a high SI value should undergo study to facilitate optimal treatment planning.
LOGISTIC NETWORK REGRESSION FOR SCALABLE ANALYSIS OF NETWORKS WITH JOINT EDGE/VERTEX DYNAMICS
Almquist, Zack W.; Butts, Carter T.
2015-01-01
Change in group size and composition has long been an important area of research in the social sciences. Similarly, interest in interaction dynamics has a long history in sociology and social psychology. However, the effects of endogenous group change on interaction dynamics are a surprisingly understudied area. One way to explore these relationships is through social network models. Network dynamics may be viewed as a process of change in the edge structure of a network, in the vertex set on which edges are defined, or in both simultaneously. Although early studies of such processes were primarily descriptive, recent work on this topic has increasingly turned to formal statistical models. Although showing great promise, many of these modern dynamic models are computationally intensive and scale very poorly in the size of the network under study and/or the number of time points considered. Likewise, currently used models focus on edge dynamics, with little support for endogenously changing vertex sets. Here, the authors show how an existing approach based on logistic network regression can be extended to serve as a highly scalable framework for modeling large networks with dynamic vertex sets. The authors place this approach within a general dynamic exponential family (exponential-family random graph modeling) context, clarifying the assumptions underlying the framework (and providing a clear path for extensions), and they show how model assessment methods for cross-sectional networks can be extended to the dynamic case. Finally, the authors illustrate this approach on a classic data set involving interactions among windsurfers on a California beach. PMID:26120218
Binary logistic regression analysis of hard palate dimensions for sexing human crania
Asif, Muhammed; Shetty, Radhakrishna; Avadhani, Ramakrishna
2016-01-01
Sex determination is the preliminary step in every forensic investigation and the hard palate assumes significance in cranial sexing in cases involving burns and explosions due to its resistant nature and secluded location. This study analyzes the sexing potential of incisive foramen to posterior nasal spine length, palatine process of maxilla length, horizontal plate of palatine bone length and transverse length between the greater palatine foramina. The study deviates from the conventional method of measuring the maxillo-alveolar length and breadth as the dimensions considered in this study are more heat resistant and useful in situations with damaged alveolar margins. The study involves 50 male and 50 female adult dry skulls of Indian ethnic group. The dimensions measured were statistically analyzed using Student's t test, binary logistic regression and receiver operating characteristic curve. It was observed that the incisive foramen to posterior nasal spine length is a definite sex marker with sex predictability of 87.2%. The palatine process of maxilla length with 66.8% sex predictability and the horizontal plate of palatine bone length with 71.9% sex predictability cannot be relied upon as definite sex markers. The transverse length between the greater palatine foramina is statistically insignificant in sexing crania (P=0.318). Considering a significant overlap of values in both the sexes the palatal dimensions singularly cannot be relied upon for sexing. Nevertheless, considering the high sex predictability of incisive foramen to posterior nasal spine length this dimension can definitely be used to supplement other sexing evidence available to precisely conclude the cranial sex. PMID:27382518
Banno, Masahiro; Koide, Takayoshi; Aleksic, Branko; Okada, Takashi; Kikuchi, Tsutomu; Kohmura, Kunihiro; Adachi, Yasunori; Kawano, Naoko; Iidaka, Tetsuya; Ozaki, Norio
2012-01-01
Objectives This study investigated what clinical and sociodemographic factors affected Wisconsin Card Sorting Test (WCST) factor scores of patients with schizophrenia to evaluate parameters or items of the WCST. Design Cross-sectional study. Setting Patients with schizophrenia from three hospitals participated. Participants Participants were recruited from July 2009 to August 2011. 131 Japanese patients with schizophrenia (84 men and 47 women, 43.5±13.8 years (mean±SD)) entered and completed the study. Participants were recruited in the study if they (1) met DSM-IV criteria for schizophrenia; (2) were physically healthy and (3) had no mood disorders, substance abuse, neurodevelopmental disorders, epilepsy or mental retardation. We examined their basic clinical and sociodemographic factors (sex, age, education years, age of onset, duration of illness, chlorpromazine equivalent doses and the positive and negative syndrome scale (PANSS) scores). Primary and secondary outcome measures All patients carried out the WCST Keio version. Five indicators were calculated, including categories achieved (CA), perseverative errors in Milner (PEM) and Nelson (PEN), total errors (TE) and difficulties of maintaining set (DMS). From the principal component analysis, we identified two factors (1 and 2). We assessed the relationship between these factor scores and clinical and sociodemographic factors, using multiple logistic regression analysis. Results Factor 1 was mainly composed of CA, PEM, PEN and TE. Factor 2 was mainly composed of DMS. The factor 1 score was affected by age, education years and the PANSS negative scale score. The factor 2 score was affected by duration of illness. Conclusions Age, education years, PANSS negative scale score and duration of illness affected WCST factor scores in patients with schizophrenia. Using WCST factor scores may reduce the possibility of type I errors due to multiple comparisons. PMID:23135537
Bias-reduced and separation-proof conditional logistic regression with small or sparse data sets.
Heinze, Georg; Puhr, Rainer
2010-03-30
Conditional logistic regression is used for the analysis of binary outcomes when subjects are stratified into several subsets, e.g. matched pairs or blocks. Log odds ratio estimates are usually found by maximizing the conditional likelihood. This approach eliminates all strata-specific parameters by conditioning on the number of events within each stratum. However, in the analyses of both an animal experiment and a lung cancer case-control study, conditional maximum likelihood (CML) resulted in infinite odds ratio estimates and monotone likelihood. Estimation can be improved by using Cytel Inc.'s well-known LogXact software, which provides a median unbiased estimate and exact or mid-p confidence intervals. Here, we suggest and outline point and interval estimation based on maximization of a penalized conditional likelihood in the spirit of Firth's (Biometrika 1993; 80:27-38) bias correction method (CFL). We present comparative analyses of both studies, demonstrating some advantages of CFL over competitors. We report on a small-sample simulation study where CFL log odds ratio estimates were almost unbiased, whereas LogXact estimates showed some bias and CML estimates exhibited serious bias. Confidence intervals and tests based on the penalized conditional likelihood had close-to-nominal coverage rates and yielded highest power among all methods compared, respectively. Therefore, we propose CFL as an attractive solution to the stratified analysis of binary data, irrespective of the occurrence of monotone likelihood. A SAS program implementing CFL is available at: http://www.muw.ac.at/msi/biometrie/programs. PMID:20213709
Mallinis, Georgios; Koutsias, Nikos
2008-01-01
Improvement of satellite sensor characteristics motivates the development of new techniques for satellite image classification. Spatial information seems to be critical in classification processes, especially for heterogeneous and complex landscapes such as those observed in the Mediterranean basin. In our study, a spectral classification method of a LANDSAT-5 TM imagery that uses several binomial logistic regression models was developed, evaluated and compared to the familiar parametric maximum likelihood algorithm. The classification approach based on logistic regression modelling was extended to a contextual one by using autocovariates to consider spatial dependencies of every pixel with its neighbours. Finally, the maximum likelihood algorithm was upgraded to contextual by considering typicality, a measure which indicates the strength of class membership. The use of logistic regression for broad-scale land cover classification presented higher overall accuracy (75.61%), although not statistically significant, than the maximum likelihood algorithm (64.23%), even when the latter was refined following a spatial approach based on Mahalanobis distance (66.67%). However, the consideration of the spatial autocovariate in the logistic models significantly improved the fit of the models and increased the overall accuracy from 75.61% to 80.49%.
NASA Astrophysics Data System (ADS)
Al-Mudhafar, W. J.
2013-12-01
Precisely prediction of rock facies leads to adequate reservoir characterization by improving the porosity-permeability relationships to estimate the properties in non-cored intervals. It also helps to accurately identify the spatial facies distribution to perform an accurate reservoir model for optimal future reservoir performance. In this paper, the facies estimation has been done through Multinomial logistic regression (MLR) with respect to the well logs and core data in a well in upper sandstone formation of South Rumaila oil field. The entire independent variables are gamma rays, formation density, water saturation, shale volume, log porosity, core porosity, and core permeability. Firstly, Robust Sequential Imputation Algorithm has been considered to impute the missing data. This algorithm starts from a complete subset of the dataset and estimates sequentially the missing values in an incomplete observation by minimizing the determinant of the covariance of the augmented data matrix. Then, the observation is added to the complete data matrix and the algorithm continues with the next observation with missing values. The MLR has been chosen to estimate the maximum likelihood and minimize the standard error for the nonlinear relationships between facies & core and log data. The MLR is used to predict the probabilities of the different possible facies given each independent variable by constructing a linear predictor function having a set of weights that are linearly combined with the independent variables by using a dot product. Beta distribution of facies has been considered as prior knowledge and the resulted predicted probability (posterior) has been estimated from MLR based on Baye's theorem that represents the relationship between predicted probability (posterior) with the conditional probability and the prior knowledge. To assess the statistical accuracy of the model, the bootstrap should be carried out to estimate extra-sample prediction error by randomly
Wagner, Philippe; Ghith, Nermin; Leckie, George
2016-01-01
Background and Aim Many multilevel logistic regression analyses of “neighbourhood and health” focus on interpreting measures of associations (e.g., odds ratio, OR). In contrast, multilevel analysis of variance is rarely considered. We propose an original stepwise analytical approach that distinguishes between “specific” (measures of association) and “general” (measures of variance) contextual effects. Performing two empirical examples we illustrate the methodology, interpret the results and discuss the implications of this kind of analysis in public health. Methods We analyse 43,291 individuals residing in 218 neighbourhoods in the city of Malmö, Sweden in 2006. We study two individual outcomes (psychotropic drug use and choice of private vs. public general practitioner, GP) for which the relative importance of neighbourhood as a source of individual variation differs substantially. In Step 1 of the analysis, we evaluate the OR and the area under the receiver operating characteristic (AUC) curve for individual-level covariates (i.e., age, sex and individual low income). In Step 2, we assess general contextual effects using the AUC. Finally, in Step 3 the OR for a specific neighbourhood characteristic (i.e., neighbourhood income) is interpreted jointly with the proportional change in variance (i.e., PCV) and the proportion of ORs in the opposite direction (POOR) statistics. Results For both outcomes, information on individual characteristics (Step 1) provide a low discriminatory accuracy (AUC = 0.616 for psychotropic drugs; = 0.600 for choosing a private GP). Accounting for neighbourhood of residence (Step 2) only improved the AUC for choosing a private GP (+0.295 units). High neighbourhood income (Step 3) was strongly associated to choosing a private GP (OR = 3.50) but the PCV was only 11% and the POOR 33%. Conclusion Applying an innovative stepwise multilevel analysis, we observed that, in Malmö, the neighbourhood context per se had a negligible
The Application of the Cumulative Logistic Regression Model to Automated Essay Scoring
ERIC Educational Resources Information Center
Haberman, Shelby J.; Sinharay, Sandip
2010-01-01
Most automated essay scoring programs use a linear regression model to predict an essay score from several essay features. This article applied a cumulative logit model instead of the linear regression model to automated essay scoring. Comparison of the performances of the linear regression model and the cumulative logit model was performed on a…
Comparison of the Properties of Regression and Categorical Risk-Adjustment Models
Averill, Richard F.; Muldoon, John H.; Hughes, John S.
2016-01-01
Clinical risk-adjustment, the ability to standardize the comparison of individuals with different health needs, is based upon 2 main alternative approaches: regression models and clinical categorical models. In this article, we examine the impact of the differences in the way these models are constructed on end user applications. PMID:26945302
ERIC Educational Resources Information Center
Olejnik, Stephen; Mills, Jamie; Keselman, Harvey
2000-01-01
Evaluated the use of Mallow's C(p) and Wherry's adjusted R squared (R. Wherry, 1931) statistics to select a final model from a pool of model solutions using computer generated data. Neither statistic identified the underlying regression model any better than, and usually less well than, the stepwise selection method, which itself was poor for…
Li, J.; Gray, B.R.; Bates, D.M.
2008-01-01
Partitioning the variance of a response by design levels is challenging for binomial and other discrete outcomes. Goldstein (2003) proposed four definitions for variance partitioning coefficients (VPC) under a two-level logistic regression model. In this study, we explicitly derived formulae for multi-level logistic regression model and subsequently studied the distributional properties of the calculated VPCs. Using simulations and a vegetation dataset, we demonstrated associations between different VPC definitions, the importance of methods for estimating VPCs (by comparing VPC obtained using Laplace and penalized quasilikehood methods), and bivariate dependence between VPCs calculated at different levels. Such an empirical study lends an immediate support to wider applications of VPC in scientific data analysis.
Staley, Dennis M.; Negri, Jacquelyn A.; Kean, Jason W.; Laber, Jayme L.; Tillery, Anne C.; Youberg, Ann M.
2016-01-01
Wildfire can significantly alter the hydrologic response of a watershed to the extent that even modest rainstorms can generate dangerous flash floods and debris flows. To reduce public exposure to hazard, the U.S. Geological Survey produces post-fire debris-flow hazard assessments for select fires in the western United States. We use publicly available geospatial data describing basin morphology, burn severity, soil properties, and rainfall characteristics to estimate the statistical likelihood that debris flows will occur in response to a storm of a given rainfall intensity. Using an empirical database and refined geospatial analysis methods, we defined new equations for the prediction of debris-flow likelihood using logistic regression methods. We showed that the new logistic regression model outperformed previous models used to predict debris-flow likelihood.
Chen, Chau-Kuang; Bruce, Michelle; Tyler, Lauren; Brown, Claudine; Garrett, Angelica; Goggins, Susan; Lewis-Polite, Brandy; Weriwoh, Mirabel L; Juarez, Paul D.; Hood, Darryl B.; Skelton, Tyler
2014-01-01
The goal of this study was to analyze a 54-item instrument for assessment of perception of exposure to environmental contaminants within the context of the built environment, or exposome. This exposome was defined in five domains to include 1) home and hobby, 2) school, 3) community, 4) occupation, and 5) exposure history. Interviews were conducted with child-bearing-age minority women at Metro Nashville General Hospital at Meharry Medical College. Data were analyzed utilizing DTReg software for Support Vector Machine (SVM) modeling followed by an SPSS package for a logistic regression model. The target (outcome) variable of interest was respondent's residence by ZIP code. The results demonstrate that the rank order of important variables with respect to SVM modeling versus traditional logistic regression models is almost identical. This is the first study documenting that SVM analysis has discriminate power for determination of higher-ordered spatial relationships on an environmental exposure history questionnaire. PMID:23395953
Gordóvil-Merino, Amalia; Guàrdia-Olmos, Joan; Peró-Cebollero, Maribel; de la Fuente-Solanas, Emilia I
2010-04-01
The limitations inherent to classical estimation of the logistic regression models are known. The Bayesian approach in statistical analysis is an alternative to be considered, given that it makes it possible to introduce prior information about the phenomenon under study. The aim of the present work is to analyze binary and multinomial logistic regression simple models estimated by means of a Bayesian approach in comparison to classical estimation. To that effect, Child Attention Deficit Hyperactivity Disorder (ADHD) clinical data were analyzed. The sample included 286 participants of 6-12 years (78% boys, 22% girls) with ADHD positive diagnosis in 86.7% of the cases. The results show a reduction of standard errors associated to the coefficients obtained from the Bayesian analysis, thus bringing a greater stability to the coefficients. Complex models where parameter estimation may be easily compromised could benefit from this advantage. PMID:20524554
NASA Astrophysics Data System (ADS)
Colkesen, Ismail; Sahin, Emrehan Kutlug; Kavzoglu, Taskin
2016-06-01
Identification of landslide prone areas and production of accurate landslide susceptibility zonation maps have been crucial topics for hazard management studies. Since the prediction of susceptibility is one of the main processing steps in landslide susceptibility analysis, selection of a suitable prediction method plays an important role in the success of the susceptibility zonation process. Although simple statistical algorithms (e.g. logistic regression) have been widely used in the literature, the use of advanced non-parametric algorithms in landslide susceptibility zonation has recently become an active research topic. The main purpose of this study is to investigate the possible application of kernel-based Gaussian process regression (GPR) and support vector regression (SVR) for producing landslide susceptibility map of Tonya district of Trabzon, Turkey. Results of these two regression methods were compared with logistic regression (LR) method that is regarded as a benchmark method. Results showed that while kernel-based GPR and SVR methods generally produced similar results (90.46% and 90.37%, respectively), they outperformed the conventional LR method by about 18%. While confirming the superiority of the GPR method, statistical tests based on ROC statistics, success rate and prediction rate curves revealed the significant improvement in susceptibility map accuracy by applying kernel-based GPR and SVR methods.
NASA Astrophysics Data System (ADS)
Saro, Lee; Woo, Jeon Seong; Kwan-Young, Oh; Moung-Jin, Lee
2016-02-01
The aim of this study is to predict landslide susceptibility caused using the spatial analysis by the application of a statistical methodology based on the GIS. Logistic regression models along with artificial neutral network were applied and validated to analyze landslide susceptibility in Inje, Korea. Landslide occurrence area in the study were identified based on interpretations of optical remote sensing data (Aerial photographs) followed by field surveys. A spatial database considering forest, geophysical, soil and topographic data, was built on the study area using the Geographical Information System (GIS). These factors were analysed using artificial neural network (ANN) and logistic regression models to generate a landslide susceptibility map. The study validates the landslide susceptibility map by comparing them with landslide occurrence areas. The locations of landslide occurrence were divided randomly into a training set (50%) and a test set (50%). A training set analyse the landslide susceptibility map using the artificial network along with logistic regression models, and a test set was retained to validate the prediction map. The validation results revealed that the artificial neural network model (with an accuracy of 80.10%) was better at predicting landslides than the logistic regression model (with an accuracy of 77.05%). Of the weights used in the artificial neural network model, `slope' yielded the highest weight value (1.330), and `aspect' yielded the lowest value (1.000). This research applied two statistical analysis methods in a GIS and compared their results. Based on the findings, we were able to derive a more effective method for analyzing landslide susceptibility.
Ren, Yilong; Wang, Yunpeng; Wu, Xinkai; Yu, Guizhen; Ding, Chuan
2016-10-01
Red light running (RLR) has become a major safety concern at signalized intersection. To prevent RLR related crashes, it is critical to identify the factors that significantly impact the drivers' behaviors of RLR, and to predict potential RLR in real time. In this research, 9-month's RLR events extracted from high-resolution traffic data collected by loop detectors from three signalized intersections were applied to identify the factors that significantly affect RLR behaviors. The data analysis indicated that occupancy time, time gap, used yellow time, time left to yellow start, whether the preceding vehicle runs through the intersection during yellow, and whether there is a vehicle passing through the intersection on the adjacent lane were significantly factors for RLR behaviors. Furthermore, due to the rare events nature of RLR, a modified rare events logistic regression model was developed for RLR prediction. The rare events logistic regression method has been applied in many fields for rare events studies and shows impressive performance, but so far none of previous research has applied this method to study RLR. The results showed that the rare events logistic regression model performed significantly better than the standard logistic regression model. More importantly, the proposed RLR prediction method is purely based on loop detector data collected from a single advance loop detector located 400 feet away from stop-bar. This brings great potential for future field applications of the proposed method since loops have been widely implemented in many intersections and can collect data in real time. This research is expected to contribute to the improvement of intersection safety significantly. PMID:27472815
2012-01-01
Background Identifying risk factors for Salmonella Enteritidis (SE) infections in Ontario will assist public health authorities to design effective control and prevention programs to reduce the burden of SE infections. Our research objective was to identify risk factors for acquiring SE infections with various phage types (PT) in Ontario, Canada. We hypothesized that certain PTs (e.g., PT8 and PT13a) have specific risk factors for infection. Methods Our study included endemic SE cases with various PTs whose isolates were submitted to the Public Health Laboratory-Toronto from January 20th to August 12th, 2011. Cases were interviewed using a standardized questionnaire that included questions pertaining to demographics, travel history, clinical symptoms, contact with animals, and food exposures. A multinomial logistic regression method using the Generalized Linear Latent and Mixed Model procedure and a case-case study design were used to identify risk factors for acquiring SE infections with various PTs in Ontario, Canada. In the multinomial logistic regression model, the outcome variable had three categories representing human infections caused by SE PT8, PT13a, and all other SE PTs (i.e., non-PT8/non-PT13a) as a referent category to which the other two categories were compared. Results In the multivariable model, SE PT8 was positively associated with contact with dogs (OR=2.17, 95% CI 1.01-4.68) and negatively associated with pepper consumption (OR=0.35, 95% CI 0.13-0.94), after adjusting for age categories and gender, and using exposure periods and health regions as random effects to account for clustering. Conclusions Our study findings offer interesting hypotheses about the role of phage type-specific risk factors. Multinomial logistic regression analysis and the case-case study approach are novel methodologies to evaluate associations among SE infections with different PTs and various risk factors. PMID:23057531
Teshnizi, Saeed Hosseini; Ayatollahi, Sayyed Mohhamad Taghi
2015-01-01
Background and objective: Artificial Neural Networks (ANNs) have recently been applied in situations where an analysis based on the logistic regression (LR) is a standard statistical approach; direct comparisons of the results, however, are seldom attempted. In this study, we compared both logistic regression models and feed-forward neural networks on the academic failure data set. Methods: The data for this study included 18 questions about study situation of 275 undergraduate students selected randomly from among nursing and midwifery and paramedic schools of Hormozgan University of Medical Sciences in 2013. Logistic regression with forward method and feed forward Artificial Neural Network with 15 neurons in hidden layer were fitted to the dataset. The accuracy of the models in predicting academic failure was compared by using ROC (Receiver Operating Characteristic) and classification accuracy. Results: Among nine ANNs, the ANN with 15 neurons in hidden layer was a better ANN compared with LR. The Area Under Receiver Operating Characteristics (AUROC) of the LR model and ANN with 15 neurons in hidden layers, were estimated as 0.55 and 0.89, respectively and ANN was significantly greater than the LR. The LR and ANN models respectively classified 77.5% and 84.3% of the students correctly. Conclusion: Based on this dataset, it seems the classification of the students in two groups with and without academic failure by using ANN with 15 neurons in the hidden layer is better than the LR model. PMID:26635438
NASA Astrophysics Data System (ADS)
Schaeben, Helmut; Semmler, Georg
2016-09-01
The objective of prospectivity modeling is prediction of the conditional probability of the presence T = 1 or absence T = 0 of a target T given favorable or prohibitive predictors B, or construction of a two classes 0,1 classification of T. A special case of logistic regression called weights-of-evidence (WofE) is geologists' favorite method of prospectivity modeling due to its apparent simplicity. However, the numerical simplicity is deceiving as it is implied by the severe mathematical modeling assumption of joint conditional independence of all predictors given the target. General weights of evidence are explicitly introduced which are as simple to estimate as conventional weights, i.e., by counting, but do not require conditional independence. Complementary to the regression view is the classification view on prospectivity modeling. Boosting is the construction of a strong classifier from a set of weak classifiers. From the regression point of view it is closely related to logistic regression. Boost weights-of-evidence (BoostWofE) was introduced into prospectivity modeling to counterbalance violations of the assumption of conditional independence even though relaxation of modeling assumptions with respect to weak classifiers was not the (initial) purpose of boosting. In the original publication of BoostWofE a fabricated dataset was used to "validate" this approach. Using the same fabricated dataset it is shown that BoostWofE cannot generally compensate lacking conditional independence whatever the consecutively processing order of predictors. Thus the alleged features of BoostWofE are disproved by way of counterexamples, while theoretical findings are confirmed that logistic regression including interaction terms can exactly compensate violations of joint conditional independence if the predictors are indicators.
Procedures for adjusting regional regression models of urban-runoff quality using local data
Hoos, A.B.; Sisolak, J.K.
1993-01-01
Statistical operations termed model-adjustment procedures (MAP?s) can be used to incorporate local data into existing regression models to improve the prediction of urban-runoff quality. Each MAP is a form of regression analysis in which the local data base is used as a calibration data set. Regression coefficients are determined from the local data base, and the resulting `adjusted? regression models can then be used to predict storm-runoff quality at unmonitored sites. The response variable in the regression analyses is the observed load or mean concentration of a constituent in storm runoff for a single storm. The set of explanatory variables used in the regression analyses is different for each MAP, but always includes the predicted value of load or mean concentration from a regional regression model. The four MAP?s examined in this study were: single-factor regression against the regional model prediction, P, (termed MAP-lF-P), regression against P,, (termed MAP-R-P), regression against P, and additional local variables (termed MAP-R-P+nV), and a weighted combination of P, and a local-regression prediction (termed MAP-W). The procedures were tested by means of split-sample analysis, using data from three cities included in the Nationwide Urban Runoff Program: Denver, Colorado; Bellevue, Washington; and Knoxville, Tennessee. The MAP that provided the greatest predictive accuracy for the verification data set differed among the three test data bases and among model types (MAP-W for Denver and Knoxville, MAP-lF-P and MAP-R-P for Bellevue load models, and MAP-R-P+nV for Bellevue concentration models) and, in many cases, was not clearly indicated by the values of standard error of estimate for the calibration data set. A scheme to guide MAP selection, based on exploratory data analysis of the calibration data set, is presented and tested. The MAP?s were tested for sensitivity to the size of a calibration data set. As expected, predictive accuracy of all MAP?s for
Demenais, F M; Laing, A E; Bonney, G E
1992-01-01
Segregation analysis of discrete traits can be conducted by the classical mixed model and the recently introduced regressive models. The mixed model assumes an underlying liability to the disease, to which a major gene, a multifactorial component, and random environment contribute independently. Affected persons have a liability exceeding a threshold. The regressive logistic models assume that the logarithm of the odds of being affected is a linear function of major genotype effects, the phenotypes of older relatives, and other covariates. A formulation of the regressive models, based on an underlying liability model, has been recently proposed. The regression coefficients on antecedents are expressed in terms of the relevant familial correlations and a one-to-one correspondence with the parameters of the mixed model can thus be established. Computer simulations are conducted to evaluate the fit of the two formulations of the regressive models to the mixed model on nuclear families. The two forms of the class D regressive model provide a good fit to a generated mixed model, in terms of both hypothesis testing and parameter estimation. The simpler class A regressive model, which assumes that the outcomes of children depend solely on the outcomes of parents, is not robust against a sib-sib correlation exceeding that specified by the model, emphasizing testing class A against class D. The studies reported here show that if the true state of nature is that described by the mixed model, then a regressive model will do just as well. Moreover, the regressive models, allowing for more patterns of family dependence, provide a flexible framework to understand gene-environment interactions in complex diseases. PMID:1487139
NASA Astrophysics Data System (ADS)
Mǎgut, F. L.; Zaharia, S.; Glade, T.; Irimuş, I. A.
2012-04-01
Various methods exist in analyzing spatial landslide susceptibility and classing the results in susceptibility classes. The prediction of spatial landslide distribution can be performed by using a variety of methods based on GIS techniques. The two very common methods of a heuristic assessment and a logistic regression model are employed in this study in order to compare their performance in predicting the spatial distribution of previously mapped landslides for a study area located in Maramureš County, in Northwestern Romania. The first model determines a susceptibility index by combining the heuristic approach with GIS techniques of spatial data analysis. The criteria used for quantifying each susceptibility factor and the expression used to determine the susceptibility index are taken from the Romanian legislation (Governmental Decision 447/2003). This procedure is followed in any Romanian state-ordered study which relies on financial support. The logistic regression model predicts the spatial distribution of landslides by statistically calculating regressive coefficients which describe the dependency of previously mapped landslides on different factors. The identified shallow landslides correspond generally to Pannonian marl and Quaternary contractile clay deposits. The study region is located in the Northwestern part of Romania, including the Baia Mare municipality, the capital of Maramureš County. The study focuses on the former piedmontal region situated to the south of the volcanic mountains Gutâi, in the Baia Mare Depression, where most of the landslide activity has been recorded. In addition, a narrow sector of the volcanic mountains which borders the city of Baia Mare to the north has also been included to test the accuracy of the models in different lithologic units. The results of both models indicate a general medium landslide susceptibility of the study area. The more detailed differences will be discussed with respect to the advantages and
Smith, Matthew I.; de Lusignan, Simon; Mullett, David; Correa, Ana; Tickner, Jermaine; Jones, Simon
2016-01-01
Introduction Falls are the leading cause of injury in older people. Reducing falls could reduce financial pressures on health services. We carried out this research to develop a falls risk model, using routine primary care and hospital data to identify those at risk of falls, and apply a cost analysis to enable commissioners of health services to identify those in whom savings can be made through referral to a falls prevention service. Methods Multilevel logistical regression was performed on routinely collected general practice and hospital data from 74751 over 65’s, to produce a risk model for falls. Validation measures were carried out. A cost-analysis was performed to identify at which level of risk it would be cost-effective to refer patients to a falls prevention service. 95% confidence intervals were calculated using a Monte Carlo Model (MCM), allowing us to adjust for uncertainty in the estimates of these variables. Results A risk model for falls was produced with an area under the curve of the receiver operating characteristics curve of 0.87. The risk cut-off with the highest combination of sensitivity and specificity was at p = 0.07 (sensitivity of 81% and specificity of 78%). The risk cut-off at which savings outweigh costs was p = 0.27 and the risk cut-off with the maximum savings was p = 0.53, which would result in referral of 1.8% and 0.45% of the over 65’s population respectively. Above a risk cut-off of p = 0.27, costs do not exceed savings. Conclusions This model is the best performing falls predictive tool developed to date; it has been developed on a large UK city population; can be readily run from routine data; and can be implemented in a way that optimises the use of health service resources. Commissioners of health services should use this model to flag and refer patients at risk to their falls service and save resources. PMID:27448280
NASA Astrophysics Data System (ADS)
Nong, Yu; Du, Qingyun; Wang, Kun; Miao, Lei; Zhang, Weiwei
2008-10-01
Urban growth modeling, one of the most important aspects of land use and land cover change study, has attracted substantial attention because it helps to comprehend the mechanisms of land use change thus helps relevant policies made. This study applied multinomial logistic regression to model urban growth in the Jiayu county of Hubei province, China to discover the relationship between urban growth and the driving forces of which biophysical and social-economic factors are selected as independent variables. This type of regression is similar to binary logistic regression, but it is more general because the dependent variable is not restricted to two categories, as those previous studies did. The multinomial one can simulate the process of multiple land use competition between urban land, bare land, cultivated land and orchard land. Taking the land use type of Urban as reference category, parameters could be estimated with odds ratio. A probability map is generated from the model to predict where urban growth will occur as a result of the computation.
NASA Astrophysics Data System (ADS)
Trigila, Alessandro; Iadanza, Carla; Esposito, Carlo; Scarascia-Mugnozza, Gabriele
2015-11-01
The aim of this work is to define reliable susceptibility models for shallow landslides using Logistic Regression and Random Forests multivariate statistical techniques. The study area, located in North-East Sicily, was hit on October 1st 2009 by a severe rainstorm (225 mm of cumulative rainfall in 7 h) which caused flash floods and more than 1000 landslides. Several small villages, such as Giampilieri, were hit with 31 fatalities, 6 missing persons and damage to buildings and transportation infrastructures. Landslides, mainly types such as earth and debris translational slides evolving into debris flows, were triggered on steep slopes and involved colluvium and regolith materials which cover the underlying metamorphic bedrock. The work has been carried out with the following steps: i) realization of a detailed event landslide inventory map through field surveys coupled with observation of high resolution aerial colour orthophoto; ii) identification of landslide source areas; iii) data preparation of landslide controlling factors and descriptive statistics based on a bivariate method (Frequency Ratio) to get an initial overview on existing relationships between causative factors and shallow landslide source areas; iv) choice of criteria for the selection and sizing of the mapping unit; v) implementation of 5 multivariate statistical susceptibility models based on Logistic Regression and Random Forests techniques and focused on landslide source areas; vi) evaluation of the influence of sample size and type of sampling on results and performance of the models; vii) evaluation of the predictive capabilities of the models using ROC curve, AUC and contingency tables; viii) comparison of model results and obtained susceptibility maps; and ix) analysis of temporal variation of landslide susceptibility related to input parameter changes. Models based on Logistic Regression and Random Forests have demonstrated excellent predictive capabilities. Land use and wildfire
NASA Technical Reports Server (NTRS)
Smith, Kelly M.; Gay, Robert S.; Stachowiak, Susan J.
2013-01-01
In late 2014, NASA will fly the Orion capsule on a Delta IV-Heavy rocket for the Exploration Flight Test-1 (EFT-1) mission. For EFT-1, the Orion capsule will be flying with a new GPS receiver and new navigation software. Given the experimental nature of the flight, the flight software must be robust to the loss of GPS measurements. Once the high-speed entry is complete, the drogue parachutes must be deployed within the proper conditions to stabilize the vehicle prior to deploying the main parachutes. When GPS is available in nominal operations, the vehicle will deploy the drogue parachutes based on an altitude trigger. However, when GPS is unavailable, the navigated altitude errors become excessively large, driving the need for a backup barometric altimeter to improve altitude knowledge. In order to increase overall robustness, the vehicle also has an alternate method of triggering the parachute deployment sequence based on planet-relative velocity if both the GPS and the barometric altimeter fail. However, this backup trigger results in large altitude errors relative to the targeted altitude. Motivated by this challenge, this paper demonstrates how logistic regression may be employed to semi-automatically generate robust triggers based on statistical analysis. Logistic regression is used as a ground processor pre-flight to develop a statistical classifier. The classifier would then be implemented in flight software and executed in real-time. This technique offers improved performance even in the face of highly inaccurate measurements. Although the logistic regression-based trigger approach will not be implemented within EFT-1 flight software, the methodology can be carried forward for future missions and vehicles.
NASA Technical Reports Server (NTRS)
Smith, Kelly M.; Gay, Robert S.; Stachowiak, Susan J.
2013-01-01
In late 2014, NASA will fly the Orion capsule on a Delta IV-Heavy rocket for the Exploration Flight Test-1 (EFT-1) mission. For EFT-1, the Orion capsule will be flying with a new GPS receiver and new navigation software. Given the experimental nature of the flight, the flight software must be robust to the loss of GPS measurements. Once the high-speed entry is complete, the drogue parachutes must be deployed within the proper conditions to stabilize the vehicle prior to deploying the main parachutes. When GPS is available in nominal operations, the vehicle will deploy the drogue parachutes based on an altitude trigger. However, when GPS is unavailable, the navigated altitude errors become excessively large, driving the need for a backup barometric altimeter. In order to increase overall robustness, the vehicle also has an alternate method of triggering the drogue parachute deployment based on planet-relative velocity if both the GPS and the barometric altimeter fail. However, this velocity-based trigger results in large altitude errors relative to the targeted altitude. Motivated by this challenge, this paper demonstrates how logistic regression may be employed to automatically generate robust triggers based on statistical analysis. Logistic regression is used as a ground processor pre-flight to develop a classifier. The classifier would then be implemented in flight software and executed in real-time. This technique offers excellent performance even in the face of highly inaccurate measurements. Although the logistic regression-based trigger approach will not be implemented within EFT-1 flight software, the methodology can be carried forward for future missions and vehicles.
NASA Technical Reports Server (NTRS)
Smith, Kelly; Gay, Robert; Stachowiak, Susan
2013-01-01
In late 2014, NASA will fly the Orion capsule on a Delta IV-Heavy rocket for the Exploration Flight Test-1 (EFT-1) mission. For EFT-1, the Orion capsule will be flying with a new GPS receiver and new navigation software. Given the experimental nature of the flight, the flight software must be robust to the loss of GPS measurements. Once the high-speed entry is complete, the drogue parachutes must be deployed within the proper conditions to stabilize the vehicle prior to deploying the main parachutes. When GPS is available in nominal operations, the vehicle will deploy the drogue parachutes based on an altitude trigger. However, when GPS is unavailable, the navigated altitude errors become excessively large, driving the need for a backup barometric altimeter to improve altitude knowledge. In order to increase overall robustness, the vehicle also has an alternate method of triggering the parachute deployment sequence based on planet-relative velocity if both the GPS and the barometric altimeter fail. However, this backup trigger results in large altitude errors relative to the targeted altitude. Motivated by this challenge, this paper demonstrates how logistic regression may be employed to semi-automatically generate robust triggers based on statistical analysis. Logistic regression is used as a ground processor pre-flight to develop a statistical classifier. The classifier would then be implemented in flight software and executed in real-time. This technique offers improved performance even in the face of highly inaccurate measurements. Although the logistic regression-based trigger approach will not be implemented within EFT-1 flight software, the methodology can be carried forward for future missions and vehicles
Xu, Jun-Fang; Xu, Jing; Li, Shi-Zhu; Jia, Tia-Wu; Huang, Xi-Bao; Zhang, Hua-Ming; Chen, Mei; Yang, Guo-Jing; Gao, Shu-Jing; Wang, Qing-Yun; Zhou, Xiao-Nong
2013-01-01
Background The transmission of schistosomiasis japonica in a local setting is still poorly understood in the lake regions of the People's Republic of China (P. R. China), and its transmission patterns are closely related to human, social and economic factors. Methodology/Principal Findings We aimed to apply the integrated approach of artificial neural network (ANN) and logistic regression model in assessment of transmission risks of Schistosoma japonicum with epidemiological data collected from 2339 villagers from 1247 households in six villages of Jiangling County, P.R. China. By using the back-propagation (BP) of the ANN model, 16 factors out of 27 factors were screened, and the top five factors ranked by the absolute value of mean impact value (MIV) were mainly related to human behavior, i.e. integration of water contact history and infection history, family with past infection, history of water contact, infection history, and infection times. The top five factors screened by the logistic regression model were mainly related to the social economics, i.e. village level, economic conditions of family, age group, education level, and infection times. The risk of human infection with S. japonicum is higher in the population who are at age 15 or younger, or with lower education, or with the higher infection rate of the village, or with poor family, and in the population with more than one time to be infected. Conclusion/Significance Both BP artificial neural network and logistic regression model established in a small scale suggested that individual behavior and socioeconomic status are the most important risk factors in the transmission of schistosomiasis japonica. It was reviewed that the young population (≤15) in higher-risk areas was the main target to be intervened for the disease transmission control. PMID:23556015
NASA Astrophysics Data System (ADS)
Mărgărint, M. C.; Grozavu, A.; Patriche, C. V.
2013-12-01
In landslide susceptibility assessment, an important issue is the correct identification of significant contributing factors, which leads to the improvement of predictions regarding this type of geomorphologic processes. In the scientific literature, different weightings are assigned to these factors, but contain large variations. This study aims to identify the spatial variability and range of variation for the coefficients of landslide predictors in different geographical conditions. Four sectors of 15 km × 15 km (225 km2) were selected for analysis from representative regions in Romania in terms of spatial extent of landslides, situated both on the hilly areas (the Transylvanian Plateau and Moldavian Plateau) and lower mountain region (Subcarpathians). The following factors were taken into consideration: elevation, slope angle, slope height, terrain curvature (mean, plan and profile), distance from drainage network, slope aspect, land use, and lithology. For each sector, landslide inventory, digital elevation model and thematic layers of the mentioned predictors were achieved and integrated in a georeferenced environment. The logistic regression was applied separately for the four study sectors as the statistical method for assessing terrain landsliding susceptibility. Maps of landslide susceptibility were produced, the values of which were classified by using the natural breaks method (Jenks). The accuracy of the logistic regression outcomes was evaluated using the ROC (receiver operating characteristic) curve and AUC (area under the curve) parameter, which show values between 0.852 and 0.922 for training samples, and between 0.851 and 0.940 for validation samples. The values of coefficients are generally confined within the limits specified by the scientific literature. In each sector, landslide susceptibility is essentially related to some specific predictors, such as the slope angle, land use, slope height, and lithology. The study points out that the
NASA Astrophysics Data System (ADS)
Margarint, M. C.; Grozavu, A.; Patriche, C. V.
2012-04-01
Landslides represent a significant natural hazard in hilly areas of Romania which cause important damages. The scientific interest for landslide susceptibility mapping is quite recent and standardized through legislation. However, there is need for improving the methodology, in order for the susceptibility maps to constitute a sound basis for territorial planning. The logistic regression is one of the main statistical methods used for assessing terrain susceptibility for landsliding. There are different degrees of weighting the landslide causal factors mentioned in the scientific literature, but with large variations. This study aims to identify the range of variation of landslide causal factors for different regions in Romania. The following factors were taken into consideration: slope angle, terrain altitude, terrain curvature (mean, plan and profile), soil type, lithologic class, land use, distance from drainage network and roads, mean annual precipitations. Four square perimeters of 15x15 km were chosen from representative regions in terms of spatial extent of landslides: two situated in the central-northern part of the Moldavian Plateau, one in the Transylvania Depression and one in the Moldavian Subcarpathians. The logistic regression was applied separately for the four sectors. In order to monitor the differences in the final results, numerous attempts have been made, starting from landslides polygons acquired from both the topographic maps at scale 1:25.000 (1984-1985 edition) and the ortophotoimages (2005-2006). The other elements were acquired from cartographic materials at appropriate scales, according the international methodology. The data integration was accomplished in the georeferenced environment provided by TNTMips 6.9 ArcGIS 9.3 and SAGA 2.0.8 software packages, while the statistical analysis was performed using Excel 2003 and XLSTAT 2010 trial version. Maps for all landslide causal factors were achieved for each perimeter. The logistic
NASA Astrophysics Data System (ADS)
Pradhan, Biswajeet
2010-05-01
This paper presents the results of the cross-validation of a multivariate logistic regression model using remote sensing data and GIS for landslide hazard analysis on the Penang, Cameron, and Selangor areas in Malaysia. Landslide locations in the study areas were identified by interpreting aerial photographs and satellite images, supported by field surveys. SPOT 5 and Landsat TM satellite imagery were used to map landcover and vegetation index, respectively. Maps of topography, soil type, lineaments and land cover were constructed from the spatial datasets. Ten factors which influence landslide occurrence, i.e., slope, aspect, curvature, distance from drainage, lithology, distance from lineaments, soil type, landcover, rainfall precipitation, and normalized difference vegetation index (ndvi), were extracted from the spatial database and the logistic regression coefficient of each factor was computed. Then the landslide hazard was analysed using the multivariate logistic regression coefficients derived not only from the data for the respective area but also using the logistic regression coefficients calculated from each of the other two areas (nine hazard maps in all) as a cross-validation of the model. For verification of the model, the results of the analyses were then compared with the field-verified landslide locations. Among the three cases of the application of logistic regression coefficient in the same study area, the case of Selangor based on the Selangor logistic regression coefficients showed the highest accuracy (94%), where as Penang based on the Penang coefficients showed the lowest accuracy (86%). Similarly, among the six cases from the cross application of logistic regression coefficient in other two areas, the case of Selangor based on logistic coefficient of Cameron showed highest (90%) prediction accuracy where as the case of Penang based on the Selangor logistic regression coefficients showed the lowest accuracy (79%). Qualitatively, the cross
Perez, Ivan; Chavez, Allison K.; Ponce, Dario
2016-01-01
Background: The Ricketts' posteroanterior (PA) cephalometry seems to be the most widely used and it has not been tested by multivariate statistics for sex determination. Objective: The objective was to determine the applicability of Ricketts' PA cephalometry for sex determination using the logistic regression analysis. Materials and Methods: The logistic models were estimated at distinct age cutoffs (all ages, 11 years, 13 years, and 15 years) in a database from 1,296 Hispano American Peruvians between 5 years and 44 years of age. Results: The logistic models were composed by six cephalometric measurements; the accuracy achieved by resubstitution varied between 60% and 70% and all the variables, with one exception, exhibited a direct relationship with the probability of being classified as male; the nasal width exhibited an indirect relationship. Conclusion: The maxillary and facial widths were present in all models and may represent a sexual dimorphism indicator. The accuracy found was lower than the literature and the Ricketts' PA cephalometry may not be adequate for sex determination. The indirect relationship of the nasal width in models with data from patients of 12 years of age or less may be a trait related to age or a characteristic in the studied population, which could be better studied and confirmed. PMID:27555732
2013-01-01
Background Genome-wide association studies have become very popular in identifying genetic contributions to phenotypes. Millions of SNPs are being tested for their association with diseases and traits using linear or logistic regression models. This conceptually simple strategy encounters the following computational issues: a large number of tests and very large genotype files (many Gigabytes) which cannot be directly loaded into the software memory. One of the solutions applied on a grand scale is cluster computing involving large-scale resources. We show how to speed up the computations using matrix operations in pure R code. Results We improve speed: computation time from 6 hours is reduced to 10-15 minutes. Our approach can handle essentially an unlimited amount of covariates efficiently, using projections. Data files in GWAS are vast and reading them into computer memory becomes an important issue. However, much improvement can be made if the data is structured beforehand in a way allowing for easy access to blocks of SNPs. We propose several solutions based on the R packages ff and ncdf. We adapted the semi-parallel computations for logistic regression. We show that in a typical GWAS setting, where SNP effects are very small, we do not lose any precision and our computations are few hundreds times faster than standard procedures. Conclusions We provide very fast algorithms for GWAS written in pure R code. We also show how to rearrange SNP data for fast access. PMID:23711206
Battaglin, William A.; Ulery, Randy L.; Winterstein, Thomas; Welborn, Toby
2003-01-01
In the State of Texas, surface water (streams, canals, and reservoirs) and ground water are used as sources of public water supply. Surface-water sources of public water supply are susceptible to contamination from point and nonpoint sources. To help protect sources of drinking water and to aid water managers in designing protective yet cost-effective and risk-mitigated monitoring strategies, the Texas Commission on Environmental Quality and the U.S. Geological Survey developed procedures to assess the susceptibility of public water-supply source waters in Texas to the occurrence of 227 contaminants. One component of the assessments is the determination of susceptibility of surface-water sources to nonpoint-source contamination. To accomplish this, water-quality data at 323 monitoring sites were matched with geographic information system-derived watershed- characteristic data for the watersheds upstream from the sites. Logistic regression models then were developed to estimate the probability that a particular contaminant will exceed a threshold concentration specified by the Texas Commission on Environmental Quality. Logistic regression models were developed for 63 of the 227 contaminants. Of the remaining contaminants, 106 were not modeled because monitoring data were available at less than 10 percent of the monitoring sites; 29 were not modeled because there were less than 15 percent detections of the contaminant in the monitoring data; 27 were not modeled because of the lack of any monitoring data; and 2 were not modeled because threshold values were not specified.
Reboussin, Beth A; Preisser, John S; Song, Eun-Young; Wolfson, Mark
2012-07-01
Under-age drinking is an enormous public health issue in the USA. Evidence that community level structures may impact on under-age drinking has led to a proliferation of efforts to change the environment surrounding the use of alcohol. Although the focus of these efforts is to reduce drinking by individual youths, environmental interventions are typically implemented at the community level with entire communities randomized to the same intervention condition. A distinct feature of these trials is the tendency of the behaviours of individuals residing in the same community to be more alike than that of others residing in different communities, which is herein called 'clustering'. Statistical analyses and sample size calculations must account for this clustering to avoid type I errors and to ensure an appropriately powered trial. Clustering itself may also be of scientific interest. We consider the alternating logistic regressions procedure within the population-averaged modelling framework to estimate the effect of a law enforcement intervention on the prevalence of under-age drinking behaviours while modelling the clustering at multiple levels, e.g. within communities and within neighbourhoods nested within communities, by using pairwise odds ratios. We then derive sample size formulae for estimating intervention effects when planning a post-test-only or repeated cross-sectional community-randomized trial using the alternating logistic regressions procedure. PMID:24347839
NASA Astrophysics Data System (ADS)
Ariffin, Syaiba Balqish; Midi, Habshah; Arasan, Jayanthi; Rana, Md Sohel
2015-02-01
This article is concerned with the performance of the maximum estimated likelihood estimator in the presence of separation in the space of the independent variables and high leverage points. The maximum likelihood estimator suffers from the problem of non overlap cases in the covariates where the regression coefficients are not identifiable and the maximum likelihood estimator does not exist. Consequently, iteration scheme fails to converge and gives faulty results. To remedy this problem, the maximum estimated likelihood estimator is put forward. It is evident that the maximum estimated likelihood estimator is resistant against separation and the estimates always exist. The effect of high leverage points are then investigated on the performance of maximum estimated likelihood estimator through real data sets and Monte Carlo simulation study. The findings signify that the maximum estimated likelihood estimator fails to provide better parameter estimates in the presence of both separation, and high leverage points.
2013-01-01
Methods for analysis of network dynamics have seen great progress in the past decade. This article shows how Dynamic Network Logistic Regression techniques (a special case of the Temporal Exponential Random Graph Models) can be used to implement decision theoretic models for network dynamics in a panel data context. We also provide practical heuristics for model building and assessment. We illustrate the power of these techniques by applying them to a dynamic blog network sampled during the 2004 US presidential election cycle. This is a particularly interesting case because it marks the debut of Internet-based media such as blogs and social networking web sites as institutionally recognized features of the American political landscape. Using a longitudinal sample of all Democratic National Convention/Republican National Convention–designated blog citation networks, we are able to test the influence of various strategic, institutional, and balance-theoretic mechanisms as well as exogenous factors such as seasonality and political events on the propensity of blogs to cite one another over time. Using a combination of deviance-based model selection criteria and simulation-based model adequacy tests, we identify the combination of processes that best characterizes the choice behavior of the contending blogs. PMID:24143060
ERIC Educational Resources Information Center
Osborne, Jason W.
2012-01-01
Logistic regression is slowly gaining acceptance in the social sciences, and fills an important niche in the researcher's toolkit: being able to predict important outcomes that are not continuous in nature. While OLS regression is a valuable tool, it cannot routinely be used to predict outcomes that are binary or categorical in nature. These…
NASA Astrophysics Data System (ADS)
Angillieri, María Yanina Esper
2010-01-01
This study employs statistical modeling techniques and geomorphological mapping to analyze the distribution of active rock glaciers in relation to altitude, aspect, slope, lithology and solar radiation using optical remote sensing techniques with GIS. The study area includes a portion of the Dry Andes of the Cordillera Frontal of San Juan around 30°S latitude, where few geomorphological studies have been conducted. Over 155 rock glaciers have been identified, and 85 are considered active. The relationship between the variables and the rock glaciers distribution was analyzed using the frequency ratio method and logistic regression models. The analytical results show that elevations > 3824 m a.s.l., a south-facing or east-facing aspect, areas with relatively low solar radiation, and slope between 2° and 20° favor the existence of the rock glaciers, and demonstrate that lithology and slope exert major influences.
2015-01-01
This paper considers the problem of estimation and variable selection for large high-dimensional data (high number of predictors p and large sample size N, without excluding the possibility that N < p) resulting from an individually matched case-control study. We develop a simple algorithm for the adaptation of the Lasso and related methods to the conditional logistic regression model. Our proposal relies on the simplification of the calculations involved in the likelihood function. Then, the proposed algorithm iteratively solves reweighted Lasso problems using cyclical coordinate descent, computed along a regularization path. This method can handle large problems and deal with sparse features efficiently. We discuss benefits and drawbacks with respect to the existing available implementations. We also illustrate the interest and use of these techniques on a pharmacoepidemiological study of medication use and traffic safety. PMID:25916593
Jostins, Luke; McVean, Gilean
2016-01-01
Motivation: For many classes of disease the same genetic risk variants underly many related phenotypes or disease subtypes. Multinomial logistic regression provides an attractive framework to analyze multi-category phenotypes, and explore the genetic relationships between these phenotype categories. We introduce Trinculo, a program that implements a wide range of multinomial analyses in a single fast package that is designed to be easy to use by users of standard genome-wide association study software. Availability and implementation: An open source C implementation, with code and binaries for Linux and Mac OSX, is available for download at http://sourceforge.net/projects/trinculo Supplementary information: Supplementary data are available at Bioinformatics online. Contact: lj4@well.ox.ac.uk PMID:26873930
NASA Astrophysics Data System (ADS)
Mehrjoo, Saeed; Bashiri, Mahdi
2013-05-01
Production planning and control (PPC) systems have to deal with rising complexity and dynamics. The complexity of planning tasks is due to some existing multiple variables and dynamic factors derived from uncertainties surrounding the PPC. Although literatures on exact scheduling algorithms, simulation approaches, and heuristic methods are extensive in production planning, they seem to be inefficient because of daily fluctuations in real factories. Decision support systems can provide productive tools for production planners to offer a feasible and prompt decision in effective and robust production planning. In this paper, we propose a robust decision support tool for detailed production planning based on statistical multivariate method including principal component analysis and logistic regression. The proposed approach has been used in a real case in Iranian automotive industry. In the presence of existing multisource uncertainties, the results of applying the proposed method in the selected case show that the accuracy of daily production planning increases in comparison with the existing method.
NASA Astrophysics Data System (ADS)
Notario del Pino, Jesús S.; Ruiz-Gallardo, José-Reyes
2015-03-01
Treatments that minimize soil erosion after large wildfires depend, among other factors, on fire severity and landscape configuration so that, in practice, most of them are applied according to emergency criteria. Therefore, simple tools to predict soil erosion risk help to decide where the available resources should be used first. In this study, a predictive model for soil erosion degree, based on ordinal logistic regression, has been developed and evaluated using data from three large forest fires in South-eastern Spain. The field data were successfully fit to the model in 60% of cases after 50 runs (i.e., agreement between observed and predicted soil erosion degrees), using slope steepness, slope aspect, and fire severity as predictors. North-facing slopes were shown to be less prone to soil erosion than the rest.
Gysemans, K P M; Bernaerts, K; Vermeulen, A; Geeraerd, A H; Debevere, J; Devlieghere, F; Van Impe, J F
2007-03-20
Several model types have already been developed to describe the boundary between growth and no growth conditions. In this article two types were thoroughly studied and compared, namely (i) the ordinary (linear) logistic regression model, i.e., with a polynomial on the right-hand side of the model equation (type I) and (ii) the (nonlinear) logistic regression model derived from a square root-type kinetic model (type II). The examination was carried out on the basis of the data described in Vermeulen et al. [Vermeulen, A., Gysemans, K.P.M., Bernaerts, K., Geeraerd, A.H., Van Impe, J.F., Debevere, J., Devlieghere, F., 2006-this issue. Influence of pH, water activity and acetic acid concentration on Listeria monocytogenes at 7 degrees C: data collection for the development of a growth/no growth model. International Journal of Food Microbiology. .]. These data sets consist of growth/no growth data for Listeria monocytogenes as a function of water activity (0.960-0.990), pH (5.0-6.0) and acetic acid percentage (0-0.8% (w/w)), both for a monoculture and a mixed strain culture. Numerous replicates, namely twenty, were performed at closely spaced conditions. In this way detailed information was obtained about the position of the interface and the transition zone between growth and no growth. The main questions investigated were (i) which model type performs best on the monoculture and the mixed strain data, (ii) are there differences between the growth/no growth interfaces of monocultures and mixed strain cultures, (iii) which parameter estimation approach works best for the type II models, and (iv) how sensitive is the performance of these models to the values of their nonlinear-appearing parameters. The results showed that both type I and II models performed well on the monoculture data with respect to goodness-of-fit and predictive power. The type I models were, however, more sensitive to anomalous data points. The situation was different for the mixed strain culture. In
NASA Astrophysics Data System (ADS)
Espino, Natalia V.
Foreign Object Debris/Damage (FOD) is a costly and high-risk problem that aeronautics industries such as Boeing, Lockheed Martin, among others are facing at their production lines every day. They spend an average of $350 thousand dollars per year fixing FOD problems. FOD can put pilots, passengers and other crews' lives into high-risk. FOD refers to any type of foreign object, particle, debris or agent in the manufacturing environment, which could contaminate/damage the product or otherwise undermine quality control standards. FOD can be in the form of any of the following categories: panstock, manufacturing debris, tools/shop aids, consumables and trash. Although aeronautics industries have put many prevention plans in place such as housekeeping and "clean as you go" philosophies, trainings, use of RFID for tooling control, etc. none of them has been able to completely eradicate the problem. This research presents a logistic regression statistical model approach to predict probability of FOD type under given specific circumstances such as workstation, month and aircraft/jet being built. FOD Quality Assurance Reports of the last three years were provided by an aeronautical industry for this study. By predicting type of FOD, custom reduction/elimination plans can be put in place and by such means being able to diminish the problem. Different aircrafts were analyzed and so different models developed through same methodology. Results of the study presented are predictions of FOD type for each aircraft and workstation throughout the year, which were obtained by applying proposed logistic regression models. This research would help aeronautic industries to address the FOD problem correctly, to be able to identify root causes and establish actual reduction/elimination plans.
Bejaei, M; Wiseman, K; Cheng, K M
2015-01-01
Consumers' interest in specialty eggs appears to be growing in Europe and North America. The objective of this research was to develop logistic regression models that utilise purchaser attributes and demographics to predict the probability of a consumer purchasing a specific type of table egg including regular (white and brown), non-caged (free-run, free-range and organic) or nutrient-enhanced eggs. These purchase prediction models, together with the purchasers' attributes, can be used to assess market opportunities of different egg types specifically in British Columbia (BC). An online survey was used to gather data for the models. A total of 702 completed questionnaires were submitted by BC residents. Selected independent variables included in the logistic regression to develop models for different egg types to predict the probability of a consumer purchasing a specific type of table egg. The variables used in the model accounted for 54% and 49% of variances in the purchase of regular and non-caged eggs, respectively. Research results indicate that consumers of different egg types exhibit a set of unique and statistically significant characteristics and/or demographics. For example, consumers of regular eggs were less educated, older, price sensitive, major chain store buyers, and store flyer users, and had lower awareness about different types of eggs and less concern regarding animal welfare issues. However, most of the non-caged egg consumers were less concerned about price, had higher awareness about different types of table eggs, purchased their eggs from local/organic grocery stores, farm gates or farmers markets, and they were more concerned about care and feeding of hens compared to consumers of other eggs types. PMID:26103791
Calef, M.P.; McGuire, A.D.; Epstein, H.E.; Rupp, T.S.; Shugart, H.H.
2005-01-01
Aim: To understand drivers of vegetation type distribution and sensitivity to climate change. Location: Interior Alaska. Methods: A logistic regression model was developed that predicts the potential equilibrium distribution of four major vegetation types: tundra, deciduous forest, black spruce forest and white spruce forest based on elevation, aspect, slope, drainage type, fire interval, average growing season temperature and total growing season precipitation. The model was run in three consecutive steps. The hierarchical logistic regression model was used to evaluate how scenarios of changes in temperature, precipitation and fire interval may influence the distribution of the four major vegetation types found in this region. Results: At the first step, tundra was distinguished from forest, which was mostly driven by elevation, precipitation and south to north aspect. At the second step, forest was separated into deciduous and spruce forest, a distinction that was primarily driven by fire interval and elevation. At the third step, the identification of black vs. white spruce was driven mainly by fire interval and elevation. The model was verified for Interior Alaska, the region used to develop the model, where it predicted vegetation distribution among the steps with an accuracy of 60-83%. When the model was independently validated for north-west Canada, it predicted vegetation distribution among the steps with an accuracy of 53-85%. Black spruce remains the dominant vegetation type under all scenarios, potentially expanding most under warming coupled with increasing fire interval. White spruce is clearly limited by moisture once average growing season temperatures exceeded a critical limit (+2 ??C). Deciduous forests expand their range the most when any two of the following scenarios are combined: decreasing fire interval, warming and increasing precipitation. Tundra can be replaced by forest under warming but expands under precipitation increase. Main
Veazey, Lindsay M; Franklin, Erik C; Kelley, Christopher; Rooney, John; Frazer, L Neil; Toonen, Robert J
2016-01-01
Predictive habitat suitability models are powerful tools for cost-effective, statistically robust assessment of the environmental drivers of species distributions. The aim of this study was to develop predictive habitat suitability models for two genera of scleractinian corals (Leptoserisand Montipora) found within the mesophotic zone across the main Hawaiian Islands. The mesophotic zone (30-180 m) is challenging to reach, and therefore historically understudied, because it falls between the maximum limit of SCUBA divers and the minimum typical working depth of submersible vehicles. Here, we implement a logistic regression with rare events corrections to account for the scarcity of presence observations within the dataset. These corrections reduced the coefficient error and improved overall prediction success (73.6% and 74.3%) for both original regression models. The final models included depth, rugosity, slope, mean current velocity, and wave height as the best environmental covariates for predicting the occurrence of the two genera in the mesophotic zone. Using an objectively selected theta ("presence") threshold, the predicted presence probability values (average of 0.051 for Leptoseris and 0.040 for Montipora) were translated to spatially-explicit habitat suitability maps of the main Hawaiian Islands at 25 m grid cell resolution. Our maps are the first of their kind to use extant presence and absence data to examine the habitat preferences of these two dominant mesophotic coral genera across Hawai'i. PMID:27441122
Franklin, Erik C.; Kelley, Christopher; Frazer, L. Neil; Toonen, Robert J.
2016-01-01
Predictive habitat suitability models are powerful tools for cost-effective, statistically robust assessment of the environmental drivers of species distributions. The aim of this study was to develop predictive habitat suitability models for two genera of scleractinian corals (Leptoserisand Montipora) found within the mesophotic zone across the main Hawaiian Islands. The mesophotic zone (30–180 m) is challenging to reach, and therefore historically understudied, because it falls between the maximum limit of SCUBA divers and the minimum typical working depth of submersible vehicles. Here, we implement a logistic regression with rare events corrections to account for the scarcity of presence observations within the dataset. These corrections reduced the coefficient error and improved overall prediction success (73.6% and 74.3%) for both original regression models. The final models included depth, rugosity, slope, mean current velocity, and wave height as the best environmental covariates for predicting the occurrence of the two genera in the mesophotic zone. Using an objectively selected theta (“presence”) threshold, the predicted presence probability values (average of 0.051 for Leptoseris and 0.040 for Montipora) were translated to spatially-explicit habitat suitability maps of the main Hawaiian Islands at 25 m grid cell resolution. Our maps are the first of their kind to use extant presence and absence data to examine the habitat preferences of these two dominant mesophotic coral genera across Hawai‘i. PMID:27441122
NASA Astrophysics Data System (ADS)
Madhu, B.; Ashok, N. C.; Balasubramanian, S.
2014-11-01
Multinomial logistic regression analysis was used to develop statistical model that can predict the probability of breast cancer in Southern Karnataka using the breast cancer occurrence data during 2007-2011. Independent socio-economic variables describing the breast cancer occurrence like age, education, occupation, parity, type of family, health insurance coverage, residential locality and socioeconomic status of each case was obtained. The models were developed as follows: i) Spatial visualization of the Urban- rural distribution of breast cancer cases that were obtained from the Bharat Hospital and Institute of Oncology. ii) Socio-economic risk factors describing the breast cancer occurrences were complied for each case. These data were then analysed using multinomial logistic regression analysis in a SPSS statistical software and relations between the occurrence of breast cancer across the socio-economic status and the influence of other socio-economic variables were evaluated and multinomial logistic regression models were constructed. iii) the model that best predicted the occurrence of breast cancer were identified. This multivariate logistic regression model has been entered into a geographic information system and maps showing the predicted probability of breast cancer occurrence in Southern Karnataka was created. This study demonstrates that Multinomial logistic regression is a valuable tool for developing models that predict the probability of breast cancer Occurrence in Southern Karnataka.
Narushima, Daichi; Kawasaki, Yohei; Takamatsu, Shoji
2016-01-01
Background: Spontaneous Reporting Systems (SRSs) are passive systems composed of reports of suspected Adverse Drug Events (ADEs), and are used for Pharmacovigilance (PhV), namely, drug safety surveillance. Exploration of analytical methodologies to enhance SRS-based discovery will contribute to more effective PhV. In this study, we proposed a statistical modeling approach for SRS data to address heterogeneity by a reporting time point. Furthermore, we applied this approach to analyze ADEs of incretin-based drugs such as DPP-4 inhibitors and GLP-1 receptor agonists, which are widely used to treat type 2 diabetes. Methods: SRS data were obtained from the Japanese Adverse Drug Event Report (JADER) database. Reported adverse events were classified according to the MedDRA High Level Terms (HLTs). A mixed effects logistic regression model was used to analyze the occurrence of each HLT. The model treated DPP-4 inhibitors, GLP-1 receptor agonists, hypoglycemic drugs, concomitant suspected drugs, age, and sex as fixed effects, while the quarterly period of reporting was treated as a random effect. Before application of the model, Fisher’s exact tests were performed for all drug-HLT combinations. Mixed effects logistic regressions were performed for the HLTs that were found to be associated with incretin-based drugs. Statistical significance was determined by a two-sided p-value <0.01 or a 99% two-sided confidence interval. Finally, the models with and without the random effect were compared based on Akaike’s Information Criteria (AIC), in which a model with a smaller AIC was considered satisfactory. Results: The analysis included 187,181 cases reported from January 2010 to March 2015. It showed that 33 HLTs, including pancreatic, gastrointestinal, and cholecystic events, were significantly associated with DPP-4 inhibitors or GLP-1 receptor agonists. In the AIC comparison, half of the HLTs reported with incretin-based drugs favored the random effect, whereas HLTs
Jen, Min-Hua; Bottle, Alex; Kirkwood, Graham; Johnston, Ron; Aylin, Paul
2011-09-01
We have previously described a system for monitoring a number of healthcare outcomes using case-mix adjustment models. It is desirable to automate the model fitting process in such a system if monitoring covers a large number of outcome measures or subgroup analyses. Our aim was to compare the performance of three different variable selection strategies: "manual", "automated" backward elimination and re-categorisation, and including all variables at once, irrespective of their apparent importance, with automated re-categorisation. Logistic regression models for predicting in-hospital mortality and emergency readmission within 28 days were fitted to an administrative database for 78 diagnosis groups and 126 procedures from 1996 to 2006 for National Health Services hospital trusts in England. The performance of models was assessed with Receiver Operating Characteristic (ROC) c statistics, (measuring discrimination) and Brier score (assessing the average of the predictive accuracy). Overall, discrimination was similar for diagnoses and procedures and consistently better for mortality than for emergency readmission. Brier scores were generally low overall (showing higher accuracy) and were lower for procedures than diagnoses, with a few exceptions for emergency readmission within 28 days. Among the three variable selection strategies, the automated procedure had similar performance to the manual method in almost all cases except low-risk groups with few outcome events. For the rapid generation of multiple case-mix models we suggest applying automated modelling to reduce the time required, in particular when examining different outcomes of large numbers of procedures and diseases in routinely collected administrative health data. PMID:21556848
NASA Astrophysics Data System (ADS)
Song, Sutao; Chen, Gongxiang; Zhan, Yu; Zhang, Jiacai; Yao, Li
2014-03-01
Recently, sparse algorithms, such as Sparse Multinomial Logistic Regression (SMLR), have been successfully applied in decoding visual information from functional magnetic resonance imaging (fMRI) data, where the contrast of visual stimuli was predicted by a classifier. The contrast classifier combined brain activities of voxels with sparse weights. For sparse algorithms, the goal is to learn a classifier whose weights distributed as sparse as possible by introducing some prior belief about the weights. There are two ways to introduce a sparse prior constraints for weights: the Automatic Relevance Determination (ARD-SMLR) and Laplace prior (LAP-SMLR). In this paper, we presented comparison results between the ARD-SMLR and LAP-SMLR models in computational time, classification accuracy and voxel selection. Results showed that, for fMRI data, no significant difference was found in classification accuracy between these two methods when voxels in V1 were chosen as input features (totally 1017 voxels). As for computation time, LAP-SMLR was superior to ARD-SMLR; the survived voxels for ARD-SMLR was less than LAP-SMLR. Using simulation data, we confirmed the classification performance for the two SMLR models was sensitive to the sparsity of the initial features, when the ratio of relevant features to the initial features was larger than 0.01, ARD-SMLR outperformed LAP-SMLR; otherwise, LAP-SMLR outperformed LAP-SMLR. Simulation data showed ARD-SMLR was more efficient in selecting relevant features.
Swartz, Michael D.; Yu, Robert K.; Shete, Sanjay
2011-01-01
When modeling the risk of a disease, the very act of selecting the factors to include can heavily impact the results. This study compares the performance of several variable selection techniques applied to logistic regression. We performed realistic simulation studies to compare five methods of variable selection: (1) a confidence interval approach for significant coefficients (CI), (2) backward selection, (3) forward selection, (4) stepwise selection, and (5) Bayesian stochastic search variable selection (SSVS) using both informed and uniformed priors. We defined our simulated diseases mimicking odds ratios for cancer risk found in the literature for environmental factors, such as smoking; dietary risk factors, such as fiber; genetic risk factors such as XPD; and interactions. We modeled the distribution of our covariates, including correlation, after the reported empirical distributions of these risk factors. We also used a null data set to calibrate the priors of the Bayesian method and evaluate its sensitivity. Of the standard methods (95% CI, backward, forward and stepwise selection) the CI approach resulted in the highest average percent of correct associations and lowest average percent of incorrect associations. SSVS with an informed prior had higher average percent of correct associations and lower average percent of incorrect associations than did the CI approach. This study shows that Bayesian methods offer a way to use prior information to both increase power and decrease false-positive results when selecting factors to model complex disease risk. PMID:18937224
NASA Astrophysics Data System (ADS)
Wang, Ying; Song, Chongzhen; Lin, Qigen; Li, Juan
2016-04-01
The Newmark displacement model has been used to predict earthquake-triggered landslides. Logistic regression (LR) is also a common landslide hazard assessment method. We combined the Newmark displacement model and LR and applied them to Wenchuan County and Beichuan County in China, which were affected by the Ms. 8.0 Wenchuan earthquake on May 12th, 2008, to develop a mechanism-based landslide occurrence probability model and improve the predictive accuracy. A total of 1904 landslide sites in Wenchuan County and 3800 random non-landslide sites were selected as the training dataset. We applied the Newmark model and obtained the distribution of permanent displacement (Dn) for a 30 × 30 m grid. Four factors (Dn, topographic relief, and distances to drainages and roads) were used as independent variables for LR. Then, a combined model was obtained, with an AUC (area under the curve) value of 0.797 for Wenchuan County. A total of 617 landslide sites and non-landslide sites in Beichuan County were used as a validation dataset with AUC = 0.753. The proposed method may also be applied to earthquake-induced landslides in other regions.
NASA Astrophysics Data System (ADS)
Xiao, Rui; Su, Shiliang; Mai, Gengchen; Zhang, Zhonghao; Yang, Chenxue
2015-02-01
Cash crop expansion has been a major land use change in tropical and subtropical regions worldwide. Quantifying the determinants of cash crop expansion should provide deeper spatial insights into the dynamics and ecological consequences of cash crop expansion. This paper investigated the process of cash crop expansion in Hangzhou region (China) from 1985 to 2009 using remotely sensed data. The corresponding determinants (neighborhood, physical, and proximity) and their relative effects during three periods (1985-1994, 1994-2003, and 2003-2009) were quantified by logistic regression modeling and variance partitioning. Results showed that the total area of cash crops increased from 58,874.1 ha in 1985 to 90,375.1 ha in 2009, with a net growth of 53.5%. Cash crops were more likely to grow in loam soils. Steep areas with higher elevation would experience less likelihood of cash crop expansion. A consistently higher probability of cash crop expansion was found on places with abundant farmland and forest cover in the three periods. Besides, distance to river and lake, distance to county center, and distance to provincial road were decisive determinants for farmers' choice of cash crop plantation. Different categories of determinants and their combinations exerted different influences on cash crop expansion. The joint effects of neighborhood and proximity determinants were the strongest, and the unique effect of physical determinants decreased with time. Our study contributed to understanding of the proximate drivers of cash crop expansion in subtropical regions.
Bent, Gardner C.; Steeves, Peter A.
2006-01-01
A revised logistic regression equation and an automated procedure were developed for mapping the probability of a stream flowing perennially in Massachusetts. The equation provides city and town conservation commissions and the Massachusetts Department of Environmental Protection a method for assessing whether streams are intermittent or perennial at a specific site in Massachusetts by estimating the probability of a stream flowing perennially at that site. This information could assist the environmental agencies who administer the Commonwealth of Massachusetts Rivers Protection Act of 1996, which establishes a 200-foot-wide protected riverfront area extending from the mean annual high-water line along each side of a perennial stream, with exceptions for some urban areas. The equation was developed by relating the observed intermittent or perennial status of a stream site to selected basin characteristics of naturally flowing streams (defined as having no regulation by dams, surface-water withdrawals, ground-water withdrawals, diversion, wastewater discharge, and so forth) in Massachusetts. This revised equation differs from the equation developed in a previous U.S. Geological Survey study in that it is solely based on visual observations of the intermittent or perennial status of stream sites across Massachusetts and on the evaluation of several additional basin and land-use characteristics as potential explanatory variables in the logistic regression analysis. The revised equation estimated more accurately the intermittent or perennial status of the observed stream sites than the equation from the previous study. Stream sites used in the analysis were identified as intermittent or perennial based on visual observation during low-flow periods from late July through early September 2001. The database of intermittent and perennial streams included a total of 351 naturally flowing (no regulation) sites, of which 85 were observed to be intermittent and 266 perennial
NASA Astrophysics Data System (ADS)
Dong, Longjun; Wesseloo, Johan; Potvin, Yves; Li, Xibing
2016-01-01
Seismic events and blasts generate seismic waveforms that have different characteristics. The challenge to confidently differentiate these two signatures is complex and requires the integration of physical and statistical techniques. In this paper, the different characteristics of blasts and seismic events were investigated by comparing probability density distributions of different parameters. Five typical parameters of blasts and events and the probability density functions of blast time, as well as probability density functions of origin time difference for neighbouring blasts were extracted as discriminant indicators. The Fisher classifier, naive Bayesian classifier and logistic regression were used to establish discriminators. Databases from three Australian and Canadian mines were established for training, calibrating and testing the discriminant models. The classification performances and discriminant precision of the three statistical techniques were discussed and compared. The proposed discriminators have explicit and simple functions which can be easily used by workers in mines or researchers. Back-test, applied results, cross-validated results and analysis of receiver operating characteristic curves in different mines have shown that the discriminator for one of the mines has a reasonably good discriminating performance.
NASA Astrophysics Data System (ADS)
Mousavi, S. Mostafa; Horton, Stephen, P.; Langston, Charles A.; Samei, Borhan
2016-07-01
We develop an automated strategy for discriminating deep microseismic events from shallow ones on the basis of the waveforms recorded on a limited number of surface receivers. Machine-learning techniques are employed to explore the relationship between event hypocenters and seismic features of the recorded signals in time, frequency, and time-frequency domains. We applied the technique to 440 microearthquakes -1.7
Duwe, Grant; Freske, Pamela J
2012-08-01
This study presents the results from efforts to revise the Minnesota Sex Offender Screening Tool-Revised (MnSOST-R), one of the most widely used sex offender risk-assessment tools. The updated instrument, the MnSOST-3, contains nine individual items, six of which are new. The population for this study consisted of the cross-validation sample for the MnSOST-R (N = 220) and a contemporary sample of 2,315 sex offenders released from Minnesota prisons between 2003 and 2006. To score and select items for the MnSOST-3, we used predicted probabilities generated from a multiple logistic regression model. We used bootstrap resampling to not only refine our selection of predictors but also internally validate the model. The results indicate the MnSOST-3 has a relatively high level of predictive discrimination, as evidenced by an apparent AUC of .821 and an optimism-corrected AUC of .796. The findings show the MnSOST-3 is well calibrated with actual recidivism rates for all but the highest risk offenders. Although estimating a penalized maximum likelihood model did not improve the overall calibration, the results suggest the MnSOST-3 may still be useful in helping identify high-risk offenders whose sexual recidivism risk exceeds 50%. Results from an interrater reliability assessment indicate the instrument, which is scored in a Microsoft Excel application, has an adequate degree of consistency across raters (ICC = .83 for both consistency and absolute agreement). PMID:22291047
Chao, Cheng-Min; Yu, Ya-Wen; Cheng, Bor-Wen; Kuo, Yao-Lung
2014-10-01
The aim of the paper is to use data mining technology to establish a classification of breast cancer survival patterns, and offers a treatment decision-making reference for the survival ability of women diagnosed with breast cancer in Taiwan. We studied patients with breast cancer in a specific hospital in Central Taiwan to obtain 1,340 data sets. We employed a support vector machine, logistic regression, and a C5.0 decision tree to construct a classification model of breast cancer patients' survival rates, and used a 10-fold cross-validation approach to identify the model. The results show that the establishment of classification tools for the classification of the models yielded an average accuracy rate of more than 90% for both; the SVM provided the best method for constructing the three categories of the classification system for the survival mode. The results of the experiment show that the three methods used to create the classification system, established a high accuracy rate, predicted a more accurate survival ability of women diagnosed with breast cancer, and could be used as a reference when creating a medical decision-making frame. PMID:25119239
NASA Astrophysics Data System (ADS)
Vozinaki, Anthi Eirini K.; Karatzas, George P.; Sibetheros, Ioannis A.; Varouchakis, Emmanouil A.
2014-05-01
Damage curves are the most significant component of the flood loss estimation models. Their development is quite complex. Two types of damage curves exist, historical and synthetic curves. Historical curves are developed from historical loss data from actual flood events. However, due to the scarcity of historical data, synthetic damage curves can be alternatively developed. Synthetic curves rely on the analysis of expected damage under certain hypothetical flooding conditions. A synthetic approach was developed and presented in this work for the development of damage curves, which are subsequently used as the basic input to a flood loss estimation model. A questionnaire-based survey took place among practicing and research agronomists, in order to generate rural loss data based on the responders' loss estimates, for several flood condition scenarios. In addition, a similar questionnaire-based survey took place among building experts, i.e. civil engineers and architects, in order to generate loss data for the urban sector. By answering the questionnaire, the experts were in essence expressing their opinion on how damage to various crop types or building types is related to a range of values of flood inundation parameters, such as floodwater depth and velocity. However, the loss data compiled from the completed questionnaires were not sufficient for the construction of workable damage curves; to overcome this problem, a Weighted Monte Carlo method was implemented, in order to generate extra synthetic datasets with statistical properties identical to those of the questionnaire-based data. The data generated by the Weighted Monte Carlo method were processed via Logistic Regression techniques in order to develop accurate logistic damage curves for the rural and the urban sectors. A Python-based code was developed, which combines the Weighted Monte Carlo method and the Logistic Regression analysis into a single code (WMCLR Python code). Each WMCLR code execution
Ong, Hong Choon; Alih, Ekele
2015-01-01
The tendency for experimental and industrial variables to include a certain proportion of outliers has become a rule rather than an exception. These clusters of outliers, if left undetected, have the capability to distort the mean and the covariance matrix of the Hotelling’s T2 multivariate control charts constructed to monitor individual quality characteristics. The effect of this distortion is that the control chart constructed from it becomes unreliable as it exhibits masking and swamping, a phenomenon in which an out-of-control process is erroneously declared as an in-control process or an in-control process is erroneously declared as out-of-control process. To handle these problems, this article proposes a control chart that is based on cluster-regression adjustment for retrospective monitoring of individual quality characteristics in a multivariate setting. The performance of the proposed method is investigated through Monte Carlo simulation experiments and historical datasets. Results obtained indicate that the proposed method is an improvement over the state-of-art methods in terms of outlier detection as well as keeping masking and swamping rate under control. PMID:25923739
ERIC Educational Resources Information Center
So, Tak-Shing Harry; Peng, Chao-Ying Joanne
This study compared the accuracy of predicting two-group membership obtained from K-means clustering with those derived from linear probability modeling, linear discriminant function, and logistic regression under various data properties. Multivariate normally distributed populations were simulated based on combinations of population proportions,…
ERIC Educational Resources Information Center
Woods, Carol M.; Oltmanns, Thomas F.; Turkheimer, Eric
2008-01-01
Person-fit assessment is used to identify persons who respond aberrantly to a test or questionnaire. In this study, S. P. Reise's (2000) method for evaluating person fit using 2-level logistic regression was applied to 13 personality scales of the Schedule for Nonadaptive and Adaptive Personality (SNAP; L. Clark, 1996) that had been administered…
NASA Astrophysics Data System (ADS)
Ngadisih, Bhandary, Netra P.; Yatabe, Ryuichi; Dahal, Ranjan K.
2016-05-01
West Java Province is the most landslide risky area in Indonesia owing to extreme geo-morphological conditions, climatic conditions and densely populated settlements with immense completed and ongoing development activities. So, a landslide susceptibility map at regional scale in this province is a fundamental tool for risk management and land-use planning. Logistic regression and Artificial Neural Network (ANN) models are the most frequently used tools for landslide susceptibility assessment, mainly because they are capable of handling the nature of landslide data. The main objective of this study is to apply logistic regression and ANN models and compare their performance for landslide susceptibility mapping in volcanic mountains of West Java Province. In addition, the model application is proposed to identify the most contributing factors to landslide events in the study area. The spatial database built in GIS platform consists of landslide inventory, four topographical parameters (slope, aspect, relief, distance to river), three geological parameters (distance to volcano crater, distance to thrust and fault, geological formation), and two anthropogenic parameters (distance to road, land use). The logistic regression model in this study revealed that slope, geological formations, distance to road and distance to volcano are the most influential factors of landslide events while, the ANN model revealed that distance to volcano crater, geological formation, distance to road, and land-use are the most important causal factors of landslides in the study area. Moreover, an evaluation of the model showed that the ANN model has a higher accuracy than the logistic regression model.
Mala, A; Ravichandran, B; Raghavan, S; Rajmohan, H R
2010-08-01
There are only a few studies performed on multinomial logistic regression on the benzene-exposed occupational group. A study was carried out to assess the relationship between the benzene concentration and trans-trans-muconic acid (t,t-MA), biomarkers in urine samples from petrol filling workers. A total of 117 workers involved in this occupation were selected for this current study. Generally, logistic regression analysis (LR) is a common statistical technique that could be used to predict the likelihood of categorical or binary or dichotomous outcome variables. The multinomial logistic regression equations were used to predict the relationship between benzene concentration and t,t-MA. The results showed a significant correlation between benzene and t,t-MA among the petrol fillers. Prediction equations were estimated by adopting the physical characteristic viz., age, experience in years and job categories of petrol filling station workers. Interestingly, there was no significant difference observed among experience in years. Petrol fillers and cashiers having a higher occupational risk were in the age group of ≤24 and between 25 and 34 years. Among the petrol fillers, the t,t-MA levels with exceeding ACGIH TWA-TLV level was showing to be more significant. This study demonstrated that multinomial logistic regression is an effective model for profiling the greatest risk of the benzene-exposed group caused by different explanatory variables. PMID:21120078
Mala, A.; Ravichandran, B.; Raghavan, S.; Rajmohan, H. R.
2010-01-01
There are only a few studies performed on multinomial logistic regression on the benzene-exposed occupational group. A study was carried out to assess the relationship between the benzene concentration and trans-trans-muconic acid (t,t-MA), biomarkers in urine samples from petrol filling workers. A total of 117 workers involved in this occupation were selected for this current study. Generally, logistic regression analysis (LR) is a common statistical technique that could be used to predict the likelihood of categorical or binary or dichotomous outcome variables. The multinomial logistic regression equations were used to predict the relationship between benzene concentration and t,t-MA. The results showed a significant correlation between benzene and t,t-MA among the petrol fillers. Prediction equations were estimated by adopting the physical characteristic viz., age, experience in years and job categories of petrol filling station workers. Interestingly, there was no significant difference observed among experience in years. Petrol fillers and cashiers having a higher occupational risk were in the age group of ≤24 and between 25 and 34 years. Among the petrol fillers, the t,t-MA levels with exceeding ACGIH TWA-TLV level was showing to be more significant. This study demonstrated that multinomial logistic regression is an effective model for profiling the greatest risk of the benzene-exposed group caused by different explanatory variables. PMID:21120078
ERIC Educational Resources Information Center
Hidalgo, Mª Dolores; Gómez-Benito, Juana; Zumbo, Bruno D.
2014-01-01
The authors analyze the effectiveness of the R[superscript 2] and delta log odds ratio effect size measures when using logistic regression analysis to detect differential item functioning (DIF) in dichotomous items. A simulation study was carried out, and the Type I error rate and power estimates under conditions in which only statistical testing…
NASA Astrophysics Data System (ADS)
Das, Iswar; Sahoo, Sashikant; van Westen, Cees; Stein, Alfred; Hack, Robert
2010-02-01
Landslide studies are commonly guided by ground knowledge and field measurements of rock strength and slope failure criteria. With increasing sophistication of GIS-based statistical methods, however, landslide susceptibility studies benefit from the integration of data collected from various sources and methods at different scales. This study presents a logistic regression method for landslide susceptibility mapping and verifies the result by comparing it with the geotechnical-based slope stability probability classification (SSPC) methodology. The study was carried out in a landslide-prone national highway road section in the northern Himalayas, India. Logistic regression model performance was assessed by the receiver operator characteristics (ROC) curve, showing an area under the curve equal to 0.83. Field validation of the SSPC results showed a correspondence of 72% between the high and very high susceptibility classes with present landslide occurrences. A spatial comparison of the two susceptibility maps revealed the significance of the geotechnical-based SSPC method as 90% of the area classified as high and very high susceptible zones by the logistic regression method corresponds to the high and very high class in the SSPC method. On the other hand, only 34% of the area classified as high and very high by the SSPC method falls in the high and very high classes of the logistic regression method. The underestimation by the logistic regression method can be attributed to the generalisation made by the statistical methods, so that a number of slopes existing in critical equilibrium condition might not be classified as high or very high susceptible zones.
Boutilier, J; Chan, T; Lee, T; Craig, T; Sharpe, M
2014-06-15
Purpose: To develop a statistical model that predicts optimization objective function weights from patient geometry for intensity-modulation radiotherapy (IMRT) of prostate cancer. Methods: A previously developed inverse optimization method (IOM) is applied retrospectively to determine optimal weights for 51 treated patients. We use an overlap volume ratio (OVR) of bladder and rectum for different PTV expansions in order to quantify patient geometry in explanatory variables. Using the optimal weights as ground truth, we develop and train a logistic regression (LR) model to predict the rectum weight and thus the bladder weight. Post hoc, we fix the weights of the left femoral head, right femoral head, and an artificial structure that encourages conformity to the population average while normalizing the bladder and rectum weights accordingly. The population average of objective function weights is used for comparison. Results: The OVR at 0.7cm was found to be the most predictive of the rectum weights. The LR model performance is statistically significant when compared to the population average over a range of clinical metrics including bladder/rectum V53Gy, bladder/rectum V70Gy, and mean voxel dose to the bladder, rectum, CTV, and PTV. On average, the LR model predicted bladder and rectum weights that are both 63% closer to the optimal weights compared to the population average. The treatment plans resulting from the LR weights have, on average, a rectum V70Gy that is 35% closer to the clinical plan and a bladder V70Gy that is 43% closer. Similar results are seen for bladder V54Gy and rectum V54Gy. Conclusion: Statistical modelling from patient anatomy can be used to determine objective function weights in IMRT for prostate cancer. Our method allows the treatment planners to begin the personalization process from an informed starting point, which may lead to more consistent clinical plans and reduce overall planning time.
Burkness, Eric C; Galvan, Tederson L; Hutchison, W D
2009-04-01
Late-season plantings of sweet corn in Minnesota result in an abundant supply of silking corn, Zea mays L., throughout August to early September that is highly attractive to the corn earworm, Helicoverpa zea (Boddie) (Lepidoptera: Noctuidae). During a 10-yr period, 1997-2006, insecticide efficacy trials were conducted in late-planted sweet corn in Minnesota for management of H. zea. These data were used to develop a logistic regression model to identify the variables and interactions that most influenced efficacy (proportion control) of late-instar H. zea. The pyrethroid lambdacyhalothrin (0.028 kg [AI]/ha) is a commonly used insecticide in sweet corn and was therefore chosen for use in parameter evaluation. Three variables were found to be significant (alpha = 0.05), the percentage of plants silking at the time of the first insecticide application, the interval between the first and second insecticide applications, and the interval between the last insecticide application and harvest. Odds ratio estimates indicated that as the percentage of plants silking at the time of first application increased, control of H. zea increased. As the interval between the first and second insecticide application increased, control of H. zea decreased. Finally, as the interval between the last insecticide application and harvest increased, control of H. zea increased. An additional timing trial was conducted in 2007 by using lambda-cyhalothrin, to evaluate the impact of the percentage of plants silking at the first application. The results indicated no significant differences in efficacy against late-instar H. zea at 0, 50, 90, and 100% of plants silking at the first application (regimes of five or more sprays). The implications of these effects are discussed within the context of current integrated pest management programs for late-planted sweet corn in the upper midwestern United States. PMID:19449649
Kononen, Douglas W; Flannagan, Carol A C; Wang, Stewart C
2011-01-01
A multivariate logistic regression model, based upon National Automotive Sampling System Crashworthiness Data System (NASS-CDS) data for calendar years 1999-2008, was developed to predict the probability that a crash-involved vehicle will contain one or more occupants with serious or incapacitating injuries. These vehicles were defined as containing at least one occupant coded with an Injury Severity Score (ISS) of greater than or equal to 15, in planar, non-rollover crash events involving Model Year 2000 and newer cars, light trucks, and vans. The target injury outcome measure was developed by the Centers for Disease Control and Prevention (CDC)-led National Expert Panel on Field Triage in their recent revision of the Field Triage Decision Scheme (American College of Surgeons, 2006). The parameters to be used for crash injury prediction were subsequently specified by the National Expert Panel. Model input parameters included: crash direction (front, left, right, and rear), change in velocity (delta-V), multiple vs. single impacts, belt use, presence of at least one older occupant (≥ 55 years old), presence of at least one female in the vehicle, and vehicle type (car, pickup truck, van, and sport utility). The model was developed using predictor variables that may be readily available, post-crash, from OnStar-like telematics systems. Model sensitivity and specificity were 40% and 98%, respectively, using a probability cutpoint of 0.20. The area under the receiver operator characteristic (ROC) curve for the final model was 0.84. Delta-V (mph), seat belt use and crash direction were the most important predictors of serious injury. Due to the complexity of factors associated with rollover-related injuries, a separate screening algorithm is needed to model injuries associated with this crash mode. PMID:21094304
Chang, Ling-Hui; Tsai, Athena Yi-Jung; Huang, Wen-Ni
2016-01-01
Because resources for long-term care services are limited, timely and appropriate referral for rehabilitation services is critical for optimizing clients’ functions and successfully integrating them into the community. We investigated which client characteristics are most relevant in predicting Taiwan’s community-based occupational therapy (OT) service referral based on experts’ beliefs. Data were collected in face-to-face interviews using the Multidimensional Assessment Instrument (MDAI). Community-dwelling participants (n = 221) ≥ 18 years old who reported disabilities in the previous National Survey of Long-term Care Needs in Taiwan were enrolled. The standard for referral was the judgment and agreement of two experienced occupational therapists who reviewed the results of the MDAI. Logistic regressions and Generalized Additive Models were used for analysis. Two predictive models were proposed, one using basic activities of daily living (BADLs) and one using instrumental ADLs (IADLs). Dementia, psychiatric disorders, cognitive impairment, joint range-of-motion limitations, fear of falling, behavioral or emotional problems, expressive deficits (in the BADL-based model), and limitations in IADLs or BADLs were significantly correlated with the need for referral. Both models showed high area under the curve (AUC) values on receiver operating curve testing (AUC = 0.977 and 0.972, respectively). The probability of being referred for community OT services was calculated using the referral algorithm. The referral protocol facilitated communication between healthcare professionals to make appropriate decisions for OT referrals. The methods and findings should be useful for developing referral protocols for other long-term care services. PMID:26863544
NASA Astrophysics Data System (ADS)
Yao, W.; Poleswki, P.; Krzystek, P.
2016-06-01
The recent success of deep convolutional neural networks (CNN) on a large number of applications can be attributed to large amounts of available training data and increasing computing power. In this paper, a semantic pixel labelling scheme for urban areas using multi-resolution CNN and hand-crafted spatial-spectral features of airborne remotely sensed data is presented. Both CNN and hand-crafted features are applied to image/DSM patches to produce per-pixel class probabilities with a L1-norm regularized logistical regression classifier. The evidence theory infers a degree of belief for pixel labelling from different sources to smooth regions by handling the conflicts present in the both classifiers while reducing the uncertainty. The aerial data used in this study were provided by ISPRS as benchmark datasets for 2D semantic labelling tasks in urban areas, which consists of two data sources from LiDAR and color infrared camera. The test sites are parts of a city in Germany which is assumed to consist of typical object classes including impervious surfaces, trees, buildings, low vegetation, vehicles and clutter. The evaluation is based on the computation of pixel-based confusion matrices by random sampling. The performance of the strategy with respect to scene characteristics and method combination strategies is analyzed and discussed. The competitive classification accuracy could be not only explained by the nature of input data sources: e.g. the above-ground height of nDSM highlight the vertical dimension of houses, trees even cars and the nearinfrared spectrum indicates vegetation, but also attributed to decision-level fusion of CNN's texture-based approach with multichannel spatial-spectral hand-crafted features based on the evidence combination theory.
Wei, Peng; Tang, Hongwei; Li, Donghui
2014-11-01
Most complex human diseases are likely the consequence of the joint actions of genetic and environmental factors. Identification of gene-environment (G × E) interactions not only contributes to a better understanding of the disease mechanisms, but also improves disease risk prediction and targeted intervention. In contrast to the large number of genetic susceptibility loci discovered by genome-wide association studies, there have been very few successes in identifying G × E interactions, which may be partly due to limited statistical power and inaccurately measured exposures. Although existing statistical methods only consider interactions between genes and static environmental exposures, many environmental/lifestyle factors, such as air pollution and diet, change over time, and cannot be accurately captured at one measurement time point or by simply categorizing into static exposure categories. There is a dearth of statistical methods for detecting gene by time-varying environmental exposure interactions. Here, we propose a powerful functional logistic regression (FLR) approach to model the time-varying effect of longitudinal environmental exposure and its interaction with genetic factors on disease risk. Capitalizing on the powerful functional data analysis framework, our proposed FLR model is capable of accommodating longitudinal exposures measured at irregular time points and contaminated by measurement errors, commonly encountered in observational studies. We use extensive simulations to show that the proposed method can control the Type I error and is more powerful than alternative ad hoc methods. We demonstrate the utility of this new method using data from a case-control study of pancreatic cancer to identify the windows of vulnerability of lifetime body mass index on the risk of pancreatic cancer as well as genes that may modify this association. PMID:25219575
Khedhiri, Sami; Stambouli, Nejla; Ouerhani, Slah; Rouissi, Kamel; Marrakchi, Raja; Gaaied, Amel B; Slama, M B
2010-07-01
Cigarette smoking is the predominant risk factor for bladder cancer in males and females. The tobacco carcinogens are metabolized by various xenobiotic metabolizing enzymes such as N-acetyltransferases (NAT) and glutathione S-transferases (GST). Polymorphisms in NAT and GST genes alter the ability of these enzymes to metabolize carcinogens. In this paper, we conduct a statistical analysis based on logistic regressions to assess the impact of smoking and metabolizing enzyme genotypes on the risk to develop bladder cancer using a case-control study from Tunisia. We also use the generalized ordered logistic model to investigate whether these factors do have an impact on the progression of bladder tumors. PMID:20063011
NASA Astrophysics Data System (ADS)
Trigila, Alessandro; Iadanza, Carla; Esposito, Carlo; Scarascia-Mugnozza, Gabriele
2015-04-01
first phase of the work addressed to identify the spatial relationships between the landslides location and the 13 related factors by using the Frequency Ratio bivariate statistical method. The analysis was then carried out by adopting a multivariate statistical approach, according to the Logistic Regression technique and Random Forests technique that gave best results in terms of AUC. The models were performed and evaluated with different sample sizes and also taking into account the temporal variation of input variables such as burned areas by wildfire. The most significant outcome of this work are: the relevant influence of the sample size on the model results and the strong importance of some environmental factors (e.g. land use and wildfires) for the identification of the depletion zones of extremely rapid shallow landslides.
JACOB, BENJAMIN G.; NOVAK, ROBERT J.; TOE, LAURENT; SANFO, MOUSSA S.; AFRIYIE, ABENA N.; IBRAHIM, MOHAMMED A.; GRIFFITH, DANIEL A.; UNNASCH, THOMAS R.
2013-01-01
The standard methods for regression analyses of clustered riverine larval habitat data of Simulium damnosum s.l. a major black-fly vector of Onchoceriasis, postulate models relating observational ecological-sampled parameter estimators to prolific habitats without accounting for residual intra-cluster error correlation effects. Generally, this correlation comes from two sources: (1) the design of the random effects and their assumed covariance from the multiple levels within the regression model; and, (2) the correlation structure of the residuals. Unfortunately, inconspicuous errors in residual intra-cluster correlation estimates can overstate precision in forecasted S.damnosum s.l. riverine larval habitat explanatory attributes regardless how they are treated (e.g., independent, autoregressive, Toeplitz, etc). In this research, the geographical locations for multiple riverine-based S. damnosum s.l. larval ecosystem habitats sampled from 2 pre-established epidemiological sites in Togo were identified and recorded from July 2009 to June 2010. Initially the data was aggregated into proc genmod. An agglomerative hierarchical residual cluster-based analysis was then performed. The sampled clustered study site data was then analyzed for statistical correlations using Monthly Biting Rates (MBR). Euclidean distance measurements and terrain-related geomorphological statistics were then generated in ArcGIS. A digital overlay was then performed also in ArcGIS using the georeferenced ground coordinates of high and low density clusters stratified by Annual Biting Rates (ABR). This data was overlain onto multitemporal sub-meter pixel resolution satellite data (i.e., QuickBird 0.61m wavbands ). Orthogonal spatial filter eigenvectors were then generated in SAS/GIS. Univariate and non-linear regression-based models (i.e., Logistic, Poisson and Negative Binomial) were also employed to determine probability distributions and to identify statistically significant parameter
ERIC Educational Resources Information Center
Kasapoglu, Koray
2014-01-01
This study aims to investigate which factors are associated with Turkey's 15-year-olds' scoring above the OECD average (493) on the PISA'09 reading assessment. Collected from a total of 4,996 15-year-old students from Turkey, data were analyzed by logistic regression analysis in order to model the data of students who were split…
Barks, C.S.
1995-01-01
Storm-runoff water-quality data were used to verify and, when appropriate, adjust regional regression models previously developed to estimate urban storm- runoff loads and mean concentrations in Little Rock, Arkansas. Data collected at 5 representative sites during 22 storms from June 1992 through January 1994 compose the Little Rock data base. Comparison of observed values (0) of storm-runoff loads and mean concentrations to the predicted values (Pu) from the regional regression models for nine constituents (chemical oxygen demand, suspended solids, total nitrogen, total ammonia plus organic nitrogen as nitrogen, total phosphorus, dissolved phosphorus, total recoverable copper, total recoverable lead, and total recoverable zinc) shows large prediction errors ranging from 63 to several thousand percent. Prediction errors for six of the regional regression models are less than 100 percent, and can be considered reasonable for water-quality models. Differences between 0 and Pu are due to variability in the Little Rock data base and error in the regional models. Where applicable, a model adjustment procedure (termed MAP-R-P) based upon regression with 0 against Pu was applied to improve predictive accuracy. For 11 of the 18 regional water-quality models, 0 and Pu are significantly correlated, that is much of the variation in 0 is explained by the regional models. Five of these 11 regional models consistently overestimate O; therefore, MAP-R-P can be used to provide a better estimate. For the remaining seven regional models, 0 and Pu are not significanfly correlated, thus neither the unadjusted regional models nor the MAP-R-P is appropriate. A simple estimator, such as the mean of the observed values may be used if the regression models are not appropriate. Standard error of estimate of the adjusted models ranges from 48 to 130 percent. Calibration results may be biased due to the limited data set sizes in the Little Rock data base. The relatively large values of
Quantile Regression Adjusting for Dependent Censoring from Semi-Competing Risks
Li, Ruosha; Peng, Limin
2014-01-01
Summary In this work, we study quantile regression when the response is an event time subject to potentially dependent censoring. We consider the semi-competing risks setting, where time to censoring remains observable after the occurrence of the event of interest. While such a scenario frequently arises in biomedical studies, most of current quantile regression methods for censored data are not applicable because they generally require the censoring time and the event time be independent. By imposing rather mild assumptions on the association structure between the time-to-event response and the censoring time variable, we propose quantile regression procedures, which allow us to garner a comprehensive view of the covariate effects on the event time outcome as well as to examine the informativeness of censoring. An efficient and stable algorithm is provided for implementing the new method. We establish the asymptotic properties of the resulting estimators including uniform consistency and weak convergence. The theoretical development may serve as a useful template for addressing estimating settings that involve stochastic integrals. Extensive simulation studies suggest that the proposed method performs well with moderate sample sizes. We illustrate the practical utility of our proposals through an application to a bone marrow transplant trial. PMID:25574152
Hoos, Anne B.; Patel, Anant R.
1996-01-01
Model-adjustment procedures were applied to the combined data bases of storm-runoff quality for Chattanooga, Knoxville, and Nashville, Tennessee, to improve predictive accuracy for storm-runoff quality for urban watersheds in these three cities and throughout Middle and East Tennessee. Data for 45 storms at 15 different sites (five sites in each city) constitute the data base. Comparison of observed values of storm-runoff load and event-mean concentration to the predicted values from the regional regression models for 10 constituents shows prediction errors, as large as 806,000 percent. Model-adjustment procedures, which combine the regional model predictions with local data, are applied to improve predictive accuracy. Standard error of estimate after model adjustment ranges from 67 to 322 percent. Calibration results may be biased due to sampling error in the Tennessee data base. The relatively large values of standard error of estimate for some of the constituent models, although representing significant reduction (at least 50 percent) in prediction error compared to estimation with unadjusted regional models, may be unacceptable for some applications. The user may wish to collect additional local data for these constituents and repeat the analysis, or calibrate an independent local regression model.
Chen, Chen-Hsin; Tsay, Yuh-Chyuan; Wu, Ya-Chi; Horng, Cheng-Fang
2013-10-30
In conventional survival analysis there is an underlying assumption that all study subjects are susceptible to the event. In general, this assumption does not adequately hold when investigating the time to an event other than death. Owing to genetic and/or environmental etiology, study subjects may not be susceptible to the disease. Analyzing nonsusceptibility has become an important topic in biomedical, epidemiological, and sociological research, with recent statistical studies proposing several mixture models for right-censored data in regression analysis. In longitudinal studies, we often encounter left, interval, and right-censored data because of incomplete observations of the time endpoint, as well as possibly left-truncated data arising from the dissimilar entry ages of recruited healthy subjects. To analyze these kinds of incomplete data while accounting for nonsusceptibility and possible crossing hazards in the framework of mixture regression models, we utilize a logistic regression model to specify the probability of susceptibility, and a generalized gamma distribution, or a log-logistic distribution, in the accelerated failure time location-scale regression model to formulate the time to the event. Relative times of the conditional event time distribution for susceptible subjects are extended in the accelerated failure time location-scale submodel. We also construct graphical goodness-of-fit procedures on the basis of the Turnbull-Frydman estimator and newly proposed residuals. Simulation studies were conducted to demonstrate the validity of the proposed estimation procedure. The mixture regression models are illustrated with alcohol abuse data from the Taiwan Aboriginal Study Project and hypertriglyceridemia data from the Cardiovascular Disease Risk Factor Two-township Study in Taiwan. PMID:23661280
NASA Astrophysics Data System (ADS)
Ozdemir, Adnan
2011-12-01
SummaryIn this study, groundwater spring potential maps produced by three different methods, frequency ratio, weights of evidence, and logistic regression, were evaluated using validation data sets and compared to each other. Groundwater spring occurrence potential maps in the Sultan Mountains (Konya, Turkey) were constructed using the relationship between groundwater spring locations and their causative factors. Groundwater spring locations were identified in the study area from a topographic map. Different thematic maps of the study area, such as geology, topography, geomorphology, hydrology, and land use/cover, have been used to identify groundwater potential zones. Seventeen spring-related parameter layers of the entire study area were used to generate groundwater spring potential maps. These are geology (lithology), fault density, distance to fault, relative permeability of lithologies, elevation, slope aspect, slope steepness, curvature, plan curvature, profile curvature, topographic wetness index, stream power index, sediment transport capacity index, drainage density, distance to drainage, land use/cover, and precipitation. The predictive capability of each model was determined by the area under the relative operating characteristic curve. The areas under the curve for frequency ratio, weights of evidence and logistic regression methods were calculated as 0.903, 0.880, and 0.840, respectively. These results indicate that frequency ratio and weights of evidence models are relatively good estimators, whereas logistic regression is a relatively poor estimator of groundwater spring potential mapping in the study area. The frequency ratio model is simple; the process of input, calculation and output can be readily understood. The produced groundwater spring potential maps can serve planners and engineers in groundwater development plans and land-use planning.
Sargolzaie, Narjes; Miri-Moghaddam, Ebrahim
2014-01-01
The most common differential diagnosis of β-thalassemia (β-thal) trait is iron deficiency anemia. Several red blood cell equations were introduced during different studies for differential diagnosis between β-thal trait and iron deficiency anemia. Due to genetic variations in different regions, these equations cannot be useful in all population. The aim of this study was to determine a native equation with high accuracy for differential diagnosis of β-thal trait and iron deficiency anemia for the Sistan and Baluchestan population by logistic regression analysis. We selected 77 iron deficiency anemia and 100 β-thal trait cases. We used binary logistic regression analysis and determined best equations for probability prediction of β-thal trait against iron deficiency anemia in our population. We compared diagnostic values and receiver operative characteristic (ROC) curve related to this equation and another 10 published equations in discriminating β-thal trait and iron deficiency anemia. The binary logistic regression analysis determined the best equation for best probability prediction of β-thal trait against iron deficiency anemia with area under curve (AUC) 0.998. Based on ROC curves and AUC, Green & King, England & Frazer, and then Sirdah indices, respectively, had the most accuracy after our equation. We suggest that to get the best equation and cut-off in each region, one needs to evaluate specific information of each region, specifically in areas where populations are homogeneous, to provide a specific formula for differentiating between β-thal trait and iron deficiency anemia. PMID:25155260
NASA Astrophysics Data System (ADS)
Ozdemir, Adnan
2011-07-01
SummaryThe purpose of this study is to produce a groundwater spring potential map of the Sultan Mountains in central Turkey, based on a logistic regression method within a Geographic Information System (GIS) environment. Using field surveys, the locations of the springs (440 springs) were determined in the study area. In this study, 17 spring-related factors were used in the analysis: geology, relative permeability, land use/land cover, precipitation, elevation, slope, aspect, total curvature, plan curvature, profile curvature, wetness index, stream power index, sediment transport capacity index, distance to drainage, distance to fault, drainage density, and fault density map. The coefficients of the predictor variables were estimated using binary logistic regression analysis and were used to calculate the groundwater spring potential for the entire study area. The accuracy of the final spring potential map was evaluated based on the observed springs. The accuracy of the model was evaluated by calculating the relative operating characteristics. The area value of the relative operating characteristic curve model was found to be 0.82. These results indicate that the model is a good estimator of the spring potential in the study area. The spring potential map shows that the areas of very low, low, moderate and high groundwater spring potential classes are 105.586 km 2 (28.99%), 74.271 km 2 (19.906%), 101.203 km 2 (27.14%), and 90.05 km 2 (24.671%), respectively. The interpretations of the potential map showed that stream power index, relative permeability of lithologies, geology, elevation, aspect, wetness index, plan curvature, and drainage density play major roles in spring occurrence and distribution in the Sultan Mountains. The logistic regression approach has not yet been used to delineate groundwater potential zones. In this study, the logistic regression method was used to locate potential zones for groundwater springs in the Sultan Mountains. The evolved model
Methods for Adjusting U.S. Geological Survey Rural Regression Peak Discharges in an Urban Setting
Moglen, Glenn E.; Shivers, Dorianne E.
2006-01-01
A study was conducted of 78 U.S. Geological Survey gaged streams that have been subjected to varying degrees of urbanization over the last three decades. Flood-frequency analysis coupled with nonlinear regression techniques were used to generate a set of equations for converting peak discharge estimates determined from rural regression equations to a set of peak discharge estimates that represent known urbanization. Specifically, urban regression equations for the 2-, 5-, 10-, 25-, 50-, 100-, and 500-year return periods were calibrated as a function of the corresponding rural peak discharge and the percentage of impervious area in a watershed. The results of this study indicate that two sets of equations, one set based on imperviousness and one set based on population density, performed well. Both sets of equations are dependent on rural peak discharges, a measure of development (average percentage of imperviousness or average population density), and a measure of homogeneity of development within a watershed. Average imperviousness was readily determined by using geographic information system methods and commonly available land-cover data. Similarly, average population density was easily determined from census data. Thus, a key advantage to the equations developed in this study is that they do not require field measurements of watershed characteristics as did the U.S. Geological Survey urban equations developed in an earlier investigation. During this study, the U.S. Geological Survey PeakFQ program was used as an integral tool in the calibration of all equations. The scarcity of historical land-use data, however, made exclusive use of flow records necessary for the 30-year period from 1970 to 2000. Such relatively short-duration streamflow time series required a nonstandard treatment of the historical data function of the PeakFQ program in comparison to published guidelines. Thus, the approach used during this investigation does not fully comply with the
Agogo, George O; van der Voet, Hilko; Van't Veer, Pieter; van Eeuwijk, Fred A; Boshuizen, Hendriek C
2016-07-01
Dietary questionnaires are prone to measurement error, which bias the perceived association between dietary intake and risk of disease. Short-term measurements are required to adjust for the bias in the association. For foods that are not consumed daily, the short-term measurements are often characterized by excess zeroes. Via a simulation study, the performance of a two-part calibration model that was developed for a single-replicate study design was assessed by mimicking leafy vegetable intake reports from the multicenter European Prospective Investigation into Cancer and Nutrition (EPIC) study. In part I of the fitted two-part calibration model, a logistic distribution was assumed; in part II, a gamma distribution was assumed. The model was assessed with respect to the magnitude of the correlation between the consumption probability and the consumed amount (hereafter, cross-part correlation), the number and form of covariates in the calibration model, the percentage of zero response values, and the magnitude of the measurement error in the dietary intake. From the simulation study results, transforming the dietary variable in the regression calibration to an appropriate scale was found to be the most important factor for the model performance. Reducing the number of covariates in the model could be beneficial, but was not critical in large-sample studies. The performance was remarkably robust when fitting a one-part rather than a two-part model. The model performance was minimally affected by the cross-part correlation. PMID:27003183
Mesa, José Luis
2004-01-01
In clinical research, suitable visualization techniques of data after statistical analysis are crucial for the researches' and physicians' understanding. Common statistical techniques to analyze data in clinical research are logistic regression models. Among these, the application of binary logistic regression analysis (LRA) has greatly increased during past years, due to its diagnostic accuracy and because scientists often want to analyze in a dichotomous way whether some event will occur or not. Such an analysis lacks a suitable, understandable, and widely used graphical display, instead providing an understandable logit function based on a linear model for the natural logarithm of the odds in favor of the occurrence of the dependent variable, Y. By simple exponential transformation, such a logit equation can be transformed into a logistic function, resulting in predicted probabilities for the presence of the dependent variable, P(Y-1/X). This model can be used to generate a simple graphical display for binary LRA. For the case of a single predictor or explanatory (independent) variable, X, a plot can be generated with X represented by the abscissa (i.e., horizontal axis) and P(Y-1/X) represented by the ordinate (i.e., vertical axis). For the case of multiple predictor models, I propose here a relief 3D surface graphic in order to plot up to four independent variables (two continuous and two discrete). By using this technique, any researcher or physician would be able to transform a lesser understandable logit function into a figure easier to grasp, thus leading to a better knowledge and interpretation of data in clinical research. For this, a sophisticated statistical package is not necessary, because the graphical display may be generated by using any 2D or 3D surface plotter. PMID:14962632
Nonaka, Junko
2012-01-01
The objective of this study was to develop a probabilistic model to predict the end of lag time (λ) during the growth of Bacillus cereus vegetative cells as a function of temperature, pH, and salt concentration using logistic regression. The developed λ model was subsequently combined with a logistic differential equation to simulate bacterial numbers over time. To develop a novel model for λ, we determined whether bacterial growth had begun, i.e., whether λ had ended, at each time point during the growth kinetics. The growth of B. cereus was evaluated by optical density (OD) measurements in culture media for various pHs (5.5 ∼ 7.0) and salt concentrations (0.5 ∼ 2.0%) at static temperatures (10 ∼ 20°C). The probability of the end of λ was modeled using dichotomous judgments obtained at each OD measurement point concerning whether a significant increase had been observed. The probability of the end of λ was described as a function of time, temperature, pH, and salt concentration and showed a high goodness of fit. The λ model was validated with independent data sets of B. cereus growth in culture media and foods, indicating acceptable performance. Furthermore, the λ model, in combination with a logistic differential equation, enabled a simulation of the population of B. cereus in various foods over time at static and/or fluctuating temperatures with high accuracy. Thus, this newly developed modeling procedure enables the description of λ using observable environmental parameters without any conceptual assumptions and the simulation of bacterial numbers over time with the use of a logistic differential equation. PMID:22729541
Koseki, Shige; Nonaka, Junko
2012-09-01
The objective of this study was to develop a probabilistic model to predict the end of lag time (λ) during the growth of Bacillus cereus vegetative cells as a function of temperature, pH, and salt concentration using logistic regression. The developed λ model was subsequently combined with a logistic differential equation to simulate bacterial numbers over time. To develop a novel model for λ, we determined whether bacterial growth had begun, i.e., whether λ had ended, at each time point during the growth kinetics. The growth of B. cereus was evaluated by optical density (OD) measurements in culture media for various pHs (5.5 ∼ 7.0) and salt concentrations (0.5 ∼ 2.0%) at static temperatures (10 ∼ 20°C). The probability of the end of λ was modeled using dichotomous judgments obtained at each OD measurement point concerning whether a significant increase had been observed. The probability of the end of λ was described as a function of time, temperature, pH, and salt concentration and showed a high goodness of fit. The λ model was validated with independent data sets of B. cereus growth in culture media and foods, indicating acceptable performance. Furthermore, the λ model, in combination with a logistic differential equation, enabled a simulation of the population of B. cereus in various foods over time at static and/or fluctuating temperatures with high accuracy. Thus, this newly developed modeling procedure enables the description of λ using observable environmental parameters without any conceptual assumptions and the simulation of bacterial numbers over time with the use of a logistic differential equation. PMID:22729541
de Vet, Emely; Chinapaw, Mai JM; de Boer, Michiel; Seidell, Jacob C; Brug, Johannes
2014-01-01
Background Playing video games contributes substantially to sedentary behavior in youth. A new generation of video games—active games—seems to be a promising alternative to sedentary games to promote physical activity and reduce sedentary behavior. At this time, little is known about correlates of active and non-active gaming among adolescents. Objective The objective of this study was to examine potential personal, social, and game-related correlates of both active and non-active gaming in adolescents. Methods A survey assessing game behavior and potential personal, social, and game-related correlates was conducted among adolescents (12-16 years, N=353) recruited via schools. Multivariable, multilevel logistic regression analyses, adjusted for demographics (age, sex and educational level of adolescents), were conducted to examine personal, social, and game-related correlates of active gaming ≥1 hour per week (h/wk) and non-active gaming >7 h/wk. Results Active gaming ≥1 h/wk was significantly associated with a more positive attitude toward active gaming (OR 5.3, CI 2.4-11.8; P<.001), a less positive attitude toward non-active games (OR 0.30, CI 0.1-0.6; P=.002), a higher score on habit strength regarding gaming (OR 1.9, CI 1.2-3.2; P=.008) and having brothers/sisters (OR 6.7, CI 2.6-17.1; P<.001) and friends (OR 3.4, CI 1.4-8.4; P=.009) who spend more time on active gaming and a little bit lower score on game engagement (OR 0.95, CI 0.91-0.997; P=.04). Non-active gaming >7 h/wk was significantly associated with a more positive attitude toward non-active gaming (OR 2.6, CI 1.1-6.3; P=.035), a stronger habit regarding gaming (OR 3.0, CI 1.7-5.3; P<.001), having friends who spend more time on non-active gaming (OR 3.3, CI 1.46-7.53; P=.004), and a more positive image of a non-active gamer (OR 2, CI 1.07–3.75; P=.03). Conclusions Various factors were significantly associated with active gaming ≥1 h/wk and non-active gaming >7 h/wk. Active gaming is most
NASA Astrophysics Data System (ADS)
WU, Chunhung
2015-04-01
The research built the original logistic regression landslide susceptibility model (abbreviated as or-LRLSM) and landslide ratio-based ogistic regression landslide susceptibility model (abbreviated as lr-LRLSM), compared the performance and explained the error source of two models. The research assumes that the performance of the logistic regression model can be better if the distribution of landslide ratio and weighted value of each variable is similar. Landslide ratio is the ratio of landslide area to total area in the specific area and an useful index to evaluate the seriousness of landslide disaster in Taiwan. The research adopted the landside inventory induced by 2009 Typhoon Morakot in the Chishan watershed, which was the most serious disaster event in the last decade, in Taiwan. The research adopted the 20 m grid as the basic unit in building the LRLSM, and six variables, including elevation, slope, aspect, geological formation, accumulated rainfall, and bank erosion, were included in the two models. The six variables were divided as continuous variables, including elevation, slope, and accumulated rainfall, and categorical variables, including aspect, geological formation and bank erosion in building the or-LRLSM, while all variables, which were classified based on landslide ratio, were categorical variables in building the lr-LRLSM. Because the count of whole basic unit in the Chishan watershed was too much to calculate by using commercial software, the research took random sampling instead of the whole basic units. The research adopted equal proportions of landslide unit and not landslide unit in logistic regression analysis. The research took 10 times random sampling and selected the group with the best Cox & Snell R2 value and Nagelkerker R2 value as the database for the following analysis. Based on the best result from 10 random sampling groups, the or-LRLSM (lr-LRLSM) is significant at the 1% level with Cox & Snell R2 = 0.190 (0.196) and Nagelkerke R2
2013-01-01
Background In Brazil, 99% of the cases of malaria are concentrated in the Amazon region, with high level of transmission. The objectives of the study were to use geographic information systems (GIS) analysis and logistic regression as a tool to identify and analyse the relative likelihood and its socio-environmental determinants of malaria infection in the Vale do Amanhecer rural settlement, Brazil. Methods A GIS database of georeferenced malaria cases, recorded in 2005, and multiple explanatory data layers was built, based on a multispectral Landsat 5 TM image, digital map of the settlement blocks and a SRTM digital elevation model. Satellite imagery was used to map the spatial patterns of land use and cover (LUC) and to derive spectral indices of vegetation density (NDVI) and soil/vegetation humidity (VSHI). An Euclidian distance operator was applied to measure proximity of domiciles to potential mosquito breeding habitats and gold mining areas. The malaria risk model was generated by multiple logistic regression, in which environmental factors were considered as independent variables and the number of cases, binarized by a threshold value was the dependent variable. Results Out of a total of 336 cases of malaria, 133 positive slides were from inhabitants at Road 08, which corresponds to 37.60% of the notifications. The southern region of the settlement presented 276 cases and a greater number of domiciles in which more than ten cases/home were notified. From these, 102 (30.36%) cases were caused by Plasmodium falciparum and 174 (51.79%) cases by Plasmodium vivax. Malaria risk is the highest in the south of the settlement, associated with proximity to gold mining sites, intense land use, high levels of soil/vegetation humidity and low vegetation density. Conclusions Mid-resolution, remote sensing data and GIS-derived distance measures can be successfully combined with digital maps of the housing location of (non-) infected inhabitants to predict relative
NASA Astrophysics Data System (ADS)
Ozdemir, Adnan; Altural, Tolga
2013-03-01
This study evaluated and compared landslide susceptibility maps produced with three different methods, frequency ratio, weights of evidence, and logistic regression, by using validation datasets. The field surveys performed as part of this investigation mapped the locations of 90 landslides that had been identified in the Sultan Mountains of south-western Turkey. The landslide influence parameters used for this study are geology, relative permeability, land use/land cover, precipitation, elevation, slope, aspect, total curvature, plan curvature, profile curvature, wetness index, stream power index, sediment transportation capacity index, distance to drainage, distance to fault, drainage density, fault density, and spring density maps. The relationships between landslide distributions and these parameters were analysed using the three methods, and the results of these methods were then used to calculate the landslide susceptibility of the entire study area. The accuracy of the final landslide susceptibility maps was evaluated based on the landslides observed during the fieldwork, and the accuracy of the models was evaluated by calculating each model's relative operating characteristic curve. The predictive capability of each model was determined from the area under the relative operating characteristic curve and the areas under the curves obtained using the frequency ratio, logistic regression, and weights of evidence methods are 0.976, 0.952, and 0.937, respectively. These results indicate that the frequency ratio and weights of evidence models are relatively good estimators of landslide susceptibility in the study area. Specifically, the results of the correlation analysis show a high correlation between the frequency ratio and weights of evidence results, and the frequency ratio and logistic regression methods exhibit correlation coefficients of 0.771 and 0.727, respectively. The frequency ratio model is simple, and its input, calculation and output processes are
Woods, Carol M; Oltmanns, Thomas F; Turkheimer, Eric
2008-06-01
Person-fit assessment is used to identify persons who respond aberrantly to a test or questionnaire. In this study, S. P. Reise's (2000) method for evaluating person fit using 2-level logistic regression was applied to 13 personality scales of the Schedule for Nonadaptive and Adaptive Personality (SNAP; L. Clark, 1996) that had been administered to military recruits (N = 2,026). Results revealed significant person-fit heterogeneity and indicated that for 5 SNAP scales (Disinhibition, Entitlement, Exhibitionism, Negative Temperament, and Workaholism), the scale was more discriminating for some people than for others. Possible causes of aberrant responding were explored with several covariates. On all 5 scales, severe pathology emerged as a key influence on responses, and there was evidence of differential test functioning with respect to gender, ethnicity, or both. Other potential sources of aberrancy were carelessness, haphazard responding, or uncooperativeness. Social desirability was not as influential as expected. PMID:18557693
Choi, Seung W; Gibbons, Laura E; Crane, Paul K
2011-03-01
Logistic regression provides a flexible framework for detecting various types of differential item functioning (DIF). Previous efforts extended the framework by using item response theory (IRT) based trait scores, and by employing an iterative process using group-specific item parameters to account for DIF in the trait scores, analogous to purification approaches used in other DIF detection frameworks. The current investigation advances the technique by developing a computational platform integrating both statistical and IRT procedures into a single program. Furthermore, a Monte Carlo simulation approach was incorporated to derive empirical criteria for various DIF statistics and effect size measures. For purposes of illustration, the procedure was applied to data from a questionnaire of anxiety symptoms for detecting DIF associated with age from the Patient-Reported Outcomes Measurement Information System. PMID:21572908
Chan, Kung-Sik; Jiao, Feiran; Mikulski, Marek A.; Gerke, Alicia; Guo, Junfeng; Newell, John D; Hoffman, Eric A.; Thompson, Brad; Lee, Chang Hyun; Fuortes, Laurence J.
2015-01-01
Rationale and Objectives We evaluated the role of automated quantitative computed tomography (CT) scan interpretation algorithm in detecting Interstitial Lung Disease (ILD) and/or emphysema in a sample of elderly subjects with mild lung disease.ypothesized that the quantification and distributions of CT attenuation values on lung CT, over a subset of Hounsfield Units (HU) range [−1000 HU, 0 HU], can differentiate early or mild disease from normal lung. Materials and Methods We compared results of quantitative spiral rapid end-exhalation (functional residual capacity; FRC) and end-inhalation (total lung capacity; TLC) CT scan analyses in 52 subjects with radiographic evidence of mild fibrotic lung disease to 17 normal subjects. Several CT value distributions were explored, including (i) that from the peripheral lung taken at TLC (with peels at 15 or 65mm), (ii) the ratio of (i) to that from the core of lung, and (iii) the ratio of (ii) to its FRC counterpart. We developed a fused-lasso logistic regression model that can automatically identify sub-intervals of [−1000 HU, 0 HU] over which a CT value distribution provides optimal discrimination between abnormal and normal scans. Results The fused-lasso logistic regression model based on (ii) with 15 mm peel identified the relative frequency of CT values over [−1000, −900] and that over [−450,−200] HU as a means of discriminating abnormal versus normal, resulting in a zero out-sample false positive rate and 15%false negative rate of that was lowered to 12% by pooling information. Conclusions We demonstrated the potential usefulness of this novel quantitative imaging analysis method in discriminating ILD and/or emphysema from normal lungs. PMID:26776294
Robertson, John M.; Soehn, Matthias; Yan Di
2010-05-01
Purpose: Understanding the dose-volume relationship of small bowel irradiation and severe acute diarrhea may help reduce the incidence of this side effect during adjuvant treatment for rectal cancer. Methods and Materials: Consecutive patients treated curatively for rectal cancer were reviewed, and the maximum grade of acute diarrhea was determined. The small bowel was outlined on the treatment planning CT scan, and a dose-volume histogram was calculated for the initial pelvic treatment (45 Gy). Logistic regression models were fitted for varying cutoff-dose levels from 5 to 45 Gy in 5-Gy increments. The model with the highest LogLikelihood was used to develop a cutoff-dose normal tissue complication probability (NTCP) model. Results: There were a total of 152 patients (48% preoperative, 47% postoperative, 5% other), predominantly treated prone (95%) with a three-field technique (94%) and a protracted venous infusion of 5-fluorouracil (78%). Acute Grade 3 diarrhea occurred in 21%. The largest LogLikelihood was found for the cutoff-dose logistic regression model with 15 Gy as the cutoff-dose, although the models for 20 Gy and 25 Gy had similar significance. According to this model, highly significant correlations (p <0.001) between small bowel volumes receiving at least 15 Gy and toxicity exist in the considered patient population. Similar findings applied to both the preoperatively (p = 0.001) and postoperatively irradiated groups (p = 0.001). Conclusion: The incidence of Grade 3 diarrhea was significantly correlated with the volume of small bowel receiving at least 15 Gy using a cutoff-dose NTCP model.
Ho Hoang, Khai-Long; Mombaur, Katja
2015-10-15
Dynamic modeling of the human body is an important tool to investigate the fundamentals of the biomechanics of human movement. To model the human body in terms of a multi-body system, it is necessary to know the anthropometric parameters of the body segments. For young healthy subjects, several data sets exist that are widely used in the research community, e.g. the tables provided by de Leva. None such comprehensive anthropometric parameter sets exist for elderly people. It is, however, well known that body proportions change significantly during aging, e.g. due to degenerative effects in the spine, such that parameters for young people cannot be used for realistically simulating the dynamics of elderly people. In this study, regression equations are derived from the inertial parameters, center of mass positions, and body segment lengths provided by de Leva to be adjustable to the changes in proportion of the body parts of male and female humans due to aging. Additional adjustments are made to the reference points of the parameters for the upper body segments as they are chosen in a more practicable way in the context of creating a multi-body model in a chain structure with the pelvis representing the most proximal segment. PMID:26338096
O'Dwyer, Jean; Morris Downes, Margaret; Adley, Catherine C
2016-02-01
This study analyses the relationship between meteorological phenomena and outbreaks of waterborne-transmitted vero cytotoxin-producing Escherichia coli (VTEC) in the Republic of Ireland over an 8-year period (2005-2012). Data pertaining to the notification of waterborne VTEC outbreaks were extracted from the Computerised Infectious Disease Reporting system, which is administered through the national Health Protection Surveillance Centre as part of the Health Service Executive. Rainfall and temperature data were obtained from the national meteorological office and categorised as cumulative rainfall, heavy rainfall events in the previous 7 days, and mean temperature. Regression analysis was performed using logistic regression (LR) analysis. The LR model was significant (p < 0.001), with all independent variables: cumulative rainfall, heavy rainfall and mean temperature making a statistically significant contribution to the model. The study has found that rainfall, particularly heavy rainfall in the preceding 7 days of an outbreak, is a strong statistical indicator of a waterborne outbreak and that temperature also impacts waterborne VTEC outbreak occurrence. PMID:26837828
2013-01-01
Background Microarray technology can acquire information about thousands of genes simultaneously. We analyzed published breast cancer microarray databases to predict five-year recurrence and compared the performance of three data mining algorithms of artificial neural networks (ANN), decision trees (DT) and logistic regression (LR) and two composite models of DT-ANN and DT-LR. The collection of microarray datasets from the Gene Expression Omnibus, four breast cancer datasets were pooled for predicting five-year breast cancer relapse. After data compilation, 757 subjects, 5 clinical variables and 13,452 genetic variables were aggregated. The bootstrap method, Mann–Whitney U test and 20-fold cross-validation were performed to investigate candidate genes with 100 most-significant p-values. The predictive powers of DT, LR and ANN models were assessed using accuracy and the area under ROC curve. The associated genes were evaluated using Cox regression. Results The DT models exhibited the lowest predictive power and the poorest extrapolation when applied to the test samples. The ANN models displayed the best predictive power and showed the best extrapolation. The 21 most-associated genes, as determined by integration of each model, were analyzed using Cox regression with a 3.53-fold (95% CI: 2.24-5.58) increased risk of breast cancer five-year recurrence… Conclusions The 21 selected genes can predict breast cancer recurrence. Among these genes, CCNB1, PLK1 and TOP2A are in the cell cycle G2/M DNA damage checkpoint pathway. Oncologists can offer the genetic information for patients when understanding the gene expression profiles on breast cancer recurrence. PMID:23506640
Allahyari, Elahe; Jafari, Peyman; Bagheri, Zahra
2016-01-01
Objective. The present study uses simulated data to find what the optimal number of response categories is to achieve adequate power in ordinal logistic regression (OLR) model for differential item functioning (DIF) analysis in psychometric research. Methods. A hypothetical ten-item quality of life scale with three, four, and five response categories was simulated. The power and type I error rates of OLR model for detecting uniform DIF were investigated under different combinations of ability distribution (θ), sample size, sample size ratio, and the magnitude of uniform DIF across reference and focal groups. Results. When θ was distributed identically in the reference and focal groups, increasing the number of response categories from 3 to 5 resulted in an increase of approximately 8% in power of OLR model for detecting uniform DIF. The power of OLR was less than 0.36 when ability distribution in the reference and focal groups was highly skewed to the left and right, respectively. Conclusions. The clearest conclusion from this research is that the minimum number of response categories for DIF analysis using OLR is five. However, the impact of the number of response categories in detecting DIF was lower than might be expected. PMID:27403207
NASA Astrophysics Data System (ADS)
Wang, Liang-Jie; Sawada, Kazuhide; Moriguchi, Shuji
2013-01-01
To mitigate the damage caused by landslide disasters, different mathematical models have been applied to predict landslide spatial distribution characteristics. Although some researchers have achieved excellent results around the world, few studies take the spatial resolution of the database into account. Four types of digital elevation model (DEM) ranging from 2 to 20 m derived from light detection and ranging technology to analyze landslide susceptibility in Mizunami City, Gifu Prefecture, Japan, are presented. Fifteen landslide-causative factors are considered using a logistic-regression approach to create models for landslide potential analysis. Pre-existing landslide bodies are used to evaluate the performance of the four models. The results revealed that the 20-m model had the highest classification accuracy (71.9%), whereas the 2-m model had the lowest value (68.7%). In the 2-m model, 89.4% of the landslide bodies fit in the medium to very high categories. For the 20-m model, only 83.3% of the landslide bodies were concentrated in the medium to very high classes. When the cell size decreases from 20 to 2 m, the area under the relative operative characteristic increases from 0.68 to 0.77. Therefore, higher-resolution DEMs would provide better results for landslide-susceptibility mapping.
NASA Astrophysics Data System (ADS)
Venkataraman, Kartik; Uddameri, Venkatesh
2012-08-01
SummaryThe occurrence of elevated levels of arsenic and nitrate in aquifers impacted by agricultural activities is common and can result in adverse health effects in rural areas. Numerous wells located in the Ogallala aquifer in the Southern High Plains of Texas have tested positive for both arsenic and nitrate MCL exceedance. To model the simultaneous exceedance of both chemicals, two types of Logistic Regression (LR) models were developed by (a) treating arsenic and nitrate independently and combining the marginal probabilities of their exceedance, and (b) treating the two exceedances together by using a multinomial model. Influencing variables representative of both soil and aquifer properties and data for which was readily available were identified. The predictive capacities of the two models were evaluated using Received Operating Characteristics (ROCs) and spatial trends in predictions were studied. The LR model constructed from the marginal probabilities had lower overall accuracy (59% correct classifications) and was extremely conservative by over-predicting outcomes. In contrast, the multinomial model showed good overall accuracy (79% correct classifications), made the correct predictions 90% of the time when both arsenic and nitrate MCL exceedances were observed, and was a good fit for wells located in agricultural areas. The results of the multinomial model also confirm previous studies that attributed shallow subsurface arsenic to anthropogenic activities. Based on the insights provided by the model it is recommended that where agricultural areas are concerned, the occurrence of arsenic and nitrate are better evaluated together.
Botha, J; de Ridder, J H; Potgieter, J C; Steyn, H S; Malan, L
2013-10-01
A recently proposed model for waist circumference cut points (RPWC), driven by increased blood pressure, was demonstrated in an African population. We therefore aimed to validate the RPWC by comparing the RPWC and the Joint Statement Consensus (JSC) models via Logistic Regression (LR) and Neural Networks (NN) analyses. Urban African gender groups (N=171) were stratified according to the JSC and RPWC cut point models. Ultrasound carotid intima media thickness (CIMT), blood pressure (BP) and fasting bloods (glucose, high density lipoprotein (HDL) and triglycerides) were obtained in a well-controlled setting. The RPWC male model (LR ROC AUC: 0.71, NN ROC AUC: 0.71) was practically equal to the JSC model (LR ROC AUC: 0.71, NN ROC AUC: 0.69) to predict structural vascular -disease. Similarly, the female RPWC model (LR ROC AUC: 0.84, NN ROC AUC: 0.82) and JSC model (LR ROC AUC: 0.82, NN ROC AUC: 0.81) equally predicted CIMT as surrogate marker for structural vascular disease. Odds ratios supported validity where prediction of CIMT revealed -clinical -significance, well over 1, for both the JSC and RPWC models in African males and females (OR 3.75-13.98). In conclusion, the proposed RPWC model was substantially validated utilizing linear and non-linear analyses. We therefore propose ethnic-specific WC cut points (African males, ≥90 cm; -females, ≥98 cm) to predict a surrogate marker for structural vascular disease. PMID:23934678
Hung, J.; Chaitman, B.R.; Lam, J.; Lesperance, J.; Dupras, G.; Fines, P.; Cherkaoui, O.; Robert, P.; Bourassa, M.G.
1985-08-01
The incremental diagnostic yield of clinical data, exercise ECG, stress thallium scintigraphy, and cardiac fluoroscopy to predict coronary and multivessel disease was assessed in 171 symptomatic men by means of multiple logistic regression analyses. When clinical variables alone were analyzed, chest pain type and age were predictive of coronary disease, whereas chest pain type, age, a family history of premature coronary disease before age 55 years, and abnormal ST-T wave changes on the rest ECG were predictive of multivessel disease. The percentage of patients correctly classified by cardiac fluoroscopy (presence or absence of coronary artery calcification), exercise ECG, and thallium scintigraphy was 9%, 25%, and 50%, respectively, greater than for clinical variables, when the presence or absence of coronary disease was the outcome, and 13%, 25%, and 29%, respectively, when multivessel disease was studied; 5% of patients were misclassified. When the 37 clinical and noninvasive test variables were analyzed jointly, the most significant variable predictive of coronary disease was an abnormal thallium scan and for multivessel disease, the amount of exercise performed. The data from this study provide a quantitative model and confirm previous reports that optimal diagnostic efficacy is obtained when noninvasive tests are ordered sequentially. In symptomatic men, cardiac fluoroscopy is a relatively ineffective test when compared to exercise ECG and thallium scintigraphy.
Feizi, Awat; Aliyari, Roqayeh; Roohafza, Hamidreza
2012-01-01
Objective. The present paper aimed at investigating the association between perceived stress and major life events stressors in Iranian general population. Methods. In a cross-sectional large-scale community-based study, 4583 people aged 19 and older, living in Isfahan, Iran, were investigated. Logistic quantile regression was used for modeling perceived stress, measured by GHQ questionnaire, as the bounded outcome (dependent), variable, and as a function of most important stressful life events, as the predictor variables, controlling for major lifestyle and sociodemographic factors. This model provides empirical evidence of the predictors' effects heterogeneity depending on individual location on the distribution of perceived stress. Results. The results showed that among four stressful life events, family conflicts and social problems were more correlated with level of perceived stress. Higher levels of education were negatively associated with perceived stress and its coefficients monotonically decrease beyond the 30th percentile. Also, higher levels of physical activity were associated with perception of low levels of stress. The pattern of gender's coefficient over the majority of quantiles implied that females are more affected by stressors. Also high perceived stress was associated with low or middle levels of income. Conclusions. The results of current research suggested that in a developing society with high prevalence of stress, interventions targeted toward promoting financial and social equalities, social skills training, and healthy lifestyle may have the potential benefits for large parts of the population, most notably female and lower educated people. PMID:23091560
Mumcu Kucuker, Derya; Baskent, Emin Zeki
2015-01-01
Integration of non-wood forest products (NWFPs) into forest management planning has become an increasingly important issue in forestry over the last decade. Among NWFPs, mushrooms are valued due to their medicinal, commercial, high nutritional and recreational importance. Commercial mushroom harvesting also provides important income to local dwellers and contributes to the economic value of regional forests. Sustainable management of these products at the regional scale requires information on their locations in diverse forest settings and the ability to predict and map their spatial distributions over the landscape. This study focuses on modeling the spatial distribution of commercially harvested Lactarius deliciosus and L. salmonicolor mushrooms in the Kızılcasu Forest Planning Unit, Turkey. The best models were developed based on topographic, climatic and stand characteristics, separately through logistic regression analysis using SPSS™. The best topographic model provided better classification success (69.3 %) than the best climatic (65.4 %) and stand (65 %) models. However, the overall best model, with 73 % overall classification success, used a mix of several variables. The best models were integrated into an Arc/Info GIS program to create spatial distribution maps of L. deliciosus and L. salmonicolor in the planning area. Our approach may be useful to predict the occurrence and distribution of other NWFPs and provide a valuable tool for designing silvicultural prescriptions and preparing multiple-use forest management plans. PMID:24821473
NASA Astrophysics Data System (ADS)
Bornaetxea, Txomin; Antigüedad, Iñaki; Ormaetxea, Orbange
2016-04-01
In the Oria river basin (885 km2) shallow landslides are very frequent and they produce several roadblocks and damage in the infrastructure and properties, causing big economic loss every year. Considering that the zonification of the territory in different landslide susceptibility levels provides a useful tool for the territorial planning and natural risk management, this study has the objective of identifying the most prone landslide places applying an objective and reproducible methodology. To do so, a quantitative multivariate methodology, the logistic regression, has been used. Fieldwork landslide points and randomly selected stable points have been used along with Lithology, Land Use, Distance to the transport infrastructure, Altitude, Senoidal Slope and Normalized Difference Vegetation Index (NDVI) independent variables to carry out a landslide susceptibility map. The model has been validated by the prediction and success rate curves and their corresponding area under the curve (AUC). In addition, the result has been compared to those from two landslide susceptibility models, covering the study area previously applied in different scales, such as ELSUS1000 version 1 (2013) and Landslide Susceptibility Map of Gipuzkoa (2007). Validation results show an excellent prediction capacity of the proposed model (AUC 0,962), and comparisons highlight big differences with previous studies.
NASA Astrophysics Data System (ADS)
Esposito, Carlo; Barra, Anna; Evans, Stephen G.; Scarascia Mugnozza, Gabriele; Delaney, Keith
2014-05-01
The study of landslide susceptibility by multivariate statistical methods is based on finding a quantitative relationship between controlling factors and landslide occurrence. Such studies have become popular in the last few decades thanks to the development of geographic information systems (GIS) software and the related improved data management. In this work we applied a statistical approach to an area of high landslide susceptibility mainly due to its tropical climate and geological-geomorphological setting. The study area is located in the south-east region of Brazil that has frequently been affected by flood and landslide hazard, especially because of heavy rainfall events during the summer season. In this work we studied a disastrous event that occurred on January 11th and 12th of 2011, which involved Região Serrana (the mountainous region of Rio de Janeiro State) and caused more than 5000 landslides and at least 904 deaths. In order to produce susceptibility maps, we focused our attention on an area of 93,6 km2 that includes Nova Friburgo city. We utilized two different multivariate statistic methods: Logistic Regression (LR), already widely used in applied geosciences, and Random Forest (RF), which has only recently been applied to landslide susceptibility analysis. With reference to each mapping unit, the first method (LR) results in a probability of landslide occurrence, while the second one (RF) gives a prediction in terms of % of area susceptible to slope failure. With this aim in mind, a landslide inventory map (related to the studied event) has been drawn up through analyses of high-resolution GeoEye satellite images, in a GIS environment. Data layers of 11 causative factors have been created and processed in order to be used as continuous numerical or discrete categorical variables in statistical analysis. In particular, the logistic regression method has frequent difficulties in managing numerical continuous and discrete categorical variables
Mendes, Renata G.; de Souza, César R.; Machado, Maurício N.; Correa, Paulo R.; Di Thommazo-Luporini, Luciana; Arena, Ross; Myers, Jonathan; Pizzolato, Ednaldo B.
2015-01-01
Introduction In coronary artery bypass (CABG) surgery, the common complications are the need for reintubation, prolonged mechanical ventilation (PMV) and death. Thus, a reliable model for the prognostic evaluation of those particular outcomes is a worthwhile pursuit. The existence of such a system would lead to better resource planning, cost reductions and an increased ability to guide preventive strategies. The aim of this study was to compare different methods – logistic regression (LR) and artificial neural networks (ANNs) – in accomplishing this goal. Material and methods Subjects undergoing CABG (n = 1315) were divided into training (n = 1053) and validation (n = 262) groups. The set of independent variables consisted of age, gender, weight, height, body mass index, diabetes, creatinine level, cardiopulmonary bypass, presence of preserved ventricular function, moderate and severe ventricular dysfunction and total number of grafts. The PMV was also an input for the prediction of death. The ability of ANN to discriminate outcomes was assessed using receiver-operating characteristic (ROC) analysis and the results were compared using a multivariate LR. Results The ROC curve areas for LR and ANN models, respectively, were: for reintubation 0.62 (CI: 0.50–0.75) and 0.65 (CI: 0.53–0.77); for PMV 0.67 (CI: 0.57–0.78) and 0.72 (CI: 0.64–0.81); and for death 0.86 (CI: 0.79–0.93) and 0.85 (CI: 0.80–0.91). No differences were observed between models. Conclusions The ANN has similar discriminating power in predicting reintubation, PMV and death outcomes. Thus, both models may be applicable as a predictor for these outcomes in subjects undergoing CABG. PMID:26322087
Jahandideh, Samad; Abdolmaleki, Parviz; Movahedi, Mohammad Mehdi
2010-02-01
Various studies have been reported on the bioeffects of magnetic field exposure; however, no consensus or guideline is available for experimental designs relating to exposure conditions as yet. In this study, logistic regression (LR) and artificial neural networks (ANNs) were used in order to analyze and predict the melatonin excretion patterns in the rat exposed to extremely low frequency magnetic fields (ELF-MF). Subsequently, on a database containing 33 experiments, performances of LR and ANNs were compared through resubstitution and jackknife tests. Predictor variables were more effective parameters and included frequency, polarization, exposure duration, and strength of magnetic fields. Also, five performance measures including accuracy, sensitivity, specificity, Matthew's Correlation Coefficient (MCC) and normalized percentage, better than random (S) were used to evaluate the performance of models. The LR as a conventional model obtained poor prediction performance. Nonetheless, LR distinguished the duration of magnetic fields as a statistically significant parameter. Also, horizontal polarization of magnetic fields with the highest logit coefficient (or parameter estimate) with negative sign was found to be the strongest indicator for experimental designs relating to exposure conditions. This means that each experiment with horizontal polarization of magnetic fields has a higher probability to result in "not changed melatonin level" pattern. On the other hand, ANNs, a more powerful model which has not been introduced in predicting melatonin excretion patterns in the rat exposed to ELF-MF, showed high performance measure values and higher reliability, especially obtaining 0.55 value of MCC through jackknife tests. Obtained results showed that such predictor models are promising and may play a useful role in defining guidelines for experimental designs relating to exposure conditions. In conclusion, analysis of the bioelectromagnetic data could result in
NASA Astrophysics Data System (ADS)
El Kadiri, R.; Sultan, M.; Elbayoumi, T.; Sefry, S.
2013-12-01
Efforts to map the distribution of debris flows, to assess the factors controlling their development, and to identify the areas prone to their development are often hampered by the absence or paucity of appropriate monitoring systems and historical databases and the inaccessibility of these areas in many parts of the world. We developed methodologies that heavily rely on readily available observations extracted from remote sensing datasets and successfully applied these techniques over the the Jazan province, in the Red Sea hills of Saudi Arabia. We first identified debris flows (10,334 locations) from high spatial resolution satellite datasets (e.g., GeoEye, Orbview), and verified a subset of these occurrences in the field. We then constructed a GIS to host the identified debris flow locations together with co-registered relevant data (e.g., lithology, elevation) and derived products (e.g., slope, normalized difference vegetation index, etc). Spatial analysis of the data sets in the GIS sets indicated various degrees of correspondence between the distribution of debris flows and various variables (e.g., stream power index, topographic position index, normalized difference vegetation index, distance to stream, flow accumulation, slope and soil weathering index, aspect, elevation) suggesting a causal effect. For example, debris flows were found in areas of high slope, low distance to low stream orders and low vegetation index. To evaluate the extent to which these factors control landslide distribution, we constructed and applied: (1) a stepwise input selection by testing all input combinations to make the final model more compact and effective, (2) a statistic-based binary logistic regression (BLR) model, and (3) a mathematical-based artificial neural network (ANN) model. Only 80% (8267 locations) of the data was used for the construction of each of the models and the remaining samples (2067 locations) were used for the accuracy assessment purposes. Results
Diaz, Andres; Rossow, Stephanie; Ciarlet, Max; Marthaler, Douglas
2016-01-01
Rotaviruses (RV) are important causes of diarrhea in animals, especially in domestic animals. Of the 9 RV species, rotavirus A, B, and C (RVA, RVB, and RVC, respectively) had been established as important causes of diarrhea in pigs. The Minnesota Veterinary Diagnostic Laboratory receives swine stool samples from North America to determine the etiologic agents of disease. Between November 2009 and October 2011, 7,508 samples from pigs with diarrhea were submitted to determine if enteric pathogens, including RV, were present in the samples. All samples were tested for RVA, RVB, and RVC by real time RT-PCR. The majority of the samples (82%) were positive for RVA, RVB, and/or RVC. To better understand the risk factors associated with RV infections in swine diagnostic samples, three-level mixed-effects logistic regression models (3L-MLMs) were used to estimate associations among RV species, age, and geographical variability within the major swine production regions in North America. The conditional odds ratios (cORs) for RVA and RVB detection were lower for 1–3 day old pigs when compared to any other age group. However, the cOR of RVC detection in 1–3 day old pigs was significantly higher (p < 0.001) than pigs in the 4–20 days old and >55 day old age groups. Furthermore, pigs in the 21–55 day old age group had statistically higher cORs of RV co-detection compared to 1–3 day old pigs (p < 0.001). The 3L-MLMs indicated that RV status was more similar within states than among states or within each region. Our results indicated that 3L-MLMs are a powerful and adaptable tool to handle and analyze large-hierarchical datasets. In addition, our results indicated that, overall, swine RV epidemiology is complex, and RV species are associated with different age groups and vary by regions in North America. PMID:27145176
NASA Astrophysics Data System (ADS)
Cama, Mariaelena; Cristi Nicu, Ionut; Conoscenti, Christian; Quénéhervé, Geraldine; Maerker, Michael
2016-04-01
Landslide susceptibility can be defined as the likelihood of a landslide occurring in a given area on the basis of local terrain conditions. In the last decades many research focused on its evaluation by means of stochastic approaches under the assumption that 'the past is the key to the future' which means that if a model is able to reproduce a known landslide spatial distribution, it will be able to predict the future locations of new (i.e. unknown) slope failures. Among the various stochastic approaches, Binary Logistic Regression (BLR) is one of the most used because it calculates the susceptibility in probabilistic terms and its results are easily interpretable from a geomorphological point of view. However, very often not much importance is given to multicollinearity assessment whose effect is that the coefficient estimates are unstable, with opposite sign and therefore difficult to interpret. Therefore, it should be evaluated every time in order to make a model whose results are geomorphologically correct. In this study the effects of multicollinearity in the predictive performance and robustness of landslide susceptibility models are analyzed. In particular, the multicollinearity is estimated by means of Variation Inflation Index (VIF) which is also used as selection criterion for the independent variables (VIF Stepwise Selection) and compared to the more commonly used AIC Stepwise Selection. The robustness of the results is evaluated through 100 replicates of the dataset. The study area selected to perform this analysis is the Moldavian Plateau where landslides are among the most frequent geomorphological processes. This area has an increasing trend of urbanization and a very high potential regarding the cultural heritage, being the place of discovery of the largest settlement belonging to the Cucuteni Culture from Eastern Europe (that led to the development of the great complex Cucuteni-Tripyllia). Therefore, identifying the areas susceptible to
Zhang, Shi-Yan; Lin, Bi-Ding; Li, Bu-Ren
2015-01-01
The purpose of this study was to evaluate the diagnostic efficiency for hepatocellular carcinoma (HCC) with the combined analysis of alpha-l-fucosidase (AFU), alpha-fetoprotein (AFP) and thymidine kinase 1 (TK1). Serum levels of AFU, AFP and TK1 were measured in: 116 patients with HCC, 109 patients with benign hepatic diseases, and 104 normal subjects. The diagnostic value was analyzed using the logistic regression equation and receiver operating characteristic curves (ROC). Statistical distribution of the three tested tumor markers in every group was non-normally distributed (Kolmogorov–Sminov test, Z = 0.156–0.517, P < 0.001). The serum levels of AFP and TK1 in patients with HCC were significantly higher than those in patients with benign hepatic diseases (Mann–Whitney U test, Z = −8.570 to –5.943, all P < 0.001). However, there was no statistically significant difference of AFU between these two groups (Mann–Whitney U test, Z = −1.820, P = 0.069). The levels of AFU were significantly higher in patients with benign hepatic diseases than in normal subjects (Mann–Whitney U test, Z = −7.984, P < 0.001). Receiver operating characteristic curves (ROC) in patients with HCC versus those without HCC indicated the optimal cut-off value was 40.80 U/L for AFU, 10.86 μg/L for AFP and 1.92 pmol/L for TK1, respectively. The area under ROC curve (AUC) was 0.718 for AFU, 0.832 for AFP, 0.773 for TK1 and 0.900 for the combination of the three tumor markers. The combination resulted in a higher Youden index and a sensitivity of 85.3%. The combined detection of serum AFU, AFP and TK1 could play a complementary role in the diagnosis of HCC, and could significantly improve the sensitivity for the diagnosis of HCC. PMID:25870783
Unitary Response Regression Models
ERIC Educational Resources Information Center
Lipovetsky, S.
2007-01-01
The dependent variable in a regular linear regression is a numerical variable, and in a logistic regression it is a binary or categorical variable. In these models the dependent variable has varying values. However, there are problems yielding an identity output of a constant value which can also be modelled in a linear or logistic regression with…
ERIC Educational Resources Information Center
Tipton, Elizabeth; Pustejovsky, James E.
2015-01-01
Randomized experiments are commonly used to evaluate the effectiveness of educational interventions. The goal of the present investigation is to develop small-sample corrections for multiple contrast hypothesis tests (i.e., F-tests) such as the omnibus test of meta-regression fit or a test for equality of three or more levels of a categorical…
ERIC Educational Resources Information Center
Thatcher, Greg W.; Henson, Robin K.
This study examined research in training and development to determine effect size reporting practices. It focused on the reporting of corrected effect sizes in research articles using multiple regression analyses. When possible, researchers calculated corrected effect sizes and determine if the associated shrinkage could have impacted researcher…
A new method for dealing with measurement error in explanatory variables of regression models.
Freedman, Laurence S; Fainberg, Vitaly; Kipnis, Victor; Midthune, Douglas; Carroll, Raymond J
2004-03-01
We introduce a new method, moment reconstruction, of correcting for measurement error in covariates in regression models. The central idea is similar to regression calibration in that the values of the covariates that are measured with error are replaced by "adjusted" values. In regression calibration the adjusted value is the expectation of the true value conditional on the measured value. In moment reconstruction the adjusted value is the variance-preserving empirical Bayes estimate of the true value conditional on the outcome variable. The adjusted values thereby have the same first two moments and the same covariance with the outcome variable as the unobserved "true" covariate values. We show that moment reconstruction is equivalent to regression calibration in the case of linear regression, but leads to different results for logistic regression. For case-control studies with logistic regression and covariates that are normally distributed within cases and controls, we show that the resulting estimates of the regression coefficients are consistent. In simulations we demonstrate that for logistic regression, moment reconstruction carries less bias than regression calibration, and for case-control studies is superior in mean-square error to the standard regression calibration approach. Finally, we give an example of the use of moment reconstruction in linear discriminant analysis and a nonstandard problem where we wish to adjust a classification tree for measurement error in the explanatory variables. PMID:15032787
NASA Technical Reports Server (NTRS)
Duda, David P.; Minnis, Patrick
2009-01-01
Straightforward application of the Schmidt-Appleman contrail formation criteria to diagnose persistent contrail occurrence from numerical weather prediction data is hindered by significant bias errors in the upper tropospheric humidity. Logistic models of contrail occurrence have been proposed to overcome this problem, but basic questions remain about how random measurement error may affect their accuracy. A set of 5000 synthetic contrail observations is created to study the effects of random error in these probabilistic models. The simulated observations are based on distributions of temperature, humidity, and vertical velocity derived from Advanced Regional Prediction System (ARPS) weather analyses. The logistic models created from the simulated observations were evaluated using two common statistical measures of model accuracy, the percent correct (PC) and the Hanssen-Kuipers discriminant (HKD). To convert the probabilistic results of the logistic models into a dichotomous yes/no choice suitable for the statistical measures, two critical probability thresholds are considered. The HKD scores are higher when the climatological frequency of contrail occurrence is used as the critical threshold, while the PC scores are higher when the critical probability threshold is 0.5. For both thresholds, typical random errors in temperature, relative humidity, and vertical velocity are found to be small enough to allow for accurate logistic models of contrail occurrence. The accuracy of the models developed from synthetic data is over 85 percent for both the prediction of contrail occurrence and non-occurrence, although in practice, larger errors would be anticipated.
ERIC Educational Resources Information Center
Gómez-Benito, Juana; Hidalgo, Maria Dolores; Zumbo, Bruno D.
2013-01-01
The objective of this article was to find an optimal decision rule for identifying polytomous items with large or moderate amounts of differential functioning. The effectiveness of combining statistical tests with effect size measures was assessed using logistic discriminant function analysis and two effect size measures: R[superscript 2] and…
NASA Technical Reports Server (NTRS)
Duda, David P.; Minnis, Patrick
2009-01-01
Previous studies have shown that probabilistic forecasting may be a useful method for predicting persistent contrail formation. A probabilistic forecast to accurately predict contrail formation over the contiguous United States (CONUS) is created by using meteorological data based on hourly meteorological analyses from the Advanced Regional Prediction System (ARPS) and from the Rapid Update Cycle (RUC) as well as GOES water vapor channel measurements, combined with surface and satellite observations of contrails. Two groups of logistic models were created. The first group of models (SURFACE models) is based on surface-based contrail observations supplemented with satellite observations of contrail occurrence. The second group of models (OUTBREAK models) is derived from a selected subgroup of satellite-based observations of widespread persistent contrails. The mean accuracies for both the SURFACE and OUTBREAK models typically exceeded 75 percent when based on the RUC or ARPS analysis data, but decreased when the logistic models were derived from ARPS forecast data.
NASA Astrophysics Data System (ADS)
Roşca, S.; Bilaşco, Ş.; Petrea, D.; Fodorean, I.; Vescan, I.; Filip, S.; Măguţ, F.-L.
2015-11-01
The existence of a large number of GIS models for the identification of landslide occurrence probability makes difficult the selection of a specific one. The present study focuses on the application of two quantitative models: the logistic and the BSA models. The comparative analysis of the results aims at identifying the most suitable model. The territory corresponding to the Niraj Mic Basin (87 km2) is an area characterised by a wide variety of the landforms with their morphometric, morphographical and geological characteristics as well as by a high complexity of the land use types where active landslides exist. This is the reason why it represents the test area for applying the two models and for the comparison of the results. The large complexity of input variables is illustrated by 16 factors which were represented as 72 dummy variables, analysed on the basis of their importance within the model structures. The testing of the statistical significance corresponding to each variable reduced the number of dummy variables to 12 which were considered significant for the test area within the logistic model, whereas for the BSA model all the variables were employed. The predictability degree of the models was tested through the identification of the area under the ROC curve which indicated a good accuracy (AUROC = 0.86 for the testing area) and predictability of the logistic model (AUROC = 0.63 for the validation area).
Black, L. E.; Brion, G. M.; Freitas, S. J.
2007-01-01
Predicting the presence of enteric viruses in surface waters is a complex modeling problem. Multiple water quality parameters that indicate the presence of human fecal material, the load of fecal material, and the amount of time fecal material has been in the environment are needed. This paper presents the results of a multiyear study of raw-water quality at the inlet of a potable-water plant that related 17 physical, chemical, and biological indices to the presence of enteric viruses as indicated by cytopathic changes in cell cultures. It was found that several simple, multivariate logistic regression models that could reliably identify observations of the presence or absence of total culturable virus could be fitted. The best models developed combined a fecal age indicator (the atypical coliform [AC]/total coliform [TC] ratio), the detectable presence of a human-associated sterol (epicoprostanol) to indicate the fecal source, and one of several fecal load indicators (the levels of Giardia species cysts, coliform bacteria, and coprostanol). The best fit to the data was found when the AC/TC ratio, the presence of epicoprostanol, and the density of fecal coliform bacteria were input into a simple, multivariate logistic regression equation, resulting in 84.5% and 78.6% accuracies for the identification of the presence and absence of total culturable virus, respectively. The AC/TC ratio was the most influential input variable in all of the models generated, but producing the best prediction required additional input related to the fecal source and the fecal load. The potential for replacing microbial indicators of fecal load with levels of coprostanol was proposed and evaluated by multivariate logistic regression modeling for the presence and absence of virus. PMID:17468270
Chung, Doo Yong; Cho, Kang Su; Lee, Dae Hun; Han, Jang Hee; Kang, Dong Hyuk; Jung, Hae Do; Kown, Jong Kyou; Ham, Won Sik; Choi, Young Deuk; Lee, Joo Yong
2015-01-01
Purpose This study was conducted to evaluate colic pain as a prognostic pretreatment factor that can influence ureter stone clearance and to estimate the probability of stone-free status in shock wave lithotripsy (SWL) patients with a ureter stone. Materials and Methods We retrospectively reviewed the medical records of 1,418 patients who underwent their first SWL between 2005 and 2013. Among these patients, 551 had a ureter stone measuring 4–20 mm and were thus eligible for our analyses. The colic pain as the chief complaint was defined as either subjective flank pain during history taking and physical examination. Propensity-scores for established for colic pain was calculated for each patient using multivariate logistic regression based upon the following covariates: age, maximal stone length (MSL), and mean stone density (MSD). Each factor was evaluated as predictor for stone-free status by Bayesian and non-Bayesian logistic regression model. Results After propensity-score matching, 217 patients were extracted in each group from the total patient cohort. There were no statistical differences in variables used in propensity- score matching. One-session success and stone-free rate were also higher in the painful group (73.7% and 71.0%, respectively) than in the painless group (63.6% and 60.4%, respectively). In multivariate non-Bayesian and Bayesian logistic regression models, a painful stone, shorter MSL, and lower MSD were significant factors for one-session stone-free status in patients who underwent SWL. Conclusions Colic pain in patients with ureter calculi was one of the significant predicting factors including MSL and MSD for one-session stone-free status of SWL. PMID:25902059
ERIC Educational Resources Information Center
Davidson, J. Cody
2016-01-01
Mathematics is the most common subject area of remedial need and the majority of remedial math students never pass a college-level credit-bearing math class. The majorities of studies that investigate this phenomenon are conducted at community colleges and use some type of regression model; however, none have used a continuation ratio model. The…
Kjelstrom, L.C.
1995-01-01
Previously developed U.S. Geological Survey regional regression models of runoff and 11 chemical constituents were evaluated to assess their suitability for use in urban areas in Boise and Garden City. Data collected in the study area were used to develop adjusted regional models of storm-runoff volumes and mean concentrations and loads of chemical oxygen demand, dissolved and suspended solids, total nitrogen and total ammonia plus organic nitrogen as nitrogen, total and dissolved phosphorus, and total recoverable cadmium, copper, lead, and zinc. Explanatory variables used in these models were drainage area, impervious area, land-use information, and precipitation data. Mean annual runoff volume and loads at the five outfalls were estimated from 904 individual storms during 1976 through 1993. Two methods were used to compute individual storm loads. The first method used adjusted regional models of storm loads and the second used adjusted regional models for mean concentration and runoff volume. For large storms, the first method seemed to produce excessively high loads for some constituents and the second method provided more reliable results for all constituents except suspended solids. The first method provided more reliable results for large storms for suspended solids.
Asquith, William H.; Roussel, Meghan C.
2009-01-01
Annual peak-streamflow frequency estimates are needed for flood-plain management; for objective assessment of flood risk; for cost-effective design of dams, levees, and other flood-control structures; and for design of roads, bridges, and culverts. Annual peak-streamflow frequency represents the peak streamflow for nine recurrence intervals of 2, 5, 10, 25, 50, 100, 200, 250, and 500 years. Common methods for estimation of peak-streamflow frequency for ungaged or unmonitored watersheds are regression equations for each recurrence interval developed for one or more regions; such regional equations are the subject of this report. The method is based on analysis of annual peak-streamflow data from U.S. Geological Survey streamflow-gaging stations (stations). Beginning in 2007, the U.S. Geological Survey, in cooperation with the Texas Department of Transportation and in partnership with Texas Tech University, began a 3-year investigation concerning the development of regional equations to estimate annual peak-streamflow frequency for undeveloped watersheds in Texas. The investigation focuses primarily on 638 stations with 8 or more years of data from undeveloped watersheds and other criteria. The general approach is explicitly limited to the use of L-moment statistics, which are used in conjunction with a technique of multi-linear regression referred to as PRESS minimization. The approach used to develop the regional equations, which was refined during the investigation, is referred to as the 'L-moment-based, PRESS-minimized, residual-adjusted approach'. For the approach, seven unique distributions are fit to the sample L-moments of the data for each of 638 stations and trimmed means of the seven results of the distributions for each recurrence interval are used to define the station specific, peak-streamflow frequency. As a first iteration of regression, nine weighted-least-squares, PRESS-minimized, multi-linear regression equations are computed using the watershed
Nepal, Reecha; Spencer, Joanna; Bhogal, Guneet; Nedunuri, Amulya; Poelman, Thomas; Kamath, Thejas; Chung, Edwin; Kantardjieff, Katherine; Gottlieb, Andrea; Lustig, Brooke
2015-01-01
A working example of relative solvent accessibility (RSA) prediction for proteins is presented. Novel logistic regression models with various qualitative descriptors that include amino acid type and quantitative descriptors that include 20- and six-term sequence entropy have been built and validated. A domain-complete learning set of over 1300 proteins is used to fit initial models with various sequence homology descriptors as well as query residue qualitative descriptors. Homology descriptors are derived from BLASTp sequence alignments, whereas the RSA values are determined directly from the crystal structure. The logistic regression models are fitted using dichotomous responses indicating buried or accessible solvent, with binary classifications obtained from the RSA values. The fitted models determine binary predictions of residue solvent accessibility with accuracies comparable to other less computationally intensive methods using the standard RSA threshold criteria 20 and 25% as solvent accessible. When an additional non-homology descriptor describing Lobanov–Galzitskaya residue disorder propensity is included, incremental improvements in accuracy are achieved with 25% threshold accuracies of 76.12 and 74.79% for the Manesh-215 and CASP(8+9) test sets, respectively. Moreover, the described software and the accompanying learning and validation sets allow students and researchers to explore the utility of RSA prediction with simple, physically intuitive models in any number of related applications. PMID:26664348
Keith, Scott W.; Allison, David B.
2014-01-01
This paper details the design, evaluation, and implementation of a framework for detecting and modeling non-linearity between a binary outcome and a continuous predictor variable adjusted for covariates in complex samples. The framework provides familiar-looking parameterizations of output in terms of linear slope coefficients and odds ratios. Estimation methods focus on maximum likelihood optimization of piecewise linear free-knot splines formulated as B-splines. Correctly specifying the optimal number and positions of the knots improves the model, but is marked by computational intensity and numerical instability. Our inference methods utilize both parametric and non-parametric bootstrapping. Unlike other non-linear modeling packages, this framework is designed to incorporate multistage survey sample designs common to nationally representative datasets. We illustrate the approach and evaluate its performance in specifying the correct number of knots under various conditions with an example using body mass index (BMI, kg/m2) and the complex multistage sampling design from the Third National Health and Nutrition Examination Survey to simulate binary mortality outcomes data having realistic non-linear sample-weighted risk associations with BMI. BMI and mortality data provide a particularly apt example and area of application since BMI is commonly recorded in large health surveys with complex designs, often categorized for modeling, and non-linearly related to mortality. When complex sample design considerations were ignored, our method was generally similar to or more accurate than two common model selection procedures, Schwarz’s Bayesian Information Criterion (BIC) and Akaike’s Information Criterion (AIC), in terms of correctly selecting the correct number of knots. Our approach provided accurate knot selections when complex sampling weights were incorporated, while AIC and BIC were not effective under these conditions. PMID:25610831
Robertson, D.M.; Saad, D.A.; Heisey, D.M.
2006-01-01
Various approaches are used to subdivide large areas into regions containing streams that have similar reference or background water quality and that respond similarly to different factors. For many applications, such as establishing reference conditions, it is preferable to use physical characteristics that are not affected by human activities to delineate these regions. However, most approaches, such as ecoregion classifications, rely on land use to delineate regions or have difficulties compensating for the effects of land use. Land use not only directly affects water quality, but it is often correlated with the factors used to define the regions. In this article, we describe modifications to SPARTA (spatial regression-tree analysis), a relatively new approach applied to water-quality and environmental characteristic data to delineate zones with similar factors affecting water quality. In this modified approach, land-use-adjusted (residualized) water quality and environmental characteristics are computed for each site. Regression-tree analysis is applied to the residualized data to determine the most statistically important environmental characteristics describing the distribution of a specific water-quality constituent. Geographic information for small basins throughout the study area is then used to subdivide the area into relatively homogeneous environmental water-quality zones. For each zone, commonly used approaches are subsequently used to define its reference water quality and how its water quality responds to changes in land use. SPARTA is used to delineate zones of similar reference concentrations of total phosphorus and suspended sediment throughout the upper Midwestern part of the United States. ?? 2006 Springer Science+Business Media, Inc.
Trashball: A Logistic Regression Classroom Activity
ERIC Educational Resources Information Center
Morrell, Christopher H.; Auer, Richard E.
2007-01-01
In the early 1990s, the National Science Foundation funded many research projects for improving statistical education. Many of these stressed the need for classroom activities that illustrate important issues of designing experiments, generating quality data, fitting models, and performing statistical tests. Our paper describes such an activity on…
A Latent Transition Model with Logistic Regression
ERIC Educational Resources Information Center
Chung, Hwan; Walls, Theodore A.; Park, Yousung
2007-01-01
Latent transition models increasingly include covariates that predict prevalence of latent classes at a given time or transition rates among classes over time. In many situations, the covariate of interest may be latent. This paper describes an approach for handling both manifest and latent covariates in a latent transition model. A Bayesian…
Technology Transfer Automated Retrieval System (TEKTRAN)
A method of accounting for differences in variation in components of test-day milk production records was developed. This method could improve the accuracy of genetic evaluations. A random regression model is used to analyze the data, then a transformation is applied to the random regression coeffic...
Bond, H S; Sullivan, S G; Cowling, B J
2016-06-01
Influenza vaccination is the most practical means available for preventing influenza virus infection and is widely used in many countries. Because vaccine components and circulating strains frequently change, it is important to continually monitor vaccine effectiveness (VE). The test-negative design is frequently used to estimate VE. In this design, patients meeting the same clinical case definition are recruited and tested for influenza; those who test positive are the cases and those who test negative form the comparison group. When determining VE in these studies, the typical approach has been to use logistic regression, adjusting for potential confounders. Because vaccine coverage and influenza incidence change throughout the season, time is included among these confounders. While most studies use unconditional logistic regression, adjusting for time, an alternative approach is to use conditional logistic regression, matching on time. Here, we used simulation data to examine the potential for both regression approaches to permit accurate and robust estimates of VE. In situations where vaccine coverage changed during the influenza season, the conditional model and unconditional models adjusting for categorical week and using a spline function for week provided more accurate estimates. We illustrated the two approaches on data from a test-negative study of influenza VE against hospitalization in children in Hong Kong which resulted in the conditional logistic regression model providing the best fit to the data. PMID:26732691
Foster, Guy M.; Graham, Jennifer L.
2016-01-01
The Kansas River is a primary source of drinking water for about 800,000 people in northeastern Kansas. Source-water supplies are treated by a combination of chemical and physical processes to remove contaminants before distribution. Advanced notification of changing water-quality conditions and cyanobacteria and associated toxin and taste-and-odor compounds provides drinking-water treatment facilities time to develop and implement adequate treatment strategies. The U.S. Geological Survey (USGS), in cooperation with the Kansas Water Office (funded in part through the Kansas State Water Plan Fund), and the City of Lawrence, the City of Topeka, the City of Olathe, and Johnson County Water One, began a study in July 2012 to develop statistical models at two Kansas River sites located upstream from drinking-water intakes. Continuous water-quality monitors have been operated and discrete-water quality samples have been collected on the Kansas River at Wamego (USGS site number 06887500) and De Soto (USGS site number 06892350) since July 2012. Continuous and discrete water-quality data collected during July 2012 through June 2015 were used to develop statistical models for constituents of interest at the Wamego and De Soto sites. Logistic models to continuously estimate the probability of occurrence above selected thresholds were developed for cyanobacteria, microcystin, and geosmin. Linear regression models to continuously estimate constituent concentrations were developed for major ions, dissolved solids, alkalinity, nutrients (nitrogen and phosphorus species), suspended sediment, indicator bacteria (Escherichia coli, fecal coliform, and enterococci), and actinomycetes bacteria. These models will be used to provide real-time estimates of the probability that cyanobacteria and associated compounds exceed thresholds and of the concentrations of other water-quality constituents in the Kansas River. The models documented in this report are useful for characterizing changes
Correlation and simple linear regression.
Eberly, Lynn E
2007-01-01
This chapter highlights important steps in using correlation and simple linear regression to address scientific questions about the association of two continuous variables with each other. These steps include estimation and inference, assessing model fit, the connection between regression and ANOVA, and study design. Examples in microbiology are used throughout. This chapter provides a framework that is helpful in understanding more complex statistical techniques, such as multiple linear regression, linear mixed effects models, logistic regression, and proportional hazards regression. PMID:18450049
2011-01-01
Background Dementia and cognitive impairment associated with aging are a major medical and social concern. Neuropsychological testing is a key element in the diagnostic procedures of Mild Cognitive Impairment (MCI), but has presently a limited value in the prediction of progression to dementia. We advance the hypothesis that newer statistical classification methods derived from data mining and machine learning methods like Neural Networks, Support Vector Machines and Random Forests can improve accuracy, sensitivity and specificity of predictions obtained from neuropsychological testing. Seven non parametric classifiers derived from data mining methods (Multilayer Perceptrons Neural Networks, Radial Basis Function Neural Networks, Support Vector Machines, CART, CHAID and QUEST Classification Trees and Random Forests) were compared to three traditional classifiers (Linear Discriminant Analysis, Quadratic Discriminant Analysis and Logistic Regression) in terms of overall classification accuracy, specificity, sensitivity, Area under the ROC curve and Press'Q. Model predictors were 10 neuropsychological tests currently used in the diagnosis of dementia. Statistical distributions of classification parameters obtained from a 5-fold cross-validation were compared using the Friedman's nonparametric test. Results Press' Q test showed that all classifiers performed better than chance alone (p < 0.05). Support Vector Machines showed the larger overall classification accuracy (Median (Me) = 0.76) an area under the ROC (Me = 0.90). However this method showed high specificity (Me = 1.0) but low sensitivity (Me = 0.3). Random Forest ranked second in overall accuracy (Me = 0.73) with high area under the ROC (Me = 0.73) specificity (Me = 0.73) and sensitivity (Me = 0.64). Linear Discriminant Analysis also showed acceptable overall accuracy (Me = 0.66), with acceptable area under the ROC (Me = 0.72) specificity (Me = 0.66) and sensitivity (Me = 0.64). The remaining classifiers showed
Killeen, Peter R
2015-07-01
The generalized matching law (GML) is reconstructed as a logistic regression equation that privileges no particular value of the sensitivity parameter, a. That value will often approach 1 due to the feedback that drives switching that is intrinsic to most concurrent schedules. A model of that feedback reproduced some features of concurrent data. The GML is a law only in the strained sense that any equation that maps data is a law. The machine under the hood of matching is in all likelihood the very law that was displaced by the Matching Law. It is now time to return the Law of Effect to centrality in our science. PMID:25988932
Jiang, Honghua; Kulkarni, Pandurang M; Mallinckrodt, Craig H; Shurzinske, Linda; Molenberghs, Geert; Lipkovich, Ilya
2015-01-01
The benefits of adjusting for baseline covariates are not as straightforward with repeated binary responses as with continuous response variables. Therefore, in this study, we compared different methods for analyzing repeated binary data through simulations when the outcome at the study endpoint is of interest. Methods compared included chi-square, Fisher's exact test, covariate adjusted/unadjusted logistic regression (Adj.logit/Unadj.logit), covariate adjusted/unadjusted generalized estimating equations (Adj.GEE/Unadj.GEE), covariate adjusted/unadjusted generalized linear mixed model (Adj.GLMM/Unadj.GLMM). All these methods preserved the type I error close to the nominal level. Covariate adjusted methods improved power compared with the unadjusted methods because of the increased treatment effect estimates, especially when the correlation between the baseline and outcome was strong, even though there was an apparent increase in standard errors. Results of the Chi-squared test were identical to those for the unadjusted logistic regression. Fisher's exact test was the most conservative test regarding the type I error rate and also with the lowest power. Without missing data, there was no gain in using a repeated measures approach over a simple logistic regression at the final time point. Analysis of results from five phase III diabetes trials of the same compound was consistent with the simulation findings. Therefore, covariate adjusted analysis is recommended for repeated binary data when the study endpoint is of interest. PMID:25866149
NASA Astrophysics Data System (ADS)
Wu, Chunhung
2016-04-01
Few researches have discussed about the applicability of applying the statistical landslide susceptibility (LS) model for extreme rainfall-induced landslide events. The researches focuses on the comparison and applicability of LS models based on four methods, including landslide ratio-based logistic regression (LRBLR), frequency ratio (FR), weight of evidence (WOE), and instability index (II) methods, in an extreme rainfall-induced landslide cases. The landslide inventory in the Chishan river watershed, Southwestern Taiwan, after 2009 Typhoon Morakot is the main materials in this research. The Chishan river watershed is a tributary watershed of Kaoping river watershed, which is a landslide- and erosion-prone watershed with the annual average suspended load of 3.6×107 MT/yr (ranks 11th in the world). Typhoon Morakot struck Southern Taiwan from Aug. 6-10 in 2009 and dumped nearly 2,000 mm of rainfall in the Chishan river watershed. The 24-hour, 48-hour, and 72-hours accumulated rainfall in the Chishan river watershed exceeded the 200-year return period accumulated rainfall. 2,389 landslide polygons in the Chishan river watershed were extracted from SPOT 5 images after 2009 Typhoon Morakot. The total landslide area is around 33.5 km2, equals to the landslide ratio of 4.1%. The main landslide types based on Varnes' (1978) classification are rotational and translational slides. The two characteristics of extreme rainfall-induced landslide event are dense landslide distribution and large occupation of downslope landslide areas owing to headward erosion and bank erosion in the flooding processes. The area of downslope landslide in the Chishan river watershed after 2009 Typhoon Morakot is 3.2 times higher than that of upslope landslide areas. The prediction accuracy of LS models based on LRBLR, FR, WOE, and II methods have been proven over 70%. The model performance and applicability of four models in a landslide-prone watershed with dense distribution of rainfall
NASA Astrophysics Data System (ADS)
Liberman, Neomi; Ben-David Kolikant, Yifat; Beeri, Catriel
2012-09-01
Due to a program reform in Israel, experienced CS high-school teachers faced the need to master and teach a new programming paradigm. This situation served as an opportunity to explore the relationship between teachers' content knowledge (CK) and their pedagogical content knowledge (PCK). This article focuses on three case studies, with emphasis on one of them. Using observations and interviews, we examine how the teachers, we observed taught and what development of their teaching occurred as a result of their teaching experience, if at all. Our findings suggest that this situation creates a new hybrid state of teachers, which we term "regressed experts." These teachers incorporate in their professional practice some elements typical of novices and some typical of experts. We also found that these teachers' experience, although established when teaching a different CK, serve as a leverage to improve their knowledge and understanding of aspects of the new content.
J. Richard Hess; Kevin L. Kenney; William A. Smith; Ian Bonner; David J. Muth
2015-04-01
Equipment manufacturers have made rapid improvements in biomass harvesting and handling equipment. These improvements have increased transportation and handling efficiencies due to higher biomass densities and reduced losses. Improvements in grinder efficiencies and capacity have reduced biomass grinding costs. Biomass collection efficiencies (the ratio of biomass collected to the amount available in the field) as high as 75% for crop residues and greater than 90% for perennial energy crops have also been demonstrated. However, as collection rates increase, the fraction of entrained soil in the biomass increases, and high biomass residue removal rates can violate agronomic sustainability limits. Advancements in quantifying multi-factor sustainability limits to increase removal rate as guided by sustainable residue removal plans, and mitigating soil contamination through targeted removal rates based on soil type and residue type/fraction is allowing the use of new high efficiency harvesting equipment and methods. As another consideration, single pass harvesting and other technologies that improve harvesting costs cause biomass storage moisture management challenges, which challenges are further perturbed by annual variability in biomass moisture content. Monitoring, sampling, simulation, and analysis provide basis for moisture, time, and quality relationships in storage, which has allowed the development of moisture tolerant storage systems and best management processes that combine moisture content and time to accommodate baled storage of wet material based upon “shelf-life.” The key to improving biomass supply logistics costs has been developing the associated agronomic sustainability and biomass quality technologies and processes that allow the implementation of equipment engineering solutions.
Country logistics performance and disaster impact.
Vaillancourt, Alain; Haavisto, Ira
2016-04-01
The aim of this paper is to deepen the understanding of the relationship between country logistics performance and disaster impact. The relationship is analysed through correlation analysis and regression models for 117 countries for the years 2007 to 2012 with disaster impact variables from the International Disaster Database (EM-DAT) and logistics performance indicators from the World Bank. The results show a significant relationship between country logistics performance and disaster impact overall and for five out of six specific logistic performance indicators. These specific indicators were further used to explore the relationship between country logistic performance and disaster impact for three specific disaster types (epidemic, flood and storm). The findings enhance the understanding of the role of logistics in a humanitarian context with empirical evidence of the importance of country logistics performance in disaster response operations. PMID:26282578
ERIC Educational Resources Information Center
Zumbo, Bruno D.; Ochieng, Charles O.
Many measures found in educational research are ordered categorical response variables that are empirical realizations of an underlying normally distributed variate. These ordered categorical variables are commonly referred to as Likert or rating scale data. Regression models are commonly fit using these ordered categorical variables as the…
Logistic Stick-Breaking Process
Ren, Lu; Du, Lan; Carin, Lawrence; Dunson, David B.
2013-01-01
A logistic stick-breaking process (LSBP) is proposed for non-parametric clustering of general spatially- or temporally-dependent data, imposing the belief that proximate data are more likely to be clustered together. The sticks in the LSBP are realized via multiple logistic regression functions, with shrinkage priors employed to favor contiguous and spatially localized segments. The LSBP is also extended for the simultaneous processing of multiple data sets, yielding a hierarchical logistic stick-breaking process (H-LSBP). The model parameters (atoms) within the H-LSBP are shared across the multiple learning tasks. Efficient variational Bayesian inference is derived, and comparisons are made to related techniques in the literature. Experimental analysis is performed for audio waveforms and images, and it is demonstrated that for segmentation applications the LSBP yields generally homogeneous segments with sharp boundaries. PMID:25258593
Quality Reporting of Multivariable Regression Models in Observational Studies
Real, Jordi; Forné, Carles; Roso-Llorach, Albert; Martínez-Sánchez, Jose M.
2016-01-01
Abstract Controlling for confounders is a crucial step in analytical observational studies, and multivariable models are widely used as statistical adjustment techniques. However, the validation of the assumptions of the multivariable regression models (MRMs) should be made clear in scientific reporting. The objective of this study is to review the quality of statistical reporting of the most commonly used MRMs (logistic, linear, and Cox regression) that were applied in analytical observational studies published between 2003 and 2014 by journals indexed in MEDLINE. Review of a representative sample of articles indexed in MEDLINE (n = 428) with observational design and use of MRMs (logistic, linear, and Cox regression). We assessed the quality of reporting about: model assumptions and goodness-of-fit, interactions, sensitivity analysis, crude and adjusted effect estimate, and specification of more than 1 adjusted model. The tests of underlying assumptions or goodness-of-fit of the MRMs used were described in 26.2% (95% CI: 22.0–30.3) of the articles and 18.5% (95% CI: 14.8–22.1) reported the interaction analysis. Reporting of all items assessed was higher in articles published in journals with a higher impact factor. A low percentage of articles indexed in MEDLINE that used multivariable techniques provided information demonstrating rigorous application of the model selected as an adjustment method. Given the importance of these methods to the final results and conclusions of observational studies, greater rigor is required in reporting the use of MRMs in the scientific literature. PMID:27196467
A regularization corrected score method for nonlinear regression models with covariate error.
Zucker, David M; Gorfine, Malka; Li, Yi; Tadesse, Mahlet G; Spiegelman, Donna
2013-03-01
Many regression analyses involve explanatory variables that are measured with error, and failing to account for this error is well known to lead to biased point and interval estimates of the regression coefficients. We present here a new general method for adjusting for covariate error. Our method consists of an approximate version of the Stefanski-Nakamura corrected score approach, using the method of regularization to obtain an approximate solution of the relevant integral equation. We develop the theory in the setting of classical likelihood models; this setting covers, for example, linear regression, nonlinear regression, logistic regression, and Poisson regression. The method is extremely general in terms of the types of measurement error models covered, and is a functional method in the sense of not involving assumptions on the distribution of the true covariate. We discuss the theoretical properties of the method and present simulation results in the logistic regression setting (univariate and multivariate). For illustration, we apply the method to data from the Harvard Nurses' Health Study concerning the relationship between physical activity and breast cancer mortality in the period following a diagnosis of breast cancer. PMID:23379851
NASA Technical Reports Server (NTRS)
Tellado, Joseph
2014-01-01
The presentation contains a status of KSC ISS Logistics Operations. It basically presents current top level ISS Logistics tasks being conducted at KSC, current International Partner activities, hardware processing flow focussing on late Stow operations, list of KSC Logistics POC's, and a backup list of Logistics launch site services. This presentation is being given at the annual International Space Station (ISS) Multi-lateral Logistics Maintenance Control Panel meeting to be held in Turin, Italy during the week of May 13-16. The presentatiuon content doesn't contain any potential lessons learned.
Huang, Dong; Cabral, Ricardo; De la Torre, Fernando
2016-02-01
Discriminative methods (e.g., kernel regression, SVM) have been extensively used to solve problems such as object recognition, image alignment and pose estimation from images. These methods typically map image features ( X) to continuous (e.g., pose) or discrete (e.g., object category) values. A major drawback of existing discriminative methods is that samples are directly projected onto a subspace and hence fail to account for outliers common in realistic training sets due to occlusion, specular reflections or noise. It is important to notice that existing discriminative approaches assume the input variables X to be noise free. Thus, discriminative methods experience significant performance degradation when gross outliers are present. Despite its obvious importance, the problem of robust discriminative learning has been relatively unexplored in computer vision. This paper develops the theory of robust regression (RR) and presents an effective convex approach that uses recent advances on rank minimization. The framework applies to a variety of problems in computer vision including robust linear discriminant analysis, regression with missing data, and multi-label classification. Several synthetic and real examples with applications to head pose estimation from images, image and video classification and facial attribute classification with missing data are used to illustrate the benefits of RR. PMID:26761740
Adolescent suicide attempts and adult adjustment
Brière, Frédéric N.; Rohde, Paul; Seeley, John R.; Klein, Daniel; Lewinsohn, Peter M.
2014-01-01
Background Adolescent suicide attempts are disproportionally prevalent and frequently of low severity, raising questions regarding their long-term prognostic implications. In this study, we examined whether adolescent attempts were associated with impairments related to suicidality, psychopathology, and psychosocial functioning in adulthood (objective 1) and whether these impairments were better accounted for by concurrent adolescent confounders (objective 2). Method 816 adolescents were assessed using interviews and questionnaires at four time points from adolescence to adulthood. We examined whether lifetime suicide attempts in adolescence (by T2, mean age 17) predicted adult outcomes (by T4, mean age 30) using linear and logistic regressions in unadjusted models (objective 1) and adjusting for sociodemographic background, adolescent psychopathology, and family risk factors (objective 2). Results In unadjusted analyses, adolescent suicide attempts predicted poorer adjustment on all outcomes, except those related to social role status. After adjustment, adolescent attempts remained predictive of axis I and II psychopathology (anxiety disorder, antisocial and borderline personality disorder symptoms), global and social adjustment, risky sex, and psychiatric treatment utilization. However, adolescent attempts no longer predicted most adult outcomes, notably suicide attempts and major depressive disorder. Secondary analyses indicated that associations did not differ by sex and attempt characteristics (intent, lethality, recurrence). Conclusions Adolescent suicide attempters are at high risk of protracted and wide-ranging impairments, regardless of the characteristics of their attempt. Although attempts specifically predict (and possibly influence) several outcomes, results suggest that most impairments reflect the confounding contributions of other individual and family problems or vulnerabilites in adolescent attempters. PMID:25421360
Packaging for logistical support
NASA Astrophysics Data System (ADS)
Twede, Diana; Hughes, Harold
Logistical packaging is conducted to furnish protection, utility, and communication for elements of a logistical system. Once the functional requirements of space logistical support packaging have been identified, decision-makers have a reasonable basis on which to compare package alternatives. Flexible packages may be found, for example, to provide adequate protection and superior utility to that of rigid packages requiring greater storage and postuse waste volumes.
Computational Techniques for Spatial Logistic Regression with Large Datasets
Paciorek, Christopher J.
2007-01-01
In epidemiological research, outcomes are frequently non-normal, sample sizes may be large, and effect sizes are often small. To relate health outcomes to geographic risk factors, fast and powerful methods for fitting spatial models, particularly for non-normal data, are required. I focus on binary outcomes, with the risk surface a smooth function of space, but the development herein is relevant for non-normal data in general. I compare penalized likelihood models, including the penalized quasi-likelihood (PQL) approach, and Bayesian models based on fit, speed, and ease of implementation. A Bayesian model using a spectral basis representation of the spatial surface via the Fourier basis provides the best tradeoff of sensitivity and specificity in simulations, detecting real spatial features while limiting overfitting and being reasonably computationally efficient. One of the contributions of this work is further development of this underused representation. The spectral basis model outperforms the penalized likelihood methods, which are prone to overfitting, but is slower to fit and not as easily implemented. A Bayesian Markov random field model performs less well statistically than the spectral basis model, but is very computationally efficient. We illustrate the methods on a real dataset of cancer cases in Taiwan. The success of the spectral basis with binary data and similar results with count data suggest that it may be generally useful in spatial models and more complicated hierarchical models. PMID:18443645
Incremental logistic regression for customizing automatic diagnostic models.
Tortajada, Salvador; Robles, Montserrat; García-Gómez, Juan Miguel
2015-01-01
In the last decades, and following the new trends in medicine, statistical learning techniques have been used for developing automatic diagnostic models for aiding the clinical experts throughout the use of Clinical Decision Support Systems. The development of these models requires a large, representative amount of data, which is commonly obtained from one hospital or a group of hospitals after an expensive and time-consuming gathering, preprocess, and validation of cases. After the model development, it has to overcome an external validation that is often carried out in a different hospital or health center. The experience is that the models show underperformed expectations. Furthermore, patient data needs ethical approval and patient consent to send and store data. For these reasons, we introduce an incremental learning algorithm base on the Bayesian inference approach that may allow us to build an initial model with a smaller number of cases and update it incrementally when new data are collected or even perform a new calibration of a model from a different center by using a reduced number of cases. The performance of our algorithm is demonstrated by employing different benchmark datasets and a real brain tumor dataset; and we compare its performance to a previous incremental algorithm and a non-incremental Bayesian model, showing that the algorithm is independent of the data model, iterative, and has a good convergence. PMID:25417079
Background or Experience? Using Logistic Regression to Predict College Retention
ERIC Educational Resources Information Center
Synco, Tracee M.
2012-01-01
Tinto, Astin and countless others have researched the retention and attrition of students from college for more than thirty years. However, the six year graduation rate for all first-time full-time freshmen for the 2002 cohort was 57%. This study sought to determine the retention variables that predicted continued enrollment of entering freshmen…
Assessing Lake Trophic Status: A Proportional Odds Logistic Regression Model
Lake trophic state classifications are good predictors of ecosystem condition and are indicative of both ecosystem services (e.g., recreation and aesthetics), and disservices (e.g., harmful algal blooms). Methods for classifying trophic state are based off the foundational work o...
Significant Improvements to LOGIST.
ERIC Educational Resources Information Center
Wingersky, Marilyn S.
The computer program LOGIST (Wingersky, Patrick, and Lord, 1988) estimates the item parameters and the examinee's abilities for Birnbaum's three-parameter logistic item response theory model using Newton's method for solving the joint maximum likelihood equations. In 1989, Martha Stocking discovered a problem with this procedure in that when the…
Regression calibration method for correcting measurement-error bias in nutritional epidemiology.
Spiegelman, D; McDermott, A; Rosner, B
1997-04-01
Regression calibration is a statistical method for adjusting point and interval estimates of effect obtained from regression models commonly used in epidemiology for bias due to measurement error in assessing nutrients or other variables. Previous work developed regression calibration for use in estimating odds ratios from logistic regression. We extend this here to estimating incidence rate ratios from Cox proportional hazards models and regression slopes from linear-regression models. Regression calibration is appropriate when a gold standard is available in a validation study and a linear measurement error with constant variance applies or when replicate measurements are available in a reliability study and linear random within-person error can be assumed. In this paper, the method is illustrated by correction of rate ratios describing the relations between the incidence of breast cancer and dietary intakes of vitamin A, alcohol, and total energy in the Nurses' Health Study. An example using linear regression is based on estimation of the relation between ultradistal radius bone density and dietary intakes of caffeine, calcium, and total energy in the Massachusetts Women's Health Study. Software implementing these methods uses SAS macros. PMID:9094918
Definition of main pollen season using a logistic model.
Ribeiro, Helena; Cunha, Mário; Abreu, Ilda
2007-01-01
This paper proposes a method to unify the definition of the main pollen season based on statistical analysis. For this, an aerobiological study was carried out in Porto region (Portugal), from 2003-2005 using a 7-day Hirst-type volumetric spore trap. To define the main pollen season, a non-linear logistic regression model was fitted to the values of the accumulated sum of the daily airborne pollen concentration from several allergological species. An important feature of this method is that the main pollen season will be characterized by the model parameters calculated. These parameters are identifiable aspects of the flowering phenology, and determine not only the beginning and end of the main pollen season, but are also influenced by the meteorological conditions. The results obtained with the proposed methodology were also compared with two of the most used percentage methods. The logistic model fitted well with the sum of accumulated pollen. The explained variance was always higher than 97%, and the exponential part of the predicted curve was well adjusted to the time when higher atmospheric pollen concentration was sampled. The comparison between the different methods tested showed large divergence in the duration and end dates of the main pollen season of the studied species. PMID:18247462
NASA Astrophysics Data System (ADS)
Chang, Yoon S.; Oh, Chang H.
Nowadays, environmental management becomes a critical business consideration for companies to survive from many regulations and tough business requirements. Most of world-leading companies are now aware that environment friendly technology and management are critical to the sustainable growth of the company. The environment market has seen continuous growth marking 532B in 2000, and 590B in 2004. This growth rate is expected to grow to 700B in 2010. It is not hard to see the environment-friendly efforts in almost all aspects of business operations. Such trends can be easily found in logistics area. Green logistics aims to make environmental friendly decisions throughout a product lifecycle. Therefore for the success of green logistics, it is critical to have real time tracking capability on the product throughout the product lifecycle and smart solution service architecture. In this chapter, we introduce an RFID based green logistics solution and service.
Survival Data and Regression Models
NASA Astrophysics Data System (ADS)
Grégoire, G.
2014-12-01
We start this chapter by introducing some basic elements for the analysis of censored survival data. Then we focus on right censored data and develop two types of regression models. The first one concerns the so-called accelerated failure time models (AFT), which are parametric models where a function of a parameter depends linearly on the covariables. The second one is a semiparametric model, where the covariables enter in a multiplicative form in the expression of the hazard rate function. The main statistical tool for analysing these regression models is the maximum likelihood methodology and, in spite we recall some essential results about the ML theory, we refer to the chapter "Logistic Regression" for a more detailed presentation.
Logistics planning for phased programs.
NASA Technical Reports Server (NTRS)
Cook, W. H.
1973-01-01
It is pointed out that the proper and early integration of logistics planning into the phased program planning process will drastically reduce these logistics costs. Phased project planning is a phased approach to the planning, approval, and conduct of major research and development activity. A progressive build-up of knowledge of all aspects of the program is provided. Elements of logistics are discussed together with aspects of integrated logistics support, logistics program planning, and logistics activities for phased programs. Continuing logistics support can only be assured if there is a comprehensive sequential listing of all logistics activities tied to the program schedule and a real-time inventory of assets.
de Barros, Márcio Vinícius Lins; Arancibia, Ana Elisa Loyola; Costa, Ana Paula; Bueno, Fernando Brito; Martins, Marcela Aparecida Corrêa; Magalhães, Maria Cláudia; Silva, José Luiz Padilha; de Bastos, Marcos
2016-04-01
Deep venous thrombosis (DVT) management includes prediction rule evaluation to define standard pretest DVT probabilities in symptomatic patients. The aim of this study was to evaluate the incremental usefulness of hormonal therapy to the Wells prediction rules for DVT in women. We studied women undertaking compressive ultrasound scanning for suspected DVT. We adjusted the Wells score for DVT, taking into account the β-coefficients of the logistic regression model. Data discrimination was evaluated by the receiver operating characteristic (ROC) curve. The adjusted score calibration was assessed graphically and by the Hosmer-Lemeshow test. Reclassification tables and the net reclassification index were used for the adjusted score comparison with the Wells score for DVT. We observed 461 women including 103 DVT events. The mean age was 56 years (±21 years). The adjusted logistic regression model included hormonal therapy and six Wells prediction rules for DVT. The adjusted score weights ranged from -4 to 4. Hosmer-Lemeshow test showed a nonsignificant P value (0.69) and the calibration graph showed no differences between the expected and the observed values. The area under the ROC curve was 0.92 [95% confidence interval (CI) 0.90-0.95] for the adjusted model and 0.87 (95% CI 0.84-0.91) for the Wells score for DVT (Delong test, P value < 0.01). Net reclassification index for the adjusted score was 0.22 (95% CI 0.11-0.33, P value < 0.01). Our results suggest an incremental usefulness of hormonal therapy as an independent DVT prediction rule in women compared with the Wells score for DVT. The adjusted score must be evaluated in different populations before clinical use. PMID:26757018
Maso, Gianpaolo; Alberico, Salvatore; Monasta, Lorenzo; Ronfani, Luca; Montico, Marcella; Businelli, Caterina; Soini, Valentina; Piccoli, Monica; Gigli, Carmine; Domini, Daniele; Fiscella, Claudio; Casarsa, Sara; Zompicchiatti, Carlo; De Agostinis, Michela; D'Atri, Attilio; Mugittu, Raffaela; La Valle, Santo; Di Leonardo, Cristina; Adamo, Valter; Smiroldo, Silvia; Frate, Giovanni Del; Olivuzzi, Monica; Giove, Silvio; Parente, Maria; Bassini, Daniele; Melazzini, Simona; Guaschino, Secondo; De Seta, Francesco; Demarini, Sergio; Travan, Laura; Marchesoni, Diego; Rossi, Alberto; Simon, Giorgio; Zicari, Sandro; Tamburlini, Giorgio
2013-01-01
Background Caesarean delivery (CD) rates are commonly used as an indicator of quality in obstetric care and risk adjustment evaluation is recommended to assess inter-institutional variations. The aim of this study was to evaluate whether the Ten Group classification system (TGCS) can be used in case-mix adjustment. Methods Standardized data on 15,255 deliveries from 11 different regional centers were prospectively collected. Crude Risk Ratios of CDs were calculated for each center. Two multiple logistic regression models were herein considered by using: Model 1- maternal (age, Body Mass Index), obstetric variables (gestational age, fetal presentation, single or multiple, previous scar, parity, neonatal birth weight) and presence of risk factors; Model 2- TGCS either with or without maternal characteristics and presence of risk factors. Receiver Operating Characteristic (ROC) curves of the multivariate logistic regression analyses were used to assess the diagnostic accuracy of each model. The null hypothesis that Areas under ROC Curve (AUC) were not different from each other was verified with a Chi Square test and post hoc pairwise comparisons by using a Bonferroni correction. Results Crude evaluation of CD rates showed all centers had significantly higher Risk Ratios than the referent. Both multiple logistic regression models reduced these variations. However the two methods ranked institutions differently: model 1 and model 2 (adjusted for TGCS) identified respectively nine and eight centers with significantly higher CD rates than the referent with slightly different AUCs (0.8758 and 0.8929 respectively). In the adjusted model for TGCS and maternal characteristics/presence of risk factors, three centers had CD rates similar to the referent with the best AUC (0.9024). Conclusions The TGCS might be considered as a reliable variable to adjust CD rates. The addition of maternal characteristics and risk factors to TGCS substantially increase the predictive
ERIC Educational Resources Information Center
Pedrini, D. T.; Pedrini, Bonnie C.
Regression, another mechanism studied by Sigmund Freud, has had much research, e.g., hypnotic regression, frustration regression, schizophrenic regression, and infra-human-animal regression (often directly related to fixation). Many investigators worked with hypnotic age regression, which has a long history, going back to Russian reflexologists.…
American Psychiatric Association. Diagnostic and statistical manual of mental disorders. 5th ed. Arlington, Va: American Psychiatric Publishing. 2013. Powell AD. Grief, bereavement, and adjustment disorders. In: Stern TA, Rosenbaum ...
Lee, Myung Hee; Liu, Yufeng
2013-12-01
The continuum regression technique provides an appealing regression framework connecting ordinary least squares, partial least squares and principal component regression in one family. It offers some insight on the underlying regression model for a given application. Moreover, it helps to provide deep understanding of various regression techniques. Despite the useful framework, however, the current development on continuum regression is only for linear regression. In many applications, nonlinear regression is necessary. The extension of continuum regression from linear models to nonlinear models using kernel learning is considered. The proposed kernel continuum regression technique is quite general and can handle very flexible regression model estimation. An efficient algorithm is developed for fast implementation. Numerical examples have demonstrated the usefulness of the proposed technique. PMID:24058224
Real, Jordi; Forné, Carles; Roso-Llorach, Albert; Martínez-Sánchez, Jose M
2016-05-01
Controlling for confounders is a crucial step in analytical observational studies, and multivariable models are widely used as statistical adjustment techniques. However, the validation of the assumptions of the multivariable regression models (MRMs) should be made clear in scientific reporting. The objective of this study is to review the quality of statistical reporting of the most commonly used MRMs (logistic, linear, and Cox regression) that were applied in analytical observational studies published between 2003 and 2014 by journals indexed in MEDLINE.Review of a representative sample of articles indexed in MEDLINE (n = 428) with observational design and use of MRMs (logistic, linear, and Cox regression). We assessed the quality of reporting about: model assumptions and goodness-of-fit, interactions, sensitivity analysis, crude and adjusted effect estimate, and specification of more than 1 adjusted model.The tests of underlying assumptions or goodness-of-fit of the MRMs used were described in 26.2% (95% CI: 22.0-30.3) of the articles and 18.5% (95% CI: 14.8-22.1) reported the interaction analysis. Reporting of all items assessed was higher in articles published in journals with a higher impact factor.A low percentage of articles indexed in MEDLINE that used multivariable techniques provided information demonstrating rigorous application of the model selected as an adjustment method. Given the importance of these methods to the final results and conclusions of observational studies, greater rigor is required in reporting the use of MRMs in the scientific literature. PMID:27196467
Bao, J Y
1991-04-01
The commonly used microforceps have a much greater opening distance and spring resistance than needed. A piece of plastic ring or rubber band can be used to adjust the opening distance and reduce most of the spring resistance, making the user feel more comfortable and less fatigued. PMID:2051437
Space Shuttle operational logistics plan
NASA Technical Reports Server (NTRS)
Botts, J. W.
1983-01-01
The Kennedy Space Center plan for logistics to support Space Shuttle Operations and to establish the related policies, requirements, and responsibilities are described. The Directorate of Shuttle Management and Operations logistics responsibilities required by the Kennedy Organizational Manual, and the self-sufficiency contracting concept are implemented. The Space Shuttle Program Level 1 and Level 2 logistics policies and requirements applicable to KSC that are presented in HQ NASA and Johnson Space Center directives are also implemented.
Jieke theory and logistic model
Cao, H.; Feng, G.
1996-06-01
What is a shell or a JIEKE (in Chinese) is introduced firstly, jieke is a sort of system boundary. From the concept of jieke theory, a new logistic model which takes account of the switch effect of the jieke is suggested. The model is analyzed and nonlinear mapping of the model is made. The results show the feature of the switch logistic model far differ from the original logistic model. {copyright} {ital 1996 American Institute of Physics.}
Observational Studies: Matching or Regression?
Brazauskas, Ruta; Logan, Brent R
2016-03-01
In observational studies with an aim of assessing treatment effect or comparing groups of patients, several approaches could be used. Often, baseline characteristics of patients may be imbalanced between groups, and adjustments are needed to account for this. It can be accomplished either via appropriate regression modeling or, alternatively, by conducting a matched pairs study. The latter is often chosen because it makes groups appear to be comparable. In this article we considered these 2 options in terms of their ability to detect a treatment effect in time-to-event studies. Our investigation shows that a Cox regression model applied to the entire cohort is often a more powerful tool in detecting treatment effect as compared with a matched study. Real data from a hematopoietic cell transplantation study is used as an example. PMID:26712591
Harry, Herbert H.
1989-01-01
Apparatus and method for the adjustment and alignment of shafts in high power devices. A plurality of adjacent rotatable angled cylinders are positioned between a base and the shaft to be aligned which when rotated introduce an axial offset. The apparatus is electrically conductive and constructed of a structurally rigid material. The angled cylinders allow the shaft such as the center conductor in a pulse line machine to be offset in any desired alignment position within the range of the apparatus.
Technical issues: logistics. AAMC.
Stillman, P L
1993-06-01
The author states that she became interested in standardized patients (SPs) around 20 years ago as a means of developing a more uniform and effective way to provide instruction and evaluation of basic clinical skills. She reflects upon in detail: (1) the logistics of using SPs in teaching; (2) how SPs are used in assessment; (3) what aspects of performance SPs can be trained to record and evaluate; (4) issues concerning checklists; (5) evaluation of interviewing skills; (6) evaluation of written communication skills; (7) importance of defining what is being tested; (8) various kinds and uses of inter-station exercises and problems of scoring them; (9) case development and the various sources for case material; (10) ways to generate scores; (11) selecting and training SPs; (12) role of the faculty and primary importance of bedside training with real patients; and (13) pros and cons of national versus single-school efforts to use SPs. She concludes by cautioning that further research must be done before SPs can be used for high-stakes certifying and licensing examinations. PMID:8507311
A note on Verhulst's logistic equation and related logistic maps
NASA Astrophysics Data System (ADS)
Ranferi Gutiérrez, M.; Reyes, M. A.; Rosu, H. C.
2010-05-01
We consider the Verhulst logistic equation and a couple of forms of the corresponding logistic maps. For the case of the logistic equation we show that using the general Riccati solution only changes the initial conditions of the equation. Next, we consider two forms of corresponding logistic maps reporting the following results. For the map xn + 1 = rxn(1 - xn) we propose a new way to write the solution for r = -2 which allows better precision of the iterative terms, while for the map xn + 1 - xn = rxn(1 - xn + 1) we show that it behaves identically to the logistic equation from the standpoint of the general Riccati solution, which is also provided herein for any value of the parameter r.
Eberly, Lynn E
2007-01-01
This chapter describes multiple linear regression, a statistical approach used to describe the simultaneous associations of several variables with one continuous outcome. Important steps in using this approach include estimation and inference, variable selection in model building, and assessing model fit. The special cases of regression with interactions among the variables, polynomial regression, regressions with categorical (grouping) variables, and separate slopes models are also covered. Examples in microbiology are used throughout. PMID:18450050
2015-09-09
The NCCS Regression Test Harness is a software package that provides a framework to perform regression and acceptance testing on NCCS High Performance Computers. The package is written in Python and has only the dependency of a Subversion repository to store the regression tests.
Orthogonal Regression and Equivariance.
ERIC Educational Resources Information Center
Blankmeyer, Eric
Ordinary least-squares regression treats the variables asymmetrically, designating a dependent variable and one or more independent variables. When it is not obvious how to make this distinction, a researcher may prefer to use orthogonal regression, which treats the variables symmetrically. However, the usual procedure for orthogonal regression is…
Weine, Stevan Merrill; Ware, Norma; Tugenberg, Toni; Hakizimana, Leonce; Dahnweih, Gonwo; Currie, Madeleine; Wagner, Maureen; Levin, Elise
2013-01-01
Objectives The purpose of this mixed method study was to characterize the patterns of psychosocial adjustment among adolescent African refugees in U.S. resettlement. Methods A purposive sample of 73 recently resettled refugee adolescents from Burundi and Liberia were followed for two years and qualitative and quantitative data was analyzed using a mixed methods exploratory design. Results Protective resources identified were the family and community capacities that can promote youth psychosocial adjustment through: 1) Finances for necessities; 2) English proficiency; 3) Social support networks; 4) Engaged parenting; 5) Family cohesion; 6) Cultural adherence and guidance; 7) Educational support; and, 8) Faith and religious involvement. The researchers first inductively identified 19 thriving, 29 managing, and 25 struggling youths based on review of cases. Univariate analyses then indicated significant associations with country of origin, parental education, and parental employment. Multiple regressions indicated that better psychosocial adjustment was associated with Liberians and living with both parents. Logistic regressions showed that thriving was associated with Liberians and higher parental education, managing with more parental education, and struggling with Burundians and living parents. Qualitative analysis identified how these factors were proxy indicators for protective resources in families and communities. Conclusion These three trajectories of psychosocial adjustment and six domains of protective resources could assist in developing targeted prevention programs and policies for refugee youth. Further rigorous longitudinal mixed-methods study of adolescent refugees in U.S. resettlement are needed. PMID:24205467
Qadir, Farah; Khalid, Amna; Medhin, Girmay
2015-01-01
This study aimed to identify prevalence rates of psychological distress among Pakistani women seeking help for primary infertility. The associations of social support, marital adjustment, and sociodemographic factors with psychological distress were also examined. A total of 177 women with primary infertility were interviewed from one hospital in Islamabad using a Self-Reporting Questionnaire, the Multidimensional Scale of Perceived Social Support, and the Locke-Wallace Marital Adjustment Test. The data were collected between November 2012 and March 2013. The prevalence of psychological distress was 37.3 percent. The results of the logistic regression suggested that marital adjustment and social support were significantly negatively associated with psychological distress in this sample. These associations were not confounded by any of the demographic variables controlled in the multivariable regression models. The role of perceived social support and adjustment in marriage among women experiencing primary infertility are important factors in understanding their psychological distress. The results of this small-scale effort highlight the need for social and familial awareness to help tackle the psychological distress related to infertility. Future research needs to focus on the way the experience of infertility is conditioned by social structural realities. New ways need to be developed to better take into account the process and nature of the infertility experience. PMID:25837531
Ratios as a size adjustment in morphometrics.
Albrecht, G H; Gelvin, B R; Hartman, S E
1993-08-01
Simple ratios in which a measurement variable is divided by a size variable are commonly used but known to be inadequate for eliminating size correlations from morphometric data. Deficiencies in the simple ratio can be alleviated by incorporating regression coefficients describing the bivariate relationship between the measurement and size variables. Recommendations have included: 1) subtracting the regression intercept to force the bivariate relationship through the origin (intercept-adjusted ratios); 2) exponentiating either the measurement or the size variable using an allometry coefficient to achieve linearity (allometrically adjusted ratios); or 3) both subtracting the intercept and exponentiating (fully adjusted ratios). These three strategies for deriving size-adjusted ratios imply different data models for describing the bivariate relationship between the measurement and size variables (i.e., the linear, simple allometric, and full allometric models, respectively). Algebraic rearrangement of the equation associated with each data model leads to a correctly formulated adjusted ratio whose expected value is constant (i.e., size correlation is eliminated). Alternatively, simple algebra can be used to derive an expected value function for assessing whether any proposed ratio formula is effective in eliminating size correlations. Some published ratio adjustments were incorrectly formulated as indicated by expected values that remain a function of size after ratio transformation. Regression coefficients incorporated into adjusted ratios must be estimated using least-squares regression of the measurement variable on the size variable. Use of parameters estimated by any other regression technique (e.g., major axis or reduced major axis) results in residual correlations between size and the adjusted measurement variable. Correctly formulated adjusted ratios, whose parameters are estimated by least-squares methods, do control for size correlations. The size-adjusted
Logistic Discriminant Function Analysis for DIF Identification of Polytomously Scored Items.
ERIC Educational Resources Information Center
Miller, Timothy R.; Spray, Judith A.
1993-01-01
Presents logistic discriminant analysis as a means of detecting differential item functioning (DIF) in items that are polytomously scored. Provides examples of DIF detection using a 27-item mathematics test with 1,977 examinees. The proposed method is simpler and more practical than polytomous extensions of the logistic regression DIF procedure.…
Assessing risk factors for periodontitis using regression
NASA Astrophysics Data System (ADS)
Lobo Pereira, J. A.; Ferreira, Maria Cristina; Oliveira, Teresa
2013-10-01
Multivariate statistical analysis is indispensable to assess the associations and interactions between different factors and the risk of periodontitis. Among others, regression analysis is a statistical technique widely used in healthcare to investigate and model the relationship between variables. In our work we study the impact of socio-demographic, medical and behavioral factors on periodontal health. Using regression, linear and logistic models, we can assess the relevance, as risk factors for periodontitis disease, of the following independent variables (IVs): Age, Gender, Diabetic Status, Education, Smoking status and Plaque Index. The multiple linear regression analysis model was built to evaluate the influence of IVs on mean Attachment Loss (AL). Thus, the regression coefficients along with respective p-values will be obtained as well as the respective p-values from the significance tests. The classification of a case (individual) adopted in the logistic model was the extent of the destruction of periodontal tissues defined by an Attachment Loss greater than or equal to 4 mm in 25% (AL≥4mm/≥25%) of sites surveyed. The association measures include the Odds Ratios together with the correspondent 95% confidence intervals.
Heteroscedastic transformation cure regression models.
Chen, Chyong-Mei; Chen, Chen-Hsin
2016-06-30
Cure models have been applied to analyze clinical trials with cures and age-at-onset studies with nonsusceptibility. Lu and Ying (On semiparametric transformation cure model. Biometrika 2004; 91:331?-343. DOI: 10.1093/biomet/91.2.331) developed a general class of semiparametric transformation cure models, which assumes that the failure times of uncured subjects, after an unknown monotone transformation, follow a regression model with homoscedastic residuals. However, it cannot deal with frequently encountered heteroscedasticity, which may result from dispersed ranges of failure time span among uncured subjects' strata. To tackle the phenomenon, this article presents semiparametric heteroscedastic transformation cure models. The cure status and the failure time of an uncured subject are fitted by a logistic regression model and a heteroscedastic transformation model, respectively. Unlike the approach of Lu and Ying, we derive score equations from the full likelihood for estimating the regression parameters in the proposed model. The similar martingale difference function to their proposal is used to estimate the infinite-dimensional transformation function. Our proposed estimating approach is intuitively applicable and can be conveniently extended to other complicated models when the maximization of the likelihood may be too tedious to be implemented. We conduct simulation studies to validate large-sample properties of the proposed estimators and to compare with the approach of Lu and Ying via the relative efficiency. The estimating method and the two relevant goodness-of-fit graphical procedures are illustrated by using breast cancer data and melanoma data. Copyright © 2016 John Wiley & Sons, Ltd. PMID:26887342
Harmonic regression and scale stability.
Lee, Yi-Hsuan; Haberman, Shelby J
2013-10-01
Monitoring a very frequently administered educational test with a relatively short history of stable operation imposes a number of challenges. Test scores usually vary by season, and the frequency of administration of such educational tests is also seasonal. Although it is important to react to unreasonable changes in the distributions of test scores in a timely fashion, it is not a simple matter to ascertain what sort of distribution is really unusual. Many commonly used approaches for seasonal adjustment are designed for time series with evenly spaced observations that span many years and, therefore, are inappropriate for data from such educational tests. Harmonic regression, a seasonal-adjustment method, can be useful in monitoring scale stability when the number of years available is limited and when the observations are unevenly spaced. Additional forms of adjustments can be included to account for variability in test scores due to different sources of population variations. To illustrate, real data are considered from an international language assessment. PMID:24092490
Tailored logistics: the next advantage.
Fuller, J B; O'Conor, J; Rawlinson, R
1993-01-01
How many top executives have ever visited with managers who move materials from the factory to the store? How many still reduce the costs of logistics to the rent of warehouses and the fees charged by common carriers? To judge by hours of senior management attention, logistics problems do not rank high. But logistics have the potential to become the next governing element of strategy. Whether they know it or not, senior managers of every retail store and diversified manufacturing company compete in logistically distinct businesses. Customer needs vary, and companies can tailor their logistics systems to serve their customers better and more profitably. Companies do not create value for customers and sustainable advantage for themselves merely by offering varieties of goods. Rather, they offer goods in distinct ways. A particular can of Coca-Cola, for example, might be a can of Coca-Cola going to a vending machine, or a can of Coca-Cola that comes with billing services. There is a fortune buried in this distinction. The goal of logistics strategy is building distinct approaches to distinct groups of customers. The first step is organizing a cross-functional team to proceed through the following steps: segmenting customers according to purchase criteria, establishing different standards of service for different customer segments, tailoring logistics pipelines to support each segment, and creating economics of scale to determine which assets can be shared among various pipelines. The goal of establishing logistically distinct businesses is familiar: improved knowledge of customers and improved means of satisfying them. PMID:10126157
NASA Space Rocket Logistics Challenges
NASA Technical Reports Server (NTRS)
Neeley, James R.; Jones, James V.; Watson, Michael D.; Bramon, Christopher J.; Inman, Sharon K.; Tuttle, Loraine
2014-01-01
The Space Launch System (SLS) is the new NASA heavy lift launch vehicle and is scheduled for its first mission in 2017. The goal of the first mission, which will be uncrewed, is to demonstrate the integrated system performance of the SLS rocket and spacecraft before a crewed flight in 2021. SLS has many of the same logistics challenges as any other large scale program. Common logistics concerns for SLS include integration of discreet programs geographically separated, multiple prime contractors with distinct and different goals, schedule pressures and funding constraints. However, SLS also faces unique challenges. The new program is a confluence of new hardware and heritage, with heritage hardware constituting seventy-five percent of the program. This unique approach to design makes logistics concerns such as commonality especially problematic. Additionally, a very low manifest rate of one flight every four years makes logistics comparatively expensive. That, along with the SLS architecture being developed using a block upgrade evolutionary approach, exacerbates long-range planning for supportability considerations. These common and unique logistics challenges must be clearly identified and tackled to allow SLS to have a successful program. This paper will address the common and unique challenges facing the SLS programs, along with the analysis and decisions the NASA Logistics engineers are making to mitigate the threats posed by each.
Binary Regression with Differentially Misclassified Response and Exposure Variables
Tang, Li; Lyles, Robert H.; King, Caroline C.; Celentano, David D.; Lo, Yungtai
2015-01-01
Misclassification is a long-standing statistical problem in epidemiology. In many real studies, either an exposure or a response variable or both may be misclassified. As such, potential threats to the validity of the analytic results (e.g., estimates of odds ratios) that stem from misclassification are widely discussed in the literature. Much of the discussion has been restricted to the nondifferential case, in which misclassification rates for a particular variable are assumed not to depend on other variables. However, complex differential misclassification patterns are common in practice, as we illustrate here using bacterial vaginosis (BV) and Trichomoniasis data from the HIV Epidemiology Research Study (HERS). Therefore, clear illustrations of valid and accessible methods that deal with complex misclassification are still in high demand. We formulate a maximum likelihood (ML) framework that allows flexible modeling of misclassification in both the response and a key binary exposure variable, while adjusting for other covariates via logistic regression. The approach emphasizes the use of internal validation data in order to evaluate the underlying misclassification mechanisms. Data-driven simulations show that the proposed ML analysis outperforms less flexible approaches that fail to appropriately account for complex misclassification patterns. The value and validity of the method is further demonstrated through a comprehensive analysis of the HERS example data. PMID:25652841
Binary regression with differentially misclassified response and exposure variables.
Tang, Li; Lyles, Robert H; King, Caroline C; Celentano, David D; Lo, Yungtai
2015-04-30
Misclassification is a long-standing statistical problem in epidemiology. In many real studies, either an exposure or a response variable or both may be misclassified. As such, potential threats to the validity of the analytic results (e.g., estimates of odds ratios) that stem from misclassification are widely discussed in the literature. Much of the discussion has been restricted to the nondifferential case, in which misclassification rates for a particular variable are assumed not to depend on other variables. However, complex differential misclassification patterns are common in practice, as we illustrate here using bacterial vaginosis and Trichomoniasis data from the HIV Epidemiology Research Study (HERS). Therefore, clear illustrations of valid and accessible methods that deal with complex misclassification are still in high demand. We formulate a maximum likelihood (ML) framework that allows flexible modeling of misclassification in both the response and a key binary exposure variable, while adjusting for other covariates via logistic regression. The approach emphasizes the use of internal validation data in order to evaluate the underlying misclassification mechanisms. Data-driven simulations show that the proposed ML analysis outperforms less flexible approaches that fail to appropriately account for complex misclassification patterns. The value and validity of the method are further demonstrated through a comprehensive analysis of the HERS example data. PMID:25652841
Prediction in Multiple Regression.
ERIC Educational Resources Information Center
Osborne, Jason W.
2000-01-01
Presents the concept of prediction via multiple regression (MR) and discusses the assumptions underlying multiple regression analyses. Also discusses shrinkage, cross-validation, and double cross-validation of prediction equations and describes how to calculate confidence intervals around individual predictions. (SLD)
Improved Regression Calibration
ERIC Educational Resources Information Center
Skrondal, Anders; Kuha, Jouni
2012-01-01
The likelihood for generalized linear models with covariate measurement error cannot in general be expressed in closed form, which makes maximum likelihood estimation taxing. A popular alternative is regression calibration which is computationally efficient at the cost of inconsistent estimation. We propose an improved regression calibration…
Gerber, Samuel; Rübel, Oliver; Bremer, Peer-Timo; Pascucci, Valerio; Whitaker, Ross T.
2012-01-01
This paper introduces a novel partition-based regression approach that incorporates topological information. Partition-based regression typically introduce a quality-of-fit-driven decomposition of the domain. The emphasis in this work is on a topologically meaningful segmentation. Thus, the proposed regression approach is based on a segmentation induced by a discrete approximation of the Morse-Smale complex. This yields a segmentation with partitions corresponding to regions of the function with a single minimum and maximum that are often well approximated by a linear model. This approach yields regression models that are amenable to interpretation and have good predictive capacity. Typically, regression estimates are quantified by their geometrical accuracy. For the proposed regression, an important aspect is the quality of the segmentation itself. Thus, this paper introduces a new criterion that measures the topological accuracy of the estimate. The topological accuracy provides a complementary measure to the classical geometrical error measures and is very sensitive to over-fitting. The Morse-Smale regression is compared to state-of-the-art approaches in terms of geometry and topology and yields comparable or improved fits in many cases. Finally, a detailed study on climate-simulation data demonstrates the application of the Morse-Smale regression. Supplementary materials are available online and contain an implementation of the proposed approach in the R package msr, an analysis and simulations on the stability of the Morse-Smale complex approximation and additional tables for the climate-simulation study. PMID:23687424
Gerber, Samuel; Rubel, Oliver; Bremer, Peer -Timo; Pascucci, Valerio; Whitaker, Ross T.
2012-01-19
This paper introduces a novel partition-based regression approach that incorporates topological information. Partition-based regression typically introduces a quality-of-fit-driven decomposition of the domain. The emphasis in this work is on a topologically meaningful segmentation. Thus, the proposed regression approach is based on a segmentation induced by a discrete approximation of the Morse–Smale complex. This yields a segmentation with partitions corresponding to regions of the function with a single minimum and maximum that are often well approximated by a linear model. This approach yields regression models that are amenable to interpretation and have good predictive capacity. Typically, regression estimates are quantified by their geometrical accuracy. For the proposed regression, an important aspect is the quality of the segmentation itself. Thus, this article introduces a new criterion that measures the topological accuracy of the estimate. The topological accuracy provides a complementary measure to the classical geometrical error measures and is very sensitive to overfitting. The Morse–Smale regression is compared to state-of-the-art approaches in terms of geometry and topology and yields comparable or improved fits in many cases. Finally, a detailed study on climate-simulation data demonstrates the application of the Morse–Smale regression. Supplementary Materials are available online and contain an implementation of the proposed approach in the R package msr, an analysis and simulations on the stability of the Morse–Smale complex approximation, and additional tables for the climate-simulation study.
Miki, Takako; Kochi, Takeshi; Kuwahara, Keisuke; Eguchi, Masafumi; Kurotani, Kayo; Tsuruoka, Hiroko; Ito, Rie; Kabe, Isamu; Kawakami, Norito; Mizoue, Tetsuya; Nanri, Akiko
2015-09-30
Depression has been linked to the overall diet using both exploratory and pre-defined methods. However, neither of these methods incorporates specific knowledge on nutrient-disease associations. The aim of the present study was to empirically identify dietary patterns using reduced rank regression and to examine their relations to depressive symptoms. Participants were 2006 Japanese employees aged 19-69 years. Depressive symptoms were assessed using the Center for Epidemiologic Studies Depression Scale. Diet was assessed using a validated, self-administered diet history questionnaire. Dietary patterns were extracted by reduced rank regression with 6 depression-related nutrients as response variables. Logistic regression was used to estimate odds ratios of depressive symptoms adjusted for potential confounders. A dietary pattern characterized by a high intake of vegetables, mushrooms, seaweeds, soybean products, green tea, potatoes, fruits, and small fish with bones and a low intake of rice was associated with fewer depressive symptoms. The multivariable-adjusted odds ratios of having depressive symptoms were 0.62 (95% confidence interval, 0.48-0.81) in the highest versus lowest tertiles of dietary score. Results suggest that adherence to a diet rich in vegetables, fruits, and typical Japanese foods, including mushrooms, seaweeds, soybean products, and green tea, is associated with a lower probability of having depressive symptoms. PMID:26208984
Schmid, Matthias; Wickler, Florian; Maloney, Kelly O.; Mitchell, Richard; Fenske, Nora; Mayr, Andreas
2013-01-01
Regression analysis with a bounded outcome is a common problem in applied statistics. Typical examples include regression models for percentage outcomes and the analysis of ratings that are measured on a bounded scale. In this paper, we consider beta regression, which is a generalization of logit models to situations where the response is continuous on the interval (0,1). Consequently, beta regression is a convenient tool for analyzing percentage responses. The classical approach to fit a beta regression model is to use maximum likelihood estimation with subsequent AIC-based variable selection. As an alternative to this established - yet unstable - approach, we propose a new estimation technique called boosted beta regression. With boosted beta regression estimation and variable selection can be carried out simultaneously in a highly efficient way. Additionally, both the mean and the variance of a percentage response can be modeled using flexible nonlinear covariate effects. As a consequence, the new method accounts for common problems such as overdispersion and non-binomial variance structures. PMID:23626706
Logistics Management: New trends in the Reverse Logistics
NASA Astrophysics Data System (ADS)
Antonyová, A.; Antony, P.; Soewito, B.
2016-04-01
Present level and quality of the environment are directly dependent on our access to natural resources, as well as their sustainability. In particular production activities and phenomena associated with it have a direct impact on the future of our planet. Recycling process, which in large enterprises often becomes an important and integral part of the production program, is usually in small and medium-sized enterprises problematic. We can specify a few factors, which have direct impact on the development and successful application of the effective reverse logistics system. Find the ways to economically acceptable model of reverse logistics, focusing on converting waste materials for renewable energy, is the task in progress.
Lessons in logistics from Somalia.
Kemball-Cook, D; Stephenson, R
1984-03-01
By February 1981 the refugee relief operation in Somalia was close to breakdown. The Governor of Somalia and the United Nations High Commission for Refugees (UNHCR) contracted the agency CARE to manage the logistics of the operation. By August 1981 over 99 % of food received at Mogadishu was reaching the camps. Here we describe this apparent success, and attempt to diagnose the contributing factors. Chief among these are dynamic leadership, 'systems' management, adaptability of personnel, the use of professional Indian food monitors in the camps, and the support given by the Government. The chief qualification on the success of the operation has been the continued dependency on expatriate expertise. General conclusions are offered relating to the management of logistics in relief operations. The most important conclusion is that there is a prime need for logistics to be centralized in a single organization at the start of major emergencies. We point to the current inadequacy in an international relief system which fails to ensure this, and suggest that a new or existing part of the United Nations family be given a 'brief for in-country logistics' to become a UN Emergency Logistics Office. PMID:20958559
George: Gaussian Process regression
NASA Astrophysics Data System (ADS)
Foreman-Mackey, Daniel
2015-11-01
George is a fast and flexible library, implemented in C++ with Python bindings, for Gaussian Process regression useful for accounting for correlated noise in astronomical datasets, including those for transiting exoplanet discovery and characterization and stellar population modeling.
Multivariate Regression with Calibration*
Liu, Han; Wang, Lie; Zhao, Tuo
2014-01-01
We propose a new method named calibrated multivariate regression (CMR) for fitting high dimensional multivariate regression models. Compared to existing methods, CMR calibrates the regularization for each regression task with respect to its noise level so that it is simultaneously tuning insensitive and achieves an improved finite-sample performance. Computationally, we develop an efficient smoothed proximal gradient algorithm which has a worst-case iteration complexity O(1/ε), where ε is a pre-specified numerical accuracy. Theoretically, we prove that CMR achieves the optimal rate of convergence in parameter estimation. We illustrate the usefulness of CMR by thorough numerical simulations and show that CMR consistently outperforms other high dimensional multivariate regression methods. We also apply CMR on a brain activity prediction problem and find that CMR is as competitive as the handcrafted model created by human experts. PMID:25620861
Logistics background study: underground mining
Hanslovan, J. J.; Visovsky, R. G.
1982-02-01
Logistical functions that are normally associated with US underground coal mining are investigated and analyzed. These functions imply all activities and services that support the producing sections of the mine. The report provides a better understanding of how these functions impact coal production in terms of time, cost, and safety. Major underground logistics activities are analyzed and include: transportation and personnel, supplies and equipment; transportation of coal and rock; electrical distribution and communications systems; water handling; hydraulics; and ventilation systems. Recommended areas for future research are identified and prioritized.
Continual Improvement in Shuttle Logistics
NASA Technical Reports Server (NTRS)
Flowers, Jean; Schafer, Loraine
1995-01-01
It has been said that Continual Improvement (CI) is difficult to apply to service oriented functions, especially in a government agency such as NASA. However, a constrained budget and increasing requirements are a way of life at NASA Kennedy Space Center (KSC), making it a natural environment for the application of CI tools and techniques. This paper describes how KSC, and specifically the Space Shuttle Logistics Project, a key contributor to KSC's mission, has embraced the CI management approach as a means of achieving its strategic goals and objectives. An overview of how the KSC Space Shuttle Logistics Project has structured its CI effort and examples of some of the initiatives are provided.
A development of logistics management models for the Space Transportation System
NASA Technical Reports Server (NTRS)
Carrillo, M. J.; Jacobsen, S. E.; Abell, J. B.; Lippiatt, T. F.
1983-01-01
A new analytic queueing approach was described which relates stockage levels, repair level decisions, and the project network schedule of prelaunch operations directly to the probability distribution of the space transportation system launch delay. Finite source population and limited repair capability were additional factors included in this logistics management model developed specifically for STS maintenance requirements. Data presently available to support logistics decisions were based on a comparability study of heavy aircraft components. A two-phase program is recommended by which NASA would implement an integrated data collection system, assemble logistics data from previous STS flights, revise extant logistics planning and resource requirement parameters using Bayes-Lin techniques, and adjust for uncertainty surrounding logistics systems performance parameters. The implementation of these recommendations can be expected to deliver more cost-effective logistics support.
Moderated Multiple Regression, Spurious Interaction Effects, and IRT
ERIC Educational Resources Information Center
Kang, Sun-Mee; Waller, Niels G.
2005-01-01
Two Monte Carlo studies were conducted to explore the Type I error rates in moderated multiple regression (MMR) of observed scores and estimated latent trait scores from a two-parameter logistic item response theory (IRT) model. The results of both studies showed that MMR Type I error rates were substantially higher than the nominal alpha levels…
Logistics support of space facilities
NASA Technical Reports Server (NTRS)
Lewis, William C.
1988-01-01
The logistic support of space facilities is described, with special attention given to the problem of sizing the inventory of ready spares kept at the space facility. Where possible, data from the Space Shuttle Orbiter is extrapolated to provide numerical estimates for space facilities. Attention is also given to repair effort estimation and long duration missions.
NASA Space Rocket Logistics Challenges
NASA Technical Reports Server (NTRS)
Bramon, Chris; Neeley, James R.; Jones, James V.; Watson, Michael D.; Inman, Sharon K.; Tuttle, Loraine
2014-01-01
The Space Launch System (SLS) is the new NASA heavy lift launch vehicle in development and is scheduled for its first mission in 2017. SLS has many of the same logistics challenges as any other large scale program. However, SLS also faces unique challenges. This presentation will address the SLS challenges, along with the analysis and decisions to mitigate the threats posed by each.
Regression versus No Regression in the Autistic Disorder: Developmental Trajectories
ERIC Educational Resources Information Center
Bernabei, P.; Cerquiglini, A.; Cortesi, F.; D' Ardia, C.
2007-01-01
Developmental regression is a complex phenomenon which occurs in 20-49% of the autistic population. Aim of the study was to assess possible differences in the development of regressed and non-regressed autistic preschoolers. We longitudinally studied 40 autistic children (18 regressed, 22 non-regressed) aged 2-6 years. The following developmental…
Explorations in Statistics: Regression
ERIC Educational Resources Information Center
Curran-Everett, Douglas
2011-01-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This seventh installment of "Explorations in Statistics" explores regression, a technique that estimates the nature of the relationship between two things for which we may only surmise a mechanistic or predictive connection.…
Modern Regression Discontinuity Analysis
ERIC Educational Resources Information Center
Bloom, Howard S.
2012-01-01
This article provides a detailed discussion of the theory and practice of modern regression discontinuity (RD) analysis for estimating the effects of interventions or treatments. Part 1 briefly chronicles the history of RD analysis and summarizes its past applications. Part 2 explains how in theory an RD analysis can identify an average effect of…
Webcast entitled Statistical Tools for Making Sense of Data, by the National Nutrient Criteria Support Center, N-STEPS (Nutrients-Scientific Technical Exchange Partnership. The section "Correlation and Regression" provides an overview of these two techniques in the context of nut...
Multiple linear regression analysis
NASA Technical Reports Server (NTRS)
Edwards, T. R.
1980-01-01
Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.
Mechanisms of neuroblastoma regression
Brodeur, Garrett M.; Bagatell, Rochelle
2014-01-01
Recent genomic and biological studies of neuroblastoma have shed light on the dramatic heterogeneity in the clinical behaviour of this disease, which spans from spontaneous regression or differentiation in some patients, to relentless disease progression in others, despite intensive multimodality therapy. This evidence also suggests several possible mechanisms to explain the phenomena of spontaneous regression in neuroblastomas, including neurotrophin deprivation, humoral or cellular immunity, loss of telomerase activity and alterations in epigenetic regulation. A better understanding of the mechanisms of spontaneous regression might help to identify optimal therapeutic approaches for patients with these tumours. Currently, the most druggable mechanism is the delayed activation of developmentally programmed cell death regulated by the tropomyosin receptor kinase A pathway. Indeed, targeted therapy aimed at inhibiting neurotrophin receptors might be used in lieu of conventional chemotherapy or radiation in infants with biologically favourable tumours that require treatment. Alternative approaches consist of breaking immune tolerance to tumour antigens or activating neurotrophin receptor pathways to induce neuronal differentiation. These approaches are likely to be most effective against biologically favourable tumours, but they might also provide insights into treatment of biologically unfavourable tumours. We describe the different mechanisms of spontaneous neuroblastoma regression and the consequent therapeutic approaches. PMID:25331179
Bayesian ARTMAP for regression.
Sasu, L M; Andonie, R
2013-10-01
Bayesian ARTMAP (BA) is a recently introduced neural architecture which uses a combination of Fuzzy ARTMAP competitive learning and Bayesian learning. Training is generally performed online, in a single-epoch. During training, BA creates input data clusters as Gaussian categories, and also infers the conditional probabilities between input patterns and categories, and between categories and classes. During prediction, BA uses Bayesian posterior probability estimation. So far, BA was used only for classification. The goal of this paper is to analyze the efficiency of BA for regression problems. Our contributions are: (i) we generalize the BA algorithm using the clustering functionality of both ART modules, and name it BA for Regression (BAR); (ii) we prove that BAR is a universal approximator with the best approximation property. In other words, BAR approximates arbitrarily well any continuous function (universal approximation) and, for every given continuous function, there is one in the set of BAR approximators situated at minimum distance (best approximation); (iii) we experimentally compare the online trained BAR with several neural models, on the following standard regression benchmarks: CPU Computer Hardware, Boston Housing, Wisconsin Breast Cancer, and Communities and Crime. Our results show that BAR is an appropriate tool for regression tasks, both for theoretical and practical reasons. PMID:23665468
Multisource information fusion for logistics
NASA Astrophysics Data System (ADS)
Woodley, Robert; Petrov, Plamen; Noll, Warren
2011-05-01
Current Army logistical systems and databases contain massive amounts of data that need an effective method to extract actionable information. The databases do not contain root cause and case-based analysis needed to diagnose or predict breakdowns. A system is needed to find data from as many sources as possible, process it in an integrated fashion, and disseminate information products on the readiness of the fleet vehicles. 21st Century Systems, Inc. introduces the Agent- Enabled Logistics Enterprise Intelligence System (AELEIS) tool, designed to assist logistics analysts with assessing the availability and prognostics of assets in the logistics pipeline. AELEIS extracts data from multiple, heterogeneous data sets. This data is then aggregated and mined for data trends. Finally, data reasoning tools and prognostics tools evaluate the data for relevance and potential issues. Multiple types of data mining tools may be employed to extract the data and an information reasoning capability determines what tools are needed to apply them to extract information. This can be visualized as a push-pull system where data trends fire a reasoning engine to search for corroborating evidence and then integrate the data into actionable information. The architecture decides on what reasoning engine to use (i.e., it may start with a rule-based method, but, if needed, go to condition based reasoning, and even a model-based reasoning engine for certain types of equipment). Initial results show that AELEIS is able to indicate to the user of potential fault conditions and root-cause information mined from a database.
Logistics, electronic commerce, and the environment
NASA Astrophysics Data System (ADS)
Sarkis, Joseph; Meade, Laura; Talluri, Srinivas
2002-02-01
Organizations realize that a strong supporting logistics or electronic logistics (e-logistics) function is important from both commercial and consumer perspectives. The implications of e-logistics models and practices cover the forward and reverse logistics functions of organizations. They also have direct and profound impact on the natural environment. This paper will focus on a discussion of forward and reverse e-logistics and their relationship to the natural environment. After discussion of the many pertinent issues in these areas, directions of practice and implications for study and research are then described.
You, Hua; Gu, Hai; Ning, Weiqing; Zhou, Hua; Dong, Hengjin
2016-01-01
Background The New Rural Cooperative Medical Scheme (NCMS) includes a maternal care benefits package that is associated with increasing maternal health services. The local compensation policies have been frequently adjusted in recent years. This study examined the association between the NCMS maternal-services policy adjustment and expense reimbursement in Yuyao, China. Methods Two household surveys were conducted in Yuyao in 2008 and 2011 (before and after the NCMS policy adjustment, respectively). Local women (N = 154) who had delivery history in the past three years were recruited. A questionnaire was used to collect information about delivery history, maternal health services utilization (prenatal care, postnatal care, and the grade of delivery institutions), NCMS participation, and reimbursement status. Logistic regression analyses were used to predict the association between policy adjustment and maternal health utilization and the association between policy adjustment and out-of-pocket proportion. Next, t-tests and covariance analyses adjusting for household income were used to compare the out-of-pocket proportion between 2008 and 2011. Results Results revealed that compensation policy adjustment was associated with an increase in postnatal visits (adjusted OR = 3.32, p = 0.009) and the use of second level or above institutions for delivery (adjusted OR = 2.32, p = 0.03) among participants. In 2008, only 9.1% of pregnant women received reimbursement from the NCMS; however, this rate increased to 36.8% in 2011. After policy adjustment, there were no significant changes in the proportion of out-of-pocket expenses shared in delivery fee (F = 0.24, p = 0.63) and in household income (F = 0.46, p = 0.50). Conclusions Financial compensation increase improved maternal health services utilization; however, this effect was limited. Although the reimbursement rate was raised, the out-of-pocket proportion was not significant changed; therefore, the compensation design
Shapiro, M F; Park, R E; Keesey, J; Brook, R H
1994-01-01
OBJECTIVE. This study investigated how mortality differences between groups of municipal versus voluntary hospitals are affected by case-mix adjustment methods. DATA SOURCES AND STUDY SETTING. We sampled about 10,000 random admissions from administrative data for patients hospitalized with each of six conditions in hospitals in New York City during 1984-1987. STUDY DESIGN. We developed logistic regression models adjusting for age and gender, for principal diagnosis, for "limited other diagnoses" (secondary diagnoses that were very unlikely to result from care received), for "full other diagnoses" (all secondary diagnoses irrespective of whether they might have been due to care received), for previous diagnoses, and for other variables. PRINCIPAL FINDINGS. For five of the six conditions, when the limited other diagnoses adjustment was used there was higher mortality in the municipal hospitals (p < .05), with 3.3 additional deaths/100 admissions for myocardial infarction, 1.2 for pneumonia, 8.3 for stroke, 2.8 for head trauma, and 0.8 for hip repair. However, when the full other diagnoses adjustment was used, differences remained significant only for stroke (4.3 additional deaths/100 admissions) and head trauma (1.3) (p < .05). CONCLUSIONS. Estimates of mortality differences between New York City municipal and voluntary hospitals are substantially affected by which secondary diagnoses are used in case-mix adjustment. Judgments of quality should not be based on administrative data unless models can be developed that validly capture level of sickness at admission. PMID:8163382
Ridge Regression: A Regression Procedure for Analyzing Correlated Independent Variables.
ERIC Educational Resources Information Center
Rakow, Ernest A.
Ridge regression is presented as an analytic technique to be used when predictor variables in a multiple linear regression situation are highly correlated, a situation which may result in unstable regression coefficients and difficulties in interpretation. Ridge regression avoids the problem of selection of variables that may occur in stepwise…
Ridge Regression Signal Processing
NASA Technical Reports Server (NTRS)
Kuhl, Mark R.
1990-01-01
The introduction of the Global Positioning System (GPS) into the National Airspace System (NAS) necessitates the development of Receiver Autonomous Integrity Monitoring (RAIM) techniques. In order to guarantee a certain level of integrity, a thorough understanding of modern estimation techniques applied to navigational problems is required. The extended Kalman filter (EKF) is derived and analyzed under poor geometry conditions. It was found that the performance of the EKF is difficult to predict, since the EKF is designed for a Gaussian environment. A novel approach is implemented which incorporates ridge regression to explain the behavior of an EKF in the presence of dynamics under poor geometry conditions. The basic principles of ridge regression theory are presented, followed by the derivation of a linearized recursive ridge estimator. Computer simulations are performed to confirm the underlying theory and to provide a comparative analysis of the EKF and the recursive ridge estimator.
Fast Censored Linear Regression
HUANG, YIJIAN
2013-01-01
Weighted log-rank estimating function has become a standard estimation method for the censored linear regression model, or the accelerated failure time model. Well established statistically, the estimator defined as a consistent root has, however, rather poor computational properties because the estimating function is neither continuous nor, in general, monotone. We propose a computationally efficient estimator through an asymptotics-guided Newton algorithm, in which censored quantile regression methods are tailored to yield an initial consistent estimate and a consistent derivative estimate of the limiting estimating function. We also develop fast interval estimation with a new proposal for sandwich variance estimation. The proposed estimator is asymptotically equivalent to the consistent root estimator and barely distinguishable in samples of practical size. However, computation time is typically reduced by two to three orders of magnitude for point estimation alone. Illustrations with clinical applications are provided. PMID:24347802
Comparing the Discrete and Continuous Logistic Models
ERIC Educational Resources Information Center
Gordon, Sheldon P.
2008-01-01
The solutions of the discrete logistic growth model based on a difference equation and the continuous logistic growth model based on a differential equation are compared and contrasted. The investigation is conducted using a dynamic interactive spreadsheet. (Contains 5 figures.)
Research on 6R Military Logistics Network
NASA Astrophysics Data System (ADS)
Jie, Wan; Wen, Wang
The building of military logistics network is an important issue for the construction of new forces. This paper has thrown out a concept model of 6R military logistics network model based on JIT. Then we conceive of axis spoke y logistics centers network, flexible 6R organizational network, lean 6R military information network based grid. And then the strategy and proposal for the construction of the three sub networks of 6Rmilitary logistics network are given.
Orthogonal Regression: A Teaching Perspective
ERIC Educational Resources Information Center
Carr, James R.
2012-01-01
A well-known approach to linear least squares regression is that which involves minimizing the sum of squared orthogonal projections of data points onto the best fit line. This form of regression is known as orthogonal regression, and the linear model that it yields is known as the major axis. A similar method, reduced major axis regression, is…
Nowcasting sunshine number using logistic modeling
NASA Astrophysics Data System (ADS)
Brabec, Marek; Badescu, Viorel; Paulescu, Marius
2013-04-01
In this paper, we present a formalized approach to statistical modeling of the sunshine number, binary indicator of whether the Sun is covered by clouds introduced previously by Badescu (Theor Appl Climatol 72:127-136, 2002). Our statistical approach is based on Markov chain and logistic regression and yields fully specified probability models that are relatively easily identified (and their unknown parameters estimated) from a set of empirical data (observed sunshine number and sunshine stability number series). We discuss general structure of the model and its advantages, demonstrate its performance on real data and compare its results to classical ARIMA approach as to a competitor. Since the model parameters have clear interpretation, we also illustrate how, e.g., their inter-seasonal stability can be tested. We conclude with an outlook to future developments oriented to construction of models allowing for practically desirable smooth transition between data observed with different frequencies and with a short discussion of technical problems that such a goal brings.
Mini pressurized logistics module (MPLM)
NASA Astrophysics Data System (ADS)
Vallerani, E.; Brondolo, D.; Basile, L.
1996-06-01
The MPLM Program was initiated through a Memorandum of Understanding (MOU) between the United States' National Aeronautics and Space Administration (NASA) and Italy's ASI, the Italian Space Agency, that was signed on 6 December 1991. The MPLM is a pressurized logistics module that will be used to transport supplies and materials (up to 20,000 lb), including user experiments, between Earth and International Space Station Alpha (ISSA) using the Shuttle, to support active and passive storage, and to provide a habitable environment for two people when docked to the Station. The Italian Space Agency has selected Alenia Spazio to develop MPLM modules that have always been considered a key element for the new International Space Station taking benefit from its design flexibility and consequent possible cost saving based on the maximum utilization of the Shuttle launch capability for any mission. In the frame of the very recent agreement between the U.S. and Russia for cooperation in space, that foresees the utilization of MIR 1 hardware, the Italian MPLM will remain an important element of the logistics system, being the only pressurized module designed for re-entry. Within the new scenario of anticipated Shuttle flights to MIR 1 during Space Station phase 1, MPLM remains a candidate for one or more missions to provide MIR 1 resupply capabilities and advanced ISSA hardware/procedures verification. Based on the concept of Flexible Carriers, Alenia Spazio is providing NASA with three MPLM flight units that can be configured according to the requirements of the Human-Tended Capability (HTC) and Permanent Human Capability (PHC) of the Space Station. Configurability will allow transportation of passive cargo only, or a combination of passive and cold cargo accommodated in R/F racks. Having developed and qualified the baseline configuration with respect to the worst enveloping condition, each unit could be easily configured to the passive or active version depending upon the
Accounting for the correlation between fellow eyes in regression analysis.
Glynn, R J; Rosner, B
1992-03-01
Regression techniques that appropriately use all available eyes have infrequently been applied in the ophthalmologic literature, despite advances both in the development of statistical models and in the availability of computer software to fit these models. We considered the general linear model and polychotomous logistic regression approaches of Rosner and the estimating equation approach of Liang and Zeger, applied to both linear and logistic regression. Methods were illustrated with the use of two real data sets: (1) impairment of visual acuity in patients with retinitis pigmentosa and (2) overall visual field impairment in elderly patients evaluated for glaucoma. We discuss the interpretation of coefficients from these models and the advantages of these approaches compared with alternative approaches, such as treating individuals rather than eyes as the unit of analysis, separate regression analyses of right and left eyes, or utilization of ordinary regression techniques without accounting for the correlation between fellow eyes. Specific advantages include enhanced statistical power, more interpretable regression coefficients, greater precision of estimation, and less sensitivity to missing data for some eyes. We concluded that these models should be used more frequently in ophthalmologic research, and we provide guidelines for choosing between alternative models. PMID:1543458
Logistics Handbook, 1976. Colorado Outward Bound School.
ERIC Educational Resources Information Center
Colorado Outward Bound School, Denver.
Logistics, a support mission, is vital to the successful operation of the Colorado Outward Bound School (COBS) courses. Logistics is responsible for purchasing, maintaining, transporting, and replenishing a wide variety of items, i.e., food, mountaineering and camping equipment, medical and other supplies, and vehicles. The Logistics coordinator…
Incremental hierarchical discriminant regression.
Weng, Juyang; Hwang, Wey-Shiuan
2007-03-01
This paper presents incremental hierarchical discriminant regression (IHDR) which incrementally builds a decision tree or regression tree for very high-dimensional regression or decision spaces by an online, real-time learning system. Biologically motivated, it is an approximate computational model for automatic development of associative cortex, with both bottom-up sensory inputs and top-down motor projections. At each internal node of the IHDR tree, information in the output space is used to automatically derive the local subspace spanned by the most discriminating features. Embedded in the tree is a hierarchical probability distribution model used to prune very unlikely cases during the search. The number of parameters in the coarse-to-fine approximation is dynamic and data-driven, enabling the IHDR tree to automatically fit data with unknown distribution shapes (thus, it is difficult to select the number of parameters up front). The IHDR tree dynamically assigns long-term memory to avoid the loss-of-memory problem typical with a global-fitting learning algorithm for neural networks. A major challenge for an incrementally built tree is that the number of samples varies arbitrarily during the construction process. An incrementally updated probability model, called sample-size-dependent negative-log-likelihood (SDNLL) metric is used to deal with large sample-size cases, small sample-size cases, and unbalanced sample-size cases, measured among different internal nodes of the IHDR tree. We report experimental results for four types of data: synthetic data to visualize the behavior of the algorithms, large face image data, continuous video stream from robot navigation, and publicly available data sets that use human defined features. PMID:17385628
NASA Technical Reports Server (NTRS)
Kuhl, Mark R.
1990-01-01
Current navigation requirements depend on a geometric dilution of precision (GDOP) criterion. As long as the GDOP stays below a specific value, navigation requirements are met. The GDOP will exceed the specified value when the measurement geometry becomes too collinear. A new signal processing technique, called Ridge Regression Processing, can reduce the effects of nearly collinear measurement geometry; thereby reducing the inflation of the measurement errors. It is shown that the Ridge signal processor gives a consistently better mean squared error (MSE) in position than the Ordinary Least Mean Squares (OLS) estimator. The applicability of this technique is currently being investigated to improve the following areas: receiver autonomous integrity monitoring (RAIM), coverage requirements, availability requirements, and precision approaches.
Logistic equation of arbitrary order
NASA Astrophysics Data System (ADS)
Grabowski, Franciszek
2010-08-01
The paper is concerned with the new logistic equation of arbitrary order which describes the performance of complex executive systems X vs. number of tasks N, operating at limited resources K, at non-extensive, heterogeneous self-organization processes characterized by parameter f. In contrast to the classical logistic equation which exclusively relates to the special case of sub-extensive homogeneous self-organization processes at f=1, the proposed model concerns both homogeneous and heterogeneous processes in sub-extensive and super-extensive areas. The parameter of arbitrary order f, where -∞
Do subfertile women adjust their habits when trying to conceive?
Joelsson, Lana Salih; Berglund, Anna; Wånggren, Kjell; Lood, Mikael; Rosenblad, Andreas; Tydén, Tanja
2016-01-01
Aim The aim of this study was to investigate lifestyle habits and lifestyle adjustments among subfertile women trying to conceive. Materials and methods Women (n = 747) were recruited consecutively at their first visit to fertility clinics in mid-Sweden. Participants completed a questionnaire. Data were analyzed using logistic regression, t tests, and chi-square tests. Results The response rate was 62% (n = 466). Mean duration of infertility was 1.9 years. During this time 13.2% used tobacco daily, 13.6% drank more than three cups of coffee per day, and 11.6% consumed more than two glasses of alcohol weekly. In this sample, 23.9% of the women were overweight (body mass index, BMI 25–29.9 kg/m2), and 12.5% were obese (BMI ≥30 kg/m2). Obese women exercised more and changed to healthy diets more frequently than normal-weight women (odds ratio 7.43; 95% confidence interval 3.7–14.9). Six out of ten women (n = 266) took folic acid when they started trying to conceive, but 11% stopped taking folic acid after some time. Taking folic acid was associated with a higher level of education (p < 0.001). Conclusions Among subfertile women, one-third were overweight or obese, and some had other lifestyle factors with known adverse effects on fertility such as use of tobacco. Overweight and obese women adjusted their habits but did not reduce their body mass index. Women of fertile age would benefit from preconception counseling, and the treatment of infertility should routinely offer interventions for lifestyle changes. PMID:27216564
Adjusting for confounding by neighborhood using complex survey data.
Brumback, Babette A; He, Zhulin
2011-04-30
Recently, we examined methods of adjusting for confounding by neighborhood of an individual exposure effect on a binary outcome, using complex survey data; the methods were found to fail when the neighborhood sample sizes are small and the selection bias is strongly informative. More recently, other authors have adapted an older method from the genetics literature for application to complex survey data; their adaptation achieves a consistent estimator under a broad range of circumstances. The method is based on weighted pseudolikelihoods, in which the contribution from each neighborhood involves all pairs of cases and controls in the neighborhood. The pairs are treated as if they were independent, a pairwise pseudo-conditional likelihood is thus derived, and then the corresponding score equation is weighted with inverse-probabilities of sampling each case-control pair. We have greatly simplified the implementation by translating the pairwise pseudo-conditional likelihood into an equivalent ordinary weighted log-likelihood formulation. We show how to program the method using standard software for ordinary logistic regression with complex survey data (e.g. SAS PROC SURVEYLOGISTIC). We also show that the methodology applies to a broader set of sampling scenarios than the ones considered by the previous authors. We demonstrate the validity of our simplified implementation by applying it to a simulation for which previous methods failed; the new method performs beautifully. We also apply the new method to an analysis of 2009 National Health Interview Survey (NHIS) public-use data, to estimate the effect of education on health insurance coverage, adjusting for confounding by neighborhood. PMID:21287588
Regression analysis for solving diagnosis problem of children's health
NASA Astrophysics Data System (ADS)
Cherkashina, Yu A.; Gerget, O. M.
2016-04-01
The paper includes results of scientific researches. These researches are devoted to the application of statistical techniques, namely, regression analysis, to assess the health status of children in the neonatal period based on medical data (hemostatic parameters, parameters of blood tests, the gestational age, vascular-endothelial growth factor) measured at 3-5 days of children's life. In this paper a detailed description of the studied medical data is given. A binary logistic regression procedure is discussed in the paper. Basic results of the research are presented. A classification table of predicted values and factual observed values is shown, the overall percentage of correct recognition is determined. Regression equation coefficients are calculated, the general regression equation is written based on them. Based on the results of logistic regression, ROC analysis was performed, sensitivity and specificity of the model are calculated and ROC curves are constructed. These mathematical techniques allow carrying out diagnostics of health of children providing a high quality of recognition. The results make a significant contribution to the development of evidence-based medicine and have a high practical importance in the professional activity of the author.
Recursive Algorithm For Linear Regression
NASA Technical Reports Server (NTRS)
Varanasi, S. V.
1988-01-01
Order of model determined easily. Linear-regression algorithhm includes recursive equations for coefficients of model of increased order. Algorithm eliminates duplicative calculations, facilitates search for minimum order of linear-regression model fitting set of data satisfactory.
Technology Transfer Automated Retrieval System (TEKTRAN)
Improvement of cold tolerance of winter wheat (Triticum aestivum L.) through breeding methods has been problematic. A better understanding of how individual wheat cultivars respond to components of the freezing process may provide new information that can be used to develop more cold tolerance culti...
Narumalani, S.; Jensen, J.R.; Althausen, J.D.; Burkhalter, S.; Mackey, H.E. Jr.
1994-06-01
Since aquatic macrophytes have an important influence on the physical and chemical processes of an ecosystem while simultaneously affecting human activity, it is imperative that they be inventoried and managed wisely. However, mapping wetlands can be a major challenge because they are found in diverse geographic areas ranging from small tributary streams, to shrub or scrub and marsh communities, to open water lacustrian environments. In addition, the type and spatial distribution of wetlands can change dramatically from season to season, especially when nonpersistent species are present. This research, focuses on developing a model for predicting the future growth and distribution of aquatic macrophytes. This model will use a geographic information system (GIS) to analyze some of the biophysical variables that affect aquatic macrophyte growth and distribution. The data will provide scientists information on the future spatial growth and distribution of aquatic macrophytes. This study focuses on the Savannah River Site Par Pond (1,000 ha) and L Lake (400 ha) these are two cooling ponds that have received thermal effluent from nuclear reactor operations. Par Pond was constructed in 1958, and natural invasion of wetland has occurred over its 35-year history, with much of the shoreline having developed extensive beds of persistent and non-persistent aquatic macrophytes.
ERIC Educational Resources Information Center
Miller, Thomas E.; Herreid, Charlene H.
2009-01-01
This is the fifth in a series of articles describing an attrition prediction and intervention project at the University of South Florida (USF) in Tampa. The project was originally presented in the 83(2) issue (Miller 2007). The statistical model for predicting attrition was described in the 83(3) issue (Miller and Herreid 2008). The methods and…
Eggen, S; Natvig, B
1991-02-01
The prevalence of torus mandibularis was assessed in two groups of dental patients, altogether 2010 individuals over 10 yr of age: 1181 individuals native to the Lofoten Islands in North Norway, situated at 68 degrees latitude; and 829 patients indigenous to Gudbrandsdal, an inland district in the southeastern part of the country at 61 degrees latitude. Both groups were supposed to be of the same Caucasian stock and, therefore, to have similar genetic predisposition to torus on the average. The following observations were found: 1) the prevalence of torus mandibularis was much greater in Gudbrandsdal than in Lofoten (P much less than 0.001); 2) the prevalence decreased among persons above 50 yr of age as compared with those of the age classes 10-49 yr (P less than 0.01); and 3) it was smaller among women than men (P less than 0.05), mainly due to such a decrease in Lofoten. In a recent investigation of people living in Gudbrandsdal the fraction of the variation of torus that was attributable to genetic differences was estimated as about 30%, whereas approximately 70% of the causes seemed to be ascribable to environmental influences in terms of occlusal stress. It is suggested that dietary habits and number of existing teeth seemed to be environmental variables with an influence on the observed variation of torus prevalence between geographical regions, age classes, and sexes. The question about a possible sexual difference as to the genetic component of liability to torus mandibularis was outside the scope of the present study. PMID:2019087
ERIC Educational Resources Information Center
Tang, Hui; Kirk, John; Pienta, Norbert J.
2014-01-01
This paper includes two experiments, one investigating complexity factors in stoichiometry word problems, and the other identifying students' problem-solving protocols by using eye-tracking technology. The word problems used in this study had five different complexity factors, which were randomly assigned by a Web-based tool that we…
NASA Astrophysics Data System (ADS)
Lu, Q.; Wang, X. L.
2009-04-01
Detecting changepoints in a sequence of continuous random variables has been extensively explored in both statistics and climatology literature. There is little, however, for studying the case with multicategory random variables. For instance, the sky-cloudiness condition in Canada is reported in tenths of the sky dome and thus has 11 categories (from 0 for clear sky, to 10 tenths for overcast). This study develops an overall likelihood-ratio test statistic for detecting a sudden change in the parameters of the continuation-ratio logit random intercept model for a sequence of multinomial variables. A method of partitioning the overall test statistic is also proposed, which allows one to assess the significance of the effect of the detected change on individual categories. An application of this new technique to real sky cloudiness data is also presented.
ERIC Educational Resources Information Center
Liu, Xing
2008-01-01
The proportional odds (PO) model, which is also called cumulative odds model (Agresti, 1996, 2002 ; Armstrong & Sloan, 1989; Long, 1997, Long & Freese, 2006; McCullagh, 1980; McCullagh & Nelder, 1989; Powers & Xie, 2000; O'Connell, 2006), is one of the most commonly used models for the analysis of ordinal categorical data and comes from the class…
AP® Potential Predicted by PSAT/NMSQT® Scores Using Logistic Regression. Statistical Report 2014-1
ERIC Educational Resources Information Center
Zhang, Xiuyuan; Patel, Priyank; Ewing, Maureen
2014-01-01
AP Potential™ is an educational guidance tool that uses PSAT/NMSQT® scores to identify students who have the potential to do well on one or more Advanced Placement® (AP®) Exams. Students identified as having AP potential, perhaps students who would not have been otherwise identified, should consider enrolling in the corresponding AP course if they…
NASA Astrophysics Data System (ADS)
Maas, A.; Rottensteiner, F.; Heipke, C.
2016-06-01
Supervised classification of remotely sensed images is a classical method to update topographic geospatial databases. The task requires training data in the form of image data with known class labels, whose generation is time-consuming. To avoid this problem one can use the labels from the outdated database for training. As some of these labels may be wrong due to changes in land cover, one has to use training techniques that can cope with wrong class labels in the training data. In this paper we adapt a label noise tolerant training technique to the problem of database updating. No labelled data other than the existing database are necessary. The resulting label image and transition matrix between the labels can help to update the database and to detect changes between the two time epochs. Our experiments are based on different test areas, using real images with simulated existing databases. Our results show that this method can indeed detect changes that would remain undetected if label noise were not considered in training.
ERIC Educational Resources Information Center
Del Prette, Zilda Aparecida Pereira; Prette, Almir Del; De Oliveira, Lael Almeida; Gresham, Frank M.; Vance, Michael J.
2012-01-01
Social skills are specific behaviors that individuals exhibit in order to successfully complete social tasks whereas social competence represents judgments by significant others that these social tasks have been successfully accomplished. The present investigation identified the best sociobehavioral predictors obtained from different raters…
Logistic regression analysis to predict weaning-to-estrous interval in first-litter gilts
Technology Transfer Automated Retrieval System (TEKTRAN)
Delayed return to estrus after weaning is a significant problem for swine producers. In this study, we investigated the relationships between weaning-to-estrous interval (WEI) and body weight (BW), back fat (BF), plasma leptin (L), glucose (G), albumin (A), urea nitrogen (PUN) concentrations and lit...
ADJUSTABLE DOUBLE PULSE GENERATOR
Gratian, J.W.; Gratian, A.C.
1961-08-01
>A modulator pulse source having adjustable pulse width and adjustable pulse spacing is described. The generator consists of a cross coupled multivibrator having adjustable time constant circuitry in each leg, an adjustable differentiating circuit in the output of each leg, a mixing and rectifying circuit for combining the differentiated pulses and generating in its output a resultant sequence of negative pulses, and a final amplifying circuit for inverting and square-topping the pulses. (AEC)
Adjustable sutures in children.
Engel, J Mark; Guyton, David L; Hunter, David G
2014-06-01
Although adjustable sutures are considered a standard technique in adult strabismus surgery, most surgeons are hesitant to attempt the technique in children, who are believed to be unlikely to cooperate for postoperative assessment and adjustment. Interest in using adjustable sutures in pediatric patients has increased with the development of surgical techniques specific to infants and children. This workshop briefly reviews the literature supporting the use of adjustable sutures in children and presents the approaches currently used by three experienced strabismus surgeons. PMID:24924284
Accounting for Slipping and Other False Negatives in Logistic Models of Student Learning
ERIC Educational Resources Information Center
MacLellan, Christopher J.; Liu, Ran; Koedinger, Kenneth R.
2015-01-01
Additive Factors Model (AFM) and Performance Factors Analysis (PFA) are two popular models of student learning that employ logistic regression to estimate parameters and predict performance. This is in contrast to Bayesian Knowledge Tracing (BKT) which uses a Hidden Markov Model formalism. While all three models tend to make similar predictions,…
Optimal distributions for multiplex logistic networks.
Solá Conde, Luis E; Used, Javier; Romance, Miguel
2016-06-01
This paper presents some mathematical models for distribution of goods in logistic networks based on spectral analysis of complex networks. Given a steady distribution of a finished product, some numerical algorithms are presented for computing the weights in a multiplex logistic network that reach the equilibrium dynamics with high convergence rate. As an application, the logistic networks of Germany and Spain are analyzed in terms of their convergence rates. PMID:27368801
Integrated Computer System of Management in Logistics
NASA Astrophysics Data System (ADS)
Chwesiuk, Krzysztof
2011-06-01
This paper aims at presenting a concept of an integrated computer system of management in logistics, particularly in supply and distribution chains. Consequently, the paper includes the basic idea of the concept of computer-based management in logistics and components of the system, such as CAM and CIM systems in production processes, and management systems for storage, materials flow, and for managing transport, forwarding and logistics companies. The platform which integrates computer-aided management systems is that of electronic data interchange.
Optimal distributions for multiplex logistic networks
NASA Astrophysics Data System (ADS)
Solá Conde, Luis E.; Used, Javier; Romance, Miguel
2016-06-01
This paper presents some mathematical models for distribution of goods in logistic networks based on spectral analysis of complex networks. Given a steady distribution of a finished product, some numerical algorithms are presented for computing the weights in a multiplex logistic network that reach the equilibrium dynamics with high convergence rate. As an application, the logistic networks of Germany and Spain are analyzed in terms of their convergence rates.
Bayesian Spatial Quantile Regression
Reich, Brian J.; Fuentes, Montserrat; Dunson, David B.
2013-01-01
Tropospheric ozone is one of the six criteria pollutants regulated by the United States Environmental Protection Agency under the Clean Air Act and has been linked with several adverse health effects, including mortality. Due to the strong dependence on weather conditions, ozone may be sensitive to climate change and there is great interest in studying the potential effect of climate change on ozone, and how this change may affect public health. In this paper we develop a Bayesian spatial model to predict ozone under different meteorological conditions, and use this model to study spatial and temporal trends and to forecast ozone concentrations under different climate scenarios. We develop a spatial quantile regression model that does not assume normality and allows the covariates to affect the entire conditional distribution, rather than just the mean. The conditional distribution is allowed to vary from site-to-site and is smoothed with a spatial prior. For extremely large datasets our model is computationally infeasible, and we develop an approximate method. We apply the approximate version of our model to summer ozone from 1997–2005 in the Eastern U.S., and use deterministic climate models to project ozone under future climate conditions. Our analysis suggests that holding all other factors fixed, an increase in daily average temperature will lead to the largest increase in ozone in the Industrial Midwest and Northeast. PMID:23459794
Bayesian Spatial Quantile Regression.
Reich, Brian J; Fuentes, Montserrat; Dunson, David B
2011-03-01
Tropospheric ozone is one of the six criteria pollutants regulated by the United States Environmental Protection Agency under the Clean Air Act and has been linked with several adverse health effects, including mortality. Due to the strong dependence on weather conditions, ozone may be sensitive to climate change and there is great interest in studying the potential effect of climate change on ozone, and how this change may affect public health. In this paper we develop a Bayesian spatial model to predict ozone under different meteorological conditions, and use this model to study spatial and temporal trends and to forecast ozone concentrations under different climate scenarios. We develop a spatial quantile regression model that does not assume normality and allows the covariates to affect the entire conditional distribution, rather than just the mean. The conditional distribution is allowed to vary from site-to-site and is smoothed with a spatial prior. For extremely large datasets our model is computationally infeasible, and we develop an approximate method. We apply the approximate version of our model to summer ozone from 1997-2005 in the Eastern U.S., and use deterministic climate models to project ozone under future climate conditions. Our analysis suggests that holding all other factors fixed, an increase in daily average temperature will lead to the largest increase in ozone in the Industrial Midwest and Northeast. PMID:23459794
Luo, Chongliang; Liu, Jin; Dey, Dipak K; Chen, Kun
2016-07-01
In many fields, multi-view datasets, measuring multiple distinct but interrelated sets of characteristics on the same set of subjects, together with data on certain outcomes or phenotypes, are routinely collected. The objective in such a problem is often two-fold: both to explore the association structures of multiple sets of measurements and to develop a parsimonious model for predicting the future outcomes. We study a unified canonical variate regression framework to tackle the two problems simultaneously. The proposed criterion integrates multiple canonical correlation analysis with predictive modeling, balancing between the association strength of the canonical variates and their joint predictive power on the outcomes. Moreover, the proposed criterion seeks multiple sets of canonical variates simultaneously to enable the examination of their joint effects on the outcomes, and is able to handle multivariate and non-Gaussian outcomes. An efficient algorithm based on variable splitting and Lagrangian multipliers is proposed. Simulation studies show the superior performance of the proposed approach. We demonstrate the effectiveness of the proposed approach in an [Formula: see text] intercross mice study and an alcohol dependence study. PMID:26861909
Logistics Reduction Technologies for Exploration Missions
NASA Technical Reports Server (NTRS)
Broyan, James L., Jr.; Ewert, Michael K.; Fink, Patrick W.
2014-01-01
Human exploration missions under study are limited by the launch mass capacity of existing and planned launch vehicles. The logistical mass of crew items is typically considered separate from the vehicle structure, habitat outfitting, and life support systems. Although mass is typically the focus of exploration missions, due to its strong impact on launch vehicle and habitable volume for the crew, logistics volume also needs to be considered. NASA's Advanced Exploration Systems (AES) Logistics Reduction and Repurposing (LRR) Project is developing six logistics technologies guided by a systems engineering cradle-to-grave approach to enable after-use crew items to augment vehicle systems. Specifically, AES LRR is investigating the direct reduction of clothing mass, the repurposing of logistical packaging, the use of autonomous logistics management technologies, the processing of spent crew items to benefit radiation shielding and water recovery, and the conversion of trash to propulsion gases. Reduction of mass has a corresponding and significant impact to logistical volume. The reduction of logistical volume can reduce the overall pressurized vehicle mass directly, or indirectly benefit the mission by allowing for an increase in habitable volume during the mission. The systematic implementation of these types of technologies will increase launch mass efficiency by enabling items to be used for secondary purposes and improve the habitability of the vehicle as mission durations increase. Early studies have shown that the use of advanced logistics technologies can save approximately 20 m(sup 3) of volume during transit alone for a six-person Mars conjunction class mission.
Analysis of Jingdong Mall Logistics Distribution Model
NASA Astrophysics Data System (ADS)
Shao, Kang; Cheng, Feng
In recent years, the development of electronic commerce in our country to speed up the pace. The role of logistics has been highlighted, more and more electronic commerce enterprise are beginning to realize the importance of logistics in the success or failure of the enterprise. In this paper, the author take Jingdong Mall for example, performing a SWOT analysis of their current situation of self-built logistics system, find out the problems existing in the current Jingdong Mall logistics distribution and give appropriate recommendations.
Linear regression in astronomy. I
NASA Technical Reports Server (NTRS)
Isobe, Takashi; Feigelson, Eric D.; Akritas, Michael G.; Babu, Gutti Jogesh
1990-01-01
Five methods for obtaining linear regression fits to bivariate data with unknown or insignificant measurement errors are discussed: ordinary least-squares (OLS) regression of Y on X, OLS regression of X on Y, the bisector of the two OLS lines, orthogonal regression, and 'reduced major-axis' regression. These methods have been used by various researchers in observational astronomy, most importantly in cosmic distance scale applications. Formulas for calculating the slope and intercept coefficients and their uncertainties are given for all the methods, including a new general form of the OLS variance estimates. The accuracy of the formulas was confirmed using numerical simulations. The applicability of the procedures is discussed with respect to their mathematical properties, the nature of the astronomical data under consideration, and the scientific purpose of the regression. It is found that, for problems needing symmetrical treatment of the variables, the OLS bisector performs significantly better than orthogonal or reduced major-axis regression.
Psychological Adjustment in Young Korean American Adolescents and Parental Warmth
Kim, Eunjung
2008-01-01
Problem: The relation between parental warmth and psychological adjustment is not known for young Korean American adolescents. Methods: 103 adolescents' perceived parental warmth and psychological adjustment were assessed using, respectively, the Parental Acceptance-Rejection Questionnaire and the Child Personality Assessment Questionnaire. Findings: Low perceived maternal and paternal warmth were positively related to adolescents' overall poor psychological adjustment and almost all of its attributes. When maternal and paternal warmth were entered simultaneously into the regression equation, only low maternal warmth was related to adolescents' poor psychological adjustment. Conclusion: Perceived parental warmth is important in predicting young adolescents' psychological adjustment as suggested in the parental acceptance-rejection theory. PMID:19885379
FUTURE LOGISTICS AND OPERATIONAL ADAPTABILITY
Houck, Roger P.
2009-10-01
While we cannot predict the future, we can ascertain trends and examine them through the use of alternative futures methodologies and tools. From a logistics perspective, we know that many different futures are possible, all of which are obviously dependent on decisions we make in the present. As professional logisticians we are obligated to provide the field - our Soldiers - with our best professional opinion of what will result in success on the battlefield. Our view of the future should take history and contemporary conflict into account, but it must also consider that continuity with the past cannot be taken for granted. If we are too focused on past and current experience, then our vision of the future will be limited indeed. On the one hand, the future must be explained in language that does not defy common sense. On the other hand, the pace of change is such that we must conduct qualitative and quantitative trend analyses, forecasting, and explorative scenario development in ways that allow for significant breaks - or "shocks" - that may "change the game". We will need capabilities and solutions that are constantly evolving - and improving - to match the operational tempo of a radically changing threat environment. For those who provide quartermaster services, this article will briefly examine what this means from the perspective of creating what might be termed a preferred future.
Biomass supply logistics and infrastructure.
Sokhansanj, Shahabaddine; Hess, J Richard
2009-01-01
Feedstock supply system encompasses numerous unit operations necessary to move lignocellulosic feedstock from the place where it is produced (in the field or on the stump) to the start of the conversion process (reactor throat) of the biorefinery. These unit operations, which include collection, storage, preprocessing, handling, and transportation, represent one of the largest technical and logistics challenges to the emerging lignocellulosic biorefining industry. This chapter briefly reviews the methods of estimating the quantities of biomass, followed by harvesting and collection processes based on current practices on handling wet and dry forage materials. Storage and queuing are used to deal with seasonal harvest times, variable yields, and delivery schedules. Preprocessing can be as simple as grinding and formatting the biomass for increased bulk density or improved conversion efficiency, or it can be as complex as improving feedstock quality through fractionation, tissue separation, drying, blending, and densification. Handling and transportation consists of using a variety of transport equipment (truck, train, ship) for moving the biomass from one point to another. The chapter also provides typical cost figures for harvest and processing of biomass. PMID:19768612
Biomass Supply Logistics and Infrastructure
Sokhansanj, Shahabaddine
2009-04-01
Feedstock supply system encompasses numerous unit operations necessary to move lignocellulosic feedstock from the place where it is produced (in the field or on the stump) to the start of the conversion process (reactor throat) of the Biorefinery. These unit operations, which include collection, storage, preprocessing, handling, and transportation, represent one of the largest technical and logistics challenges to the emerging lignocellulosic biorefining industry. This chapter briefly reviews methods of estimating the quantities of biomass followed by harvesting and collection processes based on current practices on handling wet and dry forage materials. Storage and queuing are used to deal with seasonal harvest times, variable yields, and delivery schedules. Preprocessing can be as simple as grinding and formatting the biomass for increased bulk density or improved conversion efficiency, or it can be as complex as improving feedstock quality through fractionation, tissue separation, drying, blending, and densification. Handling and Transportation consists of using a variety of transport equipment (truck, train, ship) for moving the biomass from one point to another. The chapter also provides typical cost figures for harvest and processing of biomass.
Biomass Supply Logistics and Infrastructure
NASA Astrophysics Data System (ADS)
Sokhansanj, Shahabaddine; Hess, J. Richard
Feedstock supply system encompasses numerous unit operations necessary to move lignocellulosic feedstock from the place where it is produced (in the field or on the stump) to the start of the conversion process (reactor throat) of the biorefinery. These unit operations, which include collection, storage, preprocessing, handling, and transportation, represent one of the largest technical and logistics challenges to the emerging lignocellulosic biorefining industry. This chapter briefly reviews the methods of estimating the quantities of biomass, followed by harvesting and collection processes based on current practices on handling wet and dry forage materials. Storage and queuing are used to deal with seasonal harvest times, variable yields, and delivery schedules. Preprocessing can be as simple as grinding and formatting the biomass for increased bulk density or improved conversion efficiency, or it can be as complex as improving feedstock quality through fractionation, tissue separation, drying, blending, and densification. Handling and transportation consists of using a variety of transport equipment (truck, train, ship) for moving the biomass from one point to another. The chapter also provides typical cost figures for harvest and processing of biomass.
Risk-adjusted monitoring of survival times
Sego, Landon H.; Reynolds, Marion R.; Woodall, William H.
2009-02-26
We consider the monitoring of clinical outcomes, where each patient has a di®erent risk of death prior to undergoing a health care procedure.We propose a risk-adjusted survival time CUSUM chart (RAST CUSUM) for monitoring clinical outcomes where the primary endpoint is a continuous, time-to-event variable that may be right censored. Risk adjustment is accomplished using accelerated failure time regression models. We compare the average run length performance of the RAST CUSUM chart to the risk-adjusted Bernoulli CUSUM chart, using data from cardiac surgeries to motivate the details of the comparison. The comparisons show that the RAST CUSUM chart is more efficient at detecting a sudden decrease in the odds of death than the risk-adjusted Bernoulli CUSUM chart, especially when the fraction of censored observations is not too high. We also discuss the implementation of a prospective monitoring scheme using the RAST CUSUM chart.
Evaluating differential effects using regression interactions and regression mixture models
Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung
2015-01-01
Research increasingly emphasizes understanding differential effects. This paper focuses on understanding regression mixture models, a relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their formulation, and their assumptions are compared using Monte Carlo simulations and real data analysis. The capabilities of regression mixture models are described and specific issues to be addressed when conducting regression mixtures are proposed. The paper aims to clarify the role that regression mixtures can take in the estimation of differential effects and increase awareness of the benefits and potential pitfalls of this approach. Regression mixture models are shown to be a potentially effective exploratory method for finding differential effects when these effects can be defined by a small number of classes of respondents who share a typical relationship between a predictor and an outcome. It is also shown that the comparison between regression mixture models and interactions becomes substantially more complex as the number of classes increases. It is argued that regression interactions are well suited for direct tests of specific hypotheses about differential effects and regression mixtures provide a useful approach for exploring effect heterogeneity given adequate samples and study design. PMID:26556903
Visualizing the Logistic Map with a Microcontroller
ERIC Educational Resources Information Center
Serna, Juan D.; Joshi, Amitabh
2012-01-01
The logistic map is one of the simplest nonlinear dynamical systems that clearly exhibits the route to chaos. In this paper, we explore the evolution of the logistic map using an open-source microcontroller connected to an array of light-emitting diodes (LEDs). We divide the one-dimensional domain interval [0,1] into ten equal parts, an associate…
Exploration Mission Benefits From Logistics Reduction Technologies
NASA Technical Reports Server (NTRS)
Broyan, James Lee, Jr.; Ewert, Michael K.; Schlesinger, Thilini
2016-01-01
Technologies that reduce logistical mass, volume, and the crew time dedicated to logistics management become more important as exploration missions extend further from the Earth. Even modest reductions in logistical mass can have a significant impact because it also reduces the packaging burden. NASA's Advanced Exploration Systems' Logistics Reduction Project is developing technologies that can directly reduce the mass and volume of crew clothing and metabolic waste collection. Also, cargo bags have been developed that can be reconfigured for crew outfitting, and trash processing technologies are under development to increase habitable volume and improve protection against solar storm events. Additionally, Mars class missions are sufficiently distant that even logistics management without resupply can be problematic due to the communication time delay with Earth. Although exploration vehicles are launched with all consumables and logistics in a defined configuration, the configuration continually changes as the mission progresses. Traditionally significant ground and crew time has been required to understand the evolving configuration and to help locate misplaced items. For key mission events and unplanned contingencies, the crew will not be able to rely on the ground for logistics localization assistance. NASA has been developing a radio-frequency-identification autonomous logistics management system to reduce crew time for general inventory and enable greater crew self-response to unplanned events when a wide range of items may need to be located in a very short time period. This paper provides a status of the technologies being developed and their mission benefits for exploration missions.
2015-01-01
Background Over the past 50,000 years, shifts in human-environmental or human-human interactions shaped genetic differences within and among human populations, including variants under positive selection. Shaped by environmental factors, such variants influence the genetics of modern health, disease, and treatment outcome. Because evolutionary processes tend to act on gene regulation, we test whether regulatory variants are under positive selection. We introduce a new approach to enhance detection of genetic markers undergoing positive selection, using conditional entropy to capture recent local selection signals. Results We use conditional logistic regression to compare our Adjusted Haplotype Conditional Entropy (H|H) measure of positive selection to existing positive selection measures. H|H and existing measures were applied to published regulatory variants acting in cis (cis-eQTLs), with conditional logistic regression testing whether regulatory variants undergo stronger positive selection than the surrounding gene. These cis-eQTLs were drawn from six independent studies of genotype and RNA expression. The conditional logistic regression shows that, overall, H|H is substantially more powerful than existing positive-selection methods in identifying cis-eQTLs against other Single Nucleotide Polymorphisms (SNPs) in the same genes. When broken down by Gene Ontology, H|H predictions are particularly strong in some biological process categories, where regulatory variants are under strong positive selection compared to the bulk of the gene, distinct from those GO categories under overall positive selection. . However, cis-eQTLs in a second group of genes lack positive selection signatures detectable by H|H, consistent with ancient short haplotypes compared to the surrounding gene (for example, in innate immunity GO:0042742); under such other modes of selection, H|H would not be expected to be a strong predictor.. These conditional logistic regression models are
On some generalized discrete logistic maps
Radwan, Ahmed G.
2012-01-01
Recently, conventional logistic maps have been used in different vital applications like modeling and security. However, unfortunately the conventional logistic maps can tolerate only one changeable parameter. In this paper, three different generalized logistic maps are introduced with arbitrary powers which can be reduced to the conventional logistic map. The added parameter (arbitrary power) increases the degree of freedom of each map and gives us a versatile response that can fit many applications. Therefore, the conventional logistic map is considered only a special case from each proposed map. This new parameter increases the flexibility of the system, and illustrates the performance of the conventional system within any required neighborhood. Many cases will be illustrated showing the effect of the arbitrary power and the equation parameter on the number of equilibrium points, their locations, stability conditions, and bifurcation diagrams up to the chaotic behavior. PMID:25685414
Comparison of logistic equations for population growth.
Jensen, A L
1975-12-01
Two different forms of the logistic equation for population growth appear in the ecological literature. In the form of the logistic equation that appears in recent ecology textbooks the parameters are the instantaneous rate of natural increase per individual and the carrying capacity of the environment. In the form of the logistic equation that appears in some older literature the parameters are the instantaneous birth rate per individual and the carrying capacity. The decision whether to use one form or the other depends on which form of the equation is biologically more realistic. In this study the form of the logistic equation in which the instantaneous birth rate per individual is a parameter is shown to be more realistic in terms of the birth and death processes of population growth. Application of the logistic equation to calculate yield from an exploited fish population also shows that the parameters must be the instantaneous birth rate per individual and the carrying capacity. PMID:1203427
Linear regression in astronomy. II
NASA Technical Reports Server (NTRS)
Feigelson, Eric D.; Babu, Gutti J.
1992-01-01
A wide variety of least-squares linear regression procedures used in observational astronomy, particularly investigations of the cosmic distance scale, are presented and discussed. The classes of linear models considered are (1) unweighted regression lines, with bootstrap and jackknife resampling; (2) regression solutions when measurement error, in one or both variables, dominates the scatter; (3) methods to apply a calibration line to new data; (4) truncated regression models, which apply to flux-limited data sets; and (5) censored regression models, which apply when nondetections are present. For the calibration problem we develop two new procedures: a formula for the intercept offset between two parallel data sets, which propagates slope errors from one regression to the other; and a generalization of the Working-Hotelling confidence bands to nonstandard least-squares lines. They can provide improved error analysis for Faber-Jackson, Tully-Fisher, and similar cosmic distance scale relations.
Quantile regression for climate data
NASA Astrophysics Data System (ADS)
Marasinghe, Dilhani Shalika
Quantile regression is a developing statistical tool which is used to explain the relationship between response and predictor variables. This thesis describes two examples of climatology using quantile regression.Our main goal is to estimate derivatives of a conditional mean and/or conditional quantile function. We introduce a method to handle autocorrelation in the framework of quantile regression and used it with the temperature data. Also we explain some properties of the tornado data which is non-normally distributed. Even though quantile regression provides a more comprehensive view, when talking about residuals with the normality and the constant variance assumption, we would prefer least square regression for our temperature analysis. When dealing with the non-normality and non constant variance assumption, quantile regression is a better candidate for the estimation of the derivative.
Improvement in fresh fruit and vegetable logistics quality: berry logistics field studies.
do Nascimento Nunes, M Cecilia; Nicometo, Mike; Emond, Jean Pierre; Melis, Ricardo Badia; Uysal, Ismail
2014-06-13
Shelf life of fresh fruits and vegetables is greatly influenced by environmental conditions. Increasing temperature usually results in accelerated loss of quality and shelf-life reduction, which is not physically visible until too late in the supply chain to adjust logistics to match shelf life. A blackberry study showed that temperatures inside pallets varied significantly and 57% of the berries arriving at the packinghouse did not have enough remaining shelf life for the longest supply routes. Yet, the advanced shelf-life loss was not physically visible. Some of those pallets would be sent on longer supply routes than necessary, creating avoidable waste. Other studies showed that variable pre-cooling at the centre of pallets resulted in physically invisible uneven shelf life. We have shown that using simple temperature measurements much waste can be avoided using 'first expiring first out'. Results from our studies showed that shelf-life prediction should not be based on a single quality factor as, depending on the temperature history, the quality attribute that limits shelf life may vary. Finally, methods to use air temperature to predict product temperature for highest shelf-life prediction accuracy in the absence of individual sensors for each monitored product have been developed. Our results show a significant reduction of up to 98% in the root-mean-square-error difference between the product temperature and air temperature when advanced estimation methods are used. PMID:24797135
Evaluating Differential Effects Using Regression Interactions and Regression Mixture Models
ERIC Educational Resources Information Center
Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung
2015-01-01
Research increasingly emphasizes understanding differential effects. This article focuses on understanding regression mixture models, which are relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their…
Retro-regression--another important multivariate regression improvement.
Randić, M
2001-01-01
We review the serious problem associated with instabilities of the coefficients of regression equations, referred to as the MRA (multivariate regression analysis) "nightmare of the first kind". This is manifested when in a stepwise regression a descriptor is included or excluded from a regression. The consequence is an unpredictable change of the coefficients of the descriptors that remain in the regression equation. We follow with consideration of an even more serious problem, referred to as the MRA "nightmare of the second kind", arising when optimal descriptors are selected from a large pool of descriptors. This process typically causes at different steps of the stepwise regression a replacement of several previously used descriptors by new ones. We describe a procedure that resolves these difficulties. The approach is illustrated on boiling points of nonanes which are considered (1) by using an ordered connectivity basis; (2) by using an ordering resulting from application of greedy algorithm; and (3) by using an ordering derived from an exhaustive search for optimal descriptors. A novel variant of multiple regression analysis, called retro-regression (RR), is outlined showing how it resolves the ambiguities associated with both "nightmares" of the first and the second kind of MRA. PMID:11410035
Pazvakawambwa, Lillian; Indongo, Nelago; Kazembe, Lawrence N.
2013-01-01
Background Marriage is a significant event in life-course of individuals, and creates a system that characterizes societal and economic structures. Marital patterns and dynamics over the years have changed a lot, with decreasing proportions of marriage, increased levels of divorce and co-habitation in developing countries. Although, such changes have been reported in African societies including Namibia, they have largely remained unexplained. Objectives and Methods In this paper, we examined trends and patterns of marital status of women of marriageable age: 15 to 49 years, in Namibia using the 1992, 2000 and 2006 Demographic and Health Survey (DHS) data. Trends were established for selected demographic variables. Two binary logistic regression models for ever-married versus never married, and cohabitation versus married were fitted to establish factors associated with such nuptial systems. Further a multinomial logistic regression models, adjusted for bio-demographic and socio-economic variables, were fitted separately for each year, to establish determinants of type of union (never married, married and cohabitation). Results and Conclusions Findings indicate a general change away from marriage, with a shift in singulate mean age at marriage. Cohabitation was prevalent among those less than 30 years of age, the odds were higher in urban areas and increased since 1992. Be as it may marriage remained a persistent nuptiality pattern, and common among the less educated and employed, but lower odds in urban areas. Results from multinomial model suggest that marital status was associated with age at marriage, total children born, region, place of residence, education level and religion. We conclude that marital patterns have undergone significant transformation over the past two decades in Namibia, with a coexistence of traditional marriage framework with co-habitation, and sizeable proportion remaining unmarried to the late 30s. A shift in the singulate mean age is
A SAS macro for residual deviance of ordinal regression analysis.
Wan, J Y; Wang, W; Bromberg, J
1994-12-01
In this paper, a SAS macro is described for calculating the likelihood of the 'saturated' model in the analysis of ordinal regression. The outcome variable is multinomial on an ordinal scale, while the explanatory variables can be nominal or ordinal. Several ordinal regression models may be fitted to the data. One method of testing for the goodness of fit of these regression models is by comparing the residual deviance with the chi 2 distribution. In SAS, PROC LOGISTIC may be used to fit this type of data with proportional odds model. Unfortunately, the residual deviance is not available from the output. Our SAS macro will supplement the SAS output so that the residual deviance test may be carried out. The data from an ongoing HIV study is used as an illustration. PMID:7736732
Frank, Laura K; Jannasch, Franziska; Kröger, Janine; Bedu-Addo, George; Mockenhaupt, Frank P; Schulze, Matthias B; Danquah, Ina
2015-07-01
Reduced rank regression (RRR) is an innovative technique to establish dietary patterns related to biochemical risk factors for type 2 diabetes, but has not been applied in sub-Saharan Africa. In a hospital-based case-control study for type 2 diabetes in Kumasi (diabetes cases, 538; controls, 668) dietary intake was assessed by a specific food frequency questionnaire. After random split of our study population, we derived a dietary pattern in the training set using RRR with adiponectin, HDL-cholesterol and triglycerides as responses and 35 food items as predictors. This pattern score was applied to the validation set, and its association with type 2 diabetes was examined by logistic regression. The dietary pattern was characterized by a high consumption of plantain, cassava, and garden egg, and a low intake of rice, juice, vegetable oil, eggs, chocolate drink, sweets, and red meat; the score correlated positively with serum triglycerides and negatively with adiponectin. The multivariate-adjusted odds ratio of type 2 diabetes for the highest quintile compared to the lowest was 4.43 (95% confidence interval: 1.87-10.50, p for trend < 0.001). The identified dietary pattern increases the odds of type 2 diabetes in urban Ghanaians, which is mainly attributed to increased serum triglycerides. PMID:26198248
Frank, Laura K.; Jannasch, Franziska; Kröger, Janine; Bedu-Addo, George; Mockenhaupt, Frank P.; Schulze, Matthias B.; Danquah, Ina
2015-01-01
Reduced rank regression (RRR) is an innovative technique to establish dietary patterns related to biochemical risk factors for type 2 diabetes, but has not been applied in sub-Saharan Africa. In a hospital-based case-control study for type 2 diabetes in Kumasi (diabetes cases, 538; controls, 668) dietary intake was assessed by a specific food frequency questionnaire. After random split of our study population, we derived a dietary pattern in the training set using RRR with adiponectin, HDL-cholesterol and triglycerides as responses and 35 food items as predictors. This pattern score was applied to the validation set, and its association with type 2 diabetes was examined by logistic regression. The dietary pattern was characterized by a high consumption of plantain, cassava, and garden egg, and a low intake of rice, juice, vegetable oil, eggs, chocolate drink, sweets, and red meat; the score correlated positively with serum triglycerides and negatively with adiponectin. The multivariate-adjusted odds ratio of type 2 diabetes for the highest quintile compared to the lowest was 4.43 (95% confidence interval: 1.87–10.50, p for trend < 0.001). The identified dietary pattern increases the odds of type 2 diabetes in urban Ghanaians, which is mainly attributed to increased serum triglycerides. PMID:26198248
Exploration Mission Benefits From Logistics Reduction Technologies
NASA Technical Reports Server (NTRS)
Broyan, James Lee, Jr.; Schlesinger, Thilini; Ewert, Michael K.
2016-01-01
Technologies that reduce logistical mass, volume, and the crew time dedicated to logistics management become more important as exploration missions extend further from the Earth. Even modest reductions in logical mass can have a significant impact because it also reduces the packing burden. NASA's Advanced Exploration Systems' Logistics Reduction Project is developing technologies that can directly reduce the mass and volume of crew clothing and metabolic waste collection. Also, cargo bags have been developed that can be reconfigured for crew outfitting and trash processing technologies to increase habitable volume and improve protection against solar storm events are under development. Additionally, Mars class missions are sufficiently distant that even logistics management without resupply can be problematic due to the communication time delay with Earth. Although exploration vehicles are launched with all consumables and logistics in a defined configuration, the configuration continually changes as the mission progresses. Traditionally significant ground and crew time has been required to understand the evolving configuration and locate misplaced items. For key mission events and unplanned contingencies, the crew will not be able to rely on the ground for logistics localization assistance. NASA has been developing a radio frequency identification autonomous logistics management system to reduce crew time for general inventory and enable greater crew self-response to unplanned events when a wide range of items may need to be located in a very short time period. This paper provides a status of the technologies being developed and there mission benefits for exploration missions.
Logistics Modeling for Lunar Exploration Systems
NASA Technical Reports Server (NTRS)
Andraschko, Mark R.; Merrill, R. Gabe; Earle, Kevin D.
2008-01-01
The extensive logistics required to support extended crewed operations in space make effective modeling of logistics requirements and deployment critical to predicting the behavior of human lunar exploration systems. This paper discusses the software that has been developed as part of the Campaign Manifest Analysis Tool in support of strategic analysis activities under the Constellation Architecture Team - Lunar. The described logistics module enables definition of logistics requirements across multiple surface locations and allows for the transfer of logistics between those locations. A key feature of the module is the loading algorithm that is used to efficiently load logistics by type into carriers and then onto landers. Attention is given to the capabilities and limitations of this loading algorithm, particularly with regard to surface transfers. These capabilities are described within the context of the object-oriented software implementation, with details provided on the applicability of using this approach to model other human exploration scenarios. Some challenges of incorporating probabilistics into this type of logistics analysis model are discussed at a high level.
ISS Logistics Hardware Disposition and Metrics Validation
NASA Technical Reports Server (NTRS)
Rogers, Toneka R.
2010-01-01
I was assigned to the Logistics Division of the International Space Station (ISS)/Spacecraft Processing Directorate. The Division consists of eight NASA engineers and specialists that oversee the logistics portion of the Checkout, Assembly, and Payload Processing Services (CAPPS) contract. Boeing, their sub-contractors and the Boeing Prime contract out of Johnson Space Center, provide the Integrated Logistics Support for the ISS activities at Kennedy Space Center. Essentially they ensure that spares are available to support flight hardware processing and the associated ground support equipment (GSE). Boeing maintains a Depot for electrical, mechanical and structural modifications and/or repair capability as required. My assigned task was to learn project management techniques utilized by NASA and its' contractors to provide an efficient and effective logistics support infrastructure to the ISS program. Within the Space Station Processing Facility (SSPF) I was exposed to Logistics support components, such as, the NASA Spacecraft Services Depot (NSSD) capabilities, Mission Processing tools, techniques and Warehouse support issues, required for integrating Space Station elements at the Kennedy Space Center. I also supported the identification of near-term ISS Hardware and Ground Support Equipment (GSE) candidates for excessing/disposition prior to October 2010; and the validation of several Logistics Metrics used by the contractor to measure logistics support effectiveness.
Precision Efficacy Analysis for Regression.
ERIC Educational Resources Information Center
Brooks, Gordon P.
When multiple linear regression is used to develop a prediction model, sample size must be large enough to ensure stable coefficients. If the derivation sample size is inadequate, the model may not predict well for future subjects. The precision efficacy analysis for regression (PEAR) method uses a cross- validity approach to select sample sizes…
Ecological Regression and Voting Rights.
ERIC Educational Resources Information Center
Freedman, David A.; And Others
1991-01-01
The use of ecological regression in voting rights cases is discussed in the context of a lawsuit against Los Angeles County (California) in 1990. Ecological regression assumes that systematic voting differences between precincts are explained by ethnic differences. An alternative neighborhood model is shown to lead to different conclusions. (SLD)
NASA Astrophysics Data System (ADS)
Koloc, Z.; Korf, J.; Kavan, P.
The adjustment (modification) deals with gear chains intermediating (transmitting) motion transfer between the sprocket wheels on parallel shafts. The purpose of the adjustments of chain gear is to remove the unwanted effects by using the chain guide on the links (sliding guide rail) ensuring a smooth fit of the chain rollers into the wheel tooth gap.
Adjustment to Recruit Training.
ERIC Educational Resources Information Center
Anderson, Betty S.
The thesis examines problems of adjustment encountered by new recruits entering the military services. Factors affecting adjustment are discussed: the recruit training staff and environment, recruit background characteristics, the military's image, the changing values and motivations of today's youth, and the recruiting process. Sources of…
[Regression grading in gastrointestinal tumors].
Tischoff, I; Tannapfel, A
2012-02-01
Preoperative neoadjuvant chemoradiation therapy is a well-established and essential part of the interdisciplinary treatment of gastrointestinal tumors. Neoadjuvant treatment leads to regressive changes in tumors. To evaluate the histological tumor response different scoring systems describing regressive changes are used and known as tumor regression grading. Tumor regression grading is usually based on the presence of residual vital tumor cells in proportion to the total tumor size. Currently, no nationally or internationally accepted grading systems exist. In general, common guidelines should be used in the pathohistological diagnostics of tumors after neoadjuvant therapy. In particularly, the standard tumor grading will be replaced by tumor regression grading. Furthermore, tumors after neoadjuvant treatment are marked with the prefix "y" in the TNM classification. PMID:22293790
Cook, D A; Coory, M; Webster, R A
2011-06-01
OBJECTIVE To introduce a new type of risk-adjusted (RA) exponentially weighted moving average (EWMA) chart and to compare it to a commonly used type of variable life adjusted display chart for analysis of patient outcomes. DATA Routine inpatient data on mortality following admission for acute myocardial infarction, from all public and private hospitals in Queensland, Australia. METHODS The RA-EWMA plots the EWMA of the observed and predicted values. Predicted values were obtained from a logistic regression model for all hospitals in Queensland. The EWMA of the predicted values is a moving centre line, reflecting current patient case mix at a particular hospital. Thresholds around this moving centre line provide a scale by which to assess the importance of trends in the EWMA of the observed values. RESULTS The RA-EWMA chart can be designed to have equivalent performance, in terms of average run lengths, as variable life adjusted display chart. The advantages of the RA-EWMA are that it communicates information about the current level of an indicator in a direct and understandable way, and it explicitly displays information about the current patient case mix. Also, because it is not reset, the RA-EWMA is a more natural chart to use in health, where it is exceedingly rare to stop or dramatically and abruptly alter a process of care. CONCLUSION The RA-EWMA chart is a direct and intuitive way to display information about an indicator while accounting for differences in case mix. PMID:21209145
Front-End Analysis Cornerstone of Logistics
NASA Technical Reports Server (NTRS)
Nager, Paul J.
2000-01-01
The presentation provides an overview of Front-End Logistics Support Analysis (FELSA), when it should be performed, benefits of performing FELSA and why it should be performed, how it is conducted, and examples.
Practical Session: Simple Linear Regression
NASA Astrophysics Data System (ADS)
Clausel, M.; Grégoire, G.
2014-12-01
Two exercises are proposed to illustrate the simple linear regression. The first one is based on the famous Galton's data set on heredity. We use the lm R command and get coefficients estimates, standard error of the error, R2, residuals …In the second example, devoted to data related to the vapor tension of mercury, we fit a simple linear regression, predict values, and anticipate on multiple linear regression. This pratical session is an excerpt from practical exercises proposed by A. Dalalyan at EPNC (see Exercises 1 and 2 of http://certis.enpc.fr/~dalalyan/Download/TP_ENPC_4.pdf).
Splines for Diffeomorphic Image Regression
Singh, Nikhil; Niethammer, Marc
2016-01-01
This paper develops a method for splines on diffeomorphisms for image regression. In contrast to previously proposed methods to capture image changes over time, such as geodesic regression, the method can capture more complex spatio-temporal deformations. In particular, it is a first step towards capturing periodic motions for example of the heart or the lung. Starting from a variational formulation of splines the proposed approach allows for the use of temporal control points to control spline behavior. This necessitates the development of a shooting formulation for splines. Experimental results are shown for synthetic and real data. The performance of the method is compared to geodesic regression. PMID:25485370
Logistics Reduction Technologies for Exploration Missions
NASA Technical Reports Server (NTRS)
Broyan, James L., Jr.; Ewert, Michael K.; Fink, Patrick W.
2014-01-01
Human exploration missions under study are very limited by the launch mass capacity of existing and planned vehicles. The logistical mass of crew items is typically considered separate from the vehicle structure, habitat outfitting, and life support systems. Consequently, crew item logistical mass is typically competing with vehicle systems for mass allocation. NASA's Advanced Exploration Systems (AES) Logistics Reduction and Repurposing (LRR) Project is developing five logistics technologies guided by a systems engineering cradle-to-grave approach to enable used crew items to augment vehicle systems. Specifically, AES LRR is investigating the direct reduction of clothing mass, the repurposing of logistical packaging, the use of autonomous logistics management technologies, the processing of spent crew items to benefit radiation shielding and water recovery, and the conversion of trash to propulsion gases. The systematic implementation of these types of technologies will increase launch mass efficiency by enabling items to be used for secondary purposes and improve the habitability of the vehicle as the mission duration increases. This paper provides a description and the challenges of the five technologies under development and the estimated overall mission benefits of each technology.
McKenzie, K.R.
1959-07-01
An electrode support which permits accurate alignment and adjustment of the electrode in a plurality of planes and about a plurality of axes in a calutron is described. The support will align the slits in the electrode with the slits of an ionizing chamber so as to provide for the egress of ions. The support comprises an insulator, a leveling plate carried by the insulator and having diametrically opposed attaching screws screwed to the plate and the insulator and diametrically opposed adjusting screws for bearing against the insulator, and an electrode associated with the plate for adjustment therewith.
Kautter, John; Pope, Gregory C.
2004-01-01
The authors document the development of the CMS frailty adjustment model, a Medicare payment approach that adjusts payments to a Medicare managed care organization (MCO) according to the functional impairment of its community-residing enrollees. Beginning in 2004, this approach is being applied to certain organizations, such as Program of All-Inclusive Care for the Elderly (PACE), that specialize in providing care to the community-residing frail elderly. In the future, frailty adjustment could be extended to more Medicare managed care organizations. PMID:25372243
Abstract Expression Grammar Symbolic Regression
NASA Astrophysics Data System (ADS)
Korns, Michael F.
This chapter examines the use of Abstract Expression Grammars to perform the entire Symbolic Regression process without the use of Genetic Programming per se. The techniques explored produce a symbolic regression engine which has absolutely no bloat, which allows total user control of the search space and output formulas, which is faster, and more accurate than the engines produced in our previous papers using Genetic Programming. The genome is an all vector structure with four chromosomes plus additional epigenetic and constraint vectors, allowing total user control of the search space and the final output formulas. A combination of specialized compiler techniques, genetic algorithms, particle swarm, aged layered populations, plus discrete and continuous differential evolution are used to produce an improved symbolic regression sytem. Nine base test cases, from the literature, are used to test the improvement in speed and accuracy. The improved results indicate that these techniques move us a big step closer toward future industrial strength symbolic regression systems.
Multiple Regression and Its Discontents
ERIC Educational Resources Information Center
Snell, Joel C.; Marsh, Mitchell
2012-01-01
Multiple regression is part of a larger statistical strategy originated by Gauss. The authors raise questions about the theory and suggest some changes that would make room for Mandelbrot and Serendipity.
Time-Warped Geodesic Regression
Hong, Yi; Singh, Nikhil; Kwitt, Roland; Niethammer, Marc
2016-01-01
We consider geodesic regression with parametric time-warps. This allows, for example, to capture saturation effects as typically observed during brain development or degeneration. While highly-flexible models to analyze time-varying image and shape data based on generalizations of splines and polynomials have been proposed recently, they come at the cost of substantially more complex inference. Our focus in this paper is therefore to keep the model and its inference as simple as possible while allowing to capture expected biological variation. We demonstrate that by augmenting geodesic regression with parametric time-warp functions, we can achieve comparable flexibility to more complex models while retaining model simplicity. In addition, the time-warp parameters provide useful information of underlying anatomical changes as demonstrated for the analysis of corpora callosa and rat calvariae. We exemplify our strategy for shape regression on the Grassmann manifold, but note that the method is generally applicable for time-warped geodesic regression. PMID:25485368
Basis Selection for Wavelet Regression
NASA Technical Reports Server (NTRS)
Wheeler, Kevin R.; Lau, Sonie (Technical Monitor)
1998-01-01
A wavelet basis selection procedure is presented for wavelet regression. Both the basis and the threshold are selected using cross-validation. The method includes the capability of incorporating prior knowledge on the smoothness (or shape of the basis functions) into the basis selection procedure. The results of the method are demonstrated on sampled functions widely used in the wavelet regression literature. The results of the method are contrasted with other published methods.
Regression methods for spatial data
NASA Technical Reports Server (NTRS)
Yakowitz, S. J.; Szidarovszky, F.
1982-01-01
The kriging approach, a parametric regression method used by hydrologists and mining engineers, among others also provides an error estimate the integral of the regression function. The kriging method is explored and some of its statistical characteristics are described. The Watson method and theory are extended so that the kriging features are displayed. Theoretical and computational comparisons of the kriging and Watson approaches are offered.
Wrong Signs in Regression Coefficients
NASA Technical Reports Server (NTRS)
McGee, Holly
1999-01-01
When using parametric cost estimation, it is important to note the possibility of the regression coefficients having the wrong sign. A wrong sign is defined as a sign on the regression coefficient opposite to the researcher's intuition and experience. Some possible causes for the wrong sign discussed in this paper are a small range of x's, leverage points, missing variables, multicollinearity, and computational error. Additionally, techniques for determining the cause of the wrong sign are given.
Remotely Adjustable Hydraulic Pump
NASA Technical Reports Server (NTRS)
Kouns, H. H.; Gardner, L. D.
1987-01-01
Outlet pressure adjusted to match varying loads. Electrohydraulic servo has positioned sleeve in leftmost position, adjusting outlet pressure to maximum value. Sleeve in equilibrium position, with control land covering control port. For lowest pressure setting, sleeve shifted toward right by increased pressure on sleeve shoulder from servovalve. Pump used in aircraft and robots, where hydraulic actuators repeatedly turned on and off, changing pump load frequently and over wide range.
Shrinkage regression-based methods for microarray missing value imputation
2013-01-01
Background Missing values commonly occur in the microarray data, which usually contain more than 5% missing values with up to 90% of genes affected. Inaccurate missing value estimation results in reducing the power of downstream microarray data analyses. Many types of methods have been developed to estimate missing values. Among them, the regression-based methods are very popular and have been shown to perform better than the other types of methods in many testing microarray datasets. Results To further improve the performances of the regression-based methods, we propose shrinkage regression-based methods. Our methods take the advantage of the correlation structure in the microarray data and select similar genes for the target gene by Pearson correlation coefficients. Besides, our methods incorporate the least squares principle, utilize a shrinkage estimation approach to adjust the coefficients of the regression model, and then use the new coefficients to estimate missing values. Simulation results show that the proposed methods provide more accurate missing value estimation in six testing microarray datasets than the existing regression-based methods do. Conclusions Imputation of missing values is a very important aspect of microarray data analyses because most of the downstream analyses require a complete dataset. Therefore, exploring accurate and efficient methods for estimating missing values has become an essential issue. Since our proposed shrinkage regression-based methods can provide accurate missing value estimation, they are competitive alternatives to the existing regression-based methods. PMID:24565159
Background stratified Poisson regression analysis of cohort data
Langholz, Bryan
2012-01-01
Background stratified Poisson regression is an approach that has been used in the analysis of data derived from a variety of epidemiologically important studies of radiation-exposed populations, including uranium miners, nuclear industry workers, and atomic bomb survivors. We describe a novel approach to fit Poisson regression models that adjust for a set of covariates through background stratification while directly estimating the radiation-disease association of primary interest. The approach makes use of an expression for the Poisson likelihood that treats the coefficients for stratum-specific indicator variables as ‘nuisance’ variables and avoids the need to explicitly estimate the coefficients for these stratum-specific parameters. Log-linear models, as well as other general relative rate models, are accommodated. This approach is illustrated using data from the Life Span Study of Japanese atomic bomb survivors and data from a study of underground uranium miners. The point estimate and confidence interval obtained from this ‘conditional’ regression approach are identical to the values obtained using unconditional Poisson regression with model terms for each background stratum. Moreover, it is shown that the proposed approach allows estimation of background stratified Poisson regression models of non-standard form, such as models that parameterize latency effects, as well as regression models in which the number of strata is large, thereby overcoming the limitations of previously available statistical software for fitting background stratified Poisson regression models. PMID:22193911
Background stratified Poisson regression analysis of cohort data.
Richardson, David B; Langholz, Bryan
2012-03-01
Background stratified Poisson regression is an approach that has been used in the analysis of data derived from a variety of epidemiologically important studies of radiation-exposed populations, including uranium miners, nuclear industry workers, and atomic bomb survivors. We describe a novel approach to fit Poisson regression models that adjust for a set of covariates through background stratification while directly estimating the radiation-disease association of primary interest. The approach makes use of an expression for the Poisson likelihood that treats the coefficients for stratum-specific indicator variables as 'nuisance' variables and avoids the need to explicitly estimate the coefficients for these stratum-specific parameters. Log-linear models, as well as other general relative rate models, are accommodated. This approach is illustrated using data from the Life Span Study of Japanese atomic bomb survivors and data from a study of underground uranium miners. The point estimate and confidence interval obtained from this 'conditional' regression approach are identical to the values obtained using unconditional Poisson regression with model terms for each background stratum. Moreover, it is shown that the proposed approach allows estimation of background stratified Poisson regression models of non-standard form, such as models that parameterize latency effects, as well as regression models in which the number of strata is large, thereby overcoming the limitations of previously available statistical software for fitting background stratified Poisson regression models. PMID:22193911
Weighted triangulation adjustment
Anderson, Walter L.
1969-01-01
The variation of coordinates method is employed to perform a weighted least squares adjustment of horizontal survey networks. Geodetic coordinates are required for each fixed and adjustable station. A preliminary inverse geodetic position computation is made for each observed line. Weights associated with each observed equation for direction, azimuth, and distance are applied in the formation of the normal equations in-the least squares adjustment. The number of normal equations that may be solved is twice the number of new stations and less than 150. When the normal equations are solved, shifts are produced at adjustable stations. Previously computed correction factors are applied to the shifts and a most probable geodetic position is found for each adjustable station. Pinal azimuths and distances are computed. These may be written onto magnetic tape for subsequent computation of state plane or grid coordinates. Input consists of punch cards containing project identification, program options, and position and observation information. Results listed include preliminary and final positions, residuals, observation equations, solution of the normal equations showing magnitudes of shifts, and a plot of each adjusted and fixed station. During processing, data sets containing irrecoverable errors are rejected and the type of error is listed. The computer resumes processing of additional data sets.. Other conditions cause warning-errors to be issued, and processing continues with the current data set.
NASA Space Exploration Logistics Workshop Proceedings
NASA Technical Reports Server (NTRS)
deWeek, Oliver; Evans, William A.; Parrish, Joe; James, Sarah
2006-01-01
As NASA has embarked on a new Vision for Space Exploration, there is new energy and focus around the area of manned space exploration. These activities encompass the design of new vehicles such as the Crew Exploration Vehicle (CEV) and Crew Launch Vehicle (CLV) and the identification of commercial opportunities for space transportation services, as well as continued operations of the Space Shuttle and the International Space Station. Reaching the Moon and eventually Mars with a mix of both robotic and human explorers for short term missions is a formidable challenge in itself. How to achieve this in a safe, efficient and long-term sustainable way is yet another question. The challenge is not only one of vehicle design, launch, and operations but also one of space logistics. Oftentimes, logistical issues are not given enough consideration upfront, in relation to the large share of operating budgets they consume. In this context, a group of 54 experts in space logistics met for a two-day workshop to discuss the following key questions: 1. What is the current state-of the art in space logistics, in terms of architectures, concepts, technologies as well as enabling processes? 2. What are the main challenges for space logistics for future human exploration of the Moon and Mars, at the intersection of engineering and space operations? 3. What lessons can be drawn from past successes and failures in human space flight logistics? 4. What lessons and connections do we see from terrestrial analogies as well as activities in other areas, such as U.S. military logistics? 5. What key advances are required to enable long-term success in the context of a future interplanetary supply chain? These proceedings summarize the outcomes of the workshop, reference particular presentations, panels and breakout sessions, and record specific observations that should help guide future efforts.
ISS Update: Logistics Reduction and Repurposing (Part 2)
Public Affairs Officer Brandi Dean interviews Sarah Shull, Deputy Project Manager Logistics Reduction and Repurposing. Shull, who is with the Advanced Exploration Systems, discusses the Logistics t...
ISS Update: Logistics Reduction and Repurposing (Part 1)
Public Affairs Officer Brandi Dean interviews Sarah Shull, Deputy Project Manager Logistics Reduction and Repurposing. Shull, who is with the Advanced Exploration Systems, discusses the Logistics t...
Logistics Lessons Learned in NASA Space Flight
NASA Technical Reports Server (NTRS)
Evans, William A.; DeWeck, Olivier; Laufer, Deanna; Shull, Sarah
2006-01-01
The Vision for Space Exploration sets out a number of goals, involving both strategic and tactical objectives. These include returning the Space Shuttle to flight, completing the International Space Station, and conducting human expeditions to the Moon by 2020. Each of these goals has profound logistics implications. In the consideration of these objectives,a need for a study on NASA logistics lessons learned was recognized. The study endeavors to identify both needs for space exploration and challenges in the development of past logistics architectures, as well as in the design of space systems. This study may also be appropriately applied as guidance in the development of an integrated logistics architecture for future human missions to the Moon and Mars. This report first summarizes current logistics practices for the Space Shuttle Program (SSP) and the International Space Station (ISS) and examines the practices of manifesting, stowage, inventory tracking, waste disposal, and return logistics. The key findings of this examination are that while the current practices do have many positive aspects, there are also several shortcomings. These shortcomings include a high-level of excess complexity, redundancy of information/lack of a common database, and a large human-in-the-loop component. Later sections of this report describe the methodology and results of our work to systematically gather logistics lessons learned from past and current human spaceflight programs as well as validating these lessons through a survey of the opinions of current space logisticians. To consider the perspectives on logistics lessons, we searched several sources within NASA, including organizations with direct and indirect connections with the system flow in mission planning. We utilized crew debriefs, the John Commonsense lessons repository for the JSC Mission Operations Directorate, and the Skylab Lessons Learned. Additionally, we searched the public version of the Lessons Learned
Cunningham, Marc; Brown, Niquelle; Sacher, Suzy; Hatch, Benjamin; Inglis, Andrew; Aronovich, Dana
2015-01-01
Background: Contraceptive prevalence rate (CPR) is a vital indicator used by country governments, international donors, and other stakeholders for measuring progress in family planning programs against country targets and global initiatives as well as for estimating health outcomes. Because of the need for more frequent CPR estimates than population-based surveys currently provide, alternative approaches for estimating CPRs are being explored, including using contraceptive logistics data. Methods: Using data from the Demographic and Health Surveys (DHS) in 30 countries, population data from the United States Census Bureau International Database, and logistics data from the Procurement Planning and Monitoring Report (PPMR) and the Pipeline Monitoring and Procurement Planning System (PipeLine), we developed and evaluated 3 models to generate country-level, public-sector contraceptive prevalence estimates for injectable contraceptives, oral contraceptives, and male condoms. Models included: direct estimation through existing couple-years of protection (CYP) conversion factors, bivariate linear regression, and multivariate linear regression. Model evaluation consisted of comparing the referent DHS prevalence rates for each short-acting method with the model-generated prevalence rate using multiple metrics, including mean absolute error and proportion of countries where the modeled prevalence rate for each method was within 1, 2, or 5 percentage points of the DHS referent value. Results: For the methods studied, family planning use estimates from public-sector logistics data were correlated with those from the DHS, validating the quality and accuracy of current public-sector logistics data. Logistics data for oral and injectable contraceptives were significantly associated (P<.05) with the referent DHS values for both bivariate and multivariate models. For condoms, however, that association was only significant for the bivariate model. With the exception of the CYP
Reverse logistics in the construction industry.
Hosseini, M Reza; Rameezdeen, Raufdeen; Chileshe, Nicholas; Lehmann, Steffen
2015-06-01
Reverse logistics in construction refers to the movement of products and materials from salvaged buildings to a new construction site. While there is a plethora of studies looking at various aspects of the reverse logistics chain, there is no systematic review of literature on this important subject as applied to the construction industry. Therefore, the objective of this study is to integrate the fragmented body of knowledge on reverse logistics in construction, with the aim of promoting the concept among industry stakeholders and the wider construction community. Through a qualitative meta-analysis, the study synthesises the findings of previous studies and presents some actions needed by industry stakeholders to promote this concept within the real-life context. First, the trend of research and terminology related with reverse logistics is introduced. Second, it unearths the main advantages and barriers of reverse logistics in construction while providing some suggestions to harness the advantages and mitigate these barriers. Finally, it provides a future research direction based on the review. PMID:26018543
Humanitarian response: improving logistics to save lives.
McCoy, Jessica
2008-01-01
Each year, millions of people worldwide are affected by disasters, underscoring the importance of effective relief efforts. Many highly visible disaster responses have been inefficient and ineffective. Humanitarian agencies typically play a key role in disaster response (eg, procuring and distributing relief items to an affected population, assisting with evacuation, providing healthcare, assisting in the development of long-term shelter), and thus their efficiency is critical for a successful disaster response. The field of disaster and emergency response modeling is well established, but the application of such techniques to humanitarian logistics is relatively recent. This article surveys models of humanitarian response logistics and identifies promising opportunities for future work. Existing models analyze a variety of preparation and response decisions (eg, warehouse location and the distribution of relief supplies), consider both natural and manmade disasters, and typically seek to minimize cost or unmet demand. Opportunities to enhance the logistics of humanitarian response include the adaptation of models developed for general disaster response; the use of existing models, techniques, and insights from the literature on commercial supply chain management; the development of working partnerships between humanitarian aid organizations and private companies with expertise in logistics; and the consideration of behavioral factors relevant to a response. Implementable, realistic models that support the logistics of humanitarian relief can improve the preparation for and the response to disasters, which in turn can save lives. PMID:19069032
Boerner, Kathrin; Klinger, Julie; Rosen, Jules
2015-01-01
Abstract Background: Preparedness for death as a predictor of post-bereavement adjustment has not been studied prospectively. Little is known about pre-death factors associated with feeling prepared prior to the death of a loved one. Objective: Our aim was to prospectively assess the role of preparedness for death as a predictor of post-bereavement adjustment in informal caregivers (CGs) who experienced the death of their loved one and to identify predictors and correlates of complicated grief, depression, and preparedness for death among informal CGs. Methods: We conducted a prospective, longitudinal study using data collected for a randomized trial testing the efficacy of an intervention for CGs of recently placed care recipients (CRs). Subjects were 217 informal CGs of care recipients recently placed in nursing homes, and they were followed for 18 months. CGs were assessed in person by certified interviewers at 6-month intervals. Eighty-nine CGs experienced the death of their loved one in the course of the study. Measurements used included preparedness for death, advance care planning (ACP), complicated grief, depression, and sociodemographic characteristics. Results: CGs who reported feeling more prepared for the death experienced lower levels of complicated grief post-bereavement. A multivariate ordinal logistic regression model showed that spouses as opposed to adult child CGs were less prepared for the death, depressed CGs were less prepared, and patients who engaged in ACP had CGs who felt more prepared. CR overt expressions about wanting to die was also related to higher levels of preparedness in the CG. Conclusions: We show prospectively that preparedness for death facilitates post-bereavement adjustment and identify factors associated with preparedness. ACP can be an effective means for preparing informal CGs for the death of their CRs. PMID:25469737
Predicting Graduate Student Success in an MBA Program: Regression versus Classification.
ERIC Educational Resources Information Center
Wilson, Rick L.; Hardgrave, Bill C.
1995-01-01
A study of the ability of different models--including the classification techniques of discriminant analysis, logistic regression, and neural networks--to predict the academic success of master's degree students in business administration suggests that prediction is difficult, but that classification and nonparametric techniques may be…
2009-01-01
Background To allow direct comparison of bloodstream infection (BSI) rates between hospitals for performance measurement, observed rates need to be risk adjusted according to the types of patients cared for by the hospital. However, attribute data on all individual patients are often unavailable and hospital-level risk adjustment needs to be done using indirect indicator variables of patient case mix, such as hospital level. We aimed to identify medical services associated with high or low BSI rates, and to evaluate the services provided by the hospital as indicators that can be used for more objective hospital-level risk adjustment. Methods From February 2001-December 2007, 1719 monthly BSI counts were available from 18 hospitals in Queensland, Australia. BSI outcomes were stratified into four groups: overall BSI (OBSI), Staphylococcus aureus BSI (STAPH), intravascular device-related S. aureus BSI (IVD-STAPH) and methicillin-resistant S. aureus BSI (MRSA). Twelve services were considered as candidate risk-adjustment variables. For OBSI, STAPH and IVD-STAPH, we developed generalized estimating equation Poisson regression models that accounted for autocorrelation in longitudinal counts. Due to a lack of autocorrelation, a standard logistic regression model was specified for MRSA. Results Four risk services were identified for OBSI: AIDS (IRR 2.14, 95% CI 1.20 to 3.82), infectious diseases (IRR 2.72, 95% CI 1.97 to 3.76), oncology (IRR 1.60, 95% CI 1.29 to 1.98) and bone marrow transplants (IRR 1.52, 95% CI 1.14 to 2.03). Four protective services were also found. A similar but smaller group of risk and protective services were found for the other outcomes. Acceptable agreement between observed and fitted values was found for the OBSI and STAPH models but not for the IVD-STAPH and MRSA models. However, the IVD-STAPH and MRSA models successfully discriminated between hospitals with higher and lower BSI rates. Conclusion The high model goodness-of-fit and the higher
Urinary arsenic concentration adjustment factors and malnutrition.
Nermell, Barbro; Lindberg, Anna-Lena; Rahman, Mahfuzar; Berglund, Marika; Persson, Lars Ake; El Arifeen, Shams; Vahter, Marie
2008-02-01
This study aims at evaluating the suitability of adjusting urinary concentrations of arsenic, or any other urinary biomarker, for variations in urine dilution by creatinine and specific gravity in a malnourished population. We measured the concentrations of metabolites of inorganic arsenic, creatinine and specific gravity in spot urine samples collected from 1466 individuals, 5-88 years of age, in Matlab, rural Bangladesh, where arsenic-contaminated drinking water and malnutrition are prevalent (about 30% of the adults had body mass index (BMI) below 18.5 kg/m(2)). The urinary concentrations of creatinine were low; on average 0.55 g/L in the adolescents and adults and about 0.35 g/L in the 5-12 years old children. Therefore, adjustment by creatinine gave much higher numerical values for the urinary arsenic concentrations than did the corresponding data expressed as microg/L, adjusted by specific gravity. As evaluated by multiple regression analyses, urinary creatinine, adjusted by specific gravity, was more affected by body size, age, gender and season than was specific gravity. Furthermore, urinary creatinine was found to be significantly associated with urinary arsenic, which further disqualifies the creatinine adjustment. PMID:17900556
Interpretation of Standardized Regression Coefficients in Multiple Regression.
ERIC Educational Resources Information Center
Thayer, Jerome D.
The extent to which standardized regression coefficients (beta values) can be used to determine the importance of a variable in an equation was explored. The beta value and the part correlation coefficient--also called the semi-partial correlation coefficient and reported in squared form as the incremental "r squared"--were compared for variables…
Demosaicing Based on Directional Difference Regression and Efficient Regression Priors.
Wu, Jiqing; Timofte, Radu; Van Gool, Luc
2016-08-01
Color demosaicing is a key image processing step aiming to reconstruct the missing pixels from a recorded raw image. On the one hand, numerous interpolation methods focusing on spatial-spectral correlations have been proved very efficient, whereas they yield a poor image quality and strong visible artifacts. On the other hand, optimization strategies, such as learned simultaneous sparse coding and sparsity and adaptive principal component analysis-based algorithms, were shown to greatly improve image quality compared with that delivered by interpolation methods, but unfortunately are computationally heavy. In this paper, we propose efficient regression priors as a novel, fast post-processing algorithm that learns the regression priors offline from training data. We also propose an independent efficient demosaicing algorithm based on directional difference regression, and introduce its enhanced version based on fused regression. We achieve an image quality comparable to that of the state-of-the-art methods for three benchmarks, while being order(s) of magnitude faster. PMID:27254866
A linear regression solution to the spatial autocorrelation problem
NASA Astrophysics Data System (ADS)
Griffith, Daniel A.
The Moran Coefficient spatial autocorrelation index can be decomposed into orthogonal map pattern components. This decomposition relates it directly to standard linear regression, in which corresponding eigenvectors can be used as predictors. This paper reports comparative results between these linear regressions and their auto-Gaussian counterparts for the following georeferenced data sets: Columbus (Ohio) crime, Ottawa-Hull median family income, Toronto population density, southwest Ohio unemployment, Syracuse pediatric lead poisoning, and Glasgow standard mortality rates, and a small remotely sensed image of the High Peak district. This methodology is extended to auto-logistic and auto-Poisson situations, with selected data analyses including percentage of urban population across Puerto Rico, and the frequency of SIDs cases across North Carolina. These data analytic results suggest that this approach to georeferenced data analysis offers considerable promise.
Interquantile Shrinkage in Regression Models
Jiang, Liewen; Wang, Huixia Judy; Bondell, Howard D.
2012-01-01
Conventional analysis using quantile regression typically focuses on fitting the regression model at different quantiles separately. However, in situations where the quantile coefficients share some common feature, joint modeling of multiple quantiles to accommodate the commonality often leads to more efficient estimation. One example of common features is that a predictor may have a constant effect over one region of quantile levels but varying effects in other regions. To automatically perform estimation and detection of the interquantile commonality, we develop two penalization methods. When the quantile slope coefficients indeed do not change across quantile levels, the proposed methods will shrink the slopes towards constant and thus improve the estimation efficiency. We establish the oracle properties of the two proposed penalization methods. Through numerical investigations, we demonstrate that the proposed methods lead to estimations with competitive or higher efficiency than the standard quantile regression estimation in finite samples. Supplemental materials for the article are available online. PMID:24363546
Using Cultural Algorithms to Improve Intelligent Logistics
NASA Astrophysics Data System (ADS)
Ochoa, Alberto; García, Yazmani; Yañez, Javier; Teymanoglu, Yaddik
Today the issue of logistics is a very important within companies to the extent that some have departments devoted exclusively to it. This has evolved over time and today is a fundamental aspect in the fight business seeking to consolidate or remain leaders in their field. With the above we know that logistics can be divided into different classes, however, in this regard, our study is based on the timely distribution to the customer with a lower cost, higher sales and better utilization of space resulting in excellent service. Finally, prepare a comparative analysis of the results with respect to another method of optimization solution space.
Asymptotic behavior of degenerate logistic equations
NASA Astrophysics Data System (ADS)
Arrieta, José M.; Pardo, Rosa; Rodríguez-Bernal, Aníbal
2015-12-01
We analyze the asymptotic behavior of positive solutions of parabolic equations with a class of degenerate logistic nonlinearities of the type λu - n (x)uρ. An important characteristic of this work is that the region where the logistic term n (ṡ) vanishes, that is K0 = { x : n (x) = 0 }, may be non-smooth. We analyze conditions on λ, ρ, n (ṡ) and K0 guaranteeing that the solution starting at a positive initial condition remains bounded or blows up as time goes to infinity. The asymptotic behavior may not be the same in different parts of K0.
2014-01-01
Background Risk adjustment is crucial for comparison of outcome in medical care. Knowledge of the external factors that impact measured outcome but that cannot be influenced by the physician is a prerequisite for this adjustment. To date, a universal and reproducible method for identification of the relevant external factors has not been published. The selection of external factors in current quality assurance programmes is mainly based on expert opinion. We propose and demonstrate a methodology for identification of external factors requiring risk adjustment of outcome indicators and we apply it to a cataract surgery register. Methods Defined test criteria to determine the relevance for risk adjustment are “clinical relevance” and “statistical significance”. Clinical relevance of the association is presumed when observed success rates of the indicator in the presence and absence of the external factor exceed a pre-specified range of 10%. Statistical significance of the association between the external factor and outcome indicators is assessed by univariate stratification and multivariate logistic regression adjustment. The cataract surgery register was set up as part of a German multi-centre register trial for out-patient cataract surgery in three high-volume surgical sites. A total of 14,924 patient follow-ups have been documented since 2005. Eight external factors potentially relevant for risk adjustment were related to the outcome indicators “refractive accuracy” and “visual rehabilitation” 2–5 weeks after surgery. Results The clinical relevance criterion confirmed 2 (“refractive accuracy”) and 5 (“visual rehabilitation”) external factors. The significance criterion was verified in two ways. Univariate and multivariate analyses revealed almost identical external factors: 4 were related to “refractive accuracy” and 7 (6) to “visual rehabilitation”. Two (“refractive accuracy”) and 5 (“visual rehabilitation”) factors
Dehesh, Tania; Zare, Najaf; Ayatollahi, Seyyed Mohammad Taghi
2015-01-01
Background. Univariate meta-analysis (UM) procedure, as a technique that provides a single overall result, has become increasingly popular. Neglecting the existence of other concomitant covariates in the models leads to loss of treatment efficiency. Our aim was proposing four new approximation approaches for the covariance matrix of the coefficients, which is not readily available for the multivariate generalized least square (MGLS) method as a multivariate meta-analysis approach. Methods. We evaluated the efficiency of four new approaches including zero correlation (ZC), common correlation (CC), estimated correlation (EC), and multivariate multilevel correlation (MMC) on the estimation bias, mean square error (MSE), and 95% probability coverage of the confidence interval (CI) in the synthesis of Cox proportional hazard models coefficients in a simulation study. Result. Comparing the results of the simulation study on the MSE, bias, and CI of the estimated coefficients indicated that MMC approach was the most accurate procedure compared to EC, CC, and ZC procedures. The precision ranking of the four approaches according to all above settings was MMC ≥ EC ≥ CC ≥ ZC. Conclusion. This study highlights advantages of MGLS meta-analysis on UM approach. The results suggested the use of MMC procedure to overcome the lack of information for having a complete covariance matrix of the coefficients. PMID:26413142
ERIC Educational Resources Information Center
Abramson, Jane A.
Personal interviews with 100 former farm operators living in Saskatoon, Saskatchewan, were conducted in an attempt to understand the nature of the adjustment process caused by migration from rural to urban surroundings. Requirements for inclusion in the study were that respondents had owned or operated a farm for at least 3 years, had left their…
Hunter, Steven L.
2002-01-01
An inclinometer utilizing synchronous demodulation for high resolution and electronic offset adjustment provides a wide dynamic range without any moving components. A device encompassing a tiltmeter and accompanying electronic circuitry provides quasi-leveled tilt sensors that detect highly resolved tilt change without signal saturation.